Start Small: August 2011

Sunday 28 August 2011

Finding stuck or hot pixels in Nikon D80 images

In a previous article I presented a tiny Python program that used the Python Image Library (PIL) to correct a so called stuck or hot pixel in an image taken by a digital SLR. In this article we go one step further and see if we can use some simple statistics to identify those pixels automatically.

Finding pixels with the least variance

The idea is to identify those pixels in a bunch of pictures that don't change much from picture to picture, even if the subject and lighting conditions do change. There are of course many ways to define change but the one we explore here is the called the variance. Basically we compute for each pixel in a set of pictures its average and then sum (again for each pixel) the differences with its average in each picture. The pixels (or subpixels) that have the smallest sums are likely candidates for being hot or stuck.

Using PIL and Numpy for efficient number crunching

Obviously calculating the variance for each subpixel in a few hundred pictures will entail some serious number crunching if these pictures are from a megapixel camera. We therefore better use a serious number crunching library, like Numpy. Because we use Python 3, I recommend fetching the PIL and Numpy packages from Christoph Gohlke's page if you use Windows.

The code is shown below. The program will open all images given as arguments one by one using PIL's Image.open() function (line 9). PIL images can be directly converted by Numpy by the array() function. Because we might run out of memory we do not keep all images in memory together but process them one by one and calculate the variance using the so called on-line algorithm. The names of the variables used are the same as in the Wikipedia article but instead of scalars we use arrays. (Note, we do not actually calculate the variance but just the sum of squares of differences from the mean because we interested in the pixel with smallest variance whatever that may be and not in the value of the variance at such.) In the final lines (line 27) we print out the results, retrieving the index of the lowest variance with Numpy's argmin() function.

import Image
import numpy as np
import sys
from glob import glob

first = True
for arg in sys.argv[1:]:
	for filename in glob(arg):
		pic = Image.open(filename)
		pix = np.array(pic)
		if first:
			first = False
			firstpix = pix
			n = np.zeros(firstpix.shape)
			mean = np.zeros(firstpix.shape)
			M2 = np.zeros(firstpix.shape)
			delta = np.zeros(firstpix.shape)
		else:
			if pix.shape != firstpix.shape:
				print("shapes don't match")
				continue
		n += 1
		delta = pix - mean
		mean += delta/n
		M2 += delta*(pix - mean)

mini = np.unravel_index(M2.argmin(),M2.shape)
print('min',M2[mini],mini)

Results and limitations

Using a few hundred photo's I could easily identify a previously observed hot pixel. It did take a few minutes though, even on my pc which is a fairly powerful hexacore machine. Besides speed the biggest limitation is that I had to use JPEG images because I did not have RAW images available. JPEG image compression isn't lossless so the hot pixel tends to be smeared out. If you print out not just the pixel with the lowest variance but the ten pixels with the lowest variance, most of them are clustered around the same spot. (example code below).

sorti = M2.argsort(axis=None)
print(sep="\n",*[(i,M2[i]) for i in [np.unravel_index(i,M2.shape) for i in sorti[:10]]])

Sunday 21 August 2011

Windows 7 Gadgets: mini web applications

Windows 7 gadgets are small applications that are integrated in the Windows desktop. These gadgets are basically webpages displayed in a browser environment that hardly differs from a regular environment. It is therefore entirely possible to turn a gadget into a web application that retrieves data from a remote server with AJAX. In this article we explore some of the possibilities and see how we may use jQuery inside a gadget.

Anatomy of a Windows 7 gadget

There is actually some pretty decent documentation on gadgets available on the web (for example on msdn or here). The trick is to get an example gadget running with as little ballast as possible.

A gadget file is basically a zip file that contains a file gadget.html plus a number of additional files, the most important one, gadget.xml. This zip file does have a .gadget extension instead of a .zip extension. So these simple steps are needed to create a bare bones gadget:

Create a directory with a descriptive name, e.g. mygadget
In this directory create a file gadget.html
a subdirectory named css with a file mygadget.css
a file gadget.xml
and finally a file icon.png
pack the contents of this directory into a file mygadget.zip (i.e. not the toplevel directory itself)
rename this file to mygadget.gadget (although 7zip for example can pack to a file with any extension directly)
double click this file and follow through the install dialog

With the following gadget.html the result will look like the screenshot

<html xmlns="http://www.w3.org/1999/xhtml">
<head>	
	<title>My Gadget</title>
	<meta http-equiv="Content-Type" content="text/html; charset=unicode" />
	<link href="css/mygadget.css" rel="stylesheet" type="text/css" /> 
</head>
<body> 
<div id="main_image">
<p>42</p>
</div>
</body>
</html>

Not really impressive, I admit, but it shows how simple it is to create a gadget. The necessary gadget.xml mainly describes were to find the actual html code and what image to use in the gadget selector:

<?xml version="1.0" encoding="utf-8" ?>
<gadget>
  <name>Mygadget</name>
  <namespace><!--_locComment_text="{Locked}"-->StartSmall.Gadgets</namespace>
    <version><!--_locComment_text="{Locked}"-->1.0</version>
  <author name="Michel Anders">
    <info url="http://michelanders.blogspot.com" text="Start Small" />
    <logo src="icon.png" />
  </author>
  <copyright><!--_locComment_text="{Locked}"-->&#169; 2011</copyright>
  <description>Basic Gadget</description>
  <icons>
    <icon height="48" width="48" src="icon.png" />
  </icons>
  <hosts>
    <host name="sidebar">
      <base type="HTML" apiVersion="1.0.0" src="gadget.html" />
      <permissions>
        <!--_locComment_text="{Locked}"-->Full
      </permissions>
      <platform minPlatformVersion="1.0" />
    </host>
  </hosts>
</gadget>

Using jQuery in a Windows 7 gadget

Displaying static information is not much fun at all so let's see what options there are to create a more dynamic gadget:

Refer to external content like images that get refreshed
Refresh the content ourselves using JavaScript, possibly even interacting with the user

The first option is simple enough and we won't cover that here. The second option is more interesting because it opens up a whole array of possibilities. Let's have a look at how a gadget.html page should look to incorporate jQuery and how we can test if this works:

<html xmlns="http://www.w3.org/1999/xhtml">
<head>	
	<title>My Gadget</title>
	<meta http-equiv="Content-Type" content="text/html; charset=unicode" />
	<link href="css/mygadget.css" rel="stylesheet" type="text/css" />
	<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.6.1/jquery.min.js" type="text/javascript"></script>
</head>
<body> 
<div id="main_image">
<p>42</p>
</div>
<script type="text/javascript">$("#main_image p").append('<span> 43 44 45 </span>');</script>
</body>
</html>

As you can see this is surprisingly simple. The screenshot proves that the final lines of code are actually executed and change the contents of our basic gadget: Our next step is to check if we can retrieve data from a server.

JSONP in a Windows 7 gadget

Due to the same origin policy it is not entirely straight-forward to retrieve data from a server different from the server we get our webpage from. In the gadget environment every server is considered a different server because the gadget.html file originates from a file system. This means that even if we access a web application server on the same pc, we will be denied access.

Fortunately there is a workaround available in the form of JSONP and jQuery makes it very simple for use to use this. Consider the following gadget.html:

<html xmlns="http://www.w3.org/1999/xhtml">
<head>	
	<title>My Gadget</title>
	<meta http-equiv="Content-Type" content="text/html; charset=unicode" />
	<link href="css/mygadget.css" rel="stylesheet" type="text/css" />
	<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.6.1/jquery.min.js" type="text/javascript"></script>
</head>
<body> 
<div id="main_image">
<p>42</p>
</div>
<script type="text/javascript">
function Refresh() {
    $.ajax('http://127.0.0.1:8088/value',
		{dataType:'jsonp',scriptCharset: "utf-8",
		success:function(data, textStatus, jqXHR){
			$("#main_image").empty().append('<p>'+String(data)+'</p>');
		}});
}
window.setInterval("Refresh()", 5000);
</script>
</body>
</html>

It will contact an application server every 5 seconds to retrieve a number and will insert this number into the content div. That's all there is to it. The application server that serves these request is equally simple and can be build for example with the applicationserver module from a previous article:

from http.server import HTTPServer, BaseHTTPRequestHandler
from json import dumps
from applicationserver import IsExposed,ApplicationRequestHandler

class Application:

	def value(self,callback:str,_:int=0) -> IsExposed:
		r = "%s(%s);"%(callback,dumps(_ % 17)) 
		return r
		
class MyAppHandler(ApplicationRequestHandler):
	application=Application()
	
appserver = HTTPServer(('',8088),MyAppHandler)
appserver.serve_forever()

The only trick here is that JSONP requests pass an additional paramter (usually called callback that will hold a string (the name of a JavaScript function in the client) and that we have to use this parameter to contruct a return value that looks like a call to this function with the JSON encoded data as its argument (line 8). The _ parameter holds a random number added by JQuery to prevent caching by the browser. So if _ happened to be 123 and callback would be jquery456_123 the value we woudl return would be the string jquery456_123(4);

The data we return here is merely a random number but of course this could be anything and we could style it to be presented in a more readable form.

Sunday 14 August 2011

Correcting stuck or hot pixels in Nikon D80 images

A friend of mine became suddenly aware that her Nikon D80 camera had developed a so called stuck or hot pixel. Repairing a CCD chip in a camera is quite costly and what is more, looking back through the old photos she discovered the defect had been present for over a year. In this article we develop a tiny Python program that masks out a pixel in pretty much the same way that the built-in firmware in de camera does this for pixels that were hot or dead when the camera was constructed.

Correcting stuck or hot pixels in images

Hot or stuck pixels in a camera are annoying and other than replacing the ccd chip, nothing can be done about it with regard to the hardware. More on this in this illuminating article. Sometimes the camera can map out bad pixels but that won't fix old images, so the plan is to create a small program that does this on existing images.

The idea is to replace the offending pixel with a weighted average of the neighboring pixels so it will blend in with the surroundings. Diagonally adjacent pixels have a lower weight then pixels horizontally or vertically adjacent (see picture on the left). Such a program is simple enough to implement using PIL, the Python Imaging Library. There is even a version for Python 3 made available by Christoph Gohlke, although you will have to replace all relative import statements with absolute one if you want that to work. (Simply run the the installer, open all *.py files in site-packages/PIL and replace all occurrences of from . import with a simple import. A decent editor like notepad++ or ultraedit can manage that in one go).

The program will take the name of an image file and the x,y position of the pixel to fix as its arguments, e.g. python correctpixel.py myimage.jpg 1500,1271 and will produce a corrected image with the same name but with a c_ prefix added, in our example c_myimage.jpg. It will use a weighted average of the eight neighboring pixels as depicted in this image (the diagonally adjacent pixels have a weight of 1/sqrt(2).

The simplest way to find the exact location of the pixel to fix is to open the image in Gimp, locate the pixel and hover over it with the mouse: the location is shown in the status bar at the bottom. Depending on your zoom level these may be fractional numbers, but we need just the integral parts.

The program itself is rather straightforward and is shown in its entirety below:

import sys
import Image

def correct(im,xy,matrix):
 pim = im.load()
 x,y = xy
 maxx,maxy = im.size
 n = 0
 sumr,sumg,sumb = 0,0,0
 for dx,dy,w in matrix:
  px,py = x+dx,y+dy
  if px<0 or py<0 or px >= maxx or py >= maxy:
   break
  n += w
  r,g,b = pim[px,py]
  sumr,sumg,sumb = sumr+r*w,sumg+g*w,sumb+b*w
 pim[x,y]=(int(sumr/n),int(sumg/n),int(sumb/n))

w=1/(2**0.5)
matrixd=((0,1,1),(0,-1,1),(1,0,1),(-1,0,1),(-1,-1,w),(-1,1,w),(1,1,w),(1,-1,w))

im = Image.open(sys.argv[1])
xy = tuple(map(int,sys.argv[2].split(',')))
correct(im,xy,matrixd)
im.save('c_'+sys.argv[1],quality=97)

Given an opened image the function correct() will load the data and the loop of the list of pixels offsets and weights given in its matrix argument. It will check whether a neighbor is within the bounds of the image (line 12; if the stuck pixel is on an edge it might not be) and if so, sum its RGB components according to the weight of this pixel. The pixel is finally replaced by summed values divided by the number of summed pixels. We also take care to convert the RGB components to integers again (line 17Y).

The final lines take care of opening the image (line 22), and converting the pixel position argument to a tuple of integers before calling the correct() function. The image is saved again with a name prefixed with c_. The quality parameter is set to a very high value to more or less save the image with the same jpeg quality a the original image.

Note that at the moment the code shown works for jpeg images only (or to be precise, only for images with three bands, R,G,B.) Png images may have and extra transparency band and the code does not deal with that nor does it play nice with indexed formats (like gif).

Sunday 7 August 2011

Function annotations in Python, checking parameters in a web application server , part II

In a previous article I illustrated how we could put Python's function annotations to good use to provide use with a syntactically elegant way to describe which kind of parameter content would be acceptable to our simple web application framework. In this article the actual implementation is presented along with some notes on its usage.

A simple Python application server with full argument checking

The applicationserver module provides a ApplicationRequestHandler class that derives from the Python provided BaseHTTPRequestHandler class. For now all we do is provide a do_GET() method, i.e. we don't bother with HTTP POST methods yet, although it wouldn't take too much effort if we would.

The core of the code is presented below:

class ApplicationRequestHandler(BaseHTTPRequestHandler):
 
 def do_GET(self):
  url=urlsplit(self.path)
  path=url.path.strip('/')
  parts=path.split('/')
  if parts[0]=='':
   parts=['index']
   
  ob=self.application
  try:
   for p in parts:
    if hasattr(ob,p):
     ob=getattr(ob,p)
    else:
     raise AttributeError('unknown path '+ url.path)
  except AttributeError as e:
   self.send_error(404,str(e))
   return
   
  if ('return' in ob.__annotations__ and 
   issubclass(ob.__annotations__['return'],IsExposed)):
   try:
    kwargs=self.parse_args(url.query,ob.__annotations__)
   except Exception as e:
    self.send_error(400,str(e))
    return
    
   try:
    result=ob(**kwargs)
   except Exception as e:
    self.send_error(500,str(e))
    return
  else:
   self.send_error(404,'path not exposed'+ url.path)
   return
   
  self.wfile.write(result.encode())

 @staticmethod
 def parse_args(query,annotations):
  kwargs=defaultdict(list)
  if query != '':
   for p in query.split('&'):
    (name,value)=p.split('=',1)
    if not name in annotations:
     raise KeyError(name+' not annotated')
    else:
     kwargs[name].append(annotations[name](value))
   for k in kwargs:
    if len(kwargs[k])==1:
     kwargs[k]=kwargs[k][0]
  return kwargs

The first thing we do in the do_GET() method is splitting the path into separate components. The path is provided in the path member and is already stripped of hostname and query parameters. We strip from it any leading or trailing slashes as well (line 5). If there are no path components we will look for a method called index.

The next step is to check each path component and see if its a member of the application we registered with the handler. If it is, we retrieve it and check whether the next component is a member of this new object. If any of these path components is missing we raise an error (line 16).

If all parts of the path can be resolved as members, we check whether the final path points to an executable with a return allocation. If this return allocation is defined and equal to our IsExposed class (line 22) we are willing to execute it, otherwise we raise an error.

The next step is to check each argument, so we pass the query part of the URL to the static parse_args() method that will return us a dictionary of values if all values checked out ok. If so we call the method that we found earlier with these arguments and if all went well, write its result to the output stream that will deliver the content to the client.

The parse_args() method is not very complicated: It creates a default dictionary whose default will be an empty list. This way we can create a list of values if the query consists of more than one part with the same name. Then we split the query on the & character (line 44) and split each part in a name and a value part (these are separated by a = character). Next we check if the name is present in the annotations dictionary and if not raise a KeyError.

If we did find the name in the annotations dictionary its associated value should be an executable that we pass the value from the query part (line 48). The result of this check (or conversion) is appended to the list in the default dictionary. If the value in the annotation is not an executable an exception will be raised that will not be caught. Likewise will any exception within this callable bubble up to the calling code. The final lines (line 50-53) reduce those entries in the default dictionary that consist of a list with just a single item to just that item before returning.

Conclusion

Python function annotations can be used for many purposes and this example code show a rather elegant (I think) example that let's us specify in clear language what we expect of functions that are part of web applications, which makes it quite easy too catch many input errors in an way that can be adapted to almost any input pattern.