Wednesday, 13 February 2013

Region segmentation


Region segmentation

Themost ubiquitous use of computer vision in reactive robotics is to identify
a region in the image with a particular REGION color, a process called region segmen-
SEGMENTATION tation. Region segmentation and color affordances are a staple perceptual
algorithm for successful entries to many different international robot competitions,
including the AAAI Mobile Robot Competition, RoboCup, and
MIROSOT. There are many color region segmentation algorithms available
and Newton Labs sells a Cognachrome board dedicated to the rapid extraction
of colored regions. The basic concept is to identify all the pixels in an
image which are part of the region and then navigate to the region’s center
(centroid). The first step is to threshold all pixels which share the same color
(thresholding), then group those together and throw out any pixels which
don’t seem to be in the same area as the majority of the pixels (region growing).
Ch. 5 described a robot which used red to signify a red Coca-Cola can for
recycling. Ideally, the robot during the searching for the can behavior would
BINARY IMAGE see the world as a binary image (having only 2 values) consisting of red,
THRESHOLDING not-red. This partitioning of the world can be achieved by thresholding the
image and creating a binary image. A C/C++ code example is shown below:
for (i= = 0; i < numberRows; i++)
for (j= = 0; j < numberColumns; j++) {
if ((ImageIn[i][j][RED] == redValue)
&& (ImageIn[i][j][GREEN] == greenValue)
&& (ImageIn[i][I][BLUE] == blueValue)) {
ImageOut[i][j] = 255;
}
else {
ImageOut[i][j] = 0;
}
}

Note that the output of a thresholded color image is a two-dimensional
array, since there is no need to have more than one value attached at each
pixel. Also, in theory a binary image would permit only values of 0 and
1. However, on many compilers there is no particular benefit to doing a bit
level representation, and it can complicate code reuse. Also, most display
software is used to displaying at least 256 values. The difference between 1
and 0 is not detectable by the human eye. Therefore, it is more common to
replace the 1 with a 255 and use a full byte per pixel.
Thresholding works better in theory than in practice due to the lack of
color constancy. The shape of an object will mean that even though a human
sees the object as a solid color, the computer will see it as a set of similar
colors. The common solution is to specify a range of high and low values on
each color plane. The C/C++ code would now become:
for (i= = 0; i< numberRows; i++)
for (j= = 0; j<numberColumns; j++) {
if (((ImageIn[i][j][RED] >= redValueLow)
&& (ImageIn[i][j][RED] <= redValueHigh))
&&((ImageIn[i][j][GREEN]>=greenValueLow)
&&(ImageIn[i][j][GREEN] <= greenValueHigh))
&&((ImageIn[i][j][BLUDE]>=blueValueLow)
&&(ImageIn[i][j][BLUE] <= blueValueHigh))) {
ImageOut[i][j] = 255;
}
else {
ImageOut[i][j] = 0;
}
}
The change in viewpoints and lighting means that the range which defines
the object from the robot’s current position is likely to change. The
color range for the object has to be made even wider to include the set of
color values the object will take. If the object color is unique for that environment,
this increase in the color range is acceptable. Otherwise, if there
are objects which have a color close enough to the object of interest, those
objects may be mistaken for the target. In some circles, the object of interest
is called the foreground, FOREGROUND while everything else in the image is called the back-
BACKGROUND ground. Thresholding an image requires a significant contrast between the
background and foreground to be successful.


No comments:

Post a Comment