project results: |
Towards Facade Structure Recognition
Jan Čech and Radim ára
Center for Machine Perception
Czech Technical University in Prague
May 2008
In the world of rectified images of classical facades we work on a
structural model that describes the images. We are looking for a
computational procedure which segments the image to elementary units
of a structural model, the primitives. The primitives must
conform to global constraints which we call the structure. The
constrains have some unknown common attributes which must be
determined as a part of the segmentation procedure.
This result focuses on window detection. Windows tend to be aligned
horizontally and vertically. The spacings between the columns or rows
of a window array may be irregular and are unknown. The structure
recognition task is then to find the maximum-probability aligned array
of unknown attributes (common window width and height and spacing
attributes per row/column).
To solve the task we have proposed an attributed 2D stochastic
grammar. At this point, the production rules are designed manually.
The parsing procedure starts by AdaBoost classifier trained on
exemplar window images. This produces what we call seed
detections (see Figure below). This is done on all image scales,
since it is not known what the window size is. Next, the seed
detection is updated (its location slightly shifted) based on a
maximum-likelihood procedure using parametric image likelihood model
learned from a small set of exemplar window images. The likelihood
model is currently independent of the AdaBoost classifier.
Next, the parser continues to grow a structural component
from each updated seed detection. Part of that process is
maximum-likelihood image search for a sub-structure (a single window
or a row of windows) within an attribute range predicted by the
current structural component. The same image likelihood model as for
the seed update is used. Each elementary growing step concludes by
updating the attributes of the whole structural component. We call the
update process, or local attribute re-optimization, component
shaking (the seed update discussed above is in fact performed by
single-element component shake as well).
Each structural component, even partially grown, is always
consistent with respect to the gramatical rules.
Finally, the structural component of maximum probability is
selected for output. The current implementation of the parsing
procedure assumes there is just a single starting symbol per image.
Various generalizations are under development, including the
possibility of multiple starting symbols, relaxing the need for
rectification, a more complex structural model, and introducing
appearance attributes.
|
|
|
| seed detections | maximum structural component
| seed detections | maximum structural component
|
|
|
|
|
| seed detections | maximum structural component
| seed detections | maximum structural component
|
References
[1] Jan Cech and Radim Sara. Modules for Structure Detection,
eTRIMS deliverable D3.3. May 2008.
[2] Jan Cech and Radim Sara. Windowpane Detection based on Maximum
Aposteriori Probability Labeling. In: Image Analysis - From Theory
to Applications - Proceedings of the 12th Int Workshop on
Combinatorial Image Analysis (IWCIA '08). April 2008
|