Summary


1. Abstract

We propose a novel framework for contour based object detection and recognition, which we formulate as a joint contour fragment grouping and labeling problem. For a given set of contours of model shapes, we simultaneously perform selection of relevant contour fragments in edge images, grouping of the selected contour fragments, and their matching to the model contours. The inference in all these steps is performed using particle filters (PF) but with static observations. Our approach needs one example shape per class as training data. The PF framework combined with decomposition of model contour fragments to part bundles allows us to implement an intuitive search strategy for the target contour in a clutter of edge fragments. First a rough sketch of the model shape is identified, followed by fine tuning of shape details. We show that this framework yields not only accurate object detections but also localizations in real cluttered images.

2. Motivation

To endow machine with human vision for:

  • Robots
  • Medical Aid
  • Driving Assistance

3. Shape Representation

Color images are too difficult for computers to understand due to the different color, texture, etc. Therefore, we begin by extracting the edges of the images.


Click on the image to enlarge.

After we have the edges, we randomly pick three points (A, B, and C on the swan) within the boundary of an object to construct a triangle. We use the distance AB, AC and angle BAC to represent this triangle. The combination of all the possible triangles within the boundary of the object is represented in a histogram. We use the histogram to represent the shape.

By comparing the histograms of shape a, b, and c, we see that because shape a and b are similiar, their histograms are more alike than the histogram of shape c.


Click on the image to enlarge.

4. Priori Knowledge About Shape


Click on the image to enlarge.
In order to accommodate for possible deformation or broken edges, we represent the shape model in part-bundles. The main purpose of the bundle design is to ensure shape flexibility.

4a. Understanding the Edges

Machines do not know what is in an image. An image is just a piece of data in memory. To make the machine understand the image, we extract the edges of the image and link the edges to the data sequences. Finally we match the priori knowledge (shape model) to this image and try to figure out what is in the image.


Click on the image to enlarge.

Link edges to edge chains represented with data sequences.

5. System Framework

Click on the image to enlarge.
Click on the image to enlarge.

6. Object Detection for Real Images Using Particle Filters

We formulate the object detection to a labeling problem. We want to find the correct label between the model and the image. We can do it in a greedy way; however, the complexity will be very high since it is a NP hard problem. Thus, we compromise between performance and complexity by using particle filters. We initialize approximately 100 particles. Each particle represents a configuration between the model fragments and image fragments. The particles with better similarity is more likely to survive. After several intertions and resampling processes, the particles approach to the optimal configuration.


Click on the image to enlarge.

7. Experimental Results

We apply the algorithm to ETHZ dataset which includes 255 images for 5 image classes, some examples of the detection results are illustrated as follow.


Click on the image to enlarge.
1. Particle Filter with Static Observations