1. Particle Filter with Static Observations
Let be a set of model contour parts (or segments), which are called sites in the Markov Random Field (MRF) terminology. Let be a set of edge fragments in a given image I, which are called labels in the MRF terminology. Edge fragments are generated using edge-linking software [17], then we introduce break points at high curvature points in order to allow for more flexible shape matching. Let be a set of all functions which represent label assignments to the sides. Our goal is to find a function f with maximum posterior probability for a pdf

where denotes the nonnegative real numbers and Z is the set of the observations. It is a very important property
of our framework that the set of observations Z is static. It is determined by the appearance of a target object (or a set of target objects). Our primary appearance feature is shape of the contour of the target object (or target objects). It is measured by shape similarity of extracted and grouped edge
fragments to the target contour. The appearance features of target objects (which we also call model objects) can be manually defined, e.g., by drawing the contour of the query shape, or learned from training examples.
We propose to perform the optimization in Eq. 1 in the particle filter (PF) framework. This is possible, since each function is a finite set of pairs, i.e., where Obviously the order of the x's does not matter, i.e., each permutation of x's in the set defines the same function f. A key observation for the proposed approach is that a sequence (x1, . . . , xm) maximizing p clearly determines the
set that maximizes p1. Thus, we solve a more specific problem of finding a sequence. The sequence order is important in the proposed PF inference process, which is described below. Following a common notation in the PF framework, we denote a sequence (x1, . . . , xm) as x1:m. We can now restate our goal (1) as finding a sequence with maximum posterior probability:

Obviously, as stated above, a solution of Eq. 2, which is a sequence, defines a solution of Eq. 1 as which is a set.
We will approximate p(x1:m | Z) in Eq. 2 in the framework of Bayesian Importance Sampling. By drawing samples for i = 1, . . .,N from an easier to sample proposal distribution π we obtain:

where δ is the Dirac delta function and

are normalized weights. The weights account for the fact that the proposal distribution π in general is not equal to the true distribution of successor states.
However, due to the high dimensionality of (S x E)m it is still hard to sample from π. Therefore, we will derive a recursive estimation of the weights and recursive sampling of the sequence elements one by one from S x E. The recursive estimate of the importance weights will be obtained by factorizing the distributions p and π and by modeling the evolution of the hidden states in discrete time steps. We do not have any natural time parameter in our approach, but discrete steps of expanding the sequence of hidden states by one new state can be interpreted as discrete time steps. Our derivation is similar to the PF derivation, but it differs fundamentally, since unlike the standard PF framework, the observations Z do not arrive sequentially,
but are available at once. For every t from 1 to m, we have

To obtain the last equation, we apply Bayes rule to decompose that interchanges xt and Z. As it is often the case in PF applications, we assume that Using this simple exploration based proposal the weight recursion in (5) becomes:

By recursive substitution of weights in (6), i.e., by applying (6) to we obtain

Finally, under the assumption that all particles have the same initial weight and the same initial observation probability for i = 1, . . .,N, we obtain

The weight in (8) represents particle evaluation with respect
to shape and other appearance features of the model
described in the observation set Z. The intuitive explanation
is that a new correspondence xt added to the sequence
of correspondences should increase the similarity of
the selected edge fragments in the image to the model object.
Thus, the new weight is more informative if evaluated
using the extended set of correspondences and the old
weight is not needed for the evaluation. For comparison,
the corresponding weight update in the standard PF
framework ([29]) is

where zt denotes the new observations obtained at time t. Because our observations Z do not have any natural order, Z cannot be expressed as a sequence of observations. We do not make any Markov assumption in the proposed formula (8), i.e., the new state xt is dependent on all previous states for each particle (i).
Since the proposed PF framework performs sequential
filtering, there are two important issues that need to be addressed:
setting the initial correspondences for each
particle i = 1, . . .,N (Section 1.1) and the number of
particles, which will be determined experimentally. We
only mention here that the proposed approach is in some
sense robust to the initial correspondences, since it does
not matter with which correspondence we start as long as
we start at some element of the optimal set of correspondences
. Practically, we start at most
promising correspondences, which are determined by sampling
form a distribution determined by shape similarity between
model contour segments and image edge fragments.
The optimization of Eq. 2 is computed in a framework of
Sequential Importance Resampling (SIR). We outline now
our PF algorithm, which in each iteration, i.e., at every time
step t, and for each particle i = 1, . . .,N executes the 3 steps:
1) Importance sampling / proposal: Sample and set 
2) Importance weighting/evaluation: An individual importance
weight is assigned to each particle according to
Eq. 8.
3) Resampling: At the sampling stage we sample a lot
more followers than the number of particles which is referred
to as prior boosting [12, 4] so as to capture multimodal
likelihood regions. Thus we have a larger set of particles where M >N from which we sub-sample N particles and assign the qual weights to all of them as
in the standard SIR approach. While SIR requires a justification
that it still computes (4) due to the presence of the
old weights in (9), which are reset to be be equal to 1/N
after resampling in SIR, this fact is obvious in the proposed
approach, since the old weights are not present in (8).
In order to be able to execute the proposed algorithm, we
need to define the proposal distribution and the posterior distribution We describe their constructions below.
|