PPT file

Feedback For
Multimedia Retrieval
By Rong Yan, Alexander
G. and Rong Jin
Mwangi S. Kariuki
• What’s Negative Pseudo-Relevance feedback in multimedia
• As a result of high demand of content based access to video
• Content based implies that searching can be done not only
through manually indexed terms-directly evaluate if the video
content(image and the audio) is similar to the query.
• Need to allow users to query and retrieve based on the audio
information and the imagery of the video, Content-based
video retrieval (CBVR) or Content based multimedia retrieval
• Using pattern recognition technique.
• NPRF retrieves images/items which are not similar to the
query or relevant information.
• CBVR rely on pre-defined generic similarity which determines the
distance between the two images. Limitations include:a. visual features representation limited to capturing fairly low-level
physical features(color, texture or shape).
b. Different query scenarios require different similarity metrics to
model the distribution of examples. E.g. Sky and water(sea)
• Standard relevance system iteratively asks the user more training
examples as relevant or non-relevant for the learning algorithms.
• After a interactive relevance feedback, the system must then rebuild a new classifier.
• The top-ranked example from generic similarity metric doesn’t
always make the correct result due to poor performance of the
current visual information retrieval in applications.
e.g.-cars shape
The Information Digital Video Library System.
• Focuses on information extraction from video.
• Involves the integration of speech, image and natural language.
• After retrieving the metadata, the system enables full content
search and retrieval of the spoken language and visual
• Informedia interface provides multiple levels of abstractions
including:a. Visual Icons with relevance measure
b. Short titles or headlines
c. Topic identification of stories
d. Filmstrip(storyboard) views
e. Transcript
f. Dynamic maps
g. Active video skims
h. Face detections and recognition
i. Image retrieval
Relevance and Pseudo-Relevance Feedback in
Information Retrieval
• Main retrieval technique
• Pseudo-Relevance Feedback is an automatic retrieval
approach without any user intervention.
• Starting with a small no. of positive examples and no negative
examples, then extract the strong negative to train the
• Transductive learning and co-training are two of paradigms to
utilize the information of unlabeled data.
• Co-training is used to the multimedia retrieval since
redundant information is available from different modalities.
Pseudo-Relevance Feedback
• Define the query-text description plus audio, image or video.
• Video retrieval algorithm retrieves a set of relevant video shots from given
data collections.
• Taking target(T)and query(Q) the retrieval algorithm should provide
permutation of the video shots t(i) in T which is sorted by their similarity
to the user queries q(i) in Q.
• The difference between two video segments is measured through a
similarity metric between their feature vectors.
• Then the video collected are separated into two parts of each query
positive examples (T+) and Negative examples T(-).
• Precision and recall are performance measures for retrieval systems, But
we use mean average precision since we want the rank.
• Precision after every retrieved relevant shot is computed and these
precisions are averaged. The average precision of this average precision
gives the mean average precision.
• The main idea in PRF approach is to automatically feed back the data
which are identified based on generic similarity metric
• We can define the positive distance d+ as the distance between the
positive data T+ and the queries.
• The negative distance d- is defined also.
• The distance d+ and d- will converge towards a gaussian distribution
when the no. of examples goes to infinity.
• Therefore the probability density function(pdf) p(x) for both
distance are in form of,
Which sometimes is also called the error function er f(x).
Statistical Model for average
• Let p(t) be the probability density of T for the data distribution, p(+)t (positive) an
d P(-)t(Negative) distributions.
Probabilistic Output and combination
Fusion, combining the base metric and PRF metric.
Reduce the prediction variance and offer more stable results.
Linearly normalize the scores to a certain interval e.g.[-1,1]
As a result all scores(-,+) are gaussian distributed, then we
can obtain the probability by applying bayes rule.
Parametric sigmoid model to fit
the posterior directly
Base Similarity Metric
• Algorithm used to generate the base retrieval
• Expressed as follows;
• Can handle multiple examples in arbitrary metric spaces.
• Retrieval algorithm is assigned a score for each video frame, while the basic unit is a v
ideo shot(Multiple frames)-choose the maximal retrieval score of a frame within a vid
eo shot’s retrieval score.
Sampling strategy
• No. of feedback training examples will be sampled as the input to a
learning algorithm(+ e.g.)
• Subset of the e.g. that are dissimilar to the queries will be
considered as (- e.g.).
Classification Algorithm
• SVMs are known to yield good generalization performance
compared to other classification algorithms. The decision function
is of the form;
• Improved information retrieval, negative pseudo-relevance
• Using learning algorithm for classification-very successful.
• Multimedia query e.g. provide the (+) training e.g. for machine
learning theory
• (-) training e.g. are obtained from the initial simple Euclidian
similarity metric.
• SVM classifier that learns to weight the discriminating
features-improves retrieval performance.
• NPRF shows the ability to separate Gaussian distribution of
the (-) and (+) image reducing the variances.
• Answer 3rd slide.

similar documents