I Remember Seeing This Video: Image Driven Search in Video Collections (original) (raw)

We present a novel technique for image driven shot retrieval in video data. Specifically, given a query image, our method can efficiently pick the video segment containing that image. Video is first divided into shots. Each shot is described using an embedded hidden Markov model (EHMM). The EHMM is trained on GIST-like descriptors of frames in that shot. The trained EHMM computes the likelihood that a query image belongs to the shot. A Support Vector Machine classifier is trained for each EHMM. The classifier provides a yes/no decision given the likelihood value produced by its EHMM. Given a collection of shot models from one or more videos, the proposed technique can efficiently decide whether or not an image belongs to a video by identifying the shot most likely to contain that image. The proposed technique is evaluated on a realistic dataset.