John Adcock - Academia.edu (original) (raw)
Papers by John Adcock
The design and implementation of a search engine for lecture webcasts is described. A searchable ... more The design and implementation of a search engine for lecture webcasts is described. A searchable text index is created allowing users to locate material within lecture videos found on a variety of websites such as YouTube and Berkeley webcasts. The searchable index is built from the text of presentation slides appearing in the video along with other associated metadata such as the title and abstract when available. The automatic identification of distinct slides within the video stream presents several challenges. For example, picture-in-picture compositing of a speaker and a presentation slide, switching cameras, and slide builds confuse basic algorithms for extracting keyframe slide images. Enhanced algorithms are described that improve slide identification. A public system was deployed to test the algorithms and the utility of the search engine at www.talkminer.com. To date, over 17,000 lecture videos have been indexed from a variety of public sources.
onal shot boundary detection systems are comprised of three components: low-level frameindexed fe... more onal shot boundary detection systems are comprised of three components: low-level frameindexed feature extraction, inter-frame feature similarity comparison, and segmentation by extrema detection. Although TRECVID represents an enormous step towards associating performance variations with specific design choices among these system components, a systematic analysis of performance trade-offs with design parameters remains an elusive goal. A key obstacle is the numerous ad hoc thresholding schemes commonly used to detect cut and gradual shot boundaries from frameindexed novelty or dissimilarity scores. Qi et al. [1] used supervised classification in lieu of thresholding for the final step in shot boundary detection. In their work, frame-indexed features are input to a kNN classifier to label frames as boundaries or non-boundaries. The behavior of the kNN classifier has been extensively studied, and it provides a principled alternative to thresholding. Additionally, the use of kNN for shot
TalkMiner is a search engine for lecture webcasts. Lecture videos are processed to recover a set ... more TalkMiner is a search engine for lecture webcasts. Lecture videos are processed to recover a set of distinct slide images and OCR is used to generate a list of indexable terms from the slides. On our prototype system, users can search and browse lists of lectures, slides in a specific lecture, and play the lecture video. Over 10,000 lecture videos have been indexed from a variety of sources. A public website now allows users to experiment with the search engine.
The shot boundary detection system we are using for 2005 builds on the framework and system devel... more The shot boundary detection system we are using for 2005 builds on the framework and system developed in 2004 which combines pairwise similarity analysis and supervised classification. Using primitive lowlevel image features, we build secondary features based on inter-frame dissimilarity. These secondary features are used as input to an efficient k-Nearest-Neighbor (kNN) classifier. The classifier labels each frame as a shot boundary or non-boundary, and the classifier outputs are minimally processed to determine the final segmentation. This year we added information-theoretic feature selection to determine two secondary feature subsets to improve cut transition detection and gradual transition detection, respectively. These systems appeared as runs sys10M 0X in the run table. This indeed improved performance over our baseline runs (sys05 0X), and improved on the performance of a similar system using random projection for dimension reduction (sys10R 0X). Our performance was worse th...
Lecture Notes in Computer Science, 2005
Technology abounds for capturing presentations. However, no simple solution exists that users can... more Technology abounds for capturing presentations. However, no simple solution exists that users can "plug in and forget". ProjectorBox is our attempt to create a "zero user interaction" appliance that automatically captures, indexes, and manages presentation multimedia. It operates continuously to record the RGB information sent from presentation devices, such as a presenter's laptop, to display devices such as a projector.
Proceedings of the international conference on Multimedia - MM '10, 2010
For shot boundary detection, our approach combines pairwise similarity analysis and supervised cl... more For shot boundary detection, our approach combines pairwise similarity analysis and supervised classification. Using primitive low-level image features, we build secondary features based on inter-frame dissimilarity. The secondary features are motivated by prior work on media segmentation in which a kernel function is correlated along the main diagonal of a similarity matrix to construct a frameindexed novelty measure. In contrast to many previous approaches, the kernel functions combine all pairwise dissimilarity information in a neighborhood of L frames around the current frame. These secondary features are used as input to an efficient k-Nearest-Neighbor (kNN) classifier. The classifier labels each frame as a shot boundary or non-boundary, and the classifier outputs are minimally processed to determine the final segmentation.
The design and implementation of a search engine for lecture webcasts is described. A searchable ... more The design and implementation of a search engine for lecture webcasts is described. A searchable text index is created allowing users to locate material within lecture videos found on a variety of websites such as YouTube and Berkeley webcasts. The searchable index is built from the text of presentation slides appearing in the video along with other associated metadata such as the title and abstract when available. The automatic identification of distinct slides within the video stream presents several challenges. For example, picture-in-picture compositing of a speaker and a presentation slide, switching cameras, and slide builds confuse basic algorithms for extracting keyframe slide images. Enhanced algorithms are described that improve slide identification. A public system was deployed to test the algorithms and the utility of the search engine at www.talkminer.com. To date, over 17,000 lecture videos have been indexed from a variety of public sources.
onal shot boundary detection systems are comprised of three components: low-level frameindexed fe... more onal shot boundary detection systems are comprised of three components: low-level frameindexed feature extraction, inter-frame feature similarity comparison, and segmentation by extrema detection. Although TRECVID represents an enormous step towards associating performance variations with specific design choices among these system components, a systematic analysis of performance trade-offs with design parameters remains an elusive goal. A key obstacle is the numerous ad hoc thresholding schemes commonly used to detect cut and gradual shot boundaries from frameindexed novelty or dissimilarity scores. Qi et al. [1] used supervised classification in lieu of thresholding for the final step in shot boundary detection. In their work, frame-indexed features are input to a kNN classifier to label frames as boundaries or non-boundaries. The behavior of the kNN classifier has been extensively studied, and it provides a principled alternative to thresholding. Additionally, the use of kNN for shot
TalkMiner is a search engine for lecture webcasts. Lecture videos are processed to recover a set ... more TalkMiner is a search engine for lecture webcasts. Lecture videos are processed to recover a set of distinct slide images and OCR is used to generate a list of indexable terms from the slides. On our prototype system, users can search and browse lists of lectures, slides in a specific lecture, and play the lecture video. Over 10,000 lecture videos have been indexed from a variety of sources. A public website now allows users to experiment with the search engine.
The shot boundary detection system we are using for 2005 builds on the framework and system devel... more The shot boundary detection system we are using for 2005 builds on the framework and system developed in 2004 which combines pairwise similarity analysis and supervised classification. Using primitive lowlevel image features, we build secondary features based on inter-frame dissimilarity. These secondary features are used as input to an efficient k-Nearest-Neighbor (kNN) classifier. The classifier labels each frame as a shot boundary or non-boundary, and the classifier outputs are minimally processed to determine the final segmentation. This year we added information-theoretic feature selection to determine two secondary feature subsets to improve cut transition detection and gradual transition detection, respectively. These systems appeared as runs sys10M 0X in the run table. This indeed improved performance over our baseline runs (sys05 0X), and improved on the performance of a similar system using random projection for dimension reduction (sys10R 0X). Our performance was worse th...
Lecture Notes in Computer Science, 2005
Technology abounds for capturing presentations. However, no simple solution exists that users can... more Technology abounds for capturing presentations. However, no simple solution exists that users can "plug in and forget". ProjectorBox is our attempt to create a "zero user interaction" appliance that automatically captures, indexes, and manages presentation multimedia. It operates continuously to record the RGB information sent from presentation devices, such as a presenter's laptop, to display devices such as a projector.
Proceedings of the international conference on Multimedia - MM '10, 2010
For shot boundary detection, our approach combines pairwise similarity analysis and supervised cl... more For shot boundary detection, our approach combines pairwise similarity analysis and supervised classification. Using primitive low-level image features, we build secondary features based on inter-frame dissimilarity. The secondary features are motivated by prior work on media segmentation in which a kernel function is correlated along the main diagonal of a similarity matrix to construct a frameindexed novelty measure. In contrast to many previous approaches, the kernel functions combine all pairwise dissimilarity information in a neighborhood of L frames around the current frame. These secondary features are used as input to an efficient k-Nearest-Neighbor (kNN) classifier. The classifier labels each frame as a shot boundary or non-boundary, and the classifier outputs are minimally processed to determine the final segmentation.