Mark Cummins | University of Oxford (original) (raw)

Mark Cummins

I completed my DPhil thesis in the Mobile Robotics Group at Oxford, working on place recognition and appearance based navigation. My advisor was Paul Newman, and the algorithm we developed is called FAB-MAP.

Since finishing, I co-founded a start-up company called Plink, together with James Philbin, which developed visual search engine technology. Our PlinkArt app allowed users to identify a painting just by taking a picture of it with their mobile phone. Plink was acquired by Google in April 2010. I now work on the Goggle Goggles team, developing Google's computer vision systems.
Supervisors: Paul Newman

less

Uploads

Papers by Mark Cummins

Research paper thumbnail of PhotoOCR: Reading Text in Uncontrolled Conditions

2013 IEEE International Conference on Computer Vision, 2013

Research paper thumbnail of Accelerating FAB-MAP With Concentration Inequalities

IEEE Transactions on Robotics, 2010

Research paper thumbnail of Appearance-only SLAM at large scale with FAB-MAP 2.0

The International Journal of Robotics Research, 2010

We describe a new formulation of appearance-only SLAM suitable for very large scale place recogni... more We describe a new formulation of appearance-only SLAM suitable for very large scale place recognition. The system navigates in the space of appearance, assigning each new observation to either a new or a previously visited location, without reference to metric position. The system is demonstrated performing reliable online appearance mapping and loop-closure detection over a 1000 km trajectory, with mean filter update times of 14 ms. The scalability of the system is achieved by defining a sparse approximation to the FAB-MAP model suitable for implementation using an inverted index. Our formulation of the problem is fully probabilistic and naturally incorporates robustness against perceptual aliasing. We also demonstrate that the approach substantially outperforms the standard term-frequency inverse-document-frequency (tf-idf) ranking measure. The 1000 km data set comprising almost a terabyte of omni-directional and stereo imagery is available for use, and we hope that it will serve ...

Research paper thumbnail of Optical Character Recognition by Iterative Re-Segmentation of Text Images Using High-Level Cues

Research paper thumbnail of Sound Source Separation with Microphone Arrays

Research paper thumbnail of Toward Life Long Mapping

Research paper thumbnail of Method and apparatus for enabling virtual tags

Research paper thumbnail of Proceedings of the 13th International Symposium on Robotics Research (ISRR)

Robotics Research: The 13 International Symposium ISRR, Nov 8, 2010

Research paper thumbnail of PhotoOCR: Reading Text in Uncontrolled Conditions

We describe PhotoOCR, a system for text extraction from images. Our particular focus is reliable ... more We describe PhotoOCR, a system for text extraction from images. Our particular focus is reliable text extraction from smartphone imagery, with the goal of text recognition as a user input modality similar to speech recognition. Commercially available OCR performs poorly on this task. Recent progress in machine learning has substantially improved isolated character classification; we build on this progress by demonstrating a complete OCR system using these techniques. We also incorporate modern datacenter-scale distributed language modelling. Our approach is capable of recognizing text in a variety of challenging imaging conditions where traditional OCR systems fail, notably in the presence of substantial blur, low resolution, low contrast, high image noise and other distortions. It also operates with low latency; mean processing time is 600 ms per image. We evaluate our system on public benchmark datasets for text extraction and outperform all previously reported results, more than halving the error rate on multiple bench- marks. The system is currently in use in many applications at Google, and is available as a user input modality in Google Translate for Android.

Research paper thumbnail of Fast Approximate Place Recognition and Mapping for Mobile Robots

Research paper thumbnail of An Approach to Spatio-Temporally Consistent Scene Classification in Urban Workspaces: An Overview

Research paper thumbnail of Probabilistic Multispectral Place Recognition for Mobile Robotics

Research paper thumbnail of A Comparison of Loop Closing Techniques in Monocular SLAM

Research paper thumbnail of Highly Scalable Appearance-Only SLAM FAB-MAP 2.0

Research paper thumbnail of IAC-04-P.P.05 DIGITAL ORRERY A GRAPHICAL SOLAR SYSTEM SIMULATION AS AN EDUCATIONAL TOOL

Research paper thumbnail of FAB-MAP: Appearance-Based Place Recognition and Mapping using a Learned Visual Vocabulary Model

Research paper thumbnail of Detecting Salient Places for SLAM in Urban Environments using Multispectral Data

Abstract Real-time SLAM in large-scale environments is a computationally demanding task. This pap... more Abstract Real-time SLAM in large-scale environments is a computationally demanding task. This paper discusses an approach to dealing with storage and computation constraints based on selective evidence gathering at significant areas in the environment. Local measures of novelty, based on the input from multiple sensors, are used to detect salient locations. This can then be used to trigger the collection of reliable descriptors of the location.

Research paper thumbnail of Digital Orrery

Research paper thumbnail of Describing, Navigating and Recognising Urban Spaces-Building An End-to-End SLAM System

Robotics Research, Jan 1, 2011

Research paper thumbnail of RSLAM: A system for large-scale mapping in constant-time using stereo

International journal of …, Jan 1, 2011

Research paper thumbnail of PhotoOCR: Reading Text in Uncontrolled Conditions

2013 IEEE International Conference on Computer Vision, 2013

Research paper thumbnail of Accelerating FAB-MAP With Concentration Inequalities

IEEE Transactions on Robotics, 2010

Research paper thumbnail of Appearance-only SLAM at large scale with FAB-MAP 2.0

The International Journal of Robotics Research, 2010

We describe a new formulation of appearance-only SLAM suitable for very large scale place recogni... more We describe a new formulation of appearance-only SLAM suitable for very large scale place recognition. The system navigates in the space of appearance, assigning each new observation to either a new or a previously visited location, without reference to metric position. The system is demonstrated performing reliable online appearance mapping and loop-closure detection over a 1000 km trajectory, with mean filter update times of 14 ms. The scalability of the system is achieved by defining a sparse approximation to the FAB-MAP model suitable for implementation using an inverted index. Our formulation of the problem is fully probabilistic and naturally incorporates robustness against perceptual aliasing. We also demonstrate that the approach substantially outperforms the standard term-frequency inverse-document-frequency (tf-idf) ranking measure. The 1000 km data set comprising almost a terabyte of omni-directional and stereo imagery is available for use, and we hope that it will serve ...

Research paper thumbnail of Optical Character Recognition by Iterative Re-Segmentation of Text Images Using High-Level Cues

Research paper thumbnail of Sound Source Separation with Microphone Arrays

Research paper thumbnail of Toward Life Long Mapping

Research paper thumbnail of Method and apparatus for enabling virtual tags

Research paper thumbnail of Proceedings of the 13th International Symposium on Robotics Research (ISRR)

Robotics Research: The 13 International Symposium ISRR, Nov 8, 2010

Research paper thumbnail of PhotoOCR: Reading Text in Uncontrolled Conditions

We describe PhotoOCR, a system for text extraction from images. Our particular focus is reliable ... more We describe PhotoOCR, a system for text extraction from images. Our particular focus is reliable text extraction from smartphone imagery, with the goal of text recognition as a user input modality similar to speech recognition. Commercially available OCR performs poorly on this task. Recent progress in machine learning has substantially improved isolated character classification; we build on this progress by demonstrating a complete OCR system using these techniques. We also incorporate modern datacenter-scale distributed language modelling. Our approach is capable of recognizing text in a variety of challenging imaging conditions where traditional OCR systems fail, notably in the presence of substantial blur, low resolution, low contrast, high image noise and other distortions. It also operates with low latency; mean processing time is 600 ms per image. We evaluate our system on public benchmark datasets for text extraction and outperform all previously reported results, more than halving the error rate on multiple bench- marks. The system is currently in use in many applications at Google, and is available as a user input modality in Google Translate for Android.

Research paper thumbnail of Fast Approximate Place Recognition and Mapping for Mobile Robots

Research paper thumbnail of An Approach to Spatio-Temporally Consistent Scene Classification in Urban Workspaces: An Overview

Research paper thumbnail of Probabilistic Multispectral Place Recognition for Mobile Robotics

Research paper thumbnail of A Comparison of Loop Closing Techniques in Monocular SLAM

Research paper thumbnail of Highly Scalable Appearance-Only SLAM FAB-MAP 2.0

Research paper thumbnail of IAC-04-P.P.05 DIGITAL ORRERY A GRAPHICAL SOLAR SYSTEM SIMULATION AS AN EDUCATIONAL TOOL

Research paper thumbnail of FAB-MAP: Appearance-Based Place Recognition and Mapping using a Learned Visual Vocabulary Model

Research paper thumbnail of Detecting Salient Places for SLAM in Urban Environments using Multispectral Data

Abstract Real-time SLAM in large-scale environments is a computationally demanding task. This pap... more Abstract Real-time SLAM in large-scale environments is a computationally demanding task. This paper discusses an approach to dealing with storage and computation constraints based on selective evidence gathering at significant areas in the environment. Local measures of novelty, based on the input from multiple sensors, are used to detect salient locations. This can then be used to trigger the collection of reliable descriptors of the location.

Research paper thumbnail of Digital Orrery

Research paper thumbnail of Describing, Navigating and Recognising Urban Spaces-Building An End-to-End SLAM System

Robotics Research, Jan 1, 2011

Research paper thumbnail of RSLAM: A system for large-scale mapping in constant-time using stereo

International journal of …, Jan 1, 2011

Research paper thumbnail of SLAM Summer School 2009 - Appearance-Based Methods for Loop Closing

Research paper thumbnail of TEDx Dublin 2010 - Teaching Machines to See

Log In