PATSEEK: Content Based Image Retrieval System for Patent Database (original) (raw)
Related papers
Retrieval System for Patent Images
Procedia Technology, 2013
Patent information and images play important roles to describe the novelty of an invention. However, current patent collections do not support image retrieval and patent images are become almost unsearchable. This paper presents a short review of the existing research work and challenges in patent image retrieval domain. From the review, the image feature extraction step is found to be an important step to match the query and database images successfully. In order to improve the current feature extraction step in image patent retrieval, we propose a patent image retrieval approach based on Affine-SIFT technique. Comparison discussions between the existing feature extraction techniques are presented to assess the potential of this proposed approach.
Towards content-based patent image retrieval: A framework perspective
World Patent Information, 2010
In this article, we discuss the potential benefits, the requirements and the challenges involved in patent image retrieval and subsequently, we propose a framework that encompasses advanced image analysis and indexing techniques to address the need for content-based patent image search and retrieval. The proposed framework involves the application of document image pre-processing, image feature and textual metadata extraction in order to support effectively content-based image retrieval in the patent domain. To evaluate the capabilities of our proposal, we implemented a patent image search engine. Results based on a series of interaction modes, comparison with existing systems and a quantitative evaluation of our engine provide evidence that image processing and indexing technologies are currently sufficiently mature to be integrated in real-world patent retrieval applications.
Image search in patents: a review
International Journal on Document Analysis and Recognition (IJDAR), 2012
To verify the originality of an invention in a patent, the graphical description available in the form of patent drawings often plays a critical role. This paper introduces the importance, requirements and challenges of a patent image retrieval system. We present a brief account of the work done in the specific and related areas of the patent image domain. We begin with a review of work done dealing specifically with retrieval and analysis of images in the patent domain. Although the literature found dealing with patent images is small, there is a significant amount of work that has been done in related areas that is useful and applicable to the patent image area. From a methodological point of view, we present an overview of the algorithms developed for the retrieval and analysis of CAD and technical drawings, diagrams, data flow diagrams, circuit diagrams, data charts, flow charts, plots and symbol recognition.
2013
To verify the originality of an invention in a patent, the graphical description available in the form of patent drawings often plays a critical role. This paper introduces the importance, requirements and challenges of a patent image retrieval system. We present a brief account of the work done in the specific and related areas of the patent image domain. We begin with a review of work done dealing specifically with retrieval and analysis of images in the patent domain. Although the literature found dealing with patent images is small, there is a significant amount of work that has been done in related areas that is useful and applicable to the patent image area. From a methodological point of view, we present an overview of the algorithms developed for the retrieval and analysis of CAD and technical drawings, diagrams, data flow diagrams, circuit diagrams, data charts, flow charts, plots and symbol recognition.
Relational skeletons for retrieval in patent drawings
Image Processing, 2001. …, 2001
This paper presents the evaluation of a number of algorithm alternative for content based retrieval from a database of technical drawings representing patents. The objective is to help patent evaluator in their quest for a possible patent bearing too much similarity with the one under investigation. To achieve this, we have devised a system where images (drawings) are represented using attributed graphs based on the extracted line-patterns or histograms of attributes computed from the graphs. Retrieval is either performed using histogram comparison or thanks to a graph similarity measure . Promising results are presented along with possible work extention.
The aim of this document is to describe the methods we used in the Patent Image Classification and Image-based Patent Retrieval tasks of the Clef-IP 2011 track. The patent image classification task consisted in categorizing patent images into pre-defined categories such as abstract drawing, graph, flowchart, table, etc. Our main aim in participating in this sub-task was to test how our image categorizer performs on this type of categorization problem. Therefore, we used SIFT-like local orientation histograms as low level features and on the top of that we built a visual vocabularies specific to patent images using Gaussian mixture model (GMM). This allowed us to represent images with Fisher Vectors and to use linear classifiers to train one-versus-all classifiers. As the results show, we obtain very good classification performance. Concerning the Image-based Patent Retrieval task, we kept the same image repre-sentation as for the Image Classification task and used dot product as sim...
International Journal of Computer Applications
For the past many years we are using search engine for image retrieval. These search engines use shapes, contents, text, and caption based approach for getting relevant image from the web repository. This image repository contains billions of 2D and 3D images as well as relevant information about those images. For shape based approach user has to give dimensions of that particular image for getting relevant response. This paper describes the necessity of an efficient search engine for retrieving information about an image by uploading an image on the search engine or giving image as a query for retrieving information related to that particular image. It can be proved very helpful for a novice user who is searching information about an unknown or unfamiliar logo or image.
Implementation and Comparison of Text-Based Image Retrieval Schemes
International Journal of Advanced Computer Science and Applications
Search engines, i.e., Google, Yahoo provide various libraries and API's to assist programmers and researchers in easier and efficient access to their collected data. When a user generates a search query, the dedicated Application Programming Interface (API) returns the JavaScript Object Notation (JSON) file which contains the desired data. Scraping techniques help image descriptors to separate the image's URL and web host's URL in different documents for easier implementation of different algorithms. The aim of this paper is to propose a novel approach to effectively filter out the desired image(s) from the retrieved data. More specifically, this work primarily focuses on applying simple yet efficient techniques to achieve accurate image retrieval. We compare two algorithms, i.e., Cosine similarity and Sequence Matcher, to obtain the accuracy with a minimum of irrelevance. Obtained results prove Cosine similarity more accurate than its counterpart in finding the maximum relevant image(s).
Information retrieval in document image databases
IEEE Transactions on Knowledge and Data Engineering, 2004
With the rising popularity and importance of document images as an information source, information retrieval in document image databases has become a growing and challenging problem. In this paper, we propose an approach with the capability of matching partial word images to address two issues in document image retrieval: word spotting and similarity measurement between documents. First, each word image is represented by a primitive string. Then, an inexact string matching technique is utilized to measure the similarity between the two primitive strings generated from two word images. Based on the similarity, we can estimate how a word image is relevant to the other and, thereby, decide whether one is a portion of the other. To deal with various character fonts, we use a primitive string which is tolerant to serif and font differences to represent a word image. Using this technique of inexact string matching, our method is able to successfully handle the problem of heavily touching characters. Experimental results on a variety of document image databases confirm the feasibility, validity, and efficiency of our proposed approach in document image retrieval.