Content-Based Information Retrieval (CBIR) Research Papers (original) (raw)

2025

As the World Wide Web is growing rapidly and data in the present day scenario is stored in a distributed manner. The need to develop a search Engine based architectural model for people to search through the Web. Broad web search engines as well as many more specialized search tools rely on web crawlers to acquire large collections of pages for indexing and analysis. The crawler is an important module of a web search engine. The quality of a crawler directly affects the searching quality of such web search engines. Such a web crawler may interact with millions of hosts over a period of weeks or months, and thus issues of robustness, flexibility, and manageability are of major importance. Given some URLs, the crawler should retrieve the web pages of those URLs, parse the HTML files, add new URLs into its queue and go back to the first phase of this cycle. The crawler also can retrieve some other information from the HTML files as it is parsing them to get the new URLs. In this paper, we describe the design of a web crawler that uses Page Rank algorithm for distributed searches and can be run on a network of workstations. The crawler initially search for all the stop words (such as a, an, the, and etc). While searching the web pages for some keyword the crawler will initially remove all collected stop word. Also at the same time the crawler will search for snippets from web documents. All the matching word & collected snippet will be stored in temporary cache memory created at central server of crawlers. Where after applying page rank algorithm on the basis of no. of visit of web pages we will arrange the pages according to their ranks & display the results. Since, due to extensive search on web through web crawlers the chances of various virus attacks are more & processing capacity of system may get halt so to provide solution in such scenario we can provide backup to our system by creating web services. The web service will be designed in such manner that any valid updations to any database servers will automatically updates the backup servers. Therefore, even in failure of any server system, we can continue with crawling process.

2025, Thesis

ROSA, N. A. Insertion of the knowledge of the specialist in the process of relevance feedback in content-based image retrieval: a feasibility study in mammography. 2007 166p. Doctoral Thesis - Faculdade de Medicina de Ribeirão Preto,... more

2025

There are just too many trademarks out there so that a good automated retrieval system is required to help to protect them from possible infringements. However, it is from people, i.e., the general consumers' viewpoint how similar or confusing two trademarks can be. Thus, in this paper we propose a hybrid system where we patently incorporate human inputs into a computerized trademark retrieval scheme. Various surveys involving general consumers' cognition and responses are conducted, and the results are used as benchmarks in developing the automated part. The core mathematical features used in the scheme are fourgray-level Zernike moments and two new image compactness indices. Experimental results show that this hybrid system, when compared with humangenerated results, is able to achieve an average accuracy rate of 95% while that of the closest competing existing method is 65%.

2025

R egion based image retrieval system has become hot topic for research. Only color, texture or shape feature extraction cannot give high precision and recall. To get high precision and recall, this paper proposes a new content-based image retrieval method that uses combination of color and texture feature. The Regions of Interest (ROI) are roughly identified by segmenting the image into fixed partitions. The color and texture features of the ROIs are computed. The color moment will be calculated to extract the color feature, where the image will be converted from RGB to HSV color space. The Discrete Wavelet Transform is performed to extract the texture feature. A combined colour and texture feature vector is computed for each ROI and Euclidean distance measure is used for computing the distance between the features of the query and target image. Preliminary experimental results show that the proposed method provides better retrieving result than some of the existing methods.

2025, AkiNik Publications

The exponential growth of multimedia data in diverse fields, including
entertainment, education, healthcare, and surveillance, necessitates efficient systems for storing, indexing, and retrieving complex multimedia content.
This chapter explores the foundational data structures and algorithms critical for managing multimedia databases effectively. It discusses spatial, temporal, and semantic challenges, emphasizing advanced indexing mechanisms like R-trees, quadtrees, and graph-based representations for scalable and accurate data management. Techniques for content-based retrieval, real-time processing, and adaptive compression are examined in detail, showcasing their applications in content recommendation systems, digital libraries, AR/VR platforms, and surveillance.
The integration of artificial intelligence and distributed architectures
emerges as a pivotal future direction, enabling semantic understanding, real-time analytics and enhanced scalability. The chapter also highlights
challenges, including the semantic gap, computational demands, and bias in AI systems, while proposing innovative solutions such as neural semantic search, edge computing, and immersive multimedia experiences. By bridging theoretical advancements and practical applications, this chapter provides a comprehensive framework for developing intelligent multimedia database systems that are efficient, adaptive, and future-ready.

2025, Pattern Recognition Letters

In this paper, the automatic medical annotation task of the 2007 CLEF cross language image retrieval campaign (ImageCLEF) is described. The paper focusses on the images used, the task setup, and the results obtained in the evaluation campaign. Since 2005, the medical automatic image annotation task exists in ImageCLEF with increasing complexity to evaluate the performance of state-of-the-art methods for completely automatic annotation of medical images based on visual properties. The paper also describes the evolution of the task from its origin in 2005-2007. The 2007 task, comprising 11,000 fully annotated training images and 1000 test images to be annotated, is a realistic task with a large number of possible classes at different levels of detail. Detailed analysis of the methods across participating groups is presented with respect to the (i) image representation, (ii) classification method, and (iii) use of the class hierarchy. The results show that methods which build on local image descriptors and discriminative models are able to provide good predictions of the image classes, mostly by using techniques that were originally developed in the machine learning and computer vision domain for object recognition in non-medical images.

2024

The World Wide Web (WWW) has grown from a few thousand pages in 1993 to more than eight billion pages at present. Due to this explosion in size, web search engines are becoming increasingly important as the primary means of locating relevant information. This research aims to build a crawler that crawls the most important web pages, a crawling system has been built which consists of three main techniques. The first is Best-First Technique which is used to select the most important page. The second is Distributed Crawling Technique which based on UbiCrawler. It is used to distribute the URLs of the selected web pages to several machines. And the third is Duplicated Pages Detecting Technique by using a proposed document fingerprint algorithm.

2024

In the world of academia and profession, original thought and authenticity form the bedrock. With the rise of plagiarism detection, intellectual property is now protected. Traditional plagiarism detectors face the challenge of detecting paraphrased, translated, or contextually altered content. This paper would describe a proposed system in which NLP, deep learning techniques, and advanced linguistic analysis would be applied in order to enhance the accuracy and efficiency of plagiarism detection. The proposed system would then integrate context-aware algorithms along with semantic similarity assessment over the limitations that the traditional methods have to their advantages, which might potentially raise the educational integrity of institutions and the authenticity of published works.

2024

With the rapid development of technology of multimedia, the traditional information retrieval techniques based on keywords are not sufficient, content-based image retrieval (CBIR) has been an active research topic. A new content based image retrieval method using correlation, median filtering and edge extraction is proposed. To the defect of median filter reducing the image’s resolution especially on edges, edge detection is applied in order to get the edge values, and then replace the values of edge position of median filtering image with detected edge values. Aim at noise reduction while edge detail preservation. After feature extraction, histogram equalization is applied for feature vector calculation and similarity measurement. CBIR result gives that more relevant images are retrieved from image database in comparison Proposed with the existing methods and has a good retrieval ability of image.

2024

Applying semantic web technologies to knowledge sharing

2024, arXiv (Cornell University)

When humans describe images they tend to use combinations of nouns and adjectives, corresponding to objects and their associated attributes respectively. To generate such a description automatically, one needs to model objects, attributes and their associations. Conventional methods require strong annotation of object and attribute locations, making them less scalable. In this paper, we model objectattribute associations from weakly labelled images, such as those widely available on media sharing sites (e.g. Flickr), where only image-level labels (either object or attributes) are given, without their locations and associations. This is achieved by introducing a novel weakly supervised non-parametric Bayesian model. Once learned, given a new image, our model can describe the image, including objects, attributes and their associations, as well as their locations and segmentation. Extensive experiments on benchmark datasets demonstrate that our weakly supervised model performs at par with strongly supervised models on tasks such as image description and retrieval based on object-attribute associations.

2024

The workshop "Mining Scientific Papers: Computational Linguistics and Bibliometrics" (CLBib 2015), co-located with the 15th International Society of Scientometrics and Informetrics Conference (ISSI 2015), brought together researchers in Bibliometrics and Computational Linguistics in order to study the ways Bibliometrics can benefit from large-scale text analytics and sense mining of scientific papers, thus exploring the interdisciplinarity of Bibliometrics and Natural Language Processing (NLP). The goals of the workshop were to answer questions like: How can we enhance author network analysis and Bibliometrics using data obtained by text analytics? What insights can NLP provide on the structure of scientific writing, on citation networks, and on in-text citation analysis? This workshop is the first step to foster the reflection on the interdisciplinarity and the benefits that the two disciplines Bibliometrics and Natural Language Processing can drive from it.

2024

This paper presents a search engine for musical files in the Internet, based on the description of a musical fragment. The search engine allows that description to be pitch independent. Furthermore, the system admits some errors on the musical fragment description, according to a parameter available in the interface.Este trabalho apresenta um sistema de busca em arquivos musicais via Internet, baseado na descrição de trechos musicais. A busca é feita de maneira a ser independente da tonalidade em que o trecho é descrito pelo usuário. Simultaneamente, o esquema também é capaz de admitir uma certa quantidade de erros na descrição do trecho musical, de acordo com um parâmetro controlável a partir da interface

2024

Today, the number of registered trademarks is huge and is increasing rapidly. Thus, the job of identifying infringement of trademarks by solely using manual inspection is tiring, laborious and time consuming. To cope with the tremendous amount of available registered trademarks and to protect the infringement of trademarks, a new automatic and efficient trademark retrieval system is necessary and urgent. This paper presents an efficient content-based trademark retrieval based on the synergy between the colorspatial technique and zernike moment method. Zernike moments could be used as an effective descriptor of global shape of a trademark while the color-spatial feature could be used to obtain color spatial distribution in the trademark image. To retrieve visually similarly look trademark, we use a weight Euclidean distance measure. Experimental evaluations conducted on a database containing 1000 trademark images shows that the proposed methodology is very effective in retrieving visually similar look trademarks.

2024, International Journal for Scientific Research & Development

Web data extraction has been an important part for many Web data analysis applications. This paper formulates the data extraction such as the image retrieval using advanced techniques. I propose an unsupervised, page-level data extraction approach to deduce the schema and templates for each individual Deep Website that contains either singleton or multiple data records in one Webpage. FiVaTech applies tree matching, tree alignment, and advanced techniques to achieve the challenging task. In experiments, FiVaTech has much higher precision than EXALG and is comparable with other record level extraction systems like ViPER and MSE. The experiments show an encouraging result for the test pages used in many state-of-the-art Web data extraction works. Since the term has been widely used to describe the process of retrieving desired images from a large collection on the basis of features such as image that can be automatically extracted from the data themselves. The features used for retrieval can be either primitive or semantic, but the extraction process must be pre dominantly automatic. Retrieval of images by manually-assigned keywords is definitely CBIR as the term is generally understoodeven if the keywords describe image content.

2024, International Journal of Printing, Packaging & Allied Sciences

Content based image retrieval (CBIR) applications grand great amenity which guidance people find what they prefer from the massive amount of images. The images are retrieved from huge image data base availing the features comprising shape, color, and texture. Color is the absolute eminent one among vision features of image. Color Coherence Vector (CCV) is related to the color histogram method and it still favor some spatial feature, and is justified to be more competent. An improved CCV method is presented in this paper with higher spatial information and without abundant added computing work. Multi resolution Gabor filters and Gray level co-occurrence matrix are availed for extracting texture features. Fast Fourier Descriptor method is needed for extracting shape features. Then combining these three features, fused features are achieved. Finally, the fused features are utilized in image retrieval and best retrieval performance is shown by this multi-feature fusion retrieval method.

2024

DOUDKIN A. А., VORONOV A. А., MARUSHKO Y. Е. Merging algorithm for VLSI layout frames by key pointsРассматривается алгоритм сшивки кадров слоя топологии СБИС для формирования полного изображения слоя без искажений. Кадры получены путем съемки микроскопом с большим увеличением технологического слоя микросхемы. Областями применения результатов работы являются цифровая обработка изображений, методы анализа изображений

2024

Searching test image from image databases using features extraction from the content is currently an active research area. In this work, we present novel feature extraction approaches for content-based image retrieval, when the query image is color image. To facilitate robust man-machine interfaces, we accept query images with color attributes. Special attention is given to the similarity measure with different distance matrices properties since the test image and object image from database finding the distance measuring. Several applicable techniques within the literature are studied for these conditions. The goal of this paper is to present the user with a subset of images that are more similar to the object image. One of the most important aspects of the proposed methods is the accuracy measurement of the query image with different database images. The method significantly improves the feature extraction process and enables it, to be used for other computer vision applications.

2024

Masking the writing style of an author has been useful and used by novelists for the purpose of passing unnoticed, as well as by people who aim to give information without being linked to it. Within the PAN evaluation framework, it is presented the task of paraphrasing or changing the writing style of a document, maintaining the topic that is being discussed. We propose a method that performs transformations in sentences, with an unsupervised approach, i.e., without previous data of the author or linguistic characteristics of a document collection. We make syntactic and semantic changes using dictionaries and semantic resources, as well as syntactic rules for sentence simplification. In the evaluation section, we will expose the observed strengths and weaknesses of the proposal.

2024

Using aggregation functions on structured data: a use case in the FIGHT-HF project 2! Bernard De Baets, and Raúl Pérez-Fernández Ranking rules characterized by means of monometrics and consensus states 30 József Dombi, and O. Csiszár, Self De Morgan identity in nilpotent systems and a unified description of nilpotent operators 33 Paweł Drygaś Properties of uninorms and its generalization 36

2024, Taxonomy of Mathematical Plagiarism

Plagiarism is a pressing concern, even more so with the availability of large language models. Existing plagiarism detection systems reliably find copied and moderately reworded text but fail for idea plagiarism, especially in mathematical science, which heavily uses formal mathematical notation. We make two contributions. First, we establish a taxonomy of mathematical content reuse by annotating potentially plagiarised 122 scientific document pairs. Second, we analyze the best-performing approaches to detect plagiarism and mathematical content similarity on the newly established taxonomy. We found that the best-performing methods for plagiarism and math content similarity achieve an overall detection score (PlagDet) of 0.06 and 0.16, respectively. The best-performing methods failed to detect most cases from all seven newly established math similarity types. Outlined contributions will benefit research in plagiarism detection systems, recommender systems, question-answering systems, and search engines. We make our experiment’s code and annotated dataset available to the community: https://github.com/gipplab/Taxonomy-of-Mathematical-Plagiarism.

2024, RADIOELECTRONIC AND COMPUTER SYSTEMS

The noise-immune algorithms for estimating parameters of moving objects by video data are presented by modeling in Matlab, the noise immunity of correlation-extremal algorithms needed to estimate parameters of moving objects by video data has been investigated. Thresholds of stability of procedures of measurement of range, angular and high-speed parameters against the background of action of Gaussian, pulse and multiplicative noise are estimated. Recommendations about application the procedure of a filtration for the purpose of increase in a threshold of steady work are provided. Results of the conducted researches need to be considered at design of systems technical sight of different function.

2024, CLEF (Working Notes)

Recent research has suggested that there is no general similarity measure which can be applied parameter free on arbitrary databases. In contrast the optimal combination of similarity measures and parameters must be identified for each new image repository. This optimization loop is time consuming and depends on the experience of the designer as well as the knowledge of the medical expert. It would be useful if results that have been obtained for one dataset could be transferred to another image repository without extensive redesign of all relevant components. Transfer of data corpora is vital if image retrieval is integrated into complex environments such as picture archiving and communication systems (PACS). Image retrieval in medical applications (IRMA) is a framework that strictly separates data administration and application logic. This permits an efficient transfer of the data abstraction from one database to another without redesigning the software. It supports the loop of estimating a combination of distance measures, parameter adaption and result visualization, which is characteristic if an image retrieval application is used for varying data corpora. In this work the casimage dataset has been added as a data corpus to the IRMA system. Thereon the query performance has been evaluated without optimization of the currently applied feature combination. It consists of scaled representations of the images, global texture, aspect ratio, and an evaluation of deformation between pixels of different images.

2024, CLEF (Working Notes)

A combination of several classifiers using global features for the content description of medical images is proposed. Beside two texture features, downscaled representations of the original images are used, which preserve spatial information and utilize distance measures which are robust regarding common variations in radiation dose, translation, and local deformation. No query refinement mechanisms are used. The single classifiers are used within a parallel combination scheme, with the optimization set being used to obtain the best weighing parameters. For the medical automatic annotation task, a categorization rate of 78.6% is obtained, which ranks 12th among 28 submissions. When applied in the medical retrieval task, this combination of classifiers yields a mean average precision (MAP) of 0.0172, which is rank 11 of 11 submitted runs for automatic, visual only systems.

2024, CLEF (Working Notes)

A combination of several classifiers using global features for the content description of medical images is proposed. Beside well known texture histogram features, downscaled representations of the original images are used, which preserve spatial information and utilize distance measures which are robust regarding common variations in radiation dose, translation, and local deformation. These features were evaluated for the annotation task and the interactive query task in ImageCLEF 2005 without using additional textual information or query refinement mechanisms. For the annotation task, a categorization rate of 86.7% was obtained, which ranks second among all submissions. When applied in the interactive query task, the image content descriptors yielded a mean average precision (MAP) of 0.0751, which is rank 14 of 28 submitted runs. As the image deformation model is not fit for interactive retrieval tasks, two mechanisms are evaluated regarding the trade-off between loss of accuracy and speed increase: hierarchical filtering and prototype selection.

2024

Abstract. Recent research has suggested that there is no general similarity measure, which can be applied on arbitrary databases without any parameterization. Hence, the optimal combination of similarity measures and parameters must be identified for each new image repository. This optimization loop is time consuming and depends on the experience of the designer as well as the knowledge of the medical expert. It would be useful if results that have been obtained for one data set can be transferred to another without extensive re-design. This transfer is vital if content-based image retrieval is integrated into complex environments such as picture archiving and communication systems. The image retrieval in medical applications (IRMA) project defines a framework that strictly separates data administration and application logic. This permits an efficient transfer of the data abstraction of one database on another without re-designing the software. In the ImageCLEF competition, the quer...

2024

Plagiarism is one of the major aspects that is considered when it comes to academics, literature as well as other fields where it is necessary to check if an idea is original. Plagiarism, when simply put, means the act of copying someone’s work and portraying it as your own. It is ethically incorrect and is considered as a crime. For the purpose of finding plagiarism, many tools are available which can be downloaded or can be directly used online. These tools check the similarity at lexical and sentence level only. Hence, they only do statistical comparison whether the sentence is plagiarised or not, and not whether the idea is plagiarised. This project deals with detecting plagiarism at semantic level as well as identifying paraphrases, and ignoring the Named Entities which add to unnecessary plagiarism percentages. For the purpose of achieving this, we use Latent Semantic Analysis and a Bidirectional LSTM model for paraphrase detection. The final plagiarism uses a neural network t...

2024

The paper presents improved content based image retrieval (CBIR) techniques based on multilevel Block truncation coding using multiple threshold values. Block truncation Coding based features is one of the CBIR methods proposed using color features of image. The approach basically considers red, green and blue planes of image together to compute feature vector. The color averaging methods used here are BTC Level-1, BTC Level-2, BTC Level-3.Here the feature vector size per image is greatly reduced by using mean of each plane and find out the threshold value then divide each plane using threshold value, color averaging is applied to calculate precision and recall to calculate the performance of the algorithm. Instead of using all pixel data of image as feature vector for image retrieval, these six feature vectors can be used, resulting into better performance and if increased the no of feature vector get better performance.The proposed CBIR techniques are tested on generic image datab...

2024

Classification of wood images has always fascinated researchers and is a hot area of research as it brings automation in wood industry. Wood comes in different types and for a given wood; it can have some good part and some bad part like cracks, water affected and sapwood. The objective of this paper is to classify the sapwood region and the non-sapwood region in oak wood image in order to get the good part of the wood (heart part). Color based features and texture based feature extraction strategies were employed and it was found that color based feature extraction techniques are better compared to others in order to find sapwood.

2024

Face recognition is mainly used to identify the person by comparing the facial features. To extract the facial feature several techniques are used. In this paper a novel local pattern descriptor is used to extract the features. This feature extractor is called local vector pattern (LVP). The LVP used in this paper extract the features in high-order derivative space for face recognition. The LVP is mainly used to reduce the high redundancy and feature length increasing problem. The feature length increasing problem is solved by a comparative space transform. It is used to encode various spatial surrounding relationships between the referenced pixel and its surrounding pixels. The linking of LVPs is compacted to produce more distinctive features and reduce the redundancy problem. The LVP extracts the micro patterns encoded through the pair wise directions of vector by using an effective coding scheme called Comparative Space Transform (CST) for successfully extracting distinctive information. The histogram intersection methods are used for evaluating the similarity between the spatial histograms of two distributions extracted from the LVP and recognize the face image.

2024

repository. Visit the repository for full metadata or to contact the repository team © University of Bradford. This work is licenced for reuse under a Creative Commons

2024

Due to the progress in digital imaging technology, image retrieval (IR) has become a very active research area in computer science. Although many researches are increased in Sketch Based Image Retrieval (SBIR) field, it is still difficult to bridge the gap between image and sketch matching problem. Therefore, this paper presents a scalable SBIR system and contributes to get more efficient retrieval result. The features of both the query sketch and database images are extracted by Scale Invariant Feature Transform (SIFT) algorithm. Then the cropped keypoint images are processed by Canny edge detection. After blocking the edge image, the matched feature values are get by pixel count ratio. The retrieved images similar with query sketch are displayed by rank. Mean Average Precision (MAP) and Recall rates is measured as evaluation criteria. To evaluate the performance of this system, the benchmark sketch dataset of Eitz et al. is used.

2024, HAL (Le Centre pour la Communication Scientifique Directe)

Semantic Textual Similarity (STS) is an important component in many Natural Language Processing (NLP) applications, and plays an important role in diverse areas such as information retrieval, machine translation, information extraction and plagiarism detection. In this paper we propose two word embeddingbased approaches devoted to measuring the semantic similarity between Arabic-English cross-language sentences. The main idea is to exploit Machine Translation (MT) and an improved word embedding representations in order to capture the syntactic and semantic properties of words. MT is used to translate English sentences into Arabic language in order to apply a classical monolingual comparison. Afterwards, two word embedding-based methods are developed to rate the semantic similarity. Additionally, Words Alignment (WA), Inverse Document Frequency (IDF) and Part-of-Speech (POS) weighting are applied on the examined sentences to support the identification of words that are most descriptive in each sentence. The performances of our approaches are evaluated on a crosslanguage dataset containing more than 2400 Arabic-English pairs of sentence. Moreover, the proposed methods are confirmed through the Pearson correlation between our similarity scores and human ratings.

2024

As technology continues to increase the various formats in which medical images are created, transmitted, and analyzed, it has become more necessary to restrict the different ways in which this data is stored and formatted between the conflicting modalities. There is a significant increase in the use of medical images in clinical medicine, disease research, and education. While the literature lists several successful systems for contentbased image retrieval and image management methods, they have been unable to make significant inroads in routine medical informatics. This paper presents a new approach to image retrieval based on color, texture, and shape by using pyramid structure wavelet. The major advantage of such an approach is that little human intervention is required. However, most of these systems only allow a user to query using a complete image with multiple regions and are unable to retrieve similar looking images based on a single region. Experimental results of the query system on different test image databases are given. This paper introduces a comparative study between color, texture, shape and the pyramid structure wavelet technique and generates the receiving operating characteristic curve (ROC) to assess the results. The area under the curve when use color is 0.58, when use shape is 0.68, when use texture 0.74 and when use the wavelet technique is 0.8.

2024

In the previous few years, large number of digital images are used in various application areas. To store and retrieve these images we need some efficient image retrieval techniques. Content Based Image Retrieval is one of the image retrieval system, but to develop a CBIR system with appropriate combination of low level features is a big problem. One another problem with CBIR system is to choose the effective similarity measure. In this paper, comparison is done between different similarity measures. Number of similarity measures are applied on combination of texture and shape based features. Here, Euclidian, Manhattan, Minkowski and Spearman distance measures are used as similarity measurement and applied on the combination of Local Binary Pattern and Edge Histogram Descriptor. For performance evaluation Precision and recall are used.

2024, IEEE Access

Content Based Image Retrieval, CBIR, is a highly active leading research field with numerous applications that are currently expanding beyond traditional CBIR methodologies. In this paper, a CBIR methodology is proposed to meet such demands. Query inputs of the proposed methodology are an image and a text. For instance, having an image, a user would like to obtain a similar one with some modification described in text format that we refer to as a text-modifier. The proposed methodology uses a set of neural networks that operate in feature space and perform feature composition in a uniform-known domain which is the textual feature domain. In this methodology, ResNet is used to extract image features and LSTM to extract text features to form query inputs. The proposed methodology uses a set of three single-hiddenlayer non-linear feedforward networks in a cascading structure labeled NetA, NetC, and NetB. NetA maps image features into corresponding textual features. NetC composes the textual features produced by NetA with text-modifier features to form target image textual features. NetB maps target textual features to target image features that are used to recall the target image from the image-base based on cosine similarity. The proposed architecture was tested using ResNet 18, 50 and 152 for extracting image features. The testing results are promising and can compete with the most recent approaches to our knowledge as listed in section 5.

2024

Resumen-El ser humano puede de forma intuitiva seleccionar un grupo de fotografías de un conjunto e identificarlas como similares. Para ello, se emplean criterios basados en procesos cognitivos de aprendizaje centrados en sus características visuales. En un computador, se trata de emular dicho proceso al calcular un vector de características para una imagen, el cual permite identificarlas por sus atributos como su luminosidad, color predominante, intensidad, tonalidad, entre otros. El estándar MPEG-7 es una representación de imágenes y video que define, entre diversos atributos, características llamadas descriptores visuales los cuales pueden ser empleados para la aplicación de funciones de similitud entre imágenes. Sin embargo, el costo computacional de obtener estos descriptores es elevado. En este trabajo proponemos realizar estos cálculos empleando arquitecturas paralelas ofrecidas por las tarjetas gráficas. De esta forma, se realizan modificaciones a la propuesta original de MPEG-7 para ser ajustadas a las GPUs y obtener resultados en menores tiempos manteniendo la eficacia. La experimentación realizada soporta nuestra propuesta al implementar descriptores visuales de color, textura y forma, aplicados a una gran cantidad de imágenes permitiendo determinar con precisión la similitud entre pares.

2024, Springer eBooks

C HAPTER has shown that operations on floating-point numbers are naturally expressed in terms of integer or fixed-point operations on the significand and the exponent. For instance, to obtain the product of two floating-point numbers, one basically multiplies the significands and adds the exponents. However, obtaining the correct rounding of the result may require considerable design effort and the use of nonarithmetic primitives such as leading-zero counters and shifters. This chapter details the implementation of these algorithms in hardware, using digital logic. Describing in full detail all the possible hardware implementations of the needed integer arithmetic primitives is much beyond the scope of this book. The interested reader will find this information in the textbooks on the subject [345, 483, 187]. After an introduction to the context of hardware floating-point implementation in Section 8.1, we just review these primitives in Section 8.2, discuss their cost in terms of area and delay, and then focus on wiring them together in the rest of the chapter. We assume in this chapter that inputs and outputs are encoded according to the IEEE 754-2008 Standard for Floating-Point Arithmetic.

2024

Text is red and yellow with polka dots has red bottom and business top with a tan jacket with arms. CLIP Image Encoder CLIP Text Encoder Combiner Combined Features CLIP Image Encoder Contrastive Loss Reference Image Target Image Target Image Features Weights Update Image Features Text Features Figure 1: An overview of the system, from the input image and captions on the left, to the target image on the right.

2024, INTERNATIONAL JOURNAL OF COMPUTERS & TECHNOLOGY

An efficient non-uniform color quantization and similarity measurement methods are proposed to enhance the content-based image retrieval (CBIR) applications. The HSV color space is selected because it is close to human visual perception system, and a non-uniform color method is proposed to quantize an image into 37 colors. The marker histogram (MH) vector of size 296 values is generated by segmenting the quantized image into 8 regions (multiplication of 45Â°) and count the occurrences of the quantized colors in their particular angles. To cope with rotated images, an incremental displacement to the MH is applied 7 times. To find similar images, we proposed a new similarity measurement and other 4 existing metrics. A uniform color quantization of related work is implemented too and compared to our quantization method. One-hundred test images are selected from the Corel-1000 images database. Our experimental results conclude high retrieving precision ratios compared to other techniques.

2024, Proceedings of the NAACL Student Research Workshop

This paper proposes an automatic tag assignment approach to various e-commerce products where tag allotment is done solely based on the visual features in the image. It then builds a tag based product retrieval system upon these allotted tags. The explosive growth of e-commerce products being sold online has made manual annotation infeasible. Without such tags it's impossible for customers to be able to find these products. Hence a scalable approach catering to such large number of product images and allocating meaningful tags is essential and could be used to make an efficient tag based product retrieval system. In this paper we propose one such approach based on feature extraction using Deep Convolutional Neural Networks to learn descriptive semantic features from product images. Then we use inverse distance weighted K-nearest neighbours classifiers along with several other multi-label classification approaches to assign appropriate tags to our images. We demonstrate the functioning of our algorithm for the Amazon product dataset for various categories of products like clothing and apparel, electronics, sports equipment etc.

2024, Jurnal Mnemonic

Semakin berkembangnya teknologi internet, petani koi di Desa Sumber Blitar masih mempunyai suatu kendala dalam penjualan. Usaha koi "Sumber Koi" Blitar bergerak di bidang pembesaran bibit dan penjualan ikan koi. Penjualan sebelumnya dilakukan dengan cara membuka website sumber koi blitar di internet, instagram dan juga whatssap sehingga hanya customer tetap yang dapat mengakses. Petani di "Sumber Koi" membutuhkan suatu metode cepat untuk memperkirakan jumlah bibit, jenis bibit, dan rekapitulasi penjualan untuk mempermudah proses pemasaran ikan koi. Untuk mengatasi permasalahan ini digunakan metode (MBA) yang dapat menganalisis produk yang dibeli secara bersamaan, produk yang sering dibeli oleh pelanggan serta jumlah produk yang terbeli. Metode penelitian yang digunakan pada penelitian ini adalah metode prototype meliputi yang dimulai dari pengumpulan kebutuhan, membangun prototype, mengkodekan sistem, dan menguji sistem. Pada penelitian ini diperoleh hasil suatu aplikasi yang dapat digunakan untuk memperkirakan pemasaran ikan koi menggunakan metode Market Basket Analysis (MBA). Pengujian yang dilakukan meliputi, pengujian Black Box, pengujian ahli validator, serta pengujian pengguna. Dari pengujian Blacx Box diperoleh hasil keseluruhan fungsional aplikasi berfungsi dengan baik. Dari pengujian validasi yang dilakukan oleh 2 validator diperoleh porsentase kesesuaian hasil 77,5%. Hasil pengujian aplikasi oleh pengguna diperoleh porsentase kesesuaian hasil 89%

2024

Precise documentation and organized storage of archaeological findings are essential for scientific research in archaeology. However, with an increasing number of findings, traditional methods cannot meet these modern requirements. Retrieval systems are to provide a variety of tools for these tasks, while being responsible for both long-term storage of findings and easy-to-use user interaction. In this paper, we discuss aspects of the retrieval of rotationally symmetric pottery. In order for that to work, specific properties have to be defined and extracted. Therefore, we will use the shape of a vessel’s profile line, since most of them are solids of revolution. It turns out, that this requires the vessel to be correctly aligned to its axis of rotation. We present a new approach that exploits symmetry features of the vessel and applies an optimization method based on particle swarms. Besides an approach of automatically extracting features of the profile line, we can offer the user ...

2024

In this paper, we present FeatSet, a compilation of visual features extracted from open image datasets reported in the literature. FeatSet has a collection of 11 visual features, consisting of color, texture, and shape representations of the images acquired from 13 datasets. We organized the available features in a standard collection, including the available metadata and labels, when available. We also provide a description of the domain of each dataset included in our collection, with visual analysis using Multidimensional Scaling (MDS) and Principal Components Analysis (PCA) methods. FeatSet is recommended for supervised and non-supervised learning, also widely supporting Content-Based Image Retrieval (CBIR) applications and complex data indexing using Metric Access Methods (MAMs).

2024, Journal of Information and Data Management

Real-world applications generate large amounts of images every day. With the generalized use of social media, users frequently share images acquired by smartphones. Also, hospitals, clinics, exhibits, factories, and other facilities generate images with potential use for many applications. Processing the generated images usually requires feature extraction, which can be time-consuming and laborious. In this paper, we present FeatSet+, a compilation of color, texture and shape visual features extracted from 17 open image datasets reported in the literature. FeatSet+ provides a collection of 11 distinct visual features, extracted by well-known Feature Extraction Methods (FEMs) such as LBP, Haralick, and Color Layout. We organized the available features in a standard collection, including the metadata and labels, when available. Eleven of the datasets also contain classes, which aid the evaluation of supervised methods such as classifiers and clustering tasks. FeatSet+ is available for download in a public repository as sql scripts and csv files. Additionally, FeatSet+ provides a description of the domain of each dataset, including the reference to the original work and link. We show the potential applicability of FeatSet+ in four computational tasks: multi-attribute analysis and retrieval, visual analysis using Multidimensional Scaling (MDS) and Principal Components Analysis (PCA), global feature classification, and dimensionality reduction. FeatSet+ can be employed to evaluate supervised and nonsupervised learning tasks, also widely supporting Content-Based Image Retrieval (CBIR) applications and complex data indexing using Metric Access Methods (MAMs).

2024

Bedridden patients with skin lesions (ulcers) often do not have access to specialized clinic equipment. It is important to allow healthcare practitioners to use their smartphones to leverage information regarding the proper treatment to be carried. Existing applications require special equipment, such as heat sensors, or focus only on general information. To fulfill this gap, we propose ULEARn, a DBMS-based framework for the processing of ulcer images, providing tools to store and retrieve similar images of past cases. The proposed mobile application ULEARn-App allows healthcare practitioners to send a photo from a patient to ULEARn, and obtain a timely feedback that allows the improvement of procedures on therapeutic interventions. Experimental results of ULEARn and ULEARn-App using a real-world dataset showed that our tool can quickly respond to the required analysis and retrieval tasks, being up to 4.6 times faster than the specialist' expected execution time.