Keyword Search Research Papers - Academia.edu (original) (raw)

2025

Personalized web search is one of the growing concepts in the web technologies. Personalization of web search is to carry out retrieval for each user incorporating his/her interests. For a given query, a personalized Web search can provide different search results for different users or organize search results differently for each user, based upon their interests, preferences, and information needs. There are many personalized web search algorithms for analyzing the user interests and producing the outcome quickly; User profiling, Hyperlink Analysis, Content Analysis and collaborative web search are some of the instances for that kind of algorithms. In this paper we are analyzing various issues of personalized web search.

2025, Proceedings of EVA London 2025

2025

Keyword indices, topic directories, and link-based rankings are used to search and structure the rapidly growing Web today. Surprisingly little use is made of years of browsing experience of millions of people. Indeed, this information is routinely discarded by browsers. Even deliberate bookmarks are stored in a passive and isolated manner. All this goes against Vannevar Bush's dream of the Memex : an enhanced supplement to personal and community memory. We propose to demonstrate the beginnings of a 'Memex' for the Web: a browsing assistant for individuals and groups with focused interests. Memex blurs the artificial distinction between browsing history and deliberate bookmarks. The resulting glut of data is analyzed in a number of ways at the individual and community levels. Memex constructs a topic directory customized to the community, mapping their interests naturally to nodes in this directory. This lets the user recall topic-based browsing contexts by asking questions like "What trails was I following when I was last surfing about classical music?" and "What are some popular pages in or near my community's recent trail graph related to music?"

2025, Journal of Information Science

In this paper, four self-developed user interfaces that display document search results using different methods were compared. In order to create the four interfaces, two information elements: document categories and lines from the document were used. A user study compared the four interfaces. It was found that the category addition to the interface was beneficial in both measurable and subjective measures. It was also found that displaying the relevant lines from the document increased the effectiveness and shortened the search time in all cases and tasks. It was found that the participants preferred the interface containing categories and relevant lines to all other interfaces checked. It was also the fastest in the objective time measurement. Another sub-research that was conducted showed that the most important parameter for the users was the confidence level that the answer was accurate, and the least important parameter was the feeling of comfort while conducting a search.

2025, the Leibniz Center for Research in Computer Science

In this paper, we compared four self-developed user interfaces that display document search results using different methods. The two information elements that we used in order to create the four interfaces were document categories and lines from the document. A user study conducted compared our four interfaces. The study showed that the category addition to the interface is beneficial in both objective and subjective measures. It also showed that displaying the relevant lines from the document instead of its first lines increases the effectiveness and shorten the search time in all cases and tasks. The participants liked the interface containing categories and relevant lines better than all other interfaces checked. The search time in this interface was not only perceived as faster by the participants but also was proved faster in the objective manner. Organizing search results using text element enables users to focus on items that are related to search query rather than to browse through the entire displayed search results list. Another sub-research that we conducted, showed that the most important parameter for the users is the confidence level that the answer is accurate; the second parameter in terms of importance is the time of search, and the least important parameter is the feel of comfort while conducting a search.

2025, Proceedings of SEKE 2002 - acm

In the framework of a study, which investigated implementation of a model for displaying search results, the possibility of ranking documents that appear in a list of search results was examined. The purpose of this paper is to present the concept of using mutual references between documents as a tool for ranking documents, and to present the findings of a study that investigated the applicability of the concept.

2025, International Journal of Computer Science, Engineering and Information Technology

Keyword search in relational databases allows user to search information without knowing database schema and using structural query language (SQL). In this paper, we address the problem of generating and evaluating candidate networks. In candidate network generation, the overhead is caused by raising the number of joining tuples for the size of minimal candidate network. To reduce overhead, we propose candidate network generation algorithms to generate a minimum number of joining tuples according to the maximum number of tuple set. We first generate a set of joining tuples, candidate networks (CNs). It is difficult to obtain an optimal query processing plan during generating a number of joins. We also develop a dynamic CN evaluation algorithm (D_CNEval) to generate connected tuple trees (CTTs) by reducing the size of intermediate joining results. The performance evaluation of the proposed algorithms is conducted on IMDB and DBLP datasets and also compared with existing algorithms.

2025, International Journal of Advance Research and Innovative Ideas in Education

Keyword search generally used to search large amount of data. There are certain difficulties occurred to answer some queries due to their ambiguity. Also due to short and uncertain keywords diversification of keyword creates a problem. We proposed a system to address these problems. Our system automatically expands keyword search. It is based on different context information of the XML data. Our system firstly selects a feature selection model for designing an effective XML keyword search from a large database. Then it will automatically diversify the keyword search. For searching keyword from XML data a short and vague keyword query is used. Feature selection model is used to derive search candidate of the query search. Our proposed model proves the effectiveness of our system by evaluating real as well as synthetic datasets. In this system more efficiency can be achieved as we proposed pruning algorithm and Hadoop platform for implementation of our system.

2025

A framework for interface design that provides people with flexible control over different views for an information space is presented. The agileviews framework defines overviews, previews, reviews, peripheral views, and shared views that help people make decisions about where they should focus attention during information seeking. In addition to the views themselves, control mechanisms that facilitate low-effort actions and strategies

2025, International Journal of Science and Research (IJSR)

Cloud storage becomes more and more popular in recent years due to its benefits such as scalability, availability, low cost service over traditional storage solutions. Organizations are motivated to migrate their data from local site to central commercial public cloud server. By outsourcing data on cloud users gets relief from storage maintenance. Although there are many benefits to migrate data on cloud storage it brings many security problems. Therefore the data owners hesitate to migrate the sensitive data. In this case the control of data is going towards cloud service provider. This security problem induces data owners to encrypt data at client side and outsource the data. By encrypting data improves the data security but the data efficiency is decreased because searching on encrypted data is difficult. The search techniques which are used on plain text cannot be used over encrypted data. The existing solutions supports only identical keyword search, semantic search is not supported. In the paper we proposed semantic multi-keyword ranked search system. To improve search efficiency this system includes semantic search by using WordNet library. Vector space model and TF-IDF model is used for index construction and query generation.

2025, … of the 19th CIRP …

Biological inspiration for engineering design has occurred through a variety of techniques such as creation and use of databases, keyword searches of biological information in natural-language format, prior knowledge of biology, and chance observations of nature. This research focuses on utilizing the reconciled Functional Basis function and flow terms to identify suitable biological inspiration for function based design. The organized search provides two levels of results: (1) associated with verb function only and (2) narrowed results associated with verb-noun (function-flow). A set of heuristics has been complied to promote efficient searching using this technique. An example for creating smart flooring is also presented and discussed.

2025

Studying the history of museums on the web faces multiple challenges, including those related to the specificity of the website as an object of study (Brügger, 2009). The problem of the ephemeral character of the website is quite familiar to the researchers of the live web and becomes even more complicated in relation to the archived web (an den Heuvel, 2010). Periodisation of the websites' development and reconstruction of the versions of the websites for research are under ongoing discussion by scholars. There are several approaches to the periodisation of the website evolution: 1) reference to the technological changes in website construction and design (Allen, 2013; Helmond, 2013); 2) shifts in the content published on the websites (Chakraborty & Nanni, 2017); 3) generalisation of the web development (Anat Ben-David, 2019). The versioning of websites, selecting portions of information that should be taken into account while researching, and decomposing preserved data into fragments also refer to periodisation. A version can be considered as a composition of the snapshots from a certain period. A year is often taken into account for reconstructing a website or it can be a selection of separate years with some gap in between for tracing the changes (Svarre & Skov, 2024). Of course, the approach depends on the research purposes and may vary. The proposed paper suggests considering the periodisation based on the assessment of the resources preserved on the web archives and available for research.

2025, Multimedia Tools and Applications

Web search users complain of the inaccurate results produced by current search engines. Most of these inaccurate results are due to a failure to understand the user's search goal. This paper proposes a method to extract users' intentions and to build an intention map representing these extracted intentions. The proposed method makes intention vectors from clicked pages from previous search logs obtained on a given query. The components of the intention vector are weights of the keywords in a document. It extracts user's intentions by using clustering the intention vectors and extracting intention keywords from each cluster. The extracted the intentions on a query are represented in an intention map. For the efficiency analysis of intention map, we extracted user's intentions using 2,600 search log data a current domestic commercial search engine. The experimental results with a search engine using the intention maps show statistically significant improvements in user satisfaction scores.

2025

Study purpose. This research aims to analyze and highlight the potential of bee products in reducing oxidative stress and inflammation after physical activity/exercise. Materials and methods. This research uses a systematic review method by searching various journal databases such as Scopus, Web of Science, PubMed and Embase. The inclusion criteria in this study were articles published in the last 5 years and articles discussing bee products, honey, oxidative stress, inflammation, physical activity, and exercise. The exclusion criteria in this study were articles published in disreputable journals. Titles, abstracts, and full texts of articles were screened then verified and stored in Mendeley software. A total of 7,124 articles from the Scopus, Web of Science, PubMed and Embase databases were identified. A total of 8 articles that met the inclusion criteria were selected and analyzed for this systematic review. Results. Bee products that have anti-oxidant properties can reduce oxidative stress and the anti-inflammatory properties of bee products can reduce uncontrolled inflammation due to exercise. Conclusions. Bee products contain flavonoids which have anti-oxidant properties which can reduce oxidative stress. In addition, the anti-inflammatory properties of bee products can reduce uncontrolled inflammation due to physical activity/exercise. In this case, honey works by inhibiting inflammation through NF-κB signals and reducing inflammation by suppressing the secretion of pro-inflammatory cytokines such as TNF-α and inflammatory markers such as CRP. Reducing inflammation can reduce the intensity of muscle pain. It is recommended that bee products be used in individuals to reduce oxidative stress and inflammation after physical activity/exercise.

2025, National Conference on Artificial Intelligence

The paper presents and evaluates the power of best-first search over AND/OR search spaces in graphical models. The main virtue of the AND/OR representation is its sensitivity to the structure of the graphical model, which can translate into significant time savings. Indeed, in recent years depth-first AND/OR Branch-and-Bound algorithms were shown to be very effective when exploring such search spaces, especially when using caching. Since best-first strategies are known to be superior to depth-first when memory is utilized, exploring the best-first control strategy is called for. In this paper we introduce two classes of best-first AND/OR search algorithms: those that explore a context-minimal AND/OR search graph and use static variable orderings, and those that use dynamic variable orderings but explore an AND/OR search tree. The superiority of the best-first search approach is demonstrated empirically on various real-world benchmarks.

2025

The worthwhile data mining tools encourage the companies to share their data to be mined. Whereas, the companies are avoided passing their data to the miner directly because of their privacy and confidentially roles. Multi Party Computation (MPC) is a cryptographic tool which consummates aggregation on distributed data with ensuring the privacy preserving of sensitive data. In this paper, we represent a secure summation algorithm for online transactions, where the users will join the system piecemeal. The algorithm emerges the excessively useful response time, so the execution time of summation for 1000 users' data is only 0.9s.

2025, EPiC series in computing

This paper describes a simulated audio dataset of spoken words which accommodate microphone array design for training and evaluating keywords spotting systems. With this dataset you could train a neural network for the detection direction of the speaker. Which is an advanced version of the original, with added noises during a speech in random locations and different rooms with different reverb. Hence it should be closer to real-world long-range applications. This task could be a new challenge for the direction of arrival activated by keyword spotting systems. Let's call this task KWDOA. This dataset could serve as the intro level for microphone array designs. Keyword spotting (later only referred to as KWS) is a challenging task that can be used in many technology areas. KWS is often used as a wake-up system for home assistants or systems like Google, and Siri on phones. It requires high-efficiency computation and low memory utilization while maintaining high accuracy and low power consumption. However, there is little amount of datasets that use well-described multi-channel recordings. This dataset is simulated in a Python environment using module pyroomacoustics [1] for a specific real-world microphone array with the inner distance of microphones equal to 57mm. This microphone array can be used in the final prototype or as an education/testing/recording device. The original dataset, Speech Commands Dataset [2], contains recordings of numerous speakers. It is very useful as a dataset for close-range KWS. The current state-of-the-art model by score on Google Speech Commands V1-12 with paper is to the date of writing this paper. This dataset should create more diversity among different models due to more varied environments. It can also help produce better KWS for devices like home assistants or environment-aware robots. The new dataset is released under Creative Commons BY 4.0 license . This means that anyone can reach this dataset and use it in research or development.

2025, Lecture Notes in Computer Science

Existing works on keyword search over relational databases typically do not consider users' search intention for a query and return many answers which often overwhelm users. We observe that a database is in fact a repository of real world objects that interact with each other via relationships. In this work, we identify four types of semantic paths between objects and design an algorithm called pathRank to compute and rank the results of keyword queries. The answers are grouped by the types of semantic paths which reflect different query interpretations, and are annotated to facilitate user understanding.

2025

Φτάνοντας στο τέλος αυτού του ταξιδιού, ϐρίσκω τον εαυτό µου να έχει αποκοµίσει ένα πλήθος από οφέλη, πολύ περισσότερα από την αναµφισβήτητη κατάκτηση της γνώσης που συνοδεύει αυτού του είδους τα ερευνητικά ταξίδια. Η συµβολή του συνόλου των µεταπτυχιακών σπουδών µου στη διαµόρφωση του χαρακτήρα µου πιστεύω υπήρξε καϑοριστική. Η έρευνα µε δίδαξε επιµονή, υποµονή, αποφασιστικότητα και διορατικότητα. Επίσης, πέρα από το µαγευτικό ταξίδι στη γνώση, µου χάρισε απλόχερα και ένα πλήθος ταξιδιών, εµπειριών και γνωριµιών σχεδόν σε όλα τα σηµεία του πλανήτη µας, κάτι για το οποίο αισθάνοµαι ιδιαίτερα τυχερή και ευτυχής. Και τώρα, νιώθω την ανάγκη να ευχαριστήσω πολλούς συνεργάτες, αλλά και ϕίλους, για τη στήριξη και τη συµβολή τους στην ολοκλήρωση της διδακτορικής αυτής διατριβής. Θα ήθελα, πρωτίστως, να ευχαριστήσω την Καθηγήτρια κ. Ευαγγελία Πιτουρά, επιβλέπουσα του συνόλου της ερευνητικής µου πορείας ως την ολοκλήρωση αυτής της διατριβής. Η πολυετής συνεργασία µας συνέβαλε καθοριστικά σε ένα πλήθος γνώσεων και δεξιοτήτων που νιώθω ότι κατέκτησα. Την ευχαριστώ τόσο για την ουσιαστική ερευνητική συνεργασία όσο και για το συνολικό ακαδηµαϊκό ήθος το οποίο µου µετέδωσε. Θα ήθελα, επίσης, να ευχαριστήσω τους συνεργάτες -αλλά κυρίως ϕίλους-στο Εργαστήριο Κατανεµηµένης ∆ιαχείρισης και Επεξεργασίας ∆εδοµένων του Τµήµατος Μηχανικών Η/Υ και Πληροφορικής του Πανεπιστηµίου Ιωαννίνων. Πρώτα από όλους, τον ∆ρ. Κώστα Στεφανίδη, µε τον οποίο είχαµε µία άριστη συνεργασία, κυρίως τα πρώτα χρόνια της ερευνητικής µου δραστηριότητας, και ο οποίος µε καθοδήγησε στα πρώτα ϐήµατα της ερευνητικής µου πορείας. Επίσης, την κ. Ευτυχία Κωλέτσου, η οποία πάντα ενδυναµώνει την πίστη µου στους ανθρώπους και τις δυνατότητές τους, καθώς και τον κ. ∆ηµήτρη Σουραβλιά. ∆ε ϑα µπορούσα να παραλείψω τον Λέκτορα του University of Queensland της Αυστραλίας ∆ρ. Mohamed Sharaf για την ευκαιρία που µου πρόσφερε να µεταβώ και να εργαστώ ένα διάστηµα στο εργαστήριό του. Η εµπειρία αυτή ήταν επικερδής σε ένα πλήθος διαφορετικών επιπέδων. Θα ήθελα να αναφερθώ ξεχωριστά σε όλους τους ανθρώπους του Τµήµατος Πληροϕορικής (πλέον Μηχανικών Η/Υ και Πληροφορικής) του Πανεπιστηµίου Ιωαννίνων οι οποίοι συνέβαλαν µε τον έναν ή µε τον άλλο τρόπο στην ολοκλήρωση αυτής της διατριβής. Ιδιαίτερα τους ϕίλους και συµφοιτητές Μυρτώ, Γεωργία, Κατερίνα, Ευτυχία, Γιώργο και Πέτρο. Ευχαριστώ επίσης τα µέλη της επιτροπής µου, κκ. Πάνο Χρυσάνθη, Peter Τριανταφύλλου, Παναγιώτη Βασιλειάδη, Παναγιώτη Τσαπάρα, Αριστείδη Λύκα και Νίκο Μαµουλή, για την τιµή που µου κάνουν να συµµετέχουν σε αυτή. ΄Ενα µεγάλο ευχαριστώ στους γονείς µου, Κώστα και Ντίνα, όχι µόνο για τη στήριξή τους καθόλη τη διάρκεια των σπουδών µου, αλλά και για τον τρόπο που µε ανέθρεψαν, ώστε να πιστεύω στον εαυτό µου και να επιµένω µπροστά σε κάθε δυσκολία. Νιώθω τυχερή που γίνοµαι καθηµερινά αποδέκτης της υπέρµετρης αγάπης τους. Τέλος, ένα τεράστιο ευχαριστώ στο σύντροφό µου Γρηγόρη, ο οποίος ϐρίσκεται δίπλα µου και µε στηρίζει σε κάθε µου ϐήµα. Αυτή η διαδροµή δε ϑα ήταν στο ελάχιστο τόσο ενδιαφέρουσα εάν δεν ήταν κοµµάτι της.

2025

Attribute-based encryption (ABE) is one of the recommended tools to secure real systems like the Internet of Things (IoT). Almost all the ABE schemes utilize bilinear map operations, known as pairings. The challenge with these schemes is that performing pairings results in high computation costs and IoT devices are typically resource-constrained, so, efficient pairing-free ABE schemes have been proposed to solve this issue. These schemes utilize classical cryptographic operations instead of heavy bilinear pairings. Recently, two pairing-free ciphertext-policy attribute-based encryption schemes have been proposed (by Das et al. and Sowjanya et al.). According to their claims, their schemes are secure against collusion attacks and provide indistinguishability in a selective-set security model. The first scheme also has been claimed to be secure against forgery attacks. In this paper, we show that the first scheme is vulnerable to ciphertext-only, collusion between four or more data users with specific features, and forgery attacks. We also show that the second scheme is vulnerable to a key recovery attack, which can lead to a collusion attack. So, even though they are highly efficient, they have some security vulnerabilities that can violate the claims of the authors.

2025

Public Key Encryptions with Keyword Search (PEKS) scheme had been hosted for keeping data security and privacy of outsourced data in a cloud environment. It is also used to provide search operations on encrypted data. Nevertheless, most of the existing PEKS schemes are disposed to key-escrow problems due to the private key of the target users are known by the Key Generating Center (KGC). To improve the key escrow issue in PEKS schemes, the Certificate-Less Public Key Encryptions with Keyword Search (CL-PEKS) scheme has been designed. Meanwhile, the existing CL-PEKS schemes do not consider refreshing keyword searches. Due to this, the cloud server can store search trapdoors for keywords used in the system and can launch keyword guessing attacks. In this research work, we proposed Certificate-Less Searchable Encryption with a Refreshing Keyword Search (CL-SERKS) scheme by attaching date information to the encrypted data and keyword. We demonstrated that our proposed scheme is secure a...

2025, ADCS 2010

The medical domain has an abundance of textual resources of varying quality. The quality of medical articles depends largely on their publication types. However, identifying high-quality medical articles from search results is till date a manual and time-consuming process. We present a simple, rule-based, post-retrieval approach to automatically identify medical articles belonging to three high-quality publication types. Our approach simply uses title and abstract information of the articles to perform this. Our experiments show that such ...

2025, 2008 IEEE International Conference on Semantic Computing

While semantic search technologies have been proven to work well in specific domains, they still have to confront two main challenges to scale up to the Web in its entirety. In this work we address this issue with a novel semantic search system that a) provides the user with the capability to query Semantic Web information using natural language, by means of an ontology-based Question Answering (QA) system [14] and b) complements the specific answers retrieved during the QA process with a ranked list of documents from the Web . Our results show that ontology-based semantic search capabilities can be used to complement and enhance keyword search technologies.

2025, Bioinformatics

Motivation: The World Wide Web has profoundly changed the way in which we access information. Searching the internet is easy and fast, but more importantly, the interconnection of related contents makes it intuitive and closer to the associative organization of human memory. However, the information retrieval tools currently available to researchers in biology and medicine lag far behind the possibilities that the layman has come to expect from the internet. Results: By using genes and proteins as hyperlinks between sentences and abstracts, the information in PubMed can be converted into one navigable resource. iHOP (Information Hyperlinked over Proteins) is an online service that provides this gene-guided network as a natural way of accessing millions of PubMed abstracts and brings all the advantages of the internet to scientific literature research. Navigating across interrelated sentences within this network is closer to human intuition than the use of conventional keyword search...

2025, arXiv (Cornell University)

Semantic Web is, without a doubt, gaining momentum in both industry and academia. The word "Semantic" refers to "meaning" -a semantic web is a web of meaning. In this fast changing and result oriented practical world, gone are the days where an individual had to struggle for finding information on the Internet where knowledge management was the major issue. The semantic web has a vision of linking, integrating and analysing data from various data sources and forming a new information stream, hence a web of databases connected with each other and machines interacting with other machines to yield results which are user oriented and accurate. With the emergence of Semantic Web framework the naïve approach of searching information on the syntactic web is cliché. This paper proposes an optimised semantic searching of keywords exemplified by simulation an ontology of Indian universities with a proposed algorithm which ramifies the effective semantic retrieval of information which is easy to access and time saving.

2025, Journal of human sciences

In this study "keyword search method" used in the filed of digital forensics sciences is investigated. Similar to using search engines on internet investigations, using keyword search method of computer forensics investigations has too much benefits. In the study, keyword search method loop of preparing of electronic data, indexing, query with keywords, matching on database and showing results to the users are explained. Also information of keyword search method on different data type such as forensic image data, live forensics data, static data and cloud computing data are discussed. The importance of using keyword search method on computer forensics is examined in 5 different aspect as rapid and effective computer forensics investigations, contribution to privacy, relational analysis of human-event-computing device, creating dictionary for decrypt passwords and recover some deleted data and detect steganography data. In this context, allround about using keyword searches' contribution to computer forensics investigation are evaluated. Finally, some examples about computer forensics tools are handled.

2025, INFRASTRUKTUR PERKOTAAN

Penelitian ini bertujuan untuk menganalisis topik terkait infrastruktur energi terbarukan dengan menggunakan pendekatan bibliometrik untuk mengidentifikasi tren riset, kolaborasi internasional, dan topik kunci yang berkembang dalam bidang ini. Data yang digunakan diperoleh dari database Scopus dan dianalisis menggunakan aplikasi VosViewer serta Biblioshiny. Hasil analisis menunjukkan bahwa (1) studi internasional tentang infrastruktur energi terbarukan mengalami peningkatan; (2) Terdapat kolaborasi yang luas antara negara-negara seperti Amerika Serikat, India, Jerman, China, dan Italia; (3) RWTH Aachen University berkontribusi besar terhadap publikasi terkait topik ini; (4) Li X. merupakan penulis dengan kontribusi terbesar dalam penelitian infrastruktur energi terbarukan; (5) Beberapa topik seperti kendaraan listrik, microgrid, penyimpanan energi, dan respon permintaan masih jarang ditemukan, sehingga menawarkan peluang besar bagi penelitian lebih lanjut. Penelitian ini memberikan wawasan yang berguna bagi peneliti yang ingin mengembangkan penelitian di bidang infrastruktur energi terbarukan, dengan harapan dapat memperluas cakupan penelitian ke topik-topik yang belum banyak diteliti.

2025, Future Generation Computer Systems

Searchable Encryption (SE) allows a client to search over large amounts of encrypted data outsourced to the Cloud. Although, this helps to maintain the confidentiality of the outsourced data but achieving privacy is a difficult and resource intensive task. With the increase in the query effectiveness, i.e., by shifting from single keyword SE to multikeyword SE there is a notable drop in the efficiency. This motivates to make use of the advances in the multi-core architectures and multiple threads where the search can be delegated across different threads to perform search in a parallel fashion. The proposed scheme is based on probabilistic trapdoors that are formed by making use of the property of modular inverses. The use of probabilistic trapdoors helps resist distinguishability attacks. The rigorous security analysis helps us to appreciate the advantage of having a probabilistic trapdoor. Furthermore, to validate the performance of the proposed scheme, it is implemented and deployed onto the British Telecommunication's Public Cloud offering and tested over a real speech corpus. The implementation is also extended to anticipate the performance gain by using the multi-core architecture that helps to maintain the lightweight property of the scheme.

2025

Is this perfect communication? What if Alice is trying to send instructions? Aka, an algorithm Does Bob understand the correct algorithm? What if Alice and Bob speak in different (programming) languages?

2025, International Journal of Science and Research (IJSR)

2025, Lecture Notes in Computer Science

Keyword searching is the most common form of document search on the Web. Many Web publishers manually annotate the META tags and titles of their pages with frequently queried phrases in order to improve their placement and ranking. A "hidden phrase" is defined as a phrase that occurs in the META tag of a Web page but not in its body. In this paper we present an algorithm that mines the definitions of hidden phrases from the Web documents. Phrase definitions allow (i) publishers to find relevant phrases with high query frequency, and, (ii) search engines to test if the content of the body of a document matches the phrases. We use cooccurrence clustering and association rule mining algorithms to learn phrase definitions from high-dimensional data sets. We also provide experimental results.

2025, ArXiv

Different search engines provide different outputs for the same keyword. This may be due to different definitions of relevance, and/or to different knowledge/anticipation of users' preferences, but rankings are also suspected to be biased towards own content, which may prejudicial to other content providers. In this paper, we make some initial steps toward a rigorous comparison and analysis of search engines, by proposing a definition for a consensual relevance of a page with respect to a keyword, from a set of search engines. More specifically, we look at the results of several search engines for a sample of keywords, and define for each keyword the visibility of a page based on its ranking over all search engines. This allows to define a score of the search engine for a keyword, and then its average score over all keywords. Based on the pages visibility, we can also define the consensus search engine as the one showing the most visible results for each keyword. We have impleme...

2025, Datenbank-spektrum

Although many works in the database community use open data in their experimental evaluation, repeating the empirical results of previous works remains a challenge. This holds true even if the source code or binaries of the tested algorithms are available. In this paper, we argue that providing access to the raw, original datasets is not enough. Real-world datasets are rarely processed without modification. Instead, the data is adapted to the needs of the experimental evaluation in the data preparation process. We showcase that the details of the data preparation process matter and subtle differences during data conversion can have a large impact on the outcome of runtime results. We introduce a data reproducibility model, identify three levels of data reproducibility, report about our own experience, and exemplify our best practices. Reproducibility is essential to scientific research. When new algorithms are proposed, they must be compared to existing work. Often, this process is frustrating due to missing information. First, an implementation of the competitors' approach is required. Missing details in the pseudocode, uncovered corner cases that are not discussed in the paper, and the lack of source code often make it cumbersome to reimplement existing work. Second, to be able to repeat previous experimental results, the datasets for these experiments are This work was supported by Austrian Science Fund (FWF): P 29859.

2025

A mix network is a cryptographic construction that enables a group of players to permute and re-encrypt a sequence of ciphertexts so as to hide the relationship between input and output ciphertexts. In this paper, we propose a mix network known as Millimix. In constrast to other proposed constructions, Millimix enjoys a very high level of computational efficiency on small input batches, that is, batches of several thousand items or smaller. Additionally, Millmix possesses the full set of properties typically sought, but generally unavailable in many other mix network constructions, including public verifiability, robustness against malicious coalitions of players, and strong privacy guarantees. Millimix therefore promises to serve as a useful and practical complement to existing mix network constructions.

2025

Abstract. Searchable encryption is a technique that allows a client to store data in encrypted form on a curious server, such that data can be retrieved while leaking a minimal amount of information to the server. Many searchable encryption schemes have been proposed and proved secure in their own computational model. In this paper we propose a generic model for the analysis of searchable encryptions. We then iden-tify the security parameters of searchable encryption schemes and prove information theoretical bounds on the security of the parameters. We argue that perfectly secure searchable encryption schemes cannot be ef-ficient. We classify the seminal schemes in two categories: the schemes that leak information upfront during the storage phase, and schemes that leak some information at every search. This helps designers to choose the right scheme for an application. 1

2025

Searchable encryption is a technique that allows a client to store documents on a server in encrypted form. Stored documents can be retrieved selectively while revealing as little information as possible to the server. In the symmetric searchable encryption domain, the storage and the retrieval are performed by the same client. Most conventional searchable encryption schemes suffer from two disadvantages. First, searching the stored documents takes time linear in the size of the database, and/or uses heavy arithmetic operations. Secondly, the existing schemes do not consider adaptive attackers; a search-query will reveal information even about documents stored in the future. If they do consider this, it is at a significant cost to updates. In this paper we propose a novel symmetric searchable encryption scheme that offers searching at constant time in the number of unique keywords stored on the server. We present two variants of the basic scheme which differ in the efficiency of search and update. We show how each scheme could be used in a personal health record system.

2025

With the beginning of cloud computing, it has become increasingly popular for PHR owners to outsource their documents to public cloud servers while allowing users to retrieve this data. For privacy concerns, secure searches over encrypted cloud data have motivated several research works under the single PHR owner model. However, most cloud servers in practice do not just serve one PHR owner; instead, they support multiple PHR owners to share the benefits brought by cloud computing. In this paper, we propose schemes to deal with Preserving Fuzzy keyword Search in a Multi-owner model along with it solves the problem of effective fuzzy keyword search over encrypted cloud data while maintaining keyword privacy. Fuzzy keyword search greatly enhances system usability by returning the similar documents when users’ searching inputs exactly match the predefined keywords or the closest possible similar Health records based on keyword similarity semantics, when an exact match fails. We systema...

2025, Technical Report No. 2000-34of the Leibniz Center for Research in Computer Science

The number of textual databases which include many full text documents has increased in the last few years. This has taken place mainly on the Internet both in the number of databases and their scope. The purpose of this essay is to present a comparative study which took place at the Hebrew University in Jerusalem with post coordinated textual retrieval systems. The comparison carried out was between the conventional interfaces for search engines used on the Internet and a new interface developed for this study. The study itself examined the behavior of databases users in carrying out various search tasks using different interfaces which display a list of the search results for defined tasks. A comparison was made between the behavior of the users of different interfaces. The findings of the study show a clear advantage for the interface which was developed for the experiment over popular interfaces used on the Internet for similar purposes, in terms of ease of use, sense of confidence of the user while carrying out the task and his feeling about the degree of relevance of the information displayed by the interface in carrying out his task.

2025, Proceedings of the Hypertext 2000 & Digital Libraries 2000

Information retrieval systems display search results by various methods. This paper focuses on a model for displaying a list of search results by means of textual elements that utilize a new information unit that replaces the currently... more

2025, SIGCHI bulletin", (A Quarterly Publication of the ACM Special Intrest Groupe on Cmputer-Human

This article presents the preliminary results of a study currently in progress on the display of search results from textual databases. The study deals with information that should be displayed in a list of documents that match the user's... more

2025

This chapter describes the design, development and testing of one component of a system for creating generally applicable, data-driven sitemap tools and information retrieval applications. This category of software tools are used to enhance information seeking on the web. Such tools are set up by web site administrators who want to provide visitors with alternative browsing and navigation aids. The Generalized Relation Browser, or GRB, illustrates the look-ahead strategy for web navigation within the Agileviews framework. Agileviews define control mechanisms and interfaces for overviews, previews, reviews, peripheral views, and shared views intended to help people make better decisions while seeking information on the Internet. Within this framework, the overall goal of the GRB is to help users to gain a better understanding of collections of online resources by providing them with a kind of sitemap. The GRB is the follow-on to the Federal Statistics Relation Browser prototype, which focused on the US Government Fedstats collection of web sites, providing users a more dynamic and informative browsing alternative to the static sitemap.

2025

Webometrics is concerned with measuring aspects of the web: web sites, web pages, parts of web pages, words in web pages, hyperlinks, web search engine results. Webometrics is huge and easily accessible source of information, there are limitless possibilities for measuring or counting on a huge scale of the number of web pages, the number of web sites, the number of blogs) or on a smaller scale. This study found the traffic rank in India, especially Central Universities of North East Region, the best-ranked Central University of North East Region are NEHU and TU with traffic ranks of 8484 and 8,511 respectively. Nagaland University has the highest number of average pages viewed by users per day (4.1), Sikkim University has highest (55.7%) upstream site of Google among other

2025, arXiv (Cornell University)

Proxy signature schemes have been invented to delegate signing rights. The paper proposes a new concept of Identify Based Strong Bi-Designated Verifier threshold proxy signature (ID-SBDVTPS) schemes. Such scheme enables an original signer to delegate the signature authority to a group of 'n' proxy signers with the condition that 't' or more proxy signers can cooperatively sign messages on behalf of the original signer and the signatures can only be verified by any two designated verifiers and that they cannot convince anyone else of this fact.

2025, Lecture Notes in Computer Science

Structured data sources promise to be the next driver of a significant socio-economic impact for both people and companies. Nevertheless, accessing them through formal languages, such as SQL or SPARQL, can become cumbersome and frustrating for end-users. To overcome this issue, keyword search in databases is becoming the technology of choice, even if it suffers from efficiency and effectiveness problems that prevent it from being adopted at Web scale. In this paper, we motivate the need for a reference architecture for keyword search in databases to favor the development of scalable and effective components, also borrowing methods from neighbor fields, such as information retrieval and natural language processing. Moreover, we point out the need for a companion evaluation framework, able to assess the efficiency and the effectiveness of such new systems and in the light of real and compelling use cases.

2025

This position paper discusses the need for considering keyword search over relational databases in the light of broader systems, where keyword search is just one of the components and which are aimed at better supporting users in their search tasks. These more complex systems call for appropriate evaluation methodologies which go beyond what is typically done today, i.e. measuring performances of components mostly in isolation or not related to the actual user needs, and, instead, able to consider the system as a whole, its constituent components, and their inter-relations with the ultimate goal of supporting actual user search tasks.

2025

Information retrieval and integration systems typically must handle incomplete and inconsistent data. Current approaches attempt to reconcile discrepant information by leveraging data quality, user preferences, or source provenance information. Such approaches may overlook the fact that information is interpreted relative to its context. Therefore, discrepancies may be explained and thereby resolved if contexts are taking into account. In this paper, we describe an information integrator that is capable of explaining ...

2025, Journal of Grid Computing

2025, Signal Processing Conference (EUSIPCO), 2014 Proceedings of the 22nd European

We present a system for Query-by-Example Spoken Term Detection on zero-resource languages. The system compares speech patterns by representing the signal using two different acoustic models, a Spectral Acoustic (SA) model covering the spectral characteristics of the signal, and a Temporal Acoustic (TA) model covering the temporal evolution of the speech signal. Given a query and a utterance to be compared, first we compute their posterior probabilities according to each of the two models, compute similarity matrices for each model and combine these into a single enhanced matrix. Subsequence-Dynamic Time Warping (S-DTW) algorithm is used to find optimal subsequence alignment paths on this final matrix. Our experiments on data from the 2013 Spoken Web Search (SWS) task at Mediaeval benchmark evaluation show that this approach provides state of the art results and significantly improves both the single model strategies and the standard metric baselines.

2025, MediaEval

We present a system for query by example on zero-resources languages. The system compares speech patterns by fusing the contributions of two acoustic models to cover both their spectral characteristics and their temporal evolution. The spectral model uses standard Gaussian mixtures to model classical MFCC features. We introduce phonetic priors in order to bias the unsupervised training of the model. In addition, we extend the standard similarity metric used comparing vector posteriors by incorporating inter cluster distances. To model temporal evolution patterns we use long temporal context models. We combine the information obtained by both models when computing the similarity matrix to allow subsequence-DTW algorithm to find optimal subsequece alignment paths between query and reference data. Resulting alignment paths are locally filtered and globally normalized. Our experiments on Mediaeval data shows that this approach provides state of the art results and significantly improves the single model and the standard metric baseline.