Hash Table Research Papers - Academia.edu (original) (raw)

Geometric hashing is a model-based recognition technique based on matching of transformation-invariant object representations stored in a hash table. In the last decade, a number of enhancements have been suggested to the basic method... more

Geometric hashing is a model-based recognition technique based on matching of transformation-invariant object representations stored in a hash table. In the last decade, a number of enhancements have been suggested to the basic method improving its performance and reliability. One of the important enhancements is rehashing, improving the computational performance by dealing with the problem of non-uniform occupancy of hash bins. However, the proposed rehashing schemes aim to redistribute the hash entries uniformly, which is not appropriate for Bayesian approach, another enhancement optimizing the recognition rate in presence of noise. In this paper, we derive the rehashing for Bayesian voting scheme, thus improving the computational performance by minimizing the hash table size and the number of bins accessed, while maintaining optimal recognition rate.

Roberto Ierusalimschy (Department of Computer Science, PUC-Rio, Rio de Janeiro, Brazil roberto@inf.puc-rio.br) ... Luiz Henrique de Figueiredo (IMPA–Instituto de Matemática Pura e Aplicada, Rio de Janeiro, Brazil lhf@impa.br) ... Waldemar... more

Roberto Ierusalimschy (Department of Computer Science, PUC-Rio, Rio de Janeiro, Brazil roberto@inf.puc-rio.br) ... Luiz Henrique de Figueiredo (IMPA–Instituto de Matemática Pura e Aplicada, Rio de Janeiro, Brazil lhf@impa.br) ... Waldemar Celes (Department of ...

This article presents a new Fast Hash-based File Existence Checking (FHFEC) method for archiving systems. During the archiving process, there are many submissions which are actually unchanged files that do not need to be re-archived. In... more

This article presents a new Fast Hash-based File Existence Checking (FHFEC) method for archiving systems. During the archiving process, there are many submissions which are actually unchanged files that do not need to be re-archived. In this system, instead of comparing the entire ...

Roberto Ierusalimschy (Department of Computer Science, PUC-Rio, Rio de Janeiro, Brazil roberto@inf.puc-rio.br) ... Luiz Henrique de Figueiredo (IMPA–Instituto de Matemática Pura e Aplicada, Rio de Janeiro, Brazil lhf@impa.br) ... Waldemar... more

Roberto Ierusalimschy (Department of Computer Science, PUC-Rio, Rio de Janeiro, Brazil roberto@inf.puc-rio.br) ... Luiz Henrique de Figueiredo (IMPA–Instituto de Matemática Pura e Aplicada, Rio de Janeiro, Brazil lhf@impa.br) ... Waldemar Celes (Department of ...

Addresses topics on formal languages, types of languages, grammar, types of grammar, types of automats and their classification

This paper studies blockchain technology - which is getting strong attention from the industry and also being applied in many fields. Based on the nature of blockchain's data security, we analyze the application of blockchain technology... more

This paper studies blockchain technology - which is getting strong attention from the industry and also being applied in many fields. Based on the nature of blockchain's data security, we analyze the application of blockchain technology to deal with the problem of data integrity and transparency for text documents. Firstly, the paper presents an overview of some technology components constituting to blockchain and its relevance and optimization to the data authentication/protection problem. Next, an experimental application of blockchain to form a network which stores documents and maintains the integrity and transparency of documents stored in the network for external queries.

with the advancement in data storage technology, the cost per gigabyte has reduced significantly, causing users to negligently store redundant files on their system. These may be created while taking manual backups or by improperly... more

with the advancement in data storage technology, the cost per gigabyte has reduced significantly, causing users to negligently store redundant files on their system. These may be created while taking manual backups or by improperly written programs. Often, files with the exact content have different file names; and files with different content may have the same name. Hence, devising an algorithm to identify redundant files based on their file name and/or size is not enough. In this paper, the authors have proposed a novel approach where the N-layer hash of all the files are individually calculated and stored in a hash table data structure. If an N-layer hash of a file matches with a hash that already exists in the hash table, that file is marked as a duplicate, which can be deleted or moved to a specific location as per the user's choice. The use of the hash table data structure helps achieve O(n) time complexity and the use of N-layer hashes improve the accuracy of identifying redundant files. This approach can be used for folder specific, drive specific or a system wide scan as required.

Page 1. Communication Preserving Protocols for Secure Function Evaluation Moni Naor * Kobbi Nissim Department of Computer Science and Applied Mathematics Weizmann Institute of Science, Rehovot 76100, Israel naor,... more

Page 1. Communication Preserving Protocols for Secure Function Evaluation Moni Naor * Kobbi Nissim Department of Computer Science and Applied Mathematics Weizmann Institute of Science, Rehovot 76100, Israel naor, kobbi@wisdom.weizmann.ac.il ...

In SAS® Version 9.1, the hash table - the very first object introduced via the DATA Step Component Interface in Version 9.0 - has finally become robust and syntactically stable. The philosophy and application style of the hash objects is... more

In SAS® Version 9.1, the hash table - the very first object introduced via the DATA Step Component Interface in Version 9.0 - has finally become robust and syntactically stable. The philosophy and application style of the hash objects is quite different from any other structure ever used in the DATA step before. The most notable departure from the tradition is their run-time nature. Hash objects are instantiated, deleted, allocate memory, and get updated all at the run-time. Intuitively, it is clear that such traits should make for very inventive and flexible programming unseen in the DATA step of yore. Still better, Version 9.2 has added new methods, attributes, and parameters. This paper includes both hash propaedeutics and material intended for programmers already familiar with SAS hashigana at basic to very advanced levels. A number of truly dynamic programming techniques utterly unthinkable before the advent of the canned hash objects in SAS are explored and explained using liv...

This article presents a new Fast Hash-based File Existence Checking (FHFEC) method for archiving systems. During the archiving process, there are many submissions that are actually unchanged files that do not need to be re-archived. In... more

This article presents a new Fast Hash-based File Existence Checking (FHFEC) method for archiving systems. During the archiving process, there are many submissions that are actually unchanged files that do not need to be re-archived. In this system, instead of comparing the entire files, only digests of the files are compared. Strong cryptographic hash functions with a low probability of collision can be used as digests.
We propose a fast algorithm to check if a certain hash, that is, a corresponding file, is already stored in the system. The algorithm is based on dividing the whole domain of hashes into equally sized regions, and on the existence of a pointer array, which has exactly one pointer for each region. Each pointer points to the location of the first stored hash from the corresponding region and has a null value if no hash from that
region exists. The entire structure can be stored in random access memory or, alternatively, on a dedicated hard disk. Statistical performance analysis has been performed that shows that in certain cases FHFEC performs nearly optimally. Extensive simulations have confirmed these analytical results. The performance
of FHFEC has been compared to the performance of a binary search (BIS) and B+tree, which are commonly used in file systems and databases for table indices. The results show that FHFEC significantly outperforms both of them.

In recent years, reconstructing a sparse map from a simultaneous localization and mapping (SLAM) system on a conventional CPU has undergone remarkable progress. However, obtaining a dense map from the system often requires a... more

In recent years, reconstructing a sparse map from a simultaneous localization and mapping (SLAM) system on a conventional CPU has undergone remarkable progress. However, obtaining a dense map from the system often requires a high-performance GPU to accelerate computation. This paper proposes a dense mapping approach which can remove outliers and obtain a clean 3D model using a CPU in real-time. The dense mapping approach processes keyframes and establishes data association by using multi-threading technology. The outliers are removed by changing detections of associated vertices between keyframes. The implicit surface data of inliers is represented by a truncated signed distance function and fused with an adaptive weight. A global hash table and a local hash table are used to store and retrieve surface data for data-reuse. Experiment results show that the proposed approach can precisely remove the outliers in scene and obtain a dense 3D map with a better visual effect in real-time.

Interpolation is a technique used to find out values of the function (suppose f(x)), at any given point x, given a certain set of points. The function is provided such that it passes through all the points. This paper aims to analyze the... more

Interpolation is a technique used to find out values of the function (suppose f(x)), at any given point x, given a certain set of points. The function is provided such that it passes through all the points. This paper aims to analyze the problem of polynomial interpolation.

We present an efficient lock-free algorithm for parallel accessible hash tables with open addressing, which promises more robust performance and reliability than conventional lock-based implementations. "Lock-free" means that it is... more

We present an efficient lock-free algorithm for parallel accessible hash tables with open addressing, which promises more robust performance and reliability than conventional lock-based implementations. "Lock-free" means that it is guaranteed that always at least one process completes its operation within a bounded number of steps. For a single processor architecture our solution is as efficient as sequential hash tables. On a multiprocessor architecture this is also the case when all processors have comparable speeds. The algorithm allows processors that have widely different speeds or come to a halt. It can easily be implemented using C-like languages and requires on average only constant time for insertion, deletion or accessing of elements. The algorithm allows the hash tables to grow and shrink when needed. Lock-free algorithms are hard to design correctly, even when apparently straightforward. Ensuring the correctness of the design at the earliest possible stage is a major challenge in any responsible system development. In view of the complexity of the algorithm, we turned to the interactive theorem prover PVS for mechanical support. We employ standard deductive verification techniques to prove around 200 invariance properties of our algorithm, and describe how this is achieved with the theorem prover PVS. CR Subject Classification (1991): D.1 Programming techniques AMS Subject Classification (1991): 68Q22 Distributed algorithms, 68P20 Information storage and retrieval

A vectorized algorithm for entering data into a hash table is presented. A program that enters multiple data could not be executed on vector processors by conventional vectorization techniques because of data dependences. The proposed... more

A vectorized algorithm for entering data into a hash table is presented. A program that enters multiple data could not be executed on vector processors by conventional vectorization techniques because of data dependences. The proposed method enables ...

This paper presents a new parallel indexing data structure for answering queries. The index, called Bin-Hash, offers extremely high levels of concurrency, and is therefore well-suited for the emerging commodity of parallel processors,... more

This paper presents a new parallel indexing data structure for answering queries. The index, called Bin-Hash, offers extremely high levels of concurrency, and is therefore well-suited for the emerging commodity of parallel processors, such as multi-cores, cell processors, and general purpose graphics processing units (GPU). The Bin-Hash approach first bins the base data, and then partitions and separately stores the

This paper gives two contributions; First, it presents an architecture for customized content delivery for Ambient Intelligent Environments. We demonstrate how physical peers made up of a Bluetooth-based network of Java-enabled mobile... more

This paper gives two contributions; First, it presents an architecture for customized content delivery for Ambient Intelligent Environments. We demonstrate how physical peers made up of a Bluetooth-based network of Java-enabled mobile phones can be used to provide customized content delivery from the web without the need of a dedicated web connection per device. Secondly, we present two algorithms Self-OrganiziNG random walkerS (SONGS) and peer-to-peeR self-organIZed tEmporary overlayS (PRIZES), both providing mechanisms of temporary overlay formation in limited connectivity ad-hoc networks. SONGS is an extension of k-random walk algorithm whereas PRIZES is a forest-fire type flooding mechanism. We then show how adding even naive self-organization to these algorithms significantly improves the leftover queries as well as latency in terms of hop-counts.