Implementing collection of sets with trie : a stepping stone for performances? (original) (raw)

Main operations of the Set Collection Abstract Data Type are insertion, research and deletion. A well known option to implement these operations is to use hashtable. Although hashtable does not admit good time complexities in the worst case, the practical time complexities are efficient. Another option is to use the data structure known as the trie. The trie is useful for two main reasons. Firstly, with such a data structure, mentionned operations admit very good theoretical time complexities. Secondly a trie can be seen as a compact representation of a collection of sets since some parts of them are merged together. Aim of this article is to evaluate performances of the trie data structure. The Java language proposes an abstract class corresponding to the Set Collection A.D.T. operations. We propose in this article three different implementations of this abstract class. All of them are variations of the way to manage the sons of nodes. Theoretical complexities are then evaluated. A...

JAVA COLLECTION FRAMEWORK

Introduction: A collection Framework is a collection of interfaces and classes. A Collection Framework is considered as architecture for representing and manipulating collections. It is a library, a toolbox of interfaces and classes. This toolbox holds various collection interfaces and also classes that serves as a powerful, object oriented alternative to arrays. Interfaces allow collection to be manipulated independently of the details of their representation. History: Collection Implementations in pre JDK1.2 version of the java platform structure classes but was not contain any collections framework. The array, the vector and hash table classes are the standard methods for grouping the java objects and unfortunately which were not easy to extend and did not implement a standard member interface. Several new and independent frameworks are developed to address the need for reusable collection data structures. Amongst all, the most used being Doug Lea's collection package and object space generic collection library (JGL), whose goal was consistence with STL. The Collection framework was designed and developed by Josph Bloch and was introduced in JDK1.2. It reused many ideas and classes from Doung Lea's collection package. Doung Lea later developed a concurrency package, with new collection related classes. These updated version utilities was introduced in JDK 5.0 as of JSR 166. Benefits of JCF: JCF reduces programming efforts. It reduces efforts by providing useful data structures and algorithm. It make user free to program instead of putting efforts on low-level pluming to make it work. JCF increases program speed and quality. Collection Framework provides high performance with high implementations of useful data structures and algorithms. The implementations of every interface are interchangeable because these programs can be easily works or functions by switching collection implementations. As user is free from the drudgery of writing own data structures, user finds more time to work on improving quality and performance of program. Missing Data Structure in JCF: Java Collection framework includes number of data structures; still there are some data structures that are missing such as Trie, Extensible Hashing, and Polyphase Merge. Trie : It is also known as digital tree and sometimes radix tree or prefix tree. It is basically an ordered tree data structure that is used to store associative array where most of the times keys are strings. Just like a binary search tree, in tree no node stores the key associated with that node other than this its position in tree explains the key with which it is associated. Extensible Hashing: It was described by Ronald Fagin in 1979. It is a type of hash system which treats a hash as a bit string and for bucket lookup it uses a trie. Just because of hierarchical nature of system, rehashing is an incremental operation. Cascade Merge: Cascade merge sort is almost similar to the polyphase merge sort but it uses a simpler distribution. Along with this the merge is slower in cascade merge as compare to polyphase merge if the number of files are less than six but it works faster when there are more than six files.

Implementing Sets with Hash Tables in Declarative Languages

1994

Abstract. Programming languages using set as the core data collection have two interesting features: first lots of people have experience, from many different fields, in representing problems as relations between sets and then sets are a suit structure for exploiting data parallelism. This paper presents a technique for implementing sets in a Logic Programming System. It is based on hash-tables and is aimed to a Subset Abstract Machine for the Subset Equational Language.

Loading...

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.