A NEW STRING MATCHING ALGORITHM (original) (raw)

The exact string matching algorithms efficiency review

Exact String matching algorithms has been very significant in many applications in the last two decades. This is due to the advancement in technology that produces large volumes of data. The main factors in string matching algorithms are the number of attempts, the number of character comparison and the running time. These factors are influenced by the type of algorithm, type of data, data size and length of pattern used. In this article, we perform review for advantages and disadvantages of executing exact string matching algorithm. We conclude that the suffix automata and hybrid are the faster algorithms with the lowest number of attempts and the hashing approaches have the lower number of comparison. The bit parallelism algorithms have the similar limitations.

A comparison of the performance of four exact string matching algorithms

2007 IEEE International Conference on Electro/Information Technology, 2007

There are numerous exact string matching algorithms that have similar performance characteristics. Which algorithm is best depends on the length of the pattern being searched for, the number of letters in the alphabet and the particular architecture where the program is being executed. Results of experiments on different recent microprocessors, with different word lengths are reported.

A unified view to string matching algorithms

Lecture Notes in Computer Science, 1996

We present a uni ed view to sequential algorithms for many pattern matching problems, using a nite automaton built from the pattern which uses the text as input. We show the limitations of deterministic nite automata (DFA) and the advantages of using a bitwise simulation of non-deterministic nite automata (NFA). This approach gives very fast practical algorithms which have good complexity for small patterns on a RAM machine with word length O(log n), where n is the size of the text. For generalized string matching the time complexity is O(mn= log n) which for small patterns is linear. For approximate string matching we show that the two main known approaches to the problem are variations of the NFA simulation. For this case we present a di erent simulation technique which gives a running time of O(n) independently of the maximum number of errors allowed, k, for small patterns. This algorithm improves the best bit-wise or comparison based algorithms of running time O(kn) and can be used as a basic block for algorithms with good average case behavior. We also formalize previous bit-wise simulation of general NFAs achieving O(mn log log n= log n) time.

A FAST STRING MATCHING ALGORITHM

The pattern matching is a well known and important task of the pattern discovery process in today's world for finding the nucleotide or amino acid sequence patterns in protein sequence databases. Although pattern matching is commonly used in computer science, its applications cover a wide range, including in editors, information retrieval. In this paper we propose a new pattern matching algorithm that has an improved performance compare to the well known algorithms in the literature so far. Our proposed algorithm has been evolved after the comparatively study of the well known algorithms like Boyer Moore , Horspool and Raita. When we are talking about the overall performance of the proposed algorithm it has been improved using the shift provided by the Horspool search bad-character and by defining a fixed order of comparison. The proposed algorithm has been compared with other well known algorithm.

A New Efficient Hybrid String Matching Algorithm to Solve the Exact String Matching Problem

The string matching algorithms are considered one of the most studied in the computer science field because the fundamental role they play in many different applications such as information retrieval, editors, security applications, firewall, and biological applications. This study aims to introduce a new hybrid algorithm based on two well-known algorithms, namely, the modified Horspool and SSABS hybrid algorithms. Two factors used to analyze the proposed algorithm which is the total number of character comparisons and total number of attempts. The ABSBMH algorithm which is the name chosen for the proposed hybrid algorithm was tested on different types of standard datatype. The ABSBMH algorithm shows less number of character comparisons when compared to the results of other algorithms, while show almost no big different in the results of number of attempts this is due to the proposed hybrid algorithm preprocessing phase based on SSABS algorithm which is the same preprocessing phase of the Quick Search algorithm, so for all these reasons the results of the ABSBMH and other algorithms in terms of total number of attempts have been shown a small different, this is because it use different pattern lengths which are selected randomly from the databases. The experiential results expose that

COMPARATIVE ANALYSIS ON EFFICIENCY OF SINGLE STRING PATTERN MATCHING ALGORITHMS

Abstract-Data is stored in different forms but, text remains the main form of exchanging information. The manipulation of text involves several problems among which pattern matching is one of them.Pattern-matching is routinely used in various computer applications, like editors, retrieval of information etc. Pattern-matching algorithm matches the pattern exactly or approximately within the text. This paper presents the Comparative Analysis of various Pattern String matching algorithms. The highly efficient algorithms like The Brute Force Algorithm, The Karp-Rabin Algorithm, and The Boyer Moore Algorithm are used for exact or approximate patternmatching on diverse systems. After performing a detailed study on the above mentioned algorithms

An innovative platform to improve the performance of exact string matching algorithms

Computing Research Repository, 2010

Exact String Matching is an essential issue in many computer science applications. Unfortunately, the performance of Exact String Matching algorithms, namely, executing time, does not address the needs of these applications. This paper proposes a general platform for improving the existing Exact String Matching algorithms executing time, called the PXSMAlg platform. The function of this platform is to parallelize the

Faster Approximate String Matching

Algorithmica, 1999

We present a new algorithm for on-line approximate string matching. The algorithm is based on the simulation of a nondeterministic finite automaton built from the pattern and using the text as input. This simulation uses bit operations on a RAM machine with word length w = (log n) bits, where n is the text size. This is essentially similar to the model used in Wu and Manber's work, although we improve the search time by packing the automaton states differently. The running time achieved is O(n) for small patterns (i.e., whenever mk = O(log n)), where m is the pattern length and k < m is the number of allowed errors. This is in contrast with the result of Wu and Manber, which is O(kn) for m = O(log n). Longer patterns can be processed by partitioning the automaton into many machine words, at O(mk/w n) search cost. We allow generalizations in the pattern, such as classes of characters, gaps, and others, at essentially the same search cost.

An Improved Algorithm for Approximate String Matching

Given a text string, a pattern string, and an integer k, a new algorithm for finding all occurrences of the pattern string in the text string with at most k differences is presented. Both its theoretical and practical variants improve the known algorithms .

String Matching Algorithms and their Applicability in various Applications

In this paper the applicability of the various strings matching algorithms are being described. Which algorithm is best in which application and why. This describes the optimal algorithm for various activities that include string matching as an important aspect of functionality. In all applications test string and pattern class needs to be matched always.

A NEW STRING MATCHING ALGORITHM (original) (raw)

Related papers