A Fast Pattern Matching Algorithm Using Changing Consecutive Characters (original) (raw)
Related papers
2013
Pattern matching algorithms have many applications that cover a wide range including information retrieval, text processing, DNA sequence analysis and pattern recognition. In this paper, we propose a new algorithm ERS-A, that made enhancements on both two sliding windows (TSW) and Fast Pattern Matching (RS-A) algorithms. In ERS-A and TSW algorithms two sliding windows are used to scan the text from the left and right simultaneously, but while TSW utilizes the idea of Berry Ravindran bad character shift function (BR), ERS-A adds an improvement by using the shift technique provided by RS-A algorithm. RS-A algorithm uses four consecutive characters in the text immediately following the pattern window, instead of using two consecutive characters as in BR. The experimental results show that the ERS-A has enhanced the searching process significantly.
ERS-A Algorithm for Pattern Matching
2013
Pattern matching algorithms have many applications that cover a wide range including information retrieval, text processing, DNA sequence analysis and pattern recognition. In this paper, we propose a new algorithm ERS-A, that made enhancements on both two sliding windows (TSW) and Fast Pattern Matching (RS-A) algorithms. In ERS-A and TSW algorithms two sliding windows are used to scan the text from the left and right simultaneously, but while TSW utilizes the idea of Berry Ravindran bad character shift function (BR), ERS-A adds an improvement by using the shift technique provided by RS-A algorithm. RS-A algorithm uses four consecutive characters in the text immediately following the pattern window, instead of using two consecutive characters as in BR. The experimental results show that the ERS-A has enhanced the searching process significantly.
Comparative Study between Various Pattern Matching Algorithms
IJCA, 2016
Present paper describes the details of the study of the work that has been done in the field of text searching, a subdivision of Natural Language Processing (NLP) till date. The work in this project includes the study and analysis of some of the algorithms devised under this topic, finding the faults or loopholes and trying to increase the efficiency of these algorithms devised, taking forward the range of work done on it. Experiment is done on the various text search algorithms that have been devised namely Knuth-Morris Pratt Algorithm, Naïve Search Algorithm and Boyer-Moore Algorithm by providing text input of various sizes and analyzing their behavior on these variable inputs. After analyzing and doing the study on these algorithms the results states that Boyer-Moore"s Algorithm worked quite well and efficiently than the rest of them when dealing with larger data sets. When working on larger alphabets the Knuth-Morris Pratt Algorithm works quite well. These algorithms do have drawbacks as their efficiency depends upon the alphabet/pattern size. And also this paper describes new pattern matching algorithm that uses delimiter for shifting the pattern while matching.
A FAST STRING MATCHING ALGORITHM
The pattern matching is a well known and important task of the pattern discovery process in today's world for finding the nucleotide or amino acid sequence patterns in protein sequence databases. Although pattern matching is commonly used in computer science, its applications cover a wide range, including in editors, information retrieval. In this paper we propose a new pattern matching algorithm that has an improved performance compare to the well known algorithms in the literature so far. Our proposed algorithm has been evolved after the comparatively study of the well known algorithms like Boyer Moore , Horspool and Raita. When we are talking about the overall performance of the proposed algorithm it has been improved using the shift provided by the Horspool search bad-character and by defining a fixed order of comparison. The proposed algorithm has been compared with other well known algorithm.
COMPARATIVE ANALYSIS ON EFFICIENCY OF SINGLE STRING PATTERN MATCHING ALGORITHMS
Abstract-Data is stored in different forms but, text remains the main form of exchanging information. The manipulation of text involves several problems among which pattern matching is one of them.Pattern-matching is routinely used in various computer applications, like editors, retrieval of information etc. Pattern-matching algorithm matches the pattern exactly or approximately within the text. This paper presents the Comparative Analysis of various Pattern String matching algorithms. The highly efficient algorithms like The Brute Force Algorithm, The Karp-Rabin Algorithm, and The Boyer Moore Algorithm are used for exact or approximate patternmatching on diverse systems. After performing a detailed study on the above mentioned algorithms
High performance pattern search algorithm using three sliding windows
International Journal of Computer Aided Engineering and Technology
In this work we propose a new pattern matching algorithm based on the principle of the text segmentation by slicing the text in to three segments. The idea is to perform preprocessing of pattern strings before beginning to search for the pattern in the text so as to achieve substantial speed up in the search process as against to other existing algorithms which either preprocess text or pattern or does no preprocessing such as Brute Force algorithm. The behavior of the algorithm depends on the occurrence of consecutive characters in the event of pattern failure. In this paper we present an efficient pattern matching algorithm based on preprocessing of the pattern string by considering three consecutive characters of the text that immediately follow the aligned pattern window in an event of mismatch between pattern and text character. The algorithm makes use of three sliding patterns. The experimental results show that the proposed algorithm is superior to other algorithms even when the pattern is in the end of the text.
A Family of Exact Pattern Matching Algorithms with Multiple Adjacent Search Windows
2017
A new family of comparison-based exact pattern matching algorithms is presented. They utilize the multi-dimensional arrays in order to process more than one adjacent search window in each iteration of the search loop. This approach leads to a lower average computing time by the cost of space. However, the excessive space consumption can be avoided due to a special technique of replacing a multi-dimensional array with a series of one-dimensional arrays of pointers. The algorithms of this family perform well for short or middle-size patterns, when the shift of a search window by several lengths at once is quite probable. Our algorithms outperform all other known algorithms for some values of pattern length on English text, genomic sequence and a random text over an alphabet of size 8 or 32.
Improved Approximate Multiple-Pattern String Matching using Consecutive N-Grams
International Journal of Computer Applications, 2013
String matching is to find all the occurrences of a given pattern in a large text both being sequence of characters drawn from finite alphabet set. This problem is fundamental in computer Science and is the basic need of many applications such as text retrieval, symbol manipulation, computational biology, data mining, and network security. Bit parallelism method is used for increasing the processing speed of String matching algorithm. Standard Shift OR algorithm is used to perform approximate string matching. The algorithm is a filter which finds out false matches besides detecting correct matches. To improve the efficiency of basic Shift OR algorithm by reducing the number of false matches that is detected along with the correct matches by the algorithm, proposed Shift OR with consecutive q grams has been implemented. In the algorithm instead of reading a single character at a time, it read q characters at once. Extensive experiments have been done with the algorithm and the results are compared with basic version of shift OR algorithms. The number of false matches also reduced considerably. The gain is due to the improved filtering efficiency caused by q-grams.
Analysis of Pattern Searching Algorithms and Their Application
International Journal of Recent Contributions from Engineering, Science & IT (iJES)
Nowadays data is growing tremendously. Therefore, there is a great need to store and process data. The problem of Pattern Searching has different applications. When searching for text or words in computer application systems, Pattern searching is used to display the search results. The purpose of Pattern searching is to find text within another text. For example, searching for text in books will take a long time and is hard work. Using Pattern searching will save you time and effort. If similar words are found within the requested text, it will underline the word similar to what was requested, otherwise it does not display any matches if there are no similar words within a text. This paper presents comparisons of the speed of different Pattern searching algorithms, precisely the Naive, KMP, Rabin-Karp, Finite Automata, Boyer-Moore, Aho-Corasick, Z Algorithm algorithms. We will test the time complexity of these algorithms in the three programming languages C#, Java and Python using ...