Mukesh Rawat - Academia.edu (original) (raw)

Papers by Mukesh Rawat

Research paper thumbnail of NLP based grievance redressal system for Indian Railways

ArXiv, 2021

Rail Madad can be availed by any railway customer to raise complaints regarding Indian Railways s... more Rail Madad can be availed by any railway customer to raise complaints regarding Indian Railways service delivery. Rail Madad is very well automated and robust system for grievance redressal using Mobile App and Helpline Number. It is observed from the case study given in Footnote, that the grievances received on Social Media Accounts of Indian Railways are analyzed manually for redressal. It is observed that the system can further be improved by incorporating a plugin for automatic screening of grievances that the customers post on Social Media accounts of Indian Railways. The texts posted on the social media accounts can be processed using various text analysis techniques to identify the actionable tasks. We propose to build a plugin for identification of completeness of information posted by the customers and further to process the grievance into actionable tasks and thus reducing the human intervention and improving the efficiency of Rail Madad.

Research paper thumbnail of Text Summarization using Extractive Techniques

2021 3rd International Conference on Advances in Computing, Communication Control and Networking (ICAC3N)

Research paper thumbnail of Design and Implementation of Novel Techniques for Content-Based Ranking of Web Documents

Process Mining Techniques for Pattern Recognition

Research paper thumbnail of Text Summarization Using Extractive Techniques

Process Mining Techniques for Pattern Recognition

Research paper thumbnail of Automatic Document Collection

Now a day’s classification of document is an important area for research, as large amount of elec... more Now a day’s classification of document is an important area for research, as large amount of electronic documents are available in form of unstructured, semi structured and structured information. Document classification will be applicable for World Wide Web, electronic book sites, online forums, electronic mails, online blogs, digital libraries and online government repositories. So it is necessary to organize the information and proper categorization and knowledge discovery is also important. This paper focused on the existing literature and explored the techniques for automatic documents classification i.e. documents representation, knowledge extraction and classification. In this paper author propose an algorithm and architecture for automatic document collection.

Research paper thumbnail of Automatic extraction of domain specific hidden data for efficient response by search engine

International Journal of Research and Engineering, 2017

As large amount of information is growing in web daily, lots of relevant data are available in th... more As large amount of information is growing in web daily, lots of relevant data are available in the form of hypertext in web. Interested users can view publically index able pages of any web site and some information is hidden behind search forms and viewed after filling of the appropriate search interfaces. Lots of information is available behind the search interfaces known as hidden data. Right now unique web crawlers are utilizing distinctive methods for getting such profound data from the web and sent to the clients as per their problems. The purpose of fetching such deep information is to provide a large set of relevant response to the user according to their search query. In this paper a new technique suggested for automatic filling of search interfaces and then submits the form to the World Wide Web and collects the response in the form of web pages which will be send back after automatic submission of the form and then filter out relevant pages from these response pages accor...

Research paper thumbnail of Domain Independent Automatic Cluster Generation of Documents

Research paper thumbnail of Review of Information Retrieval models

A large number of information of all the domains are available online in the form of hyper text i... more A large number of information of all the domains are available online in the form of hyper text in web pages. Peoples from different disciplines are consulting different web sites to fetch information according to their need. It is very difficult to remember the names of the websites for a specific domain for which the user wants to search. So a search is a system which mines information from the world wide web and present it to the user according to its query. Information retrieval system(IRs) works for search engine arranges the web documents systematically and retrieves the result according to the user query. In this paper we discuss the widely used information retrieval models, their evaluation parameters and application.

Research paper thumbnail of An Improved Extraction Algorithm from Domain Specific Hidden Web

The web contains a large amount of information which is increasing by magnitude every day. The Wo... more The web contains a large amount of information which is increasing by magnitude every day. The World Wide Web consists of Surface Web (Publicly Indexed Web) and the Deep Web which consists of Hidden Data, alsoreferred to by different names such as Hidden Web, Deepnet or the Invisible Web. A user can directly access the surface web through a Search Engine but to access the hidden data/information, the users have to manually feed a set of keywords in a typical search interface to access these hidden web pages from source web sites. The problem area we are working on is devising efficient mechanisms to extract this information automatically beforehand since "crawlers" cannot access it otherwise. In this paper we present a mechanism to extract search forms from HTML pages spread over the web, automatic filling and submission of those forms at their source sites to download the Hidden Web pages in a repository for further use by web crawlers. General Terms-Query Interface for d...

Research paper thumbnail of Efficiency measures for ranked pages by Markov Chain Principle

International Journal of Information Technology

More often, the user likes to visit the web documents that appear in few top excellent responses ... more More often, the user likes to visit the web documents that appear in few top excellent responses to the list of links provided by the search engine and these results are the most likely accurate results to the search query. The Information Retrieval by Search Engine helps in retrieving the most relevant pages for query. In this paper, we propose an ideal technique for link analysis by taking web graph structure and we focus around the ranking of such links. The relevancy of the links is evaluated by using Markov Chain Principle and also query keyword occurrence is given a weight-age to the overall ranking of the links. The term proximity and discounted cumulative gain are used to simulate results and the scores show that the proposed methodology efficiently enhances the ranking of the web pages.

Research paper thumbnail of Review of crawling technique of Search Engine

Scientific Journal of India

Research paper thumbnail of An enhanced Boolean retrieval model for efficient searching

Scientific Journal of India

A large number of information of all the domains are available online in the form of hyper text i... more A large number of information of all the domains are available online in the form of hyper text in web pages. Peoples from different domians are consulting different web sites to fetch information according to their need. It is very difficult to remember the names of the websites for a specific domain for which the user wants to search. So a search is a system which mines information from the World Wide Web and present it to the user according to its query. Information retrieval system (IRs) works for search engine arranges the web documents systematically and retrieves the result according to the user query. In this paper an efficient Boolean retrieval model is proposed which retrieves the results according to the according to the Boolean operation specified within the terms of the search query. Also the proposed model is capable to store large indexes.

Research paper thumbnail of XML Representation of Web Document used by Search Engine

International Journal of Engineering Trends and Technology

Information retrieval is the part of Computer Science that studies the recovery of information fr... more Information retrieval is the part of Computer Science that studies the recovery of information from a gathering of written documents. Searches can be based on either full-text or other content-based indexing. The retrieved documents aim at satisfying a user information requirement usually expressed in natural language. XML is a highly structured language and used to represent the logical structure of a document. This structural nature of XML gives the user of a XML retrieval system the ability to issue more complex and precise queries than those used in traditional flat (unstructured) document retrieval. Users can make use of the structural nature of XML documents to restrict their search to specific structural elements within the XML document collection. KeywordsInformation Retrieval, Structured Data, Semi-Structured Data, Data Centric, Indexing.

Research paper thumbnail of Review of Web Clustering Algorithms and Evaluation

International Journal of Engineering Trends and Technology

Clustering is a procedure of dividing an arrangement of information articles into an arrangement ... more Clustering is a procedure of dividing an arrangement of information articles into an arrangement of significant sub-classes, called clusters. Clustering discovers groups of information protests that are comparable in some sense to each other. The individuals from a cluster are more similar to each other than they resemble individuals from different clusters. The objective of clustering is to discover brilliant clusters with the end goal that the between group likeness is low and the intragroup similitude is high. Clustering should be possible by various techniques, for example, Hierarchical,Partitioning,Density based, Grid based and so forth .In Clustering, Hierarchical Clustering is a strategy for group examination which looks to fabricate a chain of command of the groups. Generally Hierarchical Clustering fall into two types: Agglomerative: This is a “bottom up" approach: every perception begins in its own group, and combines of groups are converged as one climbs the order. Divisive: This is a "top down" approach: all perceptions begin in one group, and parts are performed recursively as one moves down the pecking order. The motivation behind the Clustering system is to cluster the data from a massive information set and make over it into a sensible frame for supplementary reason. Clustering is a noteworthy errand in information examination and information mining applications. Keywords-Clustering, Hierarchical clustering, Subclasses,Agglomerative Hierarchical clustering, Divisive Hierarchical clustering.

Research paper thumbnail of Comparison and Analysis of Two Approaches to Find Novel Documents out of Several Documents

International Journal of Computer Applications

Research paper thumbnail of Design and Analysis of Visual Insight of Child using IBM Multimedia Analysis and Retrieval System

International Journal of Computer Applications

Humans recognize objects with astonishing ease and speed; here we are using behavioral methods to... more Humans recognize objects with astonishing ease and speed; here we are using behavioral methods to investigate the sequence of processes involved in visual object recognition in natural scenes. The greatest challenge of our times is to understand how our brains function. Images play a big role in representing the inner feeling of the child or a person. Images are created in his/her memory of the daily things happening in his/her life. Then the images selected by the child reflect the visual insight of the child. The system developed is helpful for the same purpose. It shows the result in terms of accuracy compared with the other face recognition system.

Research paper thumbnail of The Hybrid Technique for Edge Detection using Bio-inspired Techniques

International Journal of Computer Applications

The image processing is the technique which is applied to process the digital information stored ... more The image processing is the technique which is applied to process the digital information stored in the form of images. The edge detection is the technique of image processing which detect the points at which the image properties changed at steady rate. In this paper, the bee colony based edge detection technique is proposed which is the enhanced version of the existing edge detection technique based on ant colony optimization. The proposed technique is implemented in MATLAB and it is been analyzed that it performs well in terms of accuracy and execution time.

Research paper thumbnail of Comparison of Keyword based Clustering of Web Documents by using OPENSTACK 4J and by Traditional Method

International Journal of Computer Applications, 2016

Research paper thumbnail of Data Optimization Techniques using Bloom Filter in Big Data

International Journal of Computer Applications, 2016

Due to the advent of new technologies, devices, and communication means like social networking si... more Due to the advent of new technologies, devices, and communication means like social networking sites, the amount of data produced by mankind is growing rapidly every year. Traditional computing techniques are not enough to process that much large amount of data. Hadoop is a bunch of technology & have capacity to store large amount of data on Data nodes. Hadoop uses MapReduce algorithm to process and analyze large scale datasets over large clusters. MapReduce is essential for Big Data processing. This algorithm divides the task into small parts and assigns those parts to many computers connected over the network, and collects the results to form the final result dataset. Bloom filter technique is probabilistic data model which is used to make processing of data more efficient. Implementation of this filter with mapper can reduce the amount of data travel. In this paper we implemented Bloom filter in Hadoop architecture. This help to reduce network traffic over network which save bandwidth as well as data storage.

Research paper thumbnail of Ranking of Web Documents for Domain Specific Database

International Journal of Computer Applications, 2016

Research paper thumbnail of NLP based grievance redressal system for Indian Railways

ArXiv, 2021

Rail Madad can be availed by any railway customer to raise complaints regarding Indian Railways s... more Rail Madad can be availed by any railway customer to raise complaints regarding Indian Railways service delivery. Rail Madad is very well automated and robust system for grievance redressal using Mobile App and Helpline Number. It is observed from the case study given in Footnote, that the grievances received on Social Media Accounts of Indian Railways are analyzed manually for redressal. It is observed that the system can further be improved by incorporating a plugin for automatic screening of grievances that the customers post on Social Media accounts of Indian Railways. The texts posted on the social media accounts can be processed using various text analysis techniques to identify the actionable tasks. We propose to build a plugin for identification of completeness of information posted by the customers and further to process the grievance into actionable tasks and thus reducing the human intervention and improving the efficiency of Rail Madad.

Research paper thumbnail of Text Summarization using Extractive Techniques

2021 3rd International Conference on Advances in Computing, Communication Control and Networking (ICAC3N)

Research paper thumbnail of Design and Implementation of Novel Techniques for Content-Based Ranking of Web Documents

Process Mining Techniques for Pattern Recognition

Research paper thumbnail of Text Summarization Using Extractive Techniques

Process Mining Techniques for Pattern Recognition

Research paper thumbnail of Automatic Document Collection

Now a day’s classification of document is an important area for research, as large amount of elec... more Now a day’s classification of document is an important area for research, as large amount of electronic documents are available in form of unstructured, semi structured and structured information. Document classification will be applicable for World Wide Web, electronic book sites, online forums, electronic mails, online blogs, digital libraries and online government repositories. So it is necessary to organize the information and proper categorization and knowledge discovery is also important. This paper focused on the existing literature and explored the techniques for automatic documents classification i.e. documents representation, knowledge extraction and classification. In this paper author propose an algorithm and architecture for automatic document collection.

Research paper thumbnail of Automatic extraction of domain specific hidden data for efficient response by search engine

International Journal of Research and Engineering, 2017

As large amount of information is growing in web daily, lots of relevant data are available in th... more As large amount of information is growing in web daily, lots of relevant data are available in the form of hypertext in web. Interested users can view publically index able pages of any web site and some information is hidden behind search forms and viewed after filling of the appropriate search interfaces. Lots of information is available behind the search interfaces known as hidden data. Right now unique web crawlers are utilizing distinctive methods for getting such profound data from the web and sent to the clients as per their problems. The purpose of fetching such deep information is to provide a large set of relevant response to the user according to their search query. In this paper a new technique suggested for automatic filling of search interfaces and then submits the form to the World Wide Web and collects the response in the form of web pages which will be send back after automatic submission of the form and then filter out relevant pages from these response pages accor...

Research paper thumbnail of Domain Independent Automatic Cluster Generation of Documents

Research paper thumbnail of Review of Information Retrieval models

A large number of information of all the domains are available online in the form of hyper text i... more A large number of information of all the domains are available online in the form of hyper text in web pages. Peoples from different disciplines are consulting different web sites to fetch information according to their need. It is very difficult to remember the names of the websites for a specific domain for which the user wants to search. So a search is a system which mines information from the world wide web and present it to the user according to its query. Information retrieval system(IRs) works for search engine arranges the web documents systematically and retrieves the result according to the user query. In this paper we discuss the widely used information retrieval models, their evaluation parameters and application.

Research paper thumbnail of An Improved Extraction Algorithm from Domain Specific Hidden Web

The web contains a large amount of information which is increasing by magnitude every day. The Wo... more The web contains a large amount of information which is increasing by magnitude every day. The World Wide Web consists of Surface Web (Publicly Indexed Web) and the Deep Web which consists of Hidden Data, alsoreferred to by different names such as Hidden Web, Deepnet or the Invisible Web. A user can directly access the surface web through a Search Engine but to access the hidden data/information, the users have to manually feed a set of keywords in a typical search interface to access these hidden web pages from source web sites. The problem area we are working on is devising efficient mechanisms to extract this information automatically beforehand since "crawlers" cannot access it otherwise. In this paper we present a mechanism to extract search forms from HTML pages spread over the web, automatic filling and submission of those forms at their source sites to download the Hidden Web pages in a repository for further use by web crawlers. General Terms-Query Interface for d...

Research paper thumbnail of Efficiency measures for ranked pages by Markov Chain Principle

International Journal of Information Technology

More often, the user likes to visit the web documents that appear in few top excellent responses ... more More often, the user likes to visit the web documents that appear in few top excellent responses to the list of links provided by the search engine and these results are the most likely accurate results to the search query. The Information Retrieval by Search Engine helps in retrieving the most relevant pages for query. In this paper, we propose an ideal technique for link analysis by taking web graph structure and we focus around the ranking of such links. The relevancy of the links is evaluated by using Markov Chain Principle and also query keyword occurrence is given a weight-age to the overall ranking of the links. The term proximity and discounted cumulative gain are used to simulate results and the scores show that the proposed methodology efficiently enhances the ranking of the web pages.

Research paper thumbnail of Review of crawling technique of Search Engine

Scientific Journal of India

Research paper thumbnail of An enhanced Boolean retrieval model for efficient searching

Scientific Journal of India

A large number of information of all the domains are available online in the form of hyper text i... more A large number of information of all the domains are available online in the form of hyper text in web pages. Peoples from different domians are consulting different web sites to fetch information according to their need. It is very difficult to remember the names of the websites for a specific domain for which the user wants to search. So a search is a system which mines information from the World Wide Web and present it to the user according to its query. Information retrieval system (IRs) works for search engine arranges the web documents systematically and retrieves the result according to the user query. In this paper an efficient Boolean retrieval model is proposed which retrieves the results according to the according to the Boolean operation specified within the terms of the search query. Also the proposed model is capable to store large indexes.

Research paper thumbnail of XML Representation of Web Document used by Search Engine

International Journal of Engineering Trends and Technology

Information retrieval is the part of Computer Science that studies the recovery of information fr... more Information retrieval is the part of Computer Science that studies the recovery of information from a gathering of written documents. Searches can be based on either full-text or other content-based indexing. The retrieved documents aim at satisfying a user information requirement usually expressed in natural language. XML is a highly structured language and used to represent the logical structure of a document. This structural nature of XML gives the user of a XML retrieval system the ability to issue more complex and precise queries than those used in traditional flat (unstructured) document retrieval. Users can make use of the structural nature of XML documents to restrict their search to specific structural elements within the XML document collection. KeywordsInformation Retrieval, Structured Data, Semi-Structured Data, Data Centric, Indexing.

Research paper thumbnail of Review of Web Clustering Algorithms and Evaluation

International Journal of Engineering Trends and Technology

Clustering is a procedure of dividing an arrangement of information articles into an arrangement ... more Clustering is a procedure of dividing an arrangement of information articles into an arrangement of significant sub-classes, called clusters. Clustering discovers groups of information protests that are comparable in some sense to each other. The individuals from a cluster are more similar to each other than they resemble individuals from different clusters. The objective of clustering is to discover brilliant clusters with the end goal that the between group likeness is low and the intragroup similitude is high. Clustering should be possible by various techniques, for example, Hierarchical,Partitioning,Density based, Grid based and so forth .In Clustering, Hierarchical Clustering is a strategy for group examination which looks to fabricate a chain of command of the groups. Generally Hierarchical Clustering fall into two types: Agglomerative: This is a “bottom up" approach: every perception begins in its own group, and combines of groups are converged as one climbs the order. Divisive: This is a "top down" approach: all perceptions begin in one group, and parts are performed recursively as one moves down the pecking order. The motivation behind the Clustering system is to cluster the data from a massive information set and make over it into a sensible frame for supplementary reason. Clustering is a noteworthy errand in information examination and information mining applications. Keywords-Clustering, Hierarchical clustering, Subclasses,Agglomerative Hierarchical clustering, Divisive Hierarchical clustering.

Research paper thumbnail of Comparison and Analysis of Two Approaches to Find Novel Documents out of Several Documents

International Journal of Computer Applications

Research paper thumbnail of Design and Analysis of Visual Insight of Child using IBM Multimedia Analysis and Retrieval System

International Journal of Computer Applications

Humans recognize objects with astonishing ease and speed; here we are using behavioral methods to... more Humans recognize objects with astonishing ease and speed; here we are using behavioral methods to investigate the sequence of processes involved in visual object recognition in natural scenes. The greatest challenge of our times is to understand how our brains function. Images play a big role in representing the inner feeling of the child or a person. Images are created in his/her memory of the daily things happening in his/her life. Then the images selected by the child reflect the visual insight of the child. The system developed is helpful for the same purpose. It shows the result in terms of accuracy compared with the other face recognition system.

Research paper thumbnail of The Hybrid Technique for Edge Detection using Bio-inspired Techniques

International Journal of Computer Applications

The image processing is the technique which is applied to process the digital information stored ... more The image processing is the technique which is applied to process the digital information stored in the form of images. The edge detection is the technique of image processing which detect the points at which the image properties changed at steady rate. In this paper, the bee colony based edge detection technique is proposed which is the enhanced version of the existing edge detection technique based on ant colony optimization. The proposed technique is implemented in MATLAB and it is been analyzed that it performs well in terms of accuracy and execution time.

Research paper thumbnail of Comparison of Keyword based Clustering of Web Documents by using OPENSTACK 4J and by Traditional Method

International Journal of Computer Applications, 2016

Research paper thumbnail of Data Optimization Techniques using Bloom Filter in Big Data

International Journal of Computer Applications, 2016

Due to the advent of new technologies, devices, and communication means like social networking si... more Due to the advent of new technologies, devices, and communication means like social networking sites, the amount of data produced by mankind is growing rapidly every year. Traditional computing techniques are not enough to process that much large amount of data. Hadoop is a bunch of technology & have capacity to store large amount of data on Data nodes. Hadoop uses MapReduce algorithm to process and analyze large scale datasets over large clusters. MapReduce is essential for Big Data processing. This algorithm divides the task into small parts and assigns those parts to many computers connected over the network, and collects the results to form the final result dataset. Bloom filter technique is probabilistic data model which is used to make processing of data more efficient. Implementation of this filter with mapper can reduce the amount of data travel. In this paper we implemented Bloom filter in Hadoop architecture. This help to reduce network traffic over network which save bandwidth as well as data storage.

Research paper thumbnail of Ranking of Web Documents for Domain Specific Database

International Journal of Computer Applications, 2016