Uttam Chauhan - Academia.edu (original) (raw)
Papers by Uttam Chauhan
Lecture notes in networks and systems, Dec 31, 2022
International Journal of Computing and Digital Systems
Recommendation Systems help users select appropriate products or services from a wide range of ch... more Recommendation Systems help users select appropriate products or services from a wide range of choices. Thus, It solves the problem of information overload upto a remarkable extent. Specifically, It is highly applicable in certain industries that sell the product online or provide the services online. Recommendation Systems are very relevant in such a domain because they can grow their business by putting it in the practice. In this review article, we offer an overview of the Recommendation Systems and their variations and extension. We address the numerous techniques used for Recommendation Systems, including content-based filtering, collaborative filtering, sequential, session-based, etc. A comparison has been given for each technique for detailed analysis. It extends the review for the variety of dataset domains, such as movies, music, jobs, products, books, etc. Besides datasets, We have discussed various applications of the recommendation Systems across multiple domains in the industry. We survey various evaluation metrics used in a wide range of Recommendation Systems. In the end, we summarized the different challenges posed by the recommendation Systems, which helps make them more accurate and reliable.
ACM Transactions on Asian and Low-Resource Language Information Processing
A topic model is one of the best stochastic models for summarizing an extensive collection of tex... more A topic model is one of the best stochastic models for summarizing an extensive collection of text. It has accomplished an inordinate achievement in text analysis as well as text summarization. It can be employed to the set of documents that are represented as a bag-of-words, without considering grammar and order of the words. We modeled the topics for Gujarati news articles corpus. As the Gujarati language has a diverse morphological structure and inflectionally rich, Gujarati text processing finds more complexity. The size of the vocabulary plays an important role in the inference process and quality of topics. As the vocabulary size increases, the inference process becomes slower and topic semantic coherence decreases. If the vocabulary size is diminished, then the topic inference process can be accelerated. It may also improve the quality of topics. In this work, the list of suffixes has been prepared that encounters too frequently with words in Gujarati text. The inflectional f...
Lecture notes in networks and systems, 2023
ACM Transactions on Asian and Low-Resource Language Information Processing, Jan 31, 2021
A topic model is one of the best stochastic models for summarizing an extensive collection of tex... more A topic model is one of the best stochastic models for summarizing an extensive collection of text. It has accomplished an inordinate achievement in text analysis as well as text summarization. It can be employed to the set of documents that are represented as a bag-of-words, without considering grammar and order of the words. We modeled the topics for Gujarati news articles corpus. As the Gujarati language has a diverse morphological structure and inflectionally rich, Gujarati text processing finds more complexity. The size of the vocabulary plays an important role in the inference process and quality of topics. As the vocabulary size increases, the inference process becomes slower and topic semantic coherence decreases. If the vocabulary size is diminished, then the topic inference process can be accelerated. It may also improve the quality of topics. In this work, the list of suffixes has been prepared that encounters too frequently with words in Gujarati text. The inflectional forms have been reduced to the root words concerning the suffixes in the list. Moreover, Gujarati single-letter words have been eliminated for faster inference and better quality of topics. Experimentally, it has been proved that if inflectional forms are reduced to their root words, then vocabulary length is shrunk to a significant extent. It also caused the topic formation process quicker. Moreover, the inflectional forms reduction and single-letter word removal enhanced the interpretability of topics. The interpretability of topics has been assessed on semantic coherence, word length, and topic size. The experimental results showed improvements in the topical semantic coherence score. Also, the topic size grew notably as the number of tokens assigned to the topics increased.
International Journal of Computing and Digital Systems, Apr 16, 2023
Lecture notes in networks and systems, 2020
The rapid growth in the text available on the Internet in the variety of forms demands in-depth r... more The rapid growth in the text available on the Internet in the variety of forms demands in-depth research for summarizing text automatically. The summarized form produced from one or multiple documents conveys the important message, which is significantly shorter than the original text. At the same time, summarizing the massive text collection exhibits several challenges. Besides the time complexity, the semantic similarity degree is one of the major issues in text summarization. The summarized text helps the user understand the large corpus much faster and with ease. In this paper, we reviewed several categories of text summarization. First, the techniques have been studied under the category of extractive and abstractive text summarization. Then, a review has been extended for text summarization using topic models. Furthermore, we incorporated machine learning aspects to summarize the large text. We also presented a comparison of techniques over numerous performance measures.
Sensors, Mar 1, 2023
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
Blockchain Applications for Healthcare Informatics
Information Processing in Agriculture, 2021
Abstract Recently many methods have been induced for plant disease detection by the influence of ... more Abstract Recently many methods have been induced for plant disease detection by the influence of Deep Neural Networks in Computer Vision. However, the dearth of transparency in these types of research makes their acquisition in the real-world scenario less approving. We propose an architecture named ResTS (Residual Teacher/Student) that can be used as visualization and a classification technique for diagnosis of the plant disease. ResTS is a tertiary adaptation of formerly suggested Teacher/Student architecture. ResTS is grounded on a Convolutional Neural Network (CNN) structure that comprises two classifiers (ResTeacher and ResStudent) and a decoder. This architecture trains both the classifiers in a reciprocal mode and the conveyed representation between ResTeacher and ResStudent is used as a proxy to envision the dominant areas in the image for categorization. The experiments have shown that the proposed structure ResTS (F1 score: 0.991) has surpassed the Teacher/Student architecture (F1 score: 0.972) and can yield finer visualizations of symptoms of the disease. Novel ResTS architecture incorporates the residual connections in all the constituents and it executes batch normalization after each convolution operation which is dissimilar to the formerly proposed Teacher/Student architecture for plant disease diagnosis. Residual connections in ResTS help in preserving the gradients and circumvent the problem of vanishing or exploding gradients. In addition, batch normalization after each convolution operation aids in swift convergence and increased reliability. All test results are attained on the PlantVillage dataset comprising 54 306 images of 14 crop species.
ACM Computing Surveys, 2022
We are not able to deal with a mammoth text corpus without summarizing them into a relatively sma... more We are not able to deal with a mammoth text corpus without summarizing them into a relatively small subset. A computational tool is extremely needed to understand such a gigantic pool of text. Probabilistic Topic Modeling discovers and explains the enormous collection of documents by reducing them in a topical subspace. In this work, we study the background and advancement of topic modeling techniques. We first introduce the preliminaries of the topic modeling techniques and review its extensions and variations, such as topic modeling over various domains, hierarchical topic modeling, word embedded topic models, and topic models in multilingual perspectives. Besides, the research work for topic modeling in a distributed environment, topic visualization approaches also have been explored. We also covered the implementation and evaluation techniques for topic models in brief. Comparison matrices have been shown over the experimental results of the various categories of topic modeling....
Sensors
With the rapid growth in the data and processing over the cloud, it has become easier to access t... more With the rapid growth in the data and processing over the cloud, it has become easier to access those data. On the other hand, it poses many technical and security challenges to the users of those provisions. Fog computing makes these technical issues manageable to some extent. Fog computing is one of the promising solutions for handling the big data produced by the IoT, which are often security-critical and time-sensitive. Massive IoT data analytics by a fog computing structure is emerging and requires extensive research for more proficient knowledge and smart decisions. Though an advancement in big data analytics is taking place, it does not consider fog data analytics. However, there are many challenges, including heterogeneity, security, accessibility, resource sharing, network communication overhead, the real-time data processing of complex data, etc. This paper explores various research challenges and their solution using the next-generation fog data analytics and IoT networks...
2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT), Jul 1, 2013
A real world challenging task of the web master of an organization is to match the needs of user ... more A real world challenging task of the web master of an organization is to match the needs of user and keep their attention in their web site. So, only option is to capture the intuition of the user and provide them with the recommendation list. Web usage mining is a kind of data mining method that provide intelligent personalized online services such as web recommendations, it is usually necessary to model users" web access behavior. Web usage mining includes three process, namely, preprocessing, pattern discovery and pattern analysis. After the completion of these three phases the user can find the required usage patterns and use this information for the specific needs. The data abstraction is achieved through data preprocessing. The aim of discovering frequent sequential access patterns in Web log data is to obtain information about the navigational behavior of the users. In the proposed system, an efficient sequential pattern mining algorithm is used to identify frequent sequential web access patterns. The access patterns are retrieved from a Graph, which is then used for matching and generating web links for recommendations.
Among the ample of approaches available for classification approach, majority of them are applica... more Among the ample of approaches available for classification approach, majority of them are applicable for structured data, such as relational database, Online Transactional Data Processing or Online Analytical Data processing. But, today in the age of internet and its applications, very huge amount of data are being transmitted from one geographical location to another. The form of data would be in unstructured and it may create serious problem for knowledge derivation. Classification can work for better accuracy using Vector Space Model in adaptive manner. Today the internet is broadly used around the world. So the spam in the email or in the machine is the one of the major problem for persons who have attached today’s internet life and it causes hardware as well as financial damage to the companies and also to the individual users. Among various approaches developed to stop spam, filtering is an important and popular one. The aim of this paper is to use fuzzy clustering approach fo...
International Journal of Managment, IT and Engineering, 2012
The majority of previous studies of data mining have been concentrate on structured data, such as... more The majority of previous studies of data mining have been concentrate on structured data, such as relational, transactional and data warehouse data. But, in actuality, an important section of the available information is stored in text databases, which consist of large collections of web documents from various sources, such as news articles, research papers, e-books, digital libraries, e-mails, and Web pages. Moreover, It is in increasing phase and in magnitude of terabytes of size. Among the ample of provisions of internet, e-mail facility is very useful and broadly used. Spam email is the strongly attached issue with email provision. Among various approaches developed to stop spam emails, filtering is an important and popular one. In this paper, to categorize spam and non-span email which arrives to our email id, classification method-KNNC Classification can work for better accuracy using Vector Space Model in adaptive manner. For getting accuracy in spam classification we have us...
International Journal of Computer Applications, 2013
A real world challenging task of the web master of an organization is to match the needs of user ... more A real world challenging task of the web master of an organization is to match the needs of user and keep their attention in their web site. So, only option is to capture the intuition of the user and provide them with the recommendation list. Web usage mining is a kind of data mining method that provide intelligent personalized online services such as web recommendations, it is usually necessary to model users" web access behavior. Web usage mining includes three process, namely, preprocessing, pattern discovery and pattern analysis. After the completion of these three phases the user can find the required usage patterns and use this information for the specific needs. The data abstraction is achieved through data preprocessing. The aim of discovering frequent sequential access patterns in Web log data is to obtain information about the navigational behavior of the users. In the proposed system, an efficient sequential pattern mining algorithm is used to identify frequent sequential web access patterns. The access patterns are retrieved from a Graph, which is then used for matching and generating web links for recommendations.
The Oxford Journal of Intelligent Decision and Data Science, 2018
Information retrieval has gained a lot of attention from researcher community. There is a huge wo... more Information retrieval has gained a lot of attention from researcher community. There is a huge work being done in text summarization, document classification, document retrieval based on query term, text mining, web content analysis, web page classification and much more. In addition, there has ample of research already been taken for the English as it is a trade language across the world. However, progress in the Gujarati text information retrieval did not follow this tendency. There is an enormous amount of Gujarati text sources which are available under the various title. Moreover, terabytes of text are being made available in form of newspapers on the websites. In this paper, LDA has been applied to Gujarati text for modeling topics. The Latent Dirichlet Allocation model is used to uncover the concealed topics in a collection of documents. It gives a better view to look at the large collection of text, as what exactly it is composed of. This paper aims mainly at the two objectives: 1) It targets the Gujarati language News articles corpus for topic inference and 2) It optimizes topic coherence by incorporating the relevant words set. The document set consists of news articles of the Gujarati newspaper, which is published daily across the state Gujarat in India.
ACM Computing Surveys, 2021
We are not able to deal with a mammoth text corpus without summarizing them into a relatively sma... more We are not able to deal with a mammoth text corpus without summarizing them into a relatively small subset.
A computational tool is extremely needed to understand such a gigantic pool of text. Probabilistic Topic Modeling
discovers and explains the enormous collection of documents by reducing them in a topical subspace.
In this work, we study the background and advancement of topic modeling techniques.We first introduce the
preliminaries of the topic modeling techniques and review its extensions and variations, such as topic modeling
over various domains, hierarchical topic modeling, word embedded topic models, and topic models in
multilingual perspectives. Besides, the research work for topic modeling in a distributed environment, topic
visualization approaches also have been explored. We also covered the implementation and evaluation techniques
for topic models in brief. Comparison matrices have been shown over the experimental results of the
various categories of topic modeling. Diverse technical challenges and future directions have been discussed.
Information Processing in Agriculture, 2021
Recently many methods have been induced for plant disease detection by the influence of Deep Neur... more Recently many methods have been induced for plant disease detection by the influence of Deep Neural Networks in Computer Vision. However, the dearth of transparency in these types of research makes their acquisition in the real-world scenario less approving. We propose an architecture named ResTS (Residual Teacher/Student) that can be used as visualization and a classification technique for diagnosis of the plant disease. ResTS is a tertiary adaptation of formerly suggested Teacher/Student architecture. ResTS is grounded on a Convolutional Neural Network (CNN) structure that comprises two classifiers (ResTeacher and ResStudent) and a decoder. This architecture trains both the classifiers in a reciprocal mode and the conveyed representation between ResTeacher and ResStudent is used as a proxy to envision the dominant areas in the image for categorization. The experiments have shown that the proposed structure ResTS (F1 score: 0.991) has surpassed the Teacher/Student architecture (F1 score: 0.972) and can yield finer visualizations of symptoms of the disease. Novel ResTS architecture incorporates the residual connections in all the constituents and it executes batch normalization after each convolution operation which is dissimilar to the formerly proposed Teacher/Student architecture for plant disease diagnosis. Residual connections in ResTS help in preserving the gradients and circumvent the problem of vanishing or exploding gradients. In addition, batch normalization after each convolution operation aids in swift convergence and increased reliability. All test results are attained on the PlantVillage dataset comprising 54 306 images of 14 crop species.
Lecture notes in networks and systems, Dec 31, 2022
International Journal of Computing and Digital Systems
Recommendation Systems help users select appropriate products or services from a wide range of ch... more Recommendation Systems help users select appropriate products or services from a wide range of choices. Thus, It solves the problem of information overload upto a remarkable extent. Specifically, It is highly applicable in certain industries that sell the product online or provide the services online. Recommendation Systems are very relevant in such a domain because they can grow their business by putting it in the practice. In this review article, we offer an overview of the Recommendation Systems and their variations and extension. We address the numerous techniques used for Recommendation Systems, including content-based filtering, collaborative filtering, sequential, session-based, etc. A comparison has been given for each technique for detailed analysis. It extends the review for the variety of dataset domains, such as movies, music, jobs, products, books, etc. Besides datasets, We have discussed various applications of the recommendation Systems across multiple domains in the industry. We survey various evaluation metrics used in a wide range of Recommendation Systems. In the end, we summarized the different challenges posed by the recommendation Systems, which helps make them more accurate and reliable.
ACM Transactions on Asian and Low-Resource Language Information Processing
A topic model is one of the best stochastic models for summarizing an extensive collection of tex... more A topic model is one of the best stochastic models for summarizing an extensive collection of text. It has accomplished an inordinate achievement in text analysis as well as text summarization. It can be employed to the set of documents that are represented as a bag-of-words, without considering grammar and order of the words. We modeled the topics for Gujarati news articles corpus. As the Gujarati language has a diverse morphological structure and inflectionally rich, Gujarati text processing finds more complexity. The size of the vocabulary plays an important role in the inference process and quality of topics. As the vocabulary size increases, the inference process becomes slower and topic semantic coherence decreases. If the vocabulary size is diminished, then the topic inference process can be accelerated. It may also improve the quality of topics. In this work, the list of suffixes has been prepared that encounters too frequently with words in Gujarati text. The inflectional f...
Lecture notes in networks and systems, 2023
ACM Transactions on Asian and Low-Resource Language Information Processing, Jan 31, 2021
A topic model is one of the best stochastic models for summarizing an extensive collection of tex... more A topic model is one of the best stochastic models for summarizing an extensive collection of text. It has accomplished an inordinate achievement in text analysis as well as text summarization. It can be employed to the set of documents that are represented as a bag-of-words, without considering grammar and order of the words. We modeled the topics for Gujarati news articles corpus. As the Gujarati language has a diverse morphological structure and inflectionally rich, Gujarati text processing finds more complexity. The size of the vocabulary plays an important role in the inference process and quality of topics. As the vocabulary size increases, the inference process becomes slower and topic semantic coherence decreases. If the vocabulary size is diminished, then the topic inference process can be accelerated. It may also improve the quality of topics. In this work, the list of suffixes has been prepared that encounters too frequently with words in Gujarati text. The inflectional forms have been reduced to the root words concerning the suffixes in the list. Moreover, Gujarati single-letter words have been eliminated for faster inference and better quality of topics. Experimentally, it has been proved that if inflectional forms are reduced to their root words, then vocabulary length is shrunk to a significant extent. It also caused the topic formation process quicker. Moreover, the inflectional forms reduction and single-letter word removal enhanced the interpretability of topics. The interpretability of topics has been assessed on semantic coherence, word length, and topic size. The experimental results showed improvements in the topical semantic coherence score. Also, the topic size grew notably as the number of tokens assigned to the topics increased.
International Journal of Computing and Digital Systems, Apr 16, 2023
Lecture notes in networks and systems, 2020
The rapid growth in the text available on the Internet in the variety of forms demands in-depth r... more The rapid growth in the text available on the Internet in the variety of forms demands in-depth research for summarizing text automatically. The summarized form produced from one or multiple documents conveys the important message, which is significantly shorter than the original text. At the same time, summarizing the massive text collection exhibits several challenges. Besides the time complexity, the semantic similarity degree is one of the major issues in text summarization. The summarized text helps the user understand the large corpus much faster and with ease. In this paper, we reviewed several categories of text summarization. First, the techniques have been studied under the category of extractive and abstractive text summarization. Then, a review has been extended for text summarization using topic models. Furthermore, we incorporated machine learning aspects to summarize the large text. We also presented a comparison of techniques over numerous performance measures.
Sensors, Mar 1, 2023
This article is an open access article distributed under the terms and conditions of the Creative... more This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY
Blockchain Applications for Healthcare Informatics
Information Processing in Agriculture, 2021
Abstract Recently many methods have been induced for plant disease detection by the influence of ... more Abstract Recently many methods have been induced for plant disease detection by the influence of Deep Neural Networks in Computer Vision. However, the dearth of transparency in these types of research makes their acquisition in the real-world scenario less approving. We propose an architecture named ResTS (Residual Teacher/Student) that can be used as visualization and a classification technique for diagnosis of the plant disease. ResTS is a tertiary adaptation of formerly suggested Teacher/Student architecture. ResTS is grounded on a Convolutional Neural Network (CNN) structure that comprises two classifiers (ResTeacher and ResStudent) and a decoder. This architecture trains both the classifiers in a reciprocal mode and the conveyed representation between ResTeacher and ResStudent is used as a proxy to envision the dominant areas in the image for categorization. The experiments have shown that the proposed structure ResTS (F1 score: 0.991) has surpassed the Teacher/Student architecture (F1 score: 0.972) and can yield finer visualizations of symptoms of the disease. Novel ResTS architecture incorporates the residual connections in all the constituents and it executes batch normalization after each convolution operation which is dissimilar to the formerly proposed Teacher/Student architecture for plant disease diagnosis. Residual connections in ResTS help in preserving the gradients and circumvent the problem of vanishing or exploding gradients. In addition, batch normalization after each convolution operation aids in swift convergence and increased reliability. All test results are attained on the PlantVillage dataset comprising 54 306 images of 14 crop species.
ACM Computing Surveys, 2022
We are not able to deal with a mammoth text corpus without summarizing them into a relatively sma... more We are not able to deal with a mammoth text corpus without summarizing them into a relatively small subset. A computational tool is extremely needed to understand such a gigantic pool of text. Probabilistic Topic Modeling discovers and explains the enormous collection of documents by reducing them in a topical subspace. In this work, we study the background and advancement of topic modeling techniques. We first introduce the preliminaries of the topic modeling techniques and review its extensions and variations, such as topic modeling over various domains, hierarchical topic modeling, word embedded topic models, and topic models in multilingual perspectives. Besides, the research work for topic modeling in a distributed environment, topic visualization approaches also have been explored. We also covered the implementation and evaluation techniques for topic models in brief. Comparison matrices have been shown over the experimental results of the various categories of topic modeling....
Sensors
With the rapid growth in the data and processing over the cloud, it has become easier to access t... more With the rapid growth in the data and processing over the cloud, it has become easier to access those data. On the other hand, it poses many technical and security challenges to the users of those provisions. Fog computing makes these technical issues manageable to some extent. Fog computing is one of the promising solutions for handling the big data produced by the IoT, which are often security-critical and time-sensitive. Massive IoT data analytics by a fog computing structure is emerging and requires extensive research for more proficient knowledge and smart decisions. Though an advancement in big data analytics is taking place, it does not consider fog data analytics. However, there are many challenges, including heterogeneity, security, accessibility, resource sharing, network communication overhead, the real-time data processing of complex data, etc. This paper explores various research challenges and their solution using the next-generation fog data analytics and IoT networks...
2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT), Jul 1, 2013
A real world challenging task of the web master of an organization is to match the needs of user ... more A real world challenging task of the web master of an organization is to match the needs of user and keep their attention in their web site. So, only option is to capture the intuition of the user and provide them with the recommendation list. Web usage mining is a kind of data mining method that provide intelligent personalized online services such as web recommendations, it is usually necessary to model users" web access behavior. Web usage mining includes three process, namely, preprocessing, pattern discovery and pattern analysis. After the completion of these three phases the user can find the required usage patterns and use this information for the specific needs. The data abstraction is achieved through data preprocessing. The aim of discovering frequent sequential access patterns in Web log data is to obtain information about the navigational behavior of the users. In the proposed system, an efficient sequential pattern mining algorithm is used to identify frequent sequential web access patterns. The access patterns are retrieved from a Graph, which is then used for matching and generating web links for recommendations.
Among the ample of approaches available for classification approach, majority of them are applica... more Among the ample of approaches available for classification approach, majority of them are applicable for structured data, such as relational database, Online Transactional Data Processing or Online Analytical Data processing. But, today in the age of internet and its applications, very huge amount of data are being transmitted from one geographical location to another. The form of data would be in unstructured and it may create serious problem for knowledge derivation. Classification can work for better accuracy using Vector Space Model in adaptive manner. Today the internet is broadly used around the world. So the spam in the email or in the machine is the one of the major problem for persons who have attached today’s internet life and it causes hardware as well as financial damage to the companies and also to the individual users. Among various approaches developed to stop spam, filtering is an important and popular one. The aim of this paper is to use fuzzy clustering approach fo...
International Journal of Managment, IT and Engineering, 2012
The majority of previous studies of data mining have been concentrate on structured data, such as... more The majority of previous studies of data mining have been concentrate on structured data, such as relational, transactional and data warehouse data. But, in actuality, an important section of the available information is stored in text databases, which consist of large collections of web documents from various sources, such as news articles, research papers, e-books, digital libraries, e-mails, and Web pages. Moreover, It is in increasing phase and in magnitude of terabytes of size. Among the ample of provisions of internet, e-mail facility is very useful and broadly used. Spam email is the strongly attached issue with email provision. Among various approaches developed to stop spam emails, filtering is an important and popular one. In this paper, to categorize spam and non-span email which arrives to our email id, classification method-KNNC Classification can work for better accuracy using Vector Space Model in adaptive manner. For getting accuracy in spam classification we have us...
International Journal of Computer Applications, 2013
A real world challenging task of the web master of an organization is to match the needs of user ... more A real world challenging task of the web master of an organization is to match the needs of user and keep their attention in their web site. So, only option is to capture the intuition of the user and provide them with the recommendation list. Web usage mining is a kind of data mining method that provide intelligent personalized online services such as web recommendations, it is usually necessary to model users" web access behavior. Web usage mining includes three process, namely, preprocessing, pattern discovery and pattern analysis. After the completion of these three phases the user can find the required usage patterns and use this information for the specific needs. The data abstraction is achieved through data preprocessing. The aim of discovering frequent sequential access patterns in Web log data is to obtain information about the navigational behavior of the users. In the proposed system, an efficient sequential pattern mining algorithm is used to identify frequent sequential web access patterns. The access patterns are retrieved from a Graph, which is then used for matching and generating web links for recommendations.
The Oxford Journal of Intelligent Decision and Data Science, 2018
Information retrieval has gained a lot of attention from researcher community. There is a huge wo... more Information retrieval has gained a lot of attention from researcher community. There is a huge work being done in text summarization, document classification, document retrieval based on query term, text mining, web content analysis, web page classification and much more. In addition, there has ample of research already been taken for the English as it is a trade language across the world. However, progress in the Gujarati text information retrieval did not follow this tendency. There is an enormous amount of Gujarati text sources which are available under the various title. Moreover, terabytes of text are being made available in form of newspapers on the websites. In this paper, LDA has been applied to Gujarati text for modeling topics. The Latent Dirichlet Allocation model is used to uncover the concealed topics in a collection of documents. It gives a better view to look at the large collection of text, as what exactly it is composed of. This paper aims mainly at the two objectives: 1) It targets the Gujarati language News articles corpus for topic inference and 2) It optimizes topic coherence by incorporating the relevant words set. The document set consists of news articles of the Gujarati newspaper, which is published daily across the state Gujarat in India.
ACM Computing Surveys, 2021
We are not able to deal with a mammoth text corpus without summarizing them into a relatively sma... more We are not able to deal with a mammoth text corpus without summarizing them into a relatively small subset.
A computational tool is extremely needed to understand such a gigantic pool of text. Probabilistic Topic Modeling
discovers and explains the enormous collection of documents by reducing them in a topical subspace.
In this work, we study the background and advancement of topic modeling techniques.We first introduce the
preliminaries of the topic modeling techniques and review its extensions and variations, such as topic modeling
over various domains, hierarchical topic modeling, word embedded topic models, and topic models in
multilingual perspectives. Besides, the research work for topic modeling in a distributed environment, topic
visualization approaches also have been explored. We also covered the implementation and evaluation techniques
for topic models in brief. Comparison matrices have been shown over the experimental results of the
various categories of topic modeling. Diverse technical challenges and future directions have been discussed.
Information Processing in Agriculture, 2021
Recently many methods have been induced for plant disease detection by the influence of Deep Neur... more Recently many methods have been induced for plant disease detection by the influence of Deep Neural Networks in Computer Vision. However, the dearth of transparency in these types of research makes their acquisition in the real-world scenario less approving. We propose an architecture named ResTS (Residual Teacher/Student) that can be used as visualization and a classification technique for diagnosis of the plant disease. ResTS is a tertiary adaptation of formerly suggested Teacher/Student architecture. ResTS is grounded on a Convolutional Neural Network (CNN) structure that comprises two classifiers (ResTeacher and ResStudent) and a decoder. This architecture trains both the classifiers in a reciprocal mode and the conveyed representation between ResTeacher and ResStudent is used as a proxy to envision the dominant areas in the image for categorization. The experiments have shown that the proposed structure ResTS (F1 score: 0.991) has surpassed the Teacher/Student architecture (F1 score: 0.972) and can yield finer visualizations of symptoms of the disease. Novel ResTS architecture incorporates the residual connections in all the constituents and it executes batch normalization after each convolution operation which is dissimilar to the formerly proposed Teacher/Student architecture for plant disease diagnosis. Residual connections in ResTS help in preserving the gradients and circumvent the problem of vanishing or exploding gradients. In addition, batch normalization after each convolution operation aids in swift convergence and increased reliability. All test results are attained on the PlantVillage dataset comprising 54 306 images of 14 crop species.