Summit Haque - Academia.edu (original) (raw)
Papers by Summit Haque
The inadequacy of spatially explicit and accessible data portals continues to be a substantial ba... more The inadequacy of spatially explicit and accessible data portals continues to be a substantial barrier for policymakers and concerned authorities in the least developed countries. The purpose of this study is to determine the potentiality of night-time light (NTL) data to measure spatial road infrastructure development. The Day-Night Band (DNB) NTL data from the Visible Infrared Imaging Radiometer Suite (VIIRS) as well as Google Maps highways road data (RD) were used in this research. In order to analyze the correlation between VIIRS NTL and RD for two least developed countries, we performed the Chi-square test of independence, which revealed that the variables are dependent on one another. Following that, we computed the Cramer's V test as a correlation coefficient to determine the strength of the association for both countries. Our findings revealed a correlation value of 0.334 in Bangladesh and a correlation value of 0.299 in Rwanda, demonstrating that VIIRS NTL and RD are strongly correlated. Following the discovery of a statistically significant correlation, we utilized the data to do more exploratory analysis.
2019 International Conference on Bangla Speech and Language Processing (ICBSLP), 2019
Style transfer is an emerging trend in the fields of deep learning’s applications, especially in ... more Style transfer is an emerging trend in the fields of deep learning’s applications, especially in images and audio data this is proven very useful and sometimes the results are astonishing. Gradually styles of textual data are also being changed in many novel works. This paper focuses on the transfer of the sentimental vibe of a sentence. Given a positive clause, the negative version of that clause or sentence is generated keeping the context same. The opposite is also done with negative sentences. Previously this was a very tough job because the go-to techniques for such tasks such as Recurrent Neural Networks(RNNs) [1] and Long Short-Term Memories(LSTMs) [2] can’t perform well with it. But since newer technologies like Generative Adversarial Network(GAN) and Variational AutoEncoder(VAE) are emerging, this work seem to become more and more possible and effective. In this paper, Multi-Genarative Variational Auto-Encoder is employed to transfer sentiment values. Inspite of working wit...
2019 International Conference on Bangla Speech and Language Processing (ICBSLP), 2019
Automatic Parts of Speech(POS) tagging is one of the most fundamental tasks for a language in Nat... more Automatic Parts of Speech(POS) tagging is one of the most fundamental tasks for a language in Natural Language Processing(NLP), which acts as a feature for solving advanced NLP tasks. Named Entity Recognition(NER) is another essential task of NLP for information retrieval. Researchers could not find up to the mark solution yet on these two tasks for Bangla language compared to other languages, for instance, English, Ger-man. Moreover, many solutions heavily depend on handcrafted features that require strong linguistic expertise. As these two sequence labeling tasks are similar, In this work, two different datasets of POS tagging and NER were prepared, and different deep neural network approaches studied for solving these two tasks separately. All of the approaches were end to end and did not need any handcrafted feature like word suffixes or affixes, gazetteers, dictionary. This study came up with an end to end solution using deep neural network-based model consisting of Bi-directio...
The Harrow-Hassidim-Lloyd algorithm is intended for solving the system of linear equations on qua... more The Harrow-Hassidim-Lloyd algorithm is intended for solving the system of linear equations on quantum devices. The exponential advantage of the algorithm comes with four caveats. We present a numerical study of the performance of the algorithm when these caveats are not perfectly matched. We observe that, between diagonal and non-diagonal matrices, the algorithm performs with higher success probability for the diagonal matrices. At the same time, it fails to perform well on lower or higher density sparse Hermitian matrices. Again, Quantum Support Vector Machine algorithm is a promising algorithm for classification problem. We have found out that it works better with binary classification problem than multi-label classification problem. And there are many opportunities left for improving the performance.
2019 International Conference on Bangla Speech and Language Processing (ICBSLP), 2019
The biggest challenge of Bengali language processing is creating a strong data set to do research... more The biggest challenge of Bengali language processing is creating a strong data set to do research on. The main focus of this paper is to introduce an authentic and credible data set and this dataset is open for all to be used for educational purposes1 for Bengali sentiment analysis where the data was extracted from a well known online news portal’s user comments. Here comments on various news were scraped, and for detecting the true sentiments of the sentences, five labels of sentiments were used. An online crowd sourcing platform was used for data annotation. To ensure the credibility and validity of the data set, every entry of the data set was tagged three times. Three models of text classification were used for baseline evaluation to check the validity of the data set. This data set might be of valuable help for future works and researches on Bengali sentiment analysis.
2021 Asian Conference on Innovation in Technology (ASIANCON), 2021
News is newly received remarkable facts about current phenomenon. Miscellaneous facts are constan... more News is newly received remarkable facts about current phenomenon. Miscellaneous facts are constantly happening in this world. Mass media helps to reach these facts to the common folks widely. As we are pushed forward to modern world, getting a convenient environment, Bengali mass media are also leaning towards digital platforms. In this article, some supervised machine learning approaches and deep learning approaches have been proposed for classifying Bengali news documents. We have used an open dataset for our work which contains more than three hundred thousand (3, 76, 211) Bengali text documents. Removing stop-words, dropping duplicate data, tokenizing, stemming etc have been commonly done as preprocessing steps. Bag-of-Words with TF-IDF and some Word Embedding approaches - Average Word2Vec, Glove & fastText have been used for feature extraction. We have trained our text corpus using supervised machine learning method and Deep learning method. Significantly, among these models, Support Vector Machine with average Word2Vec has achieved 97% accuracy and Bidirectional LSTM has achieved 96% accuracy.
International Journal of Computer Applications, 2020
Identifying and categorizing opinions in a sentence is the most prominent branch of natural langu... more Identifying and categorizing opinions in a sentence is the most prominent branch of natural language processing. It deals with the text classification to determine the intention of the author of the text. The intention can be for the presentation of happiness, sadness, patriotism, disgust, advice, etc. Most of the research work on opinion or sentiment analysis is in the English language. Bengali corpus is increasing day by day. A large number of online News portals publish their articles in Bengali language and a few News portals have the comment section that allows expressing the opinion of people. Here a research work has been done on Bengali Sports news comments published in different newspapers to train a deep learning model that will be able to categorize a comment according to its sentiment. Comments are collected and separated based on immanent sentiment. The deep learning algorithms that have been used are Convolutional Neural Network (CNN), Multilayer Perceptron, Long Short-Term Memory (LSTM).
2017 20th International Conference of Computer and Information Technology (ICCIT), 2017
With the expansion of population, the growing number of students are inducing an increase in the ... more With the expansion of population, the growing number of students are inducing an increase in the number of educational institutions in Bangladesh. Every year, lots of students, after passing their SSC examination, are in a quandary to choose a suitable college. Is it possible to mitigate their tribulation by helping them to choose their suitable college? Is there any prospect of building a recommender system that will suggest a list of suitable college to a student? What are the facts that can be used to measure the effectiveness of a college for a student? In this study, we worked to build such a recommender system. We figured out some facts that are efficacious to construct a sorted list of colleges according to their suitability for a particular student. To build the recommender system, user-based and item-based collaborative filtering methods have been used. Students are categorized according to their performance in SSC examination, group, gender, spatial information etc.
The inadequacy of spatially explicit and accessible data portals continues to be a substantial ba... more The inadequacy of spatially explicit and accessible data portals continues to be a substantial barrier for policymakers and concerned authorities in the least developed countries. The purpose of this study is to determine the potentiality of night-time light (NTL) data to measure spatial road infrastructure development. The Day-Night Band (DNB) NTL data from the Visible Infrared Imaging Radiometer Suite (VIIRS) as well as Google Maps highways road data (RD) were used in this research. In order to analyze the correlation between VIIRS NTL and RD for two least developed countries, we performed the Chi-square test of independence, which revealed that the variables are dependent on one another. Following that, we computed the Cramer's V test as a correlation coefficient to determine the strength of the association for both countries. Our findings revealed a correlation value of 0.334 in Bangladesh and a correlation value of 0.299 in Rwanda, demonstrating that VIIRS NTL and RD are strongly correlated. Following the discovery of a statistically significant correlation, we utilized the data to do more exploratory analysis.
2019 International Conference on Bangla Speech and Language Processing (ICBSLP), 2019
Style transfer is an emerging trend in the fields of deep learning’s applications, especially in ... more Style transfer is an emerging trend in the fields of deep learning’s applications, especially in images and audio data this is proven very useful and sometimes the results are astonishing. Gradually styles of textual data are also being changed in many novel works. This paper focuses on the transfer of the sentimental vibe of a sentence. Given a positive clause, the negative version of that clause or sentence is generated keeping the context same. The opposite is also done with negative sentences. Previously this was a very tough job because the go-to techniques for such tasks such as Recurrent Neural Networks(RNNs) [1] and Long Short-Term Memories(LSTMs) [2] can’t perform well with it. But since newer technologies like Generative Adversarial Network(GAN) and Variational AutoEncoder(VAE) are emerging, this work seem to become more and more possible and effective. In this paper, Multi-Genarative Variational Auto-Encoder is employed to transfer sentiment values. Inspite of working wit...
2019 International Conference on Bangla Speech and Language Processing (ICBSLP), 2019
Automatic Parts of Speech(POS) tagging is one of the most fundamental tasks for a language in Nat... more Automatic Parts of Speech(POS) tagging is one of the most fundamental tasks for a language in Natural Language Processing(NLP), which acts as a feature for solving advanced NLP tasks. Named Entity Recognition(NER) is another essential task of NLP for information retrieval. Researchers could not find up to the mark solution yet on these two tasks for Bangla language compared to other languages, for instance, English, Ger-man. Moreover, many solutions heavily depend on handcrafted features that require strong linguistic expertise. As these two sequence labeling tasks are similar, In this work, two different datasets of POS tagging and NER were prepared, and different deep neural network approaches studied for solving these two tasks separately. All of the approaches were end to end and did not need any handcrafted feature like word suffixes or affixes, gazetteers, dictionary. This study came up with an end to end solution using deep neural network-based model consisting of Bi-directio...
The Harrow-Hassidim-Lloyd algorithm is intended for solving the system of linear equations on qua... more The Harrow-Hassidim-Lloyd algorithm is intended for solving the system of linear equations on quantum devices. The exponential advantage of the algorithm comes with four caveats. We present a numerical study of the performance of the algorithm when these caveats are not perfectly matched. We observe that, between diagonal and non-diagonal matrices, the algorithm performs with higher success probability for the diagonal matrices. At the same time, it fails to perform well on lower or higher density sparse Hermitian matrices. Again, Quantum Support Vector Machine algorithm is a promising algorithm for classification problem. We have found out that it works better with binary classification problem than multi-label classification problem. And there are many opportunities left for improving the performance.
2019 International Conference on Bangla Speech and Language Processing (ICBSLP), 2019
The biggest challenge of Bengali language processing is creating a strong data set to do research... more The biggest challenge of Bengali language processing is creating a strong data set to do research on. The main focus of this paper is to introduce an authentic and credible data set and this dataset is open for all to be used for educational purposes1 for Bengali sentiment analysis where the data was extracted from a well known online news portal’s user comments. Here comments on various news were scraped, and for detecting the true sentiments of the sentences, five labels of sentiments were used. An online crowd sourcing platform was used for data annotation. To ensure the credibility and validity of the data set, every entry of the data set was tagged three times. Three models of text classification were used for baseline evaluation to check the validity of the data set. This data set might be of valuable help for future works and researches on Bengali sentiment analysis.
2021 Asian Conference on Innovation in Technology (ASIANCON), 2021
News is newly received remarkable facts about current phenomenon. Miscellaneous facts are constan... more News is newly received remarkable facts about current phenomenon. Miscellaneous facts are constantly happening in this world. Mass media helps to reach these facts to the common folks widely. As we are pushed forward to modern world, getting a convenient environment, Bengali mass media are also leaning towards digital platforms. In this article, some supervised machine learning approaches and deep learning approaches have been proposed for classifying Bengali news documents. We have used an open dataset for our work which contains more than three hundred thousand (3, 76, 211) Bengali text documents. Removing stop-words, dropping duplicate data, tokenizing, stemming etc have been commonly done as preprocessing steps. Bag-of-Words with TF-IDF and some Word Embedding approaches - Average Word2Vec, Glove & fastText have been used for feature extraction. We have trained our text corpus using supervised machine learning method and Deep learning method. Significantly, among these models, Support Vector Machine with average Word2Vec has achieved 97% accuracy and Bidirectional LSTM has achieved 96% accuracy.
International Journal of Computer Applications, 2020
Identifying and categorizing opinions in a sentence is the most prominent branch of natural langu... more Identifying and categorizing opinions in a sentence is the most prominent branch of natural language processing. It deals with the text classification to determine the intention of the author of the text. The intention can be for the presentation of happiness, sadness, patriotism, disgust, advice, etc. Most of the research work on opinion or sentiment analysis is in the English language. Bengali corpus is increasing day by day. A large number of online News portals publish their articles in Bengali language and a few News portals have the comment section that allows expressing the opinion of people. Here a research work has been done on Bengali Sports news comments published in different newspapers to train a deep learning model that will be able to categorize a comment according to its sentiment. Comments are collected and separated based on immanent sentiment. The deep learning algorithms that have been used are Convolutional Neural Network (CNN), Multilayer Perceptron, Long Short-Term Memory (LSTM).
2017 20th International Conference of Computer and Information Technology (ICCIT), 2017
With the expansion of population, the growing number of students are inducing an increase in the ... more With the expansion of population, the growing number of students are inducing an increase in the number of educational institutions in Bangladesh. Every year, lots of students, after passing their SSC examination, are in a quandary to choose a suitable college. Is it possible to mitigate their tribulation by helping them to choose their suitable college? Is there any prospect of building a recommender system that will suggest a list of suitable college to a student? What are the facts that can be used to measure the effectiveness of a college for a student? In this study, we worked to build such a recommender system. We figured out some facts that are efficacious to construct a sorted list of colleges according to their suitability for a particular student. To build the recommender system, user-based and item-based collaborative filtering methods have been used. Students are categorized according to their performance in SSC examination, group, gender, spatial information etc.