Bachchu Paul - Academia.edu (original) (raw)
Papers by Bachchu Paul
International journal of computing and digital system/International Journal of Computing and Digital Systems, Jul 1, 2024
The use of Hyperspectral Image(HSI) has become prevalent in many sectors due to its ability to id... more The use of Hyperspectral Image(HSI) has become prevalent in many sectors due to its ability to identify detailed spectral information (i.e., relationships between the collected spectral data and the object in the HSI data) that cannot be obtained through ordinary imaging. Traditional RGB image classification approaches are insufficient for hyperspectral image classification(HSIC) because they struggle to capture the subtle spectral information that exists within hyperspectral data. In the past few years, the Deep Learning(DL) based model has become a very powerful and efficient non-linear feature extractor for a wide range of computer vision tasks. Furthermore, DL-based models are exempt from manual feature extraction. The use of this stimulus prompted the researchers to use a DL-based model for the classification of Hyperspectral Images, which yielded impressive results. This motivation inspired the researchers to develop a DL-based model for the classification of hyperspectral images, which performed well. Deeper networks might encounter vanishing gradient problems, making optimization more difficult. To address this issue, regularisation and architectural improvements are being implemented. One of the key issues is that the DL-based HSIC model requires a large number of training samples for training, which is an important concern with hyperspectral data due to the scarcity of public HSI datasets. This article provides an overview of deep learning for hyperspectral image classification and assesses the most recent methods. Among all studied methods SpectralNET offers significantly better performance, due to the utilization of wavelet transformation.
Signals and communication technology, Oct 26, 2023
Circuits, Systems, and Signal Processing, Dec 23, 2023
Multimedia Tools and Applications
The articles in this volume have emerged in an interdisciplinary space in-between sociology, soci... more The articles in this volume have emerged in an interdisciplinary space in-between sociology, sociolinguistics, visual anthropology, political science and cultural studies. The contributors took part in one of the panels within the 2nd International Symposium of the International Association of Discourse and Society Studies, jointly organised by the Centre for Social Studies and the School of Humanities of the University of Coimbra in June 2015 (EDiSo, June 18-20). They all share a common interest in viewing urban landscapes as public spaces that are socially produced. Being situated in power relations, as well as based on inequalities, discourse and identities coexist and compete for representation. In this sense, each and every instance of visual communication – be it a piece of graffiti on a wall, a banner in an urban protest or political campaign, or even a shop or a street sign – performs ideologies as people and collectives who produce them make choices
Multimedia Tools and Applications
Studies in computational intelligence, Oct 4, 2022
Intelligent Computing and Communication, 2020
Speech recognition has developed highly and different solutions are available, most of them in En... more Speech recognition has developed highly and different solutions are available, most of them in English and few other non-Indian languages. People face difficulty in handling them who are not proficient in such languages. Bangla is the sixth most spoken language in world [1] and speech-based solutions are not fully available due to complex nature of the language. Thus Bangla speech recognizer is important. Every language has atomic sound called phonemes. Vowel phonemes are one of the most important aspects of a language, as most words are constituted with them. In this paper, we have categorized Bangla vowel phonemes with MFCC features and knn-based classification. The phonemes were split into micro clips of 30 ms before categorization in order to uphold real-world scenario which often consists of incomplete or extremely short duration data. We had experimented with disparate classifiers on our dataset of 92,649 short-duration clips and obtained a highest accuracy of 98.87%. We have compared the accuracy with standard MFCC features for our data set and found better result.
Computational Intelligence in Pattern Recognition, 2021
Speech is one of the attractive areas of the scientists to research in the field of machine learn... more Speech is one of the attractive areas of the scientists to research in the field of machine learning and they got maximum success in Automatic Speech Recognition system. ASR system gradually enters its footsteps into space exploration to home automation, education sectors to commercial sector, and various public sectors in our daily life to make it more manageable and comfortable. In our proposed work, we aimed to build a model on isolated Bengali word recognition system based on different colors pronounced in Bengali dialects that provides an audio-visual presentation of the recognized color. In this research work, LPC is used for extracting speech features based on pitch and fundamental frequency and KNN classifier for recognition. The proposed system achieved 94% testing accuracy for the dataset of 1500 audio samples for 15 classes where each class represents a specific color pronounced in Bengali dialect.
Multimedia Tools and Applications
Computational Intelligence in Pattern Recognition, 2021
Speech is one of the most natural forms of vocalized communication media. Nowadays with the advan... more Speech is one of the most natural forms of vocalized communication media. Nowadays with the advancement of machine learning, different doors are opened to us for finding several standard ways to step out in the real world. ASR is just like the door to explore the concept of communication through speech between human and digital devices that can recognize speech. In this paper, we have designed a Hidden Markov Model-based isolated Bangla numerals recognition system where the Short-Term Fourier Transform is used for collecting the feature vectors. The defined system achieved 91.50% accuracy for our own dataset of 2000 uttered samples for 10 classes, which gives a satisfied result for this Bangla numerals recognition.
Speech recognition has developed highly and different solutions are available, most of them in En... more Speech recognition has developed highly and different solutions are available, most of them in English and few other non-Indian languages. People face difficulty in handling them who are not proficient in such languages. Bangla is the sixth most spoken language in world [1] and speech-based solutions are not fully available due to complex nature of the language. Thus Bangla speech recognizer is important. Every language has atomic sound called phonemes. Vowel phonemes are one of the most important aspects of a language, as most words are constituted with them. In this paper, we have categorized Bangla vowel phonemes with MFCC features and knn-based classification. The phonemes were split into micro clips of 30 ms before categorization in order to uphold real-world scenario which often consists of incomplete or extremely short duration data. We had experimented with disparate classifiers on our dataset of 92,649 short-duration clips and obtained a highest accuracy of 98.87%. We have ...
Advances in Intelligent Systems and Computing
Advances in Intelligent Systems and Computing
Intelligent Computing and Communication
Research in the field of audio signal processing has developed considerably and music signal proc... more Research in the field of audio signal processing has developed considerably and music signal processing has not been an exception to this. Musicians from all over the globe have benefited tremendously with different technological advancements thereby leading music industry on to the next level. Music composers and DJs are always interested in the background music (BGM) of a song which is extremely critical in setting the mood. It is also very important for automatic music transcription and track composition for stage performers. Chords are one of the fundamental entities of BGM which are constituted with the aid of two or more musical notes. Identification of chords is thus a very important task which becomes challenging when the audio clips are short or not of studio quality. In this paper, a system is presented which can aid in distinguishing chords based on their type/family. We have experimented with two of the most fundamental type of chords major and minor at the outset and obtained a highest accuracy of 99.28% for more than 6000 very short clips of one-second duration with a deep learning-based approach.
Lecture Notes in Electrical Engineering
Cyber Intelligence and Information Retrieval
International journal of computing and digital system/International Journal of Computing and Digital Systems, Jul 1, 2024
The use of Hyperspectral Image(HSI) has become prevalent in many sectors due to its ability to id... more The use of Hyperspectral Image(HSI) has become prevalent in many sectors due to its ability to identify detailed spectral information (i.e., relationships between the collected spectral data and the object in the HSI data) that cannot be obtained through ordinary imaging. Traditional RGB image classification approaches are insufficient for hyperspectral image classification(HSIC) because they struggle to capture the subtle spectral information that exists within hyperspectral data. In the past few years, the Deep Learning(DL) based model has become a very powerful and efficient non-linear feature extractor for a wide range of computer vision tasks. Furthermore, DL-based models are exempt from manual feature extraction. The use of this stimulus prompted the researchers to use a DL-based model for the classification of Hyperspectral Images, which yielded impressive results. This motivation inspired the researchers to develop a DL-based model for the classification of hyperspectral images, which performed well. Deeper networks might encounter vanishing gradient problems, making optimization more difficult. To address this issue, regularisation and architectural improvements are being implemented. One of the key issues is that the DL-based HSIC model requires a large number of training samples for training, which is an important concern with hyperspectral data due to the scarcity of public HSI datasets. This article provides an overview of deep learning for hyperspectral image classification and assesses the most recent methods. Among all studied methods SpectralNET offers significantly better performance, due to the utilization of wavelet transformation.
Signals and communication technology, Oct 26, 2023
Circuits, Systems, and Signal Processing, Dec 23, 2023
Multimedia Tools and Applications
The articles in this volume have emerged in an interdisciplinary space in-between sociology, soci... more The articles in this volume have emerged in an interdisciplinary space in-between sociology, sociolinguistics, visual anthropology, political science and cultural studies. The contributors took part in one of the panels within the 2nd International Symposium of the International Association of Discourse and Society Studies, jointly organised by the Centre for Social Studies and the School of Humanities of the University of Coimbra in June 2015 (EDiSo, June 18-20). They all share a common interest in viewing urban landscapes as public spaces that are socially produced. Being situated in power relations, as well as based on inequalities, discourse and identities coexist and compete for representation. In this sense, each and every instance of visual communication – be it a piece of graffiti on a wall, a banner in an urban protest or political campaign, or even a shop or a street sign – performs ideologies as people and collectives who produce them make choices
Multimedia Tools and Applications
Studies in computational intelligence, Oct 4, 2022
Intelligent Computing and Communication, 2020
Speech recognition has developed highly and different solutions are available, most of them in En... more Speech recognition has developed highly and different solutions are available, most of them in English and few other non-Indian languages. People face difficulty in handling them who are not proficient in such languages. Bangla is the sixth most spoken language in world [1] and speech-based solutions are not fully available due to complex nature of the language. Thus Bangla speech recognizer is important. Every language has atomic sound called phonemes. Vowel phonemes are one of the most important aspects of a language, as most words are constituted with them. In this paper, we have categorized Bangla vowel phonemes with MFCC features and knn-based classification. The phonemes were split into micro clips of 30 ms before categorization in order to uphold real-world scenario which often consists of incomplete or extremely short duration data. We had experimented with disparate classifiers on our dataset of 92,649 short-duration clips and obtained a highest accuracy of 98.87%. We have compared the accuracy with standard MFCC features for our data set and found better result.
Computational Intelligence in Pattern Recognition, 2021
Speech is one of the attractive areas of the scientists to research in the field of machine learn... more Speech is one of the attractive areas of the scientists to research in the field of machine learning and they got maximum success in Automatic Speech Recognition system. ASR system gradually enters its footsteps into space exploration to home automation, education sectors to commercial sector, and various public sectors in our daily life to make it more manageable and comfortable. In our proposed work, we aimed to build a model on isolated Bengali word recognition system based on different colors pronounced in Bengali dialects that provides an audio-visual presentation of the recognized color. In this research work, LPC is used for extracting speech features based on pitch and fundamental frequency and KNN classifier for recognition. The proposed system achieved 94% testing accuracy for the dataset of 1500 audio samples for 15 classes where each class represents a specific color pronounced in Bengali dialect.
Multimedia Tools and Applications
Computational Intelligence in Pattern Recognition, 2021
Speech is one of the most natural forms of vocalized communication media. Nowadays with the advan... more Speech is one of the most natural forms of vocalized communication media. Nowadays with the advancement of machine learning, different doors are opened to us for finding several standard ways to step out in the real world. ASR is just like the door to explore the concept of communication through speech between human and digital devices that can recognize speech. In this paper, we have designed a Hidden Markov Model-based isolated Bangla numerals recognition system where the Short-Term Fourier Transform is used for collecting the feature vectors. The defined system achieved 91.50% accuracy for our own dataset of 2000 uttered samples for 10 classes, which gives a satisfied result for this Bangla numerals recognition.
Speech recognition has developed highly and different solutions are available, most of them in En... more Speech recognition has developed highly and different solutions are available, most of them in English and few other non-Indian languages. People face difficulty in handling them who are not proficient in such languages. Bangla is the sixth most spoken language in world [1] and speech-based solutions are not fully available due to complex nature of the language. Thus Bangla speech recognizer is important. Every language has atomic sound called phonemes. Vowel phonemes are one of the most important aspects of a language, as most words are constituted with them. In this paper, we have categorized Bangla vowel phonemes with MFCC features and knn-based classification. The phonemes were split into micro clips of 30 ms before categorization in order to uphold real-world scenario which often consists of incomplete or extremely short duration data. We had experimented with disparate classifiers on our dataset of 92,649 short-duration clips and obtained a highest accuracy of 98.87%. We have ...
Advances in Intelligent Systems and Computing
Advances in Intelligent Systems and Computing
Intelligent Computing and Communication
Research in the field of audio signal processing has developed considerably and music signal proc... more Research in the field of audio signal processing has developed considerably and music signal processing has not been an exception to this. Musicians from all over the globe have benefited tremendously with different technological advancements thereby leading music industry on to the next level. Music composers and DJs are always interested in the background music (BGM) of a song which is extremely critical in setting the mood. It is also very important for automatic music transcription and track composition for stage performers. Chords are one of the fundamental entities of BGM which are constituted with the aid of two or more musical notes. Identification of chords is thus a very important task which becomes challenging when the audio clips are short or not of studio quality. In this paper, a system is presented which can aid in distinguishing chords based on their type/family. We have experimented with two of the most fundamental type of chords major and minor at the outset and obtained a highest accuracy of 99.28% for more than 6000 very short clips of one-second duration with a deep learning-based approach.
Lecture Notes in Electrical Engineering
Cyber Intelligence and Information Retrieval