Bachchu Paul - Academia.edu (original) (raw)

Papers by Bachchu Paul

Research paper thumbnail of Deep Learning Based Hyperspectral Image Classification: A Review For Future Enhancement

International journal of computing and digital system/International Journal of Computing and Digital Systems, Jul 1, 2024

The use of Hyperspectral Image(HSI) has become prevalent in many sectors due to its ability to id... more The use of Hyperspectral Image(HSI) has become prevalent in many sectors due to its ability to identify detailed spectral information (i.e., relationships between the collected spectral data and the object in the HSI data) that cannot be obtained through ordinary imaging. Traditional RGB image classification approaches are insufficient for hyperspectral image classification(HSIC) because they struggle to capture the subtle spectral information that exists within hyperspectral data. In the past few years, the Deep Learning(DL) based model has become a very powerful and efficient non-linear feature extractor for a wide range of computer vision tasks. Furthermore, DL-based models are exempt from manual feature extraction. The use of this stimulus prompted the researchers to use a DL-based model for the classification of Hyperspectral Images, which yielded impressive results. This motivation inspired the researchers to develop a DL-based model for the classification of hyperspectral images, which performed well. Deeper networks might encounter vanishing gradient problems, making optimization more difficult. To address this issue, regularisation and architectural improvements are being implemented. One of the key issues is that the DL-based HSIC model requires a large number of training samples for training, which is an important concern with hyperspectral data due to the scarcity of public HSI datasets. This article provides an overview of deep learning for hyperspectral image classification and assesses the most recent methods. Among all studied methods SpectralNET offers significantly better performance, due to the utilization of wavelet transformation.

Research paper thumbnail of ATCBBC: A Novel Optimizer for Neural Network Architectures

Signals and communication technology, Oct 26, 2023

Research paper thumbnail of RAttSR: A Novel Low-Cost Reconstructed Attention-Based End-to-End Speech Recognizer

Circuits, Systems, and Signal Processing, Dec 23, 2023

Research paper thumbnail of Taxonomy of Music Genre Using Machine Intelligence from Feature Melting Technique

Research paper thumbnail of Identification of Mental State Through Speech Using a Deep Learning Approach

Research paper thumbnail of Machine learning approach of speech emotions recognition using feature fusion technique

Multimedia Tools and Applications

Research paper thumbnail of Ways of seeing, ways of making seen: Visual representations in urban landscapes

The articles in this volume have emerged in an interdisciplinary space in-between sociology, soci... more The articles in this volume have emerged in an interdisciplinary space in-between sociology, sociolinguistics, visual anthropology, political science and cultural studies. The contributors took part in one of the panels within the 2nd International Symposium of the International Association of Discourse and Society Studies, jointly organised by the Centre for Social Studies and the School of Humanities of the University of Coimbra in June 2015 (EDiSo, June 18-20). They all share a common interest in viewing urban landscapes as public spaces that are socially produced. Being situated in power relations, as well as based on inequalities, discourse and identities coexist and compete for representation. In this sense, each and every instance of visual communication – be it a piece of graffiti on a wall, a banner in an urban protest or political campaign, or even a shop or a street sign – performs ideologies as people and collectives who produce them make choices

Research paper thumbnail of A hybrid feature-extracted deep CNN with reduced parameters substitutes an End-to-End CNN for the recognition of spoken Bengali digits

Multimedia Tools and Applications

Research paper thumbnail of Isolated Bangla Spoken Digit and Word Recognition Using MFCC and DTW

Studies in computational intelligence, Oct 4, 2022

Research paper thumbnail of MFCC-Based Bangla Vowel Phoneme Recognition from Micro Clips

Intelligent Computing and Communication, 2020

Speech recognition has developed highly and different solutions are available, most of them in En... more Speech recognition has developed highly and different solutions are available, most of them in English and few other non-Indian languages. People face difficulty in handling them who are not proficient in such languages. Bangla is the sixth most spoken language in world [1] and speech-based solutions are not fully available due to complex nature of the language. Thus Bangla speech recognizer is important. Every language has atomic sound called phonemes. Vowel phonemes are one of the most important aspects of a language, as most words are constituted with them. In this paper, we have categorized Bangla vowel phonemes with MFCC features and knn-based classification. The phonemes were split into micro clips of 30 ms before categorization in order to uphold real-world scenario which often consists of incomplete or extremely short duration data. We had experimented with disparate classifiers on our dataset of 92,649 short-duration clips and obtained a highest accuracy of 98.87%. We have compared the accuracy with standard MFCC features for our data set and found better result.

Research paper thumbnail of A Novel Approach of Audio-Visual Color Recognition Using KNN

Computational Intelligence in Pattern Recognition, 2021

Speech is one of the attractive areas of the scientists to research in the field of machine learn... more Speech is one of the attractive areas of the scientists to research in the field of machine learning and they got maximum success in Automatic Speech Recognition system. ASR system gradually enters its footsteps into space exploration to home automation, education sectors to commercial sector, and various public sectors in our daily life to make it more manageable and comfortable. In our proposed work, we aimed to build a model on isolated Bengali word recognition system based on different colors pronounced in Bengali dialects that provides an audio-visual presentation of the recognized color. In this research work, LPC is used for extracting speech features based on pitch and fundamental frequency and KNN classifier for recognition. The proposed system achieved 94% testing accuracy for the dataset of 1500 audio samples for 15 classes where each class represents a specific color pronounced in Bengali dialect.

Research paper thumbnail of A novel pre-processing technique of amplitude interpolation for enhancing the classification accuracy of Bengali phonemes

Multimedia Tools and Applications

Research paper thumbnail of Bangla Spoken Numerals Recognition by Using HMM

Computational Intelligence in Pattern Recognition, 2021

Speech is one of the most natural forms of vocalized communication media. Nowadays with the advan... more Speech is one of the most natural forms of vocalized communication media. Nowadays with the advancement of machine learning, different doors are opened to us for finding several standard ways to step out in the real world. ASR is just like the door to explore the concept of communication through speech between human and digital devices that can recognize speech. In this paper, we have designed a Hidden Markov Model-based isolated Bangla numerals recognition system where the Short-Term Fourier Transform is used for collecting the feature vectors. The defined system achieved 91.50% accuracy for our own dataset of 2000 uttered samples for 10 classes, which gives a satisfied result for this Bangla numerals recognition.

Research paper thumbnail of MFCC-Based Bangla Vowel Phoneme Recognition from Micro Clips

Speech recognition has developed highly and different solutions are available, most of them in En... more Speech recognition has developed highly and different solutions are available, most of them in English and few other non-Indian languages. People face difficulty in handling them who are not proficient in such languages. Bangla is the sixth most spoken language in world [1] and speech-based solutions are not fully available due to complex nature of the language. Thus Bangla speech recognizer is important. Every language has atomic sound called phonemes. Vowel phonemes are one of the most important aspects of a language, as most words are constituted with them. In this paper, we have categorized Bangla vowel phonemes with MFCC features and knn-based classification. The phonemes were split into micro clips of 30 ms before categorization in order to uphold real-world scenario which often consists of incomplete or extremely short duration data. We had experimented with disparate classifiers on our dataset of 92,649 short-duration clips and obtained a highest accuracy of 98.87%. We have ...

Research paper thumbnail of Indian Regional Spoken Language Identification Using Deep Learning Approach

Advances in Intelligent Systems and Computing

Research paper thumbnail of Voice-Based Railway Station Identification Using LSTM Approach

Advances in Intelligent Systems and Computing

Research paper thumbnail of Deep Learning-Based Music Chord Family Identification

Intelligent Computing and Communication

Research in the field of audio signal processing has developed considerably and music signal proc... more Research in the field of audio signal processing has developed considerably and music signal processing has not been an exception to this. Musicians from all over the globe have benefited tremendously with different technological advancements thereby leading music industry on to the next level. Music composers and DJs are always interested in the background music (BGM) of a song which is extremely critical in setting the mood. It is also very important for automatic music transcription and track composition for stage performers. Chords are one of the fundamental entities of BGM which are constituted with the aid of two or more musical notes. Identification of chords is thus a very important task which becomes challenging when the audio clips are short or not of studio quality. In this paper, a system is presented which can aid in distinguishing chords based on their type/family. We have experimented with two of the most fundamental type of chords major and minor at the outset and obtained a highest accuracy of 99.28% for more than 6000 very short clips of one-second duration with a deep learning-based approach.

Research paper thumbnail of Bengali Spoken Numerals Recognition by MFCC and GMM Technique

Lecture Notes in Electrical Engineering

Research paper thumbnail of A Comparative Study on Sentiment Analysis Influencing Word Embedding Using SVM and KNN

Cyber Intelligence and Information Retrieval

Research paper thumbnail of Deep Learning Based Hyperspectral Image Classification: A Review For Future Enhancement

International journal of computing and digital system/International Journal of Computing and Digital Systems, Jul 1, 2024

The use of Hyperspectral Image(HSI) has become prevalent in many sectors due to its ability to id... more The use of Hyperspectral Image(HSI) has become prevalent in many sectors due to its ability to identify detailed spectral information (i.e., relationships between the collected spectral data and the object in the HSI data) that cannot be obtained through ordinary imaging. Traditional RGB image classification approaches are insufficient for hyperspectral image classification(HSIC) because they struggle to capture the subtle spectral information that exists within hyperspectral data. In the past few years, the Deep Learning(DL) based model has become a very powerful and efficient non-linear feature extractor for a wide range of computer vision tasks. Furthermore, DL-based models are exempt from manual feature extraction. The use of this stimulus prompted the researchers to use a DL-based model for the classification of Hyperspectral Images, which yielded impressive results. This motivation inspired the researchers to develop a DL-based model for the classification of hyperspectral images, which performed well. Deeper networks might encounter vanishing gradient problems, making optimization more difficult. To address this issue, regularisation and architectural improvements are being implemented. One of the key issues is that the DL-based HSIC model requires a large number of training samples for training, which is an important concern with hyperspectral data due to the scarcity of public HSI datasets. This article provides an overview of deep learning for hyperspectral image classification and assesses the most recent methods. Among all studied methods SpectralNET offers significantly better performance, due to the utilization of wavelet transformation.

Research paper thumbnail of ATCBBC: A Novel Optimizer for Neural Network Architectures

Signals and communication technology, Oct 26, 2023

Research paper thumbnail of RAttSR: A Novel Low-Cost Reconstructed Attention-Based End-to-End Speech Recognizer

Circuits, Systems, and Signal Processing, Dec 23, 2023

Research paper thumbnail of Taxonomy of Music Genre Using Machine Intelligence from Feature Melting Technique

Research paper thumbnail of Identification of Mental State Through Speech Using a Deep Learning Approach

Research paper thumbnail of Machine learning approach of speech emotions recognition using feature fusion technique

Multimedia Tools and Applications

Research paper thumbnail of Ways of seeing, ways of making seen: Visual representations in urban landscapes

The articles in this volume have emerged in an interdisciplinary space in-between sociology, soci... more The articles in this volume have emerged in an interdisciplinary space in-between sociology, sociolinguistics, visual anthropology, political science and cultural studies. The contributors took part in one of the panels within the 2nd International Symposium of the International Association of Discourse and Society Studies, jointly organised by the Centre for Social Studies and the School of Humanities of the University of Coimbra in June 2015 (EDiSo, June 18-20). They all share a common interest in viewing urban landscapes as public spaces that are socially produced. Being situated in power relations, as well as based on inequalities, discourse and identities coexist and compete for representation. In this sense, each and every instance of visual communication – be it a piece of graffiti on a wall, a banner in an urban protest or political campaign, or even a shop or a street sign – performs ideologies as people and collectives who produce them make choices

Research paper thumbnail of A hybrid feature-extracted deep CNN with reduced parameters substitutes an End-to-End CNN for the recognition of spoken Bengali digits

Multimedia Tools and Applications

Research paper thumbnail of Isolated Bangla Spoken Digit and Word Recognition Using MFCC and DTW

Studies in computational intelligence, Oct 4, 2022

Research paper thumbnail of MFCC-Based Bangla Vowel Phoneme Recognition from Micro Clips

Intelligent Computing and Communication, 2020

Speech recognition has developed highly and different solutions are available, most of them in En... more Speech recognition has developed highly and different solutions are available, most of them in English and few other non-Indian languages. People face difficulty in handling them who are not proficient in such languages. Bangla is the sixth most spoken language in world [1] and speech-based solutions are not fully available due to complex nature of the language. Thus Bangla speech recognizer is important. Every language has atomic sound called phonemes. Vowel phonemes are one of the most important aspects of a language, as most words are constituted with them. In this paper, we have categorized Bangla vowel phonemes with MFCC features and knn-based classification. The phonemes were split into micro clips of 30 ms before categorization in order to uphold real-world scenario which often consists of incomplete or extremely short duration data. We had experimented with disparate classifiers on our dataset of 92,649 short-duration clips and obtained a highest accuracy of 98.87%. We have compared the accuracy with standard MFCC features for our data set and found better result.

Research paper thumbnail of A Novel Approach of Audio-Visual Color Recognition Using KNN

Computational Intelligence in Pattern Recognition, 2021

Speech is one of the attractive areas of the scientists to research in the field of machine learn... more Speech is one of the attractive areas of the scientists to research in the field of machine learning and they got maximum success in Automatic Speech Recognition system. ASR system gradually enters its footsteps into space exploration to home automation, education sectors to commercial sector, and various public sectors in our daily life to make it more manageable and comfortable. In our proposed work, we aimed to build a model on isolated Bengali word recognition system based on different colors pronounced in Bengali dialects that provides an audio-visual presentation of the recognized color. In this research work, LPC is used for extracting speech features based on pitch and fundamental frequency and KNN classifier for recognition. The proposed system achieved 94% testing accuracy for the dataset of 1500 audio samples for 15 classes where each class represents a specific color pronounced in Bengali dialect.

Research paper thumbnail of A novel pre-processing technique of amplitude interpolation for enhancing the classification accuracy of Bengali phonemes

Multimedia Tools and Applications

Research paper thumbnail of Bangla Spoken Numerals Recognition by Using HMM

Computational Intelligence in Pattern Recognition, 2021

Speech is one of the most natural forms of vocalized communication media. Nowadays with the advan... more Speech is one of the most natural forms of vocalized communication media. Nowadays with the advancement of machine learning, different doors are opened to us for finding several standard ways to step out in the real world. ASR is just like the door to explore the concept of communication through speech between human and digital devices that can recognize speech. In this paper, we have designed a Hidden Markov Model-based isolated Bangla numerals recognition system where the Short-Term Fourier Transform is used for collecting the feature vectors. The defined system achieved 91.50% accuracy for our own dataset of 2000 uttered samples for 10 classes, which gives a satisfied result for this Bangla numerals recognition.

Research paper thumbnail of MFCC-Based Bangla Vowel Phoneme Recognition from Micro Clips

Speech recognition has developed highly and different solutions are available, most of them in En... more Speech recognition has developed highly and different solutions are available, most of them in English and few other non-Indian languages. People face difficulty in handling them who are not proficient in such languages. Bangla is the sixth most spoken language in world [1] and speech-based solutions are not fully available due to complex nature of the language. Thus Bangla speech recognizer is important. Every language has atomic sound called phonemes. Vowel phonemes are one of the most important aspects of a language, as most words are constituted with them. In this paper, we have categorized Bangla vowel phonemes with MFCC features and knn-based classification. The phonemes were split into micro clips of 30 ms before categorization in order to uphold real-world scenario which often consists of incomplete or extremely short duration data. We had experimented with disparate classifiers on our dataset of 92,649 short-duration clips and obtained a highest accuracy of 98.87%. We have ...

Research paper thumbnail of Indian Regional Spoken Language Identification Using Deep Learning Approach

Advances in Intelligent Systems and Computing

Research paper thumbnail of Voice-Based Railway Station Identification Using LSTM Approach

Advances in Intelligent Systems and Computing

Research paper thumbnail of Deep Learning-Based Music Chord Family Identification

Intelligent Computing and Communication

Research in the field of audio signal processing has developed considerably and music signal proc... more Research in the field of audio signal processing has developed considerably and music signal processing has not been an exception to this. Musicians from all over the globe have benefited tremendously with different technological advancements thereby leading music industry on to the next level. Music composers and DJs are always interested in the background music (BGM) of a song which is extremely critical in setting the mood. It is also very important for automatic music transcription and track composition for stage performers. Chords are one of the fundamental entities of BGM which are constituted with the aid of two or more musical notes. Identification of chords is thus a very important task which becomes challenging when the audio clips are short or not of studio quality. In this paper, a system is presented which can aid in distinguishing chords based on their type/family. We have experimented with two of the most fundamental type of chords major and minor at the outset and obtained a highest accuracy of 99.28% for more than 6000 very short clips of one-second duration with a deep learning-based approach.

Research paper thumbnail of Bengali Spoken Numerals Recognition by MFCC and GMM Technique

Lecture Notes in Electrical Engineering

Research paper thumbnail of A Comparative Study on Sentiment Analysis Influencing Word Embedding Using SVM and KNN

Cyber Intelligence and Information Retrieval