Computer Vision-Based Bengali Sign Language To Text Generation (original) (raw)
Related papers
Video Captioning Based on Sign Language Using YOLOV8 Model
Springer, Cham, 2023
One of the fastest-growing research areas is the recognition of sign language. In this field, many novel techniques have lately been created. People who are deaf-dumb primarily communicate using sign language. Real-time sign language is essential for people who cannot hear or speak (the dumb and the deaf). Hand gestures are one of the non-verbal communication methods used in sign language. People must be aware of these people's language because it is their only means of communication. In this work, we suggest creating and implementing a model to offer transcripts of the sign language that disabled individuals use during a live meeting or video conference. The dataset utilized in this study is downloaded from the Roboflow website and used for training and testing the data. Transfer Learning is a key idea in this situation since a trained model is utilized to identify the hand signals. The YOLOv8 model, created by Ultralytics, is employed for this purpose and instantly translates the letters of the alphabet (A-Z) into their corresponding texts. In our method, the 26 ASL signs are recognized by first extracting the essential components of each sign from the real-time input video, which is then fed into the Yolo-v8 deep learning model to identify the sign. The output will be matched to the signs contained in the neural network and classified into the appropriate signs based on a comparison between the features retrieved and the original signs present in the database.
Text to Sign Language Conversion by Using Python and Database of Images and Videos
The aim of this system is to design an independent communication system for a person who is deaf and hard of hearing. This system is used for converting text to sign language. It is the vision-based system. It takes input as alphabets and numerals and converts them into equivalent sign code and displays on a screen. In this system, we are going to used Indian sign language. Sign language is not same for all part of the world. Sign language is defined as the language of deaf and dumb people by using which they are able to express their thoughts. By using sign language they can transmit messages by combining hand shapes and different movement of hands. Sign language has their own alphabets and grammar. By creating a system that converts text to sign code, which is helpful for communication between normal people and hard of hearing the person.
A Translator for Indian Sign Language to Text and Speech
International Journal for Research in Applied Science and Engineering Technology IJRASET, 2020
Verbal Communication is the only way using which people have interacted with each other over the years but the case stands different for the disabled. The barrier created between the impaired and the normal people is one of the setbacks of the society. For the impaired people (deaf & mute), sign language is the only way to communicate. In order to help the deaf and mute communicate efficiently with the normal people, an effective solution has been devised. Our aim is to design a system which analyses and recognizes various alphabets from a database of sign images. In order to accomplish this, the application uses various techniques of Image Processing such as segmentation & feature extraction. We use the machine learning technique, Convolutional Neural Network for detection of sign language. We convert the image by cropping the background and keeping only gesture, after that we convert the gesture into black & white scale in png format into 55*60 resolution. This system will help to eradicate the barrier between the deaf-mute & normal people. This system will standardize the Indian Sign Language in India. It will also improve the quality of teaching and learning in deaf and mute institutes. Just as Hindi is recognized as the standard language for conversation throughout India, ISL will be recognized as the standard sign language throughout India. The main aim of this work is serving the mankind that is achieved by providing better teaching and better learning.
Conversion of Sign Language Video to Text and Speech
International Journal for Research in Applied Science & Engineering Technology (IJRASET), 2022
Sign Language recognition (SLR) is a significant and promising technique to facilitate communication for hearingimpaired people. Here, we are dedicated to finding an efficient solution to the gesture recognition problem. This work develops a sign language (SL) recognition framework with deep neural networks, which directly transcribes videos of SL sign to word. We propose a novel approach, by using Video sequences that contain both the temporal as well as the spatial features. So, we have used two different models to train both the temporal as well as spatial features. To train the model on the spatial features of the video sequences we use the (Convolutional Neural Networks) CNN model. CNN was trained on the frames obtained from the video sequences of train data. We have used RNN(recurrent neural network) to train the model on the temporal features. A trained CNN model was used to make predictions for individual frames to obtain a sequence of predictions or pool layer outputs for each video. Now this sequence of prediction or pool layer outputs was given to RNN to train on the temporal features. Thus, we perform sign language translation where input video will be given, and by using CNN and RNN, the sign shown in the video is recognized and converted to text and speech.
Real Time Translation of Sign Language to Speech and Text
IARJSET, 2021
Creating a desktop application that uses a computer webcam to capture a person signing gestures for Indian Sign Language (ISL) and translate it into corresponding text and speech in real time. The translated sign language gesture will be acquired in text which is further converted into audio. In this manner we are implementing a finger spelling language translator. To enable the detection gestures, we are making use of a Convolution neural network (CNN). A CNN is highly efficient in tackling computer vision problems and is capable of detecting the desired features with high degree of accuracy upon sufficient training. This project is about converting the hand gesture of sign language to voice or text using Machine Learning Techniques and vice versa. In this we are going to capture a real time translation of indian sign language using single and double hand gestures and recognize the words and convert it into text and then to speech. If the person gives speech as input it is first converted to text and then it displays the suitable sign as output and vice versa.
KU-BdSL: An open dataset for Bengali sign language recognition
Data in Brief, 2023
Sign language is a form of communication medium for speech and hearing disabled people. It has various forms with different troublesome patterns, which are difficult for the general mass to comprehend. Bengali sign language (BdSL) is one of the difficult sign languages due to its immense number of alphabet, words, and expression techniques. Machine translation can ease the difficulty for disabled people to communicate with generals. From the machine learning (ML) domain, computer vision can be the solution for them, and every ML solution requires a optimized model and a proper dataset. Therefore, in this research work, we have created a BdSL dataset and named ‘KU-BdSL’, which consists of 30 classes describing 38 consonants (‘banjonborno’) of the Bengali alphabet. The dataset includes 1500 images of hand signs in total, each representing Bengali consonant(s). Thirty-nine participants (30 males and 9 females) of different ages (21–38 years) participated in the creation of this dataset. We adopted smartphones to capture the images due to the availability of their high-definition cameras. We believe that this dataset can be beneficial to the deaf and dumb (D&D) community. Identification of Bengali consonants of BdSL from images or videos is feasible using the dataset. It can also be employed for a human-machine interface for disabled people. In the future, we will work on the vowels and word level of BdSL.
Bangla Sign Language Recognition and Sentence Building Using Deep Learning
2020 IEEE Asia-Pacific Conference on Computer Science and Data Engineering (CSDE), 2020
Modern age being the era of Information technology, it would not have come this far without the piled up data or information. Whereas communication is the basis of collecting or gathering data or information, almost 5% of the world's population is not blessed with the ability of verbal communication [1]. For deaf and dumb people lacking the ability of verbal communication, sign language is the solution. Sign language varies from the verbal language in every form and rule. This creates a gap between people conversing in verbal language and those communicating in sign language. Verbal languages are easy to interpret for having a common rule-following but sign language differs from region to region. This hampers the communication between normal people and those interacting in sign languages. Human to human interpretation is tough because of the enriched word wise signs and vocabs. To eradicate this issue, we are proposing a machine-based approach for training and detecting the Bangla Sign Language. Our aim is to train the system with enough samples containing different signs used in Bangla Sign Language. In this research, we are using the Convolutional Neural Network (CNN) for training each individual sign. In addition to working as a medium of communication between the deaf and mute with the remaining society, this approach would also serve as a tool for the hearing deprived to learn and use the sign language properly. Moreover, this would also come to assistance for anyone willing to learn or develop sign language or wishes to work with those with special needs of using sign language.
Real-time Telugu Sign Language Translator with Computer Vision
International Journal for Research in Applied Science and Engineering Technology
Sign language is the basic communication method among hearing disabled and speech disabled people. To express themselves, they require an interpreter or motion sensing devices who/which converts sign language in a few of the standard languages. However, there is no system for those who speak in the Telugu language and hence they are forced to speak in the national language over the regional language of their culture along with the same issues of cumbersome hardware or need for an interpreter. This paper proposes a system that detects hand gestures and signs from a real-time video stream that is processed with the help of computer vision and classified with object detection YOLOv3 algorithm. Additionally, the labels are mapped to corresponding Telugu text. The style of learning is transfer learning, unlike conventional CNNs, RNNs or traditional Machine Learning models. It involves applying a pre-trained model onto a completely new problem to solve the related problem statement and adapts to the new problem's requirements efficiently. This requires lesser training effort in terms of dataset size and greater accuracy. It is the first system developed as a sign language translator for Telugu script. It has given the best results as compared to the existing systems. The system is trained on 52 Telugu letters, 10 numbers and 8 frequently used Telugu words.
An Open CV Framework of Automated Sign language to Text translator for Speech and Hearing Impaired
Generally hearing impaired people use sign language for communication, but they find difficulty in communicating with others who don " t understand sign language. This project aims to lower this barrier in communication. It is based on the need of developing an electronic device that can translate sign language into text in order to make the communication take place between the mute communities and the general public as possible. Computer recognition of sign language is an important research problem for enabling communication with hearing impaired people. This project introduces an efficient and fast algorithm for identification of the number of fingers opened in a gesture representing text of the Binary Sign Language. The system does not require the hand to be perfectly aligned to the camera and any specific back ground for camera. The project uses image processing system to identify, especially English alphabetic sign language used by the deaf people to communicate. The basic objective of this project is to develop a computer based intelligent system that will enable the hearing impaired significantly to communicate with others using their natural hand gestures. The idea consisted of designing and building up an intelligent system using image processing, machine learning and artificial intelligence concepts to take visual inputs of sign language " s hand gestures and generate easily recognizable form of outputs. Hence the objective of this project is to develop an intelligent system which can act as a translator between the sign language and the spoken language dynamically and can make the communication between people with hearing impairment and normal people both effective and efficient. The system is we are implementing for Binary sign language but it can detect any sign language with prior image processing.
A Real Time Malaysian Sign Language Detection Algorithm Based on YOLOv3
International Journal of Recent Technology and Engineering, 2019
Sign language is a language that involves a movement of hand gestures. It is a medium for the hearing impaired person (deaf or mute) to communicate with others. However, in order to communicate with the hearing impaired person, the communicator has to have knowledge in sign language. This is to ensure that the message delivered by the hearing impaired person is understood. This project proposes a real time Malaysian sign language detection based on the Convolutional Neural Network (CNN) technique utilizing the You Only Look Once version 3 (YOLOv3) algorithm. Sign language images from web sources and recorded sign language videos by frames were collected. The images were labelled either alphabets or movements. Once the preprocessing phase was completed, the system was trained and tested on the Darknet framework. The system achieved 63 percent accuracy with learning saturation (overfitting) at 7000 iterations. Once it is successfully conducted, this model will be integrated with other...