Towards Indian Sign Language Sentence Recognition using INSIGNVID: Indian Sign Language Video Dataset (original) (raw)
Related papers
Towards Performance Improvement in Indian Sign Language Recognition
2020
Sign language is a complete natural language used by deaf and dumb people. It has its own grammar and it differs with spoken language to a great extent. Since people without hearing and speech impairment lack the knowledge of the sign language, the deaf and dumb people find it difficult to communicate with them. The conception of system that would be able to translate the sign language into text would facilitate understanding of sign language without human interpreter. This paper describes a systematic approach that takes Indian Sign Language (ISL) video as input and converts it into text using frame sequence generator and image augmentation techniques. By incorporating these two concepts, we have increased dataset size and reduced overfitting. It is demonstrated that using simple image manipulation techniques and batch of shifted frames of videos, performance of sign language recognition can be significantly improved. Approach described in this paper achieves 99.57% accuracy on the...
Real time Indian sign language recognition using transfer learning with VGG16
TELKOMNIKA Telecommunication Computing Electronics and Control, 2024
Normal people’s interaction and communication are easier than those with disabilities such as hearing and speech, which are very complicated; hence, the use of sign language plays a crucial role in bridging this gap in communication. While previous attempts have been made to solve this problem using deep learning techniques, including convolutional neural networks (CNNs), support vector machine (SVM), and K-nearest neighbours (KNN), these have low accuracy or may not be employed in real time. This work addresses both issues: improving upon prior limitations and extending the challenge of classifying characters in Indian sign language (ISL). Our system, which can recognize 23 hand gestures of ISL through a purely camera-based approach, eliminates expensive hardware like hand gloves, thus making it economical. The system yields an accuracy of 97.5% on the training dataset, utilizing a pre-trained VGG16 CNN optimized by the Adam optimizer and cross-entropy loss function. These results clearly show how effective transfer learning is in classifying ISL and its possible real-world applications.
Conversion of Sign Language Video to Text and Speech
International Journal for Research in Applied Science & Engineering Technology (IJRASET), 2022
Sign Language recognition (SLR) is a significant and promising technique to facilitate communication for hearingimpaired people. Here, we are dedicated to finding an efficient solution to the gesture recognition problem. This work develops a sign language (SL) recognition framework with deep neural networks, which directly transcribes videos of SL sign to word. We propose a novel approach, by using Video sequences that contain both the temporal as well as the spatial features. So, we have used two different models to train both the temporal as well as spatial features. To train the model on the spatial features of the video sequences we use the (Convolutional Neural Networks) CNN model. CNN was trained on the frames obtained from the video sequences of train data. We have used RNN(recurrent neural network) to train the model on the temporal features. A trained CNN model was used to make predictions for individual frames to obtain a sequence of predictions or pool layer outputs for each video. Now this sequence of prediction or pool layer outputs was given to RNN to train on the temporal features. Thus, we perform sign language translation where input video will be given, and by using CNN and RNN, the sign shown in the video is recognized and converted to text and speech.
A Translator for Indian Sign Language to Text and Speech
International Journal for Research in Applied Science and Engineering Technology IJRASET, 2020
Verbal Communication is the only way using which people have interacted with each other over the years but the case stands different for the disabled. The barrier created between the impaired and the normal people is one of the setbacks of the society. For the impaired people (deaf & mute), sign language is the only way to communicate. In order to help the deaf and mute communicate efficiently with the normal people, an effective solution has been devised. Our aim is to design a system which analyses and recognizes various alphabets from a database of sign images. In order to accomplish this, the application uses various techniques of Image Processing such as segmentation & feature extraction. We use the machine learning technique, Convolutional Neural Network for detection of sign language. We convert the image by cropping the background and keeping only gesture, after that we convert the gesture into black & white scale in png format into 55*60 resolution. This system will help to eradicate the barrier between the deaf-mute & normal people. This system will standardize the Indian Sign Language in India. It will also improve the quality of teaching and learning in deaf and mute institutes. Just as Hindi is recognized as the standard language for conversation throughout India, ISL will be recognized as the standard sign language throughout India. The main aim of this work is serving the mankind that is achieved by providing better teaching and better learning.
INDIAN SIGN LANGUAGE TRANSLATION FOR HARD-OF-HEARING AND HARD-OF-SPEAKING COMMUNITY
IRJET, 2022
Sign language is an integral part of human communication as it has allowed people to communicate with the hard of speaking and hearing community and understand them better. However, not everyone is capable of using sign language which causes a barrier between. One finds it hard to communicate without an interpreter. With the help of deep learning and machine learning systems, we can eliminate said barriers. The purpose of our machine learning project is to create a web/phone camera based sign language recognition and translation system that would convert sign language gestures to text and vice versa in real time. It is possible to implement them via two ways : vision-based or glove-based systems. Capturing and translating the signs from the real life world will be the core objective of this project. Convolutional Neural Network (CNN) algorithm is used to implement our project. OpenCV video stream will be used to capture the real time gestures through the web camera or the phone camera. The preprocessed images are then fed to the Keras CNN model. We get the output in the form of text predicting the sign. Not only does each country have its own sign language but there are also many other regional sign languages too. Due to the Covid-19 pandemic, the alternative to normal communication is Video-calling, Facetime, etc. Hardspeaking and hearing people are not able to use such facilities effectively causing a hindrance in communication. Our paper aims to find a solution to such a problem and proposes a system for the translation of sign language using a webcam, mic, smart mobile phones, etc.
Video Captioning Based on Sign Language Using YOLOV8 Model
Springer, Cham, 2023
One of the fastest-growing research areas is the recognition of sign language. In this field, many novel techniques have lately been created. People who are deaf-dumb primarily communicate using sign language. Real-time sign language is essential for people who cannot hear or speak (the dumb and the deaf). Hand gestures are one of the non-verbal communication methods used in sign language. People must be aware of these people's language because it is their only means of communication. In this work, we suggest creating and implementing a model to offer transcripts of the sign language that disabled individuals use during a live meeting or video conference. The dataset utilized in this study is downloaded from the Roboflow website and used for training and testing the data. Transfer Learning is a key idea in this situation since a trained model is utilized to identify the hand signals. The YOLOv8 model, created by Ultralytics, is employed for this purpose and instantly translates the letters of the alphabet (A-Z) into their corresponding texts. In our method, the 26 ASL signs are recognized by first extracting the essential components of each sign from the real-time input video, which is then fed into the Yolo-v8 deep learning model to identify the sign. The output will be matched to the signs contained in the neural network and classified into the appropriate signs based on a comparison between the features retrieved and the original signs present in the database.
Real-time Telugu Sign Language Translator with Computer Vision
International Journal for Research in Applied Science and Engineering Technology
Sign language is the basic communication method among hearing disabled and speech disabled people. To express themselves, they require an interpreter or motion sensing devices who/which converts sign language in a few of the standard languages. However, there is no system for those who speak in the Telugu language and hence they are forced to speak in the national language over the regional language of their culture along with the same issues of cumbersome hardware or need for an interpreter. This paper proposes a system that detects hand gestures and signs from a real-time video stream that is processed with the help of computer vision and classified with object detection YOLOv3 algorithm. Additionally, the labels are mapped to corresponding Telugu text. The style of learning is transfer learning, unlike conventional CNNs, RNNs or traditional Machine Learning models. It involves applying a pre-trained model onto a completely new problem to solve the related problem statement and adapts to the new problem's requirements efficiently. This requires lesser training effort in terms of dataset size and greater accuracy. It is the first system developed as a sign language translator for Telugu script. It has given the best results as compared to the existing systems. The system is trained on 52 Telugu letters, 10 numbers and 8 frequently used Telugu words.
Computer Vision-Based Bengali Sign Language To Text Generation
2022 IEEE 5th International Conference on Image Processing Applications and Systems (IPAS), 2022
In the whole world, around 7% of people have hearing and speech impairment problems. They use sign language as their communication method. As for our country, there are lots of people born with hearing and speech impairment problems. Therefore, our primary focus is to work for those people by converting Bangla sign language into text. There are already various projects on Bangla sign language done by other people. However, they focused more on the separate alphabets and numerical numbers. That is why, we want to concentrate on Bangla word signs since communication is done using words or phrases rather than alphabets. There is no proper database for Bangla word sign language, so we want to make a database for our work using BDSL. In recognition of sign language (SLR), there usually are two types of scenarios: isolated SLR, which takes words by word and completes recognize action, and the other one is continuous SLR, which completes action by translating the whole sentence at once. We are working on isolated SLR. We introduce a method where we are going to use PyTorch and YOLOv5 for a video classification model to convert Bangla sign language into the text from the video where each video has only one sign language word. Here,we have achieved an accuracy rate of 76.29% on the training dataset and 51.44% on the testing dataset. We are working to build a system that will make it easier for hearing and speech-disabled people to interact with the general public.
Real-Time Detection and Translation for Indian Sign Language using Motion and Speech Recognition
International Journal for Research in Applied Science & Engineering Technology (IJRASET), 2022
Being able to communicate effectively is perhaps one of the most important life skills of all. Speaking is the primitive form of communication and it is what enables us to express ourselves. But life can get really difficult if one lacks the gift of auditory capability. These people communicate with something called sign language. Sign language is a distinctive yet exclusive language, which has been developed for deaf community to be a part of the common culture. In India, there is a large population who are dependent on this form of communication. However, due to the lack of awareness of sign language in our day-today lives, they feel isolated and disconnected with the world. Therefore, we have created a platform which can bridge the gap of this isolation and misunderstandings, using the concepts of Deep Learning. Sign-L is a sign language translator, which can translate actions to text and voice to actions through animation. Not just translations, but it also provides tutorials to learn sign language and increase the much-needed awareness among others as well.
Sign Language to Text-Speech Translator Using Machine Learning
International Journal of Emerging Trends in Engineering Research, 2021
Communication with deaf and dumb people is quite difficult task for others. So, through sign language can communicate with deaf and mute persons but it is difficult for normal people to understand the sign language hence it creates a huge gap between them and it's uneasy to exchange their ideas, thoughts with others. This gap has existed for years in order to minimize this, new technologies should be emerged. Therefore, an interpreter is necessary which acts as a bridge between deaf-mute and others. This paper proposed system which is a sign language translator. The system used American Sign Language (ASL) dataset which is pre-processed based on threshold and intensity. This system recognizes sign language alphabet and by joining the letters it creates a sentence then it converts the text to speech. As the system is based on hand, hand gesture is used in sign language recognition system, for that the efficient hand tracking technique which is given by media pipe cross platform is used and it exactly detects the hand after that by using the ANN architecture the model has trained and which classifies the images. The system has achieved 74% accuracy and recognize almost all the letters. The system which also converts sign text to speech so that it will also helpful for blind people.