Implementing Sign Language Recognition System using Flex Sensors
- Nwachukwu M. M
- Eze C. E
- Nnorom O. D
- Nwebonyi H. A
- Ezeagwu C. O
- 73-79
- Sep 30, 2024
- Computer Science
Implementing Sign Language Recognition System using Flex Sensors
Nwachukwu M. M, Eze C. E, Nnorom O. D, Nwebonyi H. A, Ezeagwu C. O
Department of Electronics and Computer Engineering, Nnamdi Azikiwe University, Awka, Nigeria.
DOI: https://doi.org/10.51584/IJRIAS.2024.909008
Received: 31 August 2024; Accepted: 13 September 2024; Published: 30 September 2024
ABSTRACT
Sign languages rely on a combination of handshapes, facial expressions, and body movements to convey meaning. They are usually learnt at a tender age as one’s first language. This paper is aimed at designing, implementing, and developing a sensor-based smart glove system for the recognition of sign language. The system includes a wearable glove embedded with flex sensors for detecting hand movement and gestures used in sign language. This information from the sensors is fed into a microcontroller running an algorithm that identifies the signs and modulates them into speech. The system was aimed to provide an inexpensive and effective solution for the deaf/dumb community to communicate with others through sign language. Thirty-one phrases have been successfully obtained in the implemented system. The evaluation regarding the performance of the system was conducted by conducting user studies and tests, and results were presented to show the effectiveness of the proposed solution.
Keywords: Sign language recognition, sensor-based systems, smart gloves, flex sensors, hand movement detection, gesture recognition, microcontrollers, wearable and assistive technology.
INTRODUCTION
Interacting with people of diverse backgrounds is essential in our dynamic culture, whether for personal growth or professional objectives. Effective communication is important for every human being. However, people who have a hearing disability and/or a speech disability need a different way to communicate other than vocal communication. They converse with each other via sign language. Nonetheless, learning and understanding sign language is quite difficult, and not everyone will be able to interpret the meaning of the movements. Acquiring proficiency in sign language also takes time because there isn’t a useful, portable tool for reading sign language. People with hearing or speech impairments who are fluent in sign language need a translator who is also fluent in sign language to convey their ideas to others in an effective manner
For many years, studies have been actively conducted on such systems [1, 2]. In their initial stages, the technology relied on gloves with sensors that captured hand and finger movements during sign language conversations. Nevertheless, these systems were hardly ever successful because of the intricacy of sign languages not to mention how hard capturing and interpreting even minute gestures are. With the advent of computer vision techniques and machine learning algorithms, researchers began to explore using cameras and other sensors to capture sign language data [3, 4]. These systems analyse video sequences of sign language communication and use computer algorithms to recognise signs and interpret their meaning [5].
Flex sensors can measure bending or flexing with little effort and a relatively low budget. Their lightness, compactness, robustness, measurement effectiveness and low power consumption make these sensors useful for manifold applications in diverse fields [6]. By attaching flex sensors to the smart glove, and the fingers of a signer, the sensor can detect the degree of flexion in each finger and transmit this information to a computer or device for interpretation. This technology can be used to develop sign language recognition systems that can accurately translate sign language into spoken or written language, enabling better communication between people who are deaf or hard of hearing and those who are hearing [5].
The smart glove system translates analog signals from the flex sensors into digital signals that the Raspberry Pi system can understand. The Raspberry Pi system then translates the digital signals into words that the listeners can understand. This communication aids the listener in understanding what the user is attempting to say.
Nevertheless, sign language translation can be quite challenging because sign language differs in structure from spoken/written language. Some of the focuses that have been proposed to enhance sign language translation involve getting better formulations of sign language syntax and grammar. Other authors have employed computational methods such as machine learning algorithms to process big datasets of signed language data to establish relationships of sign language syntax and or grammar.
One early refined methodology for sign language recognition was carried out using hidden Markov models (HMMs). HMMs are statistical models that can be used to analyse sequential data and have been used successfully in speech recognition. Researchers adapted HMMs to analyse sign language data, with promising results [7]. In recent times, deep learning approaches, including convolutional neural networks (CNNs) and recurrent neural networks (RNNs), have found applications in sign language recognition [8]. Specifically, CNNs have proven to be quite effective in visual data analysis concerning images and videos which made them suitable for handling sign language videos. RNNs, on the other hand, are effective at analyzing sequential data, such as the temporal sequence of signs in sign language [7]. Researchers have used RNNs to analyze video sequences of sign language and improve the accuracy of sign language recognition.
The prominent drawback with the popular machine learning based approaches highlighted is the computational complexity of the overall system. Convolutional neural network-based approaches employs the use of video streams that are fed to the models using algorithms such as You Only Look Once (YOLO) algorithm which are generally slow due to computational cost required. The high processing power required from the system directly results in higher power consumption which are major challenge in developing countries. Therefore, this study aims to develop the flex-sensor based alternative that could deliver the same level of accuracy but with improved response time and lower power consumption.
REVIEW OF RELATED LITERATURE
In [9], they presented a highly advanced system that allows the automatic translation of static gestures that represented alphabets and signs in American Sign Language. This is realized in a system that employed the Hough transform and neural networks to let gesture recognition work without gloves or visual markers, but with images of bare hands for a more natural interaction. This process involves converting the image into a feature vector, which will be matched against feature vectors from the training set. The extracted features show invariance to rotation, scaling, and translation, making the system more flexible [9]. Tests were done on a dataset including 300 hand sign images, which gave a recognition accuracy of 92.3%.
[10] outlined two systems for recognizing American Sign Language (ASL) words. These systems used artificial neural networks (ANN) to convert ASL words into English. The first system relied on feature vectors captured at five specific moments in time. The second system applied histograms of these feature vectors. To extract gesture features, both systems used a sensory glove (Cyberglove™) and a Flock of Birds® 3-D motion tracker. The systems processed data, including finger joint angles and hand movement paths, through two neural networks. A velocity network figured out word length based on hand speed. A word recognition network sorted ASL signs into words. The researchers trained and tested the models with 60 ASL words. The systems achieved recognition accuracies of 92% and 95% respectively.
[11] discussed a technique to recognize letters and digits of the American Sign Language (ASL) alphabet using saliency detection in images. Following saliency detection, the images were then processed with Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) to reduce dimensions and to maximize internal class similarity and minimize external class similarity. These vectors were then trained and classified by neural networks (NN). The system aimed to facilitate communication with deaf individuals and interface with computers using standard letters in sign language. The experiments were run on a new benchmark database and the system achieved 99.88% recognition rate in 4-fold cross-validation for 4 training sessions. Which all showed very accuracy and outperformed other techniques.
[12] proposed a NN-based sign language recognition system. They collected signs using the camera from which raw and histogram features were extracted. 70% and 85% accuracy were obtained on average for raw and histogram features. respectively. [13] developed a multimodal fusion sign language recognition system. They gathered 1400 one-handed static gestures in all using Kinect, leap motion. That was CNN classified and the highest accuracy of 97% was achieved with color depth and leap motion data. Their vision-based static hand recognition system and they collected a total of 2040 alphabet signs [14]. The collected signs were segmented using median filtering and an accuracy of 91.33% was achieved using CNN.
In [15] presented a method for finger spelling recognition of the ASL alphabet using a k-Nearest Neighbors (k-NN) classifier is described. The research also examined the impact of Principal Component Analysis (PCA) on the performance of the k-NN classifier for dimensionality reduction. Empirical results showed that the k-NN classifier achieved the highest accuracy (99.8% for k=3 when the pattern was represented by the full-dimensional feature set. However, the classifier only achieved 28.6% accuracy for k=5 when the pattern was represented by PCA-reduced dimensional features. PCA’s inability to separate data was due in part to the large number of redundant or highly correlated features that exist among ASL alphabets, and this lack of separation led to the loss in accuracy. The k-NN classifier was more accurate than the classifier, but it took much longer to recognise. So, the k-NN classifier was found appropriate for self-assessment systems for special needs students learning ASL alphabet finger-spelling.
SYSTEM DESIGN AND ANALYSIS
Considerations on the device’s size, design complexity cost and voltage consumption have been made. The power supply section which consists of a 10,000mAh power bank sees to the supply of power to both the microcontroller (PIC16F877A) and the Raspberry Pi which in turn supplies to the output unit (the speaker). The rechargeable batteries are used to supply 5V to the pic microcontroller. The positive and the negative terminals are connected to a 5-volt pin on the PIC16F877A, which passes a voltage across it.
The flex sensor resistance varies depending on the amount of bending of the flex. This is also dependent on the value of the series resistors connected to it. The EMF, 5V applied to the sensor acted upon by the resistance connected to it, and the degree of bending of the flex, results to a voltage drop across the voltage divider circuits of the various flex sensors.
Fig. 1: Architecture of the Proposed System
Functional Units of the System
a. Power Supply Unit
This system required a DC power supply of 5V for the microcontroller circuit and the Raspberry pie. The source of power supply in this case is a power bank with a capacity of 10,000 mAh at 5V serves as the system’s power source since it provides stable voltage and current.
b. Smart Glove
The smart glove is made up of five flex sensors, and they alter in resistance in response to finger bending. The smart glove also includes a microcontroller. Due to the flex’s variable resistance, a voltage divider technique is used to obtain different voltage readings (Analog signals) for the microcontroller, which then converts them into digital signals using the microcontroller’s ADC module and communicates the Raspberry Pi unit, using UART communication protocol to communicate the Raspberry Pi. The smart glove was powered by pin2 and pin 6 of the Raspberry Pi. This method was shown to be quite effective throughout testing.
c. The Raspberry Pi unit
This unit serves to power the other units and also houses the code that runs the user’s guide; hence it is essential to the entire project. The 40-pin GPIO pins were used for the smart glove. This project made advantage of one of the four UARTs that are available on the Raspberry Pi 4. In order for the smart glove to connect with the Raspberry Pi unit, only uart0/1 is enabled through GPIO pins 14 and 15 by default. Through the device tree overlays, the extra UARTs may be enabled. A level shifter circuit was added to scale the voltage down to 3.3 volts at pin 10 (GPIO 15) so that the Raspberry Pi unit could operate at its best as it cannot withstand 5 volts. For audible alert to the user, a speaker or earphone were added at the 4-pole stereo audio. This system was tested and has performed effectively.
System Implementation and Evaluation
Software Development
The software is decomposed into subsystems so that each subsystem can be individually tested as a unit and debugged before the subsystems are integrated and tested as a software system in order to ensure that the software design meets specification. The system flowchart is as shown in fig 2.
System Evaluation
Python and C are the two programming languages used in this system design. The C language was used to program the microcontroller for the Analog to digital conversion of the voltage signals while the python programming language was used to program the raspberry pi. The microcontroller receives and converts the Analog voltage signals from the flex sensor into digital stream of data before sending it to the raspberry pi unit. The microcontroller can measure how far the finger bends via the sensor by interpreting the sensor information, and it can also conduct the necessary control actions, which in this case would be calling out the intended gesture.
The table below shows summary of the entire system performance as well as tests carried out on the entire system to ascertain if it’s working according to the desired objectives and specifications intended for it. The entire system is evaluated based on the tests, observations and results captured in the table below.
Fig. 2: Flow chart for the program Implementation
Table 1: Test Results
S/N | TEST PLAN | EXPECTED TEST RESULT | ACTUAL TEST RESULT |
1 | Power supply voltage | 5Vdc at 1A. | 4.95V at 1A. |
2 | Raspberry Pi response | Very fast | Fast |
3 | Smart Glove Test | Vcc =5V, Glove should respond to test program | Vcc= 4.9V, Glove responded to test program as expected |
4 | Speaker Test | Sound perfect | Sound ok |
CONCLUSION
The research developed a sensor based smart glove system for sign language recognition using flex sensors, Raspberry Pi and microcontrollers to detect hand gestures and convert them into speech. The system achieved its design goals by creating a flex sensor system, compiling a signal database and text to speech functionality. This is a step forward for sign language technology, flex sensors and microcontrollers can improve communication for deaf or hard of hearing people. Overall, the system is a good solution for communication and inclusivity between deaf and hearing community.
RECOMMENDATIONS
To enhance the systems precision and consistency across situations it would be beneficial to integrate noise reduction algorithms and temperature compensation methods. Adding an accelerometer, alongside the flex sensors could lead to tracking of hand motions by capturing extra data points resulting in better recognition of intricate gestures. Broadening the system scope to accommodate sign language dialects in making it more inclusive and valuable for users, with diverse linguistic backgrounds. Moreover considering sensor and microcontroller alternatives might lower the system’s expenses while maintaining its effectiveness thus increasing accessibility, for individuals requiring it.
REFERENCES
- K. K. S. E. Razieh Rastgoo, “Sign Language Recognition: A Deep Survey,” Expert Systems with Applications, vol. Volume 164, p. 113794, 2021.
- B. H. R. B. Helen Cooper, “Sign Language Recognition,” Visual Analysis of Humans , p. 539–562.
- P. R. S. M. A. D. S. D. S. B. Debasree Mitra, “Sign LangZZZuage Recognition with Machine Learning,” in Advances in Communication, Devices and Networking, Singapore.
- O. B. M. A. I.A. Adeyanju, “Machine learning methods for sign language recognition: A critical review and analysis,” Intelligent Systems with Applications, vol. 12, no. 200056, 2021.
- N. D. C. Society, “National Deaf Children’s Society,” [Online]. Available: https://www.ndcs.org.uk/information-and-support/language-and-communication/sign-language/what-is-sign-language/#:~:text=Sign%20language%20is%20a%20visual,sign%20languages%20in%20the%20world. [Accessed 10th August 2024].
- F. R. L. S. L. R. Q. Giovanni Saggio, “Resistive flex sensors: a survey,” Smart Materials and Structures, vol. 25.
- O. N. H. &. B. R. Koller, “Deep sign: hybrid CNN-HMM for continuous sign language recognition,” Computer Vision and Image Understanding, vol. 180, pp. 53-64, 2018.
- K. K. N. C.-H. C. H. L. S. C. T. T. C.K.M. Lee, “American sign language recognition and training method with recurrent neural network,” Expert Systems with Applications, vol. 167, 2021.
- M. H. B. T. H. A. A.-M. Qutaishat Munib, “American sign language (ASL) recognition based on Hough transform and neural networks,” Expert Systems with Applications, vol. 32, no. 1, pp. 24-37, 2007,
- M. C. L. Cemil Oz, “Linguistic properties based on American Sign Language isolated word recognition with artificial neural networks using a sensory glove and motion tracker,” Neurocomputing, vol. 70, no. 16–18, pp. 2891-2901, 2007.
- M. A. R. K. H. Zamani, “Saliency based alphabet and numbers of American sign language recognition using linear feature extraction,” in 2014 4th International eConference on Computer and Knowledge Engineering (ICCKE), 2014.
- M. A. C. M. M. M. A. S. Dr.Mahesh kaluti, “Convolutional Neural Network for Detection of Sign Language,” International Journal of Computer Trends and Technology (IJCTT), vol. 67, no. 5, pp. 34-37, 2019.
- P. A. C. J. A. R. A. Ferreira, “Multimodal Learning for Sign Language Recognition,” in Iberian Conference on Pattern Recognition and Image Analysis, 2017.
- O. A. K. A. Oyedotun, “Deep learning in vision-based static hand gesture recognition,” Neural Computing and Applications, vol. 28, 12 2017.
- A. H. Y. Dewinta, “American Sign Language-Based Finger-spelling Recognition using k-Nearest Neighbours Classifier,” in The 3rd International Conference on Information and Communication TechnologyAt: Bali, Indonesia, 2015.
- M. C. Staff, “Sign language – Understanding the basics,” 20 May 2020. [Online]. [Accessed 22 April 2023].