dc.description.abstract | Sign language is a form of visual communication that relies on a combination of hand gestures, facial expressions, and body language to convey meaning. Millions of individuals worldwide who are deaf or hard of hearing, as well as by those who communicate with them utilize it on a daily basis. However, despite its importance, sign language recognition and translation remains a challenging task since the complexity and variability of sign language.
In recent years, computer vision and technique has been increasingly applied to sign language recognition and translation, with promising results. In this work, we introduce a sign language display system, based on three-dimensional body modeling[1] and virtual try-on[2]. Our approach involves using body mesh estimation to generate a 3D human model of the signer, which is then used as input to a multi-garment network[2] to simulate the appearance of clothing on the signer.
We collected a dataset of 100 sign language videos, each featuring a different signer performing a range of signs. To use these videos, we firstly use YOLOv5[17] to crop out the signer to create a better environment to do human mesh estimation. And used body mesh estimation algorithms which aims to improve the accuracy of wrist rotation to extract the signer′s body model from each video, and then applied a virtual try-on method to simulate different types of clothing on the signer. Afterwards, we got a virtual human model whose pose and shape is same as the original signer, and its clothes is select from a cloth dataset. We combined these model frame by frame to generate a video which shows a virtual human model with virtual clothes acting sign language. | en_US |