Skip to content

SamuelDiai/SLT-with-pose-estimations

 
 

Repository files navigation

Sign Language Transformers improved using Pose estimation Keypoints features.

This work was made in the context of a project of the course "Object Recognition and Computer Vision" by Jean Ponce, Ivan Laptev, Cordelia Schmid and Josef Sivic at ENS - ULM. In this project, I improved an existing Sign Language Architecture by incorporating Pose estimation of the body, face and hands of a speaker/

The original Sign Language Transformer comes from Camgoz & Al Sign Language Transformers: Sign Language Transformers: Joint End-to-end Sign Language Recognition and Translation.

This code is based on Joey NMT respository which I modified to realize joint continuous sign language recognition and translation by adding pose estimation keypoints.

My contribution was to add pose estimations keypoints (2D or 3D) of the 2014T-phoenix dataset using the DOPE algorithm (https://github.com/naver/dope) https://www.ecva.net/papers/eccv_2020/papers_ECCV/papers/123710375.pdf.

The final report of this project can be found here

Improvements

I described and implemented 3 different fusion types namely Early fusion, Late fusion and Mid-fusion in order to merge the information of the different channels (keypoints hand, body, face, and images) at different stages of the algorithm.

The three fusion architecture are shown bellow. Basically, they represent three ways of merging the additionnal pose estimation keypoints in the SLT model. A more detailed description of the architectures can be found on the report.

Early Fusion :

diag_early

Mid Fusion :

diag_mid

Late Fusion :

diag_late

Results

A more detailed version of my work and the associated results can be found on the report here

Requirements

  • Download the feature files using the data/download.sh script.

  • [Optional] Create a conda or python virtual environment.

  • Install required packages using the requirements.txt file.

    pip install -r requirements.txt

Usage

python -m signjoey train configs/sign.yaml

! Note that the default data directory is ./data. If you download them to somewhere else, you need to update the data_path parameters in your config file.

About

Sign Language Transformers (CVPR'20)

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 99.9%
  • Shell 0.1%