Byzantine Attacks in Distributed Training

This project has been submitted for assessment in the L46 Principles of Machine Learning Systems module, of the Part III Computer Science Tripos at the University of Cambridge.

This project is an investigation into Byzantine attacks and Byzantine fault tolerance in the distributed training of deep learning models. Its objectives are to evaluate the Byzantine-Tolerant All-Reduce (BTARD) algorithm presented by Gorbunov et al. (2021) on a different convolutional model and in the presence of additional types of Byzantine attacks, in a controlled simulation environment.

The implementation of this project builds on the implementation at: https://github.com/yandex-research/btard.

References: Gorbunov, E., Borzunov, A., Diskin, M., & Ryabinin, M. (2021). Secure Distributed Training at Scale. arXiv preprint arXiv:2106.11257.

Repository Overview

The written report (PDF): L46_Project_report.pdf
The written report source code (Latex): L46_project_report
The experiment implementation (Jupyter Notebook): L46_project_experiment.ipynb
The documentation of decisions and planning: doc decisions and planning.txt
The setup and installation guide: setup guide.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Byzantine Attacks in Distributed Training

Repository Overview

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
L46_project_report		L46_project_report
L46_Project_report.pdf		L46_Project_report.pdf
L46_project_experiment.ipynb		L46_project_experiment.ipynb
README.md		README.md
doc decisions and planning.txt		doc decisions and planning.txt
setup guide.txt		setup guide.txt

andreea-zaharia/btard-l46-project

Folders and files

Latest commit

History

Repository files navigation

Byzantine Attacks in Distributed Training

Repository Overview

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages