Quantized compressive sampling of stochastic gradients for efficient communication in distributed deep learning

## 一言でいうと
既存の圧縮手法における、勾配の総分散の増加、圧縮率の上限、暗黙的なバイアスの追加といった問題を解決する、Quantized Compression Sampling を提案する

### 論文リンク
https://ojs.aaai.org/index.php/AAAI/article/view/5706

### 著者/所属機関
Afshin Abdi, Faramarz Fekri
School of Electrical and Computer Engineering
Georgia Institute of Technology, Atlanta, GA

### 投稿日付(yyyy/MM/dd)
AAAI 
2020/04/03
## 概要
量子化を行うと、いくつかの問題がある
- 量子化によって生じるノイズにより、勾配の総分散が増加し、通常収束するはいパラで収束しなくなる、また学習係数を落とす必要があり、イテレーション数が増える
- バイアスを受ける量子化手法では収束が保証されない
- 小さい勾配が0に丸められ、変化しなくなる
(a) ある学習係数での最終的な収束率
(b), (c) ある学習係数でのイテレーションごとの収束率
![image](https://user-images.githubusercontent.com/47178807/120739470-81b09b80-c52c-11eb-8011-8ab90d7aaee7.png)
これらの問題を解決する、エラーフィードバックというものがある。
これは量子化前後でのエラーを各ノードで保存しておく。

## 新規性・差分
量子化よって生じる問題点を指摘し(概要)、それを解決する手法を提案


## 手法
n次元の勾配をk次元にマッピングし、それを再構築する行列T(n, k)を考える。元の勾配をg((n, 1), マップ後の勾配をv(k, 1)とする。すなわちv = Tg
Tはアダマール行列の上からk行のHと、対角成分に-1か1をランダムにもつ対角行列Rの積
![image](https://user-images.githubusercontent.com/47178807/120741364-e5889380-c52f-11eb-9e14-8e146ee74c32.png)
その後、gのQuantized Compressive Samplingは以下で表される、sは量子化スケール係数、eは(-1/2, 1/2)の一様分布.
![image](https://user-images.githubusercontent.com/47178807/120741673-6a73ad00-c530-11eb-9abe-e047340a89b5.png)
このv^から、E[g^]=gとはるような変換を求めたい。
![image](https://user-images.githubusercontent.com/47178807/120742519-fafebd00-c531-11eb-9331-41b6faf58231.png)
かくかくしかじかで、分散の大きさとマップ後の次元k(通信量)のトレードオフになる。
![image](https://user-images.githubusercontent.com/47178807/120742628-34cfc380-c532-11eb-9e4e-1738ec3f43d4.png)
![image](https://user-images.githubusercontent.com/47178807/120742665-4749fd00-c532-11eb-81a0-a1d660507a1a.png)
アルゴリズム、勾配をk次元に変換した後量子化を行い、通信、その後変換行列で再構築。
同じシードを使うことで完全に復元できる前提
ランダム要素を加えることで、R^nの固定されたk次元部分空間に射影されることを防ぐ？
![image](https://user-images.githubusercontent.com/47178807/120743678-71042380-c534-11eb-84fe-43cbb43d43dc.png)




## 結果
学習の収束は問題がないようだが、肝心の通信時間に関する実験結果がなかった
FC, Lenet, CifarNet, Alexnetで実験

提案手法は圧縮もしているけど速い
![image](https://user-images.githubusercontent.com/47178807/120744836-d78a4100-c536-11eb-8bf4-50dafccd651d.png)

ノード数を増やした時の精度
![image](https://user-images.githubusercontent.com/47178807/120744891-f983c380-c536-11eb-89de-d878b13f3341.png)

収束率
![image](https://user-images.githubusercontent.com/47178807/120744947-115b4780-c537-11eb-80c5-d0dc73a37565.png)


## コメント


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Quantized compressive sampling of stochastic gradients for efficient communication in distributed deep learning #15

一言でいうと

論文リンク

著者/所属機関

投稿日付(yyyy/MM/dd)

概要

新規性・差分

手法

結果

コメント

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Quantized compressive sampling of stochastic gradients for efficient communication in distributed deep learning #15

Description

一言でいうと

論文リンク

著者/所属機関

投稿日付(yyyy/MM/dd)

概要

新規性・差分

手法

結果

コメント

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions