1
Introduction
2
Orgnization of the Notes
3
The Push-Pull Method
3.1
Introduction
3.2
Analysis of Convergence
3.2.1
Relationship between two iteration steps
3.2.2
Inequalities
3.2.3
Spectral radius of A
4
Distributed Stochastic Gradient Tracking(DSGT) Method
4.1
Introduction
4.2
Analysis of Convergence
4.2.1
Relationship between two iteration steps
4.2.2
Inequalities
4.2.3
Spectral radius of
\(A_{dsgt}\)
5
Summary of PuSh-Pull and DSGT
5.1
Questions
6
Gossip-like Push-Pull and DSGT
6.1
G-Push-Pull
7
Asymptotic network independence
7.1
SGD and DSGD
7.2
Bounds
7.3
Possible ways to achieve asymptotic network independece
8
Some results in asymptotic network independence
8.1
Compressed Communication
8.1.1
CHOCO-SGD
8.1.2
Stochastic gradient push
8.2
\(D^2\)
9
A sharp estimate of the transient time of DSGD
9.1
\(U(k)\)
and
\(V(k)\)
9.2
Asymptotic network independence of DSGD
9.2.1
Sublinear rate
9.2.2
Asymptotic network independence
9.2.3
Improved Bound
9.3
Transient time
9.4
Sharpness
9.5
Summary
10
comparison
10.1
Assumptions of different schemes
10.2
Convergence rate
11
Exact diffusion
11.1
Decreasing stepsize
11.1.1
Preliminary results
12
Decentralized Proximal Gradient Algorithms with Linear Convergence Rates
12.1
UDA
12.2
PUDA
13
NIDS
14
Final Words
References
Notes about Distributed Optimization
Chapter 14
Final Words