Implementations of the calculation of a Minimum Spanning Tree of a given graph in a distributed setting (in a traditional massively parallel computing setting, as well as a map-reduce variant), assuming super-linear memory per node, as defined by Mohsen Ghaffari.
![]() Rounds vs Edges |
![]() Rounds vs Epsilon |
![]() Rounds vs Vertices |
The benchmarking was done with respect to the number of communication rounds rather than the wall-clock time, as in traditional MPC algorithms.
The Map-Reduce implementation consistently takes one more round than that of its MPC variant because it has to make an additional communication (map + reduce step) to filter down from the graph that can fit in a single node, down to the MST itself. In the MPC implementation, this computation is local to a node and does not require an additional communication.
Note
The MapReduce implementation always took longer to run, likely because of the repeated reads and writes to disk. Since the MPC implementation is just a local simulation, the cost of each communication was considerably less.
Is inspired by the Filtering Algorithm by Lattanzi et al..
For further details, see the corresponding directory.
Was written with the intention of bridging theory and practice of Distributed Algorithms. It is implemented using Hadoop
's primitives for this purpose.
For further details, see the corresponding directory.
Prof. Kishore Kothapalli for his guidance and knowledge of the above algorithms.