DisNet: A Framework for Distributed Graph Computation | IEEE Conference Publication | IEEE Xplore

DisNet: A Framework for Distributed Graph Computation


Abstract:

With the rise of network science as an exciting interdisciplinary research topic, efficient graph algorithms are in high demand. Problematically, many such algorithms mea...Show More

Abstract:

With the rise of network science as an exciting interdisciplinary research topic, efficient graph algorithms are in high demand. Problematically, many such algorithms measuring important properties of networks have asymptotic lower bounds that are quadratic, cubic, or higher in the number of vertices. For analysis of social networks, transportation networks, communication networks, and a host of others, computation is intractable. In these networks computation in serial fashion requires years or even decades. Fortunately, these same computational problems are often naturally parallel. We present here the design and implementation of a master-worker framework for easily computing such results in these circumstances. The user needs only to supply two small fragments of code describing the fundamental kernel of the computation. The framework automatically divides and distributes the workload and manages completion using an arbitrary number of heterogeneous computational resources. In practice, we have used thousands of machines and observed commensurate speedups. Writing only 31 lines of standard C++ code, we computed betweenness centrality on a network of 4.7M nodes in 25 hours.
Date of Conference: 25-27 July 2011
Date Added to IEEE Xplore: 18 August 2011
ISBN Information:
Conference Location: Kaohsiung, Taiwan

I. Introduction and Related Work

DisNet is an architecture for achieving difficult feats of computation on large networks. It allows network scientists to develop and deploy new and existing algorithms to obtain results for intractable problems quickly. Architecturally, it is a master-worker framework where the master coordinates vertex-centric computation by distributing vertices to workers. In this paradigm, users must specify only how to compute results for a single vertex and how to combine computed results from two vertices. Each potentially multi-core worker machine has its own local in-memory copy of the network. DisNet is not an attempt at enabling the computation of any problem in any network. It is a recognition that many interesting forms of network analysis lend themselves to computation in a naturally parallel fashion that makes them feasible for the majority of interesting networks. While the rest of this paper will describe in detail the implementation and scaling properties of DisNet, from a user perspective the system is incredibly simple. Users do not need to deploy any complex architectures, learn any special primitives, or understand any principles of parallel processing. With a only a basic knowledge of network science or graph algorithms, users can leverage a wide variety of different computing resources, grids, and clouds to solve their problems. More concretely, by writing between 16 and 31 lines of standard C++ code with no special primitives or consideration of parallelization, we were able to leverage thousands of machines to compute algorithms such as exact diameter and betweenness centrality in under 25 hours for a 4.7M node network.

Contact IEEE to Subscribe

References

References is not available for this document.