Single Machine Graph Analytics on Massive Datasets Using Intel Optane DC Persistent Memory

Published in Proceedings of the International Conference on Very Large Data Bases (PVLDB), 2020

Recommended citation: Gurbinder Gill, Roshan Dathathri, Loc Hoang, Ramesh Peri, Keshav Pingali, “Single Machine Graph Analytics on Massive Datasets Using Intel Optane DC Persistent Memory,” Proceedings of the 46th International Conference on Very Large Data Bases (PVLDB), 13(8), April 2020. https://doi.org/10.14778/3389133.3389145

(Download publication here) (Download source code here)

(Download an earlier arxiv version of the paper here)

Read about our work in the SIGARCH blog.

Abstract

Intel Optane DC Persistent Memory (Optane PMM) is a new kind of byte-addressable memory with higher density and lower cost than DRAM. This enables the design of affordable systems that support up to 6TB of randomly accessible memory. In this paper, we present key runtime and algorithmic principles to consider when performing graph analytics on extreme-scale graphs on Optane PMM and highlight principles that can apply to graph analytics on all large-memory platforms.

To demonstrate the importance of these principles, we evaluate four existing shared-memory graph frameworks and one out-of-core graph framework on large real-world graphs using a machine with 6TB of Optane PMM. Our results show that frameworks using the runtime and algorithmic principles advocated in this paper (i) perform significantly better than the others and (ii) are competitive with graph analytics frameworks running on production clusters.