Given a graph G and a positive integer k, the Graphlet Sampling problem asks to sample a connected induced k-vertex subgraph of G uniformly at random. Graphlet sampling enhances machine learning applications by transforming graph structures into feature vectors for tasks such as graph classification and subgraph identification, boosting neural network performance, and supporting clustered federated learning by capturing local structures and relationships. A recent work has shown that the problem admits an algorithm that preprocesses G in time O(nk2 log k + m), and draws one sample in expected time kO(k) log n, where n = |V (G)| and m = |E(G)|. Such an algorithm relies on the assumption that the input graph fits into main memory and it does not seem to be straightforward to adapt it to very large graphs. We consider Graphlet Sampling in the semi-streaming setting, where we have a memory of M = Ω(n log n) words, and G can be only read through sequential passes over the edge list. We develop a semi-streaming algorithm that preprocesses G in p = O(log n) passes and samples Θ(M k−O(k)) independent uniform k-graphlets in O(k) passes. For constant k, both phases run in time O((n + m) log n). We also show that the tradeoff between memory and number of passes of our algorithms is near-optimal. Our extensive evaluation on very large graphs shows the effectiveness of our algorithms

Efficient Streaming Algorithms for Graphlet Sampling / Y. Bourreau, M. Bressan, T. Hubert Chan, Q. Kuang, M. Sozio (ADVANCES IN NEURAL INFORMATION PROCESSING SYSTEMS). - In: Advances in Neural Information Processing Systems 37 / [a cura di] A. Globerson, L. Mackey, D. Belgrave, A. Fan, U. Paquet, J. Tomczak, C. Zhang. - [s.l] : Neural Information Processing Systems Foundation, 2024. - ISBN 9798331314385. - pp. 1-30 (( Intervento presentato al 38. convegno Conference on Neural Information Processing Systems (NeurIPS 2024) tenutosi a Vancouver nel 2024.

Efficient Streaming Algorithms for Graphlet Sampling

M. Bressan
Secondo
;
2024

Abstract

Given a graph G and a positive integer k, the Graphlet Sampling problem asks to sample a connected induced k-vertex subgraph of G uniformly at random. Graphlet sampling enhances machine learning applications by transforming graph structures into feature vectors for tasks such as graph classification and subgraph identification, boosting neural network performance, and supporting clustered federated learning by capturing local structures and relationships. A recent work has shown that the problem admits an algorithm that preprocesses G in time O(nk2 log k + m), and draws one sample in expected time kO(k) log n, where n = |V (G)| and m = |E(G)|. Such an algorithm relies on the assumption that the input graph fits into main memory and it does not seem to be straightforward to adapt it to very large graphs. We consider Graphlet Sampling in the semi-streaming setting, where we have a memory of M = Ω(n log n) words, and G can be only read through sequential passes over the edge list. We develop a semi-streaming algorithm that preprocesses G in p = O(log n) passes and samples Θ(M k−O(k)) independent uniform k-graphlets in O(k) passes. For constant k, both phases run in time O((n + m) log n). We also show that the tradeoff between memory and number of passes of our algorithms is near-optimal. Our extensive evaluation on very large graphs shows the effectiveness of our algorithms
Settore INFO-01/A - Informatica
Settore IINF-05/A - Sistemi di elaborazione delle informazioni
2024
https://proceedings.neurips.cc/paper_files/paper/2024/hash/23d64d26abb5a0e9f2014cfcc700f82a-Abstract-Conference.html
Book Part (author)
File in questo prodotto:
File Dimensione Formato  
NeurIPS-2024-efficient-streaming-algorithms-for-graphlet-sampling-Paper-Conference.pdf

accesso aperto

Tipologia: Publisher's version/PDF
Dimensione 536.51 kB
Formato Adobe PDF
536.51 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/2434/1151035
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact