[R] VLouvain: Louvain Community Detection Directly on Vectors, No Graph Construction

Reddit r/MachineLearning / 3/24/2026

💬 OpinionIdeas & Deep AnalysisTools & Practical UsageModels & Research

共有:

Key Points

VLouvain re-derives Louvain community detection to operate directly on an embedding matrix, avoiding explicit similarity-graph construction and the resulting O(n^2) edge explosion.
The method computes degree and modularity gains from community-level vector sums, achieving O(n·d) state while remaining mathematically identical to standard Louvain.
Benchmarks on Amazon Products (1.57M nodes, d=200) report ~11,300s runtime for VLouvain, while multiple graph/parallel libraries fail well below that scale.
The paper finds that Top-K similarity sparsification with FAISS does not preserve meaningful structure (very low NMI vs the full graph), implying naive truncation yields near-random communities.
As a GraphRAG drop-in, VLouvain reduces graph construction/indexing time from ~3 hours to ~5.3 minutes and improves MultiHopRAG retrieval recall from 37.9% to 48.8%.

You have embeddings for your objects. You want to build a similarity graph and find communities, whether for GraphRAG, a recommender system, or just finding structure in your data. So you compute pairwise similarities, build the graph, run Louvain. Except now you have O(n^2) edges and everything crashes above ~15K nodes.

VLouvain reformulates Louvain to work directly on the embedding matrix. Degrees and modularity gains are computed from community-level vector sums, no edges involved. You maintain O(n*d) state instead of O(n^2). The result is mathematically identical to standard Louvain, not an approximation.

On Amazon Products (1.57M nodes, d=200), VLouvain completes in ~11,300 seconds. Every other method we tested (cuGraph, iGraph, GVE, NetworKit) fails before reaching half that scale.

One thing we didn't expect: Top-K sparsification doesn't save you. We built exact and approximate Top-K graphs via FAISS, and even at K=256 the partitions had NMI ~0.04 against the full graph. If you're truncating your similarity graph to make Louvain feasible, you're getting back essentially random communities.

As a drop-in replacement for graph construction in GraphRAG, indexing went from 3 hours to 5.3 minutes, retrieval recall improved from 37.9% to 48.8% on MultiHopRAG.

Paper (EDBT 2026): https://openproceedings.org/2026/conf/edbt/paper-72.pdf

Code: https://github.com/yutengkai/VLouvain

submitted by /u/Greedy-Teach1533
[link] [comments]

Build a WhatsApp AI Assistant Using Laravel, Twilio and OpenAI

Dev.to

Santa Augmentcode Intent Ep.6

Dev.to

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.

Dev.to

Anthropic shut down the Claude OAuth workaround. Here's the cheapest alternative in 2026.

Dev.to

ClawRouter vs TeamoRouter: one requires a crypto wallet, one doesn't

Dev.to

[R] VLouvain: Louvain Community Detection Directly on Vectors, No Graph Construction

Key Points

Related Articles

Build a WhatsApp AI Assistant Using Laravel, Twilio and OpenAI

Santa Augmentcode Intent Ep.6

Your Agent Hired Another Agent. The Output Was Garbage. The Money's Gone.

Anthropic shut down the Claude OAuth workaround. Here's the cheapest alternative in 2026.

ClawRouter vs TeamoRouter: one requires a crypto wallet, one doesn't

関連おすすめサービス

Notta搭載AI議事録イヤホン ZENCHORD1

AI搭載ボイスレコーダー Plaud

画像高画質化AIツール Aiarty Image Enhancer