UltipaDocs
Products
Solutions
Resources
Company
Start Free Trial
UltipaDocs
Start Free Trial
  • Introduction
  • Running Algorithms
    • Degree Centrality
    • Closeness Centrality
    • Harmonic Centrality
    • Eccentricity Centrality
    • Betweenness Centrality
    • Bridges
    • Articulation Points
    • Eigenvector Centrality
    • Katz Centrality
    • CELF
    • PageRank
    • ArticleRank
    • TextRank
    • HITS
    • SybilRank
    • Jaccard Similarity
    • Overlap Similarity
    • Cosine Similarity
    • Pearson Correlation Coefficient
    • Euclidean Distance
    • KNN
    • Vector Similarity
    • Bipartite Graph
    • HyperANF
    • Weakly Connected Components (WCC)
    • Strongly Connected Components (SCC)
    • k-Edge Connected Components
    • Local Clustering Coefficient
    • Triangle Count
    • Clique Count
    • k-Core
    • k-Truss
    • p-Cohesion
    • Induced Subgraph
    • Topological Sort
    • Breadth-First Search (BFS)
    • Depth-First Search (DFS)
    • Dijkstra's Shortest Path
    • A* Shortest Path
    • Yen's K-Shortest Paths
    • Shortest Path (BFS)
    • Delta-Stepping SSSP
    • Shortest Path Faster Algorithm (SPFA)
    • All-Pairs Shortest Path (APSP)
    • Minimum Spanning Tree (MST)
    • K-Spanning Tree
    • Steiner Tree
    • Prize-Collecting Steiner Tree (PCST)
    • Minimum Cost Flow
    • Maximum Flow
    • K-Hop Fast
    • Longest Path (DAG)
    • Random Walk
    • Adamic-Adar Index
    • Common Neighbors
    • Preferential Attachment
    • Resource Allocation
    • Total Neighbors
    • Same Community
    • Louvain
    • Leiden
    • Modularity Optimization
    • Label Propagation
    • HANP
    • SLPA
    • k-Means
    • HDBSCAN
    • K-1 Coloring
    • Modularity
    • Conductance
    • Max k-Cut
      • Node2Vec
      • Struc2Vec
      • LINE
      • Fast Random Projection
      • Summary of Graph Embedding
      • Gradient Descent
      • Backpropagation
      • Skip-gram
      • Skip-gram Optimization
  1. Docs
  2. /
  3. Graph Algorithms
  4. /
  5. Community Detection

Modularity

Overview

The Modularity algorithm evaluates the quality of an existing community partition by computing its modularity score. Unlike community detection algorithms that find communities, this algorithm measures how good a given partition is.

It is typically used after running a community detection algorithm (such as Louvain and Leiden) to assess the quality of the detected communities.

Concepts

Modularity

In many networks, nodes tend to naturally form groups or communities, characterized by dense connections within a community and relatively sparse connections between communities.

Consider an equivalent network G' to G, where G' remains the same community partition and the same number of edges as in G, but the edges are placed randomly. If G has a strong community structure, the ratio of intra-community edges to the total number of edges in G should be higher than the expected ratio in G'. A greater disparity between the actual ratio and expected ratios indicates a more prominent community structure in G. This concept forms the basis of modularity. The modularity is one of the widely used methods to evaluate the quality of a community partition. The Louvain algorithm is designed to find partitions that maximize modularity.

Modularity is a value that ranges from -1 to 1. A value close to 1 indicates a strong community structure, while negative values imply that the partitioning is likely not meaningful. For a connected graph, the modularity generally falls within the range of -0.5 to 1.

Considering the weights of edges in the graph, the modularity (Q) is defined as

where,

  • m is the total sum of edge weights in the graph;
  • Aij is the sum of edge weights between nodes i and j, and 2m = ∑ijAij;
  • ki is the sum of weights of all edges attached to node i;
  • Ci represents the community to which node iis assigned, δ(Ci,Cj) is 1 if Ci= Cj, and 0 otherwise.

Note, kikj2m is the expected sum of weights of edges between nodes i and j if edges are placed at random. Both Aij and kikj2m are divided by 2m because each pair of distinct nodes in a community is considered twice, such as Aab = Aba, kakb2m = kbka2m.

We can also write the above formula as the following:

where,

  • ∑inc is the sum of weights of edges inside community C, i.e., the intra-community weight;
  • ∑totc is the sum of weights of edges incident to nodes in community C, i.e, the total-community weight;
  • m has the same meaning as above, and 2m = ∑c∑totc.

Nodes in this graph are assigned into 3 communities, take community C1 as example:

  • ∑inC1 = Aaa + Aab + Aac + Aad + Aba + Aca + Ada = 1.5 + 1 + 0.5 + 3 + 1 + 0.5 + 3 = 10.5
  • (∑totC1)2 = kaka + kakb + kakc + kakd + kbka + kbkb + kbkc + kbkd + kcka + kckb + kckc + kckd + kdka + kdkb + kdkc + kdkd + = (ka + kb + kc + kd)2 = (6 + 2.7 + 2.8 + 3)2 = 14.52

Considerations

  • The algorithm treats all edges as undirected.
  • Community assignments are read from a node property specified by communityProperty. If not specified, each node is treated as its own community.

Example Graph

GQL
INSERT (A:default {_id: "A", comm_id: 0}), (B:default {_id: "B", comm_id: 0}),
       (C:default {_id: "C", comm_id: 0}), (D:default {_id: "D", comm_id: 1}),
       (E:default {_id: "E", comm_id: 1}), (F:default {_id: "F", comm_id: 1}),
       (G:default {_id: "G", comm_id: 2}), (H:default {_id: "H", comm_id: 2}),
       (A)-[:default]->(B), (A)-[:default]->(C),
       (B)-[:default]->(C), (A)-[:default]->(D),
       (D)-[:default]->(E), (D)-[:default]->(F),
       (E)-[:default]->(F), (G)-[:default]->(D),
       (G)-[:default]->(H)

Parameters

NameTypeDefaultDescription
communityPropertySTRING/Node property storing community assignments. If not specified, each node is treated as its own community.

Run Mode

Returns:

ColumnTypeDescription
modularityFLOATOverall modularity score Q
communityCountINTNumber of communities
GQL
CALL algo.modularity({
  communityProperty: "comm_id"
}) YIELD modularity, communityCount

Stream Mode

Returns the same columns as run mode, streamed for memory efficiency.

GQL
CALL algo.modularity.stream({
  communityProperty: "comm_id"
}) YIELD modularity, communityCount
RETURN modularity, communityCount

Stats Mode

Returns:

ColumnTypeDescription
nodeCountINTTotal number of nodes
modularityFLOATOverall modularity score Q
communityCountINTNumber of communities
GQL
CALL algo.modularity.stats({
  communityProperty: "comm_id"
}) YIELD nodeCount, modularity, communityCount