UltipaDocs
Try Playground
  • Introduction
    • Show Algorithms
    • Install and Uninstall
    • Run Algorithms
    • Algorithm Results and Statistics
    • Degree Centrality
    • Closeness Centrality
    • Harmonic Centrality
    • Graph Centrality
    • Betweenness Centrality
    • Eigenvector Centrality
    • CELF
    • PageRank
    • ArticleRank
    • HITS
    • SybilRank
    • Jaccard Similarity
    • Overlap Similarity
    • Cosine Similarity
    • Pearson Correlation Coefficient
    • Euclidean Distance
    • K-Hop All
    • Bipartite Graph
    • HyperANF
    • Connected Component
    • Triangle Counting
    • Induced Subgraph
    • k-Core
    • k-Truss
    • p-Cohesion
    • k-Edge Connected Components
    • Local Clustering Coefficient
    • Topological Sort
    • Schema Overview
    • Dijkstra's Single-Source Shortest Path
    • Delta-Stepping Single-Source Shortest Path
    • Shortest Path Faster Algorithm (SPFA)
    • Minimum Spanning Tree
    • Breadth-First Search (BFS)
    • Depth-First Search (DFS)
    • Adamic-Adar Index
    • Common Neighbors
    • Preferential Attachment
    • Resource Allocation
    • Total Neighbors
    • Louvain
    • Leiden
    • Label Propagation
    • HANP
    • k-Means
    • kNN (k-Nearest Neighbors)
    • K-1 Coloring
    • Conductance
      • Random Walk
      • Node2Vec Walk
      • Node2Vec
      • Struc2Vec Walk
      • Struc2Vec
      • GraphSAGE
      • GraphSAGE Train
      • LINE
      • Fast Random Projection
      • Summary of Graph Embedding
      • Gradient Descent
      • Backpropagation
      • Skip-gram
      • Skip-gram Optimization
  1. Docs
  2. /
  3. Graph Analytics & Algorithms
  4. /
  5. Community Detection & Classification

Leiden

✓ File Writeback ✓ Property Writeback ✓ Direct Return ✓ Stream Return ✓ Stats

Overview

The Leiden algorithm is a community detection algorithm designed to maximize modularity in a graph. It was developed to address potential issues in the results obtained by the popular Louvain algorithm, where some communities may not be well-connected or even disconnected. The Leiden algorithm is faster compared to the Louvain algorithm and guarantees to produce partitions in which all communities are internally connected. The algorithm is also named after the location of its authors.

  • V.A. Traag, L. Waltman, N.J. van Eck, From Louvain to Leiden: guaranteeing well-connected communities (2019)
  • V.A. Traag, P. Van Dooren, Y. Nesterov, Narrow scope for resolution-limit-free community detection (2011)

Concepts

Modularity

The concept of modularity is explained in the Louvain algorithm. The modularity formula used in the Leiden algorithm is extended to handle different levels of community granularity:

γ > 0 is the resolution parameter that modulates the density of connections within communities and between communities. When γ > 1, it leads to more, smaller and well-connected communities; when γ < 1, it leads to fewer, larger and less well-connected communities.

Leiden

The Leiden algorithm starts from a singleton partition, in which each node is in its own community. Then algorithm iteratively runs through passes, and each pass is made of three phases:

First Phase: Fast Modularity Optimization

Unlike the first phase of the Louvain algorithm, which keeps visiting all nodes in the graph until no node movements can increase the modularity; the Leiden algorithm takes a more efficient approach, it only visits all nodes once, afterwards it visits only nodes whose neighborhood has changed. To do that, the Leiden algorithm maintains a queue and initializes it by adding all nodes in the graph in a random order.

Then repeat the following steps until the queue is empty:

  • Remove the first node from the front of the queue.
  • Reassign the node to a different community which has the maximum gain of modularity (ΔQ); or keep the node in its original community if there is no positive gain.
  • If the node is moved to a different community, add to the rear of the queue all neighbors of the node that do not belong to the node’s new community and that are not yet in the queue.

The first phase ends with a partition P of the base or aggregate graph.

Second Phase: Refinement

This phase is designed to get a refined partition Prefined of P to guarantee that all communities are well-connected.

Prefined is initially set to a singleton partition, in which each node in the base or aggregate graph is in its own community. Refine each community C ∈ P as follows.

1. Consider only nodes v ∈ C that are well-connected within C:

where,

  • W(v,C-v) is the sum of edge weights between node v and the rest of nodes in C;
  • kv is the sum of weights of all edges attached to node v;
  • ∑totc the sum of weights of all edges attached to nodes in C.

Take community C1 in the above graph as example, where

  • m = 18.1
  • ∑totC1 = ka + kb + kc + kd = 6 + 2.7 + 2.8 + 3 = 14.5

Set γ as 1.2, then:

  • W(a, C1) - γ/m ⋅ ka ⋅ (∑totC1 - ka) = 4.5 - 1.2/18.1*6*(14.5 - 6) = 1.12
  • W(b, C1) - γ/m ⋅ kb ⋅ (∑totC1 - kb) = 1 - 1.2/18.1*2.7*(14.5 - 2.7) = -1.11
  • W(c, C1) - γ/m ⋅ kc ⋅ (∑totC1 - kc) = 0.5 - 1.2/18.1*2.8*(14.5 - 2.8) = -1.67
  • W(d, C1) - γ/m ⋅ kd ⋅ (∑totC1 - kd) = 3 - 1.2/18.1*3*(14.5 - 3) = 0.71

In this case, only nodes a and d are considered well-connected in community C1.

2. Visit each node v in random order, if it is still on its own in a community in Prefined, randomly merge it to a community C' ∈ Prefined for which the modularity increases. It is required that C' must be well-connected with C:

Note that in this phase, each node is not necessarily greedily merged with the community that yields the maximum gain of modularity. However, the larger the gain, the more likely a community is to be selected. The degree of randomness in the selection of a community is determined by a parameter θ > 0:

Randomness in the selection of a community allows the partition space to be explored more broadly.

After the refinement phase is concluded, communities in P often are split into multiple communities in Prefined, but not always.

Third Phase: Community Aggregation

The aggregate graph is created based on Prefined. However, the partition for the aggregate graph is based on P. The aggregation process is the same as Louvain.

P - color blocks, Prefined - node colors

Once this third phase is completed, another pass is applied to the aggregate graph. The passes are iterated until there are no more changes, and a maximum modularity is attained.

Considerations

  • If node i has any self-loop, when calculating ki, the weight of self-loop is counted only once.
  • The Leiden algorithm ignores the direction of edges but calculates them as undirected edges.

Syntax

  • Command: algo(leiden)
  • Parameters:
Name
Type
Spec
Default
Optional
Description
phase1_loop_numint≥15YesThe maximum loop number of the first phase during each pass
min_modularity_increasefloat[0,1]0.01YesThe minimum gain of modularity (ΔQ) to move a node to another community in the first phase
edge_schema_property[]@<schema>?.<property>Numeric type, must LTE/YesEdge properties to use as weights, where the values of multiple properties are summed up; all edge weights are considered as 1 if not set
gammafloat>01YesThe resolution parameter γ
thetafloat>00.01YesThe parameter θ which controls the degree of randomness during community merging in the second phase
limitint≥-1-1YesNumber of results to return, -1 to return all results
orderstringasc, desc/YesSort communities by the number of nodes in it (only valid in mode 2 of the stream() execution)

Examples

File Writeback

SpecContentDescription
filename_community_id_id,community_idNode and its community ID
filename_idscommunity_id,_id,_id,...Community ID and the ID of nodes in it
filename_numcommunity_id,countCommunity ID and the number of nodes in it
UQL
algo(leiden).params({ 
  phase1_loop_num: 5, 
  min_modularity_increase: 0.1,
  edge_schema_property: 'weight'
}).write({
  file:{
    filename_community_id: 'communityID',
    filename_ids: 'ids',
    filename_num: 'num'
  }
})

Property Writeback

SpecContentWrite toData Type
propertycommunity_idNode propertyuint32
UQL
algo(leiden).params({ 
  phase1_loop_num: 5, 
  min_modularity_increase: 0.1,
  edge_schema_property: 'weight'
}).write({
  db:{
    property: 'communityID'
  }
})

Direct Return

Alias Ordinal
Type
Description
Columns
0[]perNodeNode and its community ID_uuid, community_id
1KVNumber of communities, modularitycommunity_count, modularity
UQL
algo(leiden).params({ 
  phase1_loop_num: 6, 
  min_modularity_increase: 0.5,
  edge_schema_property: 'weight'
}) as results, stats
return results, stats

Stream Return

SpecContentAlias OrdinalTypeDescriptionColumns
mode1 or if not set0[]perNodeNode and its community ID_uuid, community_id
2[]perCommunityCommunity and the number of nodes in itcommunity_id, count
UQL
algo(leiden).params({ 
  phase1_loop_num: 6, 
  min_modularity_increase: 0.5,
  edge_schema_property: 'weight'
}).stream() as results
group by results.community_id
return table(results.community_id, max(results._uuid))
UQL
algo(leiden).params({ 
  phase1_loop_num: 5, 
  min_modularity_increase: 0.1,
  order: "desc"
}).stream({
  mode: 2
}) as results
return results

Stats Return

Alias Ordinal
Type
DescriptionColumns
0KVNumber of communities, modularitycommunity_count, modularity
UQL
algo(leiden).params({ 
  phase1_loop_num: 5, 
  min_modularity_increase: 0.1
}).stats() as stats
return stats