UltipaDocs
Try Playground
  • Introduction
    • Show Algorithms
    • Install and Uninstall
    • Run Algorithms
    • Algorithm Results and Statistics
    • Degree Centrality
    • Closeness Centrality
    • Harmonic Centrality
    • Graph Centrality
    • Betweenness Centrality
    • Eigenvector Centrality
    • CELF
    • PageRank
    • ArticleRank
    • HITS
    • SybilRank
    • Jaccard Similarity
    • Overlap Similarity
    • Cosine Similarity
    • Pearson Correlation Coefficient
    • Euclidean Distance
    • K-Hop All
    • Bipartite Graph
    • HyperANF
    • Connected Component
    • Triangle Counting
    • Induced Subgraph
    • k-Core
    • k-Truss
    • p-Cohesion
    • k-Edge Connected Components
    • Local Clustering Coefficient
    • Topological Sort
    • Schema Overview
    • Dijkstra's Single-Source Shortest Path
    • Delta-Stepping Single-Source Shortest Path
    • Shortest Path Faster Algorithm (SPFA)
    • Minimum Spanning Tree
    • Breadth-First Search (BFS)
    • Depth-First Search (DFS)
    • Adamic-Adar Index
    • Common Neighbors
    • Preferential Attachment
    • Resource Allocation
    • Total Neighbors
    • Louvain
    • Leiden
    • Label Propagation
    • HANP
    • k-Means
    • kNN (k-Nearest Neighbors)
    • K-1 Coloring
    • Conductance
      • Random Walk
      • Node2Vec Walk
      • Node2Vec
      • Struc2Vec Walk
      • Struc2Vec
      • GraphSAGE
      • GraphSAGE Train
      • LINE
      • Fast Random Projection
      • Summary of Graph Embedding
      • Gradient Descent
      • Backpropagation
      • Skip-gram
      • Skip-gram Optimization
  1. Docs
  2. /
  3. Graph Analytics & Algorithms
  4. /
  5. Community Detection & Classification

Label Propagation Algorithm

✓ File Writeback ✓ Property Writeback ✓ Direct Return ✓ Stream Return ✓ Stats

Overview

The Label Propagation algorithm (LPA) is a community detection algorithm using label propagation. Each node is initialized with a label and at every iteration of the algorithm, each node updates its label to the one that is most prevalent among its neighbors. This iterative process allows densely connected groups of nodes to converge on a consensus labels, nodes sharing the same label then forming a community.

LPA does not optimize any specific chosen measure of community strengths, and does not require the number of communities to be predefined. Instead, it leverages the network structure to guide its progression. This simplicity enables LPA to efficiently analyze large and complex networks.

Related material of the algorithm:

  • U.N. Raghavan, R. Albert, S. Kumara, Near linear time algorithm to detect community structures in large-scale networks (2007)

Concepts

Label

Label of a node is initialized with a specified property value, or its unique UUID.

Nodes that have the same label at the end of the algorithm indicate their affiliation to a common community.

NOTE

In LPA, all initial labels are valid and able to propagate. If there is a need to specify some labels as invalid, please contact Ultipa team for customization.

Label Propagation

In the simplest settings, at every iteration of propagation, each node updates its label to the one that the maximum numbers of its neighbors belongs to.

As the following example shows, the label of the blue node will be updated from d to c.

When considering node and edge weights, the label weight equals to the sum of the products of the corresponding node and edge weights, each node updates its label to the one with the largest weight.

As the weights of nodes and edges denoted in the example below, the label of the blue node will be updated from d to a.

Multi-label Propagation

In multi-label propagation, each node accept multiple labels during the propagation. In this case, a label probability that is proportional to its weight is given to each label of a node, while the sum of label probabilities of each node keeps as 1.

In the example below, each node keeps 2 labels, the probabilities are written next to labels, the labels of the blue node will be updated from d, c to a, c with label probabilities Pa = 6.3/(6.3+1.85) = 0.77 and Pc = 1.85/(6.3+1.85) = 0.23.

Considerations

  • LPA ignores the direction of edges but calculates them as undirected edges.
  • Node with self-loops propagates its current label(s) to itself, and each self-loop is counted twice.
  • LPA follows the synchronous update principle when updating node labels. This means that all nodes update their labels simultaneously based on the labels of their neighbors. However, in some cases, label oscillations can occur, particularly in bipartite graphs. To address this issue, the algorithm incorporates an interrupt mechanism that detects and prevents excessive label oscillations.
  • Due to factors such as the order of nodes, the random selection of labels with equal weights, and parallel calculations, the community division results of LPA may vary.

Syntax

  • Command: algo(lpa)
  • Parameters:
Name
Type
Spec
Default
Optional
Description
node_label_property@<schema>?.<property>Numeric/String type, must LTE/YesNode property to initialize node labels, nodes without the property are not involved in label propagation; UUID is used as label for all nodes if not set
node_weight_property@<schema>?.<property>Numeric type, must LTE/YesNode property to use as node weight
edge_weight_property@<schema>?.<property>Numeric type, must LTE/YesEdge property to use as edge weight
loop_numint≥15YesNumber of propagation iterations
kint≥11YesMaximum number of labels each node keeps in the end (all labels are ordered by their probability from high to low)

Examples

The example graph is as follows, nodes are of schema user, edges are of schema connect, the value of @connect.strength is shown in the graph:

File Writeback

Spec
Content
filename_id,label_1,probability_1,...label_k,probability_k
UQL
algo(lpa).params({
  k: 2,
  loop_num: 5,
  edge_weight_property: 'strength'
}).write({
  file:{
    filename: "lpa"
  }
})

Statistics: label_count = 7
Results: File lpa

File
O,1,0.599162,2,0.400838,
N,1,0.634582,2,0.365418,
M,1,0.610834,2,0.389166,
L,1,0.607434,2,0.392566,
K,1,0.619842,2,0.380158,
J,14,0.655975,8,0.344025,
I,14,0.546347,8,0.453653,
H,9,0.690423,7,0.309577,
G,14,0.569427,8,0.430573,
F,9,0.784132,7,0.215869,
E,9,0.519003,12,0.480997,
D,14,0.781072,9,0.218928,
C,12,0.540345,9,0.459655,
B,9,0.559427,14,0.440573,
A,14,0.768171,12,0.231829,

Property Writeback

Spec
Content
Write to
Data Type
propertylabel_1, probability_1, ... label_k, probability_kNode propertyLabel: string,
Label probability: float
UQL
algo(lpa).params({
  node_label_property: 'interest',
  edge_weight_property: '@connect.strength',
  k: 2,
  loop_num: 10
}).write({
  db:{
    property: "lab"
  }
})

Statistics: label_count = 5
Results: The labels and the corresponding probability of each node is written to new properties lab_1, probability_1, lab_2 and probability_2 respectively

Direct Return

Alias Ordinal
Type
Description
Columns
0[]perNodeNode and its labels, label probabilities_uuid, label_1, probability_1, ... label_k, probability_k
1KVNumber of labelslabel_count
UQL
algo(lpa).params({
  node_label_property: '@user.interest',
  node_weight_property: '@user.level'
}) as res
return res

Results: res

_uuidlabel_1probability_1
15novel1.000000
14swimming1.000000
13novel1.000000
12novel1.000000
11novel1.000000
10violin1.000000
9badminton1.000000
8piano1.000000
7badminton1.000000
6badminton1.000000
5piano1.000000
4piano1.000000
3piano1.000000
2piano1.000000
1piano1.000000
UQL
algo(lpa).params({
  node_label_property: 'interest',
  k: 2
}) as res, stats
return res, stats

Results: res and stats

_uuidlabel_1probability_1label_2probability_2
15novel0.642453saxophone0.357547
14swimming0.577773saxophone0.422227
13novel0.610180swimming0.389820
12saxophone0.608193novel0.391807
11piano0.536380saxophone0.463620
10piano0.588276movie0.411724
9piano0.595449movie0.404551
8piano0.637065movie0.362935
7piano0.554655movie0.445345
6piano0.720096movie0.279904
5piano0.502892flute0.497108
4piano0.648339flute0.351661
3piano0.520442flute0.479558
2piano0.624170flute0.375831
1piano0.670773flute0.329227
label_count
6

Stream Return

Alias Ordinal
Type
Description
Columns
0[]perNodeNode and its labels, label probabilities_uuid, label_1, probability_1, ... label_k, probability_k
UQL
algo(lpa).params({
  node_label_property: '@user.interest',
  node_weight_property: '@user.level',
  edge_weight_property: 'strength',
  loop_num: 10
}).stream() as lpa
group by lpa.label_1
with count(lpa) as labelCount
return table(lpa.label_1, labelCount) 
order by labelCount desc

Results: table(lpa.label_1, labelCount)

lpa.label_1labelCount
piano5
swimming3
violin2
novel2
tennis2

Stats Return

Alias Ordinal
Type
DescriptionColumns
0KVNumber of labelslabel_count
UQL
algo(lpa).params({
  node_label_property: 'interest',
  edge_weight_property: 'strength',
  k: 1,
  loop_num: 5
}).stats() as count 
return count

Results: count

label_count
5