UltipaDocs
Products
Solutions
Resources
Company
Start Free Trial
UltipaDocs
Start Free Trial
  • Introduction
  • Running Algorithms
    • Degree Centrality
    • Closeness Centrality
    • Harmonic Centrality
    • Eccentricity Centrality
    • Betweenness Centrality
    • Bridges
    • Articulation Points
    • Eigenvector Centrality
    • Katz Centrality
    • CELF
    • PageRank
    • ArticleRank
    • TextRank
    • HITS
    • SybilRank
    • Jaccard Similarity
    • Overlap Similarity
    • Cosine Similarity
    • Pearson Correlation Coefficient
    • Euclidean Distance
    • KNN
    • Vector Similarity
    • Bipartite Graph
    • HyperANF
    • Weakly Connected Components (WCC)
    • Strongly Connected Components (SCC)
    • k-Edge Connected Components
    • Local Clustering Coefficient
    • Triangle Count
    • Clique Count
    • k-Core
    • k-Truss
    • p-Cohesion
    • Induced Subgraph
    • Topological Sort
    • Breadth-First Search (BFS)
    • Depth-First Search (DFS)
    • Dijkstra's Shortest Path
    • A* Shortest Path
    • Yen's K-Shortest Paths
    • Shortest Path (BFS)
    • Delta-Stepping SSSP
    • Shortest Path Faster Algorithm (SPFA)
    • All-Pairs Shortest Path (APSP)
    • Minimum Spanning Tree (MST)
    • K-Spanning Tree
    • Steiner Tree
    • Prize-Collecting Steiner Tree (PCST)
    • Minimum Cost Flow
    • Maximum Flow
    • K-Hop Fast
    • Longest Path (DAG)
    • Random Walk
    • Adamic-Adar Index
    • Common Neighbors
    • Preferential Attachment
    • Resource Allocation
    • Total Neighbors
    • Same Community
    • Louvain
    • Leiden
    • Modularity Optimization
    • Label Propagation
    • HANP
    • SLPA
    • k-Means
    • HDBSCAN
    • K-1 Coloring
    • Modularity
    • Conductance
    • Max k-Cut
      • Node2Vec
      • Struc2Vec
      • LINE
      • Fast Random Projection
      • Summary of Graph Embedding
      • Gradient Descent
      • Backpropagation
      • Skip-gram
      • Skip-gram Optimization
  1. Docs
  2. /
  3. Graph Algorithms
  4. /
  5. Graph Structure

p-Cohesion

Overview

The p-Cohesion algorithm computes a per-node cohesion value that measures how well each node is embedded within a densely connected neighborhood. The concept of p-cohesion was first proposed by S. Morris in a contagion model describing interactions within large populations:

  • S. Morris, Contagion. The Review of Economic Studies, 67(1), 57–78 (2000)

Concepts

p-Cohesion

One natural measure of the cohesion of a group is the relative frequency of ties among its members compared to non-members. Let cohesion be a constant p ∈ (0,1). A p-Cohesion is a connected subgraph in which every node has, at least, a proportion p of its neighbors within the subgraph. In other words, each node has at most, a proportion (1 − p) of its neighbors outside the subgraph.

The p-Cohesion model offers two key advantages over other cohesive subgraph models:

  • With a high p value, p-Cohesion ensures both strong internal cohesiveness and sparse connections to outside nodes.
  • It considers the proportion of neighbors within the group rather than a fixed number (such as the k value in k-Core), making it better suited for graphs with varying node degrees.

The example graph below illustrates this. Suppose p = 0.6. A grey label next to each node shows the minimum number of internal neighbors required for the node to remain in a p-Cohesion.

Below are the minimal p-Cohesion subgraphs, in terms of node count, that include node a and node j, respectively.

Cohesion Value

This algorithm computes the cohesion value of a node, which is the maximum p for which the node belongs to a p-cohesive subgraph. A higher cohesion value indicates the node is embedded in a more tightly connected neighborhood.

NOTE

Note that p-cohesion is inherently a group property: whether a set of nodes is p-cohesive depends on which nodes form the group together. The per-node cohesion value produced by this algorithm is a useful ranking metric, but does not indicate which specific group the node is cohesive with.

Considerations

  • The algorithm treats all edges as undirected.

Example Graph

GQL
INSERT (A:default {_id: "A"}), (B:default {_id: "B"}),
       (C:default {_id: "C"}), (D:default {_id: "D"}),
       (E:default {_id: "E"}), (F:default {_id: "F"}),
       (G:default {_id: "G"}), (H:default {_id: "H"}),
       (I:default {_id: "I"}), (J:default {_id: "J"}),
       (K:default {_id: "K"}), (L:default {_id: "L"}),
       (K)-[:default]->(J), (K)-[:default]->(L),
       (J)-[:default]->(L), (L)-[:default]->(C),
       (C)-[:default]->(A), (A)-[:default]->(B),
       (C)-[:default]->(B), (A)-[:default]->(D),
       (B)-[:default]->(G), (B)-[:default]->(D),
       (D)-[:default]->(C), (C)-[:default]->(E),
       (C)-[:default]->(F), (D)-[:default]->(E),
       (E)-[:default]->(F), (D)-[:default]->(F),
       (D)-[:default]->(H), (I)-[:default]->(H),
       (F)-[:default]->(I)

Parameters

NameTypeDefaultDescription
pFLOAT/Cohesion threshold (0 < p ≤ 1). When set, only nodes with cohesion ≥ p are returned.

Run Mode

Returns:

ColumnTypeDescription
nodeIdSTRINGNode identifier (_id)
cohesionFLOATMinimal p-cohesion value for this node
GQL
CALL algo.pcohesion({
  p: 0.7
}) YIELD nodeId, cohesion

Stream Mode

Returns the same columns as run mode, streamed for memory efficiency.

GQL
CALL algo.pcohesion.stream() YIELD nodeId, cohesion
RETURN nodeId, cohesion

Stats Mode

Returns:

ColumnTypeDescription
nodeCountINTTotal number of nodes
minCohesionFLOATMinimum cohesion value
maxCohesionFLOATMaximum cohesion value
avgCohesionFLOATAverage cohesion value
GQL
CALL algo.pcohesion.stats() YIELD nodeCount, minCohesion, maxCohesion, avgCohesion

Write Mode

Computes results and writes them back to node properties. The write configuration is passed as a second argument map.

Write parameters:

NameTypeDescription
db.propertySTRING or MAPNode property to write results to. String: writes the cohesion column in results to a property. Map: explicit column-to-property mapping (e.g., {cohesion: 'p_cohesion'}).

Writable columns:

ColumnTypeDescription
cohesionFLOATMinimal p-cohesion value

Returns:

ColumnTypeDescription
task_idSTRINGTask identifier for tracking via SHOW TASKS
nodesWrittenINTNumber of nodes with properties written
computeTimeMsINTTime spent computing the algorithm (milliseconds)
writeTimeMsINTTime spent writing properties to storage (milliseconds)
GQL
CALL algo.pcohesion.write({}, {
  db: {
    property: "p_cohesion"
  }
}) YIELD task_id, nodesWritten, computeTimeMs, writeTimeMs