UltipaDocs
Try Playground
  • Introduction
  • Managing HDC Graphs
  • Managing Distributed Projections
  • Installing Algorithms
  • Running Algorithms
    • Degree Centrality
    • Closeness Centrality
    • Harmonic Centrality
    • Graph Centrality
    • Betweenness Centrality
    • Eigenvector Centrality
    • Katz Centrality
    • CELF
    • PageRank
    • ArticleRank
    • TextRank
    • HITS
    • SybilRank
    • Jaccard Similarity
    • Overlap Similarity
    • Cosine Similarity
    • Pearson Correlation Coefficient
    • Euclidean Distance
    • K-Hop All
    • Bipartite Graph
    • HyperANF
    • Connected Component
    • Triangle Counting
    • Induced Subgraph
    • k-Core
    • k-Truss
    • p-Cohesion
    • k-Edge Connected Components
    • Local Clustering Coefficient
    • Topological Sort
    • Schema Overview
    • Dijkstra's Single-Source Shortest Path
    • Delta-Stepping Single-Source Shortest Path
    • Shortest Path Faster Algorithm (SPFA)
    • Minimum Spanning Tree
    • Breadth-First Search (BFS)
    • Depth-First Search (DFS)
    • Adamic-Adar Index
    • Common Neighbors
    • Preferential Attachment
    • Resource Allocation
    • Total Neighbors
    • Louvain
    • Leiden
    • Label Propagation
    • HANP
    • k-Means
    • kNN (k-Nearest Neighbors)
    • K-1 Coloring
    • Conductance
      • Random Walk
      • Node2Vec Walk
      • Node2Vec
      • Struc2Vec Walk
      • Struc2Vec
      • LINE
      • Fast Random Projection
      • Summary of Graph Embedding
      • Gradient Descent
      • Backpropagation
      • Skip-gram
      • Skip-gram Optimization
  1. Docs
  2. /
  3. Graph Analytics & Algorithms
  4. /
  5. Connectivity & Compactness

Connected Component

HDC Distributed

Overview

The Connected Component algorithm identifies the connected components in a graph, which is the essential indicator to examine the connectivity and topology characteristics of the graph.

The number of connected components in a graph can serve as a coarse-grained metering method. If the number of connected components remains unchanged after certain operations or modifications to the graph, it suggests that the macroscopic connectivity and topology characteristics of the graph have not been altered significantly.

This information is valuable in various graph analysis scenarios. For example, in social networks, if the number of connected components remains the same over time, it implies that the overall connectivity patterns and community structures within the network have not experienced substantial changes.

Concepts

Connected Component

A connected component is a maximal subset of nodes in a graph where all nodes in that subset are reachable from one another by following edges in the graph. A maximal subset means that no additional nodes can be added to the subset without breaking the connectivity requirement.

The number of connected components in a graph indicates the level of disconnectedness or the presence of distinct subgraphs within the overall graph. A graph with exactly one connected component encompassing all nodes is called a connected graph.

Weakly and Strongly Connected Component

There are two important concepts related to connected component: weakly connected component (WCC) and strongly connected component (SCC):

  • A WCC is a subset of nodes in a directed or undirected graph where a path exists between any pair of nodes when edge directions are ignored.
  • An SCC is a subset of nodes in a directed graph where there is a directed path between every pair of nodes. In other words, for any two nodes u and v in an SCC, there exists a directed path from u to v and from v to u. All edges along these paths follow the same direction.

This example shows the 3 strongly connected components and 2 weakly connected components of a graph. The number of SCCs in a graph is always equal to or greater than the number of WCCs, since SCCs impose stricter connectivity conditions than WCCs.

Considerations

  • Each isolated node in the graph constitutes a connected component and is considered both a strongly connected component and a weakly connected component.

Example Graph

Run the following statements on an empty graph to define its structure and insert data:

ALTER GRAPH CURRENT_GRAPH ADD NODE {
  member ()
};
ALTER GRAPH CURRENT_GRAPH ADD EDGE {
  helps ()-[]->()
};
INSERT (Mike:member {_id: "Mike"}),
       (Cathy:member {_id: "Cathy"}),
       (Anna:member {_id: "Anna"}),
       (Joe:member {_id: "Joe"}),
       (Sam:member {_id: "Sam"}),
       (Bob:member {_id: "Bob"}),
       (Bill:member {_id: "Bill"}),
       (Alice:member {_id: "Alice"}),
       (Cathy)-[:helps]->(Mike),
       (Anna)-[:helps]->(Sam),
       (Anna)-[:helps]->(Joe),
       (Joe)-[:helps]->(Bob),
       (Bob)-[:helps]->(Joe),
       (Bob)-[:helps]->(Bill),
       (Bill)-[:helps]->(Alice),
       (Bill)-[:helps]->(Anna),
       (Alice)-[:helps]->(Anna);

Running on HDC Graphs

Creating HDC Graph

To load the entire graph to the HDC server hdc-server-1 as my_hdc_graph:

CREATE HDC GRAPH my_hdc_graph ON "hdc-server-1" OPTIONS {
  nodes: {"*": ["*"]},
  edges: {"*": ["*"]},
  direction: "undirected",
  load_id: true,
  update: "static"
}

Parameters

Algorithm name: connected_component

Name
Type
Spec
Default
Optional
Description
cc_typeInteger1, 21YesSpecifies the type of connected component to identify. Set to 1 for WCC, or 2 for SCC.
return_id_uuidStringuuid, id, bothuuidYesIncludes _uuid, _id, or both to represent nodes in the results.
limitInteger≥-1-1YesLimits the number of results returned. Set to -1 to include all results.
orderStringasc, desc/YesSorts the results by count; this option is only valid in Stream Return when mode is set to 2.

In the results of this algorithm, each connected component is represented by the same community_id, which corresponds to the _uuid value of one of its nodes.

File Writeback

This algorithm can generate three files:

Spec
Content
filename_community_id
  • _id/_uuid: The node.
  • community_id: ID of the connected component the node belongs to.
filename_ids
  • community_id: ID of the connected component.
  • _ids/_uuids: Nodes belonging to the connected component.
filename_num
  • community_id: ID of the connected component.
  • count: Number of nodes in the connected component.
CALL algo.connected_component.write("my_hdc_graph", {
  return_id_uuid: "id",
  cc_type: 1
}, {
  file: {
    filename_community_id: "f1",
    filename_ids: "f2",
    filename_num: "f3"
  }
})

Result:

_id,community_id
Alice,0
Cathy,1
Anna,0
Bob,0
Joe,0
Bill,0
Mike,1
Sam,0

DB Writeback

Writes the community_id values from the results to the specified node property. The property type is uint32.

CALL algo.connected_component.write("my_hdc_graph", {}, {
  db: {
    property: "wcc_id"
  }
})

Stats Writeback

CALL algo.connected_component.write("my_hdc_graph", {}, {
  stats: {}
})

Result:

community_count
2

Full Return

CALL algo.connected_component.run("my_hdc_graph", {
  return_id_uuid: "id",
  cc_type: 2
}) YIELD r
RETURN r

Result:

_idcommunity_id
Alice0
Cathy1
Anna0
Bob0
Joe0
Bill0
Mike6
Sam7

Stream Return

This Stream Return supports two modes:

ItemSpecColumns
mode1 (Default)
  • _id/_uuid: The node.
  • community_id: ID of the connected component the node belongs to.
2
  • community_id: ID of the connected component.
  • count: Number of nodes in the connected component.
CALL algo.connected_component.stream("my_hdc_graph", {
  return_id_uuid: "id",
  cc_type: 2
}) YIELD r
RETURN r

Result:

_idcommunity_id
Alice0
Cathy1
Anna0
Bob0
Joe0
Bill0
Mike6
Sam7
CALL algo.connected_component.stream("my_hdc_graph", {
  return_id_uuid: "id",
  cc_type: 2,
  order: "asc"
}, {
  mode: 2
}) YIELD r
RETURN r

Result:

community_idcount
61
11
71
05

Stats Return

CALL algo.connected_component.stats("my_hdc_graph", {}) YIELD wcc_count
RETURN wcc_count

Result:

community_count
2

Running on Distributed Projections

Creating Distributed Projection

To project the entire graph to its shard servers as myProj:

CREATE PROJECTION myProj OPTIONS {
  nodes: {"*": ["*"]}, 
  edges: {"*": ["*"]},
  direction: "undirected",
  load_id: true
}

Parameters

Algorithm name: wcc

The algorithm does not require any parameters.

The distributed version of this algorithm supports identifying only weakly connected components (WCC) in the graph. In the results of this algorithm, each connected component is represented by the same community_id.

File Writeback

CALL algo.wcc.write("myProj", {}, {
  file: {
    filename: "wcc"
  }
})

Result:

File: wcc
_id,community_id
Anna,4827860999564427272
Joe,4827860999564427272
Sam,4827860999564427272
Mike,6413128068398841858
Bill,4827860999564427272
Cathy,6413128068398841858
Alice,4827860999564427272
Bob,4827860999564427272

DB Writeback

Writes the community_id values from the results to the specified node property. The property type is uint64.

CALL algo.wcc.write("myProj", {}, {
  db: {
    property: "wcc_id"
  }
})