UltipaDocs
Products
Solutions
Resources
Company
Start Free Trial
UltipaDocs
Start Free Trial
  • Introduction
  • Running Algorithms
    • Degree Centrality
    • Closeness Centrality
    • Harmonic Centrality
    • Eccentricity Centrality
    • Betweenness Centrality
    • Bridges
    • Articulation Points
    • Eigenvector Centrality
    • Katz Centrality
    • CELF
    • PageRank
    • ArticleRank
    • TextRank
    • HITS
    • SybilRank
    • Jaccard Similarity
    • Overlap Similarity
    • Cosine Similarity
    • Pearson Correlation Coefficient
    • Euclidean Distance
    • KNN
    • Vector Similarity
    • Bipartite Graph
    • HyperANF
    • Weakly Connected Components (WCC)
    • Strongly Connected Components (SCC)
    • k-Edge Connected Components
    • Local Clustering Coefficient
    • Triangle Count
    • Clique Count
    • k-Core
    • k-Truss
    • p-Cohesion
    • Induced Subgraph
    • Topological Sort
    • Breadth-First Search (BFS)
    • Depth-First Search (DFS)
    • Dijkstra's Shortest Path
    • A* Shortest Path
    • Yen's K-Shortest Paths
    • Shortest Path (BFS)
    • Delta-Stepping SSSP
    • Shortest Path Faster Algorithm (SPFA)
    • All-Pairs Shortest Path (APSP)
    • Minimum Spanning Tree (MST)
    • K-Spanning Tree
    • Steiner Tree
    • Prize-Collecting Steiner Tree (PCST)
    • Minimum Cost Flow
    • Maximum Flow
    • K-Hop Fast
    • Longest Path (DAG)
    • Random Walk
    • Adamic-Adar Index
    • Common Neighbors
    • Preferential Attachment
    • Resource Allocation
    • Total Neighbors
    • Same Community
    • Louvain
    • Leiden
    • Modularity Optimization
    • Label Propagation
    • HANP
    • SLPA
    • k-Means
    • HDBSCAN
    • K-1 Coloring
    • Modularity
    • Conductance
    • Max k-Cut
      • Node2Vec
      • Struc2Vec
      • LINE
      • Fast Random Projection
      • Summary of Graph Embedding
      • Gradient Descent
      • Backpropagation
      • Skip-gram
      • Skip-gram Optimization
  1. Docs
  2. /
  3. Graph Algorithms
  4. /
  5. Similarity

Euclidean Distance

Overview

In mathematics, the Euclidean distance between two points in Euclidean space is the length of a line segment between the two points. In the graph, N numeric node properties (features) are specified to represent each node's position in an N-dimensional Euclidean space.

Concepts

Euclidean Distance

In 2-dimensional space, the formula to compute the Euclidean distance between points A(x1, y1) and B(x2, y2) is:

In 3-dimensional space, the formula to compute the Euclidean distance between points A(x1, y1, z1) and B(x2, y2, z2) is:

Generalized to N-dimensional space, the formula to compute the Euclidean distance is:

where xi1 represents the i-th dimensional coordinates of the first point, and xi2 represents the i-th dimensional coordinates of the second point.

Euclidean distance ranges from 0 to +∞; smaller values indicate greater similarity between the two nodes.

Normalized Euclidean Distance

This algorithm returns the normalized Euclidean distance using the formula 1 / (1 + d), which scales the result into the range (0, 1]:

Values closer to 1 indicate greater similarity. For example, if the Euclidean distance between two nodes is 94.38, the normalized similarity is 1 / (1 + 94.38) ≈ 0.01049.

Considerations

  • The calculation of Euclidean distance between two nodes is independent of their connectivity in the graph — it uses node properties only.

Example Graph

GQL
INSERT (:product {_id: "product1", price: 50, weight: 160, width: 20, height: 152}),
       (:product {_id: "product2", price: 42, weight: 90, width: 30, height: 90}),
       (:product {_id: "product3", price: 24, weight: 50, width: 55, height: 70}),
       (:product {_id: "product4", price: 38, weight: 20, width: 32, height: 66})

Parameters

NameTypeDefaultDescription
typeSTRINGjaccardType of similarity to compute: euclidean.
idsLIST/First group of node _ids. If empty, all nodes are used.
ids2LIST/Second group of node _ids for pairing mode. If empty, selection mode is used.
node_propertyLIST/Required. Numeric node properties to form a vector for each node.
degreeCutoffINT0Minimum degree to include a node (0 = no cutoff).
orderSTRING/Sorts results by similarity: asc or desc.
limitINT-1Maximum total results returned (-1 = all).
top_limitINT-1Maximum results per source node in selection mode (-1 = all).

Supports three computation modes:

  • All-pairs: When both ids and ids2 are empty, computes similarity between all node pairs in the graph.
  • Pairing: When both ids and ids2 are specified, computes similarity between each node in ids and each node in ids2.
  • Selection: When only ids is specified (no ids2), computes similarity between each node in ids and all other nodes. Use top_limit to limit results per source node.

Run Mode

Returns:

ColumnTypeDescription
node1STRINGFirst node identifier (_id)
node2STRINGSecond node identifier (_id)
similarityFLOATNormalized Euclidean distance (closer to 1 = more similar)

Euclidean similarity in pairing mode:

GQL
CALL algo.similarity({
  type: "euclidean",
  ids: ["product1"],
  ids2: ["product2", "product3", "product4"],
  node_property: ["price", "weight", "width", "height"]
}) YIELD node1, node2, similarity

Result:

node1node2similarity
product1product20.010484136264957374
product1product30.006898369064315755
product1product40.00601761870467499

Euclidean similarity in selection mode (top 1 per source node):

GQL
CALL algo.similarity({
  type: "euclidean",
  ids: ["product1", "product3"],
  node_property: ["price", "weight", "width", "height"],
  top_limit: 1
}) YIELD node1, node2, similarity

Result:

node1node2similarity
product1product20.010484136264957374
product3product40.024091011098206213

Stream Mode

Returns the same columns as run mode, streamed for memory efficiency.

GQL
CALL algo.similarity.stream({
  type: "euclidean",
  ids: ["product1"],
  node_property: ["price", "weight", "width", "height"],
  order: "desc"
}) YIELD node1, node2, similarity
RETURN node1, node2, similarity

Result:

node1node2similarity
product1product20.010484136264957374
product1product30.006898369064315755
product1product40.00601761870467499

Stats Mode

Returns:

ColumnTypeDescription
pairCountINTNumber of node pairs computed
minSimilarityFLOATMinimum normalized distance
maxSimilarityFLOATMaximum normalized distance
avgSimilarityFLOATAverage normalized distance
GQL
CALL algo.similarity.stats({
  type: "euclidean",
  node_property: ["price", "weight", "width", "height"]
}) YIELD pairCount, minSimilarity, maxSimilarity, avgSimilarity

Result:

pairCountminSimilaritymaxSimilarityavgSimilarity
120.006017618704674990.0240910110982062130.013147026110302051

Write Mode

Computes results and writes them back to node properties. The write configuration is passed as a second argument map.

Write parameters:

NameTypeDescription
db.propertySTRING or MAPNode property to write results to. String: writes the similarity column in results to a property. Map: explicit column-to-property mapping (e.g., {similarity: 'euc_score'}).

Writable columns:

ColumnTypeDescription
similarityFLOATNormalized Euclidean distance

Returns:

ColumnTypeDescription
task_idSTRINGTask identifier for tracking via SHOW TASKS
nodesWrittenINTNumber of nodes with properties written
computeTimeMsINTTime spent computing the algorithm (milliseconds)
writeTimeMsINTTime spent writing properties to storage (milliseconds)
GQL
CALL algo.similarity.write({
  type: "euclidean",
  ids: ["product1", "product2"],
  node_property: ["price", "weight", "width", "height"]
}, {
  db: {
    property: "sim_score"
  }
}) YIELD task_id, nodesWritten, computeTimeMs, writeTimeMs