UltipaDocs
Try Playground
  • Introduction
    • Show Algorithms
    • Install and Uninstall
    • Run Algorithms
    • Algorithm Results and Statistics
    • Degree Centrality
    • Closeness Centrality
    • Harmonic Centrality
    • Graph Centrality
    • Betweenness Centrality
    • Eigenvector Centrality
    • CELF
    • PageRank
    • ArticleRank
    • HITS
    • SybilRank
    • Jaccard Similarity
    • Overlap Similarity
    • Cosine Similarity
    • Pearson Correlation Coefficient
    • Euclidean Distance
    • K-Hop All
    • Bipartite Graph
    • HyperANF
    • Connected Component
    • Triangle Counting
    • Induced Subgraph
    • k-Core
    • k-Truss
    • p-Cohesion
    • k-Edge Connected Components
    • Local Clustering Coefficient
    • Topological Sort
    • Schema Overview
    • Dijkstra's Single-Source Shortest Path
    • Delta-Stepping Single-Source Shortest Path
    • Shortest Path Faster Algorithm (SPFA)
    • Minimum Spanning Tree
    • Breadth-First Search (BFS)
    • Depth-First Search (DFS)
    • Adamic-Adar Index
    • Common Neighbors
    • Preferential Attachment
    • Resource Allocation
    • Total Neighbors
    • Louvain
    • Leiden
    • Label Propagation
    • HANP
    • k-Means
    • kNN (k-Nearest Neighbors)
    • K-1 Coloring
    • Conductance
      • Random Walk
      • Node2Vec Walk
      • Node2Vec
      • Struc2Vec Walk
      • Struc2Vec
      • GraphSAGE
      • GraphSAGE Train
      • LINE
      • Fast Random Projection
      • Summary of Graph Embedding
      • Gradient Descent
      • Backpropagation
      • Skip-gram
      • Skip-gram Optimization
  1. Docs
  2. /
  3. Graph Analytics & Algorithms
  4. /
  5. Similarity

Euclidean Distance

✓ File Writeback ✕ Property Writeback ✓ Direct Return ✓ Stream Return ✕ Stats

Overview

In mathematics, the Euclidean distance between two points in Euclidean space is the length of a line segment between the two points. In the graph, specifying N numeric properties (features) of nodes to indicate the location of the node in an N-dimensional Euclidean space.

Concepts

Euclidean Distance

In 2-dimensional space, the formula to compute the Euclidean distance between points A(x1, y1) and B(x2, y2) is:

In 3-dimensional space, the formula to compute the Euclidean distance between points A(x1, y1, z1) and B(x2, y2, z2) is:

Generalize to N-dimensional space, the formula to compute the Euclidean distance is:

where xi1 represents the i-th dimensional coordinates of the first point, xi2 represents the i-th dimensional coordinates of the second point.

The Euclidean distance ranges from 0 to +∞; the smaller the value, the more similar the two nodes.

Normalized Euclidean Distance

Normalized Euclidean distance scales the Euclidean distance into range from 0 to 1; the closer to 1, the more similar the two nodes.

Ultipa adopts the following formula to normalize the Euclidean distance:

Considerations

  • Theoretically, the calculation of Euclidean distance between two nodes does not depend on their connectivity.

Syntax

  • Command: algo(similarity)
  • Parameters:
Name
Type
Spec
Default
Optional
Description
ids / uuids[]_id / []_uuid//NoID/UUID of the first group of nodes to calculate
ids2 / uuids2[]_id / []_uuid//YesID/UUID of the second group of nodes to calculate
typestringeuclideanDistance, euclideancosineNoType of similarity; euclideanDistance is to compute Euclidean Distance, euclidean is to compute Normalized Euclidean Distance
node_schema_property[]@<schema>?.<property>Numeric type, must LTE/NoSpecify two or more node properties to form the vectors, all properties must belong to the same (one) schema
limitint≥-1-1YesNumber of results to return, -1 to return all results
top_limitint≥-1-1YesIn the selection mode, limit the maximum number of results returned for each node specified in ids/uuids, -1 to return all results with similarity > 0; in the pairing mode, this parameter is invalid

The algorithm has two calculation modes:

  1. Pairing: when both ids/uuids and ids2/uuids2 are configured, pairing each node in ids/uuids with each node in ids2/uuids2 (ignore the same node) and computing pair-wise similarities.
  2. Selection: when only ids/uuids is configured, for each target node in it, computing pair-wise similarities between it and all other nodes in the graph. The returned results include all or limited number of nodes that have similarity > 0 with the target node and is ordered by the descending similarity.

Examples

The example graph has 4 products (edges are ignored), each product has properties price, weight, weight and height:

File Writeback

SpecContent
filenamenode1,node2,similarity
UQL
algo(similarity).params({
  uuids: [1], 
  uuids2: [2,3,4],
  node_schema_property: ['price', 'weight', 'width', 'height'],
  type: 'euclideanDistance'
}).write({
  file:{ 
    filename: 'ed'
  }
})

Results: File ed

File
product1,product2,94.3822
product1,product3,143.962
product1,product4,165.179
UQL
algo(similarity).params({
  uuids: [1,2,3,4],
  node_schema_property: ['price', 'weight', 'width', 'height'],
  type: 'euclidean'
}).write({
  file:{ 
    filename: 'ed_list'
  }
})

Results: File ed_list

File
product1,product2,0.010484
product1,product3,0.006898
product1,product4,0.006018
product2,product3,0.018082
product2,product4,0.013309
product2,product1,0.010484
product3,product4,0.024091
product3,product2,0.018082
product3,product1,0.006898
product4,product3,0.024091
product4,product2,0.013309
product4,product1,0.006018

Direct Return

Alias Ordinal
Type
DescriptionColumns
0[]perNodePairNode pair and its similaritynode1, node2, similarity
UQL
algo(similarity).params({
  uuids: [1,2], 
  uuids2: [2,3,4],
  node_schema_property: ['price', 'weight', 'width', 'height'],
  type: 'euclideanDistance'
}) as distance
return distance

Results: distance

node1node2similarity
1294.3822017119753
13143.96180048888
14165.178691119648
2354.3046959295419
2474.1350119714025
UQL
algo(similarity).params({
  uuids: [1,2],
  type: 'euclidean',
  node_schema_property: ['price', 'weight', 'width', 'height'],
  top_limit: 1
}) as top
return top

Results: top

node1node2similarity
120.0104841362649574
230.0180816471945529

Stream Return

Alias Ordinal
Type
DescriptionColumns
0[]perNodePairNode pair and its similaritynode1, node2, similarity
UQL
algo(similarity).params({
  uuids: [3], 
  uuids2: [1,2,4],
  node_schema_property: ['@product.price', '@product.weight', '@product.width'],
  type: 'euclidean'
}).stream() as distance
where distance.similarity > 0.01
return distance

Results: distance

node1node2similarity
320.019422
340.024206
UQL
algo(similarity).params({
  uuids: [1,3],
  node_schema_property: ['price', 'weight', 'width', 'height'],
  type: 'euclideanDistance',
  top_limit: 1
}).stream() as top
return top

Results: top

node1node2similarity
14165.1787
31143.9618