In mathematics, the Euclidean distance between two points in Euclidean space is the length of a line segment between the two points. In the graph, N numeric node properties (features) are specified to represent each node's position in an N-dimensional Euclidean space.
In 2-dimensional space, the formula to compute the Euclidean distance between points A(x1, y1) and B(x2, y2) is:

In 3-dimensional space, the formula to compute the Euclidean distance between points A(x1, y1, z1) and B(x2, y2, z2) is:

Generalized to N-dimensional space, the formula to compute the Euclidean distance is:

where xi1 represents the i-th dimensional coordinates of the first point, and xi2 represents the i-th dimensional coordinates of the second point.
Euclidean distance ranges from 0 to +∞; smaller values indicate greater similarity between the two nodes.
This algorithm returns the normalized Euclidean distance using the formula 1 / (1 + d), which scales the result into the range (0, 1]:
Values closer to 1 indicate greater similarity. For example, if the Euclidean distance between two nodes is 94.38, the normalized similarity is 1 / (1 + 94.38) ≈ 0.01049.

GQLINSERT (:product {_id: "product1", price: 50, weight: 160, width: 20, height: 152}), (:product {_id: "product2", price: 42, weight: 90, width: 30, height: 90}), (:product {_id: "product3", price: 24, weight: 50, width: 55, height: 70}), (:product {_id: "product4", price: 38, weight: 20, width: 32, height: 66})
| Name | Type | Default | Description |
|---|---|---|---|
type | STRING | jaccard | Type of similarity to compute: euclidean. |
ids | LIST | / | First group of node _ids. If empty, all nodes are used. |
ids2 | LIST | / | Second group of node _ids for pairing mode. If empty, selection mode is used. |
node_property | LIST | / | Required. Numeric node properties to form a vector for each node. |
degreeCutoff | INT | 0 | Minimum degree to include a node (0 = no cutoff). |
order | STRING | / | Sorts results by similarity: asc or desc. |
limit | INT | -1 | Maximum total results returned (-1 = all). |
top_limit | INT | -1 | Maximum results per source node in selection mode (-1 = all). |
Supports three computation modes:
ids and ids2 are empty, computes similarity between all node pairs in the graph.ids and ids2 are specified, computes similarity between each node in ids and each node in ids2.ids is specified (no ids2), computes similarity between each node in ids and all other nodes. Use top_limit to limit results per source node.Returns:
| Column | Type | Description |
|---|---|---|
node1 | STRING | First node identifier (_id) |
node2 | STRING | Second node identifier (_id) |
similarity | FLOAT | Normalized Euclidean distance (closer to 1 = more similar) |
Euclidean similarity in pairing mode:
GQLCALL algo.similarity({ type: "euclidean", ids: ["product1"], ids2: ["product2", "product3", "product4"], node_property: ["price", "weight", "width", "height"] }) YIELD node1, node2, similarity
Result:
| node1 | node2 | similarity |
|---|---|---|
| product1 | product2 | 0.010484136264957374 |
| product1 | product3 | 0.006898369064315755 |
| product1 | product4 | 0.00601761870467499 |
Euclidean similarity in selection mode (top 1 per source node):
GQLCALL algo.similarity({ type: "euclidean", ids: ["product1", "product3"], node_property: ["price", "weight", "width", "height"], top_limit: 1 }) YIELD node1, node2, similarity
Result:
| node1 | node2 | similarity |
|---|---|---|
| product1 | product2 | 0.010484136264957374 |
| product3 | product4 | 0.024091011098206213 |
Returns the same columns as run mode, streamed for memory efficiency.
GQLCALL algo.similarity.stream({ type: "euclidean", ids: ["product1"], node_property: ["price", "weight", "width", "height"], order: "desc" }) YIELD node1, node2, similarity RETURN node1, node2, similarity
Result:
| node1 | node2 | similarity |
|---|---|---|
| product1 | product2 | 0.010484136264957374 |
| product1 | product3 | 0.006898369064315755 |
| product1 | product4 | 0.00601761870467499 |
Returns:
| Column | Type | Description |
|---|---|---|
pairCount | INT | Number of node pairs computed |
minSimilarity | FLOAT | Minimum normalized distance |
maxSimilarity | FLOAT | Maximum normalized distance |
avgSimilarity | FLOAT | Average normalized distance |
GQLCALL algo.similarity.stats({ type: "euclidean", node_property: ["price", "weight", "width", "height"] }) YIELD pairCount, minSimilarity, maxSimilarity, avgSimilarity
Result:
| pairCount | minSimilarity | maxSimilarity | avgSimilarity |
|---|---|---|---|
| 12 | 0.00601761870467499 | 0.024091011098206213 | 0.013147026110302051 |
Computes results and writes them back to node properties. The write configuration is passed as a second argument map.
Write parameters:
| Name | Type | Description |
|---|---|---|
db.property | STRING or MAP | Node property to write results to. String: writes the similarity column in results to a property. Map: explicit column-to-property mapping (e.g., {similarity: 'euc_score'}). |
Writable columns:
| Column | Type | Description |
|---|---|---|
similarity | FLOAT | Normalized Euclidean distance |
Returns:
| Column | Type | Description |
|---|---|---|
task_id | STRING | Task identifier for tracking via SHOW TASKS |
nodesWritten | INT | Number of nodes with properties written |
computeTimeMs | INT | Time spent computing the algorithm (milliseconds) |
writeTimeMs | INT | Time spent writing properties to storage (milliseconds) |
GQLCALL algo.similarity.write({ type: "euclidean", ids: ["product1", "product2"], node_property: ["price", "weight", "width", "height"] }, { db: { property: "sim_score" } }) YIELD task_id, nodesWritten, computeTimeMs, writeTimeMs