UltipaDocs
Products
Solutions
Resources
Company
Start Free Trial
UltipaDocs
Start Free Trial
  • Introduction
  • Running Algorithms
    • Degree Centrality
    • Closeness Centrality
    • Harmonic Centrality
    • Eccentricity Centrality
    • Betweenness Centrality
    • Bridges
    • Articulation Points
    • Eigenvector Centrality
    • Katz Centrality
    • CELF
    • PageRank
    • ArticleRank
    • TextRank
    • HITS
    • SybilRank
    • Jaccard Similarity
    • Overlap Similarity
    • Cosine Similarity
    • Pearson Correlation Coefficient
    • Euclidean Distance
    • KNN
    • Vector Similarity
    • Bipartite Graph
    • HyperANF
    • Weakly Connected Components (WCC)
    • Strongly Connected Components (SCC)
    • k-Edge Connected Components
    • Local Clustering Coefficient
    • Triangle Count
    • Clique Count
    • k-Core
    • k-Truss
    • p-Cohesion
    • Induced Subgraph
    • Topological Sort
    • Breadth-First Search (BFS)
    • Depth-First Search (DFS)
    • Dijkstra's Shortest Path
    • A* Shortest Path
    • Yen's K-Shortest Paths
    • Shortest Path (BFS)
    • Delta-Stepping SSSP
    • Shortest Path Faster Algorithm (SPFA)
    • All-Pairs Shortest Path (APSP)
    • Minimum Spanning Tree (MST)
    • K-Spanning Tree
    • Steiner Tree
    • Prize-Collecting Steiner Tree (PCST)
    • Minimum Cost Flow
    • Maximum Flow
    • K-Hop Fast
    • Longest Path (DAG)
    • Random Walk
    • Adamic-Adar Index
    • Common Neighbors
    • Preferential Attachment
    • Resource Allocation
    • Total Neighbors
    • Same Community
    • Louvain
    • Leiden
    • Modularity Optimization
    • Label Propagation
    • HANP
    • SLPA
    • k-Means
    • HDBSCAN
    • K-1 Coloring
    • Modularity
    • Conductance
    • Max k-Cut
      • Node2Vec
      • Struc2Vec
      • LINE
      • Fast Random Projection
      • Summary of Graph Embedding
      • Gradient Descent
      • Backpropagation
      • Skip-gram
      • Skip-gram Optimization
  1. Docs
  2. /
  3. Graph Algorithms
  4. /
  5. Similarity

K-Nearest Neighbors (KNN)

Overview

KNN (K-Nearest Neighbors) finds the K most similar nodes for each node based on neighborhood structure. Useful for building similarity graphs and recommendation systems. For each node, the algorithm identifies its K closest neighbors by comparing neighborhood overlap using the specified similarity metric: jaccard, cosine, or overlap.

Three metrics are supported:

  • Jaccard: Intersection over union of neighbor sets. See Jaccard Similarity for details.
  • Overlap: Intersection over the smaller neighbor set. See Overlap Similarity for details.
  • Cosine: Cosine similarity between node property vectors. See Cosine Similarity for details.

Example Graph

GQL
INSERT (Sue:user {_id: "Sue", age: 28, score: 85}),
       (Dave:user {_id: "Dave", age: 35, score: 72}),
       (Ann:user {_id: "Ann", age: 30, score: 90}),
       (Mark:user {_id: "Mark", age: 32, score: 78}),
       (May:user {_id: "May", age: 26, score: 88}),
       (Jay:user {_id: "Jay", age: 29, score: 82}),
       (Billy:user {_id: "Billy", age: 24, score: 65}),
       (Dave)-[:know]->(Sue), (Dave)-[:know]->(Ann),
       (Mark)-[:know]->(Dave), (May)-[:know]->(Mark),
       (May)-[:know]->(Jay), (Jay)-[:know]->(Ann),
       (Ann)-[:know]->(Billy), (Mark)-[:know]->(Ann)

Parameters

NameTypeDefaultDescription
kINT10Number of nearest neighbors to find per node.
metricSTRINGjaccardSimilarity metric to use: jaccard, overlap, or cosine.
node_propertyLIST/Numeric node properties to form vectors (required for cosine metric).
degreeCutoffINT0Minimum degree to include a node (0 = no cutoff).

Run Mode

Returns:

ColumnTypeDescription
node1STRINGSource node identifier (_id)
node2STRINGNeighbor node identifier (_id)
similarityFLOATSimilarity score
rankINTNeighbor rank (1 = most similar)
GQL
CALL algo.knn({
  k: 2,
  metric: "jaccard"
}) YIELD node1, node2, similarity, rank

Result:

node1node2similarityrank
MayAnn0.51
MayDave0.252
MarkJay0.66666666666666661
MarkSue0.33333333333333332
JayMark0.66666666666666661
JayBilly0.52
SueMark0.33333333333333331
SueAnn0.252
DaveBilly0.33333333333333331
DaveMay0.252
BillyJay0.51
BillyDave0.33333333333333332
AnnMay0.51
AnnSue0.252

Stream Mode

Returns the same columns as run mode, streamed for memory efficiency.

GQL
CALL algo.knn.stream({
  k: 1,
  metric: "cosine",
  node_property: ["age", "score"]
}) YIELD node1, node2, similarity, rank
RETURN node1, node2, similarity, rank

Result:

node1node2similarityrank
MaySue0.99952153413184161
MarkBilly0.99936590363544461
JayBilly0.9999051568114481
SueAnn0.99999375700386131
DaveMark0.99800618582621491
BillyJay0.9999051568114481
AnnSue0.99999375700386131

Stats Mode

Returns:

ColumnTypeDescription
nodeCountINTTotal number of nodes
pairCountINTNumber of neighbor pairs found
minSimilarityFLOATMinimum similarity score
maxSimilarityFLOATMaximum similarity score
avgSimilarityFLOATAverage similarity score
GQL
CALL algo.knn.stats({
  k: 3,
  metric: "overlap"
}) YIELD nodeCount, pairCount, minSimilarity, maxSimilarity, avgSimilarity

Result:

nodeCountpairCountminSimilaritymaxSimilarityavgSimilarity
7190.333333333333333310.8596491228070174

Write Mode

Computes results and writes them back to node properties. The write configuration is passed as a second argument map.

Write parameters:

NameTypeDescription
db.propertySTRING or MAPNode property to write results to. String: writes the similarity column in results to a property. Map: explicit column-to-property mapping (e.g., {similarity: 'sim_score', rank: 'sim_rank'}).

Writable columns:

ColumnTypeDescription
similarityFLOATSimilarity score
rankINTNeighbor rank (1 = most similar)

Returns:

ColumnTypeDescription
task_idSTRINGTask identifier for tracking via SHOW TASKS
nodesWrittenINTNumber of nodes with properties written
computeTimeMsINTTime spent computing the algorithm (milliseconds)
writeTimeMsINTTime spent writing properties to storage (milliseconds)
GQL
CALL algo.knn.write({k: 5, metric: "jaccard"}, {
  db: {
    property: "sim_score"
  }
}) YIELD task_id, nodesWritten, computeTimeMs, writeTimeMs