UltipaDocs
Try Playground
  • Introduction
    • Show Algorithms
    • Install and Uninstall
    • Run Algorithms
    • Algorithm Results and Statistics
    • Degree Centrality
    • Closeness Centrality
    • Harmonic Centrality
    • Graph Centrality
    • Betweenness Centrality
    • Eigenvector Centrality
    • CELF
    • PageRank
    • ArticleRank
    • HITS
    • SybilRank
    • Jaccard Similarity
    • Overlap Similarity
    • Cosine Similarity
    • Pearson Correlation Coefficient
    • Euclidean Distance
    • K-Hop All
    • Bipartite Graph
    • HyperANF
    • Connected Component
    • Triangle Counting
    • Induced Subgraph
    • k-Core
    • k-Truss
    • p-Cohesion
    • k-Edge Connected Components
    • Local Clustering Coefficient
    • Topological Sort
    • Schema Overview
    • Dijkstra's Single-Source Shortest Path
    • Delta-Stepping Single-Source Shortest Path
    • Shortest Path Faster Algorithm (SPFA)
    • Minimum Spanning Tree
    • Breadth-First Search (BFS)
    • Depth-First Search (DFS)
    • Adamic-Adar Index
    • Common Neighbors
    • Preferential Attachment
    • Resource Allocation
    • Total Neighbors
    • Louvain
    • Leiden
    • Label Propagation
    • HANP
    • k-Means
    • kNN (k-Nearest Neighbors)
    • K-1 Coloring
    • Conductance
      • Random Walk
      • Node2Vec Walk
      • Node2Vec
      • Struc2Vec Walk
      • Struc2Vec
      • GraphSAGE
      • GraphSAGE Train
      • LINE
      • Fast Random Projection
      • Summary of Graph Embedding
      • Gradient Descent
      • Backpropagation
      • Skip-gram
      • Skip-gram Optimization
  1. Docs
  2. /
  3. Graph Analytics & Algorithms
  4. /
  5. Community Detection & Classification

kNN (k-Nearest Neighbors)

✓ File Writeback ✕ Property Writeback ✓ Direct Return ✓ Stream Return ✕ Stats

Overview

The k-Nearest Neighbors (kNN) algorithm is a classification technique that classifies the target node based on the classifications of its k nearest (most similar) nodes. kNN was proposed by T.M. Cover and P.E. Hart in 1967 and has since become one of the simplest and widely used classification algorithms:

  • T.M. Cover, P.E. Hart, Nearest Neighbor Pattern Classification (1967)

Although containing the word neighbor in its name, kNN does not explicitly consider the edges between nodes when calculating similarity. Instead, it focuses solely on node properties.

Concepts

Similarity Metric

Ultipa's kNN algorithm computes pair-wise cosine similarity between the target node and all other nodes in the graph, then selects the k nodes with the highest similarity to the target node.

Vote on Classification

One node property is selected as the class label. After finding the nearest k nodes for the target node, assign the majority label among the k nodes to the target node.

If multiple labels occur with the highest frequency, the label of the node with the highest similarity will be selected.

Syntax

  • Command: algo(knn)
  • Parameters:
Name
Type
Spec
Default
Optional
Description
node_id_uuid//NoUUID of the target node
node_schema_property[]@<schema>?.<property>Numeric type, must LTE, must be of the same schema as the target node/NoTwo or more node properties to compute the cosine similarity
top_kint>0/NoThe number of the nearest nodes to select
target_schema_property@<schema>?.<property>Numeric/String type, must LTE, must be of the same schema as the target node/NoNode property to use as the class label

Examples

The example graph has 6 image nodes (edges are ignored), and each node has properties d1, d2, d3, d4 and type:

File Writeback

Spec
Content
Description
filenameFirst row: attribute_value
Second row and later: _id,similarity
First row: The elected class label
Second row and later: ID of the nearest node and its cosine similarity with the target node
UQL
algo(knn).params({
  node_id: 1,
  node_schema_property: ['d1', 'd2', 'd3', 'd4'],
  top_k: 4,
  target_schema_property: @image.type
}).write({
  file:{
    filename: "knn"
    }
})

Results: File knn

File
Gold
top k : image4,0.538975
image3,0.705072
image6,0.841922
image2,0.85516

Direct Return

Alias Ordinal
Type
Description
Columns
0KVThe elected class label and its number of occurrences among the k nearest neighborsattribute_value, count
1[]perNodeThe nearest node and its cosine similarity with the target nodenode, similarity
UQL
algo(knn).params({
  node_id: 1,
  node_schema_property: ['d1', 'd2', 'd3', 'd4'],
  top_k: 4,
  target_schema_property: @image.type
}) as a1, a2 
return a1, a2

Results: a1 and a2

attribute_valuecount
Gold2
nodesimilarity
40.538974677919475
30.705071517140301
60.841922130134788
20.855159652306166

Stream Return

Alias Ordinal
Type
Description
Columns
0KVThe elected class label and its number of occurrences among the k nearest neighborsattribute_value, count
UQL
algo(knn).params({
  node_id: 2,
  node_schema_property: ['@image.d1', '@image.d2', '@image.d3', '@image.d4'],
  top_k: 5,
  target_schema_property: @image.type
}).stream() as label
find().nodes({_uuid == 2}) as target
return case
  when target.type == label.attribute_value then 'True'
  else 'false'
end

Results: false

UQL
find().nodes({@image}) as images
call {
    with images._uuid as target
    algo(knn).params({
      node_id: target,
      node_schema_property: ['d1', 'd2', 'd3', 'd4'],
      top_k: 3,
      target_schema_property: 'type'
    }).stream() as label
    return label
}
return table(images._id, label.attribute_value)

Results: table(images._id, label.attribute_value)

images._idlabel.attribute_value
image1Silver
image2Silver
image3Gold
image4Silver
image5Gold
image6Gold