kNN (k-Nearest Neighbors) - Graph Analytics & Algorithms

Change Password

Submit

Change Email

Submit

Change Nickname

Current Nickname:

Submit

Profile

Account ID:

Full Name:
Phone:
Company:
Company Email:

Change Password

Apply

You have no license application record.

Apply

Certificate	Issued at	Valid until	Serial No.	File

Serial No.	Valid until	File

Not having one? Apply now! >>>

Product	Created On	ID	Amount (USD)	Invoice

Product	Created On	ID	Amount (USD)	Invoice

No Invoice

Create Ultipa Account

I agree to the Privacy Policy and the

Data Processing Agreement .

Please agree to continue.

Already have an Ultipa account? Sign in now!

Forgot Password

Reset Password

Back to sign in

kNN (k-Nearest Neighbors)

HDC

Overview

The k-Nearest Neighbors (kNN) algorithm is a classification technique that classifies the target node based on the classifications of its k nearest (most similar) nodes. kNN was proposed by T.M. Cover and P.E. Hart in 1967 and has since become one of the simplest and widely used classification algorithms:

T.M. Cover, P.E. Hart, Nearest Neighbor Pattern Classification (1967)

Although containing the word neighbor in its name, kNN does not explicitly consider the edges between nodes when calculating similarity. Instead, it focuses solely on node properties.

Concepts

Similarity Metric

Ultipa's kNN algorithm computes pair-wise cosine similarity between the target node and all other nodes in the graph, then selects the k nodes with the highest similarity to the target node.

Vote on Classification

One node property is selected as the class label. After finding the nearest k nodes for the target node, assign the majority label among the k nodes to the target node.

If multiple labels occur with the highest frequency, the label of the node with the highest similarity will be selected.

Example Graph

To create this graph:

// Runs each row separately in order in an empty graphset
create().node_schema("image")
create().node_property(@image,"d1",int32).node_property(@image,"d2",int32).node_property(@image,"d3",int32).node_property(@image,"d4",int32).node_property(@image,"type",string)
insert().into(@image).nodes([{_id:"image1", d1:50, d2:160, d3:20, d4:35}, {_id:"image2", d1:42, d2:90, d3:30, d4:90, type:"Gold"}, {_id:"image3", d1:24, d2:50, d3:55, d4:70, type:"Silver"}, {_id:"image4", d1:38, d2:20, d3:32, d4:70, type:"Gold"}, {_id:"image5", d1:98, d2:10, d3:15, d4:36, type:"Copper"}, {_id:"image6", d1:51, d2:56, d3:44, d4:30, type:"Copper"}])

Creating HDC Graph

To load the entire graph to the HDC server hdc-server-1 as hdc_knn:

CALL hdc.graph.create("hdc-server-1", "hdc_knn", {
  nodes: {"*": ["*"]},
  edges: {"*": ["*"]},
  direction: "undirected",
  load_id: true,
  update: "static",
  query: "query",
  default: false
})

hdc.graph.create("hdc_knn", {
  nodes: {"*": ["*"]},
  edges: {"*": ["*"]},
  direction: "undirected",
  load_id: true,
  update: "static",
  query: "query",
  default: false
}).to("hdc-server-1")

Parameters

Algorithm name: knn

Name	Type	Spec	Default	Optional	Description
`node_id`	`_uuid`	/	/	No	Specifies the target node by its `_uuid`.
`node_schema_property`	[]"`<@schema.?><property>`"	/	/	No	Numeric node properties used to compute the cosine similarity; at least two properties are required. The schema must be the same as the target node.
`top_k`	Integer	>0	/	No	The number of the nearest nodes to select.
`target_schema_property`	"`<@schema.?><property>`"	/	/	No	Numeric or string node property used as the class label. The schema must be the same as the target node.
`return_id_uuid`	String	`uuid`, `id`, `both`	`uuid`	Yes	Includes `_uuid`, `_id`, or both to represent nodes in the results.

File Writeback

CALL algo.knn.write("hdc_knn", {
  params: {
    return_id_uuid: "id",
    // Assigns image1 as the target node
    node_id: 15420327323139833857,
    node_schema_property: ["d1", "d2", "d3", "d4"],
    top_k: 4,
    target_schema_property: "@image.type"
  },
  return_params: {
    file: {
      filename: "knn.txt"
    }
  }
})

algo(knn).params({
  projection: "hdc_knn",
  return_id_uuid: "id",
  // Assigns image1 as the target node
  node_id: 15420327323139833857,
  node_schema_property: ["d1", "d2", "d3", "d4"],
  top_k: 4,
  target_schema_property: "@image.type"
}).write({
  file: {
    filename: "knn.txt"
  }
})

Result:

Gold:2
top k(_id):
image2,0.85516
image6,0.841922
image3,0.705072
image4,0.538975

The first line in the file represents the majority label and its count among the k nearest nodes. Starting from the third line, it shows the top k nodes along with their similarity scores to the target node.

Stats Writeback

CALL algo.knn.write("hdc_knn", {
  params: {
    return_id_uuid: "id",
    // Assigns image1 as the target node
    node_id: 15420327323139833857,
    node_schema_property: ["d1", "d2", "d3", "d4"],
    top_k: 4,
    target_schema_property: "@image.type"
  },
  return_params: {
    stats: {}
  }
})

algo(knn).params({
  projection: "hdc_knn",
  return_id_uuid: "id",
  // Assigns image1 as the target node
  node_id: 15420327323139833857,
  node_schema_property: ["d1", "d2", "d3", "d4"],
  top_k: 4,
  target_schema_property: "@image.type"
}).write({
  stats: {}
})

Result:

attribute_value	count
Gold	2

Full Return

CALL algo.knn("hdc_knn", {
  params: {
    return_id_uuid: "id", 
    // Assigns image1 as the target node
    node_id: 15420327323139833857,
    node_schema_property: ["d1", "d2", "d3", "d4"],
    top_k: 4,
    target_schema_property: "@image.type"
  },
  return_params: {}
}) YIELD r
RETURN r

exec{
  algo(knn).params({
    return_id_uuid: "id", 
    // Assigns image1 as the target node
    node_id: 15420327323139833857,
    node_schema_property: ["d1", "d2", "d3", "d4"],
    top_k: 4,
    target_schema_property: "@image.type"
  }) as r
  return r
} on hdc_knn

Result:

_ids	similarity
image2	0.85516
image6	0.841922
image3	0.705072
image4	0.538975

Stream Return

CALL algo.knn("hdc_knn", {
  params: {
    return_id_uuid: "id", 
    // Assigns image1 as the target node
    node_id: 15420327323139833857,
    node_schema_property: ["d1", "d2", "d3", "d4"],
    top_k: 4,
    target_schema_property: "type"
  },
  return_params: {
    stream: {}
  }
}) YIELD r
RETURN r

exec{
  algo(knn).params({
    return_id_uuid: "id", 
    // Assigns image1 as the target node
    node_id: 15420327323139833857,
    node_schema_property: ["d1", "d2", "d3", "d4"],
    top_k: 4,
    target_schema_property: "type"
  }).stream() as r
  return r
} on hdc_knn

Result:

_ids	similarity
image2	0.85516
image6	0.841922
image3	0.705072
image4	0.538975

Stats Return

CALL algo.knn("hdc_knn", {
  params: {
    return_id_uuid: "id",
    // Assigns image1 as the target node
    node_id: 15420327323139833857,
    node_schema_property: ["d1", "d2", "d3", "d4"],
    top_k: 4,
    target_schema_property: "@image.type"
  },
  return_params: {
    stats: {}
  }
}) YIELD s 
RETURN s

exec{
  algo(knn).params({
    return_id_uuid: "id",
    // Assigns image1 as the target node
    node_id: 15420327323139833857,
    node_schema_property: ["d1", "d2", "d3", "d4"],
    top_k: 4,
    target_schema_property: "@image.type"
  }).stats() as s
  return s
} on hdc_knn

Result:

attribute_value	count
Gold	2

ID
Product
Status
Cores
Maximum Shard Services
Maximum Total Cores for Shard Service
Maximum HDC Services
Maximum Total Cores for HDC Service
Applied Validity Period(days)
Effective Date
Expired Date
Mac Address
Reason for Application
Review Comment