Change Password

Please enter the password.
Please enter the password. Between 8-64 characters. Not identical to your email address. Contain at least 3 of uppercase, lowercase, numbers, and special characters (such as @*&#).
Please enter the password.
Submit

Change Nickname

Current Nickname:
Submit

v4.2
Search
中文EN
v4.2

    Euclidean Distance

    Overview

    Euclidean distance is named after the ancient Greek mathematician Euclid, this is the most commonly used distance measurement which measures the absolute distance between two nodes in a multi-dimensional space, that is, the shortest straight-line distance between them. The Euclidean distance between two nodes in graph is calculated by using N properties of node to form two N-dimensional vectors.

    Basic Concept

    Vector

    Vector is one of the basic concepts in Advanced Mathematics, vectors in low dimensional spaces are relatively easy to understand and express. The following diagram shows the relationship between vectors A, B and coordinate axes in 2- and 3-dimensional spaces respectively, as well as the angle θ between them:

    When comparing two nodes in graph, N properties of node are used to form the two N-dimensional vectors.

    Euclidean Distance

    In 2-dimensional space, the formula to calculate the Euclidean distance is:

    In 3-dimensional space, the formula to calculate the Euclidean distance is:

    Generalize to n-dimensional space, the formula to calculate the Euclidean distance is:

    where xi1 represents the i-th dimensional coordinates of the first node, xi2 represents the i-th dimensional coordinates of the second node.

    The range of Euclidean distance values is [0,+∞]; the smaller the value, the more similar the two nodes are.

    Normalized Euclidean Distance

    Normalized Euclidean distance is an improvement on Euclidean distance. The range of normalized Euclidean distance values is [0,1], the larger the value, the more similar the two nodes are.

    Ultipa adopts the following formula to normalize Euclidean distance:

    Special Case

    Isolated Node, Disconnected Graph

    Theoretically, the calculation of Euclidean distance between two nodes does not depend on the existence of edges in the graph. Regardless of whether the two nodes to be calculated are isolated nodes or whether they are in the same connected component, it does not affect the calculation of their Euclidean distance.

    Self-loop Edge

    The calculation of Euclidean distance has nothing to do with edges.

    Directed Edge

    The calculation of Euclidean distance has nothing to do with edges.

    Command and Configuration

    • Command: algo(similarity)
    • Configurations for the parameter params():
    Name
    Type
    Default
    Specification
    Description
    ids / uuids []_id / []_uuid / Mandatory IDs or UUIDs of the first set of nodes to be calculated
    ids2 / uuids2 []_id / []_uuid / Optional IDs or UUIDs of the second set of nodes to be calculated
    type string cosine jaccard / overlap / cosine / pearson / euclideanDistance / euclidean Measurement of the similarity:
    jaccard: Jaccard Similarity
    overlap: Overlap Similarity
    cosine: Cosine Similarity
    pearson: Pearson Correlation Coefficient
    euclideanDistance: Euclidean Distance
    euclidean: Normalized Euclidean Distance
    node_schema_property []@<schema>?.<property> / Numeric node property; LTE needed; schema can be either carried or not When type is cosine / pearson / euclideanDistance / euclidean, must specify two or more node properties to form the vector; when type is jaccard / overlap, this parameter is invalid
    limit int -1 >=-1 Number of results to return; return all results if sets to -1
    top_limit int -1 >=-1 Only available in the selection mode, limit the length of selection results (top_list) of each node, return the full top_list if sets to -1

    Calculation Mode

    This algorithm has two calculation modes:

    1. Pairing mode: when two sets of valid nodes are configured, pair each node in the first set with each node in the second set (Cartesian product), similarities are calculated for all node pairs.
    2. Selection mode: when only one set (the first) of valid nodes are configured, for each node in the set, calculate its similarities with all other nodes in the graph, return the results if the similarity > 0, order the results the descending similarity.

    Examples

    Example Graph

    The example graph has product1, product2, product3 and product4 (UUIDs are 1, 2, 3 and 4 in order; edges are ignored), product node has properties price, weight, weight and height:

    Task Writeback

    1. File Writeback

    Calculation Mode
    Configuration
    Data in Each Row
    Pairing mode filename node1,node2,similarity
    Selection mode filename node,top_list

    Example: Calculate Euclidean distance between product UUID = 1 and products UUID = 2,3,4 through properties price, weight, width and height, write the algorithm results back to file

    algo(similarity).params({
      uuids: [1], 
      uuids2: [2,3,4],
      node_schema_property: [price,weight,width,height],
      type: "euclideanDistance"
    }).write({
      file:{ 
        filename: "ed"
      }
    })
    

    Results: File ed

    product1,product2,94.3822
    product1,product3,143.962
    product1,product4,165.179
    

    Example: Calculate normalized Euclidean distance between products UUID = 1,2,3,4 and all other products in the graph respectively through properties price, weight, width and height, write the algorithm results back to file

    algo(similarity).params({
      uuids: [1,2,3,4],
      node_schema_property: [price,weight,width,height],
      type: "euclidean"
    }).write({
      file:{ 
        filename: "ed_list"
      }
    })
    

    Results: File ed_list

    product1,product2:0.010484;product3:0.006898;product4:0.006018;
    product2,product3:0.018082;product4:0.013309;product1:0.010484;
    product3,product4:0.024091;product2:0.018082;product1:0.006898;
    product4,product3:0.024091;product2:0.013309;product1:0.006018;
    

    2. Property Writeback

    Not supported by this algorithm.

    3. Statistics Writeback

    This algorithm has no statistics.

    Direct Return

    Calculation Mode
    Alias Ordinal
    Type Description Column Name
    Pairing mode 0 []perNodePair Node pair and its similarity node1, node2, similarity
    Selection mode 0 []perNode Node and its selection results node, top_list

    Example: Calculate Euclidean distance between product UUID = 1 and products UUID = 2,3,4 through properties price, weight, width and height, order results in the descending distance

    algo(similarity).params({
      uuids: [1], 
      uuids2: [2,3,4],
      node_schema_property: [price,weight,width,height],
      type: "euclideanDistance"
    }) as distance
    return distance
    order by distance.similarity desc
    

    Results:

    node1 node2 similarity
    1 4 165.178691119648
    1 3 143.96180048888
    1 2 94.3822017119753

    Example: Select the product with the highest normalized Euclidean distance with products UUID = 1,2 respectively through properties price, weight, width and height,

    algo(similarity).params({
      uuids: [1,2],
      type: "euclidean",
      node_schema_property: [price,weight,width,height],
      top_limit: 1
    }) as top
    return top
    

    Results:

    node top_list
    1 2:0.010484,
    2 3:0.018082,

    Streaming Return

    Calculation Mode
    Alias Ordinal
    Type Description Column Name
    Pairing mode 0 []perNodePair Node pair and its similarity node1, node2, similarity
    Selection mode 0 []perNode Node and its selection results node, top_list

    Example: Calculate normalized Euclidean distance between product UUID = 3 and products UUID = 1,2,4 through properties price, weight, width and height, only return results that have similariy above 0.01

    algo(similarity).params({
      uuids: [3], 
      uuids2: [1,2,4],
      node_schema_property: [price,weight,width,height],
      type: "euclidean"
    }).stream() as distance
    where distance.similarity > 0.01
    return distance
    

    Results:

    node1 node2 similarity
    3 2 0.0180816471945529
    3 4 0.0240910110982062

    Example: Select the product with the farthest euclidean Distance with products UUID = 1,3 respectively

    algo(similarity).params({
      uuids: [1,3],
      node_schema_property: [price,weight,width,height],
      type: "euclideanDistance",
      top_limit: 1
    }).stream() as top
    return top
    

    Results:

    node top_list
    1 4:165.178691,
    3 1:143.961800,

    Real-time Statistics

    This algorithm has no statistics.

    Please complete the following information to download this book
    *
    公司名称不能为空
    *
    公司邮箱必须填写
    *
    你的名字必须填写
    *
    你的电话必须填写
    *
    你的电话必须填写