Change Password

Input error
Input error
Input error
Submit

Change Nickname

Current Nickname:
Submit
Search
v2.x
    v4.0

    Cosine Similarity

      Basic  

    Overview

    Cosine similarity uses the cosine value of the angle formed by two N-dimensional vectors in vector space to indicate the similarity between them. In graph, the Cosine Similairity algorithm uses N node properties to represent node as vector, and applies to two nodes.

    The range of cosine similarity values is [0,1]; the larger the value, the more similar the two nodes are. If two nodes are identical, their cosine similarity is 1.

    Basic Concept

    Vectorial Angle

    Vector is one of the basic concepts in Advanced Mathematics, vectors in low dimensional spaces are relatively easy to understand and express. The following diagram shows the relationship between vectors A, B and coordinate axes in 2- and 3-dimensional spaces respectively, as well as the angle θ between them:

    Vectorial Angle in Graph

    When comparing two nodes in graph, two vectors are established by N specified numeric properties, and vectorial angle can be calculated according to the following formula:

    Special Case

    Lonely Node, Disconnected Graph

    Theoretically, the calculation of cosine similarity between two nodes does not depend on the existence of edges in the graph. Regardless of whether the two nodes to be calculated are lonely nodes or whether they are in the same connected component, it does not affect the calculation of their cosine similarity.

    Self-loop Edge

    The calculation of cosine similarity has nothing to do with edges.

    Directed Edge

    The calculation of cosine similarity has nothing to do with edges.

    Results and Statistics

    The graph below has 4 product nodes (edges are ignored), use properties price, weight, weight and height to form vector and run the Cosine Similarity algorithm:


    Algorithm results: Calculate cosine similarity between product1 and other 3 products, return node1, node2 and cosine, node1 and node2 are both UUID

    node1 node2 cosine
    1 2 0.9865294135291195
    1 3 0.8788584075196542
    1 4 0.8168761502672031

    Algorithm statistics: N/A

    Command and Configuration

    • Command: algo(cosine_similarity)
    • Configurations for the parameter params():
    Name Type Default Value Specification Description
    ids / uuids []_id / []_uuid / Mandatory IDs or UUIDs of the first set of nodes to be calculated, only need to configure one of them
    ids2 / uuids2 []_id / []_uuid / Mandatory IDs or UUIDs of the second set of nodes to be calculated, only need to configure one of them
    limit int -1 >=-1 Number of node pairs uuids × uuids2 to return; return all results if sets to -1 or not set
    node_schema_property []@<schema>?.<property> / Numeric node property, LTE needed; at least two properties are required Node properties to form the dimensions of the vector

    Example: Calculate cosine similarity of nodes UUID = 1,2 and nodes UUID = 3,4 through properties price and weight

    algo(cosine_similarity).params({
      uuids: [1,2],
      uuids2: [3,4],
      node_schema_property: [price, weight]
    }) as cs
    return cs
    

    Algorithm Execution

    Task Writeback

    1. File Writeback

    File Configuration Item Data in Each Row
    filename node1,node2,cosine

    Example: Calculate cosine similarity of node UUID = 1 and other nodes through properties price, weight, width and height, write the algorithm results back to file named cs_result

    algo(cosine_similarity).params({
      uuids: [1], 
      uuids2: [2,3,4],
      node_schema_property: [price,weight,width,height]
    }).write({
      file:{ 
        filename: "cs_result"
      }
    })
    

    2. Property Writeback

    Not supported by this algorithm.

    3. Statistics Writeback

    This algorithm has no statistics.

    Direct Return

    Alias Ordinal Type Description Column Name
    0 []perNodePair Node pair and its cosine similarity node1, node2, cosine

    Example: Calculate cosine similarity of node UUID = 1 and other nodes through properties price, weight, width and height, define algorithm results as alias named cs and return the results

    algo(cosine_similarity).params({
      uuids: [1], 
      uuids2: [2,3,4],
      node_schema_property: [price,weight,width,height]
    }) as cs 
    return cs
    

    Streaming Return

    Alias Ordinal Type Description Column Name
    0 []perNodePair Node pair and its cosine similarity node1, node2, cosine

    Example: Calculate cosine similarity of node UUID = 1 and other nodes through properties price, weight, width and height, define algorithm results as alias named cs, return 2 results

    algo(cosine_similarity).params({
      uuids: [1], 
      uuids2: [2,3,4],
      node_schema_property: [price,weight,width,height]
    }).stream() as cs 
    return cs limit 2
    

    Real-time Statistics

    This algorithm has no statistics.

    Please complete the following information to download this book
    *
    公司名称不能为空
    *
    公司邮箱必须填写
    *
    你的名字必须填写
    *
    你的电话必须填写
    *
    你的电话必须填写