Change Password

Input error
Input error
Input error
Submit

Change Nickname

Current Nickname:
Submit
Search
v4.0
    v4.0

    Pearson Correlation Coefficient

    Overview

    The Pearson correlation coefficient measures the linear correlation between two variables. The Pearson correlation coefficient between two nodes in graph is calculated by using N properties of node to form two N-dimensional vectors.

    Basic Concept

    Vector

    Vector is one of the basic concepts in Advanced Mathematics, vectors in low dimensional spaces are relatively easy to understand and express. The following diagram shows the relationship between vectors A, B and coordinate axes in 2- and 3-dimensional spaces respectively, as well as the angle θ between them:

    When comparing two nodes in graph, N properties of node are used to form the two N-dimensional vectors.

    Pearson Correlation Coefficient

    The range of Pearson correlation coefficient values is [-1,1]; let r to denote the Pearson correlation coefficient, then:

    • r > 0 indicates positive correlation, i.e. as one variable becomes larger, the other variable becomes larger;
    • r < 0 indicates negative correlation, i.e. as one variable becomes larger, the other variable becomes smaller;
    • r = 1 or r = -1 indicates that two variables can be described by a linear equation, i.e. them fall on the same line;
    • r = 0 indicates that there is no linear correlation (but may exist some other correlations).

    Pearson correlation coefficient is defined as the quotient of the covariance and standard deviation between two variables, and is calculated as:

    Special Case

    Lonely Node, Disconnected Graph

    Theoretically, the calculation of Pearson Correlation Coefficient between two nodes does not depend on the existence of edges in the graph. Regardless of whether the two nodes to be calculated are lonely nodes or whether they are in the same connected component, it does not affect the calculation of their Pearson Correlation Coefficient.

    Self-loop Edge

    The calculation of Pearson Correlation Coefficient has nothing to do with edges.

    Directed Edge

    The calculation of Pearson Correlation Coefficient has nothing to do with edges.

    Results and Statistics

    The graph below has 4 product nodes (edges are ignored), use properties price, weight, weight and height to form vector:

    Algorithm results: Calculate Pearson Correlation Coefficient between product1 and other 3 products, return node1, node2 and similarity

    node1 node2 similarity
    1 2 0.9987851216012547
    1 3 0.4743838031328631
    1 4 0.21049415016958328

    Algorithm statistics: N/A

    Command and Configuration

    • Command: algo(similarity)
    • Configurations for the parameter params():
    Name Type
    Default
    Specification
    Description
    ids / uuids []_id / []_uuid / Mandatory IDs or UUIDs of the first set of nodes to be calculated, only need to configure one of them
    ids2 / uuids2 []_id / []_uuid / Mandatory IDs or UUIDs of the second set of nodes to be calculated, only need to configure one of them
    node_schema_property []@<schema>?.<property> / Numeric node property, LTE needed; at least two properties are required Node properties to form the dimensions of the vector
    type string cosine jaccard / overlap / cosine / pearson / euclideanDistance / euclidean Measurement of the similarity; jaccard means to calculate Jaccard similarity, overlap means to calcualte overlap similarity, cosine means to calcualte cosine similarity, pearson means to calculate Pearson correlation coefficient, euclideanDistance means to calculate Euclidean distance, euclidean means to calcualte normalzied Euclidean distance
    limit int -1 >=-1 Number of node pairs uuids × uuids2 to return; return all results if sets to -1 or not set

    Example: Calculate Pearson Correlation Coefficient of nodes UUID = 1,2 and nodes UUID = 3,4 through properties price and weight

    algo(similarity).params({
      uuids: [1,2],
      uuids2: [3,4],
      node_schema_property: [price, weight],
      type: "pearson"
    }) as p
    return p
    

    Algorithm Execution

    Task Writeback

    1. File Writeback

    Configuration Data in Each Row
    filename node1,node2,similarity

    Example: Calculate Pearson Correlation Coefficient between node UUID = 1 and other nodes through properties price, weight, width and height, write the algorithm results back to file named pearson

    algo(similarity).params({
      uuids: [1], 
      uuids2: [2,3,4],
      node_schema_property: [price,weight,width,height],
      type: "pearson"
    }).write({
      file:{ 
        filename: "pearson"
      }
    })
    

    2. Property Writeback

    Not supported by this algorithm.

    3. Statistics Writeback

    This algorithm has no statistics.

    Direct Return

    Alias Ordinal Type
    Description
    Column Name
    0 []perNodePair Node pair and its similarity node1, node2, similarity

    Example: Calculate Pearson Correlation Coefficient between node UUID = 1 and other nodes through properties price, weight, width and height, define algorithm results as alias named similarity and return the results

    algo(similarity).params({
      uuids: [1], 
      uuids2: [2,3,4],
      node_schema_property: [price,weight,width,height],
      type: "pearson"
    }) as similarity 
    return similarity
    

    Streaming Return

    Alias Ordinal Type
    Description
    Column Name
    0 []perNodePair Node pair and its similarity node1, node2, similarity

    Example: Calculate Pearson Correlation Coefficient between node UUID = 1 and other nodes through properties price, weight, width and height, define algorithm results as alias named similarity, return 2 results

    algo(similarity).params({
      uuids: [1], 
      uuids2: [2,3,4],
      node_schema_property: [price,weight,width,height],
      type: "pearson"
    }).stream() as similarity 
    return similarity limit 2
    

    Real-time Statistics

    This algorithm has no statistics.

    Please complete the following information to download this book
    *
    公司名称不能为空
    *
    公司邮箱必须填写
    *
    你的名字必须填写
    *
    你的电话必须填写
    *
    你的电话必须填写