Change Password

Please enter the password.
Please enter the password. Between 8-64 characters. Not identical to your email address. Contain at least 3 of uppercase, lowercase, numbers, and special characters (such as @*&#).
Please enter the password.
Submit

Change Nickname

Current Nickname:
Submit

v4.2
Search
中文EN
v4.2

    K-Hop All

    Overview

    K-Hop Whole Graph algorithm calculates the number of K-hop neighbors for each node and returns those neighbors. The algorithm is widely used in scenarios such as relationship discovery, impact prediction and friend suggestion and so on.

    K-Hop Whole Graph can be viewed as the batch execution of UQL command khop(). Although Ultipa has optimized the K-Hop Whole Graph algorithm for high concurrency performance, massive system resources can still be consumed by the algorithm in large graphs (over tens of millions nodes and edges) or graphs with many super nodes. Therefore, it is recommended to avoid performing K-Hop Whole Graph calculation that is too deep.

    In a graph which nodes/edges = 100, the theoretical computation amount required to query 5-hop neighbors of a node is 105 (10 billion computational complexity) and it would take 100ms; then 1 million seconds (12 days) are needed to complete such query in a graph of 10 million nodes.

    Basic Concept

    BFS

    BFS is short for Breadth First Search, together with DFS (Depth First Search), are two the most basic traversal principles. K-Hop query follows BFS principle, which means that when traversing the whole graph from one node, it first traverses all the neighbor nodes of the current node that have not arrived yet, and then traverses the neighbor nodes of these neighbor nodes in turns that have not arrived yet.

    Starting from the red node and traversing by the BFS principle, the yellow, green, blue nodes are found in succession. Color of these nodes is classified by the distance to the start node, that is, the number of edges of the shortest path(s) from the start node to them. This distance is what the K means in K-Hop query, which is the K-hop neighbors of the start node.

    Once the start node is given, K values of the other nodes in the graph are also determined. In other words, if a node is the 5th level neighbor of the start node, then it is impossible to appear in the 4th, 6th or other levels.

    Direction of K-Hop

    K-Hop query in directed graph can be restrained by the direction of the edge where the neighbor is located, this is when some nodes do not have K values and not all nodes are found in the traversal result.

    Starting from the red node in the directed graph on the left side, and traversing by the outbound edges and inbound edges; two distinct traversal results are obtained, the purple node does not belong to any level when traversing in the inbound direction and its K value does not exist.

    Special Case

    Lonely Node, Disconnected Graph

    Lonely node will not appear in any K-Hop query result.

    For disconnected graph, only nodes belong to the same connected component with the start node are involved in the K-Hop query result, while other nodes cannot be traversed.

    Self-loop Edge

    Each node is traversed only once in K-Hop query, so the self-loop edge of node can be considered invalid.

    Directed Edge

    When the K-Hop query is performed without specifying the direction of edges, the direction of edges is ignored, and certain K values are calculated for all nodes in the connected component where that start node is located.

    Results and Statistics

    Take the transaction graph below as an example, the color of card represent its level, run the K-Hop Whole Graph algorithm against all nodes:

    Algorithm results: Calculate 1-hop neighbors for each node and count the maximum level of the results, return _id, value or _uuid, value according to the execution method, column value includes the aggregative counting result(s) and the number of neighbors

    _uuid _id value
    1 Card1 3, 3
    2 Card2 3, 3
    3 Card3 2, 3
    4 Card4 3, 1
    5 Card5 2, 2
    6 Card6 , 0

    Algorithm statistics: N/A

    Command and Configuration

    • Command: algo(khop_all)
    • Configurations for the parameter params():
    Name Type
    Default
    Specification
    Description
    ids / uuids []_id / []_uuid / / IDs or UUIDs of nodes to be calculated; all nodes to be calculated if not set
    k_start int 1 >= 1 The minimum K-Hop query depth, the query depth is [k_start, k_end], k_endk_start
    k_end int 1 >= 1 The maximum K-Hop query depth, the query depth is [k_start, k_end], k_endk_start
    src_include int 0 0 or 1 Include the start node in the result or not, 1 means to include the start node, 0 or not set means to exclude the start node
    direction string / in/out, case insensitive Edge direction in the path; direction ignored if not set
    node_property []@<schema>?.<property> / Numeric node property, LTE needed Node property/properties to be aggregatively counted, schema can be either carried or not; must be used with aggregate_opt; results without any specified property will not be aggregatively counted
    aggregate_opt []string / max / min / mean / sum / var / dev, case insensitive The method that each specified property of the results to be aggregatively counted; must be used with node_property and one property corresponds to one aggregative counting method; max means the maximum, min means the minimum, mean means the average, sum means the sum, var means the variance, and dev means the standard deviation
    limit int -1 >=-1 Number of results to return; return all results if sets to -1 or not set

    Example: Calculate the number of 2-hop neighbors in the outbound direction of all nodes

    algo(khop_all).params({ 
      k_start: 2,
      k_end: 2,
      direction: "out"
    }) as k2
    return k2
    

    Example: Calculate 2-hop and 3-hop neighbors of nodes UUID = 3,4,5 and aggregatively count the maximum value of property @card.level and sum of property @card.balance of the results, also include the start node (UUID = 3,4,5) in each node's results

    algo(khop_all).params({ 
      uuids: [3,4,5],
      k_start: 2,
      k_end: 3,
      src_include: 1,
      node_property: [@card.level, @card.balance],
      aggregate_opt: ["max", "sum"]
    }) as k2
    return k2
    

    Algorithm Execution

    Task Writeback

    1. File Writeback

    Configuration Data in Each Row
    Description
    filename_ids _id,_id Each result takes up one line, the first _id in each line represents the node that is calculated, the second _id represents the neighbor of the calculated node
    filename _id,aggregate_result1,...,aggregate_resultN,count _id represents the node that is calculated, the next is/are the one or multiple aggregative counting result(s) (aggregate_result1 ~ aggregate_resultN), the last count is the number of neighbors

    Example: Calculate 2-hop and 3-hop neighbors of all nodes and aggregatively count the maximum value of property @card.level and sum of property @card.balance of the results, write the IDs of neighbors of each node back to file named neighbors, write the aggregative counting results and the number of neighbors of each node back to file named counts

    algo(khop_all).params({ 
      k_start: 3,
      k_end: 3,
      node_property: [@card.level, @card.balance],
      aggregate_opt: ["max", "sum"]
    }).write({
      file:{
        filename_ids: "neighbors",
        filename: "counts"
      }
    })
    

    2. Property Writeback

    Configuration Writeback Content Type Data Type
    property Number of neighbors Node property double

    Example: Calculate 2-hop neighbors of all nodes, write the number of neighbors back to node property named khop2

    algo(khop_all).params({ 
      k_start: 2,
      k_end: 2
    }).write({
      db:{ 
        property: "khop2"
      }
    })
    

    3. Statistics Writeback

    This algorithm has no statistics.

    Direct Return

    Alias Ordinal Type
    Description
    Column Name
    0 []perNode Node and its aggregative counting result(s) and number of neighbors _uuid, value

    Example: Calculate 3-hop neighbors of all nodes and aggregatively count the maximum value of property @card.level and sum of property @card.balance of the results, define algorithm results as alias named nei, and return the results

    algo(khop_all).params({ 
      k_start: 3,
      k_end: 3,
      node_property: [@card.level, @card.balance],
      aggregate_opt: ["max", "sum"]
    }) as nei
    return nei
    

    Streaming Return

    Alias Ordinal Type
    Description
    Column Name
    0 []perNode Node and its aggregative counting result(s) and number of neighbors _uuid, value

    Example: Calculate 1-hop to 3-hop neighbors of node UUID = 3 and aggregatively count the maximum value of property @card.level of the results, and return only the aggregative counting result and the number of neighbors

    algo(khop_all).params({
       uuids: [3],
       k_start: 1,
       k_end: 3,
       node_property: @card.level,
       aggregate_opt: max
    }).stream() as results 
    return results.value
    

    Real-time Statistics

    This algorithm has no statistics.

    Please complete the following information to download this book
    *
    公司名称不能为空
    *
    公司邮箱必须填写
    *
    你的名字必须填写
    *
    你的电话必须填写
    *
    你的电话必须填写