Change Password

Please enter the password.
Please enter the password. Between 8-64 characters. Not identical to your email address. Contain at least 3 of: uppercase, lowercase, numbers, and special characters.
Please enter the password.
Submit

Change Nickname

Current Nickname:
Submit

Apply New License

License Detail

Please complete this required field.

  • Ultipa Graph V4

Standalone

Please complete this required field.

Please complete this required field.

The MAC address of the server you want to deploy.

Please complete this required field.

Please complete this required field.

Cancel
Apply
ID
Product
Status
Cores
Applied Validity Period(days)
Effective Date
Excpired Date
Mac Address
Apply Comment
Review Comment
Close
Profile
  • Full Name:
  • Phone:
  • Company:
  • Company Email:
  • Country:
  • Language:
Change Password
Apply

You have no license application record.

Apply
Certificate Issued at Valid until Serial No. File
Serial No. Valid until File

Not having one? Apply now! >>>

Product Created On ID Amount (USD) Invoice
Product Created On ID Amount (USD) Invoice

No Invoice

Search
    English

      Fast Random Projection

      Overview

      Fast Random Projection, or FastRP in short, is a graph embedding algorithm that is observably faster than the traditional methods such as random walk. It allows the iterative computation of node embeddings so that the similarity matrix need not be explicitly constructed, which speeds up the algorithm. FastRP was proposed by H. Chen et al. in 2019.

      Related materials of the algorithm:

      FastRP Framework

      FastRP algorithm is a procedure with two components: first, similarity matrix construction, then dimension reduction by random projection to get the embeddinng result.

      Similarity Matrix Construction

      FastRP keeps two key features when constructing node similarity matrix:

      1. Preserve high-order proximity in the graph
      2. Perform element-wise normalization on the matrix

      Let S be the adjacency matrix of graph G =(V,E), D be the degree matrix, and A be the transition matrix and A = D-1S. See the example below, the number on each edge represents the edge weight:

      However, directly applying dimension reduction techniques on S or A is problematic. Real-world graphs are usually extremely sparse, which means most of the entries in S or A are 0. However, the absence of edge between two nodes does not imply that there is no association between them.

      The fact is that in matrix A, entry Aij stands for the probablity of node i reaching to node j with 1-step random walk. Furthermore, if raises A to k-th power to get matrix Ak, then Akij denotes the probability of reaching node j from node i in k steps of random walk. Hence, matrix Ak perserves high-order proximity of the graph.

      Now consider the normalization on matrix Ak. First, many Ak of the real-world has a heavy-tailed distribution, normalization is important to balance the influene of node degrees. In addition, it can be proved that Akij → dj/2m when k → ∞, where m = |E| is the number of edges in the graph, and dj is the degree of the j-th node.

      Let's construct a normalization diagonal matrix L and entry Lij = (dj/2m)β, where β is the normalization strength. Normalization strength controls the impact of node degree on the embeddings: when β is negative, the importance of high-degree neighbors is weakened, and the opposite when β is positive.

      Matrix Dimensionality Deduction

      Johnson–Lindenstrauss Lemma

      FastRP is theoretically backed by Johnson–Lindenstrauss Lemma (or JL Lemma for short) originally proposed by W. Johnson and J. Lindenstrauss in 1984, after continued research by later generations, it is now generally accepted as:

      A node set M in a high-dimensional Euclidean space can be embeded into a low-dimensional space via a linear projection R and approximately preserve pairwise distances among nodes, i.e.:

      where 0 < ϵ < 1 is an acceptable distance error, the embedding dimension d is irrelevant to the dimension of the original node set, but depends on the size of the node set n = |M| and ϵ:

      Random Projection

      Random Projection is a simple but efficient approach of dimensionality reduction:

      1. A n×m dataset M as input, where n is the number of elements, m is the original dimensionality (number of features)
      2. Construct a random projection matrix R of m×d, where d is the dimensionality that intend to reduce to (d ≪ n)
      3. Get the low-dimensional matrix N by simply calculating N = M × R, and the final dimensionality would be n×d.

      Dimensionality d depends on the dataset's size n, not on the original dimensionality m. In FastRP algorithm, for graph data, we have m = n.

      The difference among different random projection algorithms ismostly in the construction of R. FastRP adopts Very Sparse Random Projection, proposed by P. Li et al.:

      where s=sqrt(m).

      Algorithm Process

      The process of FastRP algorithm are as followed:

      1. Generate the random projection matrix R
      2. First iteration:generate dimensionality reduction matrix N1 = A ⋅ L ⋅ R
      2. Second iteration: generate dimensionality reduction matrix N2 = A2 ⋅ L ⋅ R = A ⋅ N1
      3. Iterate for k times: generate dimensionality reduction matrix Nk = A ⋅ Nk-1
      4. Generate the final embedding results, i.e., N = α1N1 + ... + αkNk, where α1,...,αk are the iteration weights.

      Command and Configuration

      • Command:algo(fastRP)
      • Configurations for the parameter params():
      Name Type
      Default
      Specification
      Description
      dimension int / >=1 Dimension of node embedding vector
      normalizationStrength float / / Normalization strength β
      iterationWeights []float [0.0,1.0,1.0] >0 Weight for each iteration, the size of the array is the number of iterations
      edge_schema_property []@<schema>?.<property> / Numeric edge property, LTE needed Edge weight property/properties, schema can be either carried or not; edge without any specified property does not participate in the calculation
      node_schema_property []@<schema>?.<property> / Numeric node property, LTE needed Node property/properties to be used as features, schema can be either carried or not; node without any specified property does not participate in the calculation
      propertyDimension int / (0,dimension] Random projection matrix R is formed by concatenation of two parts: the first part is formed by the Very Sparse Random Projection algorithm, the second part of length propertyDimension is a linear combination of the property vectors, using the property values of the node as weights
      limit int -1 -1 or >=0 Number of results to return; return all results if sets to -1 or not set

      Algorithm Execution

      Task Writeback

      1. File Writeback

      Configuration
      Data in Each Column
      filename _id, embedding_result

      2. Property Writeback

      Not supported by this algorithm.

      3. Statistics Writeback

      This algorithm has no statistics.

      Direct Return

      Alias Ordinal Type
      Description
      Column Name
      0 []perNode Node and its graph embedding results _uuid, embedding_result

      Streaming Return

      Alias Ordinal Type
      Description
      Column Name
      0 []perNode Node and its graph embedding results _uuid, embedding_result

      Real-time Statistics

      This algorithm has no statistics.

      Please complete the following information to download this book
      *
      公司名称不能为空
      *
      公司邮箱必须填写
      *
      你的名字必须填写
      *
      你的电话必须填写