Change Password

Please enter the password.
Please enter the password. Between 8-64 characters. Not identical to your email address. Contain at least 3 of: uppercase, lowercase, numbers, and special characters.
Please enter the password.
Submit

Change Nickname

Current Nickname:
Submit

Apply New License

License Detail

Please complete this required field.

  • Ultipa Blaze (v4)

Standalone

Please complete this required field.

Please complete this required field.

Please complete this required field.

Please complete this required field.

Leave it blank if an HDC service is not required.

Please complete this required field.

Leave it blank if an HDC service is not required.

Please complete this required field.

Please complete this required field.

Mac addresses of all servers, separated by line break or comma.

Please complete this required field.

Please complete this required field.

Cancel
Apply
ID
Product
Status
Cores
Maximum Shard Services
Maximum Total Cores for Shard Service
Maximum HDC Services
Maximum Total Cores for HDC Service
Applied Validity Period(days)
Effective Date
Expired Date
Mac Address
Reason for Application
Review Comment
Close
Profile
  • Full Name:
  • Phone:
  • Company:
  • Company Email:
  • Country:
  • Language:
Change Password
Apply

You have no license application record.

Apply
Certificate Issued at Valid until Serial No. File
Serial No. Valid until File

Not having one? Apply now! >>>

Product Created On ID Amount (USD) Invoice
Product Created On ID Amount (USD) Invoice

No Invoice

v5.0
Search
    English
    v5.0

      TextRank

      HDC

      Overview

      TextRank, derived from PageRank, is a graph-based ranking model for text processing. It can be used for various natural language processing tasks, including keyword extraction, keyphrase extraction, and text summarization.

      Concepts

      Text as a Graph

      To apply the TextRank algorithm, the text must first be represented as a graph .The structure of the graph depends on the specific application:

      • Nodes: Text units that best fit the task, such as words, collocations, or sentences, are added as nodes in the graph.
      • Edges: Relationships between text units, such as semantic similarity, co-occurrence, or contextual overlap, are used to connect nodes with edges.

      Sample graph build for keyphrase extraction: nodes are selected lexical units from the text, and edges are established based on co-occurrence within a defined window of words (Source: Original paper)

      TextRank Model

      TextRank computes the ranks of all text units recursively using a "recommendation" mechanism, similar to the PageRank algorithm. It incorporates edge weights through a modified formula that integrates them effectively:

      where,

      • Out(v) is the set of nodes that node v points to;
      • wvu is the edge weight between nodes v and u;
      • d is the damping factor.

      Considerations

      • The rank of isolated text unit will stay the same as the value of (1 - d).
      • A self-loop acts as both a successor and a predecessor, meaning a node can pass some rank to itself. If a network has many self-loops, it will take more iterations to converge.

      Example Graph

      To create this graph:

      // Runs each row separately in order in an empty graphset
      create().edge_property(@default, "weight", int32)
      insert().into(@default).nodes([{_id:"A"}, {_id:"B"}, {_id:"C"}, {_id:"D"}, {_id:"E"}, {_id:"F"}, {_id:"G"}])
      insert().into(@default).edges([{_from:"A", _to:"E", weight:3}, {_from:"B", _to:"A", weight:3}, {_from:"B", _to:"E", weight:2}, {_from:"C", _to:"A", weight:1}, {_from:"C", _to:"D", weight:4}, {_from:"D", _to:"E", weight:5}, {_from:"E", _to:"G", weight:2}, {_from:"F", _to:"B", weight:1}, {_from:"F", _to:"G", weight:3}])
      

      Creating HDC Graph

      To load the entire graph to the HDC server hdc-server-1 as hdc_textrank:

      CALL hdc.graph.create("hdc-server-1", "hdc_textrank", {
        nodes: {"*": ["*"]},
        edges: {"*": ["*"]},
        direction: "undirected",
        load_id: true,
        update: "static",
        query: "query",
        default: false
      })
      

      hdc.graph.create("hdc_textrank", {
        nodes: {"*": ["*"]},
        edges: {"*": ["*"]},
        direction: "undirected",
        load_id: true,
        update: "static",
        query: "query",
        default: false
      }).to("hdc-server-1")
      

      Parameters

      Algorithm name: text_rank

      Name
      Type
      Spec
      Default
      Optional
      Description
      init_value Float >0 0.2 Yes The initial rank assigned to all nodes.
      loop_num Integer ≥1 5 Yes The maximum number of iteration rounds. The algorithm terminates after all iterations are completed.
      damping Float (0,1) 0.8 Yes The damping factor.
      max_change Float ≥0 0 Yes The algorithm terminates when the changes in all ranks between iterations are less than the specified max_change, indicating that the result is stable. Sets to 0 to disable this criterion.
      edge_schema_property []"<@schema.?><property>" / / No Numeric edge properties as weights, summing values across the specified properties; edges without the specified properties are ignored.
      return_id_uuid String uuid, id, both uuid Yes Includes _uuid, _id, or both values to represent nodes in the results.
      limit Integer ≥-1 -1 Yes Limits the number of results returned; -1 includes all results.
      order String asc, desc / Yes Sorts the results by rank.

      File Writeback

      CALL algo.text_rank.write("hdc_textrank", {
        params: {
          return_id_uuid: "id",
          init_value: 1,
          loop_num: 50,
          damping: 0.8,
          edge_schema_property: "weight",
          order: 'desc'
        },
        return_params: {
          file: {
            filename: "textrank"
          }
        }
      })
      

      algo(text_rank).params({
        projection: "hdc_textrank",
        return_id_uuid: "id",
        init_value: 1,
        loop_num: 50,
        damping: 0.8,
        edge_schema_property: "weight",
        order: 'desc'
      }).write({
        file: {
          filename: "textrank"
        }
      })
      

      Result:

      _id,text_rank
      G,0.973568
      E,0.81696
      A,0.3472
      D,0.328
      B,0.24
      F,0.2
      C,0.2
      

      DB Writeback

      Writes the text_rank values from the results to the specified node property. The property type is double.

      CALL algo.text_rank.write("hdc_textrank", {
        params: {
          loop_num: 50,
          edge_schema_property: "@default.weight"
        },
        return_params: {
          db: {
            property: "rank"
          }
        }
      })
      

      algo(text_rank).params({
        projection: "hdc_textrank",
        loop_num: 50,
        edge_schema_property: "@default.weight"
      }).write({
        db:{ 
          property: 'rank'
        }
      })
      

      Full Return

      CALL algo.text_rank("hdc_textrank", {
        params: {
          return_id_uuid: "id",    
          init_value: 1,
          loop_num: 50,
          damping: 0.8,
          edge_schema_property: "weight",
          order: 'desc',
          limit: 5
        },
        return_params: {}
      }) YIELD TR
      RETURN TR
      

      exec{
        algo(text_rank).params({
          return_id_uuid: "id",    
          init_value: 1,
          loop_num: 50,
          damping: 0.8,
          edge_schema_property: "weight",
          order: 'desc',
          limit: 5
        }) as TR
        return TR
      } on hdc_textrank
      

      Result:

      _id text_rank
      G 0.973568
      E 0.81696
      A 0.3472
      D 0.328
      B 0.24

      Stream Return

      CALL algo.text_rank("hdc_textrank", {
        params: {
          return_id_uuid: "id",
          loop_num: 50,
          damping: 0.8,
          edge_schema_property: "weight",
          order: 'desc',
          limit: 5
        },
        return_params: {
        	stream: {}
        }
      }) YIELD TR
      RETURN TR
      

      exec{
        algo(text_rank).params({
          return_id_uuid: "id",
          loop_num: 50,
          damping: 0.8,
          edge_schema_property: "weight",
          order: 'desc',
          limit: 5
        }).stream() as TR
        return TR
      } on hdc_textrank
      

      Result:

      _id text_rank
      G 0.973568
      E 0.81696
      A 0.3472
      D 0.328
      B 0.24
      Please complete the following information to download this book
      *
      公司名称不能为空
      *
      公司邮箱必须填写
      *
      你的名字必须填写
      *
      你的电话必须填写