Change Password

Please enter the password.
Please enter the password. Between 8-64 characters. Not identical to your email address. Contain at least 3 of: uppercase, lowercase, numbers, and special characters.
Please enter the password.
Submit

Change Nickname

Current Nickname:
Submit

Apply New License

License Detail

Please complete this required field.

  • Ultipa Blaze (v4)

Standalone

Please complete this required field.

Please complete this required field.

Please complete this required field.

Please complete this required field.

Leave it blank if an HDC service is not required.

Please complete this required field.

Leave it blank if an HDC service is not required.

Please complete this required field.

Please complete this required field.

Mac addresses of all servers, separated by line break or comma.

Please complete this required field.

Please complete this required field.

Cancel
Apply
ID
Product
Status
Cores
Maximum Shard Services
Maximum Total Cores for Shard Service
Maximum HDC Services
Maximum Total Cores for HDC Service
Applied Validity Period(days)
Effective Date
Expired Date
Mac Address
Reason for Application
Review Comment
Close
Profile
  • Full Name:
  • Phone:
  • Company:
  • Company Email:
  • Country:
  • Language:
Change Password
Apply

You have no license application record.

Apply
Certificate Issued at Valid until Serial No. File
Serial No. Valid until File

Not having one? Apply now! >>>

Product Created On ID Amount (USD) Invoice
Product Created On ID Amount (USD) Invoice

No Invoice

v5.0
Search
    English
    v5.0

      CELF

      HDC

      Overview

      The CELF (Cost Effective Lazy Forward) algorithm selects seed nodes in a network to act as propagation sources and maximize the number of influenced nodes. This is known as Influence Maximization (IM), where 'influence' represents anything that can be spread across the network, such as contamination, information, disease, etc.

      CELF was proposed by Jure Leskovec et al. in 2007, it improves the traditional Greedy algorithm based on the IC model by taking advantage of the submodularity. It only calculates the spread score for all nodes only at the initial stage and does not recalculate for all nodes afterwards, hence cost-effective.

      Related materials of the algorithm:

      A typical application of the algorithm is to prevent epidemic outbreak by selecting a small group of people to monitor, so that any disease can be detected in an early stage.

      Concepts

      Spread Function - Independent Cascade

      This algorithm uses the Independent Cascade (IC) model to simulate influence propagation in the network. IC is a probabilistic model, it starts with a set of active seed nodes, and in step k:

      • For each node that becomes active in step k-1, it has a single chance to activate each inactive outgoing neighbor with a success probability.
      • The process runs until no more activations are possible.

      The spread of a given seed set is measured by the number of active nodes in the graph at the end of the process. This process is repeated many times (using Monte Carlo simulations), and the average spread is calculated.

      Submodularity

      The spread function IC() is called submodular as the marginal gain of a single node v is diminishing as the seed set S grows:

      where the seed set |Sk+1| > |Sk|, S ∪ {v} means to add node v into the seed set.

      Submodularity of the spread function is the key property exploited by CELF. CELF significantly improves the traditional Greedy algorithm that is used to solve the influence maximization problem, it runs a lot faster while achieving near optimal results.

      Lazy Forward

      At initialization, CELF, like the Greedy algorithm, computes the spread for each node and stores them in a list sorted by descending spread. As the seed set is empty now, the spread for each node can be viewed as its initial marginal gain.

      In the first iteration, the top node is moved from the list to the seed set.

      In the next iteration, only the marginal gain of the current top-ranked node is recalculated. After sorting, if that node remains at top, move it to the seed set; if not, repeat the process for the new top node.

      Unlike Greedy, CELF avoids calculating marginal gain for all the rest nodes in each iteration, this is where the submodularity of the spread function is considered - the marginal gain of every node in this round is always lower than the previous round. So if the top node remains at top, we can put it into the seed set directly without calculating for other nodes.

      The algorithm terminates when the seed set reaches the specified size.

      Example Graph

      To create this graph:

      // Runs each row separately in order in an empty graphset
      create().node_schema("account").edge_schema("follow")
      create().node_property(@account, "createdOn", datetime)
      insert().into(@account).nodes([{_id:"A", createdOn: "2016-12-3"}, {_id:"B", createdOn:"2022-1-30" }, {_id:"C", createdOn: "2019-11-8"}, {_id:"D", createdOn: "2019-1-16"}, {_id:"E", createdOn: "2016-3-4"}, {_id:"F", createdOn: "2019-11-10"}, {_id:"G", createdOn: "2019-7-26"}, {_id:"H", createdOn: "2016-7-11"}, {_id:"I", createdOn: "2018-12-13"},{_id:"J", createdOn: "2018-3-21"}])
      insert().into(@follow).edges([{_from:"A", _to:"B"}, {_from:"A", _to:"G"}, {_from:"B", _to:"F"}, {_from:"C", _to:"J"}, {_from:"D", _to:"J"}, {_from:"E", _to:"A"}, {_from:"F", _to:"C"}, {_from:"F", _to:"G"}, {_from:"G", _to:"H"}, {_from:"H", _to:"C"}, {_from:"H", _to:"E"}, {_from:"H", _to:"J"}, {_from:"I", _to:"B"}, {_from:"C", _to:"B"}])
      

      Creating HDC Graph

      To load the entire graph to the HDC server hdc-server-1 as hdc_celf:

      CALL hdc.graph.create("hdc-server-1", "hdc_celf", {
        nodes: {"*": ["*"]},
        edges: {"*": ["*"]},
        direction: "undirected",
        load_id: true,
        update: "static",
        query: "query",
        default: false
      })
      

      hdc.graph.create("hdc_celf", {
        nodes: {"*": ["*"]},
        edges: {"*": ["*"]},
        direction: "undirected",
        load_id: true,
        update: "static",
        query: "query",
        default: false
      }).to("hdc-server-1")
      

      Parameters

      Algorithm name: celf

      Name
      Type
      Spec
      Default
      Optional
      Description
      seedSetSize Integer >0 1 Yes The size of the seed set.
      monteCarloSimulations Integer >0 1000 Yes The number of Monte Carlo simulations.
      propagationProbability Float (0,1) 0.1 Yes The probability that a node with activation capability successfully activates each of its outgoing neighbors in a given round.
      return_id_uuid String uuid, id, both uuid Yes Includes _uuid, _id, or both values to represent nodes in the results.

      File Writeback

      CALL algo.celf.write("hdc_celf", {
        params: {
          return_id_uuid: "id",
          seedSetSize: 3,
          monteCarloSimulations: 1000,
          propagationProbability: 0.5
        },
        return_params: {
          file: {
            filename: "seeds"
          }
        }
      })
      

      algo(celf).params({
        projection: "hdc_celf",
        return_id_uuid: "id",
        seedSetSize: 3,
        monteCarloSimulations: 1000,
        propagationProbability: 0.5
      }).write({
        file: {
          filename: "seeds"
        }
      })
      

      Result:

      _id,spread
      H,3.612
      I,1.673
      A,1.353
      

      Full Return

      CALL algo.celf("hdc_celf", {
        params: {
          return_id_uuid: "id",    
          seedSetSize: 2,
          monteCarloSimulations: 1000,
          propagationProbability: 0.6
        },
        return_params: {}
      }) YIELD seeds
      RETURN seeds
      

      exec{
        algo(celf).params({
          return_id_uuid: "id",    
          seedSetSize: 2,
          monteCarloSimulations: 1000,
          propagationProbability: 0.6
        }) as seeds
        return seeds
      } on hdc_celf
      

      Result:

      _id spread
      H 4.466
      I 1.714

      Stream Return

      CALL algo.celf("hdc_celf", {
        params: {
          return_id_uuid: "id",
          seedSetSize: 3,
          monteCarloSimulations: 1000,
          propagationProbability: 0.6
        },
        return_params: {
        	stream: {}
        }
      }) YIELD seeds
      MATCH (nodes WHERE nodes._id = seeds._id)
      RETURN nodes._id, nodes.createdOn, seeds.spread
      

      exec{
        algo(celf).params({
          return_id_uuid: "id",
          seedSetSize: 3,
          monteCarloSimulations: 1000,
          propagationProbability: 0.6 
        }).stream() as seeds
        find().nodes({_id == seeds._id}) as nodes
        return nodes._id, nodes.createdOn, seeds.spread
      } on hdc_celf
      

      Result:

      nodes._id nodes.createdOn seeds.spread
      H 2016-07-11 00:00:00 4.466
      I 2018-12-13 00:00:00 1.714
      A 2016-12-03 00:00:00 1.168
      Please complete the following information to download this book
      *
      公司名称不能为空
      *
      公司邮箱必须填写
      *
      你的名字必须填写
      *
      你的电话必须填写