UltipaDocs
Try Playground
  • Introduction
    • Show Algorithms
    • Install and Uninstall
    • Run Algorithms
    • Algorithm Results and Statistics
    • Degree Centrality
    • Closeness Centrality
    • Harmonic Centrality
    • Graph Centrality
    • Betweenness Centrality
    • Eigenvector Centrality
    • CELF
    • PageRank
    • ArticleRank
    • HITS
    • SybilRank
    • Jaccard Similarity
    • Overlap Similarity
    • Cosine Similarity
    • Pearson Correlation Coefficient
    • Euclidean Distance
    • K-Hop All
    • Bipartite Graph
    • HyperANF
    • Connected Component
    • Triangle Counting
    • Induced Subgraph
    • k-Core
    • k-Truss
    • p-Cohesion
    • k-Edge Connected Components
    • Local Clustering Coefficient
    • Topological Sort
    • Schema Overview
    • Dijkstra's Single-Source Shortest Path
    • Delta-Stepping Single-Source Shortest Path
    • Shortest Path Faster Algorithm (SPFA)
    • Minimum Spanning Tree
    • Breadth-First Search (BFS)
    • Depth-First Search (DFS)
    • Adamic-Adar Index
    • Common Neighbors
    • Preferential Attachment
    • Resource Allocation
    • Total Neighbors
    • Louvain
    • Leiden
    • Label Propagation
    • HANP
    • k-Means
    • kNN (k-Nearest Neighbors)
    • K-1 Coloring
    • Conductance
      • Random Walk
      • Node2Vec Walk
      • Node2Vec
      • Struc2Vec Walk
      • Struc2Vec
      • GraphSAGE
      • GraphSAGE Train
      • LINE
      • Fast Random Projection
      • Summary of Graph Embedding
      • Gradient Descent
      • Backpropagation
      • Skip-gram
      • Skip-gram Optimization
  1. Docs
  2. /
  3. Graph Analytics & Algorithms
  4. /
  5. Centrality

HITS

✓ File Writeback ✓ Property Writeback ✓ Direct Return ✓ Stream Return ✕ Stats

Overview

The HITS (Hyperlink-Induced Topic Search) algorithm was developed by L.M. Kleinberg in 1999 with the purpose of improving the quality of search methods on the World Wide Web (WWW). HITS makes use of the mutual reinforcing relationship between authorities and hubs to evaluate and rank a set of linked entities.

  • L.M. Kleinberg, Authoritative Sources in a Hyperlinked Environment (1999)

Concepts

Authority and Hub

In WWW, hyperlinks represent some latent human judgment: the creator of page p, by including a link to page q, has in some measure conferred authority on q. Instructively, a node with large in-degree is viewed as an authority.

If a node points to considerable number of authoritative nodes, it is referred to as a hub.

As illustrated in the graph below, red nodes are good authorities, green nodes are good hubs.

Hubs and authorities exhibit what could be called a mutually reinforcing relationship: a good hub points to many good authorities; a good authority is pointed to by many good hubs.

Compute Authorities and Hubs

HITS algorithm operates on the whole graph iteratively to compute the authority weight (denoted as x) and hub weight (denoted as y) for each node through the link structure. Nodes with larger x-values and y-values are viewed as better authorities and hubs respectively.

In a directed graph G = (V, E), all nodes are initialized with x = 1 and y = 1. In each iteration, for each node p ∈ V, update its x and y values as follows:

Here is an example:

At the end of one iteration, normalize all x values and all y values to meet the invariant below:

The algorithm continues until the change of all x values and y values converges to within some tolerance, or the maximum iteration rounds is met. In the experiments of the original author, the convergence is quite rapid, 20 iterations are normally sufficient.

Considerations

  • In HITS algorithm, self-loops are ignored.
  • Authority weight of nodes with no in-links is 0, hub weight of nodes with out-links is 0.

Syntax

  • Command: algo(hits_centrality)
  • Parameters:
Name
Type
Spec
Default
Optional
Description
max_loop_numint>=120YesMaximum rounds of iterations; the algorithm ends after running for all rounds, even though the condition of tolerance is not met
tolerancefloat(0,1)0.001YesWhen all authority weights and hub weights change less than the tolerance between iterations, the result is considered stable and the algorithm ends
limitint≥-1-1YesNumber of results to return, -1 to return all results

Examples

The example graph is as follows:

File Writeback

SpecContent
filename_id,authority,hub
UQL
algo(hits_centrality).params({}).write({
  file: {
    filename: 'rank'
  }
})

Results: File rank

File
H,0.000000,0.000000
G,0.213196,0.190701
F,0.426420,0.000000
E,0.000000,0.476726
D,0.000000,0.572083
C,0.000000,0.476726
B,0.213196,0.381382
A,0.852796,0.190701

Property Writeback

SpecContentWrite toData Type
authorityauthorityNode propertydouble
hubhubNode propertydouble
UQL
algo(hits_centrality).params({
  max_loop_num: 20,
  tolerance: 0.0001
}).write({
  db: {
    authority: 'auth',
    hub: 'hub'
  }
})

Results: Authority weight for each node is written to a new property named auth, hub weight for each node is written to a new property named hub

Direct Return

Alias Ordinal
Type
DescriptionColumns
0[]perNodeNode and its authority and hub weight_uuid, authority, hub
UQL
algo(hits_centrality).params() as rank
return rank

Results: rank

_uuidauthorityhub
83.20199049138017e-110
70.2131964440937410.190700611234451
60.4264195300291661.43197368054726e-11
500.476726292571473
400.572082555485605
300.476726292571473
20.2131964440937410.381381944251153
10.8527959526529630.190700611234451

Stream Return

Alias Ordinal
Type
DescriptionColumns
0[]perNodeNode and its authority and hub weight_uuid, authority, hub
UQL
algo(hits_centrality).params({
  max_loop_num: 20,
  tolerance: 0.0001
}).stream() as rank
find().nodes({_uuid == rank._uuid}) as nodes
order by rank.hub desc
return table(nodes._id, rank.hub)

Results: table(nodes._id, rank.hub)

nodes._idrank.hub
D0.572082555485605
E0.476726292571473
C0.476726292571473
B0.381381944251153
G0.190700611234451
A0.190700611234451
F1.43197368054726e-11
H0