Change Password

Please enter the password.
Please enter the password. Between 8-64 characters. Not identical to your email address. Contain at least 3 of: uppercase, lowercase, numbers, and special characters.
Please enter the password.
Submit

Change Nickname

Current Nickname:
Submit

Apply New License

License Detail

Please complete this required field.

  • Ultipa Graph V4

Standalone

Please complete this required field.

Please complete this required field.

The MAC address of the server you want to deploy.

Please complete this required field.

Please complete this required field.

Cancel
Apply
ID
Product
Status
Cores
Applied Validity Period(days)
Effective Date
Excpired Date
Mac Address
Apply Comment
Review Comment
Close
Profile
  • Full Name:
  • Phone:
  • Company:
  • Company Email:
  • Country:
  • Language:
Change Password
Apply

You have no license application record.

Apply
Certificate Issued at Valid until Serial No. File
Serial No. Valid until File

Not having one? Apply now! >>>

Product Created On ID Amount (USD) Invoice
Product Created On ID Amount (USD) Invoice

No Invoice

ULTIPA GRAPH Benchmark Report (2022)

Download PDF File

1. Executive Summary

Ultipa has recently released v4.0 of its flagship Ultipa Graph database product. To recap Ultipa Graph v4.0's characteristics:

  • Ultipa Graph v4.0 continues to evolve over its HTAP architecture for maximum speed, lowest latency and high throughput, great data consistency (across the entire cluster), and with advanced horizontal scalability. For those who are interested in graph database scalability design, refer to this article.
  • Ultipa Graph Database is 10x to 1,000x+ faster than any other graph database systems in terms of data processing capabilities, the deeper the better.
  • Ultipa Graph is the world’s only 4th-generation graph system, leveraging its patent-pending high-density parallel computing, ultra-deep graph traversal and dynamic graph pruning technologies.
  • Ultipa is currently powering some of the world’s largest banks’ sophisticated and sea-volume data analytics and real-time decision making, anti-fraud, liquidity risk management, asset liability management, risk management systems, etc. Previously, no other systems have the capability to address customers’ challenges with such speed and celerity.

This benchmark focuses on examining the following characteristics of a graph system:

  • Data Loading
  • Graph Traversal
    • K-Hop, Shortest Paths (All Paths), etc.
  • Graph Algorithms
    • PageRank, LPA, Louvain, Similarity, etc.
  • Comparison with the following systems:
    • Neo4j, TigerGraph, JanusGraph and ArangoDB

2. Testing Bed

2.1. Hardware Platform

The benchmark testing-bed cluster is composed of 3 cloud-based server instances with the following configurations:

Server

Configuration

CPU

Intel Xeon 16-core (32-vCPU)

Memory

256GB

Disk

1TB HDD (Cloud-based)

Network

5Gbps

2.2. Software

Software

Description

OS

Linux

Graph Databases

Ultipa Graph v4.0

Neo4j v4.0.7 Enterprise Edition

TigerGraph v3.1

JanusGraph v0.6.1

ArangoDB v3.7

Note: Benchmark results across multiple graph databases are show under 3.2.3.

2.3. Dataset

Dataset

Description

Twitter-2010

http://an.kaist.ac.kr/traces/WWW2010.html

Dataset

Twitter_rv.tar.gz

Nodes: 41.6M, Edges: 1.47B (1,470M)

Data Modeling

Extend the dataset to allow nodes and edges to have properties. For instance, while running the PageRank/LPA/Louvain graph algorithms, results can be written back to the nodes as properties, which can be updated from time to time.

3. Functional Testing

3.1 Summary of Functional Testing

Testing Items

Testing Standards

Ultipa Results

Installation

The total time to have the graph database system deployed (the installation phase).

~30 min

Extensibility

Support of distributed architecture, data partitioning, horizontal and vertical scalability.

HTAP Distributed Architecture, scalable both horizontally and vertically

Graph Update

Graph modeling update can happen without suspense or shutdown of services, including real-time updates to nodes and edges.

Online update to vertices/edges, changes are reflected instantly to queries or algorithms’ results.

Data Loading

Support of batch or stream type of data loading; support of delimited text (i.e., CSV) or JSON format ingestion; support of stop-n-resume.

Support

Query Language

Natively supports graph query language.

Powerful UQL (Ultipa Query Language), easy-to-learn and easy-to-use. Can be tech and business personnel oriented at the same time.

High-concurrency Query

Ability to execute sophisticated graph queries in a highly concurrent fashion.

Support

Influence Algorithm

Support of the LPA, PageRank graph algorithms.

Support

Community Detection Algorithms

Support of the WCC/SCC, LPA and Louvain algorithms.

Support

Graph Interaction

Support of meta-data interaction, modification, display, highlight, expansion, etc.

Support

Management & Monitoring

Support of system run-time monitoring and management, such as CPU, RAM, Disk, networks, etc.

Support

Log Management

Detailed logging mechanism.

Support

Graphsets/Graphs

Support of multiple graphsets, sharing of nodes/edges across multiple graphs.

Support

Privilege Management

User privilege, access-control mechanism.

Support

Backup & Restore

Online backup and restoration support.

Support

High Availability

Does the system support HA setup?

Support

Disaster Recovery

Does the system support multi-city disaster recovery?

Support

3.2. Performance Testing

3.2.1. Summary of Ultipa’s Performance Testing

Testing Items

Testing Standard

Ultipa Testing Results

Data Loading

Ingesting 100% of the data of testing dataset, measure the total time.

2,700s

Storage Size

Loaded data size versus raw data size

1.3

1-Hop Query

Given a seed file with multiple nodes, check each node’s total number of 1-hop neighbors, log average execution time.

0.00062s

(Average Time)

2-Hop Query

Given a seed file with multiple nodes, check each node’s total number of 2-hop neighbors, log average execution time.

0.027s (Average)

 

3-Hop Query

Given a seed file with multiple nodes, check each node’s total number of 3-hop neighbors, log average execution time.

0.520s (Average)

6-Hop

Given a seed file with multiple nodes, check each node’s total number of 6-hop neighbors, log average execution time.

1.408s (Average)

23-Hop

Given a seed file with multiple node, check each node’s total number of 23-hop neighbors, log average execution time.

1.295s (Average)

Shortest-Path

Given any random pair of nodes, count their total number of shortest paths, mark the calculation time.

0.18s (Average)

Topology-Change & Query Results Change

Change the topology of the dataset, examine query results change (i.e., K-hop results of a particularly affected node).

In Real-time

Jaccard Similarity

Calculate and return Top-10 nodesbased on Jaccard Similarity against 10 nodes, log average execution time.

4.99s

PageRank

PageRank algorithm. Each algorithm is run multiple times, log the average time.

23s

LPA

Label Propagation Algorithm. Each algorithm is run multiple times, log the average time.

80s

 

Louvain

Louvain community detection algorithm. Each algorithm is run multiple times, log the average time.

210s

3.2.2. Itemized Performance Testing

3.2.2.1. Data Loading

Testing Purpose: Loading the entire dataset into graph database and start providing services. This test can show how fast a graph system ingest large volume of data.

Ultipa Testing Results:

Testing Item

Description

# of Nodes

# of Edges

Loading Time

Data Loading

Mark the total time for the entire dataset is loaded and system up-n-running.

41,652,330

1,468,365,182

2,700 seconds

Testing Results by All Systems:

Dataset

Twitter-2010

Graph System

Ultipa

TigerGraph

Neo4j

JanusGraph

ArangoDB

Loading Time(s)

2,700

1,550

3,120

20,800

32,200

Time Normalized

1.7

1

2

13.4

20.8

Storage Size

30GB

24GB

55GB

56GB

128GB

Size Normalized

1.25

1

2.3

2.3

5.3

3.2.2.2. K-Hop Neighbor Querying

Testing Purpose:K-hop query is a fundamental type of query with any graph system. It allows user to quickly identify the scope of impact of an entity (the subject node). K-hop must be implemented using BFS (Breadth-First Search) method, and there are a few caveats:

  • There are usually 2 ways of defining K-Hop and the results are different. One way is all neighbors from the 1st hop all the way to the Kth hop, the other way is the neighbors exactly Kth hop away from the starting node.
  • If a node appears on the Nth hop of a specific node, it will NOT appear on any other hop. While executing the algorithm, it’s pivotal to conduct de-duplication of nodes across different hops, otherwise, the results would be wrong.
  • The number of a node’s immediate neighbors are NOT the same as the number of edges, because a pair of connecting nodes may have 2 edges connecting them bi-directionally.

Ultipa Testing Results:

Note: The below testing results pertaining to Ultipa system log the number of nodes on the Nth hop. For K-hop query, this is slightly more time consuming than other systems calculating only the total number of nodes from 1st all the way up to the Kth hop.

Testing Item

Item Description

Node ID

# of Neighbors at Nth Hop

Execution Time (ms)

Average Execution Time (ms)

1-Hop

For each node, query for the # of nodes that are exactly 1-hop away, log execution time.

20727483

973

0.509ms

0.62ms

50329304

4,746

0.815

26199460

19,954

0.868

1177521

4,272

0.777

27960125

7

0.484

30440025

3,386

0.733

15833920

2,181

0.505

15015183

2,279

0.492

33153097

116

0.491

21250581

37

0.536

2-Hop

For each node, query for the # of nodes that are 2-hop away, log execution time.

20727483

2,874,132

5.823ms

27.33ms

50329304

2,714,697

5.606

26199460

7,818,213

69.953

1177521

19,318,826

70.281

27960125

533,108

1.983

30440025

11,294,131

21.967

15833920

7,858,255

22.746

15015183

5,933,114

27.222

33153097

12,181,518

25.157

21250581

11,063,971

22.537

3-Hop

For each node, query for the # of nodes that are 3-hop away, log execution time.

20727483

27,206,363

400.011ms

520ms

50329304

29,939,223

407.992

26199460

31,324,330

526.919

1177521

17,139,727

741.438

27960125

20,280,156

101.114

30440025

23,120,607

748.711

15833920

26,263,805

649.454

15015183

27,727,527

475.516

33153097

20,512,136

507.208

21250581

20,804,932

641.811

6-Hop

For each node, query for the # of nodes that are 6-hop away, log execution time.

20727483

10,028

1213.226ms

1408.28ms

50329304

9,052

1510.373

26199460

3,022

1372.880

1177521

3,101

1220.169

27960125

25,838

1576.181

30440025

5,437

1659.473

15833920

5,618

1210.724

15015183

6,197

1320.371

33153097

6,033

1229.563

21250581

6,738

1769.825

23-Hop

For each node, query for the # of nodes that are 23-hop away, log execution time.

15738828

3

1203.212ms

1295.19ms

9358362

1

1279.990

9358352

1

1529.716

17571568

1

1167.854

Ultipa Query Language:
khop().src({_id == Vertex_ID}).depth(#_of_hop).boost() as n return count(n)

 Testing Results by All Systems:

Dataset

Twitter-2010

Graph System

Ultipa

TigerGraph

Neo4j

JanusGraph

ArangoDB

1-Hop (second)

0.00062

0.024

0.2

0.39

1.667

2-Hop (second)

0.055

0.46

18.3

27.7

28.9

3-Hop (second)

0.52

6.6

290

4300

3888

6-Hop (second)

1.82

62.5

N/A

N/A

N/A

23-Hop (second)

1.92

N/A

N/A

N/A

N/A

3.2.2.3. Shortest Paths

Testing Purpose: Finding shortest paths is a fundamental type of graph query. Like K-hop query, it is also implanted using BFS method, namely breadth-first search. The caveat with Shortest Path finding is that there are usually multiple paths instead of one, some graph systems can NOT return the correct number of paths due to faulty implementation. It’s often more challenging to query for shortest paths on large dataset like Twitter, therefore it’s meaningful to examine how swiftly a graph system can run this query.

Testing Results: Find all shortest paths between any pair of nodes and log the execution time and total number of paths.

Testing Item

Item Description

Starting Vertex

End Vertex

# of Shortest Paths

Execution Time

Average Execution Time

Find All Shortest Paths between Random Pairs of Nodes

For any pair of nodes, query for all the shortest paths between them, log the execution time and total # of paths.

50329304

21613682

25389

0.152s

0.18s

49449489

15645246

2080710

2.484s

39687658

47978366

46

0.053s

15489748

38548456

8

0.017s

17524616

29884615

5

2.22s

33033471

37029346

273

0.018

37048837

17555248

1114

0.01s

26895497

37893465

2

0.004s

26468497

37164965

6

0.002s

54151557

37190965

30837

0.05s

Ultipa Query Language

ab().src({_id == Starting_Node_ID}).dest({_id == Ending_Node_ID}).depth(6).shortest().boost() as paths return count(paths)

Testing Results by All Systems:

Dataset

Twitter-2010

Graph Database

Ultipa

TigerGraph

Neo4j

JanusGraph

ArangoDB

All Shortest Paths

 (in seconds)

0.18

9.2

Note: Some ≥6-hop paths can’t return in 10-min.

Can’t return any ≥3-Hop paths.

Can’t return any ≥3-Hop paths.

Can’t return # of paths correctly (only 1 path found)

 

3.2.2.4. PageRank

Testing Purpose: Iterative algorithm like PageRank can be used to validate

Testing Results: It is imperative that PageRank algorithm should be conducted in an iterative fashion that traverses all nodes and edges. Some graph databases, by default, do this wrongly. Neo4j is one such example, it allows running the algorithm on limited number of vertices instead of on ALL the vertices. On the other hand, returning results in an ordered way is useful in determining the top-ranked pages (nodes or entities), unfortunately, many graph systems do NOT support result-ranking natively.

Testing Item

Item Description

Ultipa Testing Results

Top-10 Highest Ranked Vertices

Execution Time (in seconds)

Impact Analysis

Run PageRank algorithm for 10 iterations, with damping-0.8, return the top-10 highest ranked vertices and their rankings, log the total execution time.

813286,14276.1

14224719,12240

31567254,10288.6

15131310,9860.32

16049683,8546.38

7040932,6775.97

14075928,6672.9

12687952,5345.58

5380672,5021.32

26784273,2886.91

23s

Ultipa Query Language:

algo(page_rank).params({order:’desc’,init_value:0.2,loop_num:10,damping:0.8,limit:10}).write({file:{filename:”my_pagerank_result”}})

Note: The chart below shows how Ultipa compares with Tigergraph and other systems. Please note that ONLY Ultipa returns the PageRank results in an ordered way, which is more time consuming. The systems that do not carry a result couldn’t finish the algorithm.

3.2.2.5. Community Detection

Testing Purpose: LPA and Louvain are the two commonly used community detection algorithms. It’s useful to verify how fast and swiftly can a graph database system finish both algorithms on a relatively large dataset such as Twitter with over 1.5 billion nodes and edges.

Testing Results:

Testing Item

Item Description

# of Communities

Execution Time       (in seconds)

LPA

Launch the LPA algorithm, set iterations=10, print out # of communities and total execution time.

116602

80s

Louvain

Launch the Louvain Community Detection algorithm, set iterations=10, modularity=0.01, print out # of communities and total execution time.

1207

210s

Ultipa Query Language:

algo(lpa).params({loop_num:10}).write()

algo(louvain).params({phase_1_loop_num:10.min_modularity_increase:0.01}).write

 

Testing Results by All Systems:

Dataset

Twitter(2010): 41.6M, 1470M, 24.6GB

Graph DB

Ultipa

TigerGraph

Neo4j

JanusGraph

ArangoDB

PageRank

(Seconds)

23

258

 

600

N/A

Unfinished

N/A

Unfinished

LPA

(Seconds)

80

900

N/A

Unfinished

N/A

Unfinished

N/A

Unfinished

Louvain

(Seconds)

210

N/A

Not-Tested

N/A

Can’t finish projection in 30-min

N/A

Not-Tested

N/A

Not-Tested

Not: Due to resource or feature limitations, the following tests were only conducted against Ultipa Graph v4.0 but not on other graph database systems.

 

3.2.2.6. Update Dataset and Compare Query Results

Testing Purpose: To validate if the graph database system has the ability to update data contents and allow for accurate data querying in real-time. It is not uncommon that some graph systems use caching mechanism to pre-calculate and store-aside results which do NOT change accordingly after the data contents are change. In this testing item, we create a new edge connecting the provided starting vertex with one of its 3rd-hop neighboring vertex, and compare the 1-to-3-to-6 Hop results before and after the edge is inserted.

Testing Results:

Testing Item

Item Description

New-Edge Starting Vertex ID

New-Edge Ending Vertex ID

Query Depth

K-Hop Before Edge Adding

K-Hop After Edge Adding

Update the dataset, do queries against it twice to compare results

Add a new edge between the starting vertex to one of its 3rd-hop neighboring vertex, and check its 1-hop,3-hop, 6-hop results before and after the edge-adding event

20727483

28843543

1

973

974

3

27206363

27210397

6

10028

10027

50329304

21378173

1

4746

4747

3

29939223

29939314

6

9052

9052

26199460

32278263

1

19954

19955

3

31324330

31324333

6

3022

3022

1177521

6676222

1

4272

4273

3

17139727

17139725

6

3101

3101

27960125

48271231

1

7

8

3

20280156

20283107

6

25838

25836

30440025

38232241

1

3386

3387

3

23120607

23121930

6

5437

5431

 

3.2.2.7. Similarity Algorithm

Testing Item

Testing Item Description

Provided ID of Starting Vertex

Execution Time

Average Time in Seconds

Similarity

Given a vertex, query for Top-10 vertices that are most similar using Jaccard Similarity.

20727483

1.586s

4.99s

50329304

2.003s

26199460

10.573s

1177521

10.52s

27960125

0.359s

30440025

2.865s

15833920

5.042s

15015183

5.283s

33153097

5.201s

21250581

4.525s

Ultipa Query Language:

find().nodes({_id == Provided_ID}) as n with n

algo(similarity).params({uuids:[n],top_limit:10,type:"jaccard"}) as r return r

 

4. Conclusion

Benchmarks help us understand how a system performs in comparison with other systems.

Admittedly, it's difficult to fine tune every single system to its optimal state (i.e., with better looking numbers), therefore, other vendors may have different opinions about why their systems seem to be underperforming.

The argument here, however, is that Ultipa Graph database is the 1st system that has been designed from scratch to serve complicated graph queries (and evn algorithms) in real-time. Our entire architecture, data structure, graph computing engine, storage engine, graph query engine, API/SDK, toolchains were all designed and implemented in a way to serve this real-time-ness. As we are spreading into more and more commerical and industrial fields, we (and hopefully you too) will see that our concept of graph augmented intelligence and eXplainable AI (XAI) can empower smart enterprises smoothly transforming digitally with speed and ease.

To best understand a graph vendor's product, you should also look at things beyond just the numbers in the benchmarks, and here are some recommended readings:

We hope you enjoy the reading and internalize the cool stuff within.

 

May the best graph be with you.

Please complete the following information to download this book
*
公司名称不能为空
*
公司邮箱必须填写
*
你的名字必须填写
*
你的电话必须填写