ULTIPA GRAPH Benchmark Report (2022)

Profile

Account ID:

Full Name:
Phone:
Company:
Company Email:

Change Password

Apply

You have no license application record.

Apply

Certificate	Issued at	Valid until	Serial No.	File

Serial No.	Valid until	File

Not having one? Apply now! >>>

Product	Created On	ID	Amount (USD)	Invoice

Product	Created On	ID	Amount (USD)	Invoice

No Invoice

Download PDF File

1. Executive Summary

Ultipa has recently released v4.0 of its flagship Ultipa Graph database product. To recap Ultipa Graph v4.0's characteristics:

Ultipa Graph v4.0 continues to evolve over its HTAP architecture for maximum speed, lowest latency and high throughput, great data consistency (across the entire cluster), and with advanced horizontal scalability. For those who are interested in graph database scalability design, refer to this article.
Ultipa Graph Database is 10x to 1,000x+ faster than any other graph database systems in terms of data processing capabilities, the deeper the better.
Ultipa Graph is the world’s only 4^th-generation graph system, leveraging its patent-pending high-density parallel computing, ultra-deep graph traversal and dynamic graph pruning technologies.
Ultipa is currently powering some of the world’s largest banks’ sophisticated and sea-volume data analytics and real-time decision making, anti-fraud, liquidity risk management, asset liability management, risk management systems, etc. Previously, no other systems have the capability to address customers’ challenges with such speed and celerity.

This benchmark focuses on examining the following characteristics of a graph system:

Data Loading
Graph Traversal
- K-Hop, Shortest Paths (All Paths), etc.
Graph Algorithms
- PageRank, LPA, Louvain, Similarity, etc.
Comparison with the following systems:
- Neo4j, TigerGraph, JanusGraph and ArangoDB

2. Testing Bed

2.1. Hardware Platform

The benchmark testing-bed cluster is composed of 3 cloud-based server instances with the following configurations:

Server	Configuration
CPU	Intel Xeon 16-core (32-vCPU)
Memory	256GB
Disk	1TB HDD (Cloud-based)
Network	5Gbps

2.2. Software

Software

Description

Linux

Graph Databases

Ultipa Graph v4.0

Neo4j v4.0.7 Enterprise Edition

TigerGraph v3.1

JanusGraph v0.6.1

ArangoDB v3.7

Note: Benchmark results across multiple graph databases are show under 3.2.3.

2.3. Dataset

Dataset	Description
Twitter-2010	http://an.kaist.ac.kr/traces/WWW2010.html
Dataset	Twitter_rv.tar.gz Nodes: 41.6M, Edges: 1.47B (1,470M)
Data Modeling	Extend the dataset to allow nodes and edges to have properties. For instance, while running the PageRank/LPA/Louvain graph algorithms, results can be written back to the nodes as properties, which can be updated from time to time.

3. Functional Testing

3.1 Summary of Functional Testing

Testing Items	Testing Standards	Ultipa Results
Installation	The total time to have the graph database system deployed (the installation phase).	~30 min
Extensibility	Support of distributed architecture, data partitioning, horizontal and vertical scalability.	HTAP Distributed Architecture, scalable both horizontally and vertically
Graph Update	Graph modeling update can happen without suspense or shutdown of services, including real-time updates to nodes and edges.	Online update to vertices/edges, changes are reflected instantly to queries or algorithms’ results.
Data Loading	Support of batch or stream type of data loading; support of delimited text (i.e., CSV) or JSON format ingestion; support of stop-n-resume.	Support
Query Language	Natively supports graph query language.	Powerful UQL (Ultipa Query Language), easy-to-learn and easy-to-use. Can be tech and business personnel oriented at the same time.
High-concurrency Query	Ability to execute sophisticated graph queries in a highly concurrent fashion.	Support
Influence Algorithm	Support of the LPA, PageRank graph algorithms.	Support
Community Detection Algorithms	Support of the WCC/SCC, LPA and Louvain algorithms.	Support
Graph Interaction	Support of meta-data interaction, modification, display, highlight, expansion, etc.	Support
Management & Monitoring	Support of system run-time monitoring and management, such as CPU, RAM, Disk, networks, etc.	Support
Log Management	Detailed logging mechanism.	Support
Graphsets/Graphs	Support of multiple graphsets, sharing of nodes/edges across multiple graphs.	Support
Privilege Management	User privilege, access-control mechanism.	Support
Backup & Restore	Online backup and restoration support.	Support
High Availability	Does the system support HA setup?	Support
Disaster Recovery	Does the system support multi-city disaster recovery?	Support

3.2. Performance Testing

3.2.1. Summary of Ultipa’s Performance Testing

Testing Items	Testing Standard	Ultipa Testing Results
Data Loading	Ingesting 100% of the data of testing dataset, measure the total time.	2,700s
Storage Size	Loaded data size versus raw data size	1.3
1-Hop Query	Given a seed file with multiple nodes, check each node’s total number of 1-hop neighbors, log average execution time.	0.00062s (Average Time)
2-Hop Query	Given a seed file with multiple nodes, check each node’s total number of 2-hop neighbors, log average execution time.	0.027s (Average)
3-Hop Query	Given a seed file with multiple nodes, check each node’s total number of 3-hop neighbors, log average execution time.	0.520s (Average)
6-Hop	Given a seed file with multiple nodes, check each node’s total number of 6-hop neighbors, log average execution time.	1.408s (Average)
23-Hop	Given a seed file with multiple node, check each node’s total number of 23-hop neighbors, log average execution time.	1.295s (Average)
Shortest-Path	Given any random pair of nodes, count their total number of shortest paths, mark the calculation time.	0.18s (Average)
Topology-Change & Query Results Change	Change the topology of the dataset, examine query results change (i.e., K-hop results of a particularly affected node).	In Real-time
Jaccard Similarity	Calculate and return Top-10 nodesbased on Jaccard Similarity against 10 nodes, log average execution time.	4.99s
PageRank	PageRank algorithm. Each algorithm is run multiple times, log the average time.	23s
LPA	Label Propagation Algorithm. Each algorithm is run multiple times, log the average time.	80s
Louvain	Louvain community detection algorithm. Each algorithm is run multiple times, log the average time.	210s

3.2.2. Itemized Performance Testing

3.2.2.1. Data Loading

Testing Purpose: Loading the entire dataset into graph database and start providing services. This test can show how fast a graph system ingest large volume of data.

Ultipa Testing Results:

Testing Item	Description	# of Nodes	# of Edges	Loading Time
Data Loading	Mark the total time for the entire dataset is loaded and system up-n-running.	41,652,330	1,468,365,182	2,700 seconds

Testing Results by All Systems:

Dataset	Twitter-2010
Graph System	Ultipa	TigerGraph	Neo4j	JanusGraph	ArangoDB
Loading Time(s)	2,700	1,550	3,120	20,800	32,200
Time Normalized	1.7	1	2	13.4	20.8
Storage Size	30GB	24GB	55GB	56GB	128GB
Size Normalized	1.25	1	2.3	2.3	5.3

3.2.2.2. K-Hop Neighbor Querying

Testing Purpose：K-hop query is a fundamental type of query with any graph system. It allows user to quickly identify the scope of impact of an entity (the subject node). K-hop must be implemented using BFS (Breadth-First Search) method, and there are a few caveats:

There are usually 2 ways of defining K-Hop and the results are different. One way is all neighbors from the 1^st hop all the way to the K^th hop, the other way is the neighbors exactly K^thhop away from the starting node.
If a node appears on the N^th hop of a specific node, it will NOT appear on any other hop. While executing the algorithm, it’s pivotal to conduct de-duplication of nodes across different hops, otherwise, the results would be wrong.
The number of a node’s immediate neighbors are NOT the same as the number of edges, because a pair of connecting nodes may have 2 edges connecting them bi-directionally.

Ultipa Testing Results:

Note: The below testing results pertaining to Ultipa system log the number of nodes on the N^th hop. For K-hop query, this is slightly more time consuming than other systems calculating only the total number of nodes from 1^st all the way up to the K^th hop.

Testing Item	Item Description	Node ID	# of Neighbors at N^th Hop	Execution Time (ms)	Average Execution Time (ms)
1-Hop	For each node, query for the # of nodes that are exactly 1-hop away, log execution time.	20727483	973	0.509ms	0.62ms
		50329304	4,746	0.815
		26199460	19,954	0.868
		1177521	4,272	0.777
		27960125	7	0.484
		30440025	3,386	0.733
		15833920	2,181	0.505
		15015183	2,279	0.492
		33153097	116	0.491
		21250581	37	0.536
2-Hop	For each node, query for the # of nodes that are 2-hop away, log execution time.	20727483	2,874,132	5.823ms	27.33ms
		50329304	2,714,697	5.606
		26199460	7,818,213	69.953
		1177521	19,318,826	70.281
		27960125	533,108	1.983
		30440025	11,294,131	21.967
		15833920	7,858,255	22.746
		15015183	5,933,114	27.222
		33153097	12,181,518	25.157
		21250581	11,063,971	22.537
3-Hop	For each node, query for the # of nodes that are 3-hop away, log execution time.	20727483	27,206,363	400.011ms	520ms
		50329304	29,939,223	407.992
		26199460	31,324,330	526.919
		1177521	17,139,727	741.438
		27960125	20,280,156	101.114
		30440025	23,120,607	748.711
		15833920	26,263,805	649.454
		15015183	27,727,527	475.516
		33153097	20,512,136	507.208
		21250581	20,804,932	641.811
6-Hop	For each node, query for the # of nodes that are 6-hop away, log execution time.	20727483	10,028	1213.226ms	1408.28ms
		50329304	9,052	1510.373
		26199460	3,022	1372.880
		1177521	3,101	1220.169
		27960125	25,838	1576.181
		30440025	5,437	1659.473
		15833920	5,618	1210.724
		15015183	6,197	1320.371
		33153097	6,033	1229.563
		21250581	6,738	1769.825
23-Hop	For each node, query for the # of nodes that are 23-hop away, log execution time.	15738828	3	1203.212ms	1295.19ms
		9358362	1	1279.990
		9358352	1	1529.716
		17571568	1	1167.854

Ultipa Query Language:
khop().src({_id == Vertex_ID}).depth(#_of_hop).boost() as n return count(n)

Testing Results by All Systems:

Dataset	Twitter-2010
Graph System	Ultipa	TigerGraph	Neo4j	JanusGraph	ArangoDB
1-Hop (second)	0.00062	0.024	0.2	0.39	1.667
2-Hop (second)	0.055	0.46	18.3	27.7	28.9
3-Hop (second)	0.52	6.6	290	4300	3888
6-Hop (second)	1.82	62.5	N/A	N/A	N/A
23-Hop (second)	1.92	N/A	N/A	N/A	N/A

3.2.2.3. Shortest Paths

Testing Purpose: Finding shortest paths is a fundamental type of graph query. Like K-hop query, it is also implanted using BFS method, namely breadth-first search. The caveat with Shortest Path finding is that there are usually multiple paths instead of one, some graph systems can NOT return the correct number of paths due to faulty implementation. It’s often more challenging to query for shortest paths on large dataset like Twitter, therefore it’s meaningful to examine how swiftly a graph system can run this query.

Testing Results: Find all shortest paths between any pair of nodes and log the execution time and total number of paths.

Testing Item	Item Description	Starting Vertex	End Vertex	# of Shortest Paths	Execution Time	Average Execution Time
Find All Shortest Paths between Random Pairs of Nodes	For any pair of nodes, query for all the shortest paths between them, log the execution time and total # of paths.	50329304	21613682	25389	0.152s	0.18s
		49449489	15645246	2080710	2.484s
		39687658	47978366	46	0.053s
		15489748	38548456	8	0.017s
		17524616	29884615	5	2.22s
		33033471	37029346	273	0.018
		37048837	17555248	1114	0.01s
		26895497	37893465	2	0.004s
		26468497	37164965	6	0.002s
		54151557	37190965	30837	0.05s

Ultipa Query Language

ab().src({_id == Starting_Node_ID}).dest({_id == Ending_Node_ID}).depth(6).shortest().boost() as paths return count(paths)

Testing Results by All Systems:

Dataset

Twitter-2010

Graph Database

Ultipa

TigerGraph

Neo4j

JanusGraph

ArangoDB

All Shortest Paths

(in seconds)

0.18

9.2

Note: Some ≥6-hop paths can’t return in 10-min.

Can’t return any ≥3-Hop paths.

Can’t return # of paths correctly (only 1 path found)

3.2.2.4. PageRank

Testing Purpose: Iterative algorithm like PageRank can be used to validate

Testing Results: It is imperative that PageRank algorithm should be conducted in an iterative fashion that traverses all nodes and edges. Some graph databases, by default, do this wrongly. Neo4j is one such example, it allows running the algorithm on limited number of vertices instead of on ALL the vertices. On the other hand, returning results in an ordered way is useful in determining the top-ranked pages (nodes or entities), unfortunately, many graph systems do NOT support result-ranking natively.

Testing Item

Item Description

Ultipa Testing Results

Top-10 Highest Ranked Vertices

Execution Time (in seconds)

Impact Analysis

Run PageRank algorithm for 10 iterations, with damping-0.8, return the top-10 highest ranked vertices and their rankings, log the total execution time.

813286,14276.1

14224719,12240

31567254,10288.6

15131310,9860.32

16049683,8546.38

7040932,6775.97

14075928,6672.9

12687952,5345.58

5380672,5021.32

26784273,2886.91

23s

Ultipa Query Language:

algo(page_rank).params({order:’desc’,init_value:0.2,loop_num:10,damping:0.8,limit:10}).write({file:{filename:”my_pagerank_result”}})

Note: The chart below shows how Ultipa compares with Tigergraph and other systems. Please note that ONLY Ultipa returns the PageRank results in an ordered way, which is more time consuming. The systems that do not carry a result couldn’t finish the algorithm.

3.2.2.5. Community Detection

Testing Purpose: LPA and Louvain are the two commonly used community detection algorithms. It’s useful to verify how fast and swiftly can a graph database system finish both algorithms on a relatively large dataset such as Twitter with over 1.5 billion nodes and edges.

Testing Results:

Testing Item	Item Description	# of Communities	Execution Time (in seconds)
LPA	Launch the LPA algorithm, set iterations=10, print out # of communities and total execution time.	116602	80s
Louvain	Launch the Louvain Community Detection algorithm, set iterations=10, modularity=0.01, print out # of communities and total execution time.	1207	210s

Ultipa Query Language:

algo(lpa).params({loop_num:10}).write()

algo(louvain).params({phase_1_loop_num:10.min_modularity_increase:0.01}).write

Testing Results by All Systems:

Dataset	Twitter(2010): 41.6M, 1470M, 24.6GB
Graph DB	Ultipa	TigerGraph	Neo4j	JanusGraph	ArangoDB
PageRank (Seconds)	23	258	600	N/A Unfinished	N/A Unfinished
LPA (Seconds)	80	900	N/A Unfinished	N/A Unfinished	N/A Unfinished
Louvain (Seconds)	210	N/A Not-Tested	N/A Can’t finish projection in 30-min	N/A Not-Tested	N/A Not-Tested

Not: Due to resource or feature limitations, the following tests were only conducted against Ultipa Graph v4.0 but not on other graph database systems.

3.2.2.6. Update Dataset and Compare Query Results

Testing Purpose: To validate if the graph database system has the ability to update data contents and allow for accurate data querying in real-time. It is not uncommon that some graph systems use caching mechanism to pre-calculate and store-aside results which do NOT change accordingly after the data contents are change. In this testing item, we create a new edge connecting the provided starting vertex with one of its 3^rd-hop neighboring vertex, and compare the 1-to-3-to-6 Hop results before and after the edge is inserted.

Testing Results:

Testing Item	Item Description	New-Edge Starting Vertex ID	New-Edge Ending Vertex ID	Query Depth	K-Hop Before Edge Adding	K-Hop After Edge Adding
Update the dataset, do queries against it twice to compare results	Add a new edge between the starting vertex to one of its 3^rd-hop neighboring vertex, and check its 1-hop,3-hop, 6-hop results before and after the edge-adding event	20727483	28843543	1	973	974
				3	27206363	27210397
				6	10028	10027
		50329304	21378173	1	4746	4747
				3	29939223	29939314
				6	9052	9052
		26199460	32278263	1	19954	19955
				3	31324330	31324333
				6	3022	3022
		1177521	6676222	1	4272	4273
				3	17139727	17139725
				6	3101	3101
		27960125	48271231	1	7	8
				3	20280156	20283107
				6	25838	25836
		30440025	38232241	1	3386	3387
				3	23120607	23121930
				6	5437	5431

3.2.2.7. Similarity Algorithm

Testing Item	Testing Item Description	Provided ID of Starting Vertex	Execution Time	Average Time in Seconds
Similarity	Given a vertex, query for Top-10 vertices that are most similar using Jaccard Similarity.	20727483	1.586s	4.99s
		50329304	2.003s
		26199460	10.573s
		1177521	10.52s
		27960125	0.359s
		30440025	2.865s
		15833920	5.042s
		15015183	5.283s
		33153097	5.201s
		21250581	4.525s

Ultipa Query Language:

find().nodes({_id == Provided_ID}) as n with n

algo(similarity).params({uuids:[n],top_limit:10,type:"jaccard"}) as r return r

4. Conclusion

Benchmarks help us understand how a system performs in comparison with other systems.

Admittedly, it's difficult to fine tune every single system to its optimal state (i.e., with better looking numbers), therefore, other vendors may have different opinions about why their systems seem to be underperforming.

The argument here, however, is that Ultipa Graph database is the 1st system that has been designed from scratch to serve complicated graph queries (and evn algorithms) in real-time. Our entire architecture, data structure, graph computing engine, storage engine, graph query engine, API/SDK, toolchains were all designed and implemented in a way to serve this real-time-ness. As we are spreading into more and more commerical and industrial fields, we (and hopefully you too) will see that our concept of graph augmented intelligence and eXplainable AI (XAI) can empower smart enterprises smoothly transforming digitally with speed and ease.

To best understand a graph vendor's product, you should also look at things beyond just the numbers in the benchmarks, and here are some recommended readings:

We hope you enjoy the reading and internalize the cool stuff within.

May the best graph be with you.

Change Password

Change Email

Change Nickname

1. Executive Summary

2. Testing Bed

2.1. Hardware Platform

2.2. Software

2.3. Dataset

3. Functional Testing

3.1 Summary of Functional Testing

3.2. Performance Testing

3.2.1. Summary of Ultipa’s Performance Testing

3.2.2. Itemized Performance Testing

3.2.2.1. Data Loading

3.2.2.2. K-Hop Neighbor Querying

3.2.2.3. Shortest Paths

3.2.2.4. PageRank

3.2.2.5. Community Detection

algo(lpa).params({loop_num:10}).write()

algo(louvain).params({phase_1_loop_num:10.min_modularity_increase:0.01}).write

3.2.2.6. Update Dataset and Compare Query Results

3.2.2.7. Similarity Algorithm

4. Conclusion

ID
Product
Status
Cores
Maximum Shard Services
Maximum Total Cores for Shard Service
Maximum HDC Services
Maximum Total Cores for HDC Service
Applied Validity Period(days)
Effective Date
Expired Date
Mac Address
Reason for Application
Review Comment