1. Executive Summary
Ultipa has recently released v4.0 of its flagship Ultipa Graph database product. To recap Ultipa Graph v4.0's characteristics:
- Ultipa Graph v4.0 continues to evolve over its HTAP architecture for maximum speed, lowest latency and high throughput, great data consistency (across the entire cluster), and with advanced horizontal scalability. For those who are interested in graph database scalability design, refer to this article.
- Ultipa Graph Database is 10x to 1,000x+ faster than any other graph database systems in terms of data processing capabilities, the deeper the better.
- Ultipa Graph is the world’s only 4th-generation graph system, leveraging its patent-pending high-density parallel computing, ultra-deep graph traversal and dynamic graph pruning technologies.
- Ultipa is currently powering some of the world’s largest banks’ sophisticated and sea-volume data analytics and real-time decision making, anti-fraud, liquidity risk management, asset liability management, risk management systems, etc. Previously, no other systems have the capability to address customers’ challenges with such speed and celerity.
This benchmark focuses on examining the following characteristics of a graph system:
- Data Loading
- Graph Traversal
- K-Hop, Shortest Paths (All Paths), etc.
- Graph Algorithms
- PageRank, LPA, Louvain, Similarity, etc.
- Comparison with the following systems:
- Neo4j, Tigergraph, JanusGraph and ArangoDB
2. Testing Bed
2.1. Hardware Platform
The benchmark testing-bed cluster is composed of 3 cloud-based server instances with the following configurations:
Server |
Configuration |
CPU |
Intel Xeon 16-core (32-vCPU) |
Memory |
256GB |
Disk |
1TB HDD (Cloud-based) |
Network |
5Gbps |
2.2. Software
Software |
Description |
OS |
Linux |
Graph Database |
Ultipa Graph v4.0 Neo4j v4.0.7 Enterprise Edition Tigergraph v3.1 JanusGraph v0.6.1 ArangoDB v3.7 |
Note: Benchmark results across multiple graph databases are show under 3.2.3.
2.3. Datasets
Dataset |
Description |
Twitter-2010 |
|
Dataset |
Twitter_rv.tar.gz Vertices: 41.6M Edges: 14.7B (1470M) |
Data Modeling |
Extend the dataset to allow vertices and edges to have attributes, for instance, while running PageRank/LPA/Louvain graph algorithms, results can be written back to the vertices as attributes, which can be updated from time to time. On the other hand, edge attributes can be created to |
3. Functional Testing
3.1. Summary of Functional Testing
Testing Items |
Testing Standards |
Ultipa Results |
Installation |
The total time to have the graph database system deployed (the installation phase) |
~30 min |
Extensibility |
Support of distributed architecture, data partitioning, horizontal and vertical scalability. |
HTAP Distributed Architecture, scalable both horizontally and vertically |
Graph Update |
Graph modeling update can happen without suspense or shutdown of services, including real-time updates to vertices and edges. |
Online update to vertices/edges, changes are reflected instantly to queries or algorithms’ results. |
Data Loading |
Support of batch or stream type of data loading; support of delimited text (i.e., CSV) or JSON format ingestion; support of stop-n-resume. |
Support |
Query Language |
Natively supports graph query language. |
Powerful UQL (Ultipa Query Language), easy-to-learn and easy-to-use. Can be tech and business personnel oriented at the same time. |
High-concurrency Query |
Ability to execute sophisticated graph queries in a highly concurrent fashion. |
Support |
Influence Algorithm |
Support of LPA, PageRank graph algorithms |
Support |
Community Detection Algorithms |
Support of WCC/SCC, LPA and Louvain algorithms |
Support |
Graph Interaction |
Support of meta-data interaction, modification, display, highlight, expansion, etc. |
Support |
Management & Monitoring |
Support of system run-time monitoring and management, such as CPU, RAM, Disk, networks, etc. |
Support |
Log Management |
Detailed logging mechanism. |
Support |
Graphsets/Graphs |
Support of multiple graphsets, sharing of nodes/edges across multiple graphs. |
Support |
Privilege Management |
User privilege, access-control mechanism. |
Support |
Backup & Restore |
Online backup and restoration support. |
Support |
High Availability |
Does the system support HA setup? |
Support |
Disaster Recovery |
Does the system support multi-city disaster recovery? |
Support |
3.2. Performance Testing
3.2.1. Summary of Ultipa’s Performance Testing
Testing Items |
Testing Standard |
Ultipa Testing Results |
Data Loading |
Ingesting 100% of the data of testing dataset, measure the total time |
2700 seconds |
Storage Size |
Loaded data size versus raw data size |
1.3 |
1-Hop Query |
Given a seed file with multiple vertices, check each vertex’s total number of 1-hop neighbors, log average execution time. |
0.00062 second (Average Time) |
2-Hop Query |
Given a seed file with multiple vertices, check each vertex’s total number of 2-hop neighbors, log average execution time. |
0.027 second (Average Time)
|
3-Hop Query |
Given a seed file with multiple vertices, check each vertex’s total number of 3-hop neighbors, log average execution time. |
0.520 second (Average Time) |
6-Hop |
Given a seed file with multiple vertices, check each vertex’s total number of 6-hop neighbors, log average execution time. |
1.408 second (Average Time) |
23-Hop |
Given a seed file with multiple vertices, check each vertex’s total number of 23-hop neighbors, log average execution time. |
1.295 second (Average Time) |
Shortest-Path |
Given any random pair of vertices, count their total number of shortest paths, mark the calculation time. |
0.18 second (Average Time) |
Topology-Change & Query Results Change |
Change the topology of the dataset, examine query results change (i.e., K-hop results of a particularly affected vertex). |
In Real-time. |
Jaccard Similarity |
Calculate and return Top-10 vertices based on Jaccard Similarity against 10 nodes, log average execution time. |
4.99s (Average Time) |
Page Rank |
PageRank algorithm. Each algorithm is run multiple times, log the average time. |
23s (Average Time) |
LPA |
Label Propagation Algorithm. Each algorithm is run multiple times, log the average time. |
80s
|
Louvain |
Louvain community detection algorithm. Each algorithm is run multiple times, log the average time. |
210s |
3.2.2. Itemized Performance Testing
3.2.2.1. Data Loading
Testing Purpose: Loading the entire dataset into graph database and start providing services. This test can show how fast a graph system ingest large-volume of data.
Ultipa Testing Results:
Testing Item |
Description |
# of Vertices |
# of Edges |
Loading Time |
Data Loading |
Mark the total time for the entire dataset is loaded and system up-n-running. |
41652330 |
1468365182 |
2700 seconds |
Testing Results by All Systems