1 - HugeGraph BenchMark Performance

Note:

The current performance metrics are based on an earlier version. The latest version has significant improvements in both performance and functionality. We encourage you to refer to the most recent release featuring autonomous distributed storage and enhanced computational push down capabilities. Alternatively, you may wait for the community to update the data with these enhancements.

1 Test environment

1.1 Hardware information

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD

1.2 Software information

1.2.1 Test cases

Testing is done using the graphdb-benchmark, a benchmark suite for graph databases. This benchmark suite mainly consists of four types of tests:

  • Massive Insertion, which involves batch insertion of vertices and edges, with a certain number of vertices or edges being submitted at once.
  • Single Insertion, which involves the immediate insertion of each vertex or edge, one at a time.
  • Query, which mainly includes the basic query operations of the graph database:
    • Find Neighbors, which queries the neighbors of all vertices.
    • Find Adjacent Nodes, which queries the adjacent vertices of all edges.
    • Find the Shortest Path, which queries the shortest path from the first vertex to 100 random vertices.
  • Clustering, which is a community detection algorithm based on the Louvain Method.
1.2.2 Test dataset

Tests are conducted using both synthetic and real data.

The size of the datasets used in this test is not mentioned.

NameNumber of VerticesNumber of EdgesFile Size
email-enron.txt36,691367,6614MB
com-youtube.ungraph.txt1,157,8062,987,62438.7MB
amazon0601.txt403,3933,387,38847.9MB
com-lj.ungraph.txt399796134681189479MB

1.3 Service configuration

  • HugeGraph version: 0.5.6, RestServer and Gremlin Server and backends are on the same server

    • RocksDB version: rocksdbjni-5.8.6
  • Titan version: 0.5.4, using thrift+Cassandra mode

    • Cassandra version: cassandra-3.10, commit-log and data use SSD together
  • Neo4j version: 2.0.1

The Titan version adapted by graphdb-benchmark is 0.5.4.

2 Test results

2.1 Batch insertion performance

Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)com-lj.ungraph(3000w)
HugeGraph0.6295.7115.24367.033
Titan10.15108.569150.2661217.944
Neo4j3.88418.93824.890281.537

Instructions

  • The data scale is in the table header in terms of edges
  • The data in the table is the time for batch insertion, in seconds
  • For example, HugeGraph(RocksDB) spent 5.711 seconds to insert 3 million edges of the amazon0601 dataset.
Conclusion
  • The performance of batch insertion: HugeGraph(RocksDB) > Neo4j > Titan(thrift+Cassandra)

2.2 Traversal performance

2.2.1 Explanation of terms
  • FN(Find Neighbor): Traverse all vertices, find the adjacent edges based on each vertex, and use the edges and vertices to find the other vertices adjacent to the original vertex.
  • FA(Find Adjacent): Traverse all edges, get the source vertex and target vertex based on each edge.
2.2.2 FN performance
Backendemail-enron(3.6w)amazon0601(40w)com-youtube.ungraph(120w)com-lj.ungraph(400w)
HugeGraph4.07245.11866.006609.083
Titan8.08492.507184.5431099.371
Neo4j2.42410.53711.609106.919

Instructions

  • The data in the table header “()” represents the data scale, in terms of vertices.
  • The data in the table represents the time spent traversing vertices in seconds.
  • For example, HugeGraph uses the RocksDB backend to traverse all vertices in amazon0601, and search for adjacent edges and another vertex, which takes a total of 45.118 seconds.
2.2.3 FA performance
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)com-lj.ungraph(3000w)
HugeGraph1.54010.76411.243151.271
Titan7.36193.344169.2181085.235
Neo4j1.6734.7754.28440.507

Explanation

  • The data size in the header “()” is based on the number of vertices.
  • The data in the table is the time it takes to traverse the vertices in seconds.
  • For example, HugeGraph with RocksDB backend traverses all vertices in the amazon0601 dataset, and it looks up adjacent edges and other vertices, taking a total of 45.118 seconds.
Conclusion
  • Traversal performance: Neo4j > HugeGraph(RocksDB) > Titan(thrift+Cassandra)

2.3 Performance of Common Graph Analysis Methods in HugeGraph

Terminology Explanation
  • FS (Find Shortest Path): finding the shortest path between two vertices
  • K-neighbor: all vertices that can be reached by traversing K hops (including 1, 2, 3…(K-1) hops) from the starting vertex
  • K-out: all vertices that can be reached by traversing exactly K out-edges from the starting vertex.
FS performance
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)com-lj.ungraph(3000w)
HugeGraph0.4940.1033.3648.155
Titan11.8180.239377.709575.678
Neo4j1.7191.8001.9568.530

Explanation

  • The data in the header “()” represents the data scale in terms of edges
  • The data in the table is the time it takes to find the shortest path from the first vertex to 100 randomly selected vertices in seconds
  • For example, HugeGraph using the RocksDB backend to find the shortest path from the first vertex to 100 randomly selected vertices in the amazon0601 graph took a total of 0.103s.
Conclusion
  • In scenarios with small data size or few vertex relationships, HugeGraph outperforms Neo4j and Titan.
  • As the data size increases and the degree of vertex association increases, the performance of HugeGraph and Neo4j tends to be similar, both far exceeding Titan.
K-neighbor Performance
VertexDepthDegree 1Degree 2Degree 3Degree 4Degree 5Degree 6
v1Time0.031s0.033s0.048s0.500s11.27sOOM
v111Time0.027s0.034s0.115s1.36sOOM
v1111Time0.039s0.027s0.052s0.511s10.96sOOM

Explanation

  • HugeGraph-Server’s JVM memory is set to 32GB and may experience OOM when the data is too large.
K-out performance
VertexDepth1st Degree2nd Degree3rd Degree4th Degree5th Degree6th Degree
v1Time0.054s0.057s0.109s0.526s3.77sOOM
Degree10133245350,8301,128,688
v111Time0.032s0.042s0.136s1.25s20.62sOOM
Degree1021149441131502,629,970
v1111Time0.039s0.045s0.053s1.10s2.92sOOM
Degree101402555508251,070,230

Explanation

  • The JVM memory of HugeGraph-Server is set to 32GB, and OOM may occur when the data is too large.
Conclusion
  • In the FS scenario, HugeGraph outperforms Neo4j and Titan in terms of performance.
  • In the K-neighbor and K-out scenarios, HugeGraph can achieve results returned within seconds within 5 degrees.

2.4 Comprehensive Performance Test - CW

DatabaseSize 1000Size 5000Size 10000Size 20000
HugeGraph(core)20.804242.099744.7801700.547
Titan45.790820.6332652.2359568.623
Neo4j5.91350.267142.354460.880

Explanation

  • The “scale” is based on the number of vertices.
  • The data in the table is the time required to complete community discovery in seconds. For example, if HugeGraph uses the RocksDB backend and operates on a dataset of 10,000 vertices, and the community aggregation is no longer changing, it takes 744.780 seconds.
  • The CW test is a comprehensive evaluation of CRUD operations.
  • In this test, HugeGraph, like Titan, did not use the client and directly operated on the core.
Conclusion
  • Performance of community detection algorithm: Neo4j > HugeGraph > Titan

2 - HugeGraph-API Performance

The HugeGraph API performance test mainly tests HugeGraph-Server’s ability to concurrently process RESTful API requests, including:

  • Single insertion of vertices/edges
  • Batch insertion of vertices/edges
  • Vertex/Edge Queries

For the performance test of the RESTful API of each release version of HugeGraph, please refer to:

Updates coming soon, stay tuned!

2.1 - v0.5.6 Stand-alone(RocksDB)

Note:

The current performance metrics are based on an earlier version. The latest version has significant improvements in both performance and functionality. We encourage you to refer to the most recent release featuring autonomous distributed storage and enhanced computational push down capabilities. Alternatively, you may wait for the community to update the data with these enhancements.

1 Test environment

Compressed machine information:

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD,2.7T HDD
  • Information about the machine used to generate loads: configured the same as the machine that is being tested under load.
  • Testing tool: Apache JMeter 2.5.1

Note: The load-generating machine and the machine under test are located in the same local network.

2 Test description

2.1 Definition of terms (the unit of time is ms)

  • Samples: The total number of threads completed in the current scenario.
  • Average: The average response time.
  • Median: The statistical median of the response time.
  • 90% Line: The response time below which 90% of all threads fall.
  • Min: The minimum response time.
  • Max: The maximum response time.
  • Error: The error rate.
  • Throughput: The number of requests processed per unit of time.
  • KB/sec: Throughput measured in terms of data transferred per second.

2.2 Underlying storage

RocksDB is used for backend storage, HugeGraph and RocksDB are both started on the same machine, and the configuration files related to the server remain as default except for the modification of the host and port.

3 Summary of performance results

  1. The speed of inserting a single vertex and edge in HugeGraph is about 1w per second
  2. The batch insertion speed of vertices and edges is much faster than the single insertion speed
  3. The concurrency of querying vertices and edges by id can reach more than 13000, and the average delay of requests is less than 50ms

4 Test results and analysis

4.1 batch insertion

4.1.1 Upper limit stress testing
Test methods

The upper limit of stress testing is to continuously increase the concurrency and test whether the server can still provide services normally.

Stress Parameters

Duration: 5 minutes

Maximum insertion speed for vertices:
image
in conclusion:
  • With a concurrency of 2200, the throughput for vertices is 2026.8. This means that the system can process data at a rate of 405360 per second (2026.8 * 200).
Maximum insertion speed for edges
image
Conclusion:
  • With a concurrency of 900, the throughput for edges is 776.9. This means that the system can process data at a rate of 388450 per second (776.9 * 500).

4.2 Single insertion

4.2.1 Stress limit testing
Test Methods

Stress limit testing is a process of continuously increasing the concurrency level to test the upper limit of the server’s ability to provide normal service.

Stress parameters
  • Duration: 5 minutes.
  • Service exception indicator: Error rate greater than 0.00%.
Single vertex insertion
image
Conclusion:
  • With a concurrency of 11500, the throughput is 10730. This means that the system can handle a single concurrent insertion of vertices at a concurrency level of 11500.
Single edge insertion
image
Conclusion:
  • With a concurrency of 9000, the throughput is 8418. This means that the system can handle a single concurrent insertion of edges at a concurrency level of 9000.

4.3 Search by ID

4.3.1 Stress test upper limit
Testing method

Continuously increasing the concurrency level to test the upper limit of the server’s ability to provide service under normal conditions.

stress parameters
  • Duration: 5 minutes
  • Service abnormality indicator: error rate greater than 0.00%
Querying vertices by ID
image
Conclusion:
  • Concurrency is 14,000, throughput is 12,663. The concurrency capacity for querying vertices by ID is 14,000, with an average delay of 44ms.
Querying edges by ID
image
Conclusion:
  • Concurrency is 13,000, throughput is 12,225. The concurrency capacity for querying edges by ID is 13,000, with an average delay of 12ms.

2.2 - v0.5.6 Cluster(Cassandra)

Note:

The current performance metrics are based on an earlier version. The latest version has significant improvements in both performance and functionality. We encourage you to refer to the most recent release featuring autonomous distributed storage and enhanced computational push down capabilities. Alternatively, you may wait for the community to update the data with these enhancements.

1 Test environment

Compressed machine information

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD,2.7T HDD
  • Starting Pressure Machine Information: Configure the same as the compressed machine.
  • Testing tool: Apache JMeter 2.5.1.

Note: The machine used to initiate the load and the machine being tested are located in the same data center (or server room)

2 Test Description

2.1 Definition of terms (the unit of time is ms)

  • Samples – The total number of threads completed in this scenario.
  • Average – The average response time.
  • Median – The median response time in statistical terms.
  • 90% Line – The response time below which 90% of all threads fall.
  • Min – The minimum response time.
  • Max – The maximum response time.
  • Error – The error rate.
  • Throughput – The number of transactions processed per unit of time.
  • KB/sec – The throughput measured in terms of data transmitted per second.

2.2 Low-Level Storage

A 15-node Cassandra cluster is used for backend storage. HugeGraph and the Cassandra cluster are located on separate servers. Server-related configuration files are modified only for host and port settings, while the rest remain default.

3 Summary of Performance Results

  1. The speed of a single vertex and edge insertion in HugeGraph is 9000 and 4500 per second, respectively.
  2. The speed of bulk vertex and edge insertion is 50,000 and 150,000 per second, respectively, which is much higher than the single insertion speed.
  3. The concurrency for querying vertices and edges by ID can reach more than 12,000, and the average request delay is less than 70ms.

4 Test Results and Analysis

4.1 Batch Insertion

4.1.1 Pressure Upper Limit Test
Test Method

Continuously increase the concurrency level to test the upper limit of the server’s ability to provide services.

Pressure Parameters

Duration: 5 minutes.

Maximum Insertion Speed of Vertices:
image
Conclusion:
  • At a concurrency level of 3500, the throughput of vertices is 261, and the amount of data processed per second is 52,200 (261 * 200).
Maximum Insertion Speed of Edges:
image
Conclusion:
  • At a concurrency level of 1000, the throughput of edges is 323, and the amount of data processed per second is 161,500 (323 * 500).

4.2 Single Insertion

4.2.1 Pressure Upper Limit Test
Test Method

Continuously increase the concurrency level to test the upper limit of the server’s ability to provide services.

Pressure Parameters
  • Duration: 5 minutes.
  • Service exception mark: Error rate greater than 0.00%.
Single Insertion of Vertices:
image
Conclusion:
  • At a concurrency level of 9000, the throughput is 8400, and the single-insertion concurrency capability for vertices is 9000.
Single Insertion of Edges:
image
Conclusion:
  • At a concurrency level of 4500, the throughput is 4160, and the single-insertion concurrency capability for edges is 4500.

4.3 Query by ID

4.3.1 Pressure Upper Limit Test
Test Method

Continuously increase the concurrency and test the upper limit of the pressure that the server can still provide services normally.

Pressure Parameters
  • Duration: 5 minutes
  • Service exception flag: error rate greater than 0.00%
Query by ID for vertices
image
Conclusion:
  • The concurrent capacity of the vertex search by ID is 14500, with a throughput of 13576 and an average delay of 11ms.
Edge search by ID
image
Conclusion:
  • For edge ID-based queries, the server’s concurrent capacity is up to 12,000, with a throughput of 10,688 and an average latency of 63ms.

3 - HugeGraph-Loader Performance

Note:

The current performance metrics are based on an earlier version. The latest version has significant improvements in both performance and functionality. We encourage you to refer to the most recent release featuring autonomous distributed storage and enhanced computational push down capabilities. Alternatively, you may wait for the community to update the data with these enhancements.

Use Cases

When the number of graph data to be batch inserted (including vertices and edges) is at the billion level or below, or the total data size is less than TB, the HugeGraph-Loader tool can be used to continuously and quickly import graph data.

Performance

The test uses the edge data of website.

RocksDB single-machine performance (Update: multi-raft + rocksdb cluster is supported now)

  • When the label index is turned off, 228k edges/s.
  • When the label index is turned on, 153k edges/s.

Cassandra cluster performance

  • When label index is turned on by default, 63k edges/s.

4 -

1 测试环境

1.1 硬件信息

CPUMemory网卡磁盘
48 Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz128G10000Mbps750GB SSD

1.2 软件信息

1.2.1 测试用例

测试使用graphdb-benchmark,一个图数据库测试集。该测试集主要包含4类测试:

  • Massive Insertion,批量插入顶点和边,一定数量的顶点或边一次性提交

  • Single Insertion,单条插入,每个顶点或者每条边立即提交

  • Query,主要是图数据库的基本查询操作:

    • Find Neighbors,查询所有顶点的邻居
    • Find Adjacent Nodes,查询所有边的邻接顶点
    • Find Shortest Path,查询第一个顶点到100个随机顶点的最短路径
  • Clustering,基于Louvain Method的社区发现算法

1.2.2 测试数据集

测试使用人造数据和真实数据

本测试用到的数据集规模
名称vertex数目edge数目文件大小
email-enron.txt36,691367,6614MB
com-youtube.ungraph.txt1,157,8062,987,62438.7MB
amazon0601.txt403,3933,387,38847.9MB

1.3 服务配置

  • HugeGraph版本:0.4.4,RestServer和Gremlin Server和backends都在同一台服务器上
  • Cassandra版本:cassandra-3.10,commit-log 和data共用SSD
  • RocksDB版本:rocksdbjni-5.8.6
  • Titan版本:0.5.4, 使用thrift+Cassandra模式

graphdb-benchmark适配的Titan版本为0.5.4

2 测试结果

2.1 Batch插入性能

Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)
Titan9.51688.123111.586
RocksDB2.34514.07616.636
Cassandra11.930108.709101.959
Memory3.07715.20413.841

说明

  • 表头"()“中数据是数据规模,以边为单位
  • 表中数据是批量插入的时间,单位是s
  • 例如,HugeGraph使用RocksDB插入amazon0601数据集的300w条边,花费14.076s,速度约为21w edges/s
结论
  • RocksDB和Memory后端插入性能优于Cassandra
  • HugeGraph和Titan同样使用Cassandra作为后端的情况下,插入性能接近

2.2 遍历性能

2.2.1 术语说明
  • FN(Find Neighbor), 遍历所有vertex, 根据vertex查邻接edge, 通过edge和vertex查other vertex
  • FA(Find Adjacent), 遍历所有edge,根据edge获得source vertex和target vertex
2.2.2 FN性能
Backendemail-enron(3.6w)amazon0601(40w)com-youtube.ungraph(120w)
Titan7.72470.935128.884
RocksDB8.87665.85263.388
Cassandra13.125126.959102.580
Memory22.309207.411165.609

说明

  • 表头”()“中数据是数据规模,以顶点为单位
  • 表中数据是遍历顶点花费的时间,单位是s
  • 例如,HugeGraph使用RocksDB后端遍历amazon0601的所有顶点,并查找邻接边和另一顶点,总共耗时65.852s
2.2.3 FA性能
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)
Titan7.11963.353115.633
RocksDB6.03264.52652.721
Cassandra9.410102.76694.197
Memory12.340195.444140.89

说明

  • 表头”()“中数据是数据规模,以边为单位
  • 表中数据是遍历边花费的时间,单位是s
  • 例如,HugeGraph使用RocksDB后端遍历amazon0601的所有边,并查询每条边的两个顶点,总共耗时64.526s
结论
  • HugeGraph RocksDB > Titan thrift+Cassandra > HugeGraph Cassandra > HugeGraph Memory

2.3 HugeGraph-图常用分析方法性能

术语说明
  • FS(Find Shortest Path), 寻找最短路径
  • K-neighbor,从起始vertex出发,通过K跳边能够到达的所有顶点, 包括1, 2, 3…(K-1), K跳边可达vertex
  • K-out, 从起始vertex出发,恰好经过K跳out边能够到达的顶点
FS性能
Backendemail-enron(30w)amazon0601(300w)com-youtube.ungraph(300w)
Titan11.3330.313376.06
RocksDB44.3912.221268.792
Cassandra39.8453.337331.113
Memory35.6382.059388.987

说明

  • 表头”()“中数据是数据规模,以边为单位
  • 表中数据是找到从第一个顶点出发到达随机选择的100个顶点的最短路径的时间,单位是s
  • 例如,HugeGraph使用RocksDB查找第一个顶点到100个随机顶点的最短路径,总共耗时2.059s
结论
  • 在数据规模小或者顶点关联关系少的场景下,Titan最短路径性能优于HugeGraph
  • 随着数据规模增大且顶点的关联度增高,HugeGraph最短路径性能优于Titan
K-neighbor性能
顶点深度一度二度三度四度五度六度
v1时间0.031s0.033s0.048s0.500s11.27sOOM
v111时间0.027s0.034s0.1151.36sOOM
v1111时间0.039s0.027s0.052s0.511s10.96sOOM

说明

  • HugeGraph-Server的JVM内存设置为32GB,数据量过大时会出现OOM
K-out性能
顶点深度一度二度三度四度五度六度
v1时间0.054s0.057s0.109s0.526s3.77sOOM
10133245350,8301,128,688
v111时间0.032s0.042s0.136s1.25s20.62sOOM
1021149441131502,629,970
v1111时间0.039s0.045s0.053s1.10s2.92sOOM
101402555508251,070,230

说明

  • HugeGraph-Server的JVM内存设置为32GB,数据量过大时会出现OOM
结论
  • FS场景,HugeGraph性能优于Titan
  • K-neighbor和K-out场景,HugeGraph能够实现在5度范围内秒级返回结果

2.4 图综合性能测试-CW

数据库规模1000规模5000规模10000规模20000
Titan45.943849.1682737.1179791.46
Memory(core)41.0771825.905**
Cassandra(core)39.783862.7442423.1366564.191
RocksDB(core)33.383199.894763.8691677.813

说明

  • “规模"以顶点为单位
  • 表中数据是社区发现完成需要的时间,单位是s,例如HugeGraph使用RocksDB后端在规模10000的数据集,社区聚合不再变化,需要耗时763.869s
  • “*“表示超过10000s未完成
  • CW测试是CRUD的综合评估
  • 后三者分别是HugeGraph的不同后端,该测试中HugeGraph跟Titan一样,没有通过client,直接对core操作
结论
  • HugeGraph在使用Cassandra后端时,性能略优于Titan,随着数据规模的增大,优势越来越明显,数据规模20000时,比Titan快30%
  • HugeGraph在使用RocksDB后端时,性能远高于Titan和HugeGraph的Cassandra后端,分别比两者快了6倍和4倍