This page catalogs the performance levers GQLDB exposes — what they do, when to reach for them, and where the full reference lives. Treat it as a starting index: most rows link to the canonical page for that lever.
Before changing settings, find out where the time is actually going. GQLDB gives you two complementary tools:
| Tool | What it tells you | Reference |
|---|---|---|
EXPLAIN <query> | Estimated cost, planned operators, estimated row counts. No execution. | Execution Plan |
PROFILE <query> | Same as EXPLAIN plus actual per-operator row counts and elapsed time after running the query. | Execution Plan |
A profile reading is worth more than any guess about what's slow. If the plan picks a full scan where you expected an index hit, the fix is upstream (add the index, fix the predicate). If the plan looks right but a single operator dominates the time, the fix is at that operator (cardinality, memory, parallelism).
The single largest lever for read latency. GQLDB supports three index families:
| Index | What it accelerates | Reference |
|---|---|---|
| Property index | Equality and range predicates on node / edge properties, anywhere a MATCH filters by a non-key property. | Index |
| Full-text index | CONTAINS / phrase search on string properties, with tokenizer-aware matching. | Full-text Index |
| Vector index | Approximate-nearest-neighbor search over dense vector properties: embeddings, image features, etc. | Vector Index |
If a query scans the whole graph but only returns a handful of rows, you almost certainly want an index. Confirm with EXPLAIN before and after to see the operator change.
For graph algorithms — PageRank, shortest path, community detection, embeddings — the compute engine builds an in-memory topology representation that is orders of magnitude faster than executing the same logic through the row-oriented GQL pipeline.
| Lever | What it does | Reference |
|---|---|---|
ALTER GRAPH <graphName> SET COMPUTE ENABLED | Enables the compute engine for a graph; builds the topology cache on first algorithm call. | Computing Engine |
| Cached properties | Promote frequently accessed properties into the compute cache so algorithms read them without a round trip to LSM. | Computing Engine |
| Topology build state | Inspect whether the topology cache is ready, building, or stale. | Computing Engine |
Rule of thumb: if a query runs CALL algo.*, it needs the compute engine. If it's pure pattern matching, it doesn't.
Edge _id is a per-graph toggle that controls whether edges carry a fast _id lookup path. When enabled (the default), each edge gets a UUID-style _id, backed by a hidden on-disk index and an in-memory cache; you can match an edge by _id in O(1) and assign custom _id values at insertion. When disabled, the index and cache go away, edges fall back to system-assigned e:<N> identifiers, and _id-based edge lookups are rejected.
The lookup is real overhead: roughly 50 bytes per edge in the in-memory cache (≈5 GB for 100M edges, ≈50 GB for 1B), plus the hidden index on disk and a uniqueness check on every write. Turn it off when the workload never addresses edges by _id — typically pure-topology workloads (graph algorithms, k-hop traversal, recommendation engines) and bulk ETL pipelines where insert throughput matters more than per-edge addressability. Nodes don't have an equivalent switch because node _id lookup is free (nodes are stored by their _id directly).
Toggling on a populated graph runs a background converter that walks every edge; inspect progress with SHOW EDGE_ID STATUS. See Node and Edge IDs for the syntax, toggle states, and what happens to existing _id values after a transition.
The right memory settings depend on graph size and access pattern. The flags below are all set at startup; see Database Installation → Important Flags for the full table.
| Flag | What it controls | When to tune |
|---|---|---|
-cache-size | In-memory read cache (4 KB pages). Default 10000 ≈ 40 MB. | Raise for read-heavy workloads on graphs that don't fit in OS page cache. Lower on small instances to free RAM for other consumers. |
-mem-limit-bytes | Soft memory ceiling for the process. 0 = auto from cgroup, -1 = disabled. | Set explicitly when running alongside other heavy processes on the same host. |
-max-msg-size | Per-message gRPC limit (both directions). | Raise if large INSERT payloads or large RETURN result sets fail with ResourceExhausted. |
WAL fsync policy is the single biggest write-side knob:
-wal-sync-mode | Behavior | Tradeoff |
|---|---|---|
0 None | No fsync; memtable only. | Highest throughput, whole memtable lost on crash. |
1 Batch | Group-commit fsync. | Strong throughput, group-commit window lost on crash. |
2 Every | fsync on every commit. | Zero data loss, lowest throughput. |
3 Async (default) | Background fsync; ~100 ms write-loss window. | Balanced — what most workloads should keep. |
Set via -wal-sync-mode at startup or in -config. Don't pick 0 for any data you care about — it's a benchmark mode, not a production setting.
The planner uses db.stats() outputs to pick operators. Stale statistics produce stale plans.
| Lever | What it does | Reference |
|---|---|---|
db.stats() | Inspect current statistics: nodeCount, edgeCount, per-label property frequency. | Database Info → Statistics |
db.reload_stats() | Force a full rebuild from a fresh scan. O(N + E). | Database Info → Rebuilding Statistics |
Run db.reload_stats() after a bulk import, a restore, or whenever db.stats() reports statsReady = false. Don't run it on a hot path — it's a maintenance operation.
Disk-bound workloads benefit more from hardware than from any flag:
gp3 / io2 are acceptable for moderate workloads; instance-store NVMe wins for write-heavy.On HA topologies, v1.0 routes all reads to the leader — no read scale-out. v1.1 introduces opt-in follower reads with a driver-side staleness tolerance: reads can route to a follower if the follower's lag is within the configured window. See Clustering → Routing & Driver Behavior.
For pure read scale today, the path is vertical (bigger leader instance) or app-side caching in front of the database, not horizontal across followers.
Imports are usually faster as a series of large transactions than as one giant transaction or many tiny ones. The two main paths:
| Path | Best for | Reference |
|---|---|---|
LOAD CSV | Files already on disk on the server. Supports dump form and inline import. | LOAD CSV |
Driver batched INSERT | Data arriving from the application side. Pack 1k–10k entities per transaction. | Ultipa Drivers |
Before a bulk import: disable indexes you'll rebuild after, optionally lower -wal-sync-mode to 1 (Batch) for the import window, and run db.reload_stats() once the import completes so the planner sees current data.