LOAD CSV statement has two forms:
Both forms read a local filesystem path or a file:// URI. Remote schemes (http://, https://, s3://, ...) are not supported.
Syntax<load csv statement> ::= "LOAD CSV FROM" <single-quoted file source> { <dump csv> | <import from csv> } <dump csv> ::= [ "AS" <row variable> ] [ "WITH HEADER" [ "DELIMITER" <character> [ "QUOTE" <character> ] ] ] <import from csv> ::= "WITH HEADER" [ "DELIMITER" <character> [ "QUOTE" <character> ] ] "INTO" <label name> [ "EDGE FROM" <label name> "(" <column> ")" "TO" <label name> "(" <column> ")" ] [ "MAPPING" "(" <property mapping> { "," ... } ")" ] <property mapping> ::= <property> ":" <column> [ "AS" <type> ]
Details
<row variable> defaults to row when AS is omitted.WITH HEADER, the first row is consumed as column names and each subsequent row is keyed by those names. Without WITH HEADER, each row is keyed positionally: col0, col1, ... .DELIMITER accepts a single character (e.g. ',', ';', '\t'). QUOTE accepts the double quote (") as the field-quote character./data/users.csv) and file:/// URIs are used as-is. Relative paths resolve against the server process's working directory, the directory the database was launched from, not the .gdb data folder. Prefer absolute paths in production.user.csv_id,name,balance,risk_score,is_mule USR_000000,David Brown,16625.7,0.86,true USR_000001,James Johnson,62870.63,0.76,true USR_000002,Jane Brown,17065.87,0.85,true
transfers.csv_from,_to,amount,is_suspicious,timestamp USR_000000,USR_000002,3384.45,false,2024-01-27T19:19:00Z USR_000001,USR_000000,4353.22,true,2024-01-25T06:18:00Z USR_000000,USR_000001,2482.13,false,2024-01-30T15:27:00Z USR_000001,USR_000002,4050.61,false,2024-01-28T01:02:00
Read a CSV file with no header. Each row is returned as a map keyed positionally (col0, col1, ...) under a single column named row (the default variable):
GQLLOAD CSV FROM 'data/users.csv'
Result:
JSON[ { "row": { "col2": "balance", "col3": "risk_score", "col4": "is_mule", "col0": "_id", "col1": "name" } }, { "row": { "col0": "USR_000000", "col1": "David Brown", "col2": "16625.7", "col3": "0.86", "col4": "true" } }, { "row": { "col0": "USR_000001", "col1": "James Johnson", "col2": "62870.63", "col3": "0.76", "col4": "true" } }, { "row": { "col3": "0.85", "col4": "true", "col0": "USR_000002", "col1": "Jane Brown", "col2": "17065.87" } } ]
Read a CSV file with a header so the map is keyed by column name:
GQLLOAD CSV FROM 'data/users.csv' WITH HEADER
Result:
JSON[ { "row": { "_id": "USR_000000", "name": "David Brown", "balance": "16625.7", "risk_score": "0.86", "is_mule": "true" } }, { "row": { "_id": "USR_000001", "name": "James Johnson", "balance": "62870.63", "risk_score": "0.76", "is_mule": "true" } }, { "row": { "_id": "USR_000002", "name": "Jane Brown", "balance": "17065.87", "risk_score": "0.85", "is_mule": "true" } } ]
Bind the per-row map to a different column name with AS:
GQLLOAD CSV FROM '/data/users.csv' AS u
Read a tab-separated file:
GQLLOAD CSV FROM 'data/users.tsv' AS u WITH HEADER DELIMITER '\t'
Specify the field-quote character with QUOTE. The double quote (") is the supported quoting character:
GQLLOAD CSV FROM 'data/users.csv' AS u WITH HEADER DELIMITER ',' QUOTE '"'
The result set has one column (row / u / whatever name AS specified). Each row in that column is a MAP — access fields client-side after fetching, e.g. row['name'] in your driver.
The rows are returned to the client only, they cannot be passed to a following RETURN, INSERT, or MATCH in the same query.
Use INTO <label> to write each CSV row as a node (or as an edge with EDGE FROM ... TO ...) in a single statement, no chained INSERT needed. Returns a one-row summary with the count of imported (and, for edges, skipped) records.
Without MAPPING, every CSV column becomes a property of the same name on the new node. WITH HEADER is required (the import needs column names to map by). All values default to STRING.
GQLLOAD CSV FROM 'data/users.csv' WITH HEADER INTO User
Add a MAPPING (...) clause to rename columns, select a subset, and apply type coercion. Include _id in the mapping if you plan to reference these nodes from a later edge import.
GQLLOAD CSV FROM 'data/users.csv' WITH HEADER INTO Person MAPPING ( _id: '_id', name: 'name', balance: 'balance' AS FLOAT, riskScore: 'risk_score' AS FLOAT, isMule: 'is_mule' AS BOOL )
For edges, name the endpoint labels and the CSV columns that supply each endpoint's _id. Rows whose endpoint _id is missing in the graph are skipped and reported in the summary instead of erroring the whole import. Requires the graph to have EDGE_ID enabled (default for new graphs).
GQLLOAD CSV FROM 'data/transfers.csv' WITH HEADER INTO TRANSFERS EDGE FROM Person('_from') TO Person('_to') MAPPING ( isSuspicious: 'is_suspicious' AS BOOL, timestamp: 'timestamp' AS TIMESTAMP )
| Type | Aliases | Accepted format |
|---|---|---|
STRING | TEXT, CHAR, VARCHAR | cell preserved verbatim |
INT / INTEGER | BIGINT, SMALLINT, INT8/16/32/64, UINT, UINT8/16/32/64 | base-10, whitespace trimmed |
FLOAT / DOUBLE | REAL, FLOAT32, FLOAT64 | IEEE 754, whitespace trimmed |
DECIMAL | NUMERIC | arbitrary-precision, stored as the original digit string |
BOOL / BOOLEAN | true/false/t/f/yes/no/y/n/1/0 (case-insensitive) | |
DATE | YYYY-MM-DD | |
TIME | HH:MM[:SS[.fff]] | |
DATETIME | LOCAL_DATETIME | RFC 3339 or YYYY-MM-DD HH:MM:SS (no zone) |
TIMESTAMP | RFC 3339 (assumes UTC if no offset) | |
ZONED_DATETIME | RFC 3339 with offset (2024-01-15T09:00:00+09:00) | |
DURATION | INTERVAL | ISO 8601 (PT3H30M, P1Y2M, -PT15M, fractional units OK) |
BYTES | BLOB, BINARY, VARBINARY | hex (0xDEADBEEF or deadbeef) or base64 |
POINT | 1.5 2.5, 1.5,2.5, or POINT(1.5 2.5) — longitude first | |
POINT3D | 1.0 2.0 3.0, 1.0,2.0,3.0, or POINT(1.0 2.0 3.0) |
Notes:
NULL (the property is omitted from the row) rather than an empty-string value.file:// URIs are supported. HTTP(S) and object-storage sources are not yet wired in.STRING in the dump form; downstream client code is responsible for any type conversion. The inline-import form does its own coercion via AS <TYPE> in the MAPPING clause.