Based on retail business' graph model of built in the previous section, this section explains how to use Ultipa Manager, Ultipa Transporter-Importer to batch import data into Ultipa Server.
The process of importing graph data files to Ultipa graphset is to input graph. Each graph data file represents a schema；each row of data (table header excluded) in the file represents a node or an edge, each column of data represents a property: column separators such as
; are supported.
Shown below are schema
transfer's graph data files for CUSTOMER, MERCHANT, and TRANSACTION：
The 3 files use comma
, as column separator and all of them have table headers. It can be noticed from their table headers that the name of the column that contains node's system property
_id and edge's system property
_from as well as
_to is still
merchant_no, so they need to be changed into correct system property's names, which can be done both in files or during graph inputting.
Input graph via Manager
Input Graph with Manager is easy and highly visualized.
Prepare graphset and Schema
Make sure the graphset and graph model (node schema and edge schema) have been created before inputting graph via Manager.
- Create graphset
Run the UQL command in the CLI window on top of Manager to create graphset：
// Create graphset'retail_test' create().graph("retail_test")
Or create graphset via UI interface：
Note: switch to the new graphset after creation.
- Create schema
Run the UQL command in the CLI window on top of Manager to create schema：
// create node schema 'customer'、'merchant'，edge schema 'transfer' create() .node_schema("customer") .node_schema("merchant") .edge_schema("transfer")
Or create schema via UI interface：
Import node data
Property values for
_tohave to be provided when importing edge data, and the nodes referrring to have to have existed in the graphset, so importing node data has to be completed prior to edge data.
Take importing CUSTOMER.csv as an example，the importing operation can be done via 4 steps：
- Upload files
Click "File" on the left menu of Manager，find "Import" in the popup window and click “+” on the right，choose “+ Node” and upload the local file named CUSTOMER.csv：
- State seperators and headers
This decides whether columns and headers can be correctly identified. CUSTOMER.csv file has column separator
, as well as headers, users can click "Preview" to check if data are columned correctly and if headers are identified correctly (instead of being identified as a column of node data):
- State schema and property data types
This decides whether the data columns can be identified (or created) as the property it represents. CUSTOMER.csv represents schema
customer, so choose
customer as its schema. If there are red triangular icons shown on the left of some columns, it means that these properties have not been created in
customer. After changing
_id, the red triangle on the left disappeared, representing the column has been identified as system property
_id; the rest of the columns will need to be created as custom properties by filling their data types and clicking the "+". When all red triangles disappear, all properties are created.
When importing files, users can choose to upsert or overwrite. As we are adding data here, both modes will work.
Importing MERCHANT.csv file is similar to operations above.
Import edge data
After inputting file CUSTOMER.csv and MERCHANT.csv, we are ready to import TRANSACTION.csv. Choose “+edge” when uploading files and choose local file TRANSACTION.csv, choose
transfer as its schema, and change column headers
This is to verify the results after verification. Click “Schema” on the left menu of Manager, check the tree structure of schema in the popup board, and verify the numbers of nodes and edges. Users can also view schema's custom properties and data types in the spreaded tree structure.
Input graph via Transporter
Compared with Manager, using Transporter to input graph is faster and it can input multiple graphs simultaneously.
Prepare YAML files
when using Importer tool from Ultipa Transporter to input graph, users do not have to create graphset and schema beforehand, but will need a TAML file to configure, which include the server's connection information, target graphset's name, every data file's schema, and every data column's property type, etc. It also states column separators, import modes, as well as details on how to handle data in batches, etc.
Yaml configuration file are divided into 4 sections：
- Section 1 Server Information
server: host: "192.168.100.100:60010" username: "root" password: "root" graphset: "retail_test" crt: ""
Server connection information
password should be provided by the server administrator;
graphset can be a graphset to be created;
crt can be skipped if IFS is not used for communication.
- Section 2 Node File Information
nodeConfig: # About CUSTOMER.csv file - schema: "customer" file: "./CUSTOMER.csv" types: - name: cust_no type: _id - name: risk_level type: int32 - name: card_level type: int32 - name: balance type: float # About MERCHANT.csv file - schema: "merchant" file: "./MERCHANT.csv" types: - name: merchant_no type: _id
merchant_no are stated as system property
_id, the two data columns will be automatically read as string；
balance will be read as int32, int32, and float respectively；CUSTOMER.csv's data columen
cust_name, MERCHANT.csv 's data columns
type are not mentioned and will be read as string by default.
- Section 3 Edge File Information
edgeConfig: # About TRANSACTION.csv file - schema: "transfer" file: "./TRANSACTION.csv" types: - name: cust_no type: _from - name: merchant_no type: _to - name: tran_date type: timestamp - name: tran_amount type: float - name: tran_type type: int32
merchant_no are stated as system properties
_to, the two columns will be automatically read as string;
tran_type will be read as string；
tran_type will be read as timestamp, float, and int32；data column
result is not mentioned and will be read as string by default.
- Section 4 Global Information
settings: separator: "," importMode: insert yes: true batchSize: 10000 threads: 8
Column separators are comma
,; as we are adding more nodes and edges here, input mode can be either insert, overwrite, or upsert;
yes means true, and it represents that designated graphset, schema, or property will be automatically created if found not existent;
threads are the size of input batches and the numbers of concurrent batches, applicable when huge amount of data are input.
When indenting a new line, users need to click space bar twice, instead of using tab key once.
Save sections above in configuration file import_retail.yml(Click to download)
Import node and edge files
Place Transporter's import tool: ultipa-importer, node and edge files CUSTOMER.csv, MERCHANT.csv, TRANSACTION.csv, configuration file import_retail.yml under a same directory, open the command line tool under the directory (e.g. right-click Powershell) and execute commands below：
./ultipa-importer --config ./import_retail.yml
When operating the command, if notice
bash: ./ultipa-importer: Permission deniedappears, it suggests that relevant execution privileges are not obtained; users can execute
chmod 777 ultipa-importerto obtain privileges required before executing ultipa-importer commands.
When inputting data columns of date via ultipa-importer, the format of dates need to be changed into
yyyy-mm-dd hh:mm:ssfirst, for instance,
2022/3/2 22:12:56need to be changed to
2022-03-02 22:12:56before input data.