Change Password

Please enter the password.
Please enter the password. Between 8-64 characters. Not identical to your email address. Contain at least 3 of: uppercase, lowercase, numbers, and special characters.
Please enter the password.
Submit

Change Nickname

Current Nickname:
Submit

Apply New License

License Detail

Please complete this required field.

  • Ultipa Graph V4

Standalone

Please complete this required field.

Please complete this required field.

The MAC address of the server you want to deploy.

Please complete this required field.

Please complete this required field.

Cancel
Apply
ID
Product
Status
Cores
Applied Validity Period(days)
Effective Date
Excpired Date
Mac Address
Apply Comment
Review Comment
Close
Profile
  • Full Name:
  • Phone:
  • Company:
  • Company Email:
  • Country:
  • Language:
Change Password
Apply

You have no license application record.

Apply
Certificate Issued at Valid until Serial No. File
Serial No. Valid until File

Not having one? Apply now! >>>

Product Created On ID Amount (USD) Invoice
Product Created On ID Amount (USD) Invoice

No Invoice

Search
    English

      Transporter | Importer

      This manual covers the usage of Ultipa Importer (Go Version), a command-line-based lightweight tool for fast import of multiple metadata files from local to the Ultipa Graph database in one command.

      Ultipa Transporter can handle instances hosted on-premise and on Ultipa Cloud
      Datafiles, configuration file and Ultipa Transporter in local PC

      Change Log (V4.1 to V4.2)

      • Supports various formats of time (see FAQ)
      • Supports text type
      • Adds MaxPacketSize to adjust the maximum bytes of each packet the SDK processes
      • Adds timezone to set for timestamp
      • Checks whether graphset status is MOUNTED before import

      Prerequisites

      • node files and edge files
      • configuration file (yml)
      • a command line terminal that is compatible with your operating system:
      • a version of Ultipa Importer compatible with your operating system

      Background Knowledge - System Properties

      System properties of node:

      • _id: ID of node, a string of maximum 128 bytes
      • _uuid: ID of node, an uint64

      System properties of edge:

      • _uuid: ID of edge, an uint64
      • _from: the _id of start-node (FROM) of edge
      • _to: the _id of end-node (TO) of edge
      • _from_uuid: the _uuid of start-node (FROM) of edge
      • _to_uuid: the _uuid of end-node (TO) of edge

      Failures induced by system properties:

      • ID of data already exists in the current graphset when the import mode is insert
      • Not providing ID of FROM or TO when importing edge data
      • ID of FROM or TO does not exist in the current graphset and the node files that are imported in the same command, and not allowing system to create such FROM or TO by setting createNodeIfNotExist to 'true'
      • Provide node ID, FROM or TO in both string and uint64 type, but the mapping relation of two types of ID is not consistent with that in the the graphset or the node files imported in the same command

      When providing only one type of node ID, FROM or TO, either string or uint64, the other type of ID will be automatically mapped or generated by the system.

      Data File

      • Each file: nodes or edges that belong to a specific schema
      • Each row (except headers): a node or an edge
      • Each column: a property
      • File format: csv (file extension does not matter)
      • File delimiter: ,, \t, \| and ; (same delimiter for all files)
      • File headers (column name) format: <property_name> or <property_name>:<property_type>, headerless allowed
      • Valid <property_name> (name): 2 ~ 64 characters, not allow to start with wave line '~' or contain back quote '`'
      • Valid <property_type> (type):
        • For system properties: _id, _uuid, _from, _to, _from_uuid, _to_uuid
        • For custom properties: string, text, float, double, int32, uint32, int64, uint64, datetime, timestamp
        • For columns to be ingored: _ignore

      The data type of properties declared should satisfy the data in the columns. A misparsing of data, reading a column of integers as strings for example, will be prompted as an error.

      YML: server

      server:
        host: "192.168.35.151:60024"	# for cluster, separate multiple server nodes with comma ','
        username: employee533
        password: joaedSSGsdf
        crt: ""					# The directory of the SSL certificate when both servers are in SSL mode
        graphset: test_graph		# The graphset name, or use graphset 'default' by default
      

      YML: nodeConfig | edgeConfig

      • Headerless: edge file 'review.csv' of schema @review, columns are FROM, TO, rating and comment:

      A2CMX45JPSCTUJ,B0002CZSJO,5,The Best Cable
      A3EIML9QZO5NZZ,B0000AQRSU,5,awesome
      A3C9F3SZWLWDZF,B000165DSM,2,Cannot recommend
      A1C60KQ8VJZBS5,B0002CZV82,4,It's a wedgie
      

      edgeConfig:
        - schema: review		# Schema of current data file, mandatory
          file: ./review.csv	# The directory of the data file
          properties:			# set `properties` when the file is headerless
            # declare name and type for each column, must correspond to the sequence of columns
            - name: _from		
              type: _from		# declare system property
            - name: _to
              type: _to
            - name: rating	# declare custom property
              type: int32		# declare property type; must be consistent with that in the graphset if property already exist 
            - name: comment	# set to string when `type` is not set	
      
      • With header: node file 'reviewer.txt' of schema @reviewer; note that 'level' is mistakenly marked as 'string' which should be 'uint32':

      reviewerID,username,level:string,birthday:datetime
      A00625243BI8W1SSZNLMD,jespi59jr,12,1984-05-31
      A10044ECXDUVKS,Dean J Copely,10,1987-11-02
      A102MU6ZC9H1N6,Teresa Halbert,5,2001-08-14
      A109JTUZXO61UY,Mike C,9,1998-02-19
      

      nodeConfig:
        - schema: reviewer
          file: ./reviewer.txt
          types:				# set `types` when the file has headers
            # declare or revise type for columns if necessary, regardless of sequence
            - name: level
              type: uint32
            - name: reviewerID
              type: _id		# declare system property
            # use string for 'username' by default
            # use 'datatime' from header for 'birthday'
      

      A yml file can include both nodeConfig and edgeConfig, where the nodeConfig can have many sets of schema and its parameters with the same level, same of edgeConfig.

      Other parameters:

      Parameter Specification Default Value
      Description
      skip int 0 The number of rows to be skipped (not imported) from the first record.
      limit int (no limit) The total number of rows to import from the current data file.

      Parameters skip and limit are on the same level with schema, file and etc., they are usually set when re-importing data file after error occurred.

      YML: settings

      settings:
        separator: ","		# The delimiter of data columns of all the files to be imported, supports `,`, `\t`, `\|` and `;`, or take `,` by default
        importMode: overwrite	# The mode of an import operation, supports `insert`, `upsert`, and `overwrite`, or take `insert` by default
        yes: true				# Whether to auto-create graphset, schema and properties that do not exist, or do not auto-create by default
        threads: 32			# The maximum threads (no less than 2), or take the number of CPUs that run the Importer by default; 32 threads recommended
        batchSize: 1000		# The number of rows in each batch, valid from 500 to 10000; an integer of 100000/number_of_properties is recommended, or take 10000 by default
      

      Other parameters:

      Parameter Specification Default Value
      Description
      logPath <log_path> ./log/ The path of the log file, i.e., '/data/import/log/'
      MaxPacketSize int 41943040 (40M) The maximum bytes of each packet the GO SDK processes
      timezone string (local timezone) The timezone of timestamp values, e.g. +00:80, Asia/Shanghai etc.
      createNodeIfNotExist bool false true: create nodes for the non-existing _from, _to, _from_uuid or _to_uuid of edges; false: leave them non-existing and their related edges un-imported
      stopWhenError bool false (When error occurs) true: terminate the import operation immediately; false: skip the error data batch and continue with the next batch when not using this parameter
      fitToHeader bool false (When the header length and the number of data columns are inconsistent) true: omit or auto-fill columns based on the header; false: stop and throw an error

      Command Line

      1. Show help

      ./ultipa-importer --help
      
      1. Download configuration sample file

      ./ultipa-importer --sample
      
      1. Execute import operation, the config file in.yml is in the current directory

      ./ultipa-importer --config ./in.yml
      

      All parameters:

      Command
      Description
      --help show help information
      --config <FILE_PATH_NAME> define the configuration file and execute import operation
      --sample true: generate a sample config file; false: do not generate sample config file
      --host <IP:PORT> overwrite the parameter host in the config file
      --graph <GRAPH_NAME> overwrite the parameter graphset in the config file
      --username <USERNAME> overwrite the parameter username in the config file
      --password <PASSWORD> overwrite the parameter password in the config file
      --maxPacketSize <MAXPACKETSIZE> overwrite the parameter MaxPacketSize in the config file
      --logAppend true: append multiple error info into one log file; false: generate a log file for each error info
      --progressLog <boolean> (for Ultipa Manager) true: generate progress log; false do not generate progress log
      --version true: show Ultipa Importer version; false: do not show Ultipa Importer version

      Errors

      Before Importing

      Definition: errors triggered when checking configurations in the yml file, creating graphset or creating schema.

      Triggers:

      1. unconformity of the yml file content with yml format
      2. parameter config error, such as property name or data type mismatching UQL specification
      3. failure when creating graphset and/or schema

      During Importing

      Definition: errors triggered when importing data files to the remote server.

      Types:

      1. server returned error;
      2. network error;
      3. data format error (inconsistency between declared file header and data columns, see fitToHeader for solution);
      4. duplicated identifier (data ID already exists under insert mode).

      When error occurs during importing a data batch, the server will record the error type, the skipped data rows (represented by the start row position and the total number of rows) in the log file, which is for easy re-import using parameters skip and limit.

      FAQ

      Q: I got such error 'rpc error: code = ResourceExhausted desc = Received message larger than max (31324123 vs. 4194304)', what does it mean and how to solve it?

      A: This message means when importing a data batch, the packet size which is 31324123 bytes exceeds the limit of 4194304 bytes. The possible reasons are too many properties imported at a time, excessive property volume (long texts stored in text type), or too large batchSize that has been set, as a result of which the data volume of a data batch exceeds the default server config of max_rpc_msgsize (4M) and/or the MaxPacketSize of Go SDK (40M).

      Solution A: reduce the batchSize in the config file
      Solution B: raise the setting of MaxPacketSize in the config file, and/or max_rpc_msgsize in the server config (the latter requires a server re-boot).

      Q: How to set timezone for time values?

      A: Please follow below format examples:

      • [YY]YY-MM-DD HH:MM:SS
      • [YY]YY-MM-DD HH:MM:SSZ
      • [YY]YY-MM-DDTHH:MM:SSZ
      • [YY]YY-MM-DDTHH:MM:SS[+/-]0x00
      • [YY]YYMMDDHH:MM:SS[+/-]0x00

      Supports year of 4-digit or 2-digit (2-digit year will be parsed as 19xx if year≥70, or parsed as 20xx if year<70; supports month and day of 2-digit or 1-digit; dash (-) can be replaced with slash (/); [+/-]0x00 stands for +0700 or -0300 dependent, and Z stands for UTC 0 timezone.

      Please complete the following information to download this book
      *
      公司名称不能为空
      *
      公司邮箱必须填写
      *
      你的名字必须填写
      *
      你的电话必须填写