Change Password

Input error
Input error
Input error
Submit

Change Nickname

Current Nickname:
Submit
Search
v4.0
    v4.0

    Transporter Instructions

    This manual covers the usage of Ultipa Transporter (Go).

    Function Preview

    Ultipa Transporter is a command-line-based lightweight tool for fast import/export meta-data to/from the Ultipa Graph database. Batch import/export is supported in remote mode.

    A command-line can import or export multiple node files and edge files.

    Local operation (to be declared in local in the yml file) is executed against graphsets in a local Ultipa database. The directory of the local database server needs to be declared via parameter path (by default is './data').

    Local import operation is normally used for initializing the Ultipa server, in which case the Ultipa server should be stopped and there is no previous import executed.

    We strongly recommend the local import operation to be implemented under the supervision of a certified Ultipa Graph Database engineer.

    • Remote Operation (to be declared in server in the yml file)

    Remote operation is executed against graphsets in a remote Ultipa database. The IP and port of the remote database server should be declared via parameter host,as well as username and password if required.

    Node/Edge File

    Files carrying node/edge information (either exported or to be imported) are stored in the local directory. A file contains either nodes of a specific schema or edges of a specific schema; each row (except the header) in the file represents a node or an edge, and each column represents a property of the node/edge. Files are encoded in the format of csv, and delimiters supported include ,, \t, |, ;.

    Data contained in the files (either node or edge) should be declared via nodeConfig and edgeConfig. Format (and delimiter if contained) of all files in one operation should be consistent.

    Import Declaration

    Information that are consistent for all files (to be declared in settings in the yml file): delimiter separator, number of threads threads and so on, see settings introduced later for details. A csv or tsv file to import can have header or not, and the column name should be <property_name> or <property_name>:<property_type>.

    Information to be declared for each file (to be declared in nodeConfig, edgeConfig in the yml file): file directory file, file format fileType, schema of data - schema, start position of import skip, number of rows to import limit, and property list properties or types:

    (for csv or tsv file)

    • when the file is headerless, for each column in turn, use properties to:
      • declare property name and type using - name and type;
      • declare property name and type to 'string' using - name only;
      • when there are less - name than expected, columns will be omitted from the left;
    • when the file contains header, for some columns, use types to:
      • declare or modify property type using - name and type;
      • declare or modify property type to 'string' using - name only;
      • for columns whose - name does not appear, the property type will either be 'string' if the header is <property_name>, or be the <property_type> from the header if the header is <property_name>:<property_type>.

    Valid type are listed below:

    • _id, _uuid, _from, _to, _from_uuid, _to_uuid (for declaration of Ultipa system properties);
    • string, float, double, int, int32, uint32, int64, uint64, datetime, timestamp;
    • _ignore (for ignoring a column)

    The data type of properties declared should satisfy the data in the columns. A misparsing of data, reading a column of integers as strings for example, will not be prompted as an error.

    Export Declaration

    Information that is consistent for all files (to be declared in settings in the yml file): file directory outPath, and log directory logPath. An exported csv (delimiter of ,) or tsv (delimiter of \t) file can either be headerless or include header (with <property_name>:<property_type> as the column name). Batch size is auto-defined and need not be set.

    Information to be declared for each file (to be declared in nodeConfig, edgeConfig in the yml file): file format fileType, schema of data - schema, property list (not including Ultipa system properties) properties. Ultipa system properties will always be exported and need no declaration.

    Metadata Knowledge

    Unique Identifier

    Node has two types of unique identifiers which are _id (32bit-string) and_uuid (uint64-integer); edge has therefore starting node as either _from or _from_uuid, and ending node as either _to or _to_uuid. Edge, on the other hand, has only one identifier _uuid. These six properties are Ultipa system properties.

    Import Mode

    When the import mode importMode is set to upsert or overwrite, a node/edge that already exists (meaning that its identifier is provided in the file and the identifier is found in the graphset) will update or overwrite its corresponding record in the graphset; otherwise (which is, either there is no identifier in the file, or the identifier is not found in the graphset) the node/edge will be inserted into the graphset as a new record.

    When the import mode importMode is set to insert, only new node/edge (either there is no identifier in the file, or the identifier is not found in the graphset) will be inserted; any identifier that already exists in the graphset will trigger error.

    The absence of _id or _uuid from the file will be automatically generated by the system up insertion.

    Special Requirements on Edges

    An edge file must contain starting node and ending node of edges.

    To successfully import edges, both ending nodes of the edges should already exist, otherwise the import will fail. In this case, set createNodeIfNotExist to 'true' and let the system create the non-existing ending nodes so as to import the edge.

    How does Transporter judge whether an ending node exists when importing edge files? During remote operation, Transporter will search the node file being imported in the same command line as well as the graphset for the ending nodes.

    What will happen if only system properties _from and _to are provided as unique identifiers of ending nodes, or only _from_uuid and _to_uuid are provided? During remote operation, Transporter will automatically project the other two system properties.

    Import Error

    There are 4 types of error that may occur during an import operation:

    1. server returned error;
    2. network error;
    3. parameter config error;
    4. data format error (inconsistency between declared file header and data columns);
    5. duplicated identifier (node or edge already exists under insert mode).

    By default when an error occurs during importing a data batch, the whole batch will be skipped and the operation will continue from the next batch; one can use parameter stopWhenError to make the import operation fully stop once an error occurs, without importing any later batches. The error type, skipped data rows (represented by the start row position and the total number of rows) will both be recorded in the log file, which is for easy re-import using parameters skip and limit.

    The 4th type of error which is data format related occurs when:

    • in the case of a headerless file, the number of - name declared under properties is different than the number of columns in the file;
    • in the case of a file with header, some data columns have no column name, or some column names in the tail have no data.

    To avoid the 4th type of error, set fitToHeader to 'true', and let Transporter import data columns according to the properties declared, which means columns without names will be ignored and names with no data will be taken as properties and filled with empty value or 0.

    Command and Parameters

    Ultipa Transporter has two tools --- ultipa-importer and ultipa-exporter --- and both have below parameters:

    Parameter Description
    --help Help
    --config <config_file_path> Path of the configuration file

    Example of Import:

    /opt/ultipa-transporter/ultipa-importer --config ./in.yml
    

    Example of Export:

    /opt/ultipa-transporter/ultipa-exporter --config ./out.yml
    

    Example of Help:

    /opt/ultipa-transporter/ultipa-exporter --help
    

    Configuration File

    Both import and export operations need a configuration file in yml format. There are four parts in a yml config file:

    • server
    • nodeConfig
    • edgeConfig
    • settings

    server

    Parameter Specification Description
    host <ip>:<port> The IP and port of the remote server
    username string The username to log in to the remote server if required
    password string The password to log in to the remote server if required
    crt <file_path> The absolute path of the SSL certificate to communicate with the remote server, required when both servers are in SSL mode
    graphset string The graphset name, or take 'default' when not using this parameter; a non-existing graphset name will lead to failure, otherwise make Transporter auto-create the desired graphset as per instruction of yes under settings

    nodeConfig | edgeConfig

    Parameter Specification Operation Description
    - schema string Import/Export The schema of nodes/edges, or take 'default' when not using this parameter; a non-existing schema name will lead to failure; in the case of an import operation, make Transporter auto-create the desired schema as per instruction of yes under settings; in the case of an export operation, use '*' to declare all properties of all schemas
    file <file_path> Import The absolute path and name of the node file to import, such as '/opt/ultipa-server/import/amz/node.csv'
    skip int Import The number of records that will be skipped and not imported, from the beginning of the data file, or do not skip any data when not using this parameter
    limit int Import The number of records to import, or to import until the end of the file
    properties
    or
    types
    Import/Export Prompt that file columns are to be declared next, use properties when importing a headerless file or when exporting files, and use types when exporting a file with header; a - schema will no have properties and types simultaneously
    - name string Import/Export The name of a property, a non-existing property name will lead to failure; in the case of an import operation, make Transporter auto-create the desired property as per instruction of yes under settings
    type string Import The data type of a particular column, valid types are: string, int, int32, int64, uint32, uint64, float, double, datetime and timestamp for custom properties; _id _uuid _from _to _from_uuid and _to_uuid for system properties; _ignore for omitting a column; take 'string' when not using this parameter; properties that already exist in the graphset need to be set with a data type consistent with the record in the graphset

    Note: Parameters with a dash '-' ('- dir', '- schema' and '- name') and their sub-parameters carried can appear multiple times.

    settings

    Parameter

    Specification Operation Description
    logPath <path> Import/Export The path of the log file, such as '/opt/ultipa-server/log/', or write to './log/' when not using this parameter
    separator string Import The delimiter that separates data fields in the csv (or tsv) file during an import operation, support ',' '\t' '|' and ';', or take ',' when not using this parameter
    threads int Import The maximum threads (an integer no less than 2) during an import operation, or take 2 when not using this parameter; 5 ~ 8 threads are recommended
    batchSize int Import The number of records in each batch during an import operation, valid from 500 to 10000; an integer of 100000/number_of_properties is recommended, or take 10000 when not using this parameter
    importMode string Import The mode of an import operation, support 'insert', ''upsert' and 'overwrite', or take 'insert' when not using this parameter
    createNodeIfNotExist bool Import Whether to create nodes for the non-existing _from, _to, _from_uuid or _to_uuid of edges, or leave them non-existing and their related edges un-imported
    stopWhenError bool Import Whether to terminate the import operation once an error occurs, or to skip the error data batch and continue with the next batch when not using this parameter
    yes bool Import Whether to auto-create graphset, schema and properties that do not exist
    fitToHeader bool Import Whether to omit or fill up data columns according to the header in the data file or header configured in the yml file; the inconsistency between data columns and property header will trigger an error when not using this parameter
    writeHeader bool Export Whether to write header into the csv (or tsv) file during an export operation, or to write when not using this parameter, with column names in the form of <property>:<type>
    outPath <path> Export The path of the exported files, such as '/opt/ultipa-server/import/amz/', or write to './export/' when not using this parameter

    Note: exported files (node and edge) are automatically named, eg., node file will be named with format: <schema>.node.<file_type>, which can be 'default.node.csv'

    YML Samples

    Remote Import

    Example: Given the following 3 files, import them into a remote server.

    Node file-A (txt): @student, in which the proterty type of 'age' is mistakenly written as 'string'; please revise it into 'uint32' during import:

    stuNo:_id,name:string,age:string,gender:string
    20215865,Alice,24,f
    20215925,Jack,25,m
    20215973,John,28,m
    20215990,Grace,25,f
    

    Node file-B (txt): @course; please create an empty column for property 'professor':

    crsNo,title,credit
    CS202104,Computer Principle and Application,4
    SH202127,Art of File and Television,2.5
    MS202104,Calculus,3
    

    Edge file (csv): @enroll, in which the 1st column is stuNo number and the 2nd column is crsNo; please ignore the 3rd column when importing:

    20215865,SH202127,84.5
    20215925,MS202104,77.5
    20215973,CS202104,86
    20215990,SH202127,64.5
    

    Command line:

    ./ultipa-importer --config ./in.yml
    

    in.yml:

    server:
      host: "192.168.35.151:60024"
      username: employee533
      password: joaedSSGsdf
      crt: ""
      graphset: test_graph
      
    nodeConfig:
      - schema: student
        file: ./student.txt
        fileType: txt
        types:
          - name: age
            type: uint32
      - schema: course
        file: ./course.txt
        fileType: txt
        types:
          - name: title
          - name: professor
          - name: crsNo
            type: _id
          - name: credit
            type: float
    
    edgeConfig:
      - schema: enroll
        file: ./enroll.csv
        fileType: csv
        properties:
          - name: _from
            type: _from
          - name: _to
            type: _to
    
    settings:
      separator: ","
      yes: true
      fitToHeader: true
    

    Remote Export

    Example: export from a remote database all node properties of @student, node properties _id and 'title' of @course, and all edge properties of all schemas.

    Command line:

    ./ultipa-exporter --config ./out.yml
    

    out.yml:

    server:
      host: "192.168.35.151:60024"
      username: employee533
      password: joaedSSGsdf
      crt: ""
      graphset: test_graph
      
    nodeConfig:
      - schema: student
      - schema: course
        properties:
          - name: _id
          - name: title
    
    edgeConfig:
      - schema: *
    
    settings:
      writeHeader: false
      outPath: ./export/temp
    

    FAQ

    Q: How to remote import with Docker?

    A: Follow below steps:

    1. Start container: use command docker run and parameter -itd, declare host directory, container directory via parameter -v, declare container name via parameter -name:
    docker run -itd \
    -v /tmp/transporter_test/data:/opt/ultipa-transporter/data \
    --name transporter4.0test <Transporter_image>
    
    1. Enter container: use command docker run and parameter -itd:
    docker exec -it transporter4.0test bash
    
    1. Run import command:
    ./ultipa-importer --config ./data/in.yml
    
    Please complete the following information to download this book
    *
    公司名称不能为空
    *
    公司邮箱必须填写
    *
    你的名字必须填写
    *
    你的电话必须填写
    *
    你的电话必须填写