Ultipa (Ultipa Graph HTAP System, hereinafter Ultipa Graph) is a set of ultra-high performance native graph computing service framework and storage infrastructure.
Ultipa Graph product matrix contains the followings:
- World's fastest real-time graph computing engine;
- High performance, high availability, persistent storage services;
- Concise and smooth user graphical operation interface and knowledge graph system;
- Feature rich CLI toolkit;
- High performance and flexible import/export toolkit;
- Easy to deploy docker images;
- SDKs and APIs of mainstream programming languages;
- UQL - a Graph Query Language (GQL) - that this manual focuses on.
Ultipa Graph supports a rich collection of querying methods, loads of ultra-high performance graph algorithms, and real-time processing against large amounts of data, which truly saves a lot of time from user's perspective by upgrading very time consuming and non-real-time operations to real-time operations. It brings inestimable possibilities for graph computing products to be widely used in business and big data analytics scenarios.
What is UQL
Ultipa Query Language, or UQL (or uQL) for short, is a general purpose Graph Query Language. UQL is a unique high performance query and management language designed for Ultipa Graph Database and Graph Computing Engine, developers can quickly grasp it and get started with Ultipa Graph system. UQL supports query, deletion, modification, addition, graph traversal, subgraph matching, schema management, property management, engine management, index management, GraphSet management, task management, permission management, and other functions on Ultipa Graph system. Users can invoke UQL via Ultipa CLI (Command Line Tool), Ultipa Manager (a graphical interface) or SDK (Ultipa Software Development Kits). UQL will soon achieve Turing Completeness and be available to developers.
UQL keeps pace with GQL international standards in terms of the overall functionality and compatibility. GQL standard is expected to be launched in 2024. Ultipa team has begun to join the standardization organization to build the standard Graph Query Language.
Terminologies
Name | Description |
---|---|
GQL | Graph Query Language, another and the only standard data query language since SQL (Structured Query Language). |
UQL (uQL) | Ultipa Graph Query Language, a kind of GQL that can fully operate Ultipa Graph system. |
Node | In graph theory, a node is formally called a "vertex". In Ultipa Graph, we call vertex "node". |
Edge | Edge connects a pair of nodes, all the edges in Ultipa Graph system are directed edge (see the description of "Direction" below). |
Path | Path is a sequence with definite initial-node and terminal-node, alternating between nodes and edges. Nodes in path can correspond while edges cannot. The sequence of the nodes and edges in a path can be regarded as a unique identifier of the path. |
Circle | Observe nodes of a path excluding the initial-node and the terminal-node, if any of them corresponds with another node (including the initial-node and the terminal-node), then the path is judged as 'has circle'. One correspondence of the initial-node and the terminal-node will not sentence the path as 'has circle'. For example:![]() (Paths that have circle can be ruled out by using the parameter no_circle() in path query command so they will not be returned.) |
Shortest Path | If a path contains the least number of edges, at least one edge, to walk from its initial-node to terminal-node, it is called a shortest path of the specified initial-node and terminal-node. The 'least number of edges' should be understood as 'minimum sum of edges weights' in case of a weighted path. |
Graph | Dataset of nodes and edges is called graph. Graph can be viewed as a data collection of multiple paths. The smallest unit of graph is node. |
Subgraph | Subgraph is part of nodes and edges of the whole graph. The result of node query and path query can be considered as a subgraph. |
GraphSet | A GraphSet comprises a set of nodes and edges along with indexes, user privileges and algorithmic tasks created on the graph. In Ultipa Graph system, user can create more than one GraphSet. |
Schema | In Ultipa Graph system, a schema of node or edge includes a set of properties that describes the structure and content of node or edge. Each node or edge can only belong to one schema. |
Property | A property belongs to a schema and is used to describe a character of node or edge. Property supports rich data types such as int32 , float , string , and some data structures such as array , dictionary . |
Property Index | Index mechanism is used to improve the query efficiency of properties, the created index tree is stored on the disk. |
Full-text Index | Full-text index (word-segmentation of the text) is created to improve the query efficiency of long strings. Full-text index supports different dictionaries in order to optimize the word-segmentation for different datasets. |
Engine Index | Improving the efficiency of path query and deep graph traversal by loading properties to the computing engine, the efficiency is often increased by several orders of magnitude. |
Instance | Ultipa Graph system instances, that is, the running applications on Ultipa Server, each instance generally runs on one virtual or physical host, and multiple instances can form a cluster environment. |
Filter | It is used to filter nodes and edges during graph queries. Ultipa filter is essentially a logic tree, very much like IF in programming languages, which uses various conditional operators, logical operators and numeric operators. Refer to the chapter Filter for details. |
LTE | Load to Engine, load property to the computing engine. |
UFE | Unload from Engine, remove property from the computing engine. |
Direction (In) | An edge pointing to node A from another node is called an "In" edge of node A, or an edge of node A in the inbound direction. It is written as either A<-- or -->A. |
Direction (Out) | An edge pointing from node A to another node is called an "Out" edge of node A, or an edge of node A in the outbound direction. It is written as either A--> or <--A. |
Direction (Left) | An edge in a given path that points from the latter node to the previous node is called "Left" edge. It is written as A<--B. |
Direction (Right) | An edge in a given path that points from the previous node to the latter node is called "Right" edge. It is written as A-->B. |
ID (_id ) |
Unique identifier that is exclusive to node. It is stored as string with maximum length of 128 bytes. |
UUID (_uuid ) |
Unique identifier for nodes and edges. It is stored as uint64 . |
Edge Start (_from , _from_uuid ) |
The ID or UUID of the start node of directed edge. |
Edge End (_to , _to_uuid ) |
The ID or UUID of the end node of directed edge. |
V4.3 | Features supported by V4.1 and above. |
Specification
The design inspiration of UQL has roots in a deep understanding of graph, and it satisfies the demands of industry for high dimensional and extensible graphs.
Chain Statement + Semantic Assembly + Alias Call
UQL Example:
n(1).e({time > prev_e.time})[3].n(as target)
group by target.level with count(target) as quantity
order by quantity desc
return target.level, quantity limit 10
Instructions:
- The n(... ).e(...)[3].n(...) in the above example is a chain statement styled as
[command].[parameter].[parameter]...
. It realizes the operations of insertion, update, deletion and query of a UQL. A UQL statement may contain multiple chain statements. - The group by ..., with ..., order by ..., return ... and limit ... from the above example are clauses that process and assemble query results of a UQL.
- The query results of a UQL can also be recomputed by a function, such as the count(...) in the above example.
- Query results that are temporarily saved and passed between chain statements and clauses should be given custom alias, such as target and quantity in the example. Use as to define a custom alias.
- Query results that are passed within a multi-edge template are entitled with system alias, the
prev_n
andprev_e
, to be called by an Ultipa Filter. Refer to the usage of ...time > prev_e.time... in the example. - Supports escape characters
\
, tab\t
, carriage return line feed\r\n
and comment delimiters//
,/*
and*/
.
UQL Design Principles:
- Ability to return and use high dimensional results;
- Advanced and clearly defined data structures, such as subgraph, path, node, edge, property, array and table;
- Easy depict of subgraph filtering, direct docking and adaptation to high performance computing engine;
- Minimal cognitive loading, easy to read, write and learn;
- Chain statement + semantic assembly, various functional extensions supported, the "chain" itself is a path;
- The functional style is more in line with today's complex data processing needs, and provides unlimited expansion space;
- The functional style allows users to customize the extended language syntax features to meet the demands of operating complex graph;
Why Not SQL-like:
- SQL's inability to clearly express high dimensional data and its combinations, such as path, node, edge, property and collection of aggregation operation results;
- Path filtering in SQL is too complicated and inefficient, such as path query, template query and graph traversal;
- It is difficult to understand SQL as it is contrary to the logic of human brain, such as nested statement, join table query, etc.
UQL Syntax Features
UQL has DQL, DDL, DML and DCL syntax features:
- DQL: Data Query Language, to query nodes, edges and paths in the graph;
- DDL: Data Definition Language, to add, delete, modify and view the content of GraphSet;
- DML: Data Manipulation Language, to add/delete GraphSet, modify schema, control index etc.;
- DCL: Data Control Language, mainly used to manipulate database permission settings, such as user management, role management, the grant and revoke of permissions etc.
Reserved Words
Category | Words |
---|---|
System Property | _id ,_uuid ,_from ,_to ,_from_uuid ,_to_uuid |
System Table Alias | _graph ,_nodeSchema ,_edgeSchema ,_nodeProperty ,_edgeProperty ,_nodeIndex ,_edgeIndex ,_nodeFulltext ,_edgeFulltext ,_statistic ,_top ,_task ,_policy ,_user ,_privilege ,_algoList ,_extaList |
System Alias | this ,prev_n ,prev_e |
Clause Keyword | in ,nin ,contains ,xor ,as ,group by ,order by ,asc , desc ,skip ,limit ,return ,with ,where , uncollect ,union , union all ,call |
Function Keyword | case ,when ,then ,else ,end |
UQL Prefix | exec task ,explain ,profile ,debug |
Query Prefix | optional |