Ultipa (Ultipa Graph HTAP System, hereinafter Ultipa Graph) is a set of ultra high performance native graph computing service framework and storage infrastructure.
Ultipa Graph product matrix contains the followings:
- World's fastest real-time graph computing engine；
- High performance, high availability, persistent storage services；
- Concise and smooth user graphical operation interface and knowledge graph system;
- Feature rich CLI toolkit；
- High performance and flexible import/export toolkit；
- Easy to deploy docker images；
- SDKs and APIs of mainstream programming languages;
- UQL - a Graph Query Language (GQL) - that this manual focuses on.
Ultipa Graph supports a rich collection of querying methods, loads of ultra high performance graph algorithms, and real-time processings against large amounts of data, which truly saves a lot of time from user's perspective by upgrading very time consuming and non-real-time operations to real-time operations. It brings inestimable possibilities for graph computing products to be widely used in business and big data analytics scenarios.
What is UQL
Ultipa Query Language, or UQL (or uQL) for short, is a general purpose Graph Query Language. UQL is a unique high performance query and management language designed for Ultipa Graph Database and Graph Computing Engine, developers can quickly grasp it and get started with Ultipa Graph system. UQL supports query, deletion, modification, addition, graph traversal, subgraph matching, schema management, property management, engine management, index management, GraphSet management, task management, permission management, and other functions on Ultipa Graph system. Users can invoke UQL via Ultipa CLI (Command Line Tool), Ultipa Manager (a graphical interface) or SDK (Ultipa Software Development Kits). UQL will soon achieve Turing Completeness and be available to developers.
UQL keeps pace with GQL international standards in terms of the overall functionality and compatibility. GQL standard is expected to be launched in 2022. Ultipa team has begun to join the standardization organization to build the standard Graph Query Language.
|GQL||Graph Query Language, another and the only standard data query language since SQL (Structured Query Language).|
|UQL (uQL)||Ultipa Graph Query Language, a kind of GQL that can fully operate Ultipa Graph system.|
|Node||In graph theory, a node is formally called a "vertex". In Ultipa Graph, we call vertex "node".|
|Edge||Edge connects a node pair, there are (1) directed edge and (2) undirected edge (see the description of "Direction" below).|
|Path||Path is a sequence with definite start node and end node, alternating between nodes and edges. Nodes in path can repeat while edges can't. The sequence of the nodes and edges in a path can be regarded as a unique identifier of the path.|
|Circle||A path is regarded as containing circle when at least two nodes of it (except the start node) repeat each other. Paths with circle can be filtered out by using the parameter
|Graph||Dataset of nodes and edges is called graph. Graph can be viewed as a data collection of multiple paths. The smallest unit of graph is node.|
|Subgraph||Subgraph is part of nodes and edges of the whole graph. The result of node query and path query can be considered as a subgraph.|
|GraphSet||A GraphSet comprises a set of nodes and edges along with indexes, user privilleges and algorithmic tasks created on the graph. In Ultipa Graph system, user can create more than one GraphSet.|
|Schema||In Ultipa Graph system, a schema of node or edge includes a set of properties that describes the structure and content of node or edge. Each node or edge can only belong to one schema.|
|Property||A property belongs to a schema and is used to describe a character of node or edge. Property supports rich data types such as
|Property Index||Index mechanism is used to improve the query efficiency of properties, the created index tree is stored on the disk.|
|Full-text Index||Full-text index (word-segmentation of the text) is created to improve the query efficiency of long strings. Full-text index supports different dictionaries in order to optimize the word-segmentation for different datasets.|
|Engine Index||Improving the efficiency of path query and deep graph traversal by loading properties to the computing engine, the efficiency is often increased by several orders of magnitude.|
|Instance||Ultipa Graph system instances, that is, the running applications on Ultipa Server, each instance generally runs on one virtual or physical host, and multiple instances can form a cluster environment.|
|Filter||It's used to filter nodes and edges during graph queries. Ultipa filter is essentially a logic tree, very much like IF in programming languages, which uses various conditional operators, logical operators and numeric operators. Refer to the chapter Filter for details.|
|LTE||Load to Engine, load property to the computing engine.|
|UFE||Unload from Engine, remove property from the computing engine.|
|Direction (In)||An edge pointing to node A from another node is called an "In" edge of node A, or an edge of node A in the inbound direction. It is written as either A<-- or -->A.|
|Direction (Out)||An edge pointing from node A to another node is called an "Out" edge of node A, or an edge of node A in the outbound direction. It is written as either A--> or <--A.|
|Direction (Left)||An edge in a given path that points from the latter node to the previous node is called "Left" edge. It is written as A<--B.|
|Direction (Right)||An edge in a given path that points from the previous node to the latter node is called "Right" edge. It is written as A-->B.|
||Unique identifier that is exclusive to node. It is stored as
||Unique identifier for nodes and edges. It is stored as
|Edge Start (
||The ID or UUID of the start node of directed edge.|
|Edge End (
||The ID or UUID of the end node of directed edge.|
The design inspiration of UQL has roots in a deep understanding of graph, and it satisfies the demands of industry for high dimensional and extensible graphs.
Chain Statement + Semantic Assembly + Alias Call
- The add, modify, delete and view expressions of UQL are styled as
[command].[parameter].[parameter]...the chain statement, such as
a().b().c().d().... A UQL statement may contain multiple chain statements.
- Query results of UQL are processed and assembled through clause, such as
- Chain statements and clauses temporiarily save data and pass or call data between each other by alias, such as custom alias defined by
as, system alias
prev_eand so on that used in template query. Alias represents a column of data in the data stream, when the alias is called, the data in the column is called row by row.
UQL Design Principles:
- Ability to return and use high dimensional results;
- Advanced and clearly defined data structures, such as subgraph, path, node, edge, property, array and table;
- Easy depict of subgraph filtering, direct docking and adaptation to high performance computing engine;
- Minimal cognitive loading, easy to read, write and learn;
- Chain statement + semantic assembly, various functional extensions supported, the "chain" itself is a path;
- The functional style is more in line with today's complex data processing needs, and provides unlimited expansion space;
- The functional style allows users to customize the extended language syntax features to meet the demands of operating complex graph;
- Run in the way of functional JIT which makes the process more efficient.
Why Not SQL-like:
- SQL's inability to clearly express high dimensional data and its combinations, such as path, node, edge, property and collection of aggregation operation results;
- Path filtering in SQL is too complicated and inefficient, such as path query, template query and graph traversal;
- It's difficult to understand SQL as it's contrary to the logic of human brain, such as nested statement, join table query, etc.
UQL Syntax Features
UQL has DQL, DDL, DML and DCL syntax features:
- DQL: Data Query Language, to query nodes, edges and paths in the graph;
- DDL: Data Definition Language, to add, delete, modify and view the content of GraphSet;
- DML: Data Manipulation Language, to add/delete GraphSet, modify schema, control index etc.;
- DCL: Data Control Language, mainly used to manipulate database permission settings, such as user management, role management, the grant and revoke of permissions etc.
When defining the name of graph, schema, property, alias, index, policy, user and so on in UQL, the general naming conventions are as below:
- 2~64 characters
- Must start with letters
- Allow to use letters, underscore and numbers ( _ , A-Z, a-z, 0-9)
Special cases are described separately when introducing the relevant content.
Custom names are case sensitive. For example, GraphSet "Bank" and "bank" are regarded as two different GraphSets.
|properties begin with
||reserved character for system properties||system defined property, such as
|full-text filters begin with
||reserved character for full-text filter||the full-text filter that is being called, i.e. prefix
||system alias||previous node of the current node/edge in template query (case sensitive)|
||system alias||previous edge of the current node/edge in template query (case sensitive)|
||system alias||current node/edge in various queries, only used to disambiguate UQL statements (case sensitive)|
||conditional operator||to judge whether an operand exists in a set|
||conditional operator||to judge whether an operand doesn't exist in a set|
||conditional operator||to judge whether a full-text index contains one or multiple values|
||clause keyword||to define alias|
||clause keyword||to group the rows in data stream, it's a combination of
||clause keyword||to sort the rows in data stream, it's a combination of
||clause keyword||to define the order is ascending or descending in
||clause keyword||to discard the first N rows of the data stream|
||clause keyword||to keep the first N rows of the data stream|
||clause keyword||to assemble the return values|
||clause keyword||to assemble the values and to pass backwards|
||clause keyword||to filter the rows in data stream|
||clause keyword||to expand the array elements in the column|
||function keyword||to map the rows in data stream|
||function keyword||to define the conditions that the rows have to meet in
||function keyword||the value after mapping of the rows that meet the conditions in
||function keyword||the value after mapping of the rows that doesn't meet the conditions in
||function keyword||to end
||task prefix||to run non-algorithm UQLs as background tasks|
||no-return prefix||to return a pseudo result for the queries that have no return, can be a node, an edge or a path whose UUID is 0|