Each query, insertion, update chain statement and clause in UQL produces data stream. In UQL, both chain statements (insertion, update, and query for Node, Edge, Path) and clauses can accept and produce data streams.
Multiple data columns can exist in one data stream, the length of each column is the number of rows in each column. Data columns need to be defined with aliases to be used by subsequent statements, and the structure type of a data column is the same as the structure type of its alias (NODE, EDGE, PATH, ATTR, ARRAY, TABLE). See Query for more information about Alias.
Homologous columns can be a data column or multiple data columns derived from one data stream (chain statement or clause), it can also be a data column or multiple data columns inputted to that statement, if applicable. During the executon of UQL (except for the deduplication operation in RETURN clause), the length of homologous columns are the same and each column's data in one row are correlated.
Example: The template query statement in the image below found 5 paths; the columns of the end nodes of the paths, the column of the length (number of edges) of the paths, as well as the column of paths are homologous columns, and each contains 5 pieces of data
If one of the homologous columns is aggregated, deduplicated (column length becomes smaller) or processed with UNCOLLECT operation (column length becomes larger), this may affect the length of other homologous columns, details to follow in the latter sections of WITH and RETURN.
Non-homologous columns are often derived from two or more non-related statements. The lengths of non-homologous columns are often different and data in a row are not related either.
Example: The two K-Hop query statements in the image below produce two data streams of lengths 3 and 5, these two columns of data are non-homologous columns.
Data Stream and Chain Statement
During the execution of UQL, when the data stream produced earlier is used as the input to the subsequent chain statement, the times the chain statement are executed equals to the length of that data stream, and each execution uses one row of data in the data stream (system will apply some optimizations based on the actual situation).
Example: The template query statement in the image below found multiple paths, with 4 end nodes after deduplication, it can be viewed that the deletion statement afterwards is executed 4 times in total, and deletes 1 node each time:
A query statement, which is executed multiple times according to the length of the data stream, produces a new data stream and its length equals to the sum of the number of results of each query.
Example: The template query statement in the image below found 2 nodes - blue and red, the first execution of the subsequent node query statement found 2 blue nodes, the second execution found 3 red nodes, that is 5 pieces of data in total:
When multiple data streams are simultaneously used as the input to a chain statement, it can be understood that the number of times the chain statement executed equals to the length of the shortest stream.
Example: The template query statement in the image below produced two data streams of length 3 and 2, the longer data stream is trimmed to the shortest length of 2, eventually 2 rows of data are passed into the third query statement and 3 query results are obtained:
This kind of operation of trimming multiple data streams to the shortest length is not very useful in actual practices; more often, multiple data streams are to be processed into one data stream, in order to be used in the subsequent chain statement, details to follow in section WITH .
Data Stream and Clause
Different clauses work differently when processing data streams. Some clauses are only able to process homologous columns, such as GROUP BY, ORDER BY, LIMIT and SKIP; while other clauses can process both homologous and non-homologous columns, such as WITH, RETURN and WHERE. Whether it is homologous columns or non-homologous columns, all clauses produce homologous columns after execution. See the following sections for each clause.
The vast majority of clauses require data columns to be executed clearly designated; a few clauses (such as SKIP, LIMIT) do not, for their execution objects are the results from their subsequent clauses.