Ultipa offers an ever-growing rich set of algorithms for graph analytics purposes, this document will explain the basic concepts, program implementations as well as the invocation and writeback methods of each algorithm.
Many Ultipa algorithms can compute in a real-time fashion, some whole-graph and whole-data running algorithms can achieve near-real-time effect through asynchronous tasks. The parameter write()
enables algorithm to run as task, and task can be allocated to analysis node (server) for computing by using the exec task
prefix at the same time, as detailed in the Task chapter of the UQL documentation.
Ultipa algorithm package is offered to users as a hot-pluggable plugin that can be hot-updated. Both advanced algorithm package and custom algorithm package are available to users.
Show Algorithm
Returned table name: _algoList
Returned table header: name
| param
| detail
Syntax:
// To show information of algorithms installed in the current Ultipa instance
show().algo()
Algorithm Command and Parameter
- Command:
algo(<algorithm_name>)
- Parameter:
Parameter | Data Type | Specification | Description |
---|---|---|---|
params() |
obj | Mandatory | To configure the algorithm with objects wrapped by key-value (KV) pairs; see each algorithm chapter for the configuration details |
node_filter() |
obj | Filter | The filtering rules for all nodes; nodes do not satisfy the rules will be ignored |
edge_filter() |
obj | Filter | The filtering rules for all edges; edges do not satisfy the rules will be ignored |
Parameters
node_filter()
andedge_filter()
will be supported successively, thus they are not detailed in each algorithm chapter yet.
Algorithm Execution Result
Algorithm Results
In general, algorithm results is a data stream of multiple columns.
Algorithm Statistics
In general, algorithm statistics is a data row of multiple KVs (Key-Value). Some algorithms do not have statistics.
Algorithm Execution Method
There are 4 methods of executing algorithm, only one can be used at one time:
Execution Method | Execution Parameter | Description | Assemble with Other UQL Statement |
---|---|---|---|
Task Writeback | write() |
Run algorithm as task, task ID will be returned to the client (SDK/API/Manager); algorithm results can be written back to file (through RPC) or node/edge property, algorithm statistics can be written back to task information | Not supported |
Direct Return | / | Define alias for algorithm statement and use with return clause, to return algorithm results and statistics directly to the client |
Not supported |
Streaming Return | stream() |
Return algorithm results in real time | Supported |
Real-time Statistics | stats() |
Return algorithm statistics in real time | Supported |
See chapter Task of the UQL documentation for algorithm task operations.
Task Writeback
Using write()
parameter to write the algorithm results and statistics back to specified location(s), which includes file writeback (file), property writeback (db) and statistics writeback (stats). It is not supported to writeback to file and property at the same time; however, the statistics will be written back to the task automatically if the algorithm has statistics.
1. File Writeback
Write the algorithm results back to one or multiple files, filename(s) are required. There is no table header contained in the file(s), English comma is used to separate the data in one row.
Syntax: Wrap file object in the parameter write()
// Write the algorithm results back to one file, and automatically write the algorithm statistics (if has) back to algorithm task
algo(<>).params(<>).write({
file:{
filename: "<filename>"
}
})
// Write the algorithm results back to multiple files, and automatically write the algorithm statistics (if has) back to algorithm task
algo(<>).params(<>).write({
file:{
filename_<result1>: "<filename>",
filename_<result2>: "<filename>",
...
}
})
In regard to the filename extension, it is suggested to use .csv or .txt, or to ignore it.
Example: Run Triangle Counting algorithm in the graph and write the results back to file, the filename is count
algo(triangle_counting).params().write({
file:{
filename: "count.csv"
}
})
Results displayed in Ultipa Manager: View (File download link and algorithm statistics are both displayed under the Result column of task information)
Example: Run K-Hop Whole Graph algorithm in the graph and write the results back to files, the filenames are khop_ids and khop_num
algo(khop_all).params().write({
file:{
filename_ids: "khop_ids.csv",
filename_num: "khop_num.csv"
}
})
Results displayed in Ultipa Manager: View (File download links are displayed under the Result column of task information, K-Hop Whole Graph algorithm does not have statistics)
2. Property Writeback
Write the algorithm results back to one or multiple properties, property name is required. The property to be written back can be node property or edge property.
Property writeback is a whole-data operation, which means to writeback to all nodes or edges in the current graphset, so that schema need not be specified when the name of property is given. For any schema, if it does not contain the property to be written back, the property would be created automatically; if the property does exist but its data type is not the same with the data to be written back, the writeback would fail for that schema. For nodes or edges that have calculation results, the results are to be written back to their properties; for nodes or edges have no result (such as when it is not whole-data operation), results of 0, null or other empty value would be written back according to the data type.
Syntax: Wrap db object in parameter write()
// Write algorithm results back to one or multiple properties, and write the algorithm statistics (if has) back to algorithm task
algo(<>).params(<>).write({
db:{
property: "<property>"
}
})
Example: Run Closeness Centrality algorithm in the graph and write the results back to node property centrality
algo(closeness_centrality).params().write({
db:{
property: "centrality"
}
})
Results displayed in Ultipa Manager: View (Algorithm results are written back to the property centrality of nodes; Closeness Centrality algorithm doesn't have statistics so the Result column of task information is blank)
Example: Run Label Propagation algorithm in the graph and keep two labels for each node the maximum, write the results back to node property label_1, probability_1, label_2 and probability_2
algo(lpa).params({k:2}).write({
db:{
property: "label"
}
})
Results displayed in Ultipa Manager: View (Algorithm results are written back to the property label_1, probability_1, label_2 and probability_2 of nodes, algorithm statistics is displayed under the Result column of task information)
When each node (or edge) has multiple results after some algorithm executes, still only one property name
<property>
needs to be specified, the system will automatically writes all results back to different properties.
Like the Label Propagation algorithm in the above example, only property name label is specified for 'label', the property name of its 'probability' is pre-defined and not modifiable, and a serial number is added to each property by the system automatically when writing back.
3. Statistics Writeback
Algorithm statistics is recorded in task information if the algorithm has statistics. Statistics is recorded automatically when executing in the way of file writeback or property writeback; executing statistics writeback is to record algorithm statistics alone.
Syntax: Use parameter write()
directly
// Write algorithm statistics back to task
algo(<>).params(<>).write()
Example: Run Degree algorithm in the graph and write the statistics total_degree
and average_degree
back to the task
algo(degree).params().write()
Results displayed in Ultipa Manager: View (Algorithm statistics are written back to the Result column of task information)
Direct Return
Syntax: Define alias and assemble with return
clause
// Return algorithm results and statistics in real time
algo(<>).params(<>) as <alias1>, <alias2>
return <alias0>, <alias1>
// Return algorithm results in real time
algo(<>).params(<>) as <alias>
return <alias>
When defining aliases for algorithm statement, the order matters. The one written first refers to algorithm results, and the one after is algorithm statistics. User needs to follow this rule regulated by the system. If algorithm does not have statistics, only one alias can be defined for the algorithm statement.
Certain column of algorithm results or statistics can be referenced in the
return
clause. Regarding to the definition and reference of alias, please see the chapter Query of the UQL documentation.
Example: Run Degree algorithm in the graph and return the results (define as alias a1) and the statistics (define as alias a2) directly
algo(degree).params() as a1, a2
return a1, a2
Results displayed in Ultipa Manager: a1, a2
Example: Modify the above example, return algorithm results and statistics in separated columns
algo(degree).params() as a1, a2
return a1._uuid, a1.degree, a2.total_degree, a2.average_degree
Results displayed in Ultipa Manager: a1._uuid, a1.degree, a2.total_degree, a2.average_degree
For the definition, usage and reference of alias, please read the Query chapter of the UQL documentation.
Streaming Return
Syntax: Use parameter stream()
// Execute the algorithm and define the results as alias named '<alias>', in this way the algorithm results can be returned or used as the input of subsequent UQL statements
algo(<>).params(<>).stream() as <alias>
...
Example: Run Closeness Centrality algorithm in the graph and return the results as real-time data stream (define as alias cc)
algo(closeness_centrality).params().stream() as cc
return cc
Results displayed in Ultipa Manager: cc
Example: Modify the above example, return UUID of nodes whose closeness centrality is greater than 0.5
algo(closeness_centrality).params().stream() as cc
where cc.centrality > 0.5
return cc._uuid
Results displayed in Ultipa Manager: cc._uuid
Real-time Statistics
Syntax: Use parameter stats()
// Execute the algorithm and define the statistics as alias named '<alias>', in this way the algorithm statistics can be returned or used as the input of subsequent UQL statements
algo(<>).params(<>).stats() as <alias>
...
Example: Run Degree algorithm in the graph and return the statistics (define as alias sta) in real time
algo(degree).params().stats() as sta
return sta
Results displayed in Ultipa Manager: sta
Example: Modify the above example, return algorithm statistics in separated columns
algo(degree).params().stats() as sta
return sta.total_degree, sta.average_degree
Results displayed in Ultipa Manager: sta.total_degree,sta.average_degree
Algorithm Results Visualization (Under Development)
algo_dv(<algorithm_name>).params(<configuration>).id(<task_id>)
Algorithm results visualization refers to some 3D presentation of the algorithm results according to the computing purpose, especially the rendering of the color and results of nodes. The algorithm must finish the execution in the background task fashion (i.e. carry the write()
parameter) before the results to be further visualized.
Syntax:
- Command:
algo_dv(<algorithm_name>)
- Parameter: (see the table below)
Parameter | Type | Specification | Description |
---|---|---|---|
params() |
obj | Mandatory | To configure algorithm with objects; see later chapters for the configuration details of each algorithm |
id() |
int | Mandatory | The task id when running the algo().write(type: "visualization") before, the <algorithm_name> in algo_dv() must be consistent with the algorithm name in that task |