Overview
A topological sorting of a directed graph is an ordering of its nodes into a sequence, where the start node of every edge appears before its end node. Topological sorting is applicable only to directed acyclic graphs (DAGs) that do not contain any cycles.
Topological sorting have various applications in computer science and other fields. In project management and job scheduling, topological sorting plays a crucial role in determining the optimal order of task execution based on their dependencies. It is also useful for resolving dependencies between modules, libraries, or components in software development. By utilizing this algorithm, dependencies can be resolved in the correct sequence, mitigating conflicts and potential errors.
Concepts
Directed Acyclic Graph (DAG)
A directed acyclic graph (DAG) is a type of directed graph with no directed cycles. That is, it is not possible to start at any node v and follow a directed path to return back to v in a DAG.
As shown here, the first and second graphs are DAGs, while the third graph does contain a directed cycle (B→C→D→B) and therefore does not qualify as a DAG.
A directed graph is a DAG if and only if it can be topologically sorted.
Topological Sort
Every DAG has at least one topological sorting.
In the above examples, nodes in the first graph has 3 possible sortings:
- A, E, B, D, C
- A, B, E, D, C
- A, B, D, E, C
A DAG has a unique topological sorting if and only if it has a directed path containing all the nodes, in which case the sorting is the same as the order in which the nodes appear in the path.
In the following example, nodes only has 1 possible sorting: A, B, D, C, E, F.
Considerations
Running the Topogical Sort algorithm on a graph with cycles will cause the omitting of some nodes. The omitted nodes are:
- Nodes that are part of a cycle (including self-cycles).
- Nodes that are reachable from the above nodes through outgoing edges.
In the given example, first is to omit nodes C, D and G, which form the cycle. Then, nodes F, J and H which are reachable from them are also omitted. As a result, the topological sorting result is A, I, B, E.
If a graph is disconnected, or becomes disconnected after omiting nodes that form the cycle and nodes influenced by them, the topological sorting is performed within each connected component. The sorting results are then returned consistently across all components. Isolated nodes are also included in the results and are not overlooked.
Example Graph
To create this graph:
// Runs each row separately in order in an empty graphset
insert().into(@default).nodes([{_id:"A"}, {_id:"B"}, {_id:"C"}, {_id:"D"}, {_id:"E"}, {_id:"F"}, {_id:"G"}, {_id:"H"}])
insert().into(@default).edges([{_from:"A", _to:"B"}, {_from:"A", _to:"C"}, {_from:"A", _to:"D"}, {_from:"A", _to:"E"}, {_from:"E", _to:"G"}, {_from:"F", _to:"D"}, {_from:"F", _to:"E"}, {_from:"H", _to:"G"}])
Creating HDC Graph
To load the entire graph to the HDC server hdc-server-1
as hdc_topo_sort
:
CALL hdc.graph.create("hdc-server-1", "hdc_topo_sort", {
nodes: {"*": ["*"]},
edges: {"*": ["*"]},
direction: "undirected",
load_id: true,
update: "static",
query: "query",
default: false
})
hdc.graph.create("hdc_topo_sort", {
nodes: {"*": ["*"]},
edges: {"*": ["*"]},
direction: "undirected",
load_id: true,
update: "static",
query: "query",
default: false
}).to("hdc-server-1")
Parameters
Algorithm name: topological_sort
Name |
Type |
Spec |
Default |
Optional |
Description |
---|---|---|---|---|---|
return_id_uuid |
String | uuid , id , both |
uuid |
Yes | Includes _uuid , _id , or both to represent nodes in the results; this option is only valid in File Writeback. |
File Writeback
CALL algo.topological_sort.write("hdc_topo_sort", {
params: {
return_id_uuid: "id"
},
return_params: {
file: {
filename: "sort"
}
}
})
algo(topological_sort).params({
projection: "hdc_topo_sort",
return_id_uuid: "id"
}).write({
file: {
filename: "sort"
}
})
Result:
_id
F
H
A
B
C
D
E
G
Full Return
CALL algo.topological_sort("hdc_topo_sort", {
params: {},
return_params: {}
}) YIELD result
RETURN result
exec{
algo(topological_sort).params() as result
return result
} on hdc_topo_sort
[{"id":"F","uuid":"2882304861028745219","schema":"default","values":{}}]
[{"id":"H","uuid":"3386708019294240772","schema":"default","values":{}}]
[{"id":"A","uuid":"10016006670783610881","schema":"default","values":{}}]
[{"id":"B","uuid":"3530823207370096641","schema":"default","values":{}}]
[{"id":"C","uuid":"12033619303845593090","schema":"default","values":{}}]
[{"id":"D","uuid":"288231475663339522","schema":"default","values":{}}]
[{"id":"E","uuid":"10520409829049106435","schema":"default","values":{}}]
[{"id":"G","uuid":"13690943966717935617","schema":"default","values":{}}]
Stream Return
CALL algo.topological_sort("hdc_topo_sort", {
params: {},
return_params: {
stream: {}
}
}) YIELD r
FOR node IN r
RETURN node._id
exec{
algo(topological_sort).params({}).stream() as r
uncollect r as node
return node._id
} on hdc_topo_sort
Result:
node._id |
---|
F |
H |
A |
B |
C |
D |
E |
G |