UltipaDocs
Products
Solutions
Resources
Company
Start Free Trial
UltipaDocs
Start Free Trial
  • Introduction
  • GQL vs Other Languages
    • Overview
    • Node and Edge Patterns
    • Path Patterns
    • Quantified Paths
    • Questioned Paths
    • Shortest Paths
    • Cheapest Paths
    • K-Hop Traversal
    • Graph Patterns
    • Overview
    • Open Graphs
    • Closed Graphs
    • Graph Types
    • Constraints
    • Projections
    • Storage Maintenance
    • Node and Edge IDs
    • INSERT
    • INSERT OVERWRITE
    • UPSERT
    • MERGE
    • SET
    • REMOVE
    • DELETE
    • FOREACH
    • LOAD CSV
    • Query Composition
    • Result Table and Visualization
    • MATCH
    • OPTIONAL MATCH
    • FILTER
    • LET
    • FOR
    • ORDER BY
    • LIMIT
    • SKIP
    • CALL
    • RETURN
    • Composite Query
    • NEXT
    • All Functions
    • Element Functions
    • Path Functions
    • Aggregate Functions
    • Mathematical Functions
    • Trigonometric Functions
    • String Functions
    • List Functions
    • Datetime Functions
    • Spatial Functions
    • Null Functions
    • Utility Functions
    • Type Conversion Functions
    • Table Functions
  • Operators
  • Predicates
    • Overview
    • CASE
    • LET Value Expression
    • Value Query Expression
    • List Expressions
    • Current Values
    • Index
    • Full-text Index
    • Vector Index
  • Transactions
  • Triggers
  • Query Management
  • Execution Plan
    • Variables
    • Values and Types
    • Comments
    • Reserved Words
    • Naming Conventions
    • Syntactic Notation
  • GQL Conformance
  1. Docs
  2. /
  3. ISO GQL
  4. /
  5. Functions

Aggregate Functions

Overview

An aggregate function performs a calculation on a set of values and returns a single scalar value.

DISTINCT

All aggregate functions support the use of the set quantifier DISTINCT to deduplicate values before aggregation.

Null Values

Rows containing null values are ignored by all aggregate functions, except count(*).

Example Graph

GQL
INSERT (p1:Paper {_id:'P1', title:'Efficient Graph Search', score:6, author:'Alex'}),
       (p2:Paper {_id:'P2', title:'Optimizing Queries', score:9, author:'Alex'}),
       (p3:Paper {_id:'P3', title:'Path Patterns', score:7, author:'Zack'}),
       (p1)-[:Cites {weight:2}]->(p2),
       (p2)-[:Cites {weight:1}]->(p3)

collect()

Collects a set of values into a list. collect_list() is a synonym.

Syntaxcollect(<values>)
ArgumentsNameTypeDescription
<values>AnyThe target values
Return TypeLIST
GQL
MATCH (n)
RETURN collect(n.title)

Result:

collect(n.title)
["Optimizing Queries","Efficient Graph Search","Path Patterns"]

collect_distinct()

Collects a set of values into a list, removing duplicates. Equivalent to collect(DISTINCT <values>).

Syntaxcollect_distinct(<values>)
ArgumentsNameTypeDescription
<values>AnyThe target values
Return TypeLIST
GQL
MATCH (n:Paper)
RETURN collect_distinct(n.author)

Result:

collect_distinct(n.author)
["Zack","Alex"]

count()

Returns the number of rows in the input.

Syntaxcount(<values>)
ArgumentsNameTypeDescription
<values>AnyThe target values
Return TypeUINT
GQL
MATCH (n)
RETURN count(n)

Result: 3

count(*)

count(*) returns the number of rows in the intermediate result table.

Comparing the following two queries, the null values are only considered when using count(*):

GQL
FOR item IN [1, "a", "2", "b3", null]
RETURN count(item)

Result: 4

GQL
FOR item IN [1, "a", "2", "b3", null]
RETURN count(*)

Result: 5

count(DISTINCT)

You can include the set quantifier DISTINCT in count() to return the number of distinct rows in the input.

GQL
FOR item IN [1, 1, "a", "2", "b3", null]
RETURN count(DISTINCT item)

Result: 4

max()

Returns the maximum value in a set of values.

Syntaxmax(<values>)
ArgumentsNameTypeDescription
<values>AnyThe target values
Return TypeNumeric
GQL
MATCH (n)
RETURN max(n.score)

Result: 9

min()

Returns the minimum value in a set of values.

Syntaxmin(<values>)
ArgumentsNameTypeDescription
<values>AnyThe target values
Return TypeNumeric
GQL
MATCH (n)
RETURN min(n.score)

Result: 6

avg()

Computes the average of a set of numeric values.

Syntaxavg(<values>)
ArgumentsNameTypeDescription
<values>NumericThe target values
Return TypeDOUBLE
GQL
MATCH (n)
RETURN avg(n.score)

Result: 7.333333333333333

sum()

Computes the sum of a set of numeric values.

Syntaxsum(<values>)
ArgumentsNameTypeDescription
<values>NumericThe target values
Return TypeDOUBLE
GQL
MATCH (n)
RETURN sum(n.score)

Result: 22

percentile_cont()

Computes the continuous percentile value over a set of numeric values.

Syntaxpercentile_cont(<values>, <percentile>)
ArgumentsNameTypeDescription
<values>NumericThe target values
<percentile>NumericNumber between 0.0 and 1.0
Return TypeDOUBLE

percentile_cont() is computed using the following steps:

  • Sort the values in ascending order.
  • Compute the percentile position as p = percentile × (n − 1) + 1, where n is the number of non-null values.
  • Determine the percentile value using linear interpolation:
    • If p is an integer, the corresponding value at that position is the percentile value.
    • If p is a decimal between two integers p1 and p2 (p1 < p < p2), interpolate between the value v1 at position p1 and the value v2 at position p2 to compute the percentile value as v1 + (p - p1) × (v2 - v1).
GQL
FOR item IN [3, 9, 4, 7, 6]
RETURN percentile_cont(item, 0.4)

Result: 5.2

GQL
FOR item IN [3, 9, 4, 7, 6]
RETURN percentile_cont(item, 0.5)

Result: 6

percentile_disc()

Computes the discrete percentile value over a set of numeric values.

Syntaxpercentile_disc(<values>, <percentile>)
ArgumentsNameTypeDescription
<values>NumericThe target values
<percentile>NumericNumber between 0.0 and 1.0
Return TypeDOUBLE

percentile_disc() is computed using the following steps:

  • Sort the values in ascending order.
  • Compute the percentile position as p = ceil(percentile × n), where n is the number of non-null values.
  • The value at the position p is selected as the percentile value.
GQL
FOR item IN [3, 9, 4, 7, 6]
RETURN percentile_disc(item, 0.4)

Result: 4

GQL
FOR item IN [3, 9, 4, 7, 6]
RETURN percentile_disc(item, 0.5)

Result: 6

stddev_samp()

Computes the sample standard deviation of a set of numeric values. stddev() is a synonym.

stddev_samp( x 1 , ...,  x n ) = 1 n − 1 ∑ i = 1 n ( x i − x — ) 2

Syntaxstddev_samp(<values>)
ArgumentsNameTypeDescription
<values>NumericThe target values
Return TypeDOUBLE
GQL
MATCH (n)
RETURN stddev_samp(n.score)

Result: 1.5275252316519468

stddev_pop()

Computes the population standard deviation of a set of numeric values.

stddev_pop( x 1 , ...,  x n ) = 1 n ∑ i = 1 n ( x i − x — ) 2

Syntaxstddev_pop(<values>)
ArgumentsNameTypeDescription
<values>NumericThe target values
Return TypeNumeric
GQL
MATCH (n)
RETURN stddev_pop(n.score)

Result: 1.247219128924647

var_samp()

Computes the sample variance of a set of numeric values (the square of stddev_samp()). variance() is a synonym.

Syntaxvariance(<values>) or var_samp(<values>)
ArgumentsNameTypeDescription
<values>NumericThe target values
Return TypeDOUBLE
GQL
MATCH (n)
RETURN variance(n.score)

Result: 2.3333333333333335

var_pop()

Computes the population variance of a set of numeric values (the square of stddev_pop()).

Syntaxvar_pop(<values>)
ArgumentsNameTypeDescription
<values>NumericThe target values
Return TypeDOUBLE
GQL
MATCH (n)
RETURN var_pop(n.score)

Result: 1.5555555555555556