A full-text index is a type of index specialized for efficient searching for string or text properties, especially in large text fields like descriptions, comments, or articles.
Full-text indexes work by breaking down the text into smaller segments called tokens. When a query is performed, the search engine matches specified keywords against these tokens instead of the original full text, allowing for faster retrieval of relevant results. Full-text indexes support both precise and fuzzy matches.
To retrieve all full-text indexes in the current graph:
GQLSHOW FULLTEXT
To retrieve only node or edge full-text indexes:
GQLSHOW NODE FULLTEXT
GQLSHOW EDGE FULLTEXT
The result includes the following fields:
Field | Description |
|---|---|
index_name | Full-text index name. |
entity_type | NODE or EDGE. |
schema_name | The label of the full-text index. |
properties | The indexed properties. |
analyzer | The text analyzer used. |
status | Index status: ready, loading, or building. |
doc_count | Number of documents indexed. |
progress | Build/loading progress. |
You can create a full-text index using the CREATE FULLTEXT statement. The index is built asynchronously, use SHOW FULLTEXT to check build progress.
Syntax<create full-text index statement> ::= "CREATE FULLTEXT" <index name> "ON" < "NODE" | "EDGE" > <label> "(" <property name> [ { "," <property name> }... ] ")"
Details
<index name> must be unique among nodes and among edges, but a node full-text index and an edge full-text index may share the same name.To create a full-text index named prodDesc for the description property of product nodes:
GQLCREATE FULLTEXT prodDesc ON NODE product (description)
To create a full-text index named reviewText for the content and excerpt properties of review edges:
GQLCREATE FULLTEXT reviewText ON EDGE review (content, excerpt)
Dropping a full-text index does not affect the actual property values.
GQLDROP NODE FULLTEXT prodDesc
GQLDROP EDGE FULLTEXT reviewText
Use IF EXISTS to avoid errors when the index doesn't exist:
GQLDROP NODE FULLTEXT IF EXISTS prodDesc
To use a full-text index in search conditions, use the syntax ~<fulltextIndexName> CONTAINS "<keywords>":
~ symbol marks the full-text index.CONTAINS checks if the segmented tokens in the full-text index match the query.\) to escape.By default, multiple keywords separated by spaces are combined with AND (all must match). Additional operators are supported within the <keywords> string:
| Operator | Syntax | Description |
|---|---|---|
| AND (default) | "graph database" | Entries whose tokens include both graph and database. |
| OR | "graph OR database" | Entries whose tokens include graph or database (or both). |
| NOT | "-graph" | Entries whose tokens do not include graph. |
| Phrase | "\"graph database\"" | Entries whose tokens include graph followed immediately by database. |
| Proximity | "\"graph database\"~5" | Entries whose tokens include both graph and database within 5 token positions of each other. |
| Wildcard | "graph*" | Entries whose tokens start with graph (e.g., graph, graphics, graphdb). |
| Wildcard | "grap?" | Entries whose tokens match with ? as any single character (e.g., graph, grape). |
| Grouped | "(graph OR network) AND database" | Entries matching the combined sub-expressions; parentheses control precedence. |
To find nodes using the full-text index prodDesc where their tokens include graph and database:
GQLMATCH (n WHERE ~prodDesc CONTAINS "graph database") RETURN n
To find nodes using the full-text index prodDesc where their tokens include graph or database:
GQLMATCH (n WHERE ~prodDesc CONTAINS "graph OR database") RETURN n
To find edges using the full-text index reviewText where their tokens include graph and those start with ult:
GQLMATCH ()-[e WHERE ~reviewText CONTAINS "graph ult*"]-() RETURN e
Note: Full-text indexes only apply to the first node in a path pattern when retrieving paths.
For example, this query is not supported:
GQL - Not supportedMATCH p = ()-[]-(WHERE ~prodDesc CONTAINS "graph") RETURN p
You may revise the query as follows:
GQLMATCH (n WHERE ~prodDesc CONTAINS "graph") MATCH p = ()-[]-(n) RETURN p
This query is not supported either:
GQL - Not supportedMATCH p = ()-[WHERE ~reviewText CONTAINS "ult*"]-() RETURN p
You may revise the query as follows:
GQLMATCH ()-[e WHERE ~reviewText CONTAINS "ult*"]-() MATCH p = ()-[e]-() RETURN p