×

☰ Table of Contents

TigerGraph Docs : GSQL Language Reference Part 2 - Querying v1.0

Version 1.0
Document updated:

Copyright © 2015-2017 TigerGraph. All Rights Reserved .
For technical support on this topic, contact support@tigergraph.com with a subject line beginning with "GSQL"



Chapters:

Detailed Table of Contents

Introduction

The GSQL ® Query Language (aka GQuery) is a language for the exploration and analysis of large scale graphs. The high-level language makes it easy to perform powerful graph traversal queries in the TigerGraph system. By combining features familiar to database users and programmers with highly expressive new capabilities, the GSQL query language offers both easy authoring and powerful execution. A GSQL query contains one or more SELECT statements, where each SELECT statement describes a traversal over a set of vertices and edges in the graph or describes a selection of a subset of vertices.  By combining multiple SELECT statements, the user can map out query patterns to answer a virtually unlimited set of real-life data questions.

This document focuses on the formal specification for the GSQL Query Language. It includes example queries which demonstrate the language, each of which works on one of the following six graphs: workNet, socialNet, friendNet, computerNet, minimalNet, and investmentNet . Their schemas are shown below. Appendix D lists the full command and data files to create and load these graphs with small sets of data (~10 to 20 vertices). The data sets are small so that you can understand the result of each query example. The ZIP file gsql_example_graphs_v0.8.1.zip contains all of the command and data files.

Schemas for Example Graphs

Graph Schema: socialNet
CREATE VERTEX person(PRIMARY_ID personId UINT, id STRING, gender STRING) WITH STATS="OUTDEGREE_BY_EDGETYPE" CREATE UNDIRECTED EDGE friend(FROM person, TO person) CREATE VERTEX post(PRIMARY_ID postId UINT, subject STRING, postTime DATETIME) CREATE DIRECTED EDGE posted(FROM person, TO post) CREATE DIRECTED EDGE liked(FROM person, TO post, actionTime DATETIME)
Graph Schema: workNet
CREATE VERTEX person(PRIMARY_ID personId STRING, id STRING, locationId STRING, skillSet SET<INT>, skillList LIST<INT>, interestSet SET<STRING COMPRESS>, interestList LIST<STRING COMPRESS>) CREATE VERTEX company(PRIMARY_ID clientId STRING, id STRING, country STRING) CREATE UNDIRECTED EDGE worksFor(FROM person, TO company, startYear INT, startMonth INT, fullTime BOOL)


Graph Schema: friendNet
CREATE VERTEX person(PRIMARY_ID personId UINT, id STRING) CREATE UNDIRECTED EDGE friend(FROM person, TO person) CREATE UNDIRECTED EDGE coworker(FROM person, TO person)


Graph Schema: computerNet
CREATE VERTEX computer(PRIMARY_ID compID STRING, id STRING) CREATE DIRECTED EDGE connected(FROM computer, TO computer, connectionSpeed INT)
Graph Schema: minimalNet
CREATE VERTEX testV(PRIMARY_ID id STRING) CREATE UNDIRECTED EDGE testE(FROM testV, TO testV)


Graph Schema: investmentNet
TYPEDEF TUPLE < age UINT (4), mothersName STRING(20) > SECRET_INFO CREATE VERTEX person(PRIMARY_ID personId STRING, portfolio MAP<STRING, DOUBLE>, secretInfo SECRET_INFO) CREATE VERTEX stockOrder(PRIMARY_ID orderId STRING, ticker STRING, orderSize UINT, price FLOAT) CREATE UNDIRECTED EDGE makeOrder(FROM person, TO stockOrder, orderTime DATETIME)

End of Introduction Section

back to top


CREATE / INSTALL / RUN / SHOW / DROP QUERY


Contents of this Chapter - Click to Expand

A GSQL query is a compiled data retrieval-and-computation task. Users can write queries to explore a data graph however they like, to read and make computations on the graph data along the way, to update the graph, and to deliver resulting data. A query is analogous to a user-defined procedure or function: it can have one or more input parameters, and it can produce output in two ways: by returning a value or by printing. Using a query is a three-step procedure:

  1. CREATE QUERY: define the functionality of the query
  2. INSTALL QUERY: compile the query
  3. RUN QUERY: execute the query with input values


EBNF for CREATE QUERY
createQuery := CREATE [OR REPLACE] QUERY name "(" [parameterList] ")" FOR GRAPH name [RETURNS "(" baseType | accumType ")"] "{" [typedefs] [declAccumStmts] [declStmts] [declExceptStmts] queryBodyStmts "}" parameterValueList := parameterValue [, parameterValue]* parameterValue := parameterConstant | "[" parameterValue [, parameterValue]* "]" // BAG or SET | "(" stringLiteral, stringLiteral ")" // a generic VERTEX value parameterConstant := numeric | stringLiteral | TRUE | FALSE parameterList := parameterType name ["=" constant] ["," parameterType name ["=" constant]]* typedefs := (typedef ";")+ declAccumStmts := (declAccumStmt ";")+ declExceptStmts := (declExceptStmt ";")+ declStmts := (declStmt ";")+ queryBodyStmts := (queryBodyStmt ";")+ installQuery := INSTALL QUERY [installOptions] ( "*" | ALL |name [, name]* ) runQuery := RUN QUERY [runOptions] name "(" parameterValueList ")" showQuery := SHOW QUERY name dropQuery := DROP QUERY ( "*" | ALL | name [, name]* )

CREATE QUERY Statement

createQuery := CREATE [OR REPLACE] QUERY name "(" [parameterList] ")" FOR GRAPH name [RETURNS "(" baseType | accumType ")"] "{" [typedefs] [declAccumStmts] [declStmts] [declExceptStmts] queryBodyStmts "}"

CREATE QUERY defines the functionality of a query on a given graph schema.

A query has a name, a parameter list, the name of the graph being queried, an optional RETURNS type (see Section "RETURN Statement" for more details), and a body. The body consists of an optional sequence of typedefs , followed by an optional sequence of declarations, then followed by one or more statements. The body defines the behavior of the query.

If the optional keywords OR REPLACE are included, then this query definition, if error-free, will replace a previous definition with the same query name.  However, if there are any errors in this query definition, then the previous query definition will be maintained.  If the OR REPLACE option is not used, then GSQL will reject a CREATE QUERY command that uses an existing name.

Typedefs allow the programmer to define custom types for use within the body.  The declarations support definition of accumulators (see Chapter  "Accumulators" for more details) and global/local variables.  All accumulators and global variables must be declared before any statements. There are various types of statements that can be used within the body.  Typically, the core statement(s) in the body of a query is one or more SELECT, UPDATE, INSERT, DELETE statements. The language supports conditional statements such as an IF statement as well as looping constructs such as WHILE and FOREACH.  It also supports calling functions, assigning variables, printing, and modifying the graph data.

The query body may include calls to other queries. That is, the other queries are treated as subquery functions.  See the subsection on "Queries as Functions".

Example of a CREATE QUERY statement
CREATE QUERY userPosts (STRING uid) FOR GRAPH socialNet RETURNS (int) { # declaration statements users = {person.*}; # body statements posts = SELECT p FROM users:u-(posted)->:p WHERE u.id == uid; PRINT posts; RETURN posts.size(); }  

Query Parameter and Return Types

This table lists the supported data types for input parameters and return values.

Parameter Types
  • any baseType (except EDGE): INT, UINT, FLOAT, DOUBLE, STRING, BOOL, STRING, VERTEX, JSONOBJECT, JSONARRAY
  • SET<baseType>, BAG<baseType>
  • Exception: EDGE type is not supported, either as a primitive parameter or as part of a complex type.
Return Types
  • any baseType (including EDGE): INT, UINT, FLOAT, DOUBLE, STRING, BOOL, STRING, VERTEX, EDGE, JSONOBJECT, JSONARRAY
  • any accumulator type, except GroupByAccum

Statement Types

A statement is a standalone instruction that expresses an action to be carried out. The most common statements are data manipulation language ( DML) statements . DML statements include the SELECT, UPDATE, INSERT INTO, DELETE FROM, and DELETE statements.

A GSQL query has two levels of statements. The upper-level statement type is called query-body-level statement , or query-body statement for short. This statement type is part of either the top-level block or a query-body control flow block. For example, each of the statements at the top level directly under CREATE QUERY is a query-body statement. If one of the statements is a CASE statement with several THEN blocks, each of the statements in the THEN blocks is also a query-body statement. Each query-body statement ends with a semicolon.

The lower-level statement type is called DML-sub-level statement or DML-sub-statement for short. This statement type is used inside certain query-body DML statements, to define particular data manipulation actions. DML-sub-statements are comma-separated. There is no comma or semicolon after the last DML-sub-statement in a block. For example, one of the top-level statements is a SELECT statement, each of the statements in its ACCUM clause is a DML-sub-statement.  If one of those DML-sub-statements is a CASE statement, each of the statement in the THEN blocks is a DML-sub-statement.

There is some overlap in the types. For example, an assignStmt can be used either at the query-body level or the DML-sub-level.

queryBodyStmts := (queryBodyStmt ";")+ queryBodyStmt := assignStmt // Assignment | vSetVarDeclStmt // Declaration | gAccumAssignStmt // Assignment | gAccumAccumStmt // Assignment | funcCallStmt // Function Call | selectStmt // Select | queryBodyCaseStmt // Control Flow | queryBodyIfStmt // Control Flow | queryBodyWhileStmt // Control Flow | queryBodyForEachStmt // Control Flow | BREAK // Control Flow | CONTINUE // Control Flow | updateStmt // Data Modification | insertStmt // Data Modification | queryBodyDeleteStmt // Data Modification | printStmt // Output | logStmt // Output | returnStmt // Output | raiseStmt // Exception | tryStmt // Exception DMLSubStmtList := DMLSubStmt ["," DMLSubStmt]* DMLSubStmt := assignStmt // Assignment | funcCallStmt // Function Call | gAccumAccumStmt // Assignment | vAccumFuncCall // Function Call | localVarDeclStmt // Declaration | DMLSubCaseStmt // Control Flow | DMLSubIfStmt // Control Flow | DMLSubWhileStmt // Control Flow | DMLSubForEachStmt // Control Flow | BREAK // Control Flow | CONTINUE // Control Flow | insertStmt // Data Modification | DMLSubDeleteStmt // Data Modification | logStmt // Output

Guidelines for understanding statement type hierarchy:

  • Top-level statements are Query-Body type (each statement ending with a semicolon).
  • The statements within a DML statement are DML-sub statements (comma-separated list).
  • The blocks within a Control Flow statement have the same type as the entire Control Flow statement itself.
Schematic illustration of relationship between queryBodyStmt and DMLSubStmt
# Each statement's operation type is either ControlFlow, DML, or other. # Each statement's syntax type is either queryBodyStmt or DMLSubStmt. CREATE QUERY (parameterList) FOR GRAPH g [ other queryBodyStmt1; ControlFlow queryBodyStmt2 # ControlFlow inside top level. other queryBodyStmt2.1; # subStmts in ControlFlow are queryBody unless inside DML. ControlFlow queryBodyStmt2.2 # ControlFlow inside ControlFlow inside top level other queryBodyStmt2.2.1; other queryBodyStmt2.2.2; END; DML queryBodyStmt2.3 # DML inside ControlFlow inside top-level other DMLSubStmt2.3.1, # switch to DMLSubStmt other DMLSubStmt2.3.2 ; END; DML queryBodyStmt3 # DML inside top level. other DMLSubStmt3.1, # All subStmts in DML must be DMLSubStmt type ControlFlow DMLSubStmt3.2 # ControlFlow inside DML inside top level other DMLSubStmt3.2.1, other DMLSubStmt3.2.2 , DML DMLsubStmt3.3 other DMLSubStmt3.3.1, other DMLSubStmt3.3.2 ; other queryBodyStmt4; 


Here is a descriptive list of query-body statements:

EBNF term Common Name Description
assignStmt Assignment Statement See Chapter 6: "Declaration and Assignment Statements"
vSetVarDeclStmt Vertex Set Variable Declaration Statement
gAccumAssignStmt Global Accumulator Assignment Statement
gAccumAccumStmt Global Accumulator Accumulation Statement
funcCallStmt Functional Call or Query Call Statement
selectStmt SELECT Statement See Chapter 7: "SELECT Statement"
queryBodyCaseStmt query-body CASE statement See Chapter 8: "Control Flow Statements"
queryBodyIfStmt query-body IF statement
queryBodyWhileStmt query-body WHILE statement
queryBodyForEachStmt query-body FOREACH statement

updateStmt

UPDATE Statement

See Chapter 9: "Data Modification Statements"
insertStmt INSERT INTO statement
queryBodyDeleteStmt Query-body DELETE Statement
printStmt PRINT Statement See Chapter 10: "Output Statements"

logStmt

LOG Statement

returnStmt

RETURN Statement
raiseStmt PRINT Statement See Chapter 11: "Exception Statements"

tryStmt

TRY Statement

Here is a descriptive list of DML-sub-statements:

EBNF term Common Name Description

assignStmt

Assignment Statement See Chapter 6: "Declaration and Assignment Statements"
funcCallStmt Functional Call Statement
gAccumAccumStmt Global Accumulator Accumulation Statement
vAccumFuncCall Vertex-attached Accumulator Function Call Statement
localVarDeclStmt Local Variable Declaration Statement See Chapter 7: "SELECT Statement"
insertStmt INSERT INTO Statement See Chapter 8: "Control Flow Statements"

DMLSubDeleteStmt

DML-sub DELETE Statement

See Chapter 9: "Data Modification Statements"
DMLSubcaseStmt DML-sub CASE statement
DMLSubIfStmt DML-sub IF statement
DMLSubForEachStmt DML-sub FOREACH statement
DMLSubWhileStmt DML-sub WHILE statement
logStmt LOG Statement See Chapter 10: "Output Statements"


INSTALL QUERY

installQuery := INSTALL QUERY [installOptions] ( "*" | ALL | name [, name]* )

A query must be installed before it can be executed. The INSTALL QUERY command will install the queries listed:

INSTALL QUERY queryName1, queryName2, ...

It can also install all uninstalled queries, using either of the following commands:
INSTALL QUERY *
INSTALL QUERY ALL

The following options are available:

-force Option

Reinstall the query even if the system indicates the query is already installed. This is useful for overwriting an installation that is corrupted or otherwise outdated, without having to drop and then recreate the query. If this option is not used, the GSQL shell will refuse to re-install a query that is already installed.

-OPTIMIZE Option

During standard installation, the user-defined queries are dynamically linked to the GSQL language code. Anytime after INSTALL QUERY has been performed, another statement, INSTALL QUERY -OPTIMIZE can be executed.  The names of the individual queries are not needed. This operation optimizes all previously installed queries, reducing their run times by about 20%. Optimize a query if query run time is more important to you than query installation time.

Legal:
CREATE QUERY query1...
INSTALL QUERY query1

RUN QUERY query1(...)
...
INSTALL QUERY -OPTIMIZE    # (optional) optimizes run time performance for query1 and query2
RUN QUERY query1(...)      # runs faster than before


Illegal:
INSTALL QUERY -OPTIMIZE query_name

Running a Query

Installing a query creates a REST++ endpoint. Once a query is installed, there are two ways of executing a query. One way is through the GSQL shell:
RUN QUERY query_name( parameterValues ) .

CREATE, INSTALL, RUN example
CREATE QUERY RunQueryEx(INT p1, STRING p2, DOUBLE p3) FOR GRAPH testGraph{ .... } INSTALL QUERY RunQueryEx RUN QUERY RunQueryEx(1, "test", 3.14)

Query output size limitation

There is a maximum size limit of 2GB for the result set of a SELECT block. A SELECT block is the main component of a query which searches for and returns data from the graph. If the result of the SELECT block is larger than 2GB, the system will return no data.  NO error message is produced.

The quer y response time can be reduced by directly submitting an HTTP request to the REST++ server: send a GET request to " http://server_ip:9000/query/queryname ". If the REST++ server is local, then server_ip is localhost . The query parameter values are either included directly in the query string of the HTTP request's URL or supplied using a data payload.

The following two curl commands are each equivalent to the RUN QUERY command above. The first gives the parameter values in the query string in a URL. This example illustrates the simple format for primitive data types such as INT, DOUBLE, and STRING. The second gives the parameter values through the curl command's data payload -d option.

Running a query via HTTP request
curl -X GET "http://localhost:9000/query/RunQueryEx?p1=1&p2=test&p3=3.14" curl -d @RunQueryExPara.dat -X GET "http://localhost:9000/query/RunQueryEx"

where RunQueryExPara.dat has the exact string as the query string in the first URL.

RunQueryExPara.dat
p1=1&p2=test&p3=3.14

To see a list of the parameter names and types for the user-installed GSQL queries, run the following REST++ request:

curl -X GET "http://localhost:9000/endpoints?dynamic=true"

By using the data payload option, the user can avoid using a long and complex URL. In fact, to call the same query but with different parameters, only the data payload file contents need to be changed; the HTTP request can be the same. The file loader loads the entire file, appends multiple lines into one, and uses the resulting string as the URL query string. If both a query string and a data payload are given (which we strongly discourage), both are included, where the URL query string's parameter values overwrite the values given in the data payload.

Complex Type Parameter Passing

This subsection describes how to format the complex type parameter values when executing a query by RUN QUERY or curl command. More details about all parameter types are described in Section "Query Parameter Types".

Parameter type RUN QUERY Query string for GET /query HTTP Request
SET or BAG of primitives

Square brackets enclose the collection of values.

Example: a set p1 of integers: [1,5,10]

Assign multiple values to the same parameter name.

Example:  a set p1 of integers: p1=1&p1=5&p1=10

VERTEX<type>

If the vertex type is specified in the query definition, then the vertex argument is simply vertex_id

Example: vertex type is person and desired id is person2.
"person2"

parameterName=vertex_id

Example: vertex type is person and desired id is person2.
vp=person2

VERTEX

(type not pre-specified)

If the type is not defined in the query definition, then the argument must provide both the id and type in parentheses: (vertex_id, vertex_type)

Example: a verte x va w ith id="person1" and type="person:
("person1","person")

parameterName=vertex_id&parameterName.type=vertex_type

Example: parameter vertex va when type="person" and id="person1":
va=person1&va.type=person

SET or BAG of VERTEX<type> Same as a SET or BAG of primitives, where the primitive type is vertex_id. Example:
[ "person3", "person4" ]

Same as a SET or BAG of primitives, where the primitive type is vertex_id. Example:
vp=person3&vp=person4

SET or BAG of VERTEX

(type not pre-specified)

Same as a SET or BAG of vertices, with vertex type not pre-specified. Square brackets enclose a comma-separated list of vertex (id, type) pairs. Mixed types are permitted. Example:
[ ("person1","person") , ("11","post") ]

The SET or BAG must be treated like an array, specifying the first, second, etc. elements with indices [0], [1], etc. The example below provides the same input arguments as the RUN QUERY example to the left.

vp[0]=person1&vp[0].type=person&vp[1]=11&vp[1].type=post

When square brackets are used in a curl URL, the -g option or escape characters must be adopted. If the parameters are given by data payload (either by file or data payload string),  the -g option is not needed and escape characters should not be used.

Below are examples.

Running a query via HTTP request - complex parameter type
# 1. SET or BAG CREATE QUERY RunQueryEx2(SET<INT> p1) FOR GRAPH testGraph{ .... } # To run this query (either RUN QUERY or curl): GSQL > RUN QUERY RunQueryEx2([1,5,10])  curl -X GET "http://localhost:9000/graph/RunQueryEx2?p1=1&p1=5&p1=10" # 2. VERTEX. # First parameter is any vertex; second parameter must be a person type. CREATE QUERY printOneVertex(VERTEX va, VERTEX<person> vp) FOR GRAPH socialNet { PRINT va, vp; } # To run this query: GSQL > RUN QUERY printOneVertex(("person1","person"),"person2") # 1st param must give type: (vertex_id, vertex_type) curl -X GET 'http://localhost:9000/query/printOneVertex?va=person1&va.type=person&vp=person2' # 3. BAG or SET of VERTEX, any type CREATE QUERY printOneBagVertices(BAG<VERTEX> va) FOR GRAPH socialNet { PRINT va; } # To run this query: GSQL > RUN QUERY printOneBagVertices([("person1","person"), ("11","post")]) # [(vertex_1_id, vertex_1_type), (vertex_2_id, vertex_2_type), ...] curl -X GET 'http://localhost:9000/query/printOneBagVertices?va\[0\]=person1&va\[0\].type=person&va\[1\]=11&va\[1\].type=post' curl -g -X GET 'http://localhost:9000/query/printOneBagVertices?va[0]=person1&va[0].type=person&va[1]=11&va[1].type=post' # 4. BAG or SET of VERTEX, pre-specified type CREATE QUERY printOneSetVertices(SET<VERTEX<person>> vp) FOR GRAPH socialNet { PRINT vp; } # To run this query: GSQL > RUN QUERY printOneSetVertices(["person3", "person4"]) # [vertex_1_id, vertex_2_id, ...] curl -X GET 'http://localhost:9000/query/printOneSetVertices?vp=person3&vp=person4'


Payload Size Limit

This data payload option can accept a file up to 128MB by default. To increase this limit to xxx MB, use the following command:

gadmin --set nginx.client_max_body_size xxx -f

The upper limit of this setting is 1024 MB. Raising the size limit for the data payload buffer reduces the memory available for other operations, so be cautious about increasing this limit.

For more detailed information about REST++ endpoints and requests, see the RESTPP API User Guide .


The following options are available when running a query:

All-Vertex Mode -av Option

Some queries run with all or almost all vertices in a SELECT statement s, e.g. PageRank algorithm. In this case, the graph processing engine can run much more efficiently in all-vertex mode. In the all-vertex mode, all vertices are always selected, and the following actions become ineffective:

  • Filtering with selected vertices or vertex types. The source vertex set must be all vertices.
  • Filtering with the WHERE clause.
  • Filtering with the HAVING clause.
  • Assigning designated vertex or designated type of vertexes. E.g. X = { vertex_type .*}

To run the query in all-vertex mode, use the -av option in shell mode or include __GQUERY__USING_ALL_ACTIVE_MODE=true in the query string of an HTTP request.

GSQL > run query -av test() ## In a curl URL call. Note the use of both single and double underscores. curl -X GET 'http://localhost:9000/query/test?__GQUERY__USING_ALL_ACTIVE_MODE=true'

Diagnose -d Option

The diagnose option can be turned on in order to produce a diagnostic monitoring log, which contains the processing time of each SELECT block . To turn on the monitoring log, use the -d option in shell mode or __GQUERY__monitor=true in the query string of an HTTP request.

GSQL > run query -d test() ## In a curl URL call. Note the use of both single and double underscores. curl -X GET 'http://localhost:9000/query/test?__GQUERY__monitor=true'

The path of the generated log file will be shown as a part of output message. An example log is shown below:

Query Block Start (#6) start at 11:52:06.415284 Query Block Start (#6) end at 11:52:06.415745 (takes 0.000442 s) Query test takes totally 0.001 s (restpp's pre/post process time not included) ---------------- Summary (sort by total_time desc) ---------------- Query Block Start on Line 6 ---------------------------------------------------------- total iterations count : 1 avg iterations stats : 0.000442s max iterations stats : 0.000442s min iterations stats : 0.000442s total activated vertex count : 2 max activated vertex count : 2 min activated vertex count : 2

GSQL Query Output Format

The standard output of GSQL queries is in industry-standard JSON format. JSON is a hierarchical, nested structure of unordered sets (enclosed in curly braces), ordered lists (enclosed in square brackets), and key:value pairs. Strings are enclosed in double quotation marks.

At the top level of the JSON structure are two required objects: "error" and "results". The value of "error" will be "false" if the query was successful or "true" if an error was detected. The output of the query will be contained in the array called "results". Other top-level objects, such as "debug" and "message", are reserved for future or internal use. Note that the top-level objects are enclosed in curly braces, meaning that they form an unordered set. They may appear in any order.

Top Level JSON of a Valid Query - Example
{ "error": false, "message": "", "results": [ ] } 

The "results" object is a sequential list of the data objects specified by the PRINT statements. Details are described in the Chapter "Output Statements".

SHOW QUERY

To show the GSQL text of a query, run "SHOW QUERY query_name ". Additionally, the "ls" GSQL command lists all created queries and identifies which queries have been installed.

DROP QUERY

To drop a query, run "DROP QUERY query_name ". The query will be uninstalled (if it has been installed) and removed from the dictionary. To drop all queries,, either of the following commands can be used:
DROP QUERY ALL
DROP QUERY *

The GSQL language will refuse to drop an installed query Q if another query R is installed which calls query Q .  That is, all calling queries must be dropped before or at the same time that their called subqueries are dropped.


End of CREATE / INSTALL / RUN / SHOW / DROP Query Section

back to top


Data Types


Contents of this Section - Click to Expand

This section describes the data types that are native to and are supported by the GSQL Query Language. Most of the data objects used in queries come from one of three sources: (1) the query's input parameters, (2) the vertices, edges, and their attributes which are encountered when traversing the graph, or (3) variables defined within the query that are used to assist in the computational work of the query.

This section covers the following subset of the EBNF language definiti ons:

EBNF for Data Types
lowercase := [a-z] uppercase := [A-Z] letter := lowercase | uppercase digit := [0-9] integer := ["-"]digit+ real := ["-"]("." digit+) | ["-"](digit+ "." digit*) numeric := integer | real stringLiteral := '"' [~["] | '\\' ('"' | '\\')]* '"' name := (letter | "_") [letter | digit | "_"]* // Can be a single "_" or start with "_" type := baseType | name | accumType | STRING COMPRESS baseType := INT | UINT | FLOAT | DOUBLE | STRING | BOOL | VERTEX ["<" name ">"] | EDGE | JSONOBJECT | JSONARRAY | DATETIME filePath := name | stringLiteral typedef := TYPEDEF TUPLE "<" tupleType ">" name tupleType := (baseType name) | (name baseType) ["," (baseType name) | (name baseType)]* parameterType := baseType | [ SET | BAG ] "<" baseType ">"

Identifiers

An identifier is the name for an instance of a language element. In the GSQL query language, identifiers are used to name elements such as a query, a variable, or a user-defined function.  In the EBNF syntax, an identifier is referred as a name . It can be a sequence of letters, digits, or underscores ("_"). Other punctuation characters are not supported. The initial character can only be letter or an underscore.

name (identifier)
name := (letter | "_") [letter | digit | "_"]*

Overview of Types

Different types of data can be used in different contexts. The EBNF syntax defines five classes of data types.  The most basic is called baseType.  The other four are supersets of baseType.  The tabl e below gives an o verview of their definitions and their uses.

EBNF term Description Use
baseType INT, UINT, FLOAT, DOUBLE, STRING, BOOL, DATETIME, VERTEX, EDGE,
JSONOBJECT, or JSONARRAY
  • global variable
  • query return value
tupleType sequence of baseType
  • user-defined tuple
parameterType baseType or a SET, BAG, or MAP of baseType
  • query parameter
elementType baseType, STRING COMPRESS, or identifier
  • element for most types of container accumulators: SetAccum, BagAccum, GroupByAccum, key of a MapAccum element
type baseType, STRING COMPRESS, identifier, or accumType
  • element of a ListAccum, value of a MapAccum element
  • local variable


Base Types

The query language supports the following base types , which can be declared and assigned anywhere within their scope. Any of these base types may be used when defining a global variable, a local variable, a query return value, a parameter, part of a tuple, or an element of a container accumulator. Accumulators are described in detail in a later section.

BNF
baseType := INT | UINT | FLOAT | DOUBLE | STRING | BOOL | VERTEX ["<" name ">"] | EDGE | JSONOBJECT | JSONARRAY | DATETIME

The default value of each base type is shown in the table below. The default value is the initial value of a base type variable (see Section "Variable Types" for more details), or the default return value for some functions (see Section "Operators, Functions, and Expressions" for more details).

The first seven types (INT, UINT, FLOAT, DOUBLE, BOOL, STRING, and DATETIME) are the same ones mentioned in the "Attribute Data Types" section of the GSQL Language Reference, Part 1 .

type default value
INT, UINT, FLOAT, DOUBLE
(see note below)
0
BOOL false
STRING ""
DATETIME 1970-01-01 00:00:00
VERTEX "Unknown"
EDGE No edge: {}
JSONOBJECT An empty object: {}
JSONARRAY An empty array: []

FLOAT and DOUBLE input values must be in fixed point d.dddd format, where d is a digit. Output values will be printed in either fixed point for exponential notation, whichever is more compact.

The GSQL Loader can read FLOAT and DOUBLE values with exponential notation (e.g., 1.25 E-7).


VERTEX and EDGE

VERTEX and EDGE are the two types of objects which form a graph. A query parameter or variable can be declared as either of these two types.  In additional, the schema for the graph defines specific vertex and edge types (e.g., CREATE VERTEX person ).  The parameter or variable type can be restricted by giving the vertex/edge type in angle brackets < > after the keyword VERTEX/EDGE. A VERTEX or EDGE variable declared without a specifier is called a generic type. Below are examples of generic and typed vertex and edge variable declarations:

Examples of generic and typed VERTEX and EDGE declarations
VERTEX anyVertex; VERTEX<person> owner; EDGE anyEdge; EDGE<friendship> friendEdge;


Vertex and Edge Attribute Types

The following table map s vertex or ed ge attribute types in the Data Definition Language (DDL) to GSQL query language types. Accumulators are introduced in Section "Accumulators".

DDL GSQL Query
INT INT
UINT UINT
FLOAT FLOAT
DOUBLE DOUBLE
BOOL BOOL
STRING STRING
STRING COMPRESS STRING
SET< type > SetAccum< type >
LIST< type > ListAccum< type >
DATETIME DATETIME

JSONOBJECT and JSONARRAY

These two base types allow users to pass a complex data object or to write output in a customized format. These types follow the industry standard definition of JSON at www.json.org. A JSONOBJECT instance's external representation (as input and output) is a string, starting and ending with curly braces "{" and "}", which enclose an unordered list of string:value pairs.  A JSONARRAY is represented as a string, starting and ending with square brackets "[" and "]", which enclose an ordered list of values . Since a value can be an object or an array, JSON supports hierarchical, nested data structures.

More details are introduced in the Section entitled "JSONOBJECT and JSONARRAY Functions".

A JSONOBJECT or JSONARRAY value is immutable. No operator is allowed to modify its value.

TUPLE

A tuple is a user-defined data structure consisting of a fixed sequence of baseType variables. Tuple types can be created and named using a TYPEDEF statement. Tuples must be defined first, before any other statements in a query.

ENBF for tuples
typedef := TYPEDEF TUPLE "<" tupleType ">" name tupleType := (baseType name) | (name baseType) ["," (baseType name) | (name baseType)]*

A tuple can also be defined in a graph schema and then can be used as a vertex or edge attribute type. A tuple type which has been defined in the graph schema does not need to be re-defined in a query.

The graph schema investmentNet contains two complex attributes:

  • user-defined tuple SECRET_INFO, which is used for the secret_info attribute in the person vertex.
  • portfolio MAP<STRING, DOUBLE > attribute, also in the person vertex.
investmentNet schema
TYPEDEF TUPLE <age UINT (4), mothersName STRING(20) > SECRET_INFO CREATE VERTEX person(PRIMARY_ID personId STRING, portfolio MAP<STRING, DOUBLE>, secretInfo SECRET_INFO) CREATE VERTEX stockOrder(PRIMARY_ID orderId STRING, ticker STRING, orderSize UINT, price FLOAT) CREATE UNDIRECTED EDGE makeOrder(FROM person, TO stockOrder, orderTime DATETIME) CREATE GRAPH investmentNet (*)

The query below reads both the SECRET_INFO tuple and the portfolio MAP. The tuple type does not need to redefine SECRET_INFO. To read and save the map, we define a MapAccum with the same key:value type as the original portfolio map. (The "Accumulators" chapter has more information about accumulators.)   In addition, the query creates a new tuple type, ORDER_RECORD.

tupleEx query
CREATE QUERY tupleEx(VERTEX<person> p) FOR GRAPH investmentNet{ #TYPEDEF TUPLE <UINT age, STRING mothersName> SECRET_INFO; # already defined in schema TYPEDEF TUPLE <STRING ticker, FLOAT price, DATETIME orderTime> ORDER_RECORD; # new for query SetAccum<SECRET_INFO> @@info; ListAccum<ORDER_RECORD> @@orderRecords; MapAccum<STRING, DOUBLE> @@portf; # corresponds to MAP<STRING, DOUBLE> attribute INIT = {p}; # Get person p's secret_info and portfolio X = SELECT v FROM INIT:v ACCUM @@portf += v.portfolio, @@info += v.secretInfo; # Search person p's orders to record ticker, price, and order time. # Note that the tuple gathers info from both edges and vertices. orders = SELECT t FROM INIT:s -(makeOrder:e)->stockOrder:t ACCUM @@orderRecords += ORDER_RECORD(t.ticker, t.price, e.orderTime); PRINT @@portf, @@info; PRINT @@orderRecords; }
GSQL > RUN QUERY tupleEx("person1") { "error": false, "message": "", "results": [ { "@@info": [{ "mothersName": "JAMES", "age": 25 }], "@@portf": { "AAPL": 3142.24, "MS": 5000, "G": 6112.23 } }, {"@@orderRecords": [ { "ticker": "AAPL", "orderTime": "2017-03-03 18:42:28", "price": 34.42 }, { "ticker": "B", "orderTime": "2017-03-03 18:42:30", "price": 202.32001 }, { "ticker": "A", "orderTime": "2017-03-03 18:42:29", "price": 50.55 }


STRING COMPRESS

STRING COMPRESS is an integer type encoded by the system to represent string values. STRING COMPRESS uses less memory than STRING. The STRING COMPRESS type is designed to act like STRING: data are loaded and printed just as string data, and most functions and operators which take STRING input can also take STRING COMPRESS input. The difference is in how the data are stored internally. A STRING COMPRESS value can be obtained from a STRING_SET COMPRESS or STRING_LIST COMPRESS attribute or from converting a STRING value.

STRING COMPRESS type is beneficial for sets of string values when the same values are used multiple times. In practice, STRING COMPRESS are most useful for container accumulators like ListAccum<STRING COMPRESS> or SetAccum<STRING COMPRESS>.

An accumulator (introduced in Section "Accumulator") containing STRING COMPRESS stores the dictionary when it is assigned an attribute value or from another accumulator containing STRING COMPRESS. An accumulator containing STRING COMPRESS can store multiple dictionaries. A STRING value can be converted to a STRING COMPRESS value only if the value is in the dictionaries. If the STRING value is not in the dictionaries, the original string value is saved. A STRING COMPRESS value can be automatically converted to a STRING value.

When a STRING COMPRESS value is output (e.g. by PRINT statement, which is introduced in ), it is shown as a STRING.

STRING COMPRESS is not a base type.


STRING COMPRESS example
CREATE QUERY stringCompressEx(VERTEX<person> m1) FOR GRAPH workNet { ListAccum<STRING COMPRESS> @@strCompressList, @@strCompressList2; SetAccum<STRING COMPRESS> @@strCompressSet, @@strCompressSet2; ListAccum<STRING> @@strList, @@strList2; SetAccum<STRING> @@strSet, @@strSet2; S = {m1}; S = SELECT s FROM S:s ACCUM @@strSet += s.interestSet, @@strList += s.interestList, @@strCompressSet += s.interestSet, # use the dictionary from person.interestSet @@strCompressList += s.interestList; # use the dictionary from person.interestList @@strCompressList2 += @@strCompressList; # @@strCompressList2 gets the dictionary from @@strCompressList, which is from person.interestList @@strCompressList2 += "xyz"; # "xyz" is not in the dictionary, so store the actual string value @@strCompressSet2 += @@strCompressSet; @@strCompressSet2 += @@strSet; @@strList2 += @@strCompressList; # string compress integer values are decoded to strings @@strSet2 += @@strCompressSet; PRINT @@strSet, @@strList, @@strCompressSet, @@strCompressList; PRINT @@strSet2, @@strList2, @@strCompressSet2, @@strCompressList2; }
Result
GSQL > RUN QUERY stringCompressEx("person12") { "debug": "", "error": false, "message": "", "results": [ { "@@strCompressList": [ "music", "engineering", "teaching", "teaching", "teaching" ], "@@strSet": [ "teaching", "engineering", "music" ], "@@strCompressSet": [ "music", "engineering", "teaching" ], "@@strList": [ "music", "engineering", "teaching", "teaching", "teaching" ] }, { "@@strSet2": [ "music", "engineering", "teaching" ], "@@strCompressList2": [ "music", "engineering", "teaching", "teaching", "teaching", "xyz" ], "@@strList2": [ "music", "engineering", "teaching", "teaching", "teaching" ], "@@strCompressSet2": [ "teaching", "engineering", "music" ] } ] }

Query Parameter Types

Input parameters to a query can be base type (except EDGE, JSONARARY, or JSONOBJECT). A parameter can also be a SET or BAG which uses base type (except EDGE) as the element type. Within the query, SET, BAG, and MAP are converted to SetAccum, BagAccum, and MapAccum, respectively (See Section "Accumulator" for more details).

A query parameter is immutable . It cannot be assigned any value.


BNF
parameterType := baseType | [ SET | BAG ] "<" baseType ">"
Examples of collection type parameters
(SET<VERTEX<person> p1, BAG<INT> ids, MAP<UINT, STRING> names)

End of Data Types Section

back to top


Accumulators


Accumulators are special types of variables that accumulate information about the graph during its traversal and exploration. Because they are a unique and important feature of the GSQL query language, we devote a separate section for their introduction, but additional detail on their usage will be covered in other sections, the "SELECT Statement" section in particular. This section covers the following subset of the EBNF language definitions:

EBNF
declAccumStmt := accumType "@"name ["=" constant][, "@"name ["=" constant]]* | "@"name ["=" constant][, "@"name ["=" constant]]* accumType | [STATIC] accumType "@@"name ["=" constant][, "@@"name ["=" constant]]* | [STATIC] "@@"name ["=" constant][, "@@"name ["=" constant]]* accumType accumType := "SumAccum" "<" ( INT | FLOAT | DOUBLE | STRING ) ">" | "MaxAccum" "<" ( INT | FLOAT | DOUBLE ) ">" | "MinAccum" "<" ( INT | FLOAT | DOUBLE ) ">" | "AvgAccum" | "OrAccum" | "AndAccum" | "BitwiseOrAccum" | "BitwiseAndAccum" | "ListAccum" "<" type ">" | "SetAccum" "<" elementType ">" | "BagAccum" "<" elementType ">" | "MapAccum" "<" elementType "," type ">" | "HeapAccum" "<" name ">" "(" (integer | name) "," name [ASC | DESC] ["," name [ASC | DESC]]* ")" | "GroupByAccum" "<" elementType name ["," elementType name]* , accumType name ["," accumType name]* ">" | "ArrayAccum" "<" name ">" elementType := baseType | name | STRING COMPRESS gAccumAccumStmt := "@@"name "+=" expr accumClause := ACCUM DMLSubStmtList postAccumClause := POST-ACCUM DMLSubStmtList


There are a number of different types of accumulators, each providing specific accumulation functions.  Accumulators are declared to have one of two types of association: global or vertex-attached .

More technically, accumulators are mutable mutex variables shared among all the graph computation threads exploring the graph within a given query. To improve performance, the graph processing engine employs multithreaded processing. Modification of accumulators is coordinated at run-time so the accumulation operator works correctly (i.e., mutually exclusively) across all threads. This is particularly relevant in the ACCUM clause. During traversal of the graph, the selected set of edges or vertices is partitioned among a group of threads. These threads have shared mutually exclusive access to the accumulators.

Declaration of Accumulators

All accumulator variables must be declared at the beginning of a query, immediately after any typedefs, and before any other type of statement. The scope of the accumulator variables is the entire query.

The name of a vertex-attached accumulator begins with a single "@".  The name of a global accumulator begins with "@@". Additionally, a global accumulator may be declared to be static.

EBNF for Accumulator Declaration
declAccumStmt := accumType "@"name ["=" constant][, "@"name ["=" constant]]* | "@"name ["=" constant][, "@"name ["=" constant]]* accumType | [STATIC] accumType "@@"name ["=" constant][, "@@"name ["=" constant]]* | [STATIC] "@@"name ["=" constant][, "@@"name ["=" constant]]* accumType

Vertex-attached Accumulators

Vertex-attached accumulators are mutable state variables that are attached to each vertex in the graph for the duration of the query's lifetime. They act as run-time attributes of a vertex. They are shared, mutual exclusively, among all of the query's processes. Vertex-attached accumulators can be set to a value with the = operator. Additionally, an accumulate operator += can be used to update the state of the accumulator; the function of += depends on the accumulator type. In the example below, there are two accumulators attached to each vertex. The initial value of an accumulator of a given type is predefined, however it can be changed at declaration as in the accumulator @weight below.  All vertex-attached accumulator names have a single leading at-sign "@".

Vertex-Attached Accumulators
SumAccum<int> @neighbors; MaxAccum<float> @weight = 2.8;

If there is a graph with 10 vertices, then there is an instance of @neighbors and @weight for each vertex (hence 10 of each, and 20 total accumulator instances).  These are accessed via the dot operator on a vertex variable or a vertex alias (e.g., v.@neighbor ).  The accumulator operator += only impacts the accumulator for the specific vertex being referenced.  A statement such as v1.@neighbors += 1 will only impact v1 's @neighbors and not the @neighbors for other vertices.

Vertex-attached accumulators can only be accessed or updated (via = or +=) in an ACCUM or POST-ACCUM clause within a SELECT block.  The only exception to this rule is that vertex-attached accumulators can be referenced in a PRINT statement, as the PRINT has access to all information attached to a vertex set.

Edge-attached accumulators are not supported.

Global Accumulators

A global accumulator is a single mutable accumulator that can be accessed or updated within a query.  The names of global accumulators start with a double at-sign "@@".

Global Accumulators
SumAccum<int> @@totalNeighbors; MaxAccum<float> @@entropy = 1.0;

Global accumulators can only be assigned (using the = operator) outside a SELECT block (i.e., not within an ACCUM or POST-ACCUM clause). Global accumulators can be accessed or updated via the accumulate operator += anywhere within a query, including inside a SELECT block.

It is important to note that the accumulation operation for global accumulators in an ACCUM clause executes once for each process. That is, if the FROM clause uses an edge-induced selection (introduced in Section "SELECT Statement"), the ACCUM clause executes one process for each edge in the selected edge set. If the FROM clause uses a vertex-induced selection (introduced in Section "SELECT Statement"), the ACCUM clause executes one process for each vertex in the selected vertex set. Since global accumulators are shared in a mutually exclusive manner among processes, they behave very differently than a non-accumulator variable (see Section "Variable Types" for more details) in an ACCUM clause. Take the following code example. The global accumulator @@globalRelationshipCount is accumulated for every worksFor edge traversed since it is shared among processes. Conversely, relationshipCount appears to have only been incremented once. This is because a non-accumulator variable is not shared among processes. Each process has its own separate unshared copy of relationshipCount and increments the original value by one. (E.g., each process increments relationshipCount from 0 to 1.) There is no accumulation and the final value is one.

Global Variable vs Global Accumulator
#Count the total number of employment relationships for all companies CREATE QUERY countEmploymentRelationships() FOR GRAPH workNet { INT relationshipCount; SumAccum<INT> @@globalRelationshipCount; start = {company.*}; companies = SELECT s FROM start:s -(worksFor)-> :t ACCUM @@globalRelationshipCount += 1, relationshipCount = relationshipCount + 1; PRINT relationshipCount; PRINT @@globalRelationshipCount; }


Result
GSQL > RUN QUERY countEmploymentRelationships() { "error": false, "message": "", "results": [ {"localEmployeeCount": 1}, {"@@globalEmployeeCount": 17} ] }

Static Global Accumulators

A static global accumulator retains its value after the execution of a query. To declare a static global accumulator, include the STATIC keyword at the beginning of the declaration statement. For example, if a static global accumulator is incremented by 1 each time a query is executed, then its value is equal to the number of times the query has been run, since the query was installed. Each static global accumulator belongs to the particular query in which it is declared; it cannot be shared among different queries. The value only persists in the context of running the same query multiple times.  The value will reset to the default value when the GPE is restarted.

Static Global Accumulators example
CREATE QUERY staticAccumEx(INT x) FOR GRAPH minimalNet { STATIC ListAccum<INT> @@testList; @@testList += x; PRINT @@testList; }


Result
GSQL > RUN QUERY staticAccumEx(2) { "results": [ { "@@testList": [ 2 ] } ], "error": false, "message": "" } GSQL > RUN QUERY staticAccumEx(3) { "results": [ { "@@testList": [ 2, 3 ] } ], "error": false, "message": "" }

There is no command to deallocate a static global accumulator. If a static global accumulator is a collection accumulator and it no longer needed, it should be cleared to minimize the memory usage.

Accumulator Types

The following are the accumulator types we currently support. Each type of accumulator supports one or more data types .

EBNF for Accumulator Types
accumType := "SumAccum" "<" ( INT | FLOAT | DOUBLE | STRING ) ">" | "MaxAccum" "<" ( INT | FLOAT | DOUBLE ) ">" | "MinAccum" "<" ( INT | FLOAT | DOUBLE ) ">" | "AvgAccum" | "OrAccum" | "AndAccum" | "BitwiseOrAccum" | "BitwiseAndAccum" | "ListAccum" "<" type ">" | "SetAccum" "<" elementType ">" | "BagAccum" "<" elementType ">" | "MapAccum" "<" elementType "," type ">" | "HeapAccum" "<" name ">" "(" (integer | name) "," name [ASC | DESC] ["," name [ASC | DESC]]* ")" | "GroupByAccum" "<" elementType name ["," elementType name]* , accumType name ["," accumType name]* ">" | "ArrayAccum" "<" name ">" elementType := baseType | name | STRING COMPRESS gAccumAccumStmt := "@@"name "+=" expr

The accumulators fall into two major groups :

  • Scalar Accumulators store a single value:
    • SumAccum
    • MinAccum, MaxAccum
    • AvgAccum
    • AndAccum, OrAccum
    • BitwiseAndAccum, BitwiseOrAccum
  • Collection Accumulators store a set of values:
    • ListAccum
    • SetAccum
    • BagAccum
    • MapAccum
    • ArrayAccum
    • HeapAccum
    • GroupByAccum

The details of each accumulator type are summarized in the table below.  The Accumulation Operation column explains how the accumulator accumName is updated when the statement accumName += newVal is executed. Following the table are example queries for each accumulator type.

Table Ac1: Accumulator Types and Their Accumulation Behavior

Accumulator Type (Case Sensitive) Default Initial Value Accumulation operation
(result of accumName += newVal )
SumAccum<INT> 0 accumName plus newVal
SumAccum<FLOAT or DOUBLE> 0.0 accumName plus newVal
SumAccum<STRING> empty string String concatenation of accumName and newVal
MaxAccum<INT> INT_MIN The greater of newVal and accumName
MaxAccum<FLOAT or DOUBLE> FLOAT_MIN or DOUBLE_MIN The greater of newVal and accumName
MaxAccum<VERTEX> the vertex with internal id 0 The vertex with the greater internal id , either newVal or accumName
MinAccum<INT> INT_MAX The lesser of newVal and accumName
MinAccum<FLOAT or DOUBLE> FLOAT_MAX or DOUBLE_MAX The lesser of newVal and accumName
MinAccum<VERTEX> unknown The vertex with the lesser internal id, either newVal or accumName
AvgAccum 0.0 (double precision) Double precision average of newVal and all previous values accumulated to accumName
AndAccum True Boolean AND of newVal and accumName
OrAccum False Boolean OR of newVal and accumName
BitwiseAndAccum -1 (INT) = 64-bit sequence of 1s Bitwise AND of newVal and accumName
BitwiseOrAccum 0 (INT) = 64-bit sequence of 0s Bitwise OR of newVal and accumName

ListAccum< typ e >

(ordered collection of elements)

empty list

List with newVal appended to end of accumName. newVal can be a single value or a list. If accumName is [ 2, 4, 6 ], then accumName += 4
produces accumName equal to [ 2, 4, 6, 4 ]

SetAccum<t ype >

(unordered collection of elements, duplicate items not allowed)

empty set Set union of newVal and accumName . newVal can be a single value or a set/bag. If accumName is ( 2, 4, 6 ), then accumName += 4
produces accumName equal to ( 2, 4, 6)

BagAccum<t ype >

(unordered collection of elements, duplicate items allowed)

empty bag Bag union of newVal and accumName . newVal can be a single value or a set/bag. If accumName is ( 2, 4, 6 ), then accumName += 4
would result in accumName equal to ( 2, 4, 4, 6)

MapAccum< type, type >

(unordered collection of (key,value) pairs)

empty map Add or update a key:value pair to the accumName map. If accumName is [ ("red",3), ("green",4),("blue",2) ], then accumName += ("black"-> 5)
produces accumName equal to [ ("red",3), ("green",4),("blue",2), ("black",5) ]
ArrayAccum< accumType > empty list See the ArrayAccum section below for details.

HeapAccum< tuple >(heapSize, sortKey [, sortKey_i]*)

(sorted collection of tuples)

empty heap Insert newVal into the accumName heap, maintaining the heap in sorted order, according to the sortKey(s) and size limit declared for this HeapAccum
GroupByAccum< type [, type]* , accumType [, accumType]* > empty group by map Add or update a key:value pair in accumName . See Section "GroupByAccum" for more details.

SumAccum

The SumAccum type computes and stores the cumulative sum of numeric values or the cumulative concatenation of text values. The output of a SumAccum is a single numeric or string value. SumAccum variables operate on values of type INT , UINT, FLOAT, DOUBLE, or STRING only.

The += operator updates the accumulator's state. For INT, FLOAT, and DOUBLE types, += arg performs a numeric addition, while for the STRING value type += arg concatenates arg to the current value of the SumAccum.

SumAccum Example
# SumAccum Example CREATE QUERY sumAccumEx() FOR GRAPH minimalNet { SumAccum<INT> @@intAccum; SumAccum<FLOAT> @@floatAccum; SumAccum<DOUBLE> @@doubleAccum; SumAccum<STRING> @@stringAccum; @@intAccum = 1; @@intAccum += 1; @@floatAccum = @@intAccum; @@floatAccum = @@floatAccum / 3; @@doubleAccum = @@floatAccum * 8; @@doubleAccum += -1; @@stringAccum = "Hello "; @@stringAccum += "World"; PRINT @@intAccum; PRINT @@floatAccum; PRINT @@doubleAccum; PRINT @@stringAccum; }


Result
GSQL > RUN QUERY sumAccumEx() { "error": false, "message": "", "results": [ {"@@intAccum": 2}, {"@@floatAccum": 0.66667}, {"@@doubleAccum": 4.33333}, {"@@stringAccum": "Hello World"} ] }

MinAccum / MaxAccum

The MinAccum and MaxAccum types calculate and store the cumulative minimum or the cumulative maximum of a series of values. The output of a MinAccum or a MaxAccum is a single numeric value. MinAccum and MaxAccum variables operate on values of type INT, UINT, FLOAT, and DOUBLE, VERTEX (with optional specific vertex type) only.

For MinAccum, += arg checks if the current value held is less than arg and stores the smaller of the two. MaxAccum behaves the same, with the exception that it checks for and stores the greater instead of the lesser of the two.

MinAccum and MaxAccum Example
# MinAccum and MaxAccum Example CREATE QUERY minMaxAccumEx() FOR GRAPH minimalNet { MinAccum<INT> @@minAccum; MaxAccum<FLOAT> @@maxAccum; @@minAccum += 40; @@minAccum += 20; @@minAccum += -10; @@maxAccum += -1.1; @@maxAccum += 2.5; @@maxAccum += 2.8; PRINT @@minAccum; PRINT @@maxAccum; }


Result
GSQL > INSTALL QUERY minMaxAccumEx GSQL > RUN QUERY minMaxAccumEx() { "error": false, "message": "", "results": [ {"@@minAccum": -10}, {"@@maxAccum": 2.8} ] }

MinAccum and MaxAccum operating on VERTEX type have a special comparison. They do not compare vertex ids, but TigerGraph internal ids, which might n ot be in t he same order as the external ids. Comparing internal ids is much faster, so MinAccum/ MaxAccum<VERTEX> provide an efficient way to compar e and select vertices. This is helpful for some graph algorithms that require the vertices to be numbered and sortable . For example, the following query returns one post from each person. The returned vertex is not necessarily the vertex with alphabetically largest id.

MaxAccum<VERTEX> example
# Output one random post vertex from each person CREATE QUERY minMaxAccumVertex() FOR GRAPH socialNet { MaxAccum<VERTEX> @maxVertex; allUser = {person.*}; allUser = SELECT src FROM allUser:src -(posted)-> post:tgt ACCUM src.@maxVertex += tgt; PRINT allUser; }
Result
GSQL > RUN QUERY minMaxAccumVertex() { "results": [ { "v_id": "person1", "v_type": "person", "v_set": "allUser", "v": {"@maxVertex": "0"} }, { "v_id": "person2", "v_type": "person", "v_set": "allUser", "v": {"@maxVertex": "1"} }, { "v_id": "person3", "v_type": "person", "v_set": "allUser", "v": {"@maxVertex": "2"} }, { "v_id": "person4", "v_type": "person", "v_set": "allUser", "v": {"@maxVertex": "3"} }, { "v_id": "person5", "v_type": "person", "v_set": "allUser", "v": {"@maxVertex": "11"} }, { "v_id": "person6", "v_type": "person", "v_set": "allUser", "v": {"@maxVertex": "10"} }, { "v_id": "person7", "v_type": "person", "v_set": "allUser", "v": {"@maxVertex": "9"} }, { "v_id": "person8", "v_type": "person", "v_set": "allUser", "v": {"@maxVertex": "8"} } ], "error": false, "message": "" }

AvgAccum

The AvgAccum type calculates and stores the cumulative mean of a series of numeric values. Internally, its state information includes the sum value of all inputs and a count of how many input values it has accumulated. The output is the mean value; the sum and the count values are not accessible to the user. The data type of a AvgAccum variable is not declared; all AvgAccum accumulators accept inputs of type INT, UINT, FLOAT, and DOUBLE.  The output is always DOUBLE type.

The += arg operation updates the AvgAccum variable's state to be the mean of all the previous arguments along with the current argument; The = arg operation clears all the previously accumulated state and sets the new state to be arg with a count of one.

AvgAccum Example
# AvgAccum Example CREATE QUERY avgAccumEx() FOR GRAPH minimalNet { AvgAccum @@averageAccum; @@averageAccum += 10; @@averageAccum += 5.5; # avg = (10+5.5) / 2.0 @@averageAccum += -1; # avg = (10+5.5-1) / 3.0 PRINT @@averageAccum; # 4.8333... @@averageAccum = 99; # reset @@averageAccum += 101; # avg = (99 + 101) / 2 PRINT @@averageAccum; # 100 }


Result
GSQL > RUN QUERY avgAccumEx() { "error": false, "message": "", "results": [ {"@@averageAccum": 4.83333}, {"@@averageAccum": 100} ] }

AndAccum / OrAccum

The AndAccum and OrAccum types calculate and store the cumulative result of a series of boolean operations. The output of an AndAccum or an OrAccum is a single boolean value (True or False). AndAccum and OrAccum variables operate on boolean values only.  The data type does not need to be declared.

For AndAccum, += arg updates the state to be the logical AND between the current boolean state and arg . OrAccum behaves the same, with the exception that it stores the result of a logical OR operation.

AndAccum and OrAccum Example
# AndAccum and OrAccum Example CREATE QUERY andOrAccumEx() FOR GRAPH minimalNet { # T = True # F = False AndAccum @@andAccumVar; # (default value = T) OrAccum @@orAccumVar; # (default value = F) @@andAccumVar += True; # T and T = T @@andAccumVar += False; # T and F = F @@andAccumVar += True; # F and T = F PRINT @@andAccumVar; @@orAccumVar += False; # F or F == F @@orAccumVar += True; # F or T == T @@orAccumVar += False; # T or F == T PRINT @@orAccumVar; }


Result
GSQL > RUN QUERY andOrAccumEx() { "error": false, "message": "", "results": [ {"@@andAccumVar": false}, {"@@orAccumVar": true} ] }

BitwiseAndAccum / BitwiseOrAccum

The BitwiseAndAccum and BitwiseOrAccum types calculate and store the cumulative result of a series of bitwise boolean operations and store the resulting bit sequences.  BitwiseAndAccum and BitwiseOrAccum operator on INT only. The data type does not need to be declared.

Fundamental for understanding and using bitwise operations is the knowledge that integers are stored in base-2 representation as a 64-bit sequence of 1s and 0s. "Bitwise" means that each bit is treated as a separate boolean value, with 1 representing true and 0 representing false. Hence, an integer is equivalent to a sequence of boolean values. Computing the Bitwise AND of two numbers A and B means to compute the bit sequence C where the j th bit of C, denoted C j , is equal to (A j AND B j ).

For BitwiseAndAccum, += arg updates the accumulator's state to be the Bitwise AND of the current state and arg . BitwiseOrAccum behaves the same, with the exception that it computes a Bitwise OR.

Bitwise Operations and Negative Integers

Most computer systems represent negative integers using "2's complement" format, where the uppermost bit has special significance. Operations which affect the uppermost bit are crossing the boundary between positive and negative numbers, and vice versa.


BitwiseAndAccum and BitwiseOrAccum Example
# BitwiseAndAccum and BitwiseOrAccum Example CREATE QUERY bitwiseAccumEx() FOR GRAPH minimalNet { BitwiseAndAccum @@bwAndAccumVar; # default value = 64-bits of 1 = -1 (INT) BitwiseOrAccum @@bwOrAccumVar; # default value = 64-bits of 0 = 0 (INT)) # 11110000 = 240 # 00001111 = 15 # 10101010 = 170 # 01010101 = 85 # BitwiseAndAccum @@bwAndAccumVar += 170; # 11111111 & 10101010 -> 10101010 @@bwAndAccumVar += 85; # 10101010 & 01010101 -> 00000000 PRINT @@bwAndAccumVar; # 0 @@bwAndAccumVar = 15; # reset to 00001111 @@bwAndAccumVar += 85; # 00001111 & 01010101 -> 00000101 PRINT @@bwAndAccumVar; # 5 # BitwiseOrAccum @@bwOrAccumVar += 170; # 00000000 | 10101010 -> 10101010 @@bwOrAccumVar += 85; # 10101010 | 01010101 -> 11111111 = 255 PRINT @@bwOrAccumVar; # 255 @@bwOrAccumVar = 15; # reset to 00001111 @@bwOrAccumVar += 85; # 00001111 | 01010101 -> 01011111 = 95 PRINT @@bwOrAccumVar; # 95 }


Results
GSQL > RUN QUERY bitwiseAccumEx() { "error": false, "message": "", "results": [ {"@@bwAndAccumVar": 0}, {"@@bwAndAccumVar": 5}, {"@@bwOrAccumVar": 255}, {"@@bwOrAccumVar": 95} ] }

ListAccum

The ListAccum type maintains a sequential collection of elements. The output of a ListAccum is a list of values in the order the elements were added. The element type can be any base type, tuple, or STRING COMPRESS. Additionally, a ListAccum can contain a nested collection of type ListAccum. Nesting of ListAccums is limited to a depth of three.

The += arg operation appends arg to the end of the list.

ListAccum supports two additional operations:

  • @list + arg (Numeric or STRING data) adds arg to each element of @list. If the data type is INT, FLOAT, or DOUBLE, numeric addition is performed.  If the data type is string, concatenation is performed.  Other data types are not supported.
  • @list1 * @list2 (STRING data only) generates a new list of strings consisting of all permutations of an element of the first list followed by an element of the second list.

ListAccum also supports the following class functions.

Functions which modify the ListAccum (mutator functions) can be used only under the following conditions:

  • Mutator functions of global accumulators may only be used at the query-body level.
  • Mutator functions of vertex-attached accumulators may only be used in a POST-ACCUM clause.


function (T is the element type) return type Accessor / Mutator description
size() INT

Accessor

Returns the number of elements in the list.
contains( T val ) BOOL

Accessor

Returns true/false if the list does/doesn't contain the value .
get( INT idx ) T Accessor Returns the value at the given index position in the list. The index begins at 0. If the index is out of bound (including any negative value), the default value of the element type is returned.
clear() VOID Mutator Clears the list so it becomes empty with size 0.
update (INT index, T value ) VOID Mutator Assigns value to the list element at position index .


ListAccum Example
# ListAccum Example CREATE QUERY listAccumEx() FOR GRAPH minimalNet { ListAccum<INT> @@intListAccum; ListAccum<STRING> @@stringListAccum; ListAccum<STRING> @@stringMultiplyListAccum; ListAccum<STRING> @@stringAdditionAccum; ListAccum<STRING> @@letterListAccum; ListAccum<ListAccum<STRING>> @@nestedListAccum; @@intListAccum = [1,3,5]; @@intListAccum += [7,9]; @@intListAccum += 11; @@intListAccum += 13; @@intListAccum += 15; PRINT @@intListAccum; PRINT @@intListAccum.get(0), @@intListAccum.get(1); PRINT @@intListAccum.get(8); # Out of bound: default value of int: 0 #Other built-in functions PRINT @@intListAccum.size(); PRINT @@intListAccum.contains(2); PRINT @@intListAccum.contains(3); @@stringListAccum += "Hello"; @@stringListAccum += "World"; PRINT @@stringListAccum; @@letterListAccum += "a"; @@letterListAccum += "b"; #Addition results in appending to each element in list (STRING example) # Ex: [a,b,c] + "x" = [ax, bx, cx] @@stringAdditionAccum = @@letterListAccum + "x"; PRINT @@stringAdditionAccum; #Multiplication results in combination of all element permutations (STRING TYPE ONLY) # Ex: [a,b] * [c,d] = [ac, ad, bc, bd] @@stringMultiplyListAccum = @@stringListAccum * @@letterListAccum; PRINT @@stringMultiplyListAccum; #Two dimensional list (3 dimensions is possible as well) @@nestedListAccum += [["foo", "bar"], ["Big", "Bang", "Theory"], ["String", "Theory"]]; PRINT @@nestedListAccum; PRINT @@nestedListAccum.get(0); PRINT @@nestedListAccum.get(0).get(1); }


Result
GSQL > RUN QUERY listAccumEx() { "error": false, "message": "", "results": [ { "@@intListAccum": [ 1, 3, 5, 7, 9, 11, 13, 15 ] }, { "@@intListAccum.get(0)": 1, "@@intListAccum.get(1)": 3 }, { "@@intListAccum.get(8)": 0 }, { "@@intListAccum.size()": 8 }, { "@@intListAccum.contains(2)": false }, { "@@intListAccum.contains(3)": true }, { "@@stringListAccum": [ "Hello", "World" ] }, { "@@stringAdditionAccum": [ "ax", "bx" ] }, { "@@stringMultiplyListAccum": [ "Helloa", "Worlda", "Hellob", "Worldb" ] }, { "@@nestedListAccum": [ [ "foo", "bar" ], [ "Big", "Bang", "Theory" ], [ "String", "Theory" ] ] }, { "@@nestedListAccum.get(0)": [ "foo", "bar" ] }, { "@@nestedListAccum.get(0).get(1)": "bar" } ] }
Example for update function on a global ListAccum
CREATE QUERY listAccumUpdateEx() FOR GRAPH workNet { # Global ListAccum ListAccum<INT> @@intListAccum; ListAccum<STRING> @@stringListAccum; ListAccum<BOOL> @@passFail; @@intListAccum += [0,2,4,6,8]; @@stringListAccum += ["apple","banana","carrot","daikon"]; # Global update at Query-Body Level @@passFail += @@intListAccum.update(1,-99); @@passFail += @@intListAccum.update(@@intListAccum.size()-1,40); // last element @@passFail += @@stringListAccum.update(0,"zero"); // first element @@passFail += @@stringListAccum.update(4,"four"); // FAIL: out-of-range PRINT @@intListAccum, @@stringListAccum, @@passFail; }
Results for listAcccumUpdateEx
GSQL > RUN QUERY listAccumUpdateEx1() { "error": false, "message": "", "results": [{ "@@passFail": [ true, true, true, false ], "@@intListAccum": [ 0, -99, 4, 6, 40 ], "@@stringListAccum": [ "zero", "banana", "carrot", "daikon" ] }] }
Example for update function on a vertex-attached ListAccum
CREATE QUERY listAccumUpdateEx2(SET<VERTEX<person>> seed) FOR GRAPH workNet { # Each person has an LIST<INT> of skills and a LIST<STRING COMPRESS> of interests. # This function copies their lists into ListAccums, and then udpates the last # int with -99 and updates the last string with "fizz". ListAccum<INT> @intList; ListAccum<STRING COMPRESS> @stringList; ListAccum<STRING> @@intFails, @@strFails; S0 (person) = seed; S1 = SELECT s FROM S0:s ACCUM s.@intList = s.skillList, s.@stringList = s.interestList POST-ACCUM INT len = s.@intList.size(), IF NOT s.@intList.update(len-1,-99) THEN @@intFails += s.id END, INT len2 = s.@stringList.size(), IF NOT s.@stringList.update(len2-1,"fizz") THEN @@strFails += s.id END ; PRINT S1.skillList, S1.interestList, S1.@intList, S1.@stringList; PRINT @@intFails, @@strFails; }
Results for listAccumUpdateEx2
GSQL > RUN QUERY listAccumUpdateEx2(["person1","person5"]) { "error": false, "message": "", "results": [ { "v_set": "S1", "v_id": "person1", "v": { "S1.@stringList": [ "management", "fizz" ], "S1.interestList": [ "management", "financial" ], "S1.skillList": [ 1, 2, 3 ], "S1.@intList": [ 1, 2, -99 ] }, "v_type": "person" }, { "v_set": "S1", "v_id": "person5", "v": { "S1.@stringList": [ "sport", "financial", "fizz" ], "S1.interestList": [ "sport", "financial", "engineering" ], "S1.skillList": [ 8, 2, 5 ], "S1.@intList": [ 8, 2, -99 ] }, "v_type": "person" }, { "@@strFails": [], "@@intFails": [] } ] }


SetAccum

The SetAccum type maintains a collection of unique elements. The output of a SetAccum is a list of elements in arbitrary order. A SetAccum instance can contain values of one type. The element type can be any base type, tuple, or STRING COMPRESS.

For SetAccum, the += arg operation adds a non-duplicate element to the set. If arg is already represented in the set, then the SetAccum state does not change.

SetAccum also can be used with the three canonical set operators: UNION, INTERSECT, and MINUS (see Section "Set/Bag Expression and Operators" for more details).

SetAccum also supports the following class functions.

Functions which modify the SetAccum (mutator functions) can be used only under the following conditions:

  • Mutator functions of global accumulators may only be used at the query-body level.
  • Mutator functions of vertex-attached accumulators may only be used in a POST-ACCUM clause.


function (T is the element type) return type Accessor / Mutator description
size() INT Accessor Returns the number of elements in the set.
contains( T value ) BOOL Accessor Returns true/false if the set does/doesn't contain the value .
clear() VOID Mutator Clears the set so it becomes empty with size 0.


SetAccum Example
# SetAccum Example CREATE QUERY setAccumEx() FOR GRAPH minimalNet { SetAccum<INT> @@intSetAccum; SetAccum<STRING> @@stringSetAccum; @@intSetAccum += 5; @@intSetAccum.clear(); @@intSetAccum += 4; @@intSetAccum += 11; @@intSetAccum += 1; @@intSetAccum += 11; # Sets do not store duplicates @@intSetAccum += (1,2,3,4); # Can create simple sets this way PRINT @@intSetAccum; PRINT @@intSetAccum.contains(3); @@stringSetAccum += "Hello"; @@stringSetAccum += "Hello"; @@stringSetAccum += "There"; @@stringSetAccum += "World"; PRINT @@stringSetAccum; PRINT @@stringSetAccum.contains("Hello"); PRINT @@stringSetAccum.size(); }


Result
GSQL > RUN QUERY setAccumEx() { "error": false, "message": "", "results": [ {"@@intSetAccum": [ 3, 1, 2, 11, 4 ]}, {"@@intSetAccum.contains(3)": true}, {"@@stringSetAccum": [ "World", "There", "Hello" ]}, {"@@stringSetAccum.contains(Hello)": true}, {"@@stringSetAccum.size()": 3} ] }

BagAccum

The BagAccum type maintains a collection of elements with duplicated elements allowed. The output of a BagAccum is a list of elements in arbitrary order. A BagAccum instance can contain values of one type. The element type can be any base type, tuple, or STRING COMPRESS.

For BagAccum, the += arg operation adds an element to the bag.

BagAccum also supports the following class functions.

Functions which modify the BagAccum (mutator functions) can be used only under the following conditions:

  • Mutator functions of global accumulators may only be used at the query-body level.
  • Mutator functions of vertex-attached accumulators may only be used in a POST-ACCUM clause.
function (T is the element type) return type Accessor / Mutator description
size() INT Accessor Returns the number of elements in the bag.
contains( T value ) BOOL Accessor Returns true/false if the bag does/doesn't contain the value .
clear() VOID

Mutator

Clears the bag so it becomes empty with size 0.
remove( T value ) VOID Mutator Removes one instance of value from the bag.
removeAll( T value ) VOID Mutator Removes all instances of the given value from the bag.



BagAccum Example
# BagAccum Example CREATE QUERY bagAccumEx() FOR GRAPH minimalNet { #Unordered collection BagAccum<INT> @@intBagAccum; BagAccum<STRING> @@stringBagAccum; @@intBagAccum += 5; @@intBagAccum.clear(); @@intBagAccum += 4; @@intBagAccum += 11; @@intBagAccum += 1; @@intBagAccum += 11; #Bag accums can store duplicates @@intBagAccum += (1,2,3,4); PRINT @@intBagAccum; PRINT @@intBagAccum.size(); PRINT @@intBagAccum.contains(4); @@stringBagAccum += "Hello"; @@stringBagAccum += "Hello"; @@stringBagAccum += "There"; @@stringBagAccum += "World"; PRINT @@stringBagAccum.contains("Hello"); @@stringBagAccum.remove("Hello"); #Remove one matching element @@stringBagAccum.removeAll("There"); #Remove all matching elements PRINT @@stringBagAccum; }


Result
GSQL > RUN QUERY bagAccumEx() { "error": false, "message": "", "results": [ {"@@intBagAccum": [ 3, 1, 1, 2, 11, 11, 4, 4 ]}, {"@@intBagAccum.size()": 8}, {"@@intBagAccum.contains(4)": true}, {"@@stringBagAccum.contains(Hello)": true}, {"@@stringBagAccum": [ "World", "Hello" ]} ] }

MapAccum

The MapAccum type maintains a collection of (key -> value) pairs. The output of a MapAccum is a set of key and value pairs in which the keys are unique.

The key type of a MapAccum can be all base types, tuple, or STRING COMPRESS.  If the key type is VERTEX, then only the vertex's id is stored and displayed.

The value type of a MapAccum can be all base types, tuple, STRING COMPRESS or any type of accumulator, except for HeapAccum.

For MapAccum, the += (key->val) operation adds a key-value element to the collection if key is not yet used in the MapAccum. If the MapAccum already contains key , then val is accumulated to the current value, where the accumulation operation depends on the data type of val . (Strings would get concatenated, lists would be appended, numerical values would be added, etc.)

MapAccum also supports the following class functions.

Functions which modify the MapAccum (mutator functions) can be used only under the following conditions:

  • Mutator functions of global accumulators may only be used at the query-body level.
  • Mutator functions of vertex-attached accumulators may only be used in a POST-ACCUM clause.


function (KEY is the key type) return type Accessor / Mutator description
size() INT Accessor Returns the number of elements in the map.
containsKey( KEY key ) BOOL Accessor Returns true/false if the map does/doesn't contain key .
get( KEY key ) value type Accessor Returns the value which the map associates with key . If the map doesn't contain key , then the return value is undefined.
clear() VOID Mutator Clears the map so it becomes empty with size 0.

MapAccum Example
#MapAccum Example CREATE QUERY mapAccumEx() FOR GRAPH minimalNet { #Map(Key, Value) # Keys can be INT or STRING only MapAccum<STRING, INT> @@intMapAccum; MapAccum<INT, STRING> @@stringMapAccum; MapAccum<INT, MapAccum<STRING, STRING>> @@nestedMapAccum; @@intMapAccum += ("foo" -> 1); @@intMapAccum.clear(); @@intMapAccum += ("foo" -> 3); @@intMapAccum += ("bar" -> 2); @@intMapAccum += ("baz" -> 2); @@intMapAccum += ("baz" -> 1); #add 1 to existing value PRINT @@intMapAccum.containsKey("baz"); PRINT @@intMapAccum.get("bar"); PRINT @@intMapAccum.get("root"); @@stringMapAccum += (1 -> "apple"); @@stringMapAccum += (2 -> "pear"); @@stringMapAccum += (3 -> "banana"); @@stringMapAccum += (4 -> "a"); @@stringMapAccum += (4 -> "b"); #append "b" to existing value @@stringMapAccum += (4 -> "c"); #append "c" to existing value PRINT @@intMapAccum; PRINT @@stringMapAccum; #Checking and getting keys if @@stringMapAccum.containsKey(1) THEN PRINT @@stringMapAccum.get(1); END; #Map nesting @@nestedMapAccum += ( 1 -> ("foo" -> "bar") ); @@nestedMapAccum += ( 1 -> ("flip" -> "top") ); @@nestedMapAccum += ( 2 -> ("fizz" -> "pop") ); @@nestedMapAccum += ( 1 -> ("foo" -> "s") ); PRINT @@nestedMapAccum; if @@nestedMapAccum.containsKey(1) THEN if @@nestedMapAccum.get(1).containsKey("foo") THEN PRINT @@nestedMapAccum.get(1).get("foo"); END; END; } 


Result
GSQL > RUN QUERY mapAccumEx() { "error": false, "message": "", "results": [ {"@@intMapAccum.containsKey(baz)": true}, {"@@intMapAccum.get(bar)": 2}, {"@@intMapAccum.get(root)": 0}, {"@@intMapAccum": { "bar": 2, "foo": 3, "baz": 3 }}, {"@@stringMapAccum": { "1": "apple", "2": "pear", "3": "banana", "4": "abc" }}, {"@@stringMapAccum.get(1)": "apple"}, {"@@nestedMapAccum": { "1": { "foo": "bars", "flip": "top" }, "2": {"fizz": "pop"} }}, {"@@nestedMapAccum.get(1).get(foo)": "bars"} ] }

ArrayAccum

The ArrayAccum type maintains an array of accumulators. An array is a fixed-length sequence of elements, with direct access to elements by position.  The ArrayAccum has these particular characteristics:

  • The elements are accumulators, not primitive or base data types. All accumulators, except HeapAccum, MapAccum, and GroupByAccum, can be used.
  • An ArrayAccum instance can be multidimensional. There is no limit to the number of dimensions.
  • The size can be set at run-time (dynamically).
  • There are operators which update the entire array efficiently.

When an ArrayAccum is declared, the instance name should be followed by a pair of brackets for each dimension.  The brackets may either contain an integer constant to set the size of the array, or they may be empty. In that case, the size must be set with the reallocate function before the ArrayAccum can be used.

ArrayAccum declaration example
ArrayAccum<SetAccum<STRING>> @@names[10]; ArrayAccum<SetAccum<INT>> @@ids[][]; // 2-dimensional, size to be determined

Because each element of an ArrayAccum is itself an accumulator, the operators =, +=, and + can be used in two contexts: accumulator-level and element-level.

Element-level operations

If @A is an ArrayAccum of length 6, then @A[0] and @A[5] refer to its first and last elements, respectively. Referring to an ArrayAccum element is like referring to an accumulator of that type.  For example, given the following definitions:

ArrayAccum<SumAccum<INT>> @@Sums[3]; ArrayAccum<ListAccum<STRING>> @@Lists[2];

then @@Sums[0], @@Sums[1], and @@Sums[2] each refer to an individual SumAccum<INT>, and @@Lists[0] and @@Lists[1] each refer to a ListAccum<STRING>, supporting all the operations for those accumulator and data types.

@@Sums[1] = 1; @@Sums[1] += 2; // value is now 3 @@Lists[0] = "cat"; @@Lists[0] += "egory"; // value is now "category"

Accumulator-level operations

The operators =, +=, and + have special meanings when applied to an ArrayAccum as a whole. There operations efficiently update an entire ArrayAccum. All of the ArrayAccums must have the same element type.

Operator Description Example
= sets the ArrayAccum on the left equal to the ArrayAccum on the right. The two ArrayAccums must have the same element type, but the left-side ArrayAccum will change its size and dimensions to match the one on the right-side. @A = @B;
+ performs element-by-element addition of two ArrayAccums of the same type and size.  The result is a new ArrayAccum of the same size. @C = @A + @B;
// @A and @B must be the same size
+= performs element-by-element accumulation (+=) from the right-side ArrayAccum to the left-side ArrayAccum. They must be the same type and size. @A += @B;
// @A and @B must be the same size


ArrayAccum also supports the following class functions.

Functions which modify the ArrayAccum (mutator functions) can be used only under the following conditions:

  • Mutator functions of global accumulators may only be used at the query-body level.
  • Mutator functions of vertex-attached accumulators may only be used in a POST-ACCUM clause.
function return type Accessor / Mutator description
size() INT Accessor Returns the total number of elements in the (multi-dimensional) array. For example, the size of an ArrayAccum declared as @A[3][4] is 12.
reallocate( INT, ... ) VOID Mutator Discards the previous ArrayAccum instance and creates a new ArrayAccum, with the size(s) given. An N-dimensional ArrayAccum requires N integer parameters. The reallocate function cannot be used to change the number of dimensions.
Example of ArrayAccum Element-level Operations
CREATE QUERY ArrayAccumElem() FOR GRAPH minimalNet { ArrayAccum<SumAccum<DOUBLE>> @@aaSumD[2][2]; # 2D Sum Double ArrayAccum<SumAccum<STRING>> @@aaSumS[2][2]; # 2D Sum String ArrayAccum<MaxAccum<INT>> @@aaMax[2]; ArrayAccum<MinAccum<UINT>> @@aaMin[2]; ArrayAccum<AvgAccum> @@aaAvg[2]; ArrayAccum<AndAccum<BOOL>> @@aaAnd[2]; ArrayAccum<OrAccum<BOOL>> @@aaOr[2]; ArrayAccum<BitwiseAndAccum> @@aaBitAnd[2]; ArrayAccum<BitwiseOrAccum> @@aaBitOr[2]; ArrayAccum<ListAccum<INT>> @@aaList[2][2]; # 2D List ArrayAccum<SetAccum<FLOAT>> @@aaSetF[2]; ArrayAccum<BagAccum<DATETIME>> @@aaBagT[2]; ## for test data ListAccum<STRING> @@words; BOOL toggle = false; @@words += "1st"; @@words += "2nd"; @@words += "3rd"; @@words += "4th"; # Int: a[0] += 1, 2; a[1] += 3, 4 # Bool: alternate true/false # Float: a[0] += 1.111, 2.222; a[1] += 3.333, 4.444 # 2D Doub: a[0][0] += 1.111, 2.222; a[0][1] += 5.555, 6.666; # a[1][0] += 3.333, 4.444; a[0][1] += 7.777, 8.888; FOREACH i IN RANGE [0,1] DO FOREACH n IN RANGE [1, 2] DO toggle = NOT toggle; @@aaMax[i] += i*2 + n; @@aaMin[i] += i*2 + n; @@aaAvg[i] += i*2 + n; @@aaAnd[i] += toggle; @@aaOr[i] += toggle; @@aaBitAnd[i] += i*2 + n; @@aaBitOr[i] += i*2 + n; @@aaSetF[i] += (i*2 + n)/0.9; @@aaBagT[i] += epoch_to_datetime(i*2 + n); FOREACH j IN RANGE [0,1] DO @@aaSumD[i][j] += (j*4 + i*2 + n)/0.9; @@aaSumS[i][j] += @@words.get((j*2 + i + n)%4); @@aaList[i][j] += j*4 +i*2 + n ; END; END; END; PRINT @@aaSumD; PRINT @@aaSumS; PRINT @@aaMax; PRINT @@aaMin; PRINT @@aaAvg; PRINT @@aaAnd; PRINT @@aaOr; PRINT @@aaBitAnd; PRINT @@aaBitOr; PRINT @@aaList; PRINT @@aaSetF; PRINT @@aaBagT; }
Results for ArrayAccumElem
{ "error": false, "message": "", "results": [ {"@@aaSumD": [ [ 3.33333, 12.22222 ], [ 7.77778, 16.66667 ] ]}, {"@@aaSumS": [ [ "2nd3rd", "4th1st" ], [ "3rd4th", "1st2nd" ] ]}, {"@@aaMax": [ 2, 4 ]}, {"@@aaMin": [ 1, 3 ]}, {"@@aaAvg": [ 1.5, 3.5 ]}, {"@@aaAnd": [ false, false ]}, {"@@aaOr": [ true, true ]}, {"@@aaBitAnd": [ 0, 0 ]}, {"@@aaBitOr": [ 3, 7]}, {"@@aaList": [ [ [ 1, 2 ], [ 5, 6] ], [ [ 3, 4 ], [ 7, 8 ] ] ]}, {"@@aaSetF": [ [ 2.22222, 1.11111], [ 4.44444, 3.33333 ] ]}, {"@@aaBagT": [ [ 2, 1 ], [ 4, 3 ] ]} ] }


Example of Operations between Whole ArrayAccums
CREATE QUERY ArrayAccumOp3(INT lenA) FOR GRAPH minimalNet { ArrayAccum<SumAccum<INT>> @@arrayA[5]; // Original size ArrayAccum<SumAccum<INT>> @@arrayB[2]; ArrayAccum<SumAccum<INT>> @@arrayC[][]; // No size STRING msg; @@arrayA.reallocate(lenA); # Set/Change size dynamically @@arrayB.reallocate(lenA+1); @@arrayC.reallocate(lenA, lenA+1); // Initialize arrays FOREACH i IN RANGE[0,lenA-1] DO @@arrayA[i] += i*i; FOREACH j IN RANGE[0,lenA] DO @@arrayC[i][j] += j*10 + i; END; END; FOREACH i IN RANGE[0,lenA] DO @@arrayB[i] += 100-i; END; msg = "Initial Values"; PRINT msg, @@arrayA, @@arrayB, @@arrayC; msg = "Test 1: A = C, C = B"; // = operator @@arrayA = @@arrayC; // change dimensions: 1D <- 2D @@arrayC = @@arrayB; // change dimensions: 2D <- 1D PRINT msg, @@arrayA, @@arrayC; msg = "Test 2: B += C"; // += operator @@arrayB += @@arrayC; // B and C must have same size & dim PRINT msg, @@arrayB, @@arrayC; msg = "Test 3: A = B + C"; // + operator @@arrayA = @@arrayB + @@arrayC; // B & C must have same size & dim PRINT msg, @@arrayA; // A changes size & dim }
Results for Query ArrayAccumOp3
RUN QUERY ArrayAccumOp3(3) { "error": false, "message": "", "results": [ { "msg": "Initial Values", "@@arrayC": [ [ 0, 10, 20, 30 ], [ 1, 11, 21, 31 ], [ 2, 12, 22, 32 ] ], "@@arrayB": [ 100, 99, 98, 97 ], "@@arrayA": [ 0, 1, 4 ] }, { "msg": "Test 1: A = C, C = B", "@@arrayC": [ 100, 99, 98, 97 ], "@@arrayA": [ [ 0, 10, 20, 30 ], [ 1, 11, 21, 31 ], [ 2, 12, 22, 32 ] ] }, { "msg": "Test 2: B += C", "@@arrayC": [ 100, 99, 98, 97 ], "@@arrayB": [ 200, 198,196, 194 ] }, { "msg": "Test 3: A = B + C", "@@arrayA": [ 300, 297, 294, 291 ] } ] }


Example for Vertex-Attached ArrayAccum
CREATE QUERY ArrayAccumLocal() FOR GRAPH socialNet { # Count each person's edges by type # friend/liked/posted edges are type 0/1/2, respectively ArrayAccum<SumAccum<INT>> @edgesByType[3]; Persons = {person.*}; Persons = SELECT s FROM Persons:s -(:e)-> :t ACCUM CASE e.type WHEN "friend" THEN s.@edgesByType[0] += 1 WHEN "liked" THEN s.@edgesByType[1] += 1 WHEN "posted" THEN s.@edgesByType[2] += 1 END ORDER BY s.id; PRINT Persons.@edgesByType; }
Results for Query ArrayAccumLocal
{ "error": false, "message": "", "results": [ { "v_set": "Persons", "v_id": "person1", "v": {"Persons.@edgesByType": [ 2, 1, 1 ]}, "v_type": "person" }, { "v_set": "Persons", "v_id": "person2", "v": {"Persons.@edgesByType": [ 2, 2, 1 ]}, "v_type": "person" }, { "v_set": "Persons", "v_id": "person3", "v": {"Persons.@edgesByType": [ 2, 1, 1 ]}, "v_type": "person" }, { "v_set": "Persons", "v_id": "person4", "v": {"Persons.@edgesByType": [ 3, 1, 1 ]}, "v_type": "person" }, { "v_set": "Persons", "v_id": "person5", "v": {"Persons.@edgesByType": [ 2, 1, 2 ]}, "v_type": "person" } ] }


HeapAccum

The HeapAccum type maintains a sorted collection of tuples and enforces a maximum number of tuples in the collection. The output of a HeapAccum is a sorted collection of tuple elements. The += arg operation adds a tuple to the collection in sorted order. If the HeapAccum is already at maximum capacity when the += operator is applied, then the tuple which is last in the sorted order is dropped from the HeapAccum. Sorting of tuples is performed on one or more defined tuple fields ordered either ascending or descending. Sorting precedence is performed based on defined tuple fields from left to right.

The declaration of a HeapAccum is more complex than for most other accumulators, because the user must define a custom tuple type, set the maximum capacity of the HeapAccum, and specify how the HeapAccum should be sorted. The declaration syntax is outlined in the figure below:

HeapAccum declaration syntax
TYPEDEF TUPLE<type field_1,.., type field_n> tupleName; ... HeapAccum<tupleName>(capacity, field_a [ASC|DESC],... , field_z [ASC|DESC]);

First, the HeapAccum declaration must be preceded by a TYPEDEF statement which defines the tuple type. At least one of the fields (field_1, ..., field_n) must be of a data type that can be sorted.

In the declaration of the HeapAccum itself, the keyword "HeapAccum" is followed by the tuple type in angle brackets < >. This is followed by a parenthesized list of two or more parameters. The first parameter is the maximum number of tuples that the HeapAccum may store. This parameter must be a positive integer. The subsequent parameters are a subset of the tuple's field, which are used as sort keys. The sort key hierarchy is from left to right, with the leftmost key being the primary sort key. The keywords ASC and DESC indicate Ascending (lowest value first) or Descending (highest value first) sort order. Ascending order is the default.

HeapAccum also supports the following class functions.

Functions which modify the HeapAccum (mutator functions) can be used only under the following conditions:

  • Mutator functions of global accumulators may only be used at the query-body level.
  • Mutator functions of vertex-attached accumulators may only be used in a POST-ACCUM clause.
function return type Accessor / Mutator description
size() INT Accessor Returns the number of elements in the heap.
top( ) tupleType Accessor Returns the top tuple. If this heap doesn't contain anything, returns a tuple with each element equal to the default value.
resize( INT ) VOID Mutator Changes the maximum capacity of the heap.
clear() VOID Mutator Clears the heap so it becomes empty with size 0.


HeapAccum Example
#HeapAccum Example CREATE QUERY heapAccumEx() FOR GRAPH minimalNet { TYPEDEF tuple<STRING firstName, STRING lastName, INT score> testResults; #Heap with max size of 4 sorted decending by score then ascending last name HeapAccum<testResults>(4, score DESC, lastName ASC) @@topTestResults; @@topTestResults += testResults("Bruce", "Wayne", 80); @@topTestResults += testResults("Peter", "Parker", 80); @@topTestResults += testResults("Tony", "Stark", 100); @@topTestResults += testResults("Bruce", "Banner", 95); @@topTestResults += testResults("Jean", "Summers", 95); @@topTestResults += testResults("Clark", "Kent", 80); #Show element with the highest sorted position PRINT @@topTestResults.top(); PRINT @@topTestResults; #Increase the size of the heap to add more elements @@topTestResults.resize(5); #Find the size of the current heap PRINT @@topTestResults.size(); @@topTestResults += testResults("Bruce", "Wayne", 80); @@topTestResults += testResults("Peter", "Parker", 80); PRINT @@topTestResults; #Resizing smaller WILL REMOVE excess elements from the HeapAccum @@topTestResults.resize(3); PRINT @@topTestResults; #Increasing capacity will not restore dropped elements @@topTestResults.resize(5); PRINT @@topTestResults; #Removes all elements from the HeapAccum @@topTestResults.clear(); PRINT @@topTestResults.size(); }


Result
GSQL > RUN QUERY heapAccumEx() { "error": false, "message": "", "results": [ {"@@topTestResults.top()": { "firstName": "Tony", "lastName": "Stark", "score": 100 }}, {"@@topTestResults": [ { "firstName": "Tony", "lastName": "Stark", "score": 100 }, { "firstName": "Bruce", "lastName": "Banner", "score": 95 }, { "firstName": "Jean", "lastName": "Summers", "score": 95 }, { "firstName": "Clark", "lastName": "Kent", "score": 80 } ]}, {"@@topTestResults.size()": 4}, {"@@topTestResults": [ { "firstName": "Tony", "lastName": "Stark", "score": 100 }, { "firstName": "Bruce", "lastName": "Banner", "score": 95 }, { "firstName": "Jean", "lastName": "Summers", "score": 95 }, { "firstName": "Clark", "lastName": "Kent", "score": 80 }, { "firstName": "Peter", "lastName": "Parker", "score": 80 } ]}, {"@@topTestResults": [ { "firstName": "Tony", "lastName": "Stark", "score": 100 }, { "firstName": "Bruce", "lastName": "Banner", "score": 95 }, { "firstName": "Jean", "lastName": "Summers", "score": 95 } ]}, {"@@topTestResults": [ { "firstName": "Tony", "lastName": "Stark", "score": 100 }, { "firstName": "Bruce", "lastName": "Banner", "score": 95 }, { "firstName": "Jean", "lastName": "Summers", "score": 95 } ]}, {"@@topTestResults.size()": 0} ] }

GroupByAccum

The GroupByAccum is compound accumulator, an accumulator of accumulators.  At the top level, it is a MapAccum where both the key and the value can have multiple fields. Moreover, each of the value fields is an accumulator type.

GroupByAccum syntax
GroupByAccum<type [, type]* , accumType [, accumType]* >

In the EBNF above, the type terms form the key set, and the accumType terms form the map's value. Since they are accumulators, they perform a grouping. Like a MapAccum, if we try to store a (key->value) whose key has already been used, then the new value will accumulate to the data which is already stored.  In this case, each field of the multiple-field value has its own accumulation function. One way to think about GroupByAccum is that each unique key is a group ID.

In GroupByAccum, the key types can be base type, tuple, or STRING COMPRESS. The accumulators are used for aggregating group values.  Each accumulator type can be any type except HeapAccum. Each base type and each accumulator type must be followed an alias. Below is an example declaration.

GroupByAccum<INT a, STRING b, MaxAccum<INT> maxa, ListAccum<ListAccum<INT>> lists> @@group;

To add new data to this GroupByAccum, the data should be formatted as (key1, key2 -> value1, value2) .

GroupByAccum also supports the following class functions.

Functions which modify the GroupByAccum (mutator functions) can be used only under the following conditions:

  • Mutator functions of global accumulators may only be used at the query-body level.
  • Mutator functions of vertex-attached accumulators may only be used in a POST-ACCUM clause.
function (KEY1..KEYn are the key types) return type Accessor / Mutator description
size() INT Accessor Returns the number of elements in the heap.
get( KEY1 key_value1 , KEY2 key_value2 ... ) element type(s) of the accumulator(s) Accessor Returns the values from each accumulator in the group associating with the given key(s). If the key(s) doesn't exist, return the default value(s) of the accumulator type(s).
containsKey( KEY1 key_value1 , KEY2 key_value2 ... ) BOOL Accessor Returns true/false if the accumulator contains the key(s)
clear() VOID Mutator Clears the heap so it becomes empty with size 0.
remove ( KEY1 key_value1 , KEY2 key_value2 ... ) VOID Mutator Removes the group associating with the key(s)
GroupByAccum Example
#GroupByAccum Example CREATE QUERY groupByAccumEx () FOR GRAPH socialNet { ## declaration, first two primitive type are group by keys; the rest accumulator type are aggregates GroupByAccum<INT a, STRING b, MaxAccum<INT> maxa, ListAccum<ListAccum<INT>> lists> @@group; GroupByAccum<STRING gender, MapAccum<VERTEX<person>, DATETIME> m> @@group2; # nested GroupByAccum GroupByAccum<INT a, MaxAccum<INT> maxa, GroupByAccum<INT a, MaxAccum<INT> maxa> heap> @@group3; Start = { person.* }; ## usage of global GroupByAccum @@group += (1, "a" -> 1, [1]); @@group += (1, "a" -> 2, [2]); @@group += (2, "b" -> 1, [4]); @@group3 += (2 -> 1, (2 -> 0) ); @@group3 += (2 -> 1, (2 -> 5) ); @@group3 += (2 -> 5, (3 -> 3) ); PRINT @@group, @@group.get(1, "a"), @@group.get(1, "a").lists, @@group.containsKey(1, "c"), @@group3; ## two kinds of foreach FOREACH g IN @@group DO PRINT g.a, g.b, g.maxa, g.lists; END; FOREACH (g1,g2,g3,g4) IN @@group DO PRINT g1,g2,g3,g4; END; S = SELECT v FROM Start:v - (liked:e) - post:t ACCUM @@group2 += (v.gender -> (v -> e.actionTime)); PRINT @@group2, @@group2.get("Male").m, @@group2.get("Female").m; }

Result
GSQL > RUN QUERY groupByAccumEx() { "results": [ { "@@group": [ { "a": 2, "b": "b", "maxa": 1, "lists": [ [4] ] }, { "a": 1, "b": "a", "maxa": 2, "lists": [ [1], [2] ] } ], "@@group.get(1,a)": { "maxa": 2, "lists": [ [1], [2] ] }, "@@group.get(1,a).lists": [ [1], [2] ], "@@group.containsKey(1,c)": false, "@@group3": [ { "a": 2, "maxa": 5, "heap": [ { "a": 3, "maxa": 3 }, { "a": 2, "maxa": 5 } ] } ] }, { "g.a": 2, "g.b": "b", "g.maxa": 1, "g.lists": [ [4] ] }, { "g.a": 1, "g.b": "a", "g.maxa": 2, "g.lists": [ [1], [2] ] }, { "g1": 2, "g2": "b", "g3": 1, "g4": [ [4] ] }, { "g1": 1, "g2": "a", "g3": 2, "g4": [ [1], [2] ] }, { "@@group2": [ { "gender": "Male", "m": { "person6": 1263468185, "person1": 1263209520, "person3": 1263618953, "person7": 1263295325, "person8": 1263180365 } }, { "gender": "Female", "m": { "person2": 2526519281, "person5": 1263330725, "person4": 1263352565 } } ], "@@group2.get(Male).m": { "person6": 1263468185, "person1": 1263209520, "person3": 1263618953, "person7": 1263295325, "person8": 1263180365 }, "@@group2.get(Female).m": { "person2": 2526519281, "person5": 1263330725, "person4": 1263352565 } } ], "error": false, "message": "" }

Nested Accumulators

Certain collection accumulators may be nested. That is, an accumulator may contain a collection of elements where the elements themselves are accumulators. For example:

ListAccum<ListAccum<INT>> @@matrix; # a 2-dimensional jagged array of integers. Each inner list has its own unique size.

Only ListAccum, ArrayAccum, MapAccum, and GroupByAccum can contain other accumulators. However, not all combinations of collection accumulators are allowed. The following constraints apply:

  1. ListAccum: ListAccum is the only accumulator type which can be nested within ListAccum, up to a depth of 3:

    ListAccum<ListAccum<INT>> ListAccum<ListAccum<ListAccum<INT>>> ListAccum<SetAccum<INT>> # illegal
  2. MapAccum: All accumulator types, except for HeapAccum, can be nested within MapAccum as the value type. For example,

    MapAccum<STRING, ListAccum<INT>> MapAccum<INT, MapAccum<INT, STRING>> MapAccum<VERTEX, SumAccum<INT>> MapAccum<STRING, SetAccum<VERTEX>> MapAccum<STRING, GroupByAccum<VERTEX a, MaxAccum<INT> maxs>> MapAccum<SetAccum<INT>, INT> # illegal


  3. GroupByAccum: All accumulator types, except for HeapAccum, can be nested within GroupByAccum as the accumulator type. For example:

    GroupByAccum<INT a, STRING b, MaxAccum<INT> maxs, ListAccum<ListAccum<INT>> lists>
  4. ArrayAccum: Unlike the other accumulators in this list, where nesting is optional, nesting is mandatory for ArrayAccum. See the ArrayAccum section above.


It is legal to define nested ListAccums to form a multi-dimensional array. Note the declaration statements and the nested [ bracket ] notation in the example below:

CREATE QUERY nestedAccumEx() FOR GRAPH minimalNet { ListAccum<ListAccum<INT>> @@_2d_list; ListAccum<ListAccum<ListAccum<INT>>> @@_3d_list; ListAccum<INT> @@_1d_list; SumAccum <INT> @@sum = 4; @@_1d_list += 1; @@_1d_list += 2; // add 1D-list to 2D-list as element @@_2d_list += @@_1d_list; // add 1D-enum-list to 2D-list as element @@_2d_list += [@@sum, 5, 6]; // combine 2D-enum-list and 2d-list @@_2d_list += [[7, 8, 9], [10, 11], [12]]; // add an empty 1D-list @@_1d_list.clear(); @@_2d_list += @@_1d_list; // combine two 2D-list @@_2d_list += @@_2d_list; PRINT @@_2d_list; // test 3D-list @@_3d_list += @@_2d_list; @@_3d_list += [[7, 8, 9], [10, 11], [12]]; PRINT @@_3d_list; }
Result
GSQL > RUN QUERY nestedAccumEx() { "error": false, "message": "", "results": [ {"@@_2d_list": [ [1,2], [4,5,6], [7,8,9], [10,11], [12], [], [1,2], [4,5,6], [7,8,9], [10,11], [12], [] ]}, {"@@_3d_list": [ [ [1,2], [4,5,6], [7,8,9], [10,11], [12], [], [1,2], [4,5,6], [7,8,9], [10,11], [12], [] ], [ [7,8,9], [10,11], [12] ] ]} ] }




End of Accumulators Section

back to top


Operators, Functions, and Expressions


An expression is a combination of fixed values, variables, operators, function calls, and groupings which specify a computation, resulting in a data value. This section of the specification describes the literals (fixed values), operators, and functions available in the GSQL query language. It covers the subset of the EBNF definitions shown below. However, more so than in other sections of the specification, syntax alone is not an adequate description. The semantics (functionality) of the particular operators and functions are an essential complement to the syntax.

EBNF for Operations, Functions, and Expressions
constant := numeric | stringLiteral | TRUE | FALSE | GSQL_UINT_MAX | GSQL_INT_MAX | GSQL_INT_MIN | TO_DATETIME "(" stringLiteral ")" mathOperator := "*" | "/" | "%" | "+" | "-" | "<<" | ">>" | "&" | "|" comparisonOperator := "<" | "<=" | ">" | ">=" | "==" | "!=" condition := expr | expr comparisonOperator expr | expr [ NOT ] IN setBagExpr | expr IS [ NOT ] NULL | expr BETWEEN expr AND expr | "(" condition ")" | NOT condition | condition (AND | OR) condition | (TRUE | FALSE) expr := ["@@"]name | name "." "type" | name "." ["@"]name | name "." "@"name ["\'"] | name "." name "." name "(" [argList] ")" | name "." name "(" [argList] ")" [ "." FILTER "(" condition ")" ] | name ["<" type ["," type"]* ">"] "(" [argList] ")" | name "." "@"name ("." name "(" [argList] ")")+ ["." name] | "@@"name ("." name "(" [argList] ")")+ ["." name] | COALESCE "(" [argList] ")" | ( COUNT | ISEMPTY | MAX | MIN | AVG | SUM ) "(" setBagExpr ")" | expr mathOperator expr | "-" expr | "(" expr ")" | "(" argList "->" argList ")" // key value pair for MapAccum | "[" argList "]" // a list | constant | setBagExpr | name "(" argList ")" setBagExpr := ["@@"]name | name "." ["@"]name | name "." "@"name ("." name "(" [argList] ")")+ | name "." name "(" [argList] ")" [ "." FILTER "(" condition ")" ] | "@@"name ("." name "(" [argList] ")")+ | setBagExpr (UNION | INTERSECT | MINUS) setBagExpr | "(" argList ")" | "(" setBagExpr ")" argList := expr ["," expr]*


Constants

constant := numeric | stringLiteral | TRUE | FALSE | GSQL_UINT_MAX | GSQL_INT_MAX | GSQL_INT_MIN | TO_DATETIME "(" stringLiteral ")"

Each primitive data type supports constant values:

Data Type Constant Examples
Numeric types (INT, UINT, FLOAT, DOUBLE) numeric

123
-5
45.67
2.0e-0.5

UINT GSQL_UINT_MAX
INT GSQL_INT_MAX
GSQL_INT_MIN

boolean TRUE
FALSE

string stringLiteral "atoz@com"
"0.25"


GSL_UINT_MAX = 2 ^ 64 - 1 = 18446744073709551615

GSQL_INT_MAX = 2 ^ 63 - 1 =  9223372036854775807

GSQL_INT_MIN = -2 ^ 63     = -9223372036854775808

Operators

An operator is a keyword token which performs a specific computational function to return a resulting value, using the adjacent expressions (its operands) as input values.  An operator is similar to a function in that both compute a result from inputs, but syntactically they are different. The most familiar operators are the mathematical operators for addition  +  and subtraction  - .

Tip: The operators listed in this section are designed to behave like the operators in MySQL.


Mathematical Operators and Expressions

We support the following standard mathematical operators and meanings. The latter four ("<<" | ">>" | "&" | "|") are for bitwise operations.  See the section below: "Bit Operators".

mathOperator := "*" | "/" | "%" | "+" | "-" | "<<" | ">>" | "&" | "|"

Operator precedences are shown in the following list, from highest precedence to the lowest. Operators that are shown together on a line have the same precedence:

Operator Precedence, highest to lowest
*, /, % -, + <<, >> & | ==, >=, >, <=, <, !=



Example 1. Math Operators + - * /
CREATE QUERY mathOperators() FOR GRAPH minimalNet { int x; int y; int z; float v; x = 5; y = 4; z = x * y; # z = 20 z = x - y; # z = 1 z = x + y; # z = 9 z = x / y; # z = 1 z = x / 4.0; # z = 1 v = x / y; # v = 1 v = x / 4.0; # v = 1.5 v = x % 3; # v = 2 z = x % y; # z = 1 }

Boolean Operators

We support the standard Boolean operators and standard order of precedence: AND, OR, NOT

Bit Operators

Bit operators (<<, >>, &, and |) operate on integers and return an integer.

Bit Operators
CREATE QUERY bitOperationTest() FOR GRAPH minimalNet{ PRINT 80 >> 2; # 20 PRINT 80 << 2; # 320 PRINT 2 + 80 >> 4; # 5 PRINT 2 | 3 ; # 3 PRINT 2 & 3 ; # 2 PRINT 2 | 3 + 2; # 7 PRINT 2 & 3 - 2; # 0 }

String Operators

Operator + can be used for concatenating strings.

Tuple Fields

The fields of the tuple can be accessed using the dot operator.

Comparison Operators and Conditions

A condition is an expression which evaluates to a boolean value of either true or false. One type of condition uses the familiar comparison operators. A comparison operator compares two numeric values.

comparisonOperator := "<" | "<=" | ">" | ">=" | "==" | "!=" condition := expr | expr comparisonOperator expr | expr [ NOT ] IN setBagExpr | expr IS [ NOT ] NULL | expr BETWEEN expr AND expr | "(" condition ")" | NOT condition | condition (AND | OR) condition | (TRUE | FALSE) | expr NOT? LIKE expr (ESCAPE ESCAPE_CHAR)?


BETWEEN expr AND expr

The expression expr1 BETWEEN expr2 AND expr3 is true if the value expr1 is in the range from expr2 to expr3, including the endpoint values. Each expression must be numeric.

" expr1 BETWEEN expr2 AND expr3 " is equivalent to " expr1 <= expr3 AND expr1 >= expr2".

BETWEEN AND example
CREATE QUERY mathOperatorBetween() FOR GRAPH minimalNet { int x; bool b; x = 1; b = (x BETWEEN 0 AND 100); PRINT b; # True b = (x BETWEEN 1 AND 2); PRINT b; # True b = (x BETWEEN 0 AND 1); PRINT b; # True }


IS NULL, IS NOT NULL

IS NULL and IS NOT NULL can be used for checking whether an optional parameter is given any value.

IS NULL example
CREATE QUERY parameterIsNULL (INT p) FOR GRAPH minimalNet { IF p IS NULL THEN PRINT "p is null"; ELSE PRINT "p is not null"; END; }


Result
GSQL > RUN QUERY parameterIsNULL(_) { "error": false, "message": "", "results": [ { "p is null": "p is null" } ] } GSQL > RUN QUERY parameterIsNULL(3) { "error": false, "message": "", "results": [ { "p is not null": "p is not null" } ] }

Every attribute value stored in GSQL is a valid value, so IS NULL and IS NOT NULL is only effective for query parameters.


LIKE

The LIKE operator is used for string pattern matching. The expression

string1 LIKE string_pattern

evaluates to boolean true if string1 matches the pattern in string_pattern ; otherwise it is false. Both operands must be strings. LIKE may be used only in WHERE clauses. Additionally, string_pattern supports the following wildcard and other symbols, in order to express a pattern:

character or syntax meaning
%

matches zero or more characters.

Example : "%abc% " matches any string which contains the sequence "abc".

_ (underscore) matches any single character.
Example : "_abc_e" matches any 6-character string where the 2nd to 4th characters are "abc" and the last character is "e".
[charlist]

match any character in charlist. charlist is a concatenated character set, with no separators.
Example : "[Tiger]" matches either T, i, g, e, or r.

[^charlist] matches any character NOT in charlist.
Example : "[^qxz]" matches any character other than q, x, or z.
[!charlist] matches any character NOT in charlist.
special syntax within charlist α-β matches a character in the range from α to β. A charlist can have multiple ranges.
Example :
"[a-mA-M0-3]" matches a letter from a to m, upper or lower case, or a digit from 0 to 3.
\\ matches the character \
\\] matches the character ]
No special treatment is needed for [ inside a charlist.
Example : "%[\\]!]" matches any string which ends with either ] or !

Mathematical Functions

There are a number of built-in functions which act on either an accumulator, a base type, or vertex variable. The accumulator function calls are discussed in detail in the "Accumulators" section.

Below is a list of built-in functions which act on either INT, FLOAT, or DOUBLE value(s).

function name and parameters
(NUM means INT, FLOAT, or DOUBLE)

description

return type
abs (NUM num ) Returns the absolute value of num Same as parameter type
sqrt ( NUM num ) Returns the square root of num FLOAT
pow ( NUM base , NUM exp ) Returns base exp If base and exp are both INT → INT;
Otherwise → FLOAT
acos ( NUM num ) arc cosine FLOAT
asin ( NUM num ) arc sine FLOAT
atan ( NUM num ) arc tangent FLOAT
atan2 (NUM y , NUM x ) arc tangent of y / x FLOAT
ceil ( NUM num ) rounds upward INT
cos ( NUM num ) cosine FLOAT
cosh ( NUM num ) hyperbolic cosine FLOAT
exp ( NUM num ) base-e exponential FLOAT
floor ( NUM num ) rounds downward INT
fmod (NUM numer , NUM denom ) floating-point remainder of numer / denom FLOAT
ldexp (NUM x , NUM exp ) x * 2 exp FLOAT
log ( NUM num ) natural logarithm FLOAT
log10 ( NUM num ) common (base-10) logarithm FLOAT
sin ( NUM num ) sine FLOAT
sinh ( NUM num ) hyperbolic sine FLOAT
tan ( NUM num ) tangent FLOAT
tanh ( NUM num ) hyperbolic tangent FLOAT
to_string ( NUM num ) Converts num to a STRING value STRING
float_to_int (FLOAT num ) Converts num to a INT value by truncating the floating part INT
str_to_int (STRING str ) Converts str to a INT value. If str is a floating number, the floating part is truncated; If str is not a numerical value, returns 0. INT

Datetime Functions

The following functions convert from/to DATETIME to/from other types.

function name and parameters

description

return type
to_datetime(STRING str ) Converts str to a DATETIME value DATETIME
epoch_to_datetime(INT int_value ) Converts int_value to a DATETIME value by epoch time conversion DATETIME
datetime_to_epoch(DATETIME date ) Converts date to epoch time. INT

The following function converts a DATETIME value into a string format specified by the user:

function name and parameters

description

return type
datetime_format( DATETIME date [, STRING str ] )

Prints date as the str indicates. The following specifiers may be used as the format of str . The “%” character is required before the format specifier characters. If str is not given, "%Y-%m-%d %H:%M:%S" is used.

specifier description
%Y Year, numeric, four digits
%S Seconds (0..59)
%m Month, numeric (1..12)

%M

Minutes, numeric (0..59)
%H Hour, numeric (0..23)
%d Day of the month, numeric (1..31)
STRING
datetime_format example
# Show all posts's post time CREATE QUERY allposttime() FOR GRAPH socialNet { start = {post.*}; PRINT datetime_format(start.postTime, "a message was posted at %H:%M:%S on %Y/%m/%d"); }

T he followings are other functions related to DATETIME :

function name and parameters

description

return type
now() Returns the current time in DATETIME type. DATETIME
year( DATETIME date ) Extracts the year of date . INT
month( DATETIME date ) Extracts the month of date. INT
day( DATETIME date ) Extracts the day of month of date . INT
hour( DATETIME date ) Extracts the hour of date . INT
minute( DATETIME date ) Extracts the minute of date . INT
second( DATETIME date ) Extracts the second of date . INT
datetime_add( DATETIME date , INTERVAL int_value time_unit ) INTERVAL is a keyword; time_unit is one of the keywords YEAR, MONTH, DAY, HOUR, MINUTE, or SECOND. The function returns the DATETIME value which is int_value units later than date . For example, datetime_add( now() , INTERVAL 1 MONTH ) returns a DATETIME value which is 1 month from now. DATETIME
datetime_sub( DATETIME date , INTERVAL
int_value time _ unit
)
Same as datetime_add, except that the returned value is int_value units earlier than date . DATETIME
datetime_diff( DATETIME date1 , DATETIME date2) Returns the difference in seconds of these two DATETIME values: ( date1 - date2 ) . INT

JSONOBJECT and JSONARRAY Functions

JSONOBJECT and JSONARRAY are base types, meaning they can be used as a parameter type, an element type for most accumulators, or a return type.  This enables the input and output of complex, customized data structures. For input and output, a string representation of the JSON is used. Hence, the GSQL query language offers several functions to convert a formatted string into JSON and then to search and access the components of a JSON structure.

Data Conversion Functions

The following parsing functions convert a string into a JSONOBJECT or a JSONARRAY:

function name description return type
parse_json_object(STRING str ) Converts str into a JSON object JSONOBJECT
parse_json_array( STRING str ) Converts str into a JSON array JSONARRAY

Both functions generate a run-time error if the input string cannot be converted into a JSON object or a JSON array. To be properly formatted, besides having the proper nesting and matching of curly braces  { } and brackets [ ], each value field must be one of the following: a string (in double quotes "), a number, a boolean ( true or false ), or a JSONOBJECT or JSONARRAY. Each key of a key:value pair must be a string in double quotes.

See examples below.

parse_json_object and parse_json_array example
CREATE QUERY jsonEx (STRING strA, STRING strB) FOR GRAPH minimalNet { JSONARRAY jsonA; JSONOBJECT jsonO; jsonA = parse_json_array( strA ); jsonO = parse_json_object( strB ); PRINT jsonA, jsonO; }
Result
GSQL > RUN QUERY jsonEx("[123]","{\"123\":\"123\"}") or curl -X GET 'http://localhost:9000/query/jsonEx?strA=\[123\]&strB=\{"123":"123"\}' { "error": false, "message": "", "results": [ { "jsonA": [ 123 ], "jsonO": { "123": "123" } } ] } GSQL > RUN QUERY jsonEx("{123}","{\"123\":\"123\"}") Runtime Error: {123} cannot be parsed as a json array.


Data Access Methods

JSONOBJECT and JSONARRAY are object classes, each class supporting a set of data access methods, using dot notation:

jsonVariable.functionName(parameter_list)

The following methods (class functions) can act on a JSONOBJECT variable:

method name description return type
containsKey(STRING keyStr ) Returns a boolean value indicating whether the JSON object contains the key keyStr . BOOL
getInt(STRING keyStr ) Returns the numeric value associated with key keyStr as an INT. INT
getDouble (STRING keyS tr ) Returns the numeric value associated with key keyStr as a DOUBLE. DOUBLE
getString (STRING keyS tr ) Returns the string value associated with key keyStr . STRING
getBool (STRING keyS tr ) Returns the bool value associated with key keyStr . BOOL
getJsonObject (STRING keyS tr ) Returns the JSONOBJECT associated with key keyStr . JSONOBJECT
getJsonArray (STRING keySt r ) Returns the JSONARRAY associated with key keyStr . JSONARRAY

The above getType(STRING keyStr ) function generates a run-time error if

  1. The key keyStr doesn't exist, or
  2. The function's return type is different than the stored value type. See the next note about numeric data.
  3. Pure JSON stores "numbers" without distinguishing between INT and DOUBLE, but for TigerGraph, if the input value is all digits, it will be stored as INT. Other numeric values are stored as DOUBLE.  The getDouble function can read an INT and return its equivalent DOUBLE value, but it is an error to call getINT for a DOUBLE value.

The following methods can act on a JSONARRAY variable:

method name description return type
size() Returns the size of this array. INT
getInt( INT idx ) Returns the numeric value at position idx as an INT. INT
getDouble( INT idx ) Returns the numeric value at position idx as a DOUBLE. DOUBLE
getString( INT idx ) Returns the string value at position idx . STRING
getBool( INT idx ) Returns the bool value at position idx . BOOL
getJsonObject( INT idx ) Returns the JSONOBJECT value at position idx . JSONOBJECT
getJsonArray( INT idx ) Returns the JSONARRAY value at position idx . JSONARRAY

Similar to the methods of JSONOBJECT, the above getType(INT idx ) function generates a run-time error if

  1. idx is out of bounds, or
  2. The function's return type is different than the stored value type. See the next note about numeric data.
  3. Pure JSON stores "numbers" without distinguishing between INT and DOUBLE, but for TigerGraph, if the input value is all digits, it will be stored as INT. Other numeric values are stored as DOUBLE.  The getDouble function can read an INT and return its equivalent DOUBLE value, but it is an error to call getINT for a DOUBLE value.

Below is an example of using these functions and methods :

JSONOBJECT and JSONARRAY function example
CREATE QUERY jsonEx2 () FOR GRAPH minimalNet{ JSONOBJECT jsonO, jsonO2; JSONARRAY jsonA, jsonA2; STRING str, str2; str = "{\"int\":1, \"double\":3.0, \"string\":\"xyz\", \"bool\":true, \"obj\":{\"obj\":{\"bool\":false}}, \"arr\":[\"xyz\",123,true] }"; str2 = "[\"xyz\", 123, false, 5.0]"; jsonO = parse_json_object( str ) ; jsonA = parse_json_array( str2 ) ; jsonO2 = jsonO.getJsonObject("obj"); jsonA2 = jsonO.getJsonArray("arr"); PRINT jsonO; PRINT jsonO.getBool("bool"), jsonO.getJsonObject("obj"), jsonO.getJsonArray("arr"), jsonO2.getJsonObject("obj"), jsonA2.getString(0) , jsonA.getDouble(3), jsonA.getDouble(1); }
Result
GSQL > RUN QUERY jsonEx2() { "results": [ { "jsonO": { "arr": ["xyz", 123, true], "bool": true, "double": 3, "int": 1, "obj": { "obj": { "bool": false } }, "string": "xyz" } }, { "jsonO.getBool(bool)": true, "jsonO.getJsonObject(obj)": { "obj": { "bool": false } }, "jsonO.getJsonArray(arr)": ["xyz", 123, true], "jsonO2.getJsonObject(obj)": { "bool": false }, "jsonA2.getString(0)": "xyz", "jsonA.getDouble(3)": 5, "jsonA.getDouble(1)": 123 } ], "error": false, "message": "" }

Vertex, Edge, and Accumulator Functions and Attributes

Accessing attributes

Attributes on vertices or edges are defined in the graph schema. Additionally, each vertex and edge has a built-in STRING attribute called type which represents the user-defined type of that edge or vertex. These attributes, including type , can be accessed for a particular edge or vertex with the dot operator.

For example, the following code snippet shows two different SELECT statements which produce equivalent results. The first uses the dot operator on the vertex variable v to access the "subject" attribute, which is defined in the graph schema. The FROM clause in the first SELECT statement necessitates that any target vertices will be of type "post" (also defined in the graph schema). The second SELECT schema checks that the vertex variable v's type is a "post" vertex by using the dot operator to access the built-in type attribute.

Accessing vertex variable attributes
CREATE QUERY coffeeRelatedPosts() FOR GRAPH socialNet { allVertices = {ANY}; results = SELECT v FROM allVertices:s -(:e)-> post:v WHERE v.subject == "coffee"; PRINT results; results = SELECT v FROM allVertices:s -(:e)-> :v WHERE v.type == "post" AND v.subject == "coffee"; PRINT results; }
Result
GSQL > RUN QUERY coffeeRelatedPosts() { "error":false, "message":"", "results":[ { "v_set":"results", "v_id":"4", "v":{ "postTime":"2011-02-07 05:02:51", "subject":"coffee" }, "v_type":"post" }, { "v_set":"results", "v_id":"4", "v":{ "postTime":"2011-02-07 05:02:51", "subject":"coffee" }, "v_type":"post" } ] }

Vertex Functions

Below is a list of built-in functions that can be accessed by vertex aliases, using the dot operator:

Syntax for vertex functions
vertex_alias.function_name(parameter)[.FILTER(condition)]

Currently, these functions are only available for vertex aliases (defined in the FROM clause); vertex variables do not have these functions.

Note that in order to calculate outdegree by edge type, the graph schema must be defined such that vertices keep track of their edge types using WITH STATS="OUTDEGREE_BY_EDGETYPE" (however, "OUTDEGREE_BY_EDGETYPE" is now the default STATS option).

function name

description

return type
outdegree ([STRING edgeType ]) Returns the number of outgoing or undirected edges connected to the vertex. If the optional STRING argument edgeType is given, then count only edges of the given edgeType. INT
neighbors ([ STRING edgeType ]) Returns the set of ids for the vertices which are out-neighbors or undirected neighbors of the vertex. If the optional STRING argument edgeType is given, then include only those neighbors reachable by edges of the given edgeType . BagAccum<VERTEX>
neighborAttribute ( STRING edgeType, STRING targetVertexType, STRING attribute ) From the given vertex, traverses the given edgeType to the given targetVertexType , and return the set of values for the given attribute . edgeType can only be string literal. BagAccum<attributeType>
edgeAttribute ( STRING edgeType, STRING attribute ) From the given vertex, traverses the given edgeType , and return the set of values for the given edge attribute . edgeType can only be string literal. BagAccum<attributeType>
Vertex function examples
CREATE QUERY vertexFunctionExample(vertex<person> m1) FOR GRAPH socialNet { SetAccum<Vertex> @neighborSet; SetAccum<Vertex> @neighborSet2; SetAccum<DATETIME> @attr1; BagAccum<DATETIME> @attr2; int deg1, deg2, deg3, deg4; S = {m1}; S2 = SELECT S FROM S - (posted:e) -> post:t ACCUM deg1 = S.outdegree(), deg2 = S.outdegree("posted"), deg3 = S.outdegree(e.type), # same as deg2 STRING str = "posted", deg4 = S.outdegree(str); # same as deg2 PRINT deg1, deg2, deg3, deg4; S3 = SELECT S FROM S:s POST-ACCUM s.@neighborSet += s.neighbors(), s.@neighborSet2 += s.neighbors("posted"), s.@attr1 += s.neighborAttribute("posted", "post", "postTime"), s.@attr2 += s.edgeAttribute("liked", "actionTime"); PRINT S3; }
Result
GSQL > RUN QUERY vertexFunctionExample("person5") { "results":[ { "deg1":5, "deg2":2, "deg3":2, "deg4":2 }, { "v_id":"person5", "v_type":"person", "v_set":"S3", "v":{ "id":"", "gender":"Female", "@attr2":[1263330725], "@attr1":[1296694941, 1297054971], "@neighborSet":["6", "11", "person4", "4", "person7"], "@neighborSet2":["11", "4"] } } ], "error":false, "message":"" }

.FILTER

The optional .FILTER(condition) clause offers an additional filter for selecting which elements are added to the output set of the neighbor, neighborAttribute and edgeAttribute functions.  The condition is evaluated for each element . If the condition is true, the element is added to the output set; if false, it is not.  An example is shown below:

Example: vertex functions with optional filter
CREATE QUERY filterEx (SET<STRING> pIds, INT yr) FOR GRAPH workNet { SetAccum<vertex<company>> @recentEmplr, @allEmplr; BagAccum<string> @diffCountry, @allCountry; Start = {person.*}; L0 = SELECT v FROM Start:v WHERE v.id IN pIds ACCUM # filter using edge attribute v.@recentEmplr += v.neighbors("worksFor").filter(worksFor.startYear >= yr), v.@allEmplr += v.neighbors("worksFor").filter(true), # vertex alias attribute and neighbor type attribute v.@diffCountry += v.neighborAttribute("worksFor", "company", "id") .filter(v.locationId != company.country), v.@allCountry += v.neighborAttribute("worksFor", "company", "id") ; PRINT yr, L0.@recentEmplr, L0.@allEmplr, L0.@diffCountry, L0.@allCountry; }
Result
GSQL > RUN QUERY filterEx(["person1","person2"],2016) { "error": false, "message": "", "results": [ { "v_set": "L0", "v_id": "person1", "v": { "L0.@diffCountry": ["company2"], "L0.@recentEmplr": ["company1"], "L0.@allCountry": [ "company1", "company2" ], "yr": 2016, "L0.@allEmplr": [ "company2", "company1" ] }, "v_type": "person" }, { "v_set": "L0", "v_id": "person2", "v": { "L0.@diffCountry": ["company1"], "L0.@recentEmplr": [], "L0.@allCountry": [ "company1", "company2" ], "yr": 2016, "L0.@allEmplr": [ "company2", "company1" ] }, "v_type": "person" } ] }


Edge Functions

Below are the built-in functions that can be accessed by edge aliases, using the dot operator. Edge functions follow the same general rules as vertex functions (see above).

function name

description

return type
isDirected () Returns a boolean value indicating whether this edge is directed or undirected. BOOL


Accumulator Functions

Accumulator functions for each accumulator type are illustrated at the "Accumulator Type" section.

Set/Bag Expression and Operators

SELECT blocks take an input vertex set and perform various selection and filtering operations to produce an output set. Therefore, set/bag expressions and their operators are a useful and powerful part of the GSQL query language. A set/bag expression can use either SetAccum or BagAccum.

BNF
setBagExpr := ["@@"] name | name "." ["@"] name | name "." "@" name ("." name "(" [argList] ")")+ | name "." name "(" [argList] ")" [ "." FILTER "(" condition ")" ] | "@@" name ("." name "(" [argList] ")")+ | setBagExpr (UNION | INTERSECT | MINUS) setBagExpr | "(" argList ")" | "(" setBagExpr ")"

Set/Bag Expression Operators - UNION, INTERSECT, MINUS

The operators are straightforward, when two operands are both sets, the result expression is a set. When at least one operant is a bag, the result expression is a bag. If one operant is a bag and the other is a set, the operator treats the set operant as a bag containing one of each value.

Set Operators - Example
// A is a SET<INT> with value (1, 2, 3, 4) // B is a SET<INT> with value (2, 4, 6, 8) // C is a SET<INT> C = A UNION B // C = (1, 2, 3, 4, 6, 8) C = A INTERSECT B // C = (2, 4) C = A MINUS B // C = (1, 3) // D is a BAG<INT> with value (1, 2, 2, 3) // E is a BAG<INT> with value (2, 3, 5, 7) // F is a BAG<INT> F = D UNION E // F = (1, 2, 2, 2, 3, 3, 5, 7) F = D INTERSECT E // F = (2, 3) F = D MINUS E // F = (1, 2) F = D MINUS A // F = (2) F = D UNION A // F = (1, 1, 2, 2, 2, 3, 3, 4) because D UNION A is a bag F = A UNION B // F = (1, 2, 3, 4, 6, 8) because A UIOIN B is a set

The result of these operations is another set/bag expression, so these operations can be nested and chained to form more complex operations, such as

(setBagExpr_A INTERSECT (setBagExpr_B UNION setBagExpr_C) ) MINUS setBagExpr_D

Set/Bag Expression Membership Operators

For example , suppose setBagExpr_A is ("a", "b", "c")

"a" IN setBagExpr_A => true "d" IN setBagExpr_A => false "a" NOT IN setBagExpr_A => false "d" NOT IN setBagExpr_A => true

The IN and NOT IN operators support all base types on the left-hand side, and any set/bag expression on the right-hand side. The base type must be the same as the accumulator's element type. IN and NOT IN return a BOOL value.

The following example uses NOT IN to exclude neighbors that are on a blacklist.

Set Membership example
CREATE QUERY friendsNotInblacklist (VERTEX<person> seed, SET<VERTEX<person>> blackList) FOR GRAPH socialNet{ Start = {seed}; Result = SELECT v FROM Start:s-(friend:e)-person:v WHERE v NOT IN blackList; PRINT Result; }
Result
GSQL > RUN QUERY friendsNotInblacklist("person1", ["person2"]) { "error": false, "message": "", "results": [{ "v_set": "Result", "v_id": "person8", "v": { "gender": "Male", "id": "person8" }, "v_type": "person" }] }

Aggregation Functions - COUNT, SUM, MIN, MAX, AVG

The aggregation functions take a set/bag expression as its input parameter and return one value or element.

  • count() : Returns the size (INT) of the set.
  • sum() : Returns the sum of all elements. This is only applicable to a set/bag expression with numeric type.
  • min() : Returns the member with minimum value. This is only applicable to a set/bag expression with numeric type.
  • max() : Returns the member with maximum value. This is only applicable to a set/bag expression with numeric type.
  • avg() : Returns the average of all elements. This is only applicable to a set/bag expression with numeric type. The average is INT if the element type of the set/bag expression is INT.
Aggregation function example
CREATE QUERY aggregateFuncEx() FOR GRAPH minimalNet { BagAccum<INT> @@t; @@t += -5; @@t += 2; @@t+= -1; PRINT max(@@t), min(@@t), avg(@@t), count(@@t), sum(@@t); }
Result
GSQL > RUN QUERY aggregateFuncEx() { "results": [ { "max(@@t)": 2, "min(@@t)": -5, "avg(@@t)": -1, "count(@@t)": 3, "sum(@@t)": -4 } ], "error": false, "message": "" }

Miscellaneous Functions

SelectVertex()

SelectVertex() reads a data file which lists particular vertices of the graph and returns the corresponding vertex set. This function can only be used in a vertex set variable declaration statement as a seed set. The data file must be organized as a table with one or more columns.  One column must be for vertex id.  Optionally, another column is for vertex type. SelectVertex() has five parameters explained in the below table: filePath, vertexIdColumn, vertexTypeColumn, separator, and header. The rules for column separators and column headings are the same as for the GSQL Loader.


parameter name type description
filePath string The absolute file path of the input file to be read. A relative path is not supported.
vertexIdColumn $ num , or $ "column_name" if header is true. The vertex id column position.
vertexTypeColumn $ num , $ "column_name" if header is true, or a vertex type The vertex type column position or a specific vertex type.
separator single-character string The column separator character.
header bool Whether this file has a header.

One vertex set variable declaration statement can have multiple SelectVertex() function calls. However, if a declaration statement has multiple SelectVertex() calls referring to the same file, they must use the same separator and header parameters. If any row of the file contains an invalid vertex type, a run time error occurs; if any row of the file contains an nonexistent vertex id, a warning message is shown with the count of nonexistent ids.

Below is a query example using SelectVertex calls, reading from the data file selectVertexInput.csv.

selectVertexInput.csv
c1,c2,c3 person1,person,3 person5,person,4 person6,person,5


selectVertex example
SET sys.data_root = "." # change this to the absolute path CREATE QUERY selectVertexEx(STRING filename) FOR GRAPH socialNet { S = {SelectVertex(filename, $"c1", $1, ",", true), # SelectVertex(filename, $2, post, ",", false) # illegal SelectVertex(filename, $2, post, ",", true) # SelectVertex("$sys.data_root/selectVertexInput.csv", $2, post, ",", true) # need to change line 1 to the absolute path of the directory }; PRINT S; }
Result
GSQL > RUN QUERY selectVertexEx("/file_directory/selectVertexInput.csv") { "error": false, "message": "", "results": [ { "v_set": "S", "v_id": "person1", "v": {"gender": "Male"}, "v_type": "person" }, { "v_set": "S", "v_id": "person5", "v": {"gender": "Female"}, "v_type": "person" }, { "v_set": "S", "v_id": "person6", "v": {"gender": "Male"}, "v_type": "person" }, { "v_set": "S", "v_id": "3", "v": { "postTime": "2011-02-05 01:02:44", "subject": "cats" }, "v_type": "post" }, { "v_set": "S", "v_id": "4", "v": { "postTime": "2011-02-07 05:02:51", "subject": "coffee" }, "v_type": "post" }, { "v_set": "S", "v_id": "5", "v": { "postTime": "2011-02-06 01:02:02", "subject": "tigergraph" }, "v_type": "post" } ] }

to_vertex() and to_vertex_set()

to_vertex() and to_vertex_set() convert a string or a string set into a vertex or a vertex set, respectively, of a given vertex type. These two functions are useful when the vertex id(s) are obtained and only known at run-time.

Running these functions requires real-time conversion of an external id to a GSQL internal id, which is a relatively slow process. Therefore,

  1. If the user can always know the id before running the query, define the query with VERTEX or SET<VERTEX> parameters instead of STRING or SET<STRING> parameters, and avoid calling to_vertex() or to_vertex_set().
  2. Calling to_vertex_set() one time is much faster than c alling to_vertex() multiple times . Use to_vertex_set() instead of to_vertex() as much as possible.

The first parameter of to_vertex() is the vertex id string. The first parameter of to_vertex_set() is a string set representing vertex ids. The second parameter of both functions is the vertex type string.

Function signatures for to_vertex() and to_vertex_set()
VERTEX to_vertex(STRING id, STRING vertex_type) SET<VERTEX> to_vertex_set(SET<VERTEX>, STRING vertex_type)

If the vertex id or the vertex type doesn't exist, to_vertex() will have a run-time error, as shown below. However, to_vertex_set() will have a run-time error only if the vertex type doesn't exist. If one or more vertex ids are nonexistent, to_vertex_set() will display a warning message but will still run, converting all valid ids and skipping nonexistent vertex ids. If the user wants an error instead of a warning if a nonexistent id is given when converting a string set to a vertex set, the user can use to_vertex() inside a FOREACH loop, instead of to_vertex_set(). See the example below .

to_vertex() and to_vertex_set() example
CREATE QUERY to_vertex_set (SET<STRING> uids, STRING uid, STRING vtype) FOR GRAPH workNet { SetAccum<VERTEX> @@v2, @@v3; SetAccum<STRING> @@strSet; VERTEX v; v = to_vertex (uid, vtype); # to_vertex assigned to a vertex variable PRINT v; # vertex variable -> only vertex id is printed @@v2 += to_vertex (uid, vtype); # to_vertex accumulated to a SetAccum<VERTEX> PRINT @@v2; # SetAccum of vertex -> only vertex ids are printed S2 = to_vertex_set (uids, vtype); # to_vertex_set assigned to a vertex set variable PRINT S2; # vertex set variable-> full details printed @@strSet = uids; # Show SET<STRING> & SetAccumm<STRING> are the same S3 = to_vertex_set(@@strSet, vtype); # Input to to_vertex_set is SetAccum<STRING> SDIFF = S2 MINUS S3; # Now S2 = S3, so SDIFF2 is empty PRINT SDIFF.size(); #FOREACH vid in uids DO # In this case non-existing ids in uids causes run-time error # @@v3 += to_vertex( vid, vtype ); #END; #L3 = @@v3; #PRINT L3; }
Result
GSQL > RUN QUERY to_vertex_set(["person1","personx","person2"], "person3", "person") { "error": false, "message": "Runtime Warning: 1 ids are invalid person vertex ids.", "results": [ {"v": "person3"}, {"@@v2": ["person3"]}, { "v_set": "S2", "v_id": "person1", "v": { "interestList": ["management","financial"], "skillSet": [3,2,1], "skillList": [1, 2,3], "locationId": "us", "interestSet": ["financial","management"], "id": "person1" }, "v_type": "person" }, { "v_set": "S2", "v_id": "person2", "v": { "interestList": ["engineering"], "skillSet": [6,5,3,2], "skillList": [2,3,5,6], "locationId": "chn", "interestSet": ["engineering"], "id": "person2" }, "v_type": "person" }, {"SDIFF.size()": 0} ] } GSQL > RUN QUERY to_vertex_set(["person1","personx"], "person1", "abc") Runtime Error: abc is not valid vertex type.

COALESCE()

The COALESCE function evaluates each argument value in order, and returns the first value which is not NULL. This evaluation is the same as that used for IS NULL and IS NOT NULL. The COALESCE function requires all its arguments have the same data type (BOOL, INT,  FLOAT, DOUBLE, STRING, or VERTEX). The only exception is that different numeric types can be used together. In this case, all values are converted into the first argument type.

coalesce function example
CREATE QUERY coalesceFuncEx (INT p1, DOUBLE p2) FOR GRAPH minimalNet { PRINT COALESCE(p1, p2, 999.5); # p2 and the last value will be converted into first argument type, which is INT. }


Result
GSQL > RUN QUERY coalesceFuncEx(_,_) { "results": [ { "coalesce(p1,p2,999.5)": 999 } ], "error": false, "message": "" } GSQL > RUN QUERY coalesceFuncEx(1,2) { "results": [ { "coalesce(p1,p2,999.5)": 1 } ], "error": false, "message": "" } GSQL > RUN QUERY coalesceFuncEx(_,2.5) { "error": false, "message": "", "results": [ { "coalesce(p1,p2,999.5)": 2 } ] }

The COALESCE function is useful when multiple optional parameters are allowed, and one of them must be chosen if available. For example,

coalesce function example
CREATE QUERY coalesceFuncEx2 (STRING homePhoneNumber, STRING cellPhoneNumber, STRING companyPhoneNumber) FOR GRAPH socialNet { PRINT "contact number:" + COALESCE(homePhoneNumber, cellPhoneNumber, companyPhoneNumber, "N/A"); }

The COALESCE function's parameter list should have a default value as the last argument. Otherwise, i f all values are NULL, the default value of the data type is returned.


Dynamic Expressions with EVALUATE

The function evaluate() takes a string argument and interprets it as an expression which is evaluated during run-time. This enables users to create a general purpose query instead of separate queries for each specific computation.

evaluate(expressionStr, typeStr)

The evaluate() function has two parameters: expressionStr is the expression string, and typeStr is a string literal indicating the type of expression. This function returns a value whose type is typeStr and whose value is the evaluation of expressionStr. The following rules apply:

  1. evaluate() can only be used inside a SELECT block, and only inside a WHERE clause, ACCUM clause, POST-ACCUM clause, HAVING clause, or ORDER BY clause. It cannot be used in a LIMIT clause or outside a SELECT block.
  2. The result type must be specified at query installation time: typeStr must be a string literal for a primitive data type, e.g., one of "int", "float", "double", "bool", "string" (case insensitive). The default value is "bool".
  3. In expressionStr, identifiers can refer only to a vertex or edge aliases, vertex-attached accumulators, global accumulators, parameters, or scalar function calls involving the above variables. The expression may not refer to local variables, global variables, or to FROM clause vertices or edges by type.
  4. Any accumulators in the expression must be scalar accumulators (e.g., MaxAccum) for primitive-type data. Container accumulators (e.g., SetAccum) or scalar accumulators with non-primitive type (e.g. VERTEX, EDGE, DATETIME) are not supported. Container type attributes are not supported.

  5. evaluate() cannot be nested.

The following situations generate a run-time error:

  1. The expression string expressionStr cannot be compiled (unless the error is due to a non-existent  vertex or edge attribute).
  2. The result type of the expression does not match the parameter typeStr.

Silent failure conditions

If any of the following conditions occur, the query may continue running, but the entire clause or statement in which the evaluate() function resides will fail, without producing a run-time error message. For conditional clauses (WHERE, HAVING), a failing evaluate() clause is treated as if the condition is false. An assignment statement with a failing evaluate() will not execute, and an ORDER BY clause with a failing evaluate() will not sort.

  1. The expression references a non-existent attribute of a vertex or edge alias.
  2. The expression uses an operator for non-compatible operation. For example, 123 == "xyz".


The following example employs dynamic expressions in both the WHERE condition and the accumulator value in the POST-ACCUM clause.

Evaluate example
CREATE QUERY evaluateEx (STRING whereCond = "TRUE", STRING postAccumIntExpr = "1") FOR GRAPH socialNet { SetAccum<INT> @@timeSet; MaxAccum<INT> @latestLikeTime, @latestLikePostTime; S = {person.*}; S2 = SELECT s FROM S:s - (liked:e) -> post:t WHERE evaluate(whereCond) ACCUM s.@latestLikeTime += datetime_to_epoch( e.actionTime ), s.@latestLikePostTime += datetime_to_epoch( t.postTime ) POST-ACCUM @@timeSet += evaluate(postAccumIntExpr, "int") ; PRINT @@timeSet; }
Result
GSQL > RUN QUERY evaluateEx(_,_) { "error": false, "message": "", "results": [{"@@timeSet": [1]}] } GSQL > RUN QUERY evaluateEx("s.gender==\"Male\"", "s.@latestLikePostTime") { "error": false, "message": "", "results": [ { "@@timeSet": [1263295325,1296752752,1297054971,1296788551] } ] } GSQL > RUN QUERY evaluateEx("s.gender==\"Female\"", "s.@latestLikeTime + 1") { "error": false, "message": "", "results": [ { "@@timeSet": [1263293536,1263352566,1263330726] } ] } GSQL > RUN QUERY evaluateEx("xx", _) Runtime Error: xx is undefined parameter. GSQL > RUN QUERY evaluateEx("e.xyz", _)' # The attribute doesn't exist, so the entire condition in WHERE clause is false. { "error": false, "message": "", "results": [{"@@timeSet": []}] } GSQL > RUN QUERY evaluateEx("e.actionTime", _) Runtime Error: actionTime is not a primitive type attribute. GSQL > RUN QUERY evaluateEx("s.id", _) Runtime Error: Expression 's.id' value type is not bool. gsql 'RUN QUERY evaluateEx("s.gender==\"Female\"", "s.xx")' # The attribute doesn't exist, so the entire assignment is skipped. { "error": false, "message": "", "results": [{"@@timeSet": []}] }

Queries as Functions

A query that has been defined (with a CREATE QUERY ... RETURNS statement) can be treated as a callable function. A query can call itself recursively.

The following limitations apply to queries calling queries:

  1. Each parameter of the called query may be one of the following types:
    1. Primitives: INT, UINT, FLOAT, DOUBLE, STRING, BOOL
    2. VERTEX
    3. A Set or Bag of primitive or VERTEX elements
  2. The return value may be one of the following types. See also the "Return Statement" section.
    1. Primitives: INT, UINT, FLOAT, DOUBLE, STRING, BOOL
    2. VERTEX
    3. a vertex set (e.g., the result of a SELECT statement)
    4. An accumulator of primitive types.  GroupByAccum and accumulators containing tuples are not supported.
  3. A query which returns a SetAccum or BagAccum may be called with a Set or Bag argument, respectively.
  4. The order of definition matters.  A query cannot call a query which has not yet been defined.


Subquery Example 1
CREATE QUERY subquery1 (VERTEX<person> m1) FOR GRAPH socialNet RETURNS(BagAccum<VERTEX<post>>) { Start = {m1}; L = SELECT t FROM Start:s - (liked:e) - post:t; RETURN L; } CREATE QUERY mainquery1 () FOR GRAPH socialNet { BagAccum<VERTEX<post>> @@testBag; Start = {person.*}; Start = SELECT s FROM Start:s ACCUM @@testBag += subquery1(s); PRINT @@testBag; }

User-Defined Functions

Users can define their own expression functions in C++ in <tigergraph.root.dir>/dev/gdk/gsql/src/QueryUdf/ExprFunctions.hpp. Only bool, int, float, double, and string (NOT std::string) are allowed as the return value type and the function argument type. However, any C++ type is allowed inside a function body. Once defined, the new functions will be added into GSQL automatically next time GSQL is executed.

If a user-defined struct or a helper function needs to be defined, define it in <tigergraph.root.dir>/dev/gdk/gsql/src/QueryUdf/ExprUtil.hpp.

Here is an example:

new code in ExprFunction.hpp
#include <algorithm> // for std::reverse inline bool greater_than_three (double x) { return x > 3; } inline string reverse(string str){ std::reverse(str.begin(), str.end()); return str; }
user defined expression function
CREATE QUERY udfExample() FOR GRAPH minimalNet { DOUBLE x; BOOL y; x = 3.5; PRINT greater_than_three(x); y = greater_than_three(2.5); PRINT y; PRINT reverse("abc"); }
Result
GSQL > RUN QUERY udfExample() { "results": [ { "greater_than_three(x)": true }, { "y": false }, { "reverse(abc)": "cba" } ], "error": false, "message": "" }


If any code in ExprFunctions.hpp or ExprUtil.hpp causes a compilation error, GSQL cannot install any GSQL query, even if the GSQL query doesn't call any user-defined function. Therefore, please test each new user-defined expression function after adding it. One way of testing the function is creating a new cpp file test.cpp and compiling it by
> g++ test.cpp
> ./a.out
You might need to remove the include header #include <gle/engine/cpplib/headers.hpp> in ExprFunction.hpp and ExprUtil.hpp in order to compile.

test.cpp
#include "ExprFunctions.hpp" #include <iostream> int main () { std::cout << to_string (123) << std::endl; // to_string and str_to_int are two built-in functions in ExprFunction.hpp std::cout << str_to_int ("123") << std::endl; return 0; }


Examples of Expressions

Below is a list of examples of expressions. Note that ( argList ) is a set/bag expression, while [ argList ] is a list expression.


Expression Examples
#Show various types of expressions CREATE QUERY expressionEx() FOR GRAPH workNet { TYPEDEF tuple<STRING countryName, STRING companyName> companyInfo; ListAccum<STRING> @companyNames; SumAccum<INT> @companyCount; SumAccum<INT> @numberOfRelationships; ListAccum<companyInfo> @info; MapAccum< STRING,ListAccum<STRING> > @@companyEmployeeRelationships; SumAccum<INT> @@totalRelationshipCount; ListAccum<INT> @@valueList; SetAccum<INT> @@valueSet; SumAccum<INT> @@a; SumAccum<INT> @@b; #expr := constant @@a = 10; #expr := ["@@"] name @@b = @@a; #expr := expr mathOperator expr @@b = @@a + 5; #expr := "(" expr ")" @@b = (@@a + 5); #expr := "-" expr @@b = -(@@a + 5); PRINT @@a, @@b; #expr := "[" argList "]" // a list @@valueList = [1,2,3,4,5]; @@valueList += [24,80]; #expr := "(" argList ")" // setBagExpr @@valueSet += (1,2,3,4,5); #expr := ( COUNT | ISEMPTY | MAX | MIN | AVG | SUM ) "(" setBagExpr ")" PRINT MAX(@@valueList); PRINT AVG(@@valueList); seed = {ANY}; company1 = SELECT t FROM seed:s -(worksFor)-> :t WHERE (s.id == "company1"); company2 = SELECT t FROM seed:s -(worksFor)-> :t WHERE (s.id == "company2"); #expr := setBagExpr worksForBoth = company1 INTERSECT company2; PRINT worksForBoth; #expr := name "." "type" employees = SELECT s FROM seed:s WHERE (s.type == "person"); employees = SELECT s FROM employees:s -(worksFor)-> :t ACCUM #expr := name "." ["@"] name s.@companyNames += t.id, #expr := name "." name "(" [argList] ")" [ "." FILTER "(" condition ")" ] s.@numberOfRelationships += s.outdegree(), #expr := name ["<" type ["," type"]* ">"] "(" [argList] ")" s.@info += companyInfo(t.country, t.id) POST-ACCUM #expr := name "." "@" name ("." name "(" [argList] ")")+ ["." name] s.@companyCount += s.@companyNames.size(), #expr := name "." "@" name ["\'"] @@totalRelationshipCount += s.@companyCount, FOREACH comp IN s.@companyNames DO #expr := "(" argList "->" argList ")" @@companyEmployeeRelationships += (s.id -> comp) END; PRINT employees; PRINT @@totalRelationshipCount; PRINT @@companyEmployeeRelationships; #expr := "@@" name ("." name "(" [argList] ")")+ ["." name] PRINT @@companyEmployeeRelationships.size(); }


Result
GSQL > RUN QUERY expressionEx() { "error": false, "message": "", "results": [ { "@@a": 10, "@@b": -15 }, {"max(@@valueList)": 80}, {"avg(@@valueList)": 17}, { "v_set": "worksForBoth", "v_id": "person1", "v": { "interestList": ["management","financial"], "@companyCount": 0, "@numberOfRelationships": 0, "skillSet": [3,2,1,0], "skillList": [0,1,2,3], "locationId": "us", "interestSet": ["financial","management"], "@info": [], "id": "person1", "@companyNames": [] }, "v_type": "person" }, { "v_set": "worksForBoth", "v_id": "person2", "v": { "interestList": ["engineering"], "@companyCount": 0, "@numberOfRelationships": 0, "skillSet": [6,3,2,5,0], "skillList": [0,2,3,5,6], "locationId": "chn", "interestSet": ["engineering"], "@info": [], "id": "person2", "@companyNames": [] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person1", "v": { "interestList": ["management","financial"], "@companyCount": 2, "@numberOfRelationships": 4, "skillSet": [3,2,1,0], "skillList": [0,1,2,3], "locationId": "us", "interestSet": ["financial","management"], "@info": [ { "companyName": "company1", "countryName": "us" }, { "companyName": "company2", "countryName": "chn" } ], "id": "person1", "@companyNames": ["company1","company2"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person2", "v": { "interestList": ["engineering"], "@companyCount": 2, "@numberOfRelationships": 4, "skillSet": [6,3,2,5,0, "skillList": [0,2,3,5,6], "locationId": "chn", "interestSet": ["engineering"], "@info": [ { "companyName": "company1", "countryName": "us" }, { "companyName": "company2", "countryName": "chn" } ], "id": "person2", "@companyNames": ["company1","company2"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person3", "v": { "interestList": ["teaching"], "@companyCount": 1, "@numberOfRelationships": 1, "skillSet": [6,1,4,0], "skillList": [0,4,1,6], "locationId": "jp", "interestSet": ["teaching"], "@info": [{ "companyName": "company1", "countryName": "us" }], "id": "person3", "@companyNames": ["company1"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person4", "v": { "interestList": ["football"], "@companyCount": 1, "@numberOfRelationships": 1, "skillSet": [10,1,4,0], "skillList": [0,4,1,10], "locationId": "us", "interestSet": ["football"], "@info": [{ "companyName": "company2", "countryName": "chn" }], "id": "person4", "@companyNames": ["company2"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person5", "v": { "interestList": ["sport","financial","engineering"], "@companyCount": 1, "@numberOfRelationships": 1, "skillSet": [2,8,5,0], "skillList": [0,8,2,5], "locationId": "can", "interestSet": ["engineering","financial","sport"], "@info": [{ "companyName": "company2", "countryName": "chn" }], "id": "person5", "@companyNames": ["company2"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person6", "v": { "interestList": ["music","art"], "@companyCount": 1, "@numberOfRelationships": 1, "skillSet": [10,7,0], "skillList": [0,7,10], "locationId": "jp", "interestSet": ["art","music"], "@info": [{ "companyName": "company1", "countryName": "us" }], "id": "person6", "@companyNames": ["company1"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person7", "v": { "interestList": ["art","sport"], "@companyCount": 2, "@numberOfRelationships": 4, "skillSet": [6,8,0], "skillList": [0,8,6], "locationId": "us", "interestSet": ["sport","art"], "@info": [ { "companyName": "company2", "countryName": "chn" }, { "companyName": "company3", "countryName": "jp" } ], "id": "person7", "@companyNames": ["company2","company3"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person8", "v": { "interestList": ["management"], "@companyCount": 1, "@numberOfRelationships": 1, "skillSet": [2,1,5,0], "skillList": [0,1,5,2], "locationId": "chn", "interestSet": ["management"], "@info": [{ "companyName": "company1", "countryName": "us" }], "id": "person8", "@companyNames": ["company1"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person9", "v": { "interestList": ["financial","teaching"], "@companyCount": 2, "@numberOfRelationships": 4, "skillSet": [2,7,4,0], "skillList": [0,4,7,2], "locationId": "us", "interestSet": ["teaching","financial"], "@info": [ { "companyName": "company2", "countryName": "chn" }, { "companyName": "company3", "countryName": "jp" } ], "id": "person9", "@companyNames": ["company2","company3"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person10", "v": { "interestList": ["football","sport"], "@companyCount": 2, "@numberOfRelationships": 4, "skillSet": [3,0], "skillList": [0,3], "locationId": "us", "interestSet": ["sport","football"], "@info": [ { "companyName": "company1", "countryName": "us" }, { "companyName": "company3", "countryName": "jp" } ], "id": "person10", "@companyNames": ["company1","company3"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person11", "v": { "interestList": ["sport","football"], "@companyCount": 1, "@numberOfRelationships": 1, "skillSet": [10,0], "skillList": [0,10], "locationId": "can", "interestSet": ["football","sport"], "@info": [{ "companyName": "company5", "countryName": "can" }], "id": "person11", "@companyNames": ["company5"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person12", "v": { "interestList": ["music","engineering","teaching","teaching","teaching"], "@companyCount": 1, "@numberOfRelationships": 1, "skillSet": [2,1,5,0], "skillList": [0,1,5,2,2,2], "locationId": "jp", "interestSet": ["teaching","engineering","music"], "@info": [{ "companyName": "company4", "countryName": "us" }], "id": "person12", "@companyNames": ["company4"] }, "v_type": "person" }, {"@@totalRelationshipCount": 17}, {"@@companyEmployeeRelationships": { "person8": ["company1"], "person11": ["company5"], "person9": ["company2","company3"], "person12": ["company4"], "person6": ["company1"], "person10": ["company1","company3"], "person7": ["company2","company3"], "person4": ["company2"], "person5": ["company2"], "person2": ["company1","company2"], "person3": ["company1"], "person1": ["company1","company2"] }}, {"@@companyEmployeeRelationships.size()": 12} ] }

Examples of Expression Statements

Expression Statement Examples
#Show various types of expression statements CREATE QUERY expressionStmntEx() FOR GRAPH workNet { TYPEDEF tuple<STRING countryName, STRING companyName> companyInfo; ListAccum<companyInfo> @employerInfo; SumAccum<INT> @@a; ListAccum<STRING> @employers; SumAccum<INT> @employerCount; SetAccum<STRING> @@countrySet; int x; #exprStmnt := name "=" expr x = 10; #gAccumAssignStmt := "@@" name ("+=" | "=") expr @@a = 10; PRINT x, @@a; start = {person.*}; employees = SELECT s FROM start:s -(worksFor)-> :t ACCUM #exprStmnt := name "." "@" name ("+="| "=") expr s.@employers += t.id, #exprStmnt := name ["<" type ["," type"]* ">"] "(" [argList] ")" s.@employerInfo += companyInfo(t.country, t.id), #gAccumAccumStmt := "@@" name "+=" expr @@countrySet += t.country #exprStmnt := name "." "@" name ["." name "(" [argList] ")"] POST-ACCUM s.@employerCount += s.@employers.size(); #exprStmnt := "@@" name ["." name "(" [argList] ")"]+ PRINT @@countrySet.size(); PRINT employees; }


Result
GSQL > RUN QUERY expressionStmntEx() { "error": false, "message": "", "results": [ { "@@a": 10, "x": 10 }, {"@@countrySet.size()": 4}, { "v_set": "employees", "v_id": "person1", "v": { "interestList": ["management","financial"], "skillSet": [3,2,1,0], "skillList": [0,1,2,3], "locationId": "us", "@employerInfo": [ { "companyName": "company1", "countryName": "us" }, { "companyName": "company2", "countryName": "chn" } ], "interestSet": ["financial","management"], "@employerCount": 2, "id": "person1", "@employers": ["company1","company2"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person2", "v": { "interestList": ["engineering"], "skillSet": [6,3,2,5,0], "skillList": [0,2,3,5,6], "locationId": "chn", "@employerInfo": [ { "companyName": "company1", "countryName": "us" }, { "companyName": "company2", "countryName": "chn" } ], "interestSet": ["engineering"], "@employerCount": 2, "id": "person2", "@employers": ["company1","company2"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person3", "v": { "interestList": ["teaching"], "skillSet": [6,1,4,0], "skillList": [0,4,1,6], "locationId": "jp", "@employerInfo": [{ "companyName": "company1", "countryName": "us" }], "interestSet": ["teaching"], "@employerCount": 1, "id": "person3", "@employers": ["company1"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person4", "v": { "interestList": ["football"], "skillSet": [10,1,4,0], "skillList": [0,4,1,10], "locationId": "us", "@employerInfo": [{ "companyName": "company2", "countryName": "chn" }], "interestSet": ["football"], "@employerCount": 1, "id": "person4", "@employers": ["company2"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person5", "v": { "interestList": ["sport","financial","engineering"], "skillSet": [2,8,5,0], "skillList": [0,8,2,5], "locationId": "can", "@employerInfo": [{ "companyName": "company2", "countryName": "chn" }], "interestSet": ["engineering","financial","sport"], "@employerCount": 1, "id": "person5", "@employers": ["company2"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person6", "v": { "interestList": ["music","art"], "skillSet": [10,7,0], "skillList": [0,7,10], "locationId": "jp", "@employerInfo": [{ "companyName": "company1", "countryName": "us" }], "interestSet": ["art","music"], "@employerCount": 1, "id": "person6", "@employers": ["company1"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person7", "v": { "interestList": ["art","sport"], "skillSet": [6,8,0], "skillList": [0,8,6], "locationId": "us", "@employerInfo": [ { "companyName": "company2", "countryName": "chn" }, { "companyName": "company3", "countryName": "jp" } ], "interestSet": ["sport","art"], "@employerCount": 2, "id": "person7", "@employers": ["company2","company3"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person8", "v": { "interestList": ["management"], "skillSet": [2,1,5,0], "skillList": [0,1,5,2], "locationId": "chn", "@employerInfo": [{ "companyName": "company1", "countryName": "us" }], "interestSet": ["management"], "@employerCount": 1, "id": "person8", "@employers": ["company1"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person9", "v": { "interestList": ["financial","teaching"], "skillSet": [2,7,4,0], "skillList": [0,4,7,2], "locationId": "us", "@employerInfo": [ { "companyName": "company2", "countryName": "chn" }, { "companyName": "company3", "countryName": "jp" } ], "interestSet": ["teaching","financial"], "@employerCount": 2, "id": "person9", "@employers": ["company2","company3"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person10", "v": { "interestList": ["football","sport"], "skillSet": [3,0], "skillList": [0,3], "locationId": "us", "@employerInfo": [ { "companyName": "company1", "countryName": "us" }, { "companyName": "company3", "countryName": "jp" } ], "interestSet": ["sport","football"], "@employerCount": 2, "id": "person10", "@employers": ["company1","company3"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person11", "v": { "interestList": ["sport","football"], "skillSet": [10,0], "skillList": [0,10], "locationId": "can", "@employerInfo": [{ "companyName": "company5", "countryName": "can" }], "interestSet": ["football","sport"], "@employerCount": 1, "id": "person11", "@employers": ["company5"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person12", "v": { "interestList": ["music","engineering","teaching","teaching","teaching"], "skillSet": [2,1,5,0], "skillList": [0,1,5,2,2,2], "locationId": "jp", "@employerInfo": [{ "companyName": "company4", "countryName": "us" }], "interestSet": ["teaching","engineering","music"], "@employerCount": 1, "id": "person12", "@employers": ["company4"] }, "v_type": "person" } ] }

End of Operators, Functions, and Expressions Section

back to top


Declaration and Assignment Statements


Contents of this Section - Click to Expand

Previous sections focused on the lowest level building blocks of queries: data types (Section 3), operators, functions, and expressions (Section 5), and a special section devoted to accumulators (Section 4). We now begin to look at the types of statements available in GSQL queries. This section focuses on declaration and assignment statements. Later sections will provide a closer look at the all-important SELECT statement, control flow statements and data modification statements. Furthermore, some types of statements can be nested within SELECT, UPDATE, or control flow statements.

This section covers the following subset of the EBNF syntax:

EBNF
## Declarations ## declAccumStmt := accumType "@"name ["=" constant][, "@"name ["=" constant]]* | "@"name ["=" constant][, "@"name ["=" constant]]* accumType | [STATIC] accumType "@@"name ["=" constant][, "@@"name ["=" constant]]* | [STATIC] "@@"name ["=" constant][, "@@"name ["=" constant]]* accumType declStmt := baseType name ["=" constant][, name ["=" constant]]* localVarDeclStmt := baseType name "=" expr vSetVarDeclStmt := name ["(" vertexEdgeType ")"] "=" (seedSet | simpleSet | selectBlock) simpleSet := name | "(" simpleSet ")" | simpleSet (UNION | INTERSECT | MINUS) simpleSet seedSet := "{" [seed ["," seed ]*] "}" seed := '_' | ANY | ["@@"]name | name ".*" | "SelectVertex" selectVertParams selectVertParams := "(" filePath "," columnId "," (columnId | name) "," stringLiteral "," (TRUE | FALSE) ")" ["." FILTER "(" condition ")"] columnId := "$" (integer | stringLiteral) ## Assignment Statements ## assignStmt := name "=" expr | name "." name "=" expr | name "." "@"name ("+="| "=") expr gAccumAssignStmt := "@@"name ("+=" | "=") expr loadAccumStmt := "@@"name "=" "{" "LOADACCUM" loadAccumParams ["," "LOADACCUM" loadAccumParams]* "}" loadAccumParams := "(" filePath "," columnId "," [columnId ","]* stringLiteral "," (TRUE | FALSE) ")" ["." FILTER "(" condition ")"] ## Function Call Statement ## funcCallStmt := name ["<" type ["," type"]* ">"] "(" [argList] ")" | "@@"name ("." name "(" [argList] ")")+ argList := expr ["," expr]*

Declaration Statements

There are five types of variable declarations in a GSQL query:

  • Accumulator
  • Global baseType variable
  • Local baseType variable
  • Vertex set
  • Vertex or Edge aliases

The first four types each have their declaration statement and are covered in this section. Aliases are declared implicitly in a SELECT statement.

Accumulators

vertexEdgeType := "_" | ANY | name | ( "(" name ["|" name]* ")" )

Accumulator declaration is discussed in Section 4: "Accumulators".

Global Variables

After accumulator declarations, base type variables can be declared as global variables. The scope of a global variable is from the point of declaration until the end of the query.

EBNF for global variable declaration
declStmt := baseType name ["=" constant][, name ["=" constant]]*

A global variable can be accessed (read) anywhere in the query; however, there are restrictions on wh ere it can be updated.  See the subsection below on "Assignment Statements".

Global Variable Example
# Assign global variable at various places CREATE QUERY globalVariable(VERTEX<person> m1) FOR GRAPH socialNet { SetAccum<VERTEX<person>> @@personSet; SetAccum<Edge> @@edgeSet; # Declare global variables STRING gender; DATETIME dt; VERTEX v; VERTEX<person> vx; EDGE ee; allUser = {person.*}; allUser = SELECT src FROM allUser:src - (liked:e) -> post ACCUM dt = e.actionTime, ee = e, # assignment does NOT take effect yet @@edgeSet += ee # so ee is null POST-ACCUM @@personSet += src; PRINT @@edgeSet; # EMPTY because ee was frozen in the SELECT statement. PRINT dt; # actionTime of the last edge e processed. v = m1; # assign a vertex value to a global variable. gender = m1.gender; # assign a vertex's attribute value to a global variable. PRINT v, gender; FOREACH m IN @@personSet DO vx = m; # global variable assignment inside FOREACH takes place. gender = m.gender; # global variable assignment inside FOREACH takes place. PRINT vx, gender; # display the values for each iteration of the loop. END; }

Multiple global variables of the same type can be declared and initialized at the same line, as in the example below:

Multiple variable declaration example
CREATE QUERY variableDeclaration() FOR GRAPH socialNet { INT a=5,b=1; INT c,d=10; MaxAccum<INT> @@max1 = 3, @@max2 = 5, @@max3; MaxAccum<INT> @@max4, @@max5 = 2; PRINT a,b,c,d; PRINT @@max1, @@max2, @@max3, @@max4, @@max5; }
Result
GSQL > RUN QUERY variableDeclaration() { "results": [ { "a": 5, "b": 1, "c": 0, "d": 10 }, { "@@max1": 3, "@@max2": 5, "@@max3": -9223372036854776000, "@@max4": -9223372036854776000, "@@max5": 2 } ], "error": false, "message": "", "debug": "" }

Local Variables

A local variable can be declared only in an ACCUM, POST-ACCUM, or UPDATE SET clause, and its scope is limited to that clause. Local variables can only be of base types (e.g. INT, FLOAT, DOUBLE, BOOL, STRING, VERTEX). A local variable must be declared and initialized together at the same statement.

EBNF for local variable declaration and initialization
localVarDeclStmt := baseType name "=" expr

Within a local variable's scope, another local variable with the same name cannot be declared at the same level.  However, a new local variable with the same name can be declared at a lower level (i.e., within a nested SELECT or UPDATE statement.) . The lower declaration takes precedence at the lower level.

In a POST-ACCUM clause, each local variable may only be used in source vertex statements or target vertex statements, not both.

Local Variable Example
# An example showing a local variable succeeded where a global variable fails CREATE QUERY localVariable(vertex<person> m1) FOR GRAPH socialNet { MaxAccum<INT> @@maxDate, @@maxDateGlob; DATETIME dtGlob; allUser = {person.*}; allUser = SELECT src FROM allUser:src - (liked:e) -> post ACCUM DATETIME dt = e.actionTime, # Declare and assign local dt dtGlob = e.actionTime, # dtGlob doesn't update yet @@maxDate += datetime_to_epoch(dt), @@maxDateGlob += datetime_to_epoch(dtGlob); PRINT @@maxDate, @@maxDateGlob, dtGlob; # @@maxDateGlob will be 0 }

Vertex Set Variable Declaration and Assignment

Vertex set variables play a special role within GSQL queries. They are used for both the input and output of SELECT statements. Therefore, before the first SELECT statement in a query, a vertex set variable must be declared and initialized. This initial vertex set is called the seed set .

EBNF for Vertex Set Variable Declaration
vSetVarDeclStmt := name ["(" vertexEdgeType ")"] "=" (seedSet | simpleSet | selectBlock) ## Seed Sets ## seedSet := "{" [seed ["," seed ]*] "}" seed := '_' | ANY | ["@@"]name | name ".*" | "SelectVertex" selectVertParams selectVertParams := "(" filePath "," columnId "," (columnId | name) "," stringLiteral "," (TRUE | FALSE) ")" ["." FILTER "(" condition ")"] columnId := "$" (integer | stringLiteral) simpleSet := name | "(" simpleSet ")" | simpleSet (UNION | INTERSECT | MINUS) simpleSet


The query below lists all ways of assigning a vertex set variable an initial set of vertices (that is, forming a seed set).

  • a vertex parameter, untyped (S1) or typed (S2)
  • a vertex set parameter, untyped (S3) or typed (S4)
  • a global SetAccum<VERTEX> accumulator, untyped (S5) or typed (S6)
  • all vertices of any type (S7, S9) or of one type (S8)
  • a list of vertex ids in an external file (S10)
  • copy of another vertex set (S11)
  • a combination of individual vertices, vertex set parameters, or global variables (S12)
  • union of vertex set variables (S13)
Seed Set Example
CREATE QUERY seedSetExample(VERTEX v1, VERTEX<person> v2, SET<VERTEX> v3, SET<VERTEX<person>> v4) FOR GRAPH socialNet { SetAccum<VERTEX> @@testSet; SetAccum<VERTEX<person>> @@testSet2; S1 = { v1 }; S2 = { v2 }; S3 = v3; S4 = v4; S5 = @@testSet; S6 = @@testSet2; S7 = ANY; # All vertices S8 = person.*; # All person vertices S9 = _; # Equivalent to ANY S10 = SelectVertex("absolute_path_to_input_file", $0, post, ",", false); # See Section "SelectVertex()" function S11 = S1; S12 = {@@testSet, v2, v3}; # S1 is not allowed to be in {} S13 = S11 UNION S12; # but we can use UNION to combine S1 }

When declaring a vertex set variable, a set of vertex types can be optionally specified to the vertex set variable. If the vertex set variable set type is not specified explicitly, the system determines the type implicitly by the vertex set value. The type can be ANY, _ (equivalent to ANY), or any explicit vertex type(s). See the EBNF grammar rule vertexEdgeType.


Declaration syntax difference: vertex set variable vs. base type variable

In a vertex set variable declaration, the type specifier follows the variable name and should be surrounded by parentheses: vSetName (type)
This is different than a base type variable declaration, where the type specifier comes before the base variable name: type varName


After a vertex set variable is declared, the vertex type of the vertex set variable is immutable. Every assignment (e.g. SELECT statement) to this vertex set variable must match the type. The following is an example in which we must declare the vertex set variable type.

Vertex set variable type
CREATE QUERY vertexSetVariableTypeExample(vertex<person> m1) FOR GRAPH socialNet { INT ite = 0; S (ANY) = {m1}; # ANY is necessary WHILE ite < 5 DO S = SELECT t FROM S:s - (ANY:e) -> ANY:t; ite = ite + 1; END; PRINT S; }

In the above example, the query returns the set of vertices after a 5-step traversal from the input "person" vertex. If we declare the vertex set variable S without explicitly giving a type, because the type of vertex parameter m1 is "person", the GSQL engine will implicitly assign S to be "person"-type. However, if S is assigned to "person"-type, the SELECT statement inside the WHILE loop causes a type checking error, because the SELECT block will generate all connected vertices, including non-"person" vertices. Therefore, S must be declared as a ANY-type vertex set variable.

Assignment and Accumulate Statements

Assignment statements are used to set or update the value of a variable, after it has been declared. This applies to baseType variables, vertex set variables, and accumulators. Accumulators also have the special += accumulate statement, which was discussed in the Accumulator section.  Assignment statements can use expressions (expr) to define the new value of the variable.

EBNF for Assignment Statements
## Assignment Statement ## assignStmt := name "=" expr # baseType variable, vertex set variable | name "." name "=" expr # attribute of a vertex or edge | name "." "@"name ("+="| "=") expr # vertex.attached accumulator gAccumAssignStmt := "@@"name "=" expr # global accumulator | loadAccumStmt loadAccumStmt := "@@"name "=" "{" "LOADACCUM" loadAccumParams ["," "LOADACCUM" loadAccumParams]* "}"

Restrictions on Assignment Statements

In general, assignment statements can take place anywhere after the variable has been declared.  However, t here are some restrictions. These restrictions apply to "inner level" statements which are within the body of a higher-level statement:

  • The ACCUM or POST-ACCUM clause of a SELECT statement
  • The SET clause of an UPDATE statement
  • The body of a FOREACH statement
  • Global accumulator assignment "=" is not permitted within the body of SELECT or UPDATE statements
  • Global variable assignment is permitted in ACCUM or POST-ACCUM clauses, but the change in value will not take place until exiting the clause. Therefore, if there are multiple assignment statements for the same variable, only the final one will take effect.
  • Vertex attribute assignment "=" is not permitted in an ACCUM clause. However, edge attribute assignment is permitted. This is because the ACCUM clause iterates over an edge set.
  • There are additional restrictions within FOREACH loops for the loop variable. See the Data Modification section.


LOADACCUM Statement

loadAccumStmt := "@@" name "=" "{" "LOADACCUM" loadAccumParams ("," "LOADACCUM" loadAccumParams)* "}" loadAccumParams := "(" filePath "," columnId "," [columnId ","]* stringLiteral "," (TRUE | FALSE) ")" ["." FILTER "(" condition ")"] columnId := "$" (integer | stringLiteral)


LOADACCUM() can initialize a global accumulator by loading data from a file. LOADACCUM() has 3+n parameters explained in the table below: (filePath, fieldColumn_1, ...., fieldColumn_n, separator, header), where n is the number of fields in the accumulator. One assignment statement can have multiple LOADACCUM() function calls. However, every LOADACCUM() referring to the same file in the same assignment statement must use the same separator and header parameter values.

Any accumulator using generic VERTEX as an element type cannot be initialized by LOADACCUM().


parameter name type description
filePath string The absolute file path of the input file to be read. A relative path is not supported.
accumField1,...., accumFieldN $ num , or $ "column_name"
if header is true.
The column position(s) or column name(s) of the data file which supply data values to each field of the accumulator.
separator single-character string The separator of columns.
header bool Whether this file has a header.

Below is an example with an external file

loadAccumInput.csv
person1,1,"test1",3 person5,2,"test2",4 person6,3,"test3",5


LoadAccum example
CREATE QUERY loadAccumEx(STRING filename) FOR GRAPH socialNet { TYPEDEF TUPLE<STRING aaa, VERTEX<post> ddd> yourTuple; MapAccum<VERTEX<person>, MapAccum<INT, yourTuple>> @@testMap; GroupByAccum<STRING a, STRING b, MapAccum<STRING, STRING> strList> @@testGroupBy; @@testMap = { LOADACCUM (filename, $0, $1, $2, $3, ",", false)}; @@testGroupBy = { LOADACCUM ( filename, $1, $2, $3, $3, ",", true) }; PRINT @@testMap, @@testGroupBy; }


Result
GSQL > RUN QUERY loadAccumEx("/file_directory/loadAccumInput.csv") { "error": false, "message": "", "results": [ { "@@testMap": { "person5": { "2": { "aaa": "\"test2\"", "ddd": "4" } }, "person6": { "3": { "aaa": "\"test3\"", "ddd": "5" } }, "person1": { "1": { "aaa": "\"test1\"", "ddd": "3" } } }, "@@testGroupBy": [ { "a": "3", "b": "\"test3\"", "strList": { "5": "5" } }, { "a": "2", "b": "\"test2\"", "strList": { "4": "4" } } ] } ] }

Function Call Statements

funcCallStmt := name ["<" type ["," type"]* ">"] "(" [argList] ")" | "@@"name ("." name "(" [argList] ")")+ argList := expr ["," expr]*

Typically, a function call returns a value and so is part of an expression (see Section 5 - Operators, Functions and Expressions). In some cases, however, the function does not return a value (i.e., returns VOID) or the return value can be ignored, so the function call can be used as an entire statement.  This is a Function Call Statement.

Examples of Function Call statements
ListAccum<STRING> @@listAcc; BagAccum<INT> @@bagAcc; ... # examples of function call statements @@listAcc.clear(); @@bagAcc.removeAll(0);

End of Declaration, Assignment, and Function Call Statements Section

back to top


SELECT Statement


This section discusses the SELECT statement in depth and covers the following EBNF syntax:

EBNF for Select Statement
selectStmt := name "=" selectBlock selectBlock := SELECT name FROM ( edgeSet | vertexSet ) [sampleClause] [whereClause] [accumClause] [postAccumClause] [havingClause] [orderClause] [limitClause] vertexSet := name [":" name] edgeSet := name [":" name] "-" "(" [vertexEdgeType] [":" name] ")" "->" [vertexEdgeType] [":" name] vertexEdgeType := "_" | ANY | name | ( "(" name ["|" name]* ")" ) sampleClause := SAMPLE ( expr | expr "%" ) EDGE WHEN condition | SAMPLE expr TARGET WHEN condition | SAMPLE expr "%" TARGET PINNED WHEN condition whereClause := WHERE condition accumClause := ACCUM DMLSubStmtList postAccumClause := POST-ACCUM DMLSubStmtList DMLSubStmtList := DMLSubStmt ["," DMLSubStmt]* DMLSubStmt := assignStmt // Assignment | funcCallStmt // Function Call | gAccumAccumStmt // Assignment | vAccumFuncCall // Function Call | localVarDeclStmt // Declaration | DMLSubCaseStmt // Control Flow | DMLSubIfStmt // Control Flow | DMLSubWhileStmt // Control Flow | DMLSubForEachStmt // Control Flow | BREAK // Control Flow | CONTINUE // Control Flow | insertStmt // Data Modification | DMLSubDeleteStmt // Data Modification | logStmt // Output vAccumFuncCall := name "." "@"name ("." name "(" [argList] ")")+ localVarDeclStmt := ( baseType name | name baseType ) "=" expr havingClause := HAVING condition orderClause := ORDER BY expr [ASC | DESC] ["," expr [ASC | DESC]]* limitClause := LIMIT ( expr | expr "," expr | expr OFFSET expr )

The SELECT block selects a set of vertices FROM a vertex set or edge set . There are a number of optional clauses that define and/or refine the selection by constraining the vertex or edge set or the result set. There are two types of SELECT, vertex-induced and edge-induced .  Both result in a vertex set, known as the result set .

Size limitation

There is a maximum size limit of 2GB for the result set of a SELECT block . If the result of the SELECT block is larger than 2GB, the system will return no data.  NO error message is produced.

SELECT Statement Data Flow

The SELECT statement is an assignment statement with a SELECT block on the right hand side. The SELECT block has many possible clauses, which fit together in a logical flow. Overall, the SELECT block starts from a source set of vertices and returns a result set that is either a subset of the source vertices or a subset of their neighboring vertices. Along the way, computations can be performed on the selected vertices and edges. The figure below graphically depicts the overall SELECT data flow (with the exception of the SAMPLE clause). While the ACCUM and POST-ACCUM clauses do not directly affect which vertices are included in the result set, they affect the data (accumulators) which are attached to those vertices.

FROM Clause: Vertex and Edge Sets

There are two options for the FROM clause: vertexSet or edgeSet. If vertexSet is used, then the query will be a vertex-induced selection.  If edge is used, then the query is an edge-induced selection.

FROM clause
### selectBlock := SELECT name FROM ( edgeSet | vertexSet ) ...

Vertex-Induced Selection

EBNF for vertexSet, signaling a vertex-induced selection
vertexSet := name [":" name]

A vertex-induced selection takes an input set of vertices and produces a result set, which is a subset of the input set.  The FROM argument has the form Source:s , where Source is a vertex set. Source is optionally followed by :s , where s is a vertex alias which represents any vertex in the set Source.

resultSet = SELECT s FROM Source:s;

This statement can be interpreted as " Select all vertices s, from the vertex set Source ."  The result is a vertex set.

Below is a simple example of a vertex-induced selection.

Vertex-Induced SELECT example
# displays all 'post'-type vertices CREATE QUERY printAllPosts() FOR GRAPH socialNet { Source = {post.*}; # start is initialized with all vertices of type 'post' results = SELECT s FROM Source:s; # select these vertices PRINT results; }


Results
GSQL > RUN QUERY printAllPosts() { "error": false, "message": "", "results": [ { "v_set": "results", "v_id": "0", "v": { "postTime": "2010-01-12 11:22:05", "subject": "Graphs" }, "v_type": "post" }, { "v_set": "results", "v_id": "1", "v": { "postTime": "2011-03-03 23:02:00", "subject": "tigergraph" }, "v_type": "post" }, { "v_set": "results", "v_id": "2", "v": { "postTime": "2011-02-03 01:02:42", "subject": "query languages" }, "v_type": "post" }, { "v_set": "results", "v_id": "3", "v": { "postTime": "2011-02-05 01:02:44", "subject": "cats" }, "v_type": "post" }, { "v_set": "results", "v_id": "4", "v": { "postTime": "2011-02-07 05:02:51", "subject": "coffee" }, "v_type": "post" }, { "v_set": "results", "v_id": "5", "v": { "postTime": "2011-02-06 01:02:02", "subject": "tigergraph" }, "v_type": "post" }, { "v_set": "results", "v_id": "6", "v": { "postTime": "2011-02-05 02:02:05", "subject": "tigergraph" }, "v_type": "post" }, { "v_set": "results", "v_id": "7", "v": { "postTime": "2011-02-04 17:02:41", "subject": "Graphs" }, "v_type": "post" }, { "v_set": "results", "v_id": "8", "v": { "postTime": "2011-02-03 17:05:52", "subject": "cats" }, "v_type": "post" }, { "v_set": "results", "v_id": "9", "v": { "postTime": "2011-02-05 23:12:42", "subject": "cats" }, "v_type": "post" }, { "v_set": "results", "v_id": "10", "v": { "postTime": "2011-02-04 03:02:31", "subject": "cats" }, "v_type": "post" }, { "v_set": "results", "v_id": "11", "v": { "postTime": "2011-02-03 01:02:21", "subject": "cats" }, "v_type": "post" } ] }

Edge-Induced Selection

EBNF for edgeSet, signaling edge-induced selection
edgeSet := name [":" name] "-" "(" [vertexEdgeType] [":" name] ")" "->" [vertexEdgeType] [":" name] vertexEdgeType := "_" | ANY | name | ("(" name ["|" name]* ")")

Multiple types can also be specified by using delimiter "|". Additionally, the keywords "_" or "ANY" can be used for denoting a set which can include any vertex or edge type.

An edge-induced selection starts from a set of vertices, defines a set of edges incident to that set, and produces a result set of vertices that are also incident to those edges. Typically, this is used to traverse from a set of source vertices over a specific edge type to a set of target vertices. The FROM clause argument (defined formally by the EBNF edgeSet rule) is structured as an edge template: Source:s-(eType:e)->tType:t . The edge template has three parts: the source vertex set (Source), the edge type or types (eType), and the target vertex type or types (tType). Both s and t are the vertex aliases and e is the edge alias.  The template defines a pattern s → e → t, from source vertex s, across eType edges, to tType target vertices. The edge alias e represents any edge that fits the complete pattern. Likewise, s and t are aliases that represent any source vertices and target vertices, respectively, that fit the complete pattern.

Either the source vertex set ( s ) or target vertex set ( t ) can be used as the SELECT argument, which determines the result of the SELECT statement. Note the small difference in the two SELECT statements below.

Selecting source or target vertices from edge-induced selection
resultSet1 = SELECT s FROM Source:s-(eType:e)->tType:t; //Select from the source set resultSet2 = SELECT t FROM Source:s-(eType:e)->tType:t; //Select from the target set

resultSet1 is based on the source end of the edges.  resultSet2 is based on the target end of the selected edges. However, resultSet1 is NOT identical to the Source vertex set.  It is only those members of Source which connect to an eType edge and then to a tType vertex. Other clauses (presented later in this "SELECT Statement" section, can do additional filtering of the Source set.

We strongly suggest that an alias should be declared with every vertex and edge in the FROM clause, as there are several functions and features which are only available to vertex and edge aliases.

Edge Set and Target Vertex Set Options

The FROM clause chooses edges and target vertices by type. The EBNF symbol vertexEdgeType describes the options:


accepted vertex/edge types
_ any type
ANY any type
name the given vertex/edge type
name | name ... any of the vertex/edge types listed

Note that eType and tType are optional. If eType/ tType is omitted (or if ANY or _ is used), then the SELECT will seek out any edge or target vertex that is valid (i.e., there exists a valid path between two vertices over an edge). For the example below, if V1 and V2 are the only possible reachable vertex types via eType , we can omit the target vertex type, making all of the following SELECT statements equivalent. The system will infer the target vertex type at run time.

If is legal to declare an alias without explicitly stating an edge/target type.  See the examples below.

Target vertex type inference
resultSet3 = SELECT v FROM Source:v-(eType:e)->(V1|V2):t; resultSet4 = SELECT v FROM Source:v-(eType:e)->:t; resultSet5 = SELECT v FROM Source:v-(eType:e)->ANY:t; resultSet6 = SELECT v FROM Source:v-(eType:e)->_:t;

Type inference is used whenever possible for the edge set and target vertex set to prune ineligible edges and thereby optimize performance. The vertex type in Source is checked against the graph schema to find all incident edge types. The knowledge of the graph schema is combined with the selection's explicit type conditions given by eType and tType, as well as explicit and implicit type conditions in the WHERE clause to determine a final set of eligible edge sets which match the pattern Source → eType → tType.  With type inference, the user has the freedom to express only as much as necessary to select edges.

Similarly, the GSQL engine will infer the edge type at run time. For example, if E1, E2 , and E3 are the only possible edge types that can be traversed to reach vertices of type tType , we can omit specifying the edge type, making the following SELECT statements equivalent.

Edge type inference
resultSet7 = SELECT v FROM Source:v-((E1|E2|E3):e)->tType:t; resultSet8 = SELECT v FROM Source:v-(:e)->tType:t; resultSet9 = SELECT v FROM Source:v-(_:e)->tType:t; resultSet10 = SELECT v FROM Source:v-(ANY:e)->tType:t;

The following are a set of queries that demonstrate edge-induced SELECT blocks. The allPostsLiked and allPostsMade queries show how the target vertex type can be omitted. The allPostsLikedOrMade query uses the "|" operator  to select multiple types of edges.

Edge induced SELECT example
# uses various SELECT statements (some of which are equivalent) to print out # either the posts made by the given user, the posts liked by the given # user, or the posts made or liked by the given user. CREATE QUERY printAllPosts2(vertex<person> seed) FOR GRAPH socialNet { start = {seed}; # initialize starting set of vertices # --- statements produce equivalent results # select all 'post' vertices which can be reached from 'start' in one hop # using an edge of type 'liked' allPostsLiked = SELECT targetVertex FROM start -(liked:edgeName)-> post:targetVertex;   # select all vertices of any type which can be reached from 'start' in one hop # using an edge of type 'liked' allPostsLiked = SELECT targetVertex FROM start -(liked:edgeName)-> :targetVertex; # ----     # --- statements produce equivalent results # start with the vertex set from above, and traverse all edges of type "posted" # (locally those edges are just given a name 'e' in case they need accessed) # and return all vertices of type 'post' which can be reached within one-hop of 'start' vertices allPostsMade = SELECT targetVertex FROM start -(posted:e)-> post:targetVertex;   # start with the vertex set from above, and traverse all edges of type "posted" # (locally those edges are just given a name 'e' in case they need accessed) # and return all vertices of any type which can be reached within one-hop of 'start' vertices allPostsMade = SELECT targetVertex FROM start -(posted:e)-> :targetVertex; # ----     # --- statements produce equivalent results # select all vertices of type 'post' which can be reached from 'start' in one hop # using an edge of any type # not equivalent to any statement. because it doesn't restrict the edge type, # this will include any vertex connected by 'liked' or 'posted' edge types allPostsLikedOrMade = SELECT targetVertex FROM start -(:edgeName)-> post:targetVertex;   # select all vertices of type either 'post' which can be reached from 'start' in one hop # using an edge of type either 'posted' or 'liked' allPostsLikedOrMade = SELECT targetVertex FROM start -((posted|liked):e)-> post:t;   # select all vertices of any type which can be reached from 'start' in one hop # using an edge of type either 'posted' or 'liked/ allPostsLikedOrMade = SELECT targetVertex FROM start -((posted|liked):e)-> :t; # ---- PRINT allPostsLiked; PRINT allPostsMade; PRINT allPostsLikedOrMade; }


Results
GSQL > RUN QUERY printAllPosts2("person2") { "error": false, "message": "", "results": [ { "v_set": "allPostsLiked", "v_id": "0", "v": { "postTime": "2010-01-12 11:22:05", "subject": "Graphs" }, "v_type": "post" }, { "v_set": "allPostsLiked", "v_id": "3", "v": { "postTime": "2011-02-05 01:02:44", "subject": "cats" }, "v_type": "post" }, { "v_set": "allPostsMade", "v_id": "1", "v": { "postTime": "2011-03-03 23:02:00", "subject": "tigergraph" }, "v_type": "post" }, { "v_set": "allPostsLikedOrMade", "v_id": "0", "v": { "postTime": "2010-01-12 11:22:05", "subject": "Graphs" }, "v_type": "post" }, { "v_set": "allPostsLikedOrMade", "v_id": "1", "v": { "postTime": "2011-03-03 23:02:00", "subject": "tigergraph" }, "v_type": "post" }, { "v_set": "allPostsLikedOrMade", "v_id": "3", "v": { "postTime": "2011-02-05 01:02:44", "subject": "cats" }, "v_type": "post" } ] } GSQL > RUN QUERY printAllPosts2("person6") { "error": false, "message": "", "results": [ { "v_set": "allPostsLiked", "v_id": "8", "v": { "postTime": "2011-02-03 17:05:52", "subject": "cats" }, "v_type": "post" }, { "v_set": "allPostsMade", "v_id": "5", "v": { "postTime": "2011-02-06 01:02:02", "subject": "tigergraph" }, "v_type": "post" }, { "v_set": "allPostsMade", "v_id": "10", "v": { "postTime": "2011-02-04 03:02:31", "subject": "cats" }, "v_type": "post" }, { "v_set": "allPostsLikedOrMade", "v_id": "5", "v": { "postTime": "2011-02-06 01:02:02", "subject": "tigergraph" }, "v_type": "post" }, { "v_set": "allPostsLikedOrMade", "v_id": "8", "v": { "postTime": "2011-02-03 17:05:52", "subject": "cats" }, "v_type": "post" }, { "v_set": "allPostsLikedOrMade", "v_id": "10", "v": { "postTime": "2011-02-04 03:02:31", "subject": "cats" }, "v_type": "post" } ] }


This example is another edge selection that uses the "|" operator to select edges that have target vertices of multiple types.

Edge induced SELECT example
# uses a SELECT statement to print out everything related to a given user # this includes posts that the user liked, posts that the user made, and friends # of the user CREATE QUERY printAllRelatedItems(vertex<person> seed) FOR GRAPH socialNet { sourceVertex = {seed}; # -- statements produce equivalent output # returns all vertices of type either 'person' or 'post' that can be reached # from the sourceVertex set using one edge of any type everythingRelated = SELECT v FROM sourceVertex -(:e)-> (person|post):v;   # returns all vertices of any type that can be reached from the sourceVertex # using one edge of any type # this statement is equivalent to the above one because the graph schema only # has vertex types of either 'person' or 'post'. if there were more vertex # types present, these would not be equivalent. everythingRelated = SELECT v FROM sourceVertex -(:e)-> :v; # -- PRINT everythingRelated; }


Results
GSQL > RUN QUERY printAllRelatedItems("person2") { "error": false, "message": "", "results": [ { "v_set": "everythingRelated", "v_id": "person1", "v": { "gender": "Male", "id": "person1" }, "v_type": "person" }, { "v_set": "everythingRelated", "v_id": "person3", "v": { "gender": "Male", "id": "person3" }, "v_type": "person" }, { "v_set": "everythingRelated", "v_id": "0", "v": { "postTime": "2010-01-12 11:22:05", "subject": "Graphs" }, "v_type": "post" }, { "v_set": "everythingRelated", "v_id": "1", "v": { "postTime": "2011-03-03 23:02:00", "subject": "tigergraph" }, "v_type": "post" }, { "v_set": "everythingRelated", "v_id": "3", "v": { "postTime": "2011-02-05 01:02:44", "subject": "cats" }, "v_type": "post" } ] } GSQL > RUN QUERY printAllRelatedItems("person6") { "error": false, "message": "", "results": [ { "v_set": "everythingRelated", "v_id": "person4", "v": { "gender": "Female", "id": "person4" }, "v_type": "person" }, { "v_set": "everythingRelated", "v_id": "person8", "v": { "gender": "Male", "id": "person8" }, "v_type": "person" }, { "v_set": "everythingRelated", "v_id": "5", "v": { "postTime": "2011-02-06 01:02:02", "subject": "tigergraph" }, "v_type": "post" }, { "v_set": "everythingRelated", "v_id": "8", "v": { "postTime": "2011-02-03 17:05:52", "subject": "cats" }, "v_type": "post" }, { "v_set": "everythingRelated", "v_id": "10", "v": { "postTime": "2011-02-04 03:02:31", "subject": "cats" }, "v_type": "post" } ] }

Vertex and Edge Aliases

Vertex and edge aliases are declared within the FROM clause of a SELECT block, by using the character ":", followed by the alias name. Aliases can be accessed anywhere within the same SELECT block. They are used to reference a single selected vertex or edge of a set. It is through the vertex or edge aliases that attributes of these vertices or edges can be accessed.

For example, the following code snippets shows two different SELECT statements. The first SELECT statement starts from a vertex set called allVertices, and the vertex alias name v can access each individual vertex from allVertices. The second SELECT statement selects a set of edges. It can use the vertex alias s to reference the source vertices, or the alias t to reference the target vertices.

Vertex variables
results = SELECT v FROM allVertices:v; results = SELECT v FROM allVertices:s -()-> :t;

The following example shows an edge-based SELECT statement, declaring aliases for all three parts of the edge. In the ACCUM clause, the e and t aliases are assigned to local vertex and edge variables.

Edge variables
results = SELECT v FROM allVertices:s -(:e)-> :t ACCUM VERTEX v = t, EDGE eg = e;

We strongly suggest that an alias should be declared with every vertex and edge in the FROM clause, as there are several functions and features which are only available to vertex and edge aliases.

SAMPLE Clause

The SAMPLE clause is an optional clause that selects a uniform random sample from the population of edges or vertices specified in the FROM argument. To be clear, the edge population consists of those edges which satisfy all three parts – source set, edge type, and target type – of the FROM clause. This clause is intended to provide a representative sample of the distribution of edges (or vertices) connected to hub vertices, instead of dealing with all edges. A hub vertex is a vertex with a relatively high degree. (The degree of a vertex is the number of edges which connect to it. If edges are directional, one can distinguish between indegree and outdegree.)

Note

Currently, the condition that can be used with a SAMPLE clause is limited strictly to checking if the result of a function call on a vertex is greater than or greater than/equal to some number.

The expression following SAMPLE specifies the sample size, either an absolute number or a percentage of the population. The expression in sampleClause must evaluate to a positive integer. There are two sampling methods. One is sampling based on edge id. The other is based on target vertex id: if a target vertex id is sampled, all edges from this source vertex to the sampled target vertex are sampled.

EBNF for Sample Clause
sampleClause := SAMPLE ( expr | expr "%" ) EDGE WHEN condition # Sample an absolute (or a percentage) number of edges for each vertex. | SAMPLE expr TARGET WHEN condition # Sample an absolute number of edges for each vertex based on target id sampling. | SAMPLE expr "%" TARGET PINNED WHEN condition # Sample a percentage of edges for each vertex based on target id selection.

Given that the sampling is random, the results for the following examples may vary.

Below is an example of using SELECT to only traverse one edge for each source vertex. The vertex-attached accumulators @timesTraversedNoSample and @timesTraversedWithSample are used to keep track of the number of times an edge is traversed to reach the target vertex. Without using sampling, this occurs once for each edge; thus @timesTraversedNoSample has the same number as the in-degree of the vertex. With sampling edges, the number of edges is restricted. This is reflected in the @timesTraversedWithSample accumulator. Notice the difference in the result set. Because only one edge per source vertex is traversed when the SAMPLE clause is used, not all target vertices are reached. The vertex company3 has 3 incident edges, but in one instance of the query execution, it is never reached. Additionally, company2 has 6 incident edges, but only 4 source vertices sampled an edge incident to company2 .

example of SAMPLE using an absolute number of edges
CREATE QUERY sampleEx1() FOR GRAPH workNet { SumAccum<INT> @timesTraversedNoSample; SumAccum<INT> @timesTraversedWithSample; workers = {person.*};   # The 'beforeSample' result set encapsulates the normal functionality of # a SELECT statement, where 'timesTraversedNoSample' vertex accumulator is increased for # each edge incident to the vertex. beforeSample = SELECT v FROM workers:t -(:e)-> :v ACCUM v.@timesTraversedNoSample += 1;   # The 'afterSample' result set is formed by those vertices which can be # reached when for each source vertex, only one edge is used for traversal. # This is demonstrated by the values of 'timesTraversedWithSample' vertex accumulator, which # is increased for each edge incident to the vertex which is used in the # sample. afterSample = SELECT v FROM workers:t -(:e)-> :v SAMPLE 1 EDGE WHEN t.outdegree() >= 1 # only use 1 edge from the source vertex ACCUM v.@timesTraversedWithSample += 1; PRINT beforeSample; PRINT afterSample; }


Results
GSQL > RUN QUERY sampleEx1() { "error": false, "message": "", "results": [ { "v_set": "beforeSample", "v_id": "company1", "v": { "country": "us", "@timesTraversedNoSample": 6, "@timesTraversedWithSample": 6, "id": "company1" }, "v_type": "company" }, { "v_set": "beforeSample", "v_id": "company2", "v": { "country": "chn", "@timesTraversedNoSample": 6, "@timesTraversedWithSample": 4, "id": "company2" }, "v_type": "company" }, { "v_set": "beforeSample", "v_id": "company3", "v": { "country": "jp", "@timesTraversedNoSample": 3, "@timesTraversedWithSample": 0, "id": "company3" }, "v_type": "company" }, { "v_set": "beforeSample", "v_id": "company4", "v": { "country": "us", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 1, "id": "company4" }, "v_type": "company" }, { "v_set": "beforeSample", "v_id": "company5", "v": { "country": "can", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 1, "id": "company5" }, "v_type": "company" }, { "v_set": "afterSample", "v_id": "company1", "v": { "country": "us", "@timesTraversedNoSample": 6, "@timesTraversedWithSample": 6, "id": "company1" }, "v_type": "company" }, { "v_set": "afterSample", "v_id": "company2", "v": { "country": "chn", "@timesTraversedNoSample": 6, "@timesTraversedWithSample": 4, "id": "company2" }, "v_type": "company" }, { "v_set": "afterSample", "v_id": "company4", "v": { "country": "us", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 1, "id": "company4" }, "v_type": "company" }, { "v_set": "afterSample", "v_id": "company5", "v": { "country": "can", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 1, "id": "company5" }, "v_type": "company" } ] }

Since the PRINT statements are placed at the end of query, the two vertex sets beforeSample and afterSample are almost identical, showing the final values of both accumulators @timesTraversedNoSample and @timesTraversedWithSample. There is one difference: company3 is not included in afterSample because none of the sample-selected edges reached company3.

The following is an example of using the SAMPLE clause with an absolute number of target vertices.  Here, the graph is traversed starting with all vertices, and for each source vertex, only one target vertex is selected.  Thus the result set will include at most the same number of vertices as the source vertices.  Because some vertices do not have outgoing edges, the result set has fewer vertices than the starting set.  There are 31 target vertices which are originally selected.  When the SAMPLE clause is used to select only one target vertex per source vertex, only 7 target vertices were selected in this particular SAMPLE instance.

Example #. SAMPLE using an absolute number of target vertices
CREATE QUERY sampleEx2() FOR GRAPH computerNet { SumAccum<INT> @timesTraversedNoSample; SumAccum<INT> @timesTraversedWithSample; start = {computer.*}; # The result set encapsulates the normal functionality of a SELECT statement, where # 'timesTraversedNoSample' vertex accumulator is increased for each edge incident to the vertex. allTargetVertices = SELECT v FROM start:t -(:e)-> :v ACCUM v.@timesTraversedNoSample += 1; # The 'afterSample' result set is formed by those vertices which can be # reached when for each source vertex, only one target vertex is used for traversal. # This is demonstrated by the values of 'timesTraversedWithSample' vertex accumulator, which # is increased for each edge incident to the vertex which is used in the sample. afterSample = SELECT v FROM start:t -(:e)-> :v SAMPLE 1 TARGET WHEN t.outdegree() >= 1 # only use 1 target vertex from the source vertex ACCUM v.@timesTraversedWithSample += 1; PRINT allTargetVertices; PRINT afterSample; }


Results
GSQL > RUN QUERY sampleEx2() { "error": false, "message": "", "results": [ { "v_id": "c2", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c2", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c3", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c3", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c4", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c4", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c5", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c5", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c6", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c6", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c7", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c7", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c8", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c8", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 1 } }, { "v_id": "c9", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c9", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c10", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c10", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c11", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c11", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 1 } }, { "v_id": "c12", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c12", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 1 } }, { "v_id": "c13", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c13", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c14", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c14", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c15", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c15", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c16", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c16", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c17", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c17", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c18", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c18", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c19", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c19", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c20", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c20", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c21", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c21", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 1 } }, { "v_id": "c22", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c22", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c23", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c23", "@timesTraversedNoSample": 2, "@timesTraversedWithSample": 2 } }, { "v_id": "c24", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c24", "@timesTraversedNoSample": 2, "@timesTraversedWithSample": 0 } }, { "v_id": "c25", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c25", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c26", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c26", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c27", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c27", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c28", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c28", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c29", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c29", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c30", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c30", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 0 } }, { "v_id": "c31", "v_type": "computer", "v_set": "allTargetVertices", "v": { "id": "c31", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 1 } }, { "v_id": "c8", "v_type": "computer", "v_set": "afterSample", "v": { "id": "c8", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 1 } }, { "v_id": "c11", "v_type": "computer", "v_set": "afterSample", "v": { "id": "c11", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 1 } }, { "v_id": "c12", "v_type": "computer", "v_set": "afterSample", "v": { "id": "c12", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 1 } }, { "v_id": "c21", "v_type": "computer", "v_set": "afterSample", "v": { "id": "c21", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 1 } }, { "v_id": "c23", "v_type": "computer", "v_set": "afterSample", "v": { "id": "c23", "@timesTraversedNoSample": 2, "@timesTraversedWithSample": 2 } }, { "v_id": "c31", "v_type": "computer", "v_set": "afterSample", "v": { "id": "c31", "@timesTraversedNoSample": 1, "@timesTraversedWithSample": 1 } } ] }

WHERE Clause

The WHERE clause is an optional clause that constrains edges and vertices specified in the FROM clause.

EBNF for Where Clause
whereClause := WHERE condition

The WHERE clause uses a boolean condition to test each vertex or edge in the FROM set.

If the expression evaluates to false for vertex/edge X, then X excluded from further consideration in the result set. The expression may use constants or any variables or parameters within the scope of the SELECT, arithmetic operators (+, -, *, /,%), comparison operators (==, !=, <, <=, >,>=), boolean operators (AND, OR, NOT), set operators (IN, NOT IN) and parentheses to enforce precedence. The WHERE conditional expression may use any of the variables within its scope (global accumulators, vertex set variables, query input parameters, the FROM clause's vertex and edge sets (or their vertex and edge aliases), or any of the attributes or accumulators of the vertex/edge sets.) For a more formal explanation of condition, see the EBNF definitions of condition and expr.

Using built-in vertex and edge attributes and functions, such as .type and .neighbors(), the WHERE clause can be used to implement sophisticated selection rules for the edge traversal.  In the following example, the selection conditions are completely specified in the WHERE clause, with no edge types or vertex types mentioned in the FROM clause.

WHERE used as a filter
resultSet1 = SELECT v FROM S:v-((E1|E2|E3):e)->(V1|V2):t; resultSet2 = SELECT v FROM S:v-(:e)->:t WHERE t.type IN ("V1", "V2") AND t IN v.neighbors("E1|E2|E3")

The following examples demonstrate using the WHERE clause to limit the resulting vertex set based on a vertex attribute.

Basic SELECT WHERE
CREATE QUERY printCatPosts() FOR GRAPH socialNet { posts = {post.*}; catPosts = SELECT v FROM posts:v # select only those post vertices WHERE v.subject == "cats"; # which have a subset of 'cats' PRINT catPosts; }


Results
GSQL > RUN QUERY printCatPosts() { "error": false, "message": "", "results": [ { "v_set": "catPosts", "v_id": "3", "v": {"subject": "cats"}, "v_type": "post" }, { "v_set": "catPosts", "v_id": "8", "v": {"subject": "cats"}, "v_type": "post" }, { "v_set": "catPosts", "v_id": "9", "v": {"subject": "cats"}, "v_type": "post" }, { "v_set": "catPosts", "v_id": "10", "v": {"subject": "cats"}, "v_type": "post" }, { "v_set": "catPosts", "v_id": "11", "v": {"subject": "cats"}, "v_type": "post" } ] }


SELECT WHERE using IN operator
CREATE QUERY findGraphFocusedPosts() FOR GRAPH socialNet { posts = {post.*}; results = SELECT v FROM posts:v # select only post vertices WHERE v.subject IN ("Graph", "tigergraph"); # which have a subject of either 'Graph' or 'tigergraph' PRINT results; }


Results
GSQL > RUN QUERY findGraphFocusedPosts() { "error": false, "message": "", "results": [ { "v_set": "results", "v_id": "1", "v": { "postTime": "2011-03-03 23:02:00", "subject": "tigergraph" }, "v_type": "post" }, { "v_set": "results", "v_id": "5", "v": { "postTime": "2011-02-06 01:02:02", "subject": "tigergraph" }, "v_type": "post" }, { "v_set": "results", "v_id": "6", "v": { "postTime": "2011-02-05 02:02:05", "subject": "tigergraph" }, "v_type": "post" } ] }



WHERE NOT limitations

The NOT operator may not be used in combination with the .type attribute selector. To check if an edge or vertex type is not equal to a given type, use the != operator. See the example below.

The following example shows the equivalence of using WHERE as a type filter as well as its limitations.

SELECT WHERE using AND/OR
# finds female person in the social network. all of the following statements # are equivalent (i.e., produce the same results) CREATE QUERY findFemaleMembers() FOR GRAPH socialNet { allVertices = {ANY}; # includes all posts and person females = SELECT v FROM allVertices:v WHERE v.type == "person" AND v.gender != "Male"; females = SELECT v FROM allVertices:v WHERE v.type == "person" AND v.gender == "Female"; females = SELECT v FROM allVertices:v WHERE v.type == "person" AND NOT v.gender == "Male";   females = SELECT v FROM allVertices:v WHERE v.type != "post" AND NOT v.gender == "Male"; # does not compile. cannot use NOT operator in combination with type attribute #females = SELECT v FROM allVertices:v # WHERE NOT v.type != "person" AND # NOT v.gender == "Male"; # does not compile. cannot use NOT operator in combination with type attribute #females = SELECT v FROM allVertices:v # WHERE NOT v.type == "post" AND # NOT v.gender == "Male"; personVertices = {person.*}; females = SELECT v FROM personVertices:v WHERE NOT v.gender == "Male"; females = SELECT v FROM personVertices:v WHERE v.gender != "Male"; females = SELECT v FROM personVertices:v WHERE v.gender != "Male" AND true; females = SELECT v FROM personVertices:v WHERE v.gender != "Male" OR false; PRINT females; }


Results
GSQL > RUN QUERY findFemaleMembers() { "debug": "", "error": false, "message": "", "results": [ { "v_set": "females", "v_id": "person2", "v": { "gender": "Female", "id": "person2" }, "v_type": "person" }, { "v_set": "females", "v_id": "person4", "v": { "gender": "Female", "id": "person4" }, "v_type": "person" }, { "v_set": "females", "v_id": "person5", "v": { "gender": "Female", "id": "person5" }, "v_type": "person" } }

The following example uses edge attributes to determine which workers are registered as full time for some company.

WHERE using edge attributes
# find all workers who are full time at some company CREATE QUERY fullTimeWorkers() FOR GRAPH workNet { start = {person.*}; fullTimeWorkers = SELECT v FROM start:v -(worksFor:e)-> company:t WHERE e.fullTime; # fullTime is a boolean attribute on the edge PRINT fullTimeWorkers; }


Results
GSQL > RUN QUERY fullTimeWorkers() { "debug": "", "error": false, "message": "", "results": [ { "v_set": "fullTimeWorkers", "v_id": "person1", "v": { "interestList": [ "management", "financial" ], "skillSet": [3,2,1], "skillList": [1,2,3], "locationId": "us", "interestSet": [ "financial", "management" ], "id": "person1" }, "v_type": "person" }, { "v_set": "fullTimeWorkers", "v_id": "person2", "v": { "interestList": ["engineering"], "skillSet": [6,3,2,5], "skillList": [2,3,5,6], "locationId": "chn", "interestSet": ["engineering"], "id": "person2" }, "v_type": "person" }, { "v_set": "fullTimeWorkers", "v_id": "person3", "v": { "interestList": ["teaching"], "skillSet": [6,1,4], "skillList": [4,1,6], "locationId": "jp", "interestSet": ["teaching"], "id": "person3" }, "v_type": "person" }, { "v_set": "fullTimeWorkers", "v_id": "person4", "v": { "interestList": ["football"], "skillSet": [10,1,4], "skillList": [4,1,10], "locationId": "us", "interestSet": ["football"], "id": "person4" }, "v_type": "person" }, { "v_set": "fullTimeWorkers", "v_id": "person6", "v": { "interestList": [ "music", "art" ], "skillSet": [10,7], "skillList": [7,10], "locationId": "jp", "interestSet": [ "art", "music" ], "id": "person6" }, "v_type": "person" }, { "v_set": "fullTimeWorkers", "v_id": "person8", "v": { "interestList": ["management"], "skillSet": [2,1,5], "skillList": [1,5,2], "locationId": "chn", "interestSet": ["management"], "id": "person8" }, "v_type": "person" }, { "v_set": "fullTimeWorkers", "v_id": "person9", "v": { "interestList": [ "financial", "teaching" ], "skillSet": [2,7,4], "skillList": [4,7,2], "locationId": "us", "interestSet": [ "teaching", "financial" ], "id": "person9" }, "v_type": "person" }, { "v_set": "fullTimeWorkers", "v_id": "person10", "v": { "interestList": [ "football", "sport" ], "skillSet": [3], "skillList": [3], "locationId": "us", "interestSet": [ "sport", "football" ], "id": "person10" }, "v_type": "person" }, { "v_set": "fullTimeWorkers", "v_id": "person11", "v": { "interestList": [ "sport", "football" ], "skillSet": [10], "skillList": [10], "locationId": "can", "interestSet": [ "football", "sport" ], "id": "person11" }, "v_type": "person" }, { "v_set": "fullTimeWorkers", "v_id": "person12", "v": { "interestList": [ "music", "engineering", "teaching", "teaching", "teaching" ], "skillSet": [2,1,5], "skillList": [1,5,2,2,2], "locationId": "jp", "interestSet": [ "teaching", "engineering", "music" ], "id": "person12" }, "v_type": "person" } ] }


If multiple edge types are specified in edge-induced selection, the WHERE clause should use OR to separate each edge type or each target vertex type. For example,

Multiple Edge Type WHERE clause
CREATE QUERY multipleEdgeTypeWhereEx(vertex<person> m1) FOR GRAPH socialNet { allUser = {m1}; FilteredUser = SELECT s FROM allUser:s - ((posted|liked|friend):e) -> (post|person):t # WHERE e.actionTime > epoch_to_datetime(1) AND t.gender == "Male"; WHERE ( e.type == "liked" AND e.actionTime > epoch_to_datetime(1) ) OR ( e.type == "friend" AND t.gender == "Male" ) ; PRINT FilteredUser; }

The above query is compilable. However, if we use line 5 as the WHERE clause instead, the query is not compilable. The edge-type conflict checking detects an error, because i t uses attributes from both "liked" edges and "friend" edges without separating them out by OR.

ACCUM and POST-ACCUM Clauses

The optional ACCUM and POST-ACCUM clauses enable sophisticated aggregation and other computations across the set of vertices or edges selected by the preceding FROM, SAMPLE, and WHERE clauses. A query can contain one or both of these clauses. The statements in an ACCUM clause are applied for every edge in an edge-induced selection or every vertex in a vertex-induced selection.

I f there is more than one statement in the ACCUM clause, the statements are separated by commas and executed sequentially for each selected element. However, the TigerGraph system uses parallelism to improve performance. Within an ACCUM clause, each edge is handled by a separate process. As such, there is no fixed order in which the edges are processed within the ACCUM clause and the edges should not be treated as executing sequentially. The a ccumulators are mutex variables shared among each of these processes. The results of any accumulation within the ACCUM clause is not complete until all edges are traversed. Any inspection of an intermediate result within the ACCUM is incomplete and may not be that meaningful.


The statements within the ACCUM clause are executed sequentially for a given vertex or edge.  However, there is no fixed order in which a vertex set or edge set is processed.

The optional POST-ACCUM clause enables aggregation and other computations across the set of vertices (but not edges) selected by the preceding clauses. POST-ACCUM can be used without ACCUM. If it is preceded by an ACCUM clause, then it can be used for 2-stage accumulative computation: a first stage in ACCUM followed by a second stage in POST-ACCUM.

Each statement within the POST-ACCUM clause can refer to either source vertices or target vertices but not both.

In edge-induced selection, since the ACCUM clause iterates over edges, and often two edges will connect to the same source vertex or to the same target vertex, the ACCUM clause can be repeated multiple times for one vertex.

Operations that are to be performed exactly once per vertex should be performed in the POST-ACCUM clause.


The primary purpose of the ACCUM or POST-ACCUM clause is to collect information about the graph by updating accumulators (via += or =). See the "Accumulator" section for details on the += operation. However, other kinds of statements (e.g., branching, iteration, local assignments) are permitted to support more complex computations or to log activity. The EBNF syntax below defines the allowable kinds of statements that can occur within an ACCUM or POST-ACCUM.  The DMLSubStmt list is similar to the queryBodyStmt list which applies to statements outside of a SELECT block; it is important to note the differences.  Each of these statement types is discussed in one of the main sections of this reference document.


EBNF for ACCUM and POST-ACCUM Clauses
accumClause := ACCUM DMLSubStmtList postAccumClause := POST-ACCUM DMLSubStmtList DMLSubStmtList := DMLSubStmt ["," DMLSubStmt]* DMLSubStmt := assignStmt // Assignment (including vertex-attached accumulate) | funcCallStmt // Function Call | gAccumAccumStmt // Assignment (global accumulate) | vAccumFuncCall // Function Call | localVarDeclStmt // Declaration | DMLSubCaseStmt // Control Flow | DMLSubIfStmt // Control Flow | DMLSubWhileStmt // Control Flow | DMLSubForEachStmt // Control Flow | BREAK // Control Flow | CONTINUE // Control Flow | insertStmt // Data Modification | DMLSubDeleteStmt // Data Modification | logStmt // Output


Note that DML-sub-statements do not include global accumulator assignment statement (gAccumAssignStmt) but global accumulator accumulation statement (gAccumAccumStmt). Global accumulators may perform accumulation += but not assignment "=" within these clauses.


There are additional restrictions on DML-sub level statements:

  • Global variable assignment is permitted in ACCUM or POST-ACCUM clauses, but the change in value will not take place until exiting the clause. Therefore, if there are multiple assignment statements for the same variable, only the final one will take effect.
  • Vertex attribute assignment "=" is not permitted in an ACCUM clause. However, edge attribute assignment is permitted. This is because the ACCUM clause iterates over an edge set.

Aliases and ACCUM/POST-ACCUM Iteration Model

To reference each element of the selected set, use the aliases defined in the FROM clause.  For example, assume that we have the following aliases:

Example of vertex and edge aliases
FROM Source:s -(edgeTypes:e)-> targetTypes:t # edge-induced selection FROM Source:v # vertex-induced selection

Let  (V1, V2,... Vn) be the vertices in the vertex-induced selection . The following pseudocode emulates ACCUM clause behavior.

Model for ACCUM behavior in vertex-induced selection
FOREACH v in (V1,V2,...Vn) DO # iterations may occur in parallel, in unknown order DMLSubStmts referencing v DONE 

Let E = (E1, E2,... En) be the edges in the edge-induced selected set. Further, let S = (S1,S1,...Sn) and T= (T1,T2,...Tn) be the multisets (bags) of source vertices and target vertices which correspond to the edge set.  S and T are bags, because they can contain repeated elements.

Model for ACCUM behavior in edge-induced selection
FOREACH i in (1..n) DO # iterations may occur in parallel, in unknown order DMLSubStmts referencing e, s, t, which really means e_i, s_i, t_i DONE 

Note that any reference to the source alias s or target alias t is for the endpoint vertices of the current edge.

Similarly, the POST-ACCUM clause acts like a FOREACH loop on the vertex result set specified in the SELECT clause (e.g., either S or T).

Edge/Vertex Type Inference and Conflict

If multiple edge types are specified in edge-induced selection, each ACCUM statement in ACCUM clause checks whether edge types are conflicted. If only a subset of edge types are effective in an ACCUM statement , this statement is not executed on other edge types. For example:

Multiple Edge Type ACCUM statement check
CREATE QUERY multipleEdgeTypeCheckEx(vertex<person> m1) FOR GRAPH socialNet { ListAccum<STRING> @@testList1, @@testList2, @@testList3; allUser = {m1}; allUser = SELECT s FROM allUser:s - ((posted|liked|friend):e) -> (post|person):t ACCUM @@testList1 += to_string(datetime_to_epoch(e.actionTime)), @@testList2 += t.gender, @@testList3 += to_string(datetime_to_epoch(e.actionTime)) + t.gender ; PRINT @@testList1, @@testList2, @@testList3; }

In the above example, line 6 is only executed on "liked" edges, because "actionTime" is the attribute of "liked" edge only. Similarly, line 7 is only executed on "friend" edges, because "gender" is the attribute of "person" only, and only "friend" edge uses "person" as target vertex. However, line 8 causes a compilation error, because it uses multiple edges where some edges cannot be supported in a part of the statement, i.e., "liked" edges doesn't have t.gender, "friend" edges doesn't have e.actionTime.

We strongly suggest that if multiple edge types are specified in edge-induced selection, ACCUM clauses should uses CASE statement (see Section "Control Flow Statements" for more details) to separate the operation on each edge type or each target vertex type (or combination of target vertex type and edge type). The edge-type conflict checking then checks the ACCUM statement inside each THEN/ELSE blocks based on the condition. For example,

Multiple Edge Type ACCUM statement check 2
CREATE QUERY multipleEdgeTypeCheckEx2(vertex<person> m1) FOR GRAPH socialNet { ListAccum<STRING> @@testList1; allUser = {m1}; allUser = SELECT s FROM allUser:s - ((posted|liked|friend):e) -> (post|person):t ACCUM CASE WHEN e.type == "liked" THEN # for liked edges @@testList1 += to_string(datetime_to_epoch(e.actionTime)) WHEN e.type == "friend" THEN # for friend edges @@testList1 += t.gender ELSE # For the remained edge type, which is posted edges @@testList1 += to_string(datetime_to_epoch(t.postTime)) END ; PRINT @@testList1; }

The above query is compilable. However, if we switch line 8 and line 10, the edge-type conflict checking generates errors because "liked" edges doesn't support t.gender and "friend" edges doesn't support e.actionTime.

Similar to the ACCUM clause, if multiple source/target vertex types are specified in edge-induced selection and the POST-ACCUM clauses accesses source/target vertex, each ACCUM statement in POST-ACCUM clause checks whether source/target vertex types are conflicted. If only a subset of source/target vertex types are effective in a POST-ACCUM statement, this statement is not executed on other source/target vertex types.

Similar to ACCUM clause, we strongly suggest that if multiple source/target vertex types are specified in edge-induced selection and the POST-ACCUM clauses accesses source/target vertex, POST-ACCUM clauses should uses CASE statement (see Section "Control Flow Statements" for more details) to separate the operation on each source/target vertex type. The vertex type conflict checking then checks the ACCUM statement inside each THEN/ELSE blocks based on the condition.

ACCUM and POST-ACCUM Examples

We now show several examples. This example demonstrates how ACCUM or POST-ACCUM can be used to count the number of vertices in the given set.

Accum and PostAccum Semantics
#Show Accum PostAccum Behavior CREATE QUERY accumPostAccumSemantics() FOR GRAPH workNet { SumAccum<INT> @@vertexOnlyAccum; SumAccum<INT> @@vertexOnlyPostAccum; SumAccum<INT> @@vertexOnlyWhereAccum; SumAccum<INT> @@vertexOnlyWherePostAccum; SumAccum<INT> @@sourceWithEdgeAccum; SumAccum<INT> @@sourceWithEdgePostAccum; SumAccum<INT> @@targetWithEdgeAccum; SumAccum<INT> @@targetWithEdgePostAccum; #Seed start set with all company vertices start = {company.*}; #Select all vertices in source set start selectVertexSet = SELECT v from start:v #Happens once for each vertex discovered ACCUM @@vertexOnlyAccum += 1 #Happens once for each vertex in the result set "v" POST-ACCUM @@vertexOnlyPostAccum += 1; #Select all vertices in source set start with a where constraint selectVertexSetWhere = SELECT v from start:v WHERE (v.country == "us") #Happens once for each vertex discovered that also # meets the constraint condition ACCUM @@vertexOnlyWhereAccum += 1 #Happens once for each vertex in the result set "v" POST-ACCUM @@vertexOnlyWherePostAccum += 1; #Select all source "s" vertices in set start and explore all "worksFor" edge paths selectSourceWithEdge = SELECT s from start:s -(worksFor)-> :t #Happens once for each "worksFor" edge discovered ACCUM @@sourceWithEdgeAccum += 1 #Happens once for each vertex in result set "s" (source) POST-ACCUM @@sourceWithEdgePostAccum += 1; #Select all target "t" vertices found from exploring all "worksFor" edge paths from set start selectTargetWithEdge = SELECT t from start:s -(worksFor)-> :t #Happens once for each "worksFor" edge discovered ACCUM @@targetWithEdgeAccum += 1 #Happens once for each vertex in result set "t" (target) POST-ACCUM @@targetWithEdgePostAccum += 1; PRINT @@vertexOnlyAccum; PRINT @@vertexOnlyPostAccum; PRINT @@vertexOnlyWhereAccum; PRINT @@vertexOnlyWherePostAccum; PRINT @@sourceWithEdgeAccum; PRINT @@sourceWithEdgePostAccum; PRINT @@targetWithEdgeAccum; PRINT @@targetWithEdgePostAccum; }


Result
GSQL > RUN QUERY accumPostAccumSemantics() { "error": false, "message": "", "results": [ {"@@vertexOnlyAccum": 5}, {"@@vertexOnlyPostAccum": 5}, {"@@vertexOnlyWhereAccum": 2}, {"@@vertexOnlyWherePostAccum": 2}, {"@@sourceWithEdgeAccum": 17}, {"@@sourceWithEdgePostAccum": 5}, {"@@targetWithEdgeAccum": 17}, {"@@targetWithEdgePostAccum": 12} ] }

This example uses ACCUM to find all the subjects a user posted about.

Vertex ACCUM Example
# Show each user's post subject CREATE QUERY userPosts() FOR GRAPH socialNet { ListAccum<STRING> @personPosts; start = {person.*}; # Find all user post topics and append them to the vertex list accum userPostings = SELECT s FROM start:s -(posted)-> :g ACCUM s.@personPosts += g.subject; PRINT userPostings; }


Result
GSQL > RUN QUERY userPosts() { "error": false, "message": "", "results": [ { "v_set": "userPostings", "v_id": "person1", "v": { "gender": "Male", "@personPosts": ["Graphs"], "id": "person1" }, "v_type": "person" }, { "v_set": "userPostings", "v_id": "person2", "v": { "gender": "Female", "@personPosts": ["tigergraph"], "id": "person2" }, "v_type": "person" }, { "v_set": "userPostings", "v_id": "person3", "v": { "gender": "Male", "@personPosts": ["query languages"], "id": "person3" }, "v_type": "person" } /*** other vertices omitted ***/ ] }


This example shows each person's posted vertices and each person's like behaviors (liked edges).

ACCUM<VERTEX> and ACCUM<EDGE> Example
# Show each user's post and liked post time CREATE QUERY userPosts2() FOR GRAPH socialNet { ListAccum<VERTEX> @personPosts; ListAccum<EDGE> @personLikedInfo; start = {person.*}; # Find all user post topics and append them to the vertex list accum userPostings = SELECT s FROM start:s -(posted)-> :g ACCUM s.@personPosts += g; userPostings = SELECT s from start:s -(liked:e)-> :g ACCUM s.@personLikedInfo += e; PRINT start; }


Result
GSQL > RUN QUERY userPosts2() { "results": [ { "v_id": "person1", "v_type": "person", "v_set": "start", "v": { "id": "person1", "gender": "Male", "@personLikedInfo": [ { "e_type": "liked", "from_id": "person1", "from_type": "person", "to_id": "0", "to_type": "post", "directed": true, "attributes": { "actionTime": "2010-01-11 11:32:00" } } ], "@personPosts": [ "0" ] } }, { "v_id": "person2", "v_type": "person", "v_set": "start", "v": { "id": "person2", "gender": "Female", "@personLikedInfo": [ { "e_type": "liked", "from_id": "person2", "from_type": "person", "to_id": "0", "to_type": "post", "directed": true, "attributes": { "actionTime": "2010-01-12 10:52:15" } }, { "e_type": "liked", "from_id": "person2", "from_type": "person", "to_id": "3", "to_type": "post", "directed": true, "attributes": { "actionTime": "2010-01-11 16:02:26" } } ], "@personPosts": [ "1" ] } }, /*** other vertices omitted ***/ ], "error": false, "message": "" }


This example counts the total number of times each topic is used.

Global ACCUM Example
# Show number of total posts by topic CREATE QUERY userPostsByTopic() FOR GRAPH socialNet { MapAccum<STRING, INT> @@postTopicCounts; start = {person.*}; # Append subject and update the appearance count in the global map accum posts = SELECT g FROM start -(posted)-> :g ACCUM @@postTopicCounts += (g.subject -> 1); PRINT @@postTopicCounts; }


Result
GSQL > RUN QUERY userPostsByTopic() { "error": false, "message": "", "results": [{"@@postTopicCounts": { "tigergraph": 3, "cats": 5, "coffee": 1, "Graphs": 2, "query languages": 1 }}] }


This is an example of using ACCUM and POST-ACCUM in conjunction. The ACCUM traverses the graph and finds all people who live and work in the same country. After this is determined, POST-ACCUM examines each vertex (person) to see if they work where they live.

Vertex POST-ACCUM Example
#Show all person who both work and live in the same country CREATE QUERY residentEmployees() FOR GRAPH workNet { ListAccum<STRING> @company; OrAccum @worksAndLives; start = {person.*}; employees = SELECT s FROM start:s -(worksFor)-> :c #If a person works for a company in the same country where they live # add the company to the list ACCUM CASE WHEN (s.locationId == c.country) THEN s.@company += c.id END #Check each vertex and see if a person works where they live POST-ACCUM CASE WHEN (s.@company.size() > 0) THEN s.@worksAndLives += True ELSE s.@worksAndLives += False END; PRINT employees WHERE (employees.@worksAndLives == True); }


Result
GSQL > RUN QUERY residentEmployees() { "error": false, "message": "", "results": [ { "v_set": "employees", "v_id": "person1", "v": { "interestList": ["management","financial"], "skillSet": [3,2,1], "skillList": [1,2,3], "@worksAndLives": true, "locationId": "us", "interestSet": ["financial","management"], "id": "person1", "@company": ["company1"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person2", "v": { "interestList": ["engineering"], "skillSet": [6,3,2,5], "skillList": [2,3,5,6], "@worksAndLives": true, "locationId": "chn", "interestSet": ["engineering"], "id": "person2", "@company": ["company2"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person10", "v": { "interestList": ["football","sport"], "skillSet": [3], "skillList": [3], "@worksAndLives": true, "locationId": "us", "interestSet": ["sport","football"], "id": "person10", "@company": ["company1"] }, "v_type": "person" }, { "v_set": "employees", "v_id": "person11", "v": { "interestList": ["sport","football"], "skillSet": [10], "skillList": [10], "@worksAndLives": true, "locationId": "can", "interestSet": ["football","sport"], "id": "person11", "@company": ["company5"] }, "v_type": "person" } ] }


This is an example of a POST-ACCUM only that counts the number people with a particular gender.

Global POST-ACCUM Example
#Count the number of person of a given gender CREATE QUERY personGender(STRING gender) FOR GRAPH socialNet { SumAccum<INT> @@genderCount; start = {ANY}; # Select all person vertices and check the gender attribute friends = SELECT v FROM start:v WHERE v.type == "person" POST-ACCUM CASE WHEN (start.gender == gender) THEN @@genderCount += 1 END; PRINT @@genderCount; }


Result
GSQL > RUN QUERY personGender("Female") { "error": false, "message": "", "results": [{"@@genderCount": 3}] }

HAVING Clause

The optional HAVING clause provides constraints on the result set of the SELECT. The constraints are applied after ACCUM and POST-ACCUM actions. This differs from the WHERE clause, which is applied before the ACCUM and POST-ACCUM actions.

EBNF for HAVING Clause
havingClause := HAVING condition

A HAVING clause can only be used if there is an ACCUM or POST-ACCUM clause . The condition is applied to each vertex in the SELECT set (either source or target vertices) which also fulfilled the FROM and WHERE conditions. The HAVING clause is intended to test one or more of the accumulator variables that were updated in the ACCUM or POST-ACCUM clause, though the condition may be anything that equates to a boolean value. If the condition is false for a particular vertex, then that vertex is excluded from the result set.

The following example demonstrates using the HAVING clause to constrain a result set based on the vertex accumulator variable which was updated during the ACCUM clause.

Example 1. HAVING
# find all persons meeting a given activityThreshold, based on how many posts or likes a person has made CREATE QUERY activeMembers(int activityThreshold) FOR GRAPH socialNet { SumAccum<int> @activityAmount; start = {person.*}; result = SELECT v FROM start:v -(:e)-> post:tgt ACCUM v.@activityAmount +=1 HAVING v.@activityAmount >= activityThreshold; PRINT result; }

If the activityThreshold parameter is set to 3, the query returns 5 vertices:

Example 1 Results
GSQL > RUN QUERY activeMembers(3) { "error": false, "message": "", "results": [ { "v_set": "result", "v_id": "person2", "v": { "gender": "Female", "@activityAmount": 3, "id": "person2" }, "v_type": "person" }, { "v_set": "result", "v_id": "person5", "v": { "gender": "Female", "@activityAmount": 3, "id": "person5" }, "v_type": "person" }, { "v_set": "result", "v_id": "person6", "v": { "gender": "Male", "@activityAmount": 3, "id": "person6" }, "v_type": "person" }, { "v_set": "result", "v_id": "person7", "v": { "gender": "Male", "@activityAmount": 3, "id": "person7" }, "v_type": "person" }, { "v_set": "result", "v_id": "person8", "v": { "gender": "Male", "@activityAmount": 3, "id": "person8" }, "v_type": "person" } ] }

If the activityThreshold parameter is set to 2, the query would return 8 vertices. With activityThreshold = 4, the query would return no vertices.

The following example demonstrates the equivalence of a SELECT statement in which the condition for the HAVING clause is always true.

Example 2. HAVING with literal condition
# find all person meeting a given activityThreshold, based on how many posts or likes a person has made CREATE QUERY printMemberActivity() FOR GRAPH socialNet { SumAccum<int> @activityAmount; start = {person.*}; ### --- equivalent statements ----- result = SELECT v FROM start:v -(:e)-> post:tgt ACCUM v.@activityAmount +=1 HAVING true; result = SELECT v FROM start:v -(:e)-> post:tgt ACCUM v.@activityAmount +=1; ### ----- PRINT result; }


Example 2 Results
GSQL > RUN QUERY printMemberActivity() { "error": false, "message": "", "results": [ { "v_set": "result", "v_id": "person1", "v": { "gender": "Male", "@activityAmount": 4, "id": "person1" }, "v_type": "person" }, { "v_set": "result", "v_id": "person2", "v": { "gender": "Female", "@activityAmount": 6, "id": "person2" }, "v_type": "person" }, { "v_set": "result", "v_id": "person3", "v": { "gender": "Male", "@activityAmount": 4, "id": "person3" }, "v_type": "person" }, { "v_set": "result", "v_id": "person4", "v": { "gender": "Female", "@activityAmount": 4, "id": "person4" }, "v_type": "person" }, { "v_set": "result", "v_id": "person5", "v": { "gender": "Female", "@activityAmount": 6, "id": "person5" }, "v_type": "person" }, { "v_set": "result", "v_id": "person6", "v": { "gender": "Male", "@activityAmount": 6, "id": "person6" }, "v_type": "person" }, { "v_set": "result", "v_id": "person7", "v": { "gender": "Male", "@activityAmount": 6, "id": "person7" }, "v_type": "person" }, { "v_set": "result", "v_id": "person8", "v": { "gender": "Male", "@activityAmount": 6, "id": "person8" }, "v_type": "person" } ] }

The following shows an example of equivalent result sets from using WHERE vs. HAVING. Recall that the WHERE clause is evaluated before the ACCUM and that the HAVING clause is evaluated after the ACCUM. Both constrain the result set based on a condition that vertices must meet.

Example 3. HAVING vs. WHERE
# Consider the amount of activity that male person have, based on the number of posts he made and number of posts # he liked. Because the gender of the vertex does not change, evaluating whether the person vertex is male # before (WHERE) the ACCUM clause or after (HAVING) the ACCUM clause does not change the result. However, if the # condition in the HAVING clause could change within the ACCUM clause, these statements would produce different results CREATE QUERY activeMaleMembers() FOR GRAPH socialNet { SumAccum<INT> @activityAmount; start = {person.*}; ### --- statements produce equivalent results result = SELECT v FROM start:v -(:e)-> post:tgt WHERE v.gender == "Male" ACCUM v.@activityAmount +=1; result = SELECT v FROM start:v -(:e)-> post:tgt ACCUM v.@activityAmount +=1 HAVING v.gender == "Male"; ### ----- PRINT result; }


Example 3 Results
GSQL > RUN QUERY activeMaleMembers() { "error": false, "message": "", "results": [ { "v_set": "result", "v_id": "person1", "v": { "gender": "Male", "@activityAmount": 4, "id": "person1" }, "v_type": "person" }, { "v_set": "result", "v_id": "person3", "v": { "gender": "Male", "@activityAmount": 4, "id": "person3" }, "v_type": "person" }, { "v_set": "result", "v_id": "person6", "v": { "gender": "Male", "@activityAmount": 6, "id": "person6" }, "v_type": "person" }, { "v_set": "result", "v_id": "person7", "v": { "gender": "Male", "@activityAmount": 6, "id": "person7" }, "v_type": "person" }, { "v_set": "result", "v_id": "person8", "v": { "gender": "Male", "@activityAmount": 6, "id": "person8" }, "v_type": "person" } ] }


The following example has a compilation error because the result set is taken from the source vertices, but the HAVING condition is checking the target vertices.

Example 4. HAVING the wrong vertex set
# find all person having a post subject about cats # This query is illegal because the having condition is testing the wrong vertex set CREATE QUERY printMemberAboutCats() FOR GRAPH socialNet { start = {person.*}; result = SELECT v FROM start:v -(:e)-> post:tgt HAVING tgt.subject == "cats"; PRINT result; }

ORDER BY Clause

The optional ORDER BY clause sorts the result set.

EBNF for ORDER BY Clause
orderClause := ORDER BY expr [ASC | DESC] ["," expr [ASC | DESC]]*

ASC specifies ascending order (least value first), and DESC specifies descending order (greatest value first). If neither is specified, then ascending order is used. Each expr must refer to the attributes or accumulators of a member of the result set, and the expr must evaluate to a sortable value (e.g., a number or a string). ORDER BY offers hierarchical sorting by allowing a comma-separated list of expressions, sorting first by the leftmost expr.  It uses the next expression only to sort items where the current sort expr results in identical values.

The following example demonstrates the use of ORDER BY with multiple expressions. The returned vertex set is first ordered by the number of friends of the vertex, and then ordered by the number of coworkers of that vertex.

Example 1. ORDER BY Descending
# find the most popular people, sorting first based on the number as friends # and then in case of a tie by the number of coworkers CREATE QUERY topPopular() FOR GRAPH friendNet { SumAccum<INT> @numFriends; SumAccum<INT> @numCoworkers; start = {person.*}; result = SELECT v FROM start -((friend|coworker):e)-> person:v ACCUM CASE WHEN e.type == "friend" THEN v.@numFriends += 1 WHEN e.type == "coworker" THEN v.@numCoworkers += 1 END ORDER BY v.@numFriends DESC, v.@numCoworkers DESC; PRINT result; }



Example 1 Results
GSQL > RUN QUERY topPopular() { "error": false, "message": "", "results": [ { "v_set": "result", "v_id": "person9", "v": { "@numCoworkers": 3, "@numFriends": 5, "id": "person9" }, "v_type": "person" }, { "v_set": "result", "v_id": "person12", "v": { "@numCoworkers": 1, "@numFriends": 4, "id": "person12" }, "v_type": "person" }, { "v_set": "result", "v_id": "person8", "v": { "@numCoworkers": 1, "@numFriends": 4, "id": "person8" }, "v_type": "person" }, { "v_set": "result", "v_id": "person6", "v": { "@numCoworkers": 4, "@numFriends": 3, "id": "person6" }, "v_type": "person" }, { "v_set": "result", "v_id": "person1", "v": { "@numCoworkers": 3, "@numFriends": 3, "id": "person1" }, "v_type": "person" }, { "v_set": "result", "v_id": "person4", "v": { "@numCoworkers": 5, "@numFriends": 2, "id": "person4" }, "v_type": "person" }, { "v_set": "result", "v_id": "person2", "v": { "@numCoworkers": 3, "@numFriends": 2, "id": "person2" }, "v_type": "person" }, { "v_set": "result", "v_id": "person3", "v": { "@numCoworkers": 3, "@numFriends": 2, "id": "person3" }, "v_type": "person" }, { "v_set": "result", "v_id": "person10", "v": { "@numCoworkers": 1, "@numFriends": 2, "id": "person10" }, "v_type": "person" }, { "v_set": "result", "v_id": "person7", "v": { "@numCoworkers": 6, "@numFriends": 1, "id": "person7" }, "v_type": "person" }, { "v_set": "result", "v_id": "person5", "v": { "@numCoworkers": 5, "@numFriends": 1, "id": "person5" }, "v_type": "person" }, { "v_set": "result", "v_id": "person11", "v": { "@numCoworkers": 1, "@numFriends": 1, "id": "person11" }, "v_type": "person" } ] }

LIMIT Clause

The optional LIMIT clause sets constraints on the number and ranking of items included in the final result set.

EBNF for LIMIT Clause
limitClause := LIMIT ( expr | expr "," expr | expr OFFSET expr )

Each of the expr must evaluate to a nonnegative integer. To understand LIMIT, note that the tentative result set is held in the computer as a list of vertices. If the query has an ORDER BY clause, the order is specified; otherwise the list order is unknown.  Assume we number the vertices as v 1 , v 2 , ..., v n . The LIMIT clause specifies a range of vertices, starting from a lower position in the list to an upper position.

There are three forms:

LIMIT scenarios
result = SELECT v FROM S -(:e)-> :v LIMIT k; # case 1: k = Count result = SELECT v FROM S -(:e)-> :v LIMIT j, k; # case 2: j = Offset from the start of the list, k = Count result = SELECT v FROM S -(:e)-> :v LIMIT k OFFSET j; # case 3: k = Count, j = Offset from the start of the list

Case 1: LIMIT k

  • When a single expr is provided, LIMIT returns the first k elements from the tentative result set. If there are fewer than k elements available, then all elements will be returned in the result set.  If k=5 and the tentative result set has at least 5 items, then the final result list will be [ v 1 , v 2 , v 3 , v 4 , v 5 ].

Case 2: LIMIT j, k

  • When a comma separates two expressions, LIMIT treats the first expression j as an offset.  That is, it skips the first j items in the list.  The second expr k tells the maximum number of items items to include. If the list has at least 7 items, then LIMIT 2, 5 would return [ v 3 , v 4 , v 5, v 6 , v 7 ].

Case 3: LIMIT k OFFSET j

  • The behavior of Case 3 is the same as that of Case 2, except that the syntax is different.  The keyword OFFSET separates the two expressions, and the count comes before the offset, rather than vice versa. If the list has at least 7 items, then LIMIT 5 OFFET 2 would return [ v 3 , v 4 , v 5, v 6 , v 7 ].

If any of the expressions evaluate to a negative integer, the results are undefined.

The following examples demonstrate the various forms of the LIMIT clause.

The first example shows the LIMIT clause when used as an upper limit. It returns a result set with a maximum size of 4 elements in the set.

LIMIT by some number
CREATE QUERY limitEx1(INT k) FOR GRAPH friendNet { start = {person.*}; result1 = SELECT v FROM start:v ORDER BY v.id LIMIT k; PRINT result1.id; }


Results
GSQL > RUN QUERY limitEx1(4) { "error": false, "message": "", "results": [ { "v_set": "result1", "v_id": "person1", "v": {"result1.id": "person1"}, "v_type": "person" }, { "v_set": "result1", "v_id": "person10", "v": {"result1.id": "person10"}, "v_type": "person" }, { "v_set": "result1", "v_id": "person11", "v": {"result1.id": "person11"}, "v_type": "person" }, { "v_set": "result1", "v_id": "person12", "v": {"result1.id": "person12"}, "v_type": "person" } ] }

The following example shows how to use the LIMIT clause with an offset.

LIMIT with lower-bound and size
CREATE QUERY limitEx2(INT j, INT k) FOR GRAPH friendNet { start = {person.*}; result2 = SELECT v FROM start:v ORDER BY v.id LIMIT j, k; PRINT result2.id; }


Results
GSQL > RUN QUERY limitEx2(2,3) { "error": false, "message": "", "results": [ { "v_set": "result2", "v_id": "person11", "v": {"result2.id": "person11"}, "v_type": "person" }, { "v_set": "result2", "v_id": "person12", "v": {"result2.id": "person12"}, "v_type": "person" }, { "v_set": "result2", "v_id": "person2", "v": {"result2.id": "person2"}, "v_type": "person" } ] }

The following example shows the alternative syntax for a result size limit with an offset.  This time we try larger values for offset and size.  In a large data set, limitTest(5,20) might return 20 vertices, but since we don't have 25 vertices in the original data, the output was fewer than 20 vertices.

LIMIT with OFFSET
CREATE QUERY limitEx3(INT j, INT k) FOR GRAPH friendNet { start = {person.*}; result3 = SELECT v FROM start:v ORDER BY v.id LIMIT k OFFSET j; PRINT result3.id; }


Results
GSQL > RUN QUERY limitTest3(5,20) { "error": false, "message": "", "results": [ { "v_set": "result3", "v_id": "person3", "v": {"result3.id": "person3"}, "v_type": "person" }, { "v_set": "result3", "v_id": "person4", "v": {"result3.id": "person4"}, "v_type": "person" }, { "v_set": "result3", "v_id": "person5", "v": {"result3.id": "person5"}, "v_type": "person" }, { "v_set": "result3", "v_id": "person6", "v": {"result3.id": "person6"}, "v_type": "person" }, { "v_set": "result3", "v_id": "person7", "v": {"result3.id": "person7"}, "v_type": "person" }, { "v_set": "result3", "v_id": "person8", "v": {"result3.id": "person8"}, "v_type": "person" }, { "v_set": "result3", "v_id": "person9", "v": {"result3.id": "person9"}, "v_type": "person" } ] }



End of Select Statement Section

back to top


Control Flow Statements


Contents of this Section - Click to Expand

The GSQL Query Language includes a comprehensive set of control flow statements to empower sophisticated graph traversal and data computation: IF/ELSE, CASE, WHILE, and FOREACH.

Note that any of these statements can be used as a query-body statement or as a DML-sub level statement.

If the control flow statement is at the query-body level, then its block(s) of statements are query-body statements ( queryBodyStmts ). In a queryBodyStmts block , each individual statement ends with a semicolon, so there is always a semicolon at the end.

If the control flow statement is at the DML-sub level, then its block(s) of statements are DML-sub statements ( DMLSubStmtList ). In a DMLSubStmtList block, a comma separates statements, but there is no punctuation at the end.

The "Statement Types" subsection in the Chapter on "CREATE / INSTALL / RUN / SHOW / DROP QUERY" has a more detailed general example of the difference between queryBodyStmts and DMLSUbStmts.

IF Statement

The IF statement provides conditional branching: execute a block of statements ( queryBodyStmts or DMLSubStmtList ) only if a given condition is true. The IF statement allows for zero or more ELSE-IF clauses, followed by an optional ELSE clause. The IF statement can be used either at the query-body level or at the DML-sub-statement level. (See the note about differences in block syntax .)

IF syntax
queryBodyIfStmt := IF condition THEN queryBodyStmts [ELSE IF condition THEN queryBodyStmts ]* [ELSE queryBodyStmts ] END DMLSubIfStmt := IF condition THEN DMLSubStmtList [ELSE IF condition THEN DMLSubStmtList ]* [ELSE DMLSubStmtList ] END

If a particular IF condition is not true, then the flow proceeds to the next ELSE IF condition.  When a true condition is encountered, its corresponding block of statements is executed, and then the IF statement terminates (skipping any remaining ELSE-IF or ELSE clauses). If an ELSE-clause is present, its block of statements are executed if none of the preceding conditions are true. Overall, the functionality can be summarized as "execute the first block of statements whose conditional test is true."

IF semantics
# if then IF x == 5 THEN y = 10; END; # y is assigned to 10 only if x is 5.   # if then else IF x == 5 THEN y = 10; # y is 10 only if x is 5. ELSE y = 20; END; # y is 20 only if x is NOT 5.   #if with ELSE IF IF x == 5 THEN y = 10; # y is 10 only if x is 5. ELSE IF x == 7 THEN y = 5; # y is 5 only if x is 7. ELSE y = 20; END; # y is 20 only if x is NOT 5 and NOT 7.
Example 1. Simple IF-ELSE at query-body level
# count the number of friends a person has, and optionally include coworkers in that count CREATE QUERY countFriendsOf(vertex<person> seed, BOOL includeCoworkers) FOR GRAPH friendNet { SumAccum<INT> @@numFriends = 0; start = {seed}; IF includeCoworkers THEN friends = SELECT v FROM start -((friend | coworker):e)-> :v ACCUM @@numFriends +=1; ELSE friends = SELECT v FROM start -(friend:e)-> :v ACCUM @@numFriends +=1; END; PRINT @@numFriends, includeCoworkers; }


Example 1 Results
GSQL > RUN QUERY countFriendsOf("person2", true) { "debug": "", "error": false, "message": "", "results": [{ "@@numFriends": 5, "includeCoworkers": true }] } GSQL > RUN QUERY countFriendsOf2("person2", false) { "debug": "", "error": false, "message": "", "results": [{ "@@numFriends": 2, "includeCoworkers": false }] }


Example 2. IF-ELSE IF-ELSE at query-body level
# determine if a user is active in terms of social networking (i.e., posts frequently) CREATE QUERY calculateActivity(vertex<person> seed) FOR GRAPH socialNet { SumAccum<INT> @@numberPosts = 0; start = {seed}; result = SELECT postVertex FROM start -(posted:e)-> :postVertex ACCUM @@numberPosts += 1; IF @@numberPosts < 2 THEN PRINT "Not very active"; ELSE IF @@numberPosts < 3 THEN PRINT "Semi-active"; ELSE PRINT "Very active"; END; }


Example 2 Results
GSQL > RUN QUERY calculateActivity("person1") { "debug": "", "error": false, "message": "", "results": [{"Not very active": "Not very active"}] } GSQL > RUN QUERY calculateActivity("person5") { "debug": "", "error": false, "message": "", "results": [{"Semi-active": "Semi-active"}] }


Example 3. Nested IF at query-body level
# use a more advanced activity calculation, taking into account number of posts # and number of likes that a user made CREATE QUERY calculateInDepthActivity(vertex<person> seed) FOR GRAPH socialNet { SumAccum<INT> @@numberPosts = 0; SumAccum<INT> @@numberLikes = 0; start = {seed}; result = SELECT postVertex FROM start -(posted:e)-> :postVertex ACCUM @@numberPosts += 1; result = SELECT likedPost FROM start -(liked:e)-> :likedPost ACCUM @@numberLikes += 1; IF @@numberPosts < 2 THEN IF @@numberLikes < 1 THEN PRINT "Not very active"; ELSE PRINT "Semi-active"; END; ELSE IF @@numberPosts < 3 THEN IF @@numberLikes < 2 THEN PRINT "Semi-active"; ELSE PRINT "Active"; END; ELSE PRINT "Very active"; END; }


Example 3 Results
GSQL > RUN QUERY calculateInDepthActivity("person1") { "debug": "", "error": false, "message": "", "results": [{"Semi-active": "Semi-active"}] }
Example 4. Nested IF at DML-sub level
# give each user post an accumulated rating based on the subject and how many likes it has # This query is equivalent to the query ratePosts shown above CREATE QUERY ratePosts2() FOR GRAPH socialNet { SumAccum<INT> @rating = 0; allPeople = {person.*}; results = SELECT v FROM allPeople -(:e)-> post:v ACCUM IF e.type == "posted" THEN IF v.subject == "cats" THEN v.@rating += -1 # -1 if post is about cats ELSE IF v.subject == "Graphs" THEN v.@rating += 2 # +2 if post is about graphs ELSE IF v.subject == "tigergraph" THEN v.@rating += 10 # +10 if post is about tigergraph END ELSE IF e.type == "liked" THEN v.@rating += 3 # +3 each time post was liked END; PRINT results; }

CASE Statement

The CASE statement provides conditional branching: execute a block of statements only if a given condition is true. CASE statements can be used as query-body statements or DML-sub-statements. (See the note about differences in block syntax .)

CASE syntax
queryBodyCaseStmt := CASE (WHEN condition THEN queryBodyStmts)+ [ELSE queryBodyStmts] END | CASE expr (WHEN constant THEN queryBodyStmts)+ [ELSE queryBodyStmts] END DMLSubCaseStmt := CASE (WHEN condition THEN DMLSubStmtList)+ [ELSE DMLSubStmtList] END | CASE expr (WHEN constant THEN DMLSubStmtList)+ [ELSE DMLSubStmtList] END

One CASE statement contains one or more WHEN-THEN clauses, each WHEN presenting one expression. The CASE statement may also have one ELSE clause whose statements are executed if none of the preceding conditions are true.

There are two syntaxes of the CASE statement: one equivalent to an if-else statement, and the other is structured like a switch statement. The if-else version evaluates the boolean condition within each WHEN-clause and executes the first block of statements whose condition is true. The optional concluding ELSE-clause is executed only if all WHEN-clause conditions are false.

The switch version evaluates the expression following the keyword WHEN and compares its value to the expression immediately following the keyword CASE. These expressions do not need to be boolean; the CASE statement compares pairs of expressions to see if their values are equal. The first WHEN-THEN clause to have an expression value equal to the CASE expression value is executed; the remaining clauses are skipped. The optional ELSE-clause is executed only if no WHEN-clause expression has a value matching the CASE value.

CASE Semantics
STRING drink = "Juice";   # CASE statement: if-else version CASE WHEN drink == "Juice" THEN @@calories += 50 WHEN drink == "Soda" THEN @@calories += 120 ... ELSE @@calories = 0 # Optional else-clause END # Since drink = "Juice", 50 will be added to calories   # CASE statement: switch version CASE drink WHEN "Juice" THEN @@calories += 50 WHEN "Soda" THEN @@calories += 120 ... ELSE @@calories = 0 # Optional else-clause END # Since drink = "Juice", 50 will be added to calories


Example 1. CASE as IF-ELSE
# Display the total number times connected users posted about a certain subject CREATE QUERY userNetworkPosts (vertex<person> seedUser, STRING subjectName) FOR GRAPH socialNet { SumAccum<INT> @@topicSum = 0; OrAccum @visited; reachableVertices = {}; # empty vertex set visitedVertices (ANY) = {seedUser}; # set that can contain ANY type of vertex WHILE visitedVertices.size() !=0 DO # loop terminates when all neighbors are visited visitedVertices = SELECT s # s is all neighbors of visitedVertices which have not been visited FROM visitedVertices-(:e)->:s WHERE s.@visited == false ACCUM s.@visited = true, CASE WHEN s.type == "post" and s.subject == subjectName THEN @@topicSum += 1 END; END; PRINT @@topicSum; }


Example 1 Results
GSQL > RUN QUERY userNetworkPosts("person1", "Graphs") { "debug": "", "error": false, "message": "", "results": [{"@@topicSum": 3}] }

Example 2. CASE as switch
# tally male and female friends of the starting vertex CREATE QUERY countGenderOfFriends(vertex<person> seed) FOR GRAPH socialNet { SumAccum<INT> @@males = 0; SumAccum<INT> @@females = 0; SumAccum<INT> @@unknown = 0; startingVertex = {seed}; people = SELECT v FROM startingVertex -(friend:e)->:v ACCUM CASE v.gender WHEN "Male" THEN @@males += 1 WHEN "Female" THEN @@females +=1 ELSE @@unknown += 1 END; PRINT @@males, @@females, @@unknown; }


Example 2 Results
GSQL > RUN QUERY countGenderOfFriends("person4") { "debug": "", "error": false, "message": "", "results": [{ "@@males": 2, "@@unknown": 0, "@@females": 1 }] }

Example 3. Multiple CASE statements
# give each social network user a social impact score which accumulates # based on how many friends and posts they have CREATE QUERY scoreSocialImpact() FOR GRAPH socialNet { SumAccum<INT> @socialImpact = 0; allPeople = {person.*}; people = SELECT v FROM allPeople:v ACCUM CASE WHEN v.outdegree("friend") > 1 THEN v.@socialImpact +=1 END, # +1 point for having > 1 friend CASE WHEN v.outdegree("friend") > 2 THEN v.@socialImpact +=1 END, # +1 point for having > 2 friends CASE WHEN v.outdegree("posted") > 1 THEN v.@socialImpact +=1 END, # +1 point for having > 1 posts CASE WHEN v.outdegree("posted") > 3 THEN v.@socialImpact +=2 END; # +2 points for having > 2 posts PRINT people; }


Example 3 Results
GSQL > RUN QUERY scoreSocialImpact() { "debug": "", "error": false, "message": "", "results": [ { "v_set": "people", "v_id": "person1", "v": { "gender": "Male", "@socialImpact": 1, "id": "person1" }, "v_type": "person" }, { "v_set": "people", "v_id": "person2", "v": { "gender": "Female", "@socialImpact": 1, "id": "person2" }, "v_type": "person" }, { "v_set": "people", "v_id": "person3", "v": { "gender": "Male", "@socialImpact": 1, "id": "person3" }, "v_type": "person" }, { "v_set": "people", "v_id": "person4", "v": { "gender": "Female", "@socialImpact": 2, "id": "person4" }, "v_type": "person" }, { "v_set": "people", "v_id": "person5", "v": { "gender": "Female", "@socialImpact": 2, "id": "person5" }, "v_type": "person" }, { "v_set": "people", "v_id": "person6", "v": { "gender": "Male", "@socialImpact": 2, "id": "person6" }, "v_type": "person" }, { "v_set": "people", "v_id": "person7", "v": { "gender": "Male", "@socialImpact": 2, "id": "person7" }, "v_type": "person" }, { "v_set": "people", "v_id": "person8", "v": { "gender": "Male", "@socialImpact": 3, "id": "person8" }, "v_type": "person" } ] }


Example 4. Nested CASE statements
# give each user post an accumulated rating based on the subject and how many likes it has CREATE QUERY ratePosts() FOR GRAPH socialNet {   SumAccum<INT> @rating = 0; allPeople = {person.*}; results = SELECT v FROM allPeople -(:e)-> post:v ACCUM CASE e.type WHEN "posted" THEN CASE WHEN v.subject == "cats" THEN v.@rating += -1 # -1 if post is about cats WHEN v.subject == "Graphs" THEN v.@rating += 2 # +2 if post is about graphs WHEN v.subject == "tigergraph" THEN v.@rating += 10 # +10 if post is about tigergraph END WHEN "liked" THEN v.@rating += 3 # +3 each time post was liked END; PRINT results; }


Example 4 Results
GSQL > RUN QUERY ratePosts() { "debug": "", "error": false, "message": "", "results": [ { "v_set": "results", "v_id": "0", "v": { "postTime": "2010-01-12 11:22:05", "subject": "Graphs", "@rating": 11 }, "v_type": "post" }, { "v_set": "results", "v_id": "1", "v": { "postTime": "2011-03-03 23:02:00", "subject": "tigergraph", "@rating": 10 }, "v_type": "post" }, { "v_set": "results", "v_id": "2", "v": { "postTime": "2011-02-03 01:02:42", "subject": "query languages", "@rating": 0 }, "v_type": "post" }, { "v_set": "results", "v_id": "3", "v": { "postTime": "2011-02-05 01:02:44", "subject": "cats", "@rating": 2 }, "v_type": "post" }, { "v_set": "results", "v_id": "4", "v": { "postTime": "2011-02-07 05:02:51", "subject": "coffee", "@rating": 6 }, "v_type": "post" }, { "v_set": "results", "v_id": "5", "v": { "postTime": "2011-02-06 01:02:02", "subject": "tigergraph", "@rating": 10 }, "v_type": "post" }, { "v_set": "results", "v_id": "6", "v": { "postTime": "2011-02-05 02:02:05", "subject": "tigergraph", "@rating": 13 }, "v_type": "post" }, { "v_set": "results", "v_id": "7", "v": { "postTime": "2011-02-04 17:02:41", "subject": "Graphs", "@rating": 2 }, "v_type": "post" }, { "v_set": "results", "v_id": "8", "v": { "postTime": "2011-02-03 17:05:52", "subject": "cats", "@rating": 2 }, "v_type": "post" }, { "v_set": "results", "v_id": "9", "v": { "postTime": "2011-02-05 23:12:42", "subject": "cats", "@rating": -1 }, "v_type": "post" }, { "v_set": "results", "v_id": "10", "v": { "postTime": "2011-02-04 03:02:31", "subject": "cats", "@rating": 2 }, "v_type": "post" }, { "v_set": "results", "v_id": "11", "v": { "postTime": "2011-02-03 01:02:21", "subject": "cats", "@rating": -1 }, "v_type": "post" } ] }

WHILE Statement

The WHILE statement provides unbounded iteration over a block of statements. WHILE statements can be used as query-body statements or DML-sub-statements. (See the note about differences in block syntax .)

WHILE syntax
queryBodyWhileStmt := WHILE condition [LIMIT (name | integer)] DO queryBodyStmts END DMLSubWhileStmt := WHILE condition [LIMIT (name | integer)] DO DMLSubStmtList END

The WHILE statement iterates over its body ( queryBodyStmts or DMLSubStmtList ) until the condition evaluates to false or until the iteration limit is met.  A condition is any expression that evaluates to a boolean.  The condition is evaluated before each iteration. CONTINUE statements can be used to change the control flow within the while block. BREAK statements can be used to exit the while loop.

A WHILE statement may have an optional LIMIT clause.  LIMIT clauses has a constant positive integer value or integer variable to constrain the maximum number of loop iterations.  The example below demonstrates how the LIMIT behaves.

If a limit value is not specified, it is possible for a WHILE loop to iterate infinitely. It is the responsibility of the query author to design the condition logic so that it is guaranteed to eventually be true (or to set a limit).


WHILE LIMIT semantics
# These three WHILE statements behave the same. Each terminates when # (v.size == 0) or after 5 iterations of the loop. WHILE v.size() !=0 LIMIT 5 DO # Some statements END; INT iter = 0; WHILE (v.size() !=0) AND (iter < 5) DO # Some statements iter = iter + 1; END;   INT iter = 0; WHILE v.size() !=0 DO IF iter == 5 THEN BREAK; END; # Some statements iter = iter + 1; END;


Below are a number of examples that demonstrate the use of WHILE statements.

Example 1. Simple WHILE loop
# find all vertices which are reachable from a starting seed vertex (i.e., breadth-first search) CREATE QUERY reachable(vertex<person> seed) FOR GRAPH workNet { OrAccum @visited; reachableVertices = {}; # empty vertex set visitedVertices (ANY) = {seed}; # set that can contain ANY type of vertex WHILE visitedVertices.size() !=0 DO # loop terminates when all neighbors are visited visitedVertices = SELECT s # s is all neighbors of visitedVertices which have not been visited FROM visitedVertices-(:e)->:s WHERE s.@visited == false POST-ACCUM s.@visited = true; reachableVertices = reachableVertices UNION visitedVertices; END; PRINT reachableVertices; }


Example 1. Results
GSQL > RUN QUERY reachable("person1") { "debug": "", "error": false, "message": "", "results": [ { "v_set": "reachableVertices", "v_id": "person1", "v": { "interestList": [ "management", "financial" ], "skillSet": [ 3, 2, 1, 0 ], "skillList": [ 0, 1, 2, 3 ], "locationId": "us", "interestSet": [ "financial", "management" ], "@visited": true, "id": "person1" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person2", "v": { "interestList": ["engineering"], "skillSet": [ 6, 3, 2, 5, 0 ], "skillList": [ 0, 2, 3, 5, 6 ], "locationId": "chn", "interestSet": ["engineering"], "@visited": true, "id": "person2" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person3", "v": { "interestList": ["teaching"], "skillSet": [ 6, 1, 4, 0 ], "skillList": [ 0, 4, 1, 6 ], "locationId": "jp", "interestSet": ["teaching"], "@visited": true, "id": "person3" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person4", "v": { "interestList": ["football"], "skillSet": [ 10, 1, 4, 0 ], "skillList": [ 0, 4, 1, 10 ], "locationId": "us", "interestSet": ["football"], "@visited": true, "id": "person4" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person5", "v": { "interestList": [ "sport", "financial", "engineering" ], "skillSet": [ 2, 8, 5, 0 ], "skillList": [ 0, 8, 2, 5 ], "locationId": "can", "interestSet": [ "engineering", "financial", "sport" ], "@visited": true, "id": "person5" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person6", "v": { "interestList": [ "music", "art" ], "skillSet": [ 10, 7, 0 ], "skillList": [ 0, 7, 10 ], "locationId": "jp", "interestSet": [ "art", "music" ], "@visited": true, "id": "person6" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person7", "v": { "interestList": [ "art", "sport" ], "skillSet": [ 6, 8, 0 ], "skillList": [ 0, 8, 6 ], "locationId": "us", "interestSet": [ "sport", "art" ], "@visited": true, "id": "person7" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person8", "v": { "interestList": ["management"], "skillSet": [ 2, 1, 5, 0 ], "skillList": [ 0, 1, 5, 2 ], "locationId": "chn", "interestSet": ["management"], "@visited": true, "id": "person8" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person9", "v": { "interestList": [ "financial", "teaching" ], "skillSet": [ 2, 7, 4, 0 ], "skillList": [ 0, 4, 7, 2 ], "locationId": "us", "interestSet": [ "teaching", "financial" ], "@visited": true, "id": "person9" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person10", "v": { "interestList": [ "football", "sport" ], "skillSet": [ 3, 0 ], "skillList": [ 0, 3 ], "locationId": "us", "interestSet": [ "sport", "football" ], "@visited": true, "id": "person10" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "company1", "v": { "country": "us", "@visited": true, "id": "company1" }, "v_type": "company" }, { "v_set": "reachableVertices", "v_id": "company2", "v": { "country": "chn", "@visited": true, "id": "company2" }, "v_type": "company" }, { "v_set": "reachableVertices", "v_id": "company3", "v": { "country": "jp", "@visited": true, "id": "company3" }, "v_type": "company" } ] }


Example 2. WHILE loop using a LIMIT
# find all vertices which are reachable within two hops from a starting seed vertex (i.e., breadth-first search) CREATE QUERY reachableWithinTwo(vertex<person> seed) FOR GRAPH workNet { OrAccum @visited; reachableVertices = {}; # empty vertex set visitedVertices (ANY) = {seed}; # set that can contain ANY type of vertex WHILE visitedVertices.size() !=0 LIMIT 2 DO # loop terminates when all neighbors within 2-hops of the seed vertex are visited visitedVertices = SELECT s # s is all neighbors of visitedVertices which have not been visited FROM visitedVertices-(:e)->:s WHERE s.@visited == false POST-ACCUM s.@visited = true; reachableVertices = reachableVertices UNION visitedVertices; END; PRINT reachableVertices; }


Example 2. Results
GSQL > RUN QUERY reachableWithinTwo("person1") { "debug": "", "error": false, "message": "", "results": [ { "v_set": "reachableVertices", "v_id": "person1", "v": { "interestList": [ "management", "financial" ], "skillSet": [ 3, 2, 1, 0 ], "skillList": [ 0, 1, 2, 3 ], "locationId": "us", "interestSet": [ "financial", "management" ], "@visited": true, "id": "person1" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person2", "v": { "interestList": ["engineering"], "skillSet": [ 6, 3, 2, 5, 0 ], "skillList": [ 0, 2, 3, 5, 6 ], "locationId": "chn", "interestSet": ["engineering"], "@visited": true, "id": "person2" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person3", "v": { "interestList": ["teaching"], "skillSet": [ 6, 1, 4, 0 ], "skillList": [ 0, 4, 1, 6 ], "locationId": "jp", "interestSet": ["teaching"], "@visited": true, "id": "person3" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person4", "v": { "interestList": ["football"], "skillSet": [ 10, 1, 4, 0 ], "skillList": [ 0, 4, 1, 10 ], "locationId": "us", "interestSet": ["football"], "@visited": true, "id": "person4" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person5", "v": { "interestList": [ "sport", "financial", "engineering" ], "skillSet": [ 2, 8, 5, 0 ], "skillList": [ 0, 8, 2, 5 ], "locationId": "can", "interestSet": [ "engineering", "financial", "sport" ], "@visited": true, "id": "person5" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person6", "v": { "interestList": [ "music", "art" ], "skillSet": [ 10, 7, 0 ], "skillList": [ 0, 7, 10 ], "locationId": "jp", "interestSet": [ "art", "music" ], "@visited": true, "id": "person6" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person7", "v": { "interestList": [ "art", "sport" ], "skillSet": [ 6, 8, 0 ], "skillList": [ 0, 8, 6 ], "locationId": "us", "interestSet": [ "sport", "art" ], "@visited": true, "id": "person7" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person8", "v": { "interestList": ["management"], "skillSet": [ 2, 1, 5, 0 ], "skillList": [ 0, 1, 5, 2 ], "locationId": "chn", "interestSet": ["management"], "@visited": true, "id": "person8" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person9", "v": { "interestList": [ "financial", "teaching" ], "skillSet": [ 2, 7, 4, 0 ], "skillList": [ 0, 4, 7, 2 ], "locationId": "us", "interestSet": [ "teaching", "financial" ], "@visited": true, "id": "person9" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "person10", "v": { "interestList": [ "football", "sport" ], "skillSet": [ 3, 0 ], "skillList": [ 0, 3 ], "locationId": "us", "interestSet": [ "sport", "football" ], "@visited": true, "id": "person10" }, "v_type": "person" }, { "v_set": "reachableVertices", "v_id": "company1", "v": { "country": "us", "@visited": true, "id": "company1" }, "v_type": "company" }, { "v_set": "reachableVertices", "v_id": "company2", "v": { "country": "chn", "@visited": true, "id": "company2" }, "v_type": "company" } ] }


FOREACH Statement

The FOREACH statement provides bounded iteration over a block of statements. FOREACH statements can be used as query-body statements or DML-sub-statements. (See the note about differences in block syntax .)

FOREACH syntax
queryBodyForEachStmt := FOREACH forEachControl DO queryBodyStmts END DMLSubForEachStmt := FOREACH forEachControl DO DMLSubStmtList END forEachControl := (name | "(" name [, name]+ ")") IN setBagExpr | name IN RANGE "[" expr , expr"]" [".STEP(" expr ")"]

The formal syntax for forEachControl appears complex.  It can be broken down into the following cases:

  • name IN setBagExpr
  • tuple IN setBagExpr
  • name IN RANGE [ expr, expr ]
  • name IN RANGE [ expr, expr ].STEP ( expr )

Note that setBagExpr includes container accumulators and explicit sets.

The FOREACH statement has the following restrictions:

  • In a DML-sub level FOREACH, it is never permissible to update the loop variable (the variable declared before IN, e.g., var in "FOREACH var IN setBagExpr").
  • In a query-body level FOREACH, in most cases it is not permissible to update the loop variable.  The following exceptions apply:
    • If the iteration is over a ListAccum, its values can be updated.
    • If the iteration is over a MapAccum, its values can be updated, but its keys cannot.
  • If the iteration is over a set of vertices, it is not permissible to access (read or write) their vertex-attached accumulators.

  • A query-body-level FOREACH cannot iterate over a set or bag of constants. For example, FOREACH i in (1,2,3) is not supported. However, DML-sub FOREACH does support this.


FOREACH ... IN RANGE

The FOREACH statement has an optional RANGE clause RANGE[expr, expr], which can be used to define the iteration collection. Optionally, the range may specify a step size:
RANGE[expr, expr].STEP(expr)

Each expr must evaluate to an integer. Any of the integers may be negative, but the step expr may not be 0.

The clause RANGE[a,b].STEP(c)  produces the sequence of integers from a to b, inclusive, with step size c.  That is,
(a, a+c, a+2*c, a+3*c, ... a+k*c), where k = the largest integer such that |k*c| ≤ |b-a|.

If the .STEP method is not given, then the step size c = 1.

Nested FOREACH IN RANGE with MapAccum
CREATE QUERY foreachRangeEx() FOR GRAPH socialNet { ListAccum<INT> @@t; Start = {person.*}; FOREACH i IN RANGE[0, 2] DO @@t += i; L = SELECT Start FROM Start WHERE Start.id == "person1" ACCUM FOREACH j IN RANGE[0, i] DO @@t += j END ; END; PRINT @@t; }
Results
GSQL > RUN QUERY foreachRangeEx("us") { "results": [ { "@@t": [ 0, 0, 1, 0, 1, 2, 0, 1, 2 ] } ], "error": false, "message": "", "debug": "" }
FOREACH IN RANGE with step
CREATE QUERY foreachRangeStep(INT a, INT b, INT c) FOR GRAPH minimalNet { ListAccum<INT> @@t; FOREACH i IN RANGE[a,b].step(c) DO @@t += i; END; PRINT @@t; }
Results
GSQL > RUN QUERY foreachRangeStep(100,0,-9) { "results": [ { "@@t": [ 100, 91, 82, 73, 64, 55, 46, 37, 28, 19, 10, 1 ] } ], "error": false, "message": "", "debug": "" }


Query-body-level FOREACH Examples

Example 1 - FOREACH with ListAccum
# Count the number of companies whose country matches the provided string CREATE QUERY companyCount(STRING countryName) FOR GRAPH workNet { ListAccum<STRING> @@companyList; INT countryCount; start = {ANY}; # start will have a set of all vertex types s = SELECT v FROM start:v # get all vertices WHERE v.type == "company" # that have a type of "company" ACCUM @@companyList += v.country; # append the country attribute from all company vertices to the ListAccum # Iterate the ListAccum and compare each element to the countryName parameter FOREACH item in @@companyList DO IF item == countryName THEN countryCount = countryCount + 1; END; END; PRINT countryCount; }


Query Results
GSQL > RUN QUERY companyCount("us") { "debug": "", "error": false, "message": "", "results": [{"countryCount": 2}] }


Example 2 - FOREACH with a seed set
#Find all company person who live in a given country CREATE QUERY employeesByCompany(STRING country) FOR GRAPH workNet { ListAccum<VERTEX<company>> @@companyList; start = {ANY}; # Build a list of all company vertices # (these are vertex IDs only) s = SELECT v FROM start:v WHERE v.type == "company" ACCUM @@companyList += v; # Use the vertex IDs as Seeds for vertex sets FOREACH item IN @@companyList DO companyItem = {item}; employees = SELECT t FROM companyItem -(worksFor)-> :t WHERE (t.locationId == country); PRINT employees; END; }


Result
GSQL > RUN QUERY employeesByCompany("us"){ "debug": "", "error": false, "message": "", "results": [ { "v_set": "employees", "v_id": "person1", "v": { "interestList": [ "management", "financial" ], "skillSet": [ 3, 2, 1, 0 ], "@companyList": [], "skillList": [ 0, 1, 2, 3 ], "locationId": "us", "interestSet": [ "financial", "management" ], "id": "person1" }, "v_type": "person" }, { "v_set": "employees", "v_id": "person10", "v": { "interestList": [ "football", "sport" ], "skillSet": [ 3, 0 ], "@companyList": [], "skillList": [ 0, 3 ], "locationId": "us", "interestSet": [ "sport", "football" ], "id": "person10" }, "v_type": "person" }, { "v_set": "employees", "v_id": "person1", "v": { "interestList": [ "management", "financial" ], "skillSet": [ 3, 2, 1, 0 ], "@companyList": [], "skillList": [ 0, 1, 2, 3 ], "locationId": "us", "interestSet": [ "financial", "management" ], "id": "person1" }, "v_type": "person" }, { "v_set": "employees", "v_id": "person4", "v": { "interestList": ["football"], "skillSet": [ 10, 1, 4, 0 ], "@companyList": [], "skillList": [ 0, 4, 1, 10 ], "locationId": "us", "interestSet": ["football"], "id": "person4" }, "v_type": "person" }, { "v_set": "employees", "v_id": "person7", "v": { "interestList": [ "art", "sport" ], "skillSet": [ 6, 8, 0 ], "@companyList": [], "skillList": [ 0, 8, 6 ], "locationId": "us", "interestSet": [ "sport", "art" ], "id": "person7" }, "v_type": "person" }, { "v_set": "employees", "v_id": "person9", "v": { "interestList": [ "financial", "teaching" ], "skillSet": [ 2, 7, 4, 0 ], "@companyList": [], "skillList": [ 0, 4, 7, 2 ], "locationId": "us", "interestSet": [ "teaching", "financial" ], "id": "person9" }, "v_type": "person" }, { "v_set": "employees", "v_id": "person7", "v": { "interestList": [ "art", "sport" ], "skillSet": [ 6, 8, 0 ], "@companyList": [], "skillList": [ 0, 8, 6 ], "locationId": "us", "interestSet": [ "sport", "art" ], "id": "person7" }, "v_type": "person" }, { "v_set": "employees", "v_id": "person9", "v": { "interestList": [ "financial", "teaching" ], "skillSet": [ 2, 7, 4, 0 ], "@companyList": [], "skillList": [ 0, 4, 7, 2 ], "locationId": "us", "interestSet": [ "teaching", "financial" ], "id": "person9" }, "v_type": "person" }, { "v_set": "employees", "v_id": "person10", "v": { "interestList": [ "football", "sport" ], "skillSet": [ 3, 0 ], "@companyList": [], "skillList": [ 0, 3 ], "locationId": "us", "interestSet": [ "sport", "football" ], "id": "person10" }, "v_type": "person" }, { "v_set": "s", "v_id": "company1", "v": { "s.@companyList": [], "@@intBag": [ 1, 2, 3 ] }, "v_type": "company" }, { "v_set": "s", "v_id": "company2", "v": { "s.@companyList": [], "@@intBag": [ 1, 2, 3 ] }, "v_type": "company" }, { "v_set": "s", "v_id": "company3", "v": { "s.@companyList": [], "@@intBag": [ 1, 2, 3 ] }, "v_type": "company" }, { "v_set": "s", "v_id": "company4", "v": { "s.@companyList": [], "@@intBag": [ 1, 2, 3 ] }, "v_type": "company" }, { "v_set": "s", "v_id": "company5", "v": { "s.@companyList": [], "@@intBag": [ 1, 2, 3 ] }, "v_type": "company" } ] }


Example 3 - Nested FOREACH with MapAccum
# Count the number of employees from a given country and list their ids CREATE QUERY employeeByCountry(STRING countryName) FOR GRAPH workNet { MapAccum <STRING, ListAccum<STRING>> @@employees; # start will have a set of all person type vertices start = {person.*}; # Build a map using person locationId as a key and a list of strings to hold multiple person ids s = SELECT v FROM start:v ACCUM @@employees += (v.locationId -> v.id); # Iterate the map using (key,value) pairs FOREACH (key,val) in @@employees DO IF key == countryName THEN PRINT val.size(); # Nested foreach to iterate over the list of person ids FOREACH employee in val DO PRINT employee; END; # MapAccum keys are unique so we can BREAK out of the loop BREAK; END; END; }


Results
GSQL > RUN QUERY employeeByCountry("us") { "debug": "", "error": false, "message": "", "results": [ {"val.size()": 5}, {"employee": "person1"}, {"employee": "person4"}, {"employee": "person7"}, {"employee": "person9"}, {"employee": "person10"} ] }

DML-sub FOREACH Examples

ACCUM FOREACH
# Show post topics liked by users and show total likes per topic CREATE QUERY topicLikes() FOR GRAPH socialNet { SetAccum<STRING> @@personPosts; SumAccum<INT> @postLikes; MapAccum<STRING,INT> @@likesByTopic; start = {person.*}; # Find all user posts and generate a set of post topics # (set has no duplicates) posts = SELECT g FROM start - (posted) -> :g ACCUM @@personPosts += g.subject; # Use set of topics to increment how many times a specfic # post is liked by other users likedPosts = SELECT f FROM start - (liked) -> :f ACCUM FOREACH x in @@personPosts DO CASE WHEN (f.subject == x) THEN f.@postLikes += 1 END END # Aggregate all liked totals by topic POST-ACCUM @@likesByTopic += (f.subject -> f.@postLikes); # Display the number of likes per topic PRINT @@likesByTopic; }


Result
GSQL > RUN QUERY topicLikes() { "debug": "", "error": false, "message": "", "results": [{"@@likesByTopic": { "tigergraph": 1, "cats": 3, "coffee": 2, "Graphs": 3 }}] }


Example 1 - POST-ACCUM FOREACH
#Show a summary of the number of friends all persons have by gender CREATE QUERY friendGender() FOR GRAPH socialNet { ListAccum<STRING> @friendGender; SumAccum<INT> @@maleGenderCount; SumAccum<INT> @@femaleGenderCount; start = {person.*}; # Record a list showing each friend's gender socialMembers = SELECT s from start:s -(friend)-> :g ACCUM s.@friendGender += (g.gender) # Loop over each list of genders and total them POST-ACCUM FOREACH x in s.@friendGender DO CASE WHEN (x == "Male") THEN @@maleGenderCount += 1 ELSE @@femaleGenderCount += 1 END END; PRINT @@maleGenderCount; PRINT @@femaleGenderCount; }


Result
GSQL > RUN QUERY friendGender() { "debug": "", "error": false, "message": "", "results": [ {"@@maleGenderCount": 11}, {"@@femaleGenderCount": 7} ] }


CONTINUE and BREAK Statements

The CONTINUE and BREAK statements can only be used within a block of a WHILE or FOREACH statement.  The CONTINUE statement branches control flow to the end of the loop, skipping any remaining statements in the current iteration, and proceeding to the next iteration. That is, everything in the loop block after the CONTINUE statement will be skipped, and then the loop will continue as normal.

The BREAK statement branches control flow out of the loop, i.e., it will exit the loop and stop iteration.

Below are a number of examples that demonstrate the use of BREAK and CONTINUE.

Continue and Break Semantics
# While with a continue INT i = 0; INT nCount = 0; WHILE i < 10 DO i = i + 1; IF (i % 2 == 0) { CONTINUE; } nCount = nCount + 1; END; # i is 10, nCount is 5 (skips the increment for every even i).   # While with a break i = 0; WHILE i < 10 DO IF (i == 5) { BREAK; } # When i is 5 the loop is exited i = i + 1; END; # i is now 5


Example 1. Break
# find posts of a given person, and post of friends of that person, friends of friends, etc # until a post about cats is found. The number of friend-hops to reach is the 'degree' of cats CREATE QUERY findDegreeOfCats(vertex<person> seed) FOR GRAPH socialNet { SumAccum<INT> @@degree = 0; OrAccum @@foundCatPost = false; OrAccum @visited = false; friends (ANY) = {seed}; WHILE @@foundCatPost != true AND friends.size() > 0 DO posts = SELECT v FROM friends-(posted:e)->:v ACCUM CASE WHEN v.subject == "cats" THEN @@foundCatPost += true END; IF @@foundCatPost THEN BREAK; END; friends = SELECT v FROM friends-(friend:e)->:v WHERE v.@visited == false ACCUM v.@visited = true; @@degree += 1; END; PRINT @@degree; }


Results
GSQL > RUN QUERY findDegreeOfCats("person2") { "debug": "", "error": false, "message": "", "results": [{"@@degree": 2}] } GSQL > RUN QUERY findDegreeOfCats("person4") { "debug": "", "error": false, "message": "", "results": [{"@@degree": 0}] }


Example 2. While loop using continue statement
# find all 3-hop friends of a starting vertex. count coworkers as friends # if there are not enough friends CREATE QUERY findEnoughFriends(vertex<person> seed) FOR GRAPH friendNet { SumAccum<INT> @@distance = 0; # keep track of the distance from the seed OrAccum @visited = false; visitedVertices = {seed}; WHILE true LIMIT 3 DO @@distance += 1; # traverse from visitedVertices to its friends friends = SELECT v FROM visitedVertices -(friend:e)-> :v WHERE v.@visited == false POST-ACCUM v.@visited = true; PRINT friends; # if number of friends at this level is sufficient, finish this iteration IF visitedVertices.size() >= 2 THEN visitedVertices = friends; CONTINUE; END; # if fewer than 4 friends, add in coworkers coworkers = SELECT v FROM visitedVertices -(coworker:e)-> :v WHERE v.@visited == false POST-ACCUM v.@visited = true; visitedVertices = friends UNION coworkers; PRINT coworkers; END; }


Example 2 Results
GSQL > RUN QUERY findEnoughFriends("person1") { "debug": "", "error": false, "message": "", "results": [ { "v_set": "friends", "v_id": "person2", "v": { "@visited": true, "id": "person2" }, "v_type": "person" }, { "v_set": "friends", "v_id": "person3", "v": { "@visited": true, "id": "person3" }, "v_type": "person" }, { "v_set": "friends", "v_id": "person4", "v": { "@visited": true, "id": "person4" }, "v_type": "person" }, { "v_set": "coworkers", "v_id": "person5", "v": { "@visited": true, "id": "person5" }, "v_type": "person" }, { "v_set": "coworkers", "v_id": "person6", "v": { "@visited": true, "id": "person6" }, "v_type": "person" }, { "v_set": "friends", "v_id": "person1", "v": { "@visited": true, "id": "person1" }, "v_type": "person" }, { "v_set": "friends", "v_id": "person8", "v": { "@visited": true, "id": "person8" }, "v_type": "person" }, { "v_set": "friends", "v_id": "person9", "v": { "@visited": true, "id": "person9" }, "v_type": "person" }, { "v_set": "friends", "v_id": "person7", "v": { "@visited": true, "id": "person7" }, "v_type": "person" }, { "v_set": "friends", "v_id": "person10", "v": { "@visited": true, "id": "person10" }, "v_type": "person" }, { "v_set": "friends", "v_id": "person12", "v": { "@visited": true, "id": "person12" }, "v_type": "person" } ] }


Example 3. While loop using break statement
# find at least the top-k companies closest to a given seed vertex, if they exist CREATE QUERY topkCompanies(vertex<person> seed, INT k) FOR GRAPH workNet { ListAccum<vertex<company>> @@companyList; OrAccum @visited = false; visitedVertices (ANY) = {seed}; WHILE true DO visitedVertices = SELECT v # traverse from x to its unvisited neighbors FROM visitedVertices -(:e)-> :v WHERE v.@visited == false ACCUM CASE WHEN (v.type == "company") THEN # count the number of company vertices encountered @@companyList += v END POST-ACCUM v.@visited += true; # mark vertices as visited # exit loop when at least k companies have been counted IF @@companyList.size() >= k OR visitedVertices.size() == 0 THEN BREAK; END; END; PRINT @@companyList; }


Example 3. Results
GSQL > RUN QUERY topkCompanies("person1", 2) { "debug": "", "error": false, "message": "", "results": [{"@@companyList": [ "company1", "company2" ]}] }


Example 4 - Usage of CONTINUE in FOREACH
#List out all companies from a given country CREATE QUERY companyByCountry(STRING countryName) FOR GRAPH workNet { MapAccum <STRING, ListAccum<STRING>> @@companies; start = {company.*}; # start will have a set of all company type vertices #Build a map using company country as a key and a list of strings to hold multiple company ids s = SELECT v FROM start:v ACCUM @@companies += (v.country -> v.id); #Iterate the map using (key,value) pairs FOREACH (key,val) IN @@companies DO IF key != countryName THEN CONTINUE; END; PRINT val.size(); #Nested foreach to iterate over the list of company ids FOREACH comp IN val DO PRINT comp; END; END; }


Result
GSLQ > RUN QUERY companyByCountry("us") { "debug": "", "error": false, "message": "", "results": [ {"val.size()": 2}, {"comp": "company1"}, {"comp": "company4"} ] }


Example 5 - Usage of BREAK in FOREACH
#List companies employing people in a given country CREATE QUERY employmentByCountry(countryName STRING) FOR GRAPH workNet { MapAccum < STRING, ListAccum<STRING> > @@employees; start = {person.*}; # start will have a set of all person type vertices #Build a map using person locationId as a key and a list of strings to hold multiple person ids s = SELECT v FROM start:v ACCUM @@employees += (v.locationId -> v.id); #Iterate the map using (key,value) pairs FOREACH (key,val) IN @@employees DO IF key == countryName THEN PRINT val.size(); #Nested foreach to iterate over the list of person ids FOREACH employee IN val DO PRINT employee; END; BREAK; END; END; }


Result
GSQL > RUN QUERY employmentByCountry("us") { "debug": "", "error": false, "message": "", "results": [ {"val.size()": 5}, {"employee": "person1"}, {"employee": "person4"}, {"employee": "person7"}, {"employee": "person9"}, {"employee": "person10"} ] }

End of Control Flow Statements Section

back to top


Data Modification Statements


Contents of this Section - Click to Expand

The GSQL language provides full support for vertex and edge insertion, deletion, and attribute update is provided. Therefore, the language is more than just a "query" language.

All modifications to the graph data become effective after the query is executed. Accordingly, any modification statement does not affect any other statements inside the same query. The actual time point the modification becomes effective depends on the type of data modification and the queue of modifications to be fulfilled.

Query-body DELETE Statement

The query-body DELETE statement deletes a given set of edges or vertices. This statement can only be used as a query-body statement. (Deletion at the DML-sub level is served by the DML-sub DELETE statement, described next.)


EBNF
QueryBodyDeleteStmt := DELETE name FROM ( edgeSet | vertexSet ) [whereClause]

The vertexSet and edgeSet terms in the FROM clause follow the same rules as those in the FROM clause in a SELECT statement. The WHERE clause can filter the items in the vertexSet or edgeSet. Below is an example of removing vertices and edges.

DELETE statement example
# Delete all "person" vertices with location equal to "us" CREATE QUERY deleteEx() FOR GRAPH workNet { S = {person.*}; DELETE s FROM S:s WHERE s.locationId == "us"; }
DELETE statement example 2
# Delete all "worksFor" edges where the person's location is "us" CREATE QUERY deleteEx2() FOR GRAPH workNet { S = {person.*}; DELETE e FROM S:s - (worksFor:e) -> company:t WHERE s.locationId == "us"; }

The following query prints the person vertices who work in the US ("us") and the worksFor edges for persons in the US.  When the initial workNet test data loaded, there are 5 persons and 9 worksFor edges for locationId = "us".  If query deleteEx2 is run, the worksAtUS query will then find the 5 persons but 0 worksFor edges.  If the deleteEx query is then run, the worksAtUS query will now find 0 persons and 0 worksFor edges.

Query to check the results of deleteEx and deleteEx2
CREATE QUERY worksAtUS() FOR GRAPH workNet { SetAccum<EDGE> @@selEdge; Start = {person.*}; SV = SELECT s FROM Start:s WHERE s.locationId == "us"; PRINT SV.id; SE = SELECT s FROM Start:s-(worksFor:e)->company:t WHERE s.locationId == "us" ACCUM @@selEdge += e; PRINT @@selEdge; }


DML-sub DELETE Statement

DML-sub DELETE is a DML-substatement which deletes one vertex or edge each time it is called.  (Deletion at the query-body level is served by the Query-body DELETE statement described above.) In practice, this statement resides within the body of a SELECT...ACCUM/POST-ACCUM clause, so it is called once for each member of a selected vertex set or edge set.

The ACCUM clause iterates over an edge set, which can encounter the same vertex multiple times. If you wish to delete a vertex, it is best practice to place the DML-sub DELETE statement in the POST-ACCUM clause rather than in the ACCUM clause.


EBNF
DMLSubDeleteStmt := DELETE "(" name ")"

The following example uses and modifies the graph data for socialNet.

DELETE within ACCUM vs. POST-ACCUM
# Remove any post vertices posted by the given user CREATE QUERY removePosts(vertex<person> seed) FOR GRAPH socialNet { start = {seed}; # possible, but not recommended as the DML-sub DELETE statement occurs # once for each edge of the vertex v deletedPosts = SELECT v FROM start -(posted:e)-> post:v ACCUM DELETE (v);   # best practice is to delete a vertex in a POST-ACCUM, which only # occurs once for each vertex v, guaranteeing that a vertex is not # deleted more than once deletedPosts = SELECT v FROM start -(posted:e)-> post:v POST-ACCUM DELETE (v); PRINT deletedPosts; }


Results
GSQL > RUN QUERY removePosts("person3") { "debug": "", "error": false, "message": "", "results": [{ "v_set": "deletedPosts", "v_id": "2", "v": {"subject": "query languages"}, "v_type": "post" }] }

INSERT INTO Statement

The INSERT INTO statement adds edges or vertices to the graph. However, if the ID value(s) for the inserted vertex/edge match those of an existing vertex/edge, then the new values will overwrite the old values. The INSERT INTO statement can be used as a query-body-level statement or a DML-substatement.

EBNF
insertStmt := INSERT INTO name ["(" ( PRIMARY_ID | FROM "," TO ) ("," name)* ")"] VALUES "(" ( "_" | expr ) [name] ["," ( "_" | expr ) [name] ("," ("_" | expr))*] ")"


The formal syntax is complex because it encompasses several options, and even so, it requires additional explanation. The first name symbol is the vertex type or edge type. The user then has two options:

1) Provide a value for the ID(s) and then each attribute, in the canonical order for the vertex or edge type.  This format is similar to that of a LOAD statement.  In this case, it is not necessary to explicitly name the attributes, since it is assumed that every one is being referenced, in order.

INSERT with implicit attribute names
INSERT INTO name VALUES (full_list_of_parameter_values)

2) Name the specific attributes to be set, and then provide a corresponding list of values. The attributes can be in any order, with the exception that the IDs must come first.  That is, to insert a vertex, the first attribute name must be PRIMARY_ID.  To insert an edge, the first two attribute names must be FROM and TO.

INSERT with explicit attribute names
INSERT INTO name (IDs, specified_attributes) VALUES (values_for_specified_attributes)

For each attribute value, provide either an expression expr or "_", which means the default value for that attribute type.  The optional name which follows the first two (id) values is to specify the source vertex type and target vertex type, if the edge type had been defined with wildcard vertex types.


The query insertEx illustrates query-body level INSERT statements: insert new company vertices and worksFor examples into the workNet graph.

INSERT statement
CREATE QUERY insertEx(STRING name, STRING name2, STRING name3, STRING comp) FOR GRAPH workNet { # Vertex insertion    # Adds a 'company' vertex. One's location is in USA, and a child company in Japan. INSERT INTO company VALUES ( comp, comp, "us" ); INSERT INTO company (PRIMARY_ID, country) VALUES ( comp + "_jp", "jp" ); # Edge insertion # Adds a 'worksFor' edge from person given by 'name' to the company given by 'comp', filling in # default values for startYear (0), startMonth (0), and fullTime (false). # Requires that 'name' and 'comp' vertices already exist. INSERT INTO worksFor VALUES (name person, comp company, _, _, _);   # Adds a 'worksFor' edge from person given by 'name' to the company given by 'comp', filling in # default values for startMonth (0), but specifically naming values for startYear and fullTime. # Requires that 'name' and 'comp' vertices already exist. INSERT INTO worksFor (FROM, TO, startYear, fullTime) VALUES (name2 person, comp company, 2017, true);   # Adds a 'worksFor' edge from person given by 'name' to the company given by 'comp', filling in # default values for startMonth (0), and fullTime (false) but specifically naming a value for startYear (2017). # Requires that 'name' and 'comp' vertices already exist. INSERT INTO worksFor (FROM, TO, startYear) VALUES (name3 person, comp company, 2000 + 17); }

The query whoWorksForCompany can be used to check the effect of query insertEx.  Prior to running insertEx, running whoWorksForCompany("gsql") will find 0 companies called "gsql" and 0 worksFor edges for company "gsql".  If we then run the query insertEx("tic", "tac", "toe", "gsql"), then insertEx("gsql") will find a company called "gsql" and another one called "gsql_jp".  Moreover, it will find 3 edges, tic, tac, and toe, with different values for the startMonth, startYear, and fullTime parameters.

Query to check the results of insertEx
CREATE QUERY whoWorksForCompany(STRING comp) FOR GRAPH workNet { SetAccum<EDGE> @@setEdge; Comps = {company.*}; PRINT Comps.id; Pers = {person.*}; S = SELECT s FROM Pers:s -(worksFor:e)-> :t WHERE t.id == comp ACCUM @@setEdge += e; PRINT @@setEdge; }


The following example show a DML-sub level INSERT. Because the statement applies to allCompanies, several vertices will be inserted.

DML-sub INSERT statement
# Add a child company of a given company name. The new child company is in japan CREATE QUERY addNewChildCompany(STRING name) FOR GRAPH workNet { allCompanies = {company.*}; X = SELECT s FROM allCompanies:s WHERE s.id == name ACCUM INSERT INTO company VALUES ( name + "_jp", name + "_jp", "jp" ); }

UPDATE Statement

The UPDATE statement updates the attribute of each vertex or edge in a vertex set or edge set, respectively, with new attribute values.

EBNF
updateStmt := UPDATE name FROM ( edgeSet | vertexSet ) SET DMLSubStmtList [whereClause]

The set of vertices or edges to update is described in the FROM clause, following the same rules as the FROM clause in a SELECT block. In the SET clause, the DMLSubStmtList may contain assignment statements to update the attributes of a vertex or edge. These assignment statements use the vertex or edge aliases declared in the FROM clause. The optional WHERE clause supports boolean conditions to filter the items in the vertexS et or edgeSet.

UPDATE statement example
# Change all "person" vertices with location equal to "us" to "USA" CREATE QUERY updateEx() FOR GRAPH workNet{ S = {person.*}; UPDATE s FROM S:s SET s.locationId = "USA" WHERE s.locationId == "us"; # The update does not become effective within this query, so PRINT S still shows "us". PRINT S; }


The UPDATE statement can only be used as a query-body-level statement. However, DML-sub level updates are still possible by using other statement types. A vertex attribute's value can be updated within the POST-ACCUM clause of a SELECT block by using the assignment operator (=); An edge attribute's value can be updated within the ACCUM clause of a SELECT block by using the assignment operator. In fact, the UPDATE statement is equivalent to a SELECT statement with ACCUM and/or POST-ACCUM to update the vertex or edge attribute values. Below is an example.

Updating a vertex's attribute value in a ACCUM clause is not allowed, because the update can occur multiple times in parallel, and possibly result in an non-deterministic value. If the vertex attribute value update depends on an edge attribute value, use the vertex-attached accumulators to save the value and update the vertex attribute's value in the POST-ACCUM clause.

UPDATE statement example 2
# The second example is equivalent to the above updateEx CREATE QUERY updateEx2() FOR GRAPH workNet{ S = {person.*}; X = SELECT s FROM S:s WHERE S.locationId == "us" POST-ACCUM S.locationId = "USA"; PRINT S; }


Below is an example of an edge update with two attribute changes, including an incremental change (e.startYear = e.startYear + 1):

UPDATE statement example 3
CREATE QUERY updateEx3() FOR GRAPH workNet{ S = {person.*}; # update edge and target vertices' attribute UPDATE e FROM S:s - (worksFor:e) -> :t SET e.startYear = e.startYear + 1, e.fullTime = false WHERE s.locationId == "us"; PRINT S; }


Other Update Methods

In addition to the above UPDATE statement and SELECT statement, a simple assignment statement at the query-body level can be used to update the attribute value of a single vertex/edge, if the vertex/edge has been assigned to a variable or parameter.

update by assignment
# change the given person's new location CREATE QUERY updateByAssignment(VERTEX<person> v, STRING newLocation) FOR GRAPH workNet{ v.locationId = newLocation; }

End of Data Modification Statements Section

back to top


Output Statements


Contents of this Section - Click to Expand

PRINT Statement

The PRINT statement provides data output from inside a query. Each PRINT statement adds a JSON object to the "results" array in the standard output. A PRINT statement can appear anywhere as a query-body statement within the scope of variable(s) being referenced in the statement.

EBNF
printStmt := PRINT argList [WHERE condition] [">" filePath]


Each single PRINT statement outputs a set of key-value pairs enclosed in a JSON object. A PRINT statement takes a comma-separated list of expressions to be evaluated and printed. The optional WHERE clause enables the PRINT only of the condition is true.  Each expression can represent one of the following items.

  1. A literal value
  2. An expression using operators listed below:

    Numeric Arithmetic: + - * / . %
    Bit: << >> & |
    String concatenation: +
    Set UNION INTERSECT MINUS

    Parentheses can be used for controlling order of precedence.

  3. A global or local variable (including VERTEX and EDGE variables)
  4. An attribute of a vertex variable
  5. A global accumulator
  6. A vertex set variable
  7. An attribute or a vertex-attached accumulator of a vertex set variable.

The JSON format for each of these types is explained below:

Print argument Output format Example
literal

"literal": literal

PRINT 5.0      → "5.0": 5
local or global variable for a primitive type
"var": value
PRINT X        → "x": 3
local or global variable for a vertex
"var": vertex_id
PRINT v        → "v": "person1"
attribute of a vertex variable
"var.attr": value
PRINT v.gender → "v.gender": "Male"
global accumulator

single-valued: (e.g., SumAccum):
"accumName": value

container (e.g., ListAccum):
"accumName": [val, val, ... val]

Depends on whether the accumulator is single-valued or a container (e.g., ListAccum) and whether the element type is primitive or a vertex/edge.

vertex set variable
vertex_object1, ...,
vertex_objectN
See the detailed structure plus the query and results example below.
attribute or vertex-attached accumulator of a vertex set variable Similar to the output for a vertex set variable, except that the "v" field is replaced by only the specified attribute/accumulator. See query and results example below.


Vertex Set Variable (e.g., SELECT statement output)

When a vertex set variable is printed, it appears as a sequence of vertex objects, each vertex having the following form :

{ "v_set": the vertex set variable name to which this vertex belongs, "v_id": the id of this vertex, "v_type": the type of this vertex, "v": set of key/value pairs for attributes and accumulators associated with this vertex {"attr1": value1, ..., "attrN": valueN} }


Single Vertex or Vertex Collection other than Vertex Set Variable

When a vertex value (except a vertex set variable) is printed, only the vertex id is included in the JSON result.

Edge

When an edge value is printed, it appears as a JSON object with the following form:

{ "from_type": the type of the source vertex "to_type": the type of the target vertex "directed": true or false "from_id": the id of the source vertex "to_id": the id of the target vertex "attributes": set of key/value pairs for attributes and accumulators associated with this edge {"attr1": value1, ..., "attrN": valueN}, "e_type": the type of the edge }


Vertex set variables cannot be printed together with other types of expressions. To print a mix of expression types, use multiple PRINT statements to print them.

Each single PRINT statement outputs a set, so any duplicated components (i.e., vertices) are only printed once. For example,

PRINT aSet, bSet, cSet;

will output the union of the three sets. If duplication is desired, use multiple PRINT statements instead. For example,

PRINT aSet; PRINT bSET;

prints each set separately.


An example which shows all of the cases described above, in combination, is shown below.

Print Basic Example
CREATE QUERY printBasicExample(VERTEX<person> v) FOR GRAPH socialNet { SetAccum<VERTEX> @@testSet; SetAccum<EDGE> @likedSet; int x = 3; Seed = person.*; A = SELECT s FROM Seed:s WHERE s.gender == "Female" ACCUM @@testSet += s; B = SELECT t FROM Seed:s - (liked:e) -> post:t ACCUM s.@likedSet += e; Print @@testSet, 5.0, "test", x, v, v.gender; Print A; Print A.@likedSet; }

Note how the results of the three PRINT statements are grouped in the JSON "results" field below:

  1. None of the six expressions in the first PRINT statement is a vertex set variable (SetAccum<VERTEX> @@testSet is not a vertex set variable, but A and B, the results of the SELECT statements, are.) Therefore, all six can be combined into one JSON object, i.e., {output1, output2, ..., output6}.
  2. The 2nd PRINT statement is for one vertex set variable. The set happens to contain three vertices, so the output is a sequence of three vertex objects: {vertex1}, {vertex2}, {vertex3}.
  3. The 3rd PRINT statement is for one vertex-attached accumulator of the same vertex set variable. Output is very similar to that of the 2nd PRINT, except that less information about each vertex is given. The output format is nearly identical: {partial_ vertex1}, {partial_ vertex2}, {partial_ vertex3}.

Overall, the query's three PRINT statements produce one JSON array with the following format:
"results": [ {output1, output2, ..., output6}, { vertex1}, { vertex2}, { vertex3}, {partial_ vertex1}, {partial_ vertex2}, {partial_ vertex3} ]

Result (WITH COMMENTS ADDED)
GSQL > RUN QUERY printBasicExample("person1") { "error": false, "message": "", "results": [ # The combined output of the query { # 1st PRINT statement, with 6 expressions "test": "test", # 3. literal string "v": "person1", # 5. vertex variable (displays vertex id only) "v.gender": "Male", # 6. attribute of vertex variable "5.0": 5, # 2. literal number "x": 3, # 4. primitive variable "@@testSet": [ # 1. global accumulator: SetAccum<VERTEX> "person4", # (displays vertex ids only) "person5", "person2" ] }, { # 2nd PRINT statement: vertex set variable = list of JSON objects "v_set": "A", # "v_set": name of the vertex set variable "v_id": "person4", # "v_id": id of 1st vertex in the set "v": { # "v": id and all attributes of this vertex "gender": "Female", # gender attribute "@likedSet": [{ # @likedSet vertex-attached accumulator SetAccum<EDGE> "from_type": "person", "to_type": "post", "directed": true, "from_id": "person4", "to_id": "4", "attributes": {"actionTime": "2010-01-13 03:16:05"}, "e_type": "liked" }], "id": "person4" # id (same as "v_id") }, "v_type": "person" # "v_type": type of 1st vertex }, { # next vertex in the vertex set "v_set": "A", "v_id": "person5", "gender": "Female", "@likedSet": [{ "from_type": "person", "to_type": "post", "directed": true, "from_id": "person5", "to_id": "6", "attributes": {"actionTime": "2010-01-12 21:12:05"}, "e_type": "liked" }], "id": "person5" }, "v_type": "person" }, { # next vertex in the vertex set "v_set": "A", "v_id": "person2", "v": { "gender": "Female", "@likedSet": [ { "from_type": "person", "to_type": "post", "directed": true, "from_id": "person2", "to_id": "0", "attributes": {"actionTime": "2010-01-12 10:52:15"}, "e_type": "liked" }, { "from_type": "person", "to_type": "post", "directed": true, "from_id": "person2", "to_id": "3", "attributes": {"actionTime": "2010-01-11 16:02:26"}, "e_type": "liked" } ], "id": "person2" }, "v_type": "person" }, { # 3rd PRINT statement: attribute/accumulator of a vertex set variable "v_set": "A", # Output is nearly identical to 2nd PRINT statement's "v_id": "person4", "v": {"A.@likedSet": [{ # Difference: "v"'s value is only the specified variable/accumulator, "from_type": "person", # not all of the vertex's attributes "to_type": "post", "directed": true, "from_id": "person4", "to_id": "4", "attributes": {"actionTime": "2010-01-13 03:16:05"}, "e_type": "liked" }]}, "v_type": "person" }, { "v_set": "A", "v_id": "person5", "v": {"A.@likedSet": [{ "from_type": "person", "to_type": "post", "directed": true, "from_id": "person5", "to_id": "6", "attributes": {"actionTime": "2010-01-12 21:12:05"}, "e_type": "liked" }]}, "v_type": "person" }, { "v_set": "A", "v_id": "person2", "v": {"A.@likedSet": [ { "from_type": "person", "to_type": "post", "directed": true, "from_id": "person2", "to_id": "0", "attributes": {"actionTime": "2010-01-12 10:52:15"}, "e_type": "liked" }, { "from_type": "person", "to_type": "post", "directed": true, "from_id": "person2", "to_id": "3", "attributes": {"actionTime": "2010-01-11 16:02:26"}, "e_type": "liked" } ]}, "v_type": "person" } ] }


To save the JSON output to a file, simply redirect the output of the entire query to a file, e.g.,

gsql> RUN QUERY printBasicExample("person") > printBasicExample.person1.json


Printing to a CSV File

The optional clause ">" filePath saves the output to a file in CSV (comma-separated value) format, instead of JSON format.  The file path can be either absolute or relative (relative to < TigerGraph_root_dir >/logs/). Each PRINT statement appends one line to the file. Each variable is separated by comma, and multiple values in a set or list are delimited by space. Due to the simpler format of CSV vs. JSON, only data with a simple one- or two-dimension structure is supported for this feature.

Limitations of PRINT > File

  • Printing a full Vertex set variable is not supported.
  • Printing an accumulator with the element type as VERTEX or printing a SET<VERTEX> or LIST<VERTEX> parameter results in printing GSQL internal IDs, which may not be useful to most users.
  • If printing a vertex set's vertex-attached accumulator or a vertex set's variable, the output is a folder containing multiple files. Each file name is the vertex's internal ID, and each file content's is the accumulator's or attribute's value.



PRINT WHERE and PRINT > File Examples
CREATE QUERY printExample() FOR GRAPH socialNet { SetAccum<VERTEX> @@testSet, @@testSet2; ListAccum<STRING> @@strList; int x = 3; Seed = person.*; A = SELECT s FROM Seed:s WHERE s.gender == "Female" ACCUM @@testSet += s, @@strList += s.gender; A = SELECT s FROM Seed:s WHERE s.gender == "Male" ACCUM @@testSet2 += s; PRINT @@testSet, @@testSet2 > "test.txt"; # test.txt is 1 3 4 ,0 5 2 6 7 (The order of internal IDs can change within each set.) PRINT x WHERE x < 0 > "test2.txt"; # test2.txt is empty PRINT x WHERE x > 0 > "test3.txt"; # test3.txt is 3 PRINT @@strList > "test4.txt"; # test4.txt is Female Female Female PRINT A.gender > "test5" # test5 is a folder, containing n files representing n vertices. Each file shows the vertex's gender attribute value }

LOG Statement

The LOG statement is another means to output data.  It works as a function that outputs information to a log file.

BNF
logStmt := LOG "(" condition "," argList ")"

The first argument of the LOG statement is a boolean condition that enables or disables logging.  This allows logging to be easily turned on/off, for uses such as debugging.  After the condition, LOG takes one or more expressions (separated by commas).  These expressions are evaluated and output to the log file.

Unlike the PRINT statement, which can only be used as a query-body statement, the LOG statement can be used as both a query-body statement and a DML-sub-statement.

The values will be recorded in the GPE log. To find the log file after the query has completed, open a Linux shell and use the command  "gadmin log gpe".  It may show you more than one log file name; use the one ending in "INFO".  Search this file for "UDF_".

Examples
BOOLEAN debug = TRUE; INT x = 10;   LOG(debug, 20); LOG(debug, 10, x);

RETURN Statement

BNF
createQuery := CREATE QUERY name "(" [parameterList] ")" FOR GRAPH name [RETURNS "(" baseType | accumType ")"] "{" [typedefs] [declStmts] queryBodyStmts "}" returnStmt := RETURN expr

The RETURN statement is used to create a sub-query which can be called by other queries ( super-queries ). If a CREATE QUERY statement includes the optional RETURNS clause, the body must end with a RETURN statement. Exactly one type is allowed in the RETURNS clause, and thus RETURN statement can only return one expression. The returned expression must have the same type as the RETURNS clause indicates. A sub-query must be created before its corresponding super-query.  A sub-query must be install either before or in the same INSTALL QUERY command with its super-query.

The return type can be any base type or any accumulator type, except GroupByAccum and any accumulator containing any tuple type. For the purposes of return type, SetAccum is equivalent to SET, and BagAccum is equivalent to BAG.  A vertex set variable can also be returned if SET<VERTEX<type>> or SetAccum<VERTEX<type>> (<type> is optional) is used in the RETURNS clause.

See also Section 5.11 - Queries ad Functions.

Subquery Example 1
CREATE QUERY subquery1 (VERTEX<person> m1) FOR GRAPH socialNet RETURNS(BagAccum<VERTEX<post>>) { Start = {m1}; L = SELECT t FROM Start:s - (liked:e) - post:t; RETURN L; } CREATE QUERY mainquery1 () FOR GRAPH socialNet { BagAccum<VERTEX<post>> @@testBag; Start = {person.*}; Start = SELECT s FROM Start:s ACCUM @@testBag += subquery1(s); PRINT @@testBag; }


Result
GSQL > RUN QUERY mainquery1() { "results": [ { "@@testBag": [ "0", "0", "0", "3", "4", "4", "6", "8", "10" ] } ], "error": false, "message": "", "debug": "" }


Subquery Example 2
CREATE QUERY subquery2 (VERTEX<person> m1) FOR GRAPH socialNet RETURNS(INT) { int x; Start = {m1}; Start = SELECT t FROM Start:t ACCUM CASE WHEN t.gender == "Male" THEN x = 5 WHEN t.gender == "Female" THEN x = 10 ELSE x = -1 END; RETURN x; } CREATE QUERY mainquery2 (SET<VERTEX<person>> m1) FOR GRAPH socialNet { SumAccum<INT> @@sum1; Start = {m1}; Start = SELECT t FROM Start:t ACCUM @@sum1 += subquery2(t); PRINT @@sum1; }


Result
GSQL > RUN QUERY mainquery2(["person1","person2"]) { "results": [ { "@@sum1": 15 } ], "error": false, "message": "", "debug": "" }





End of Output Statement Section

back to top


Exception Statements


Contents of this Section - Click to Expand

This section describes how the GSQL language responds to exceptions and supports user-defined exception handling . An exception is a run-time error. The GSQL language supports both built-in system exceptions and user-defined exceptions. Built-in exceptions include GSQL language exceptions (such as out-of-range value, wrong data type, and illegal operation), and errors arising in other TigerGraph components or from the operation system.

The GSQL que ry language also supports user-defined exception responses, also known as exception handling.  This section covers the following syntax for user-defined exception behavior:

######################################################### ## Exception Statements ## declExceptStmt := EXCEPTION exceptVarName "(" errorCode ")" exceptVarName := name errorCode := integer raiseStmt := RAISE exceptVarName [errorMsg] errorMsg := "(" expr ")" tryStmt := TRY queryBodyStmts EXCEPTION caseExceptBlock+ [elseExceptBlock] END ";" caseExceptBlock := WHEN exceptVarName THEN queryBodyStmts elseExceptBlock := ELSE queryBodyStmts

Default Exception Response

When an exception occurs during the execution of a query, the default response is the following:

  • The query will not execute any more statements; it will exit.
  • If the query was run using the RUN QUERY command, then an error message will be displayed.
  • If the query was run by invoking the GET /query REST++ endpoint, then the output will be a simple JSON obje ct. Some errors have a error "code" field; others do no t:

    Output of Unhandled Exception (query run as REST Endpoint)
    { "error": true, "message": "<errorMsg>" "code": "<errType><errorCode>" }

The example below show two common errors: wrong data type and divide-by-zero. First we define a simple query that divides 100.0 by the query's input parameter.

Example: query excpBuiltin
CREATE QUERY excpBuiltin (n1 INT) FOR GRAPH minimalNet { PRINT 100.0/n1; }

We then test three cases:

  1. A valid input (such as n1 = 7)
  2. Wrong data type (n1 = "A")
  3. Divide by zero (n1 = 0)

First we test using the GSQL interface. When the query runs without error, the output is in JSON format. Where there is a built-in exception, however, only an error message is displayed.

Exception response for RUN QUERY
$ RUN QUERY excpBuiltin(7) { "error": false, "message": "", "results": [{"100.0/n1": 14.28571}] } $ RUN QUERY excpBuiltin("a") Values of parameter n1 must be INT64 type, invalid value [a] provided. $ RUN QUERY excpBuiltin(0) Runtime Error: divider is zero.

The situation is a little different when running the query as a REST++ endpoint. The output is always in JSON format.

Exception response for GET /query request
$ curl -X GET "http://localhost:9000/query/excpBuiltin?n1=7" { "error": false, "message": "", "results": [ { "100.0/n1": 14.28571 } ] } $ curl -X GET "http://localhost:9000/query/excpBuiltin?n1=a" { "code": "RES-30000", "error": true, "message": "Values of parameter n1 must be INT64 type, invalid value [a] provided." } $ curl -X GET "http://localhost:9000/query/excpBuiltin?n1=0" { "error": true, "message": "Runtime Error: divider is zero." }

User-Defined Exception Behavior

A query author can specify what should be the response if a particular type of exception occurs within a particular specified block of statements.

The following statement types are available to specify a user-defined exception condition or a user-defined exception response.

  • The EXCEPTION Declaration Statement names a user-defined exception.
  • The RAISE Statement indicates that one of the user-defined exceptions has occurred.
  • The TRY…EXCEPTION Statement is used to define and apply user-defined exception handling to a block of query-body statements. This can be used with or without preceding user-defined EXCEPTION and RAISE statements.

Built-in exceptions always take precedence over user-defined exceptions. Therefore, user-defined exceptions can only be used to catch conditions that would not be caught by a built-in exception. This means that built–in exceptions are best used to capture situations which are legal according to the general syntax and semantics of the GSQL query language, but which are illegal or undesirable for a particular user application.

EXCEPTION Declaration Statement

declExceptStmt := EXCEPTION exceptVarName "(" errorCode ")" exceptVarName := name errorCode := integer

To use a user-defined exception, it must first be declared. An exception declaration statement declares a user-defined exception type, assigning a name and identification number. The id number errorCode must be greater than 40,000.  Numbers 40,000 and lower are reserved for system exceptions. Exception statements must be placed before any query-body statements, after accumulator declaration statements . A query can declare multiple exception types.

RAISE Statement

raiseStmt := RAISE exceptVarName [errorMsg] errorMsg := "(" expr ")"

The RAISE statement announces that a user-defined exception has just occurred.  The exceptVarName must match one of the exceptions that was previously declared.  An optional error message can be specified. Once the RAISE statement is executed, the flow of execution changes. If the RAISE statement is not within a TRY clause, then the query ends with the default exception response, using the error code and error message defined by the exception type and RAISE statements. If the RAISE is within a TRY statement, then execution jumps to the EXCEPTION handling clause of the TRY statement.

A RAISE statement itself does not include the conditions that define the exception. Typically, the user will use an IF…THEN statement and place the RAISE statement within the THEN clause.

In the current version, a RAISE statement can only be used as a query-body-statement. It cannot be used as a DML-sub-statement. In particular, you cannot RAISE an exception inside a SELECT statement.

The example below defines and checks for two types of exceptions: an empty input set (40001) and no matching edges (40002). Remember that the minimum allowed code number is 40001.

Example: Unhandled User-Defined Exceptions
CREATE QUERY excpCountActivity(SET<VERTEX<person>> vSet, STRING eType) FOR GRAPH socialNet { # Count how many edges there are from each member of the input person set to posts, # along the specified edge type. MapAccum<STRING,INT> @@allCount; EXCEPTION emptyList (40001); EXCEPTION noEdges (40002); IF ISEMPTY(vSet) THEN ## Raise 40001 RAISE emptyList ("Error: Input parameter 'vSet' (type SET<VERTEX<person>>) is empty"); END; Start = vSet; Results = SELECT s FROM Start:s -(:e)-> post:t WHERE e.type == eType ACCUM @@allCount += (t.subject -> 1); IF Results.size() == 0 THEN ## Raise 40002 RAISE noEdges ("Error: No '" + eType + "' edges from the vertex set"); END; PRINT @@allCount; }
Results
// Valid input: no exceptions $ curl -X GET "http://localhost:9000/query/excpCountActivity?vSet=person2&vSet=person6&eType=posted" { "error": false, "message": "", "results": [ { "@@allCount": { "cats": 1, "tigergraph": 2 } } ] } // empty input set (due to spelling error in parameter name) $ curl -X GET "http://localhost:9000/query/excpCountActivity?vset=person2&vset=person6&eType=posted" { "code": "40001", "error": true, "message": "Error: Input parameter 'vSet' (type SET<VERTEX<person>>) is empty" } // no edges (due to unknown edge type) $ curl -X GET "http://localhost:9000/query/excpCountActivity?vSet=person2&vSet=person6&eType=commented" { "code": "40002", "error": true, "message": "Error: No 'commented' edges from the vertex set" }


TRY...EXCEPTION Statement for Custom Error Ha ndling

tryStmt := TRY queryBodyStmts EXCEPTION caseExceptBlock [elseExceptBlock] END ";" caseExceptBlock := WHEN exceptVarName THEN queryBodyStmts+ elseExceptBlock := ELSE queryBodyStmts

The TRY…EXCEPTION Statement is used to define and apply user-defined exception handling to a block of query-body statements. A TRY...EXCEPTION statement can be nested within a TRY block or EXCEPTION block.

The current version of GSQL does not support custom handling of built-in exceptions. Therefore, if a built-in exception occurs, it ignores the TRY..EXCEPTION blocks and simply applies the default handling, and the query aborts. In future updates, we plan to support custom handling of both custom exceptions (RAISE) and built-in exception with the TRY...EXCEPTION block.

The TRY…EXCEPTION Statement is a compound statement containing two blocks. The first block (TRY) consists of the query-body statements for which custom error handling should be applied. The second block (EXCEPTION) contains a series of WHEN…THEN exception handling clauses.  Each exception handling clause names an exception type and specifies what actions to take in the event of the exception. An optional ELSE clause contains handling statements for all other exceptions. The following text and visual flowchart details how the TRY... EXCEPTION block handles an exception.

When an exception occurs within a TRY block, the flow of execution skips the remainder of the TRY block and jumps to the EXCEPTION block. The GSQL flow now seeks to match the exception type with a handler. After executing the handling statements in the THEN or ELSE clause, the flow skips the remainder of the EXCEPTION block and continues with the statement following the END statement. However, if there is no matching WHEN or ELSE handler, then the exception is propagated. That is, the RAISE state is maintained after exiting the EXCEPTION block. If the TRY...EXCEPTION block is nested inside another TRY block, then the handling process is repeated at this upper level. This repeats until either the exception is handled or there are no more TRY...EXCEPTION blocks.

Finally, if the unhandled exception is not within a TRY block, then the the query is aborted, and the default exception response is the output.

Case 1: If cond1 is true in the outer TRY block,

  • RAISE A and jump to the output EXCEPTION block.

Handled by ELSE HandStmtsZ.

Case 2: If cond2 is true in the inner TRY block,

  • RAISE A and jump to the inner EXCEPTION block.

Handled by handStmtsX;

Case 3: If cond3 is tru e in the inner TRY block,

  • RAISE B and jump to the inner EXCEPTION block. There is no matching handler here, so propagate the exception. Jump to the outer EXCEPTION block. Handled by handStmtsY.

Custom Handling Example:

The following example is a modified shortest path query.  It looks for all paths from a source to a target in a computer network. It uses breadth-first search and stops at depth N when it has found at least one path at depth N, or it has searched the entire graph. There are three conditions which will cause it to RAISE an exception and abort the search:

  1. Seeing an edge with a negative connection speed (because the graph has bad data).
  2. Seeing an edge with a very slow connection speed (again because the graph has bad data).
  3. If no path was found in the graph (the search is already over, but we skip printing results).

Note that cases 1 and 2 do NOT mean that a negative or slow speed edge is actually on a shortest path, only that the query noticed a bad edge during its search. Also, b ecause we cannot RAISE within the SELECT block, we use a workaro und: set an integer variable with an error code.  Immediately after the SELECT block, test the integer variable and RAISE exceptions as needed.

Example: Path Search with Exceptions
CREATE QUERY compPathValid (vertex<computer> src, vertex<computer> tgt, BOOL enExcp) FOR GRAPH computerNet { # Find valid paths in a computer network from a source to a target. # Stop search once you have found some paths. # 3 Exceptions: (1) Negative connection speed, # (2) Slow connection speed, (3) No path. OrAccum @@reached, @visited; ListAccum<STRING> @paths; DOUBLE minSpeed = 0.4; INT err; EXCEPTION negSpeed (40001); EXCEPTION slowSpeed (40002); EXCEPTION notReached (40003); TRY Start = {src}; # Initialize: path to src is itself. Start = SELECT s FROM Start:s ACCUM s.@paths = s.id; WHILE Start.size() != 0 AND NOT @@reached DO Start = SELECT t FROM Start:s -(:e)-> :t WHERE t.@visited == false ACCUM CASE WHEN e.connectionSpeed < 0 THEN err = 1 WHEN e.connectionSpeed < minSpeed THEN err = 2 WHEN t == tgt THEN @@reached += true END, t.@paths += (s.@paths + "~") + t.id POST-ACCUM t.@visited = true; IF err == 1 AND enExcp THEN RAISE negSpeed ("Negative Speed"); ELSE IF err == 2 AND enExcp THEN RAISE slowSpeed ("Slow Speed"); END; END; # WHILE IF NOT @@reached AND enExcp THEN RAISE notReached ("No path to target"); END; EXCEPTION WHEN negSpeed THEN PRINT "bad path: negative speed"; WHEN slowSpeed THEN PRINT "bad path: slow speed"; WHEN notReached THEN PRINT "no path from source to target"; END; Result = {tgt}; PRINT Result.@paths; }

As the data in Appendix D show:

  • Any search passing through c1 will see negative edges.
  • Any search passing through c12 will see negative and slow edges.
  • Any search passing through c14 will see negative edges.

The results for 4 cases are shown: 1 valid search plus each of the 3 exception conditions.

Results
$ GSQL 'RUN QUERY compPathValid("c10","c12",true)' { "error": false, "message": "", "results": [{ "v_set": "Result", "v_id": "c12", "v": {"Result.@paths": ["c10~c11~c12"]}, "v_type": "computer" }] } $ GSQL 'RUN QUERY compPathValid("c1","c12",true)' { "error": false, "message": "", "results": [{"bad path: negative speed": "bad path: negative speed"}] } $ GSQL 'RUN QUERY compPathValid("c10","c13",true)' { "error": false, "message": "", "results": [{"bad path: slow speed": "bad path: slow speed"}] } $ GSQL 'RUN QUERY compPathValid("c24","c25",true)' { "error": false, "message": "", "results": [{"no path from source to target": "no path from source to target"}] }


Exception Handling Flowchart

The flowchart below summarizes all the cases for triggering and handling exceptions, both user-defined and built-in.







End of Exception Statement Section

back to top


Comments

A comment is a section of text that is ignored by the language parser; its purpose is to provide information to human readers.  The comment markers follow the conventions used in C++ and SQL:

  • Single-line or partial-line comments begin with either # or // and end at the end of the line (with the newline character).
  • Multi-line comment blocks begin with /* and end with */

back to top


Appendices

Appendix A. Common Errors and Problems

Floating Point Precision Limits

No computer can store all floating point numbers (i.e., non-integers) with perfect precision. The float data type offers about 7 decimal digits of precision; the double data type offers about 15 decimal digits of precision. Comparing two float or double values by using operators involving exact equality (==, <=, >=, BETWEEN ... AND ...) might lead to unexpected behavior. If the GSQL language parser detects that the user is attempting an exact equivalence test with float or double data types, it will display a warning message and suggestion. For example, if there are two float variables v and v2, the expression v == v2 causes the following warning message:

The comparison 'v==v2' may lead to unexpected behavior because it involves equality test between float/double numeric values. We suggest to do such comparison with an error margin, e.g. 'abs((v) - (v2)) < epsilon', where epsilon is a very small positive value of your choice, such as 0.0001.

Response to Non-existent vertex ID

If a query has a vertex parameter (VERTEX or VERTEX<vType>), and if the ID for a nonexistent vertex is given when running the query, an error message is shown, and the query won't run. This is also the response when calling a function to convert a single vertex ID string to a vertex:

  • to_vertex(): See Section "Miscellaneous Functions".

However, if the parameter is a vertex set (SET<VERTEX> or SET<VERTEX<vType>>), and one or more nonexistent IDs are given when running the query, a warning message is shown, but the query still runs, ignoring those nonexistent IDs. Therefore, if all given IDs are nonexistent, the parameter becomes an empty set. T his is also the response when calling a function to convert a set of vertex IDs to a set of vertices :

  • to_vertex_set(): See Section " Miscellaneous Functions ".
  • SelectVertex(): See Section " Miscellaneous Functions ".


Appendix B. Complete Formal Syntax for Query Language

Version 0.8.1

This is the definition for the GSQL Query Language syntax.  It is defined as a set of rules expressed in EBNF notation.

Notation Used to Define Syntax

This defines the EBNF notation used to describe the syntax.  Rules contains terminal and non-terminal symbols.  A terminal symbol is a base-level symbol which expresses literal output. All symbols in quotes (e.g., "+", "=",  ")", "10") are terminal symbols. A non-terminal symbol is defined as some combination of terminal and non-terminal symbols. The left-hand side of a rule is always a non-terminal; this rule defines the non-terminal.  The example rule below defines assignmentStmt (that is, an Assignment Statement) to be a name followed by an equal sign followed by an expression, operator, and expression with a terminating semi-colon.   AssignmentStmt, name, and expr are all non-terminals.  Additionally, all KEYWORDS are in all-capitals and are terminal symbols.   The ":=" is part of EBNF and states the left hand side can be expanded to the right hand side.

EBNF Syntax example: A rule
assignmentStmt := name "=" expr op expr ";"

A vertical bar | in EBNF indicates choice.  Choose either the symbol on the left or on the right.  A sequence of vertical bars means choose any one of the symbols in the sequence.

EBNF Syntax: vertical bar
op := "+" | "-" | "*" | "/"

Square brackets [ ] indicate an optional part or group of symbols. Parentheses ( ) group symbols together.  The rule below defines a constant to be one, two, or three digits preceded by an optional plus or minus sign.

EBNF Syntax: Square brackets and parentheses
constant := ["+" | "-"] (digit | (digit digit) | (digit digit digit))

Star * and plus + are symbols in EBNF for closure.  Star means zero or more occurrences, and plus means one or more occurrences.  The following defines intConstant to be an optional plus or minus followed by one or more digits.  It also defines floatConstant to be an optional plus or minus followed by zero or more digits followed by a decimal followed by one or more digits.  The star and plus also can be applied to groups of symbols as in the definition of list.  The non-terminal list is defined as a parenthesized list of comma-separated expressions (expr).  The list has at least one expression which can be followed by zero or more comma-expression pairs.

EBNF Syntax: square brackets and parentheses
intConstant := ["+" | "-"] digit+ floatConstant := ["+" | "-"] digit* "." digit+ list := "(" expr ["," expr]* ")"


GSQL Query Language E BNF

######################################################### ## EBNF for GSQL Query Language (GQuery) createQuery := CREATE [OR REPLACE] QUERY name "(" [parameterList] ")" FOR GRAPH name [RETURNS "(" baseType | accumType ")"] "{" [typedefs] [declAccumStmts] [declStmts] [declExceptStmts] queryBodyStmts "}" parameterValueList := parameterValue [, parameterValue]* parameterValue := parameterConstant | "[" parameterValue [, parameterValue]* "]" // BAG or SET | "(" stringLiteral, stringLiteral ")" // a generic VERTEX value parameterConstant := numeric | stringLiteral | TRUE | FALSE parameterList := parameterType name ["=" constant] ["," parameterType name ["=" constant]]* typedefs := (typedef ";")+ declAccumStmts := (declAccumStmt ";")+ declExceptStmts := (declExceptStmt ";")+ declStmts := (declStmt ";")+ queryBodyStmts := (queryBodyStmt ";")+ queryBodyStmt := assignStmt // Assignment | vSetVarDeclStmt // Declaration | gAccumAssignStmt // Assignment | gAccumAccumStmt // Assignment | funcCallStmt // Function Call | selectStmt // Select | queryBodyCaseStmt // Control Flow | queryBodyIfStmt // Control Flow | queryBodyWhileStmt // Control Flow | queryBodyForEachStmt // Control Flow | BREAK // Control Flow | CONTINUE // Control Flow | updateStmt // Data Modification | insertStmt // Data Modification | queryBodyDeleteStmt // Data Modification | printStmt // Output | logStmt // Output | returnStmt // Output | raiseStmt // Exception | tryStmt // Exception installQuery := INSTALL QUERY [installOptions] ( "*" | ALL |name [, name]* ) runQuery := RUN QUERY [runOptions] name "(" parameterValueList ")" showQuery := SHOW QUERY name dropQuery := DROP QUERY ( "*" | ALL | name [, name]* ) ######################################################### ## Types and names lowercase := [a-z] uppercase := [A-Z] letter := lowercase | uppercase digit := [0-9] integer := ["-"]digit+ real := ["-"]("." digit+) | ["-"](digit+ "." digit*) numeric := integer | real stringLiteral := '"' [~["] | '\\' ('"' | '\\')]* '"' name := (letter | "_") [letter | digit | "_"]* // Can be a single "_" or start with "_" type := baseType | name | accumType | STRING COMPRESS baseType := INT | UINT | FLOAT | DOUBLE | STRING | BOOL | VERTEX ["<" name ">"] | EDGE | JSONOBJECT | JSONARRAY | DATETIME filePath := name | stringLiteral typedef := TYPEDEF TUPLE "<" tupleType ">" name tupleType := (baseType name) | (name baseType) ["," (baseType name) | (name baseType)]* parameterType := baseType | [ SET | BAG ] "<" baseType ">" ######################################################### ## Accumulators declAccumStmt := accumType "@"name ["=" constant][, "@"name ["=" constant]]* | "@"name ["=" constant][, "@"name ["=" constant]]* accumType | [STATIC] accumType "@@"name ["=" constant][, "@@"name ["=" constant]]* | [STATIC] "@@"name ["=" constant][, "@@"name ["=" constant]]* accumType accumType := "SumAccum" "<" ( INT | FLOAT | DOUBLE | STRING ) ">" | "MaxAccum" "<" ( INT | FLOAT | DOUBLE ) ">" | "MinAccum" "<" ( INT | FLOAT | DOUBLE ) ">" | "AvgAccum" | "OrAccum" | "AndAccum" | "BitwiseOrAccum" | "BitwiseAndAccum" | "ListAccum" "<" type ">" | "SetAccum" "<" elementType ">" | "BagAccum" "<" elementType ">" | "MapAccum" "<" elementType "," type ">" | "HeapAccum" "<" name ">" "(" (integer | name) "," name [ASC | DESC] ["," name [ASC | DESC]]* ")" | "GroupByAccum" "<" elementType name ["," elementType name]* , accumType name ["," accumType name]* ">" | "ArrayAccum" "<" name ">" elementType := baseType | name | STRING COMPRESS gAccumAccumStmt := "@@"name "+=" expr ############################################################################### ## Operators, Functions, and Expressions constant := numeric | stringLiteral | TRUE | FALSE | GSQL_UINT_MAX | GSQL_INT_MAX | GSQL_INT_MIN | TO_DATETIME "(" stringLiteral ")" mathOperator := "*" | "/" | "%" | "+" | "-" | "<<" | ">>" | "&" | "|" condition := expr | expr comparisonOperator expr | expr [ NOT ] IN setBagExpr | expr IS [ NOT ] NULL | expr BETWEEN expr AND expr | "(" condition ")" | NOT condition | condition (AND | OR) condition | (TRUE | FALSE) comparisonOperator := "<" | "<=" | ">" | ">=" | "==" | "!=" expr := ["@@"]name | name "." "type" | name "." ["@"]name | name "." "@"name ["\'"] | name "." name "." name "(" [argList] ")" | name "." name "(" [argList] ")" [ "." FILTER "(" condition ")" ] | name ["<" type ["," type"]* ">"] "(" [argList] ")" | name "." "@"name ("." name "(" [argList] ")")+ ["." name] | "@@"name ("." name "(" [argList] ")")+ ["." name] | COALESCE "(" [argList] ")" | ( COUNT | ISEMPTY | MAX | MIN | AVG | SUM ) "(" setBagExpr ")" | expr mathOperator expr | "-" expr | "(" expr ")" | "(" argList "->" argList ")" // key value pair for MapAccum | "[" argList "]" // a list | constant | setBagExpr | name "(" argList ")" // function call or a tuple object setBagExpr := ["@@"]name | name "." ["@"]name | name "." "@"name ("." name "(" [argList] ")")+ | name "." name "(" [argList] ")" [ "." FILTER "(" condition ")" ] | "@@"name ("." name "(" [argList] ")")+ | setBagExpr (UNION | INTERSECT | MINUS) setBagExpr | "(" argList ")" | "(" setBagExpr ")" ######################################################### ## Declarations and Assignments ## ## Declarations ## declStmt := baseType name ["=" constant][, name ["=" constant]]* localVarDeclStmt := baseType name "=" expr vSetVarDeclStmt := name ["(" vertexEdgeType ")"] "=" (seedSet | simpleSet | selectBlock) simpleSet := name | "(" simpleSet ")" | simpleSet (UNION | INTERSECT | MINUS) simpleSet seedSet := "{" [seed ["," seed ]*] "}" seed := '_' | ANY | ["@@"]name | name ".*" | "SelectVertex" selectVertParams selectVertParams := "(" filePath "," columnId "," (columnId | name) "," stringLiteral "," (TRUE | FALSE) ")" ["." FILTER "(" condition ")"] columnId := "$" (integer | stringLiteral) ## Assignment Statements ## assignStmt := name "=" expr | name "." name "=" expr | name "." "@"name ("+="| "=") expr gAccumAssignStmt := "@@"name ("+=" | "=") expr loadAccumStmt := "@@"name "=" "{" "LOADACCUM" loadAccumParams ["," "LOADACCUM" loadAccumParams]* "}" loadAccumParams := "(" filePath "," columnId "," [columnId ","]* stringLiteral "," (TRUE | FALSE) ")" ["." FILTER "(" condition ")"] ## Function Call Statement ## funcCallStmt := name ["<" type ["," type"]* ">"] "(" [argList] ")" | "@@"name ("." name "(" [argList] ")")+ argList := expr ["," expr]* ######################################################### ## Select Statement selectStmt := name "=" selectBlock selectBlock := SELECT name FROM ( edgeSet | vertexSet ) [sampleClause] [whereClause] [accumClause] [postAccumClause] [havingClause] [orderClause] [limitClause] vertexSet := name [":" name] edgeSet := name [":" name] "-" "(" [vertexEdgeType] [":" name] ")" "->" [vertexEdgeType] [":" name] vertexEdgeType := "_" | ANY | name | ( "(" name ["|" name]* ")" ) sampleClause := SAMPLE ( expr | expr "%" ) EDGE WHEN condition | SAMPLE expr TARGET WHEN condition | SAMPLE expr "%" TARGET PINNED WHEN condition whereClause := WHERE condition accumClause := ACCUM DMLSubStmtList postAccumClause := POST-ACCUM DMLSubStmtList DMLSubStmtList := DMLSubStmt ["," DMLSubStmt]* DMLSubStmt := assignStmt // Assignment | funcCallStmt // Function Call | gAccumAccumStmt // Assignment | vAccumFuncCall // Function Call | localVarDeclStmt // Declaration | DMLSubCaseStmt // Control Flow | DMLSubIfStmt // Control Flow | DMLSubWhileStmt // Control Flow | DMLSubForEachStmt // Control Flow | BREAK // Control Flow | CONTINUE // Control Flow | insertStmt // Data Modification | DMLSubDeleteStmt // Data Modification | logStmt // Output vAccumFuncCall := name "." "@"name ("." name "(" [argList] ")")+ localVarDeclStmt := ( baseType name | name baseType ) "=" expr havingClause := HAVING condition orderClause := ORDER BY expr [ASC | DESC] ["," expr [ASC | DESC]]* limitClause := LIMIT ( expr | expr "," expr | expr OFFSET expr ) ######################################################### ## Control Flow Statements ## queryBodyIfStmt := IF condition THEN queryBodyStmts [ELSE IF condition THEN queryBodyStmts ]* [ELSE queryBodyStmts ] END DMLSubIfStmt := IF condition THEN DMLSubStmtList [ELSE IF condition THEN DMLSubStmtList ]* [ELSE DMLSubStmtList ] END queryBodyCaseStmt := CASE (WHEN condition THEN queryBodyStmts)+ [ELSE queryBodyStmts] END | CASE expr (WHEN constant THEN queryBodyStmts)+ [ELSE queryBodyStmts] END DMLSubCaseStmt := CASE (WHEN condition THEN DMLSubStmtList)+ [ELSE DMLSubStmtList] END | CASE expr (WHEN constant THEN DMLSubStmtList)+ [ELSE DMLSubStmtList] END queryBodyWhileStmt := WHILE condition [LIMIT (name | integer)] DO queryBodyStmts END DMLSubWhileStmt := WHILE condition [LIMIT (name | integer)] DO DMLSubStmtList END queryBodyForEachStmt := FOREACH forEachControl DO queryBodyStmts END DMLSubForEachStmt := FOREACH forEachControl DO DMLSubStmtList END forEachControl := ( name | "(" name (, name)+ ")") (IN | ":") setBagExpr | name IN RANGE "[" expr , expr"]" [".STEP(" expr ")"] ######################################################### ## Other Data Modifications Statements ## queryBodyDeleteStmt := DELETE name FROM ( edgeSet | vertexSet ) [whereClause] DMLSubDeleteStmt := DELETE "(" name ")" updateStmt := UPDATE name FROM ( edgeSet | vertexSet ) SET DMLSubStmtList [whereClause] insertStmt := INSERT INTO name ["(" ( PRIMARY_ID | FROM "," TO ) ["," name]* ")"] VALUES "(" ( "_" | expr ) [name] ["," ( "_" | expr ) [name] ["," ("_" | expr)]*] ")" ######################################################### ## Output Statements ## printStmt := PRINT argList [WHERE condition] [">" filePath] logStmt := LOG "(" condition "," argList ")" returnStmt := RETURN expr ######################################################### ## Exception Statements ## declExceptStmt := EXCEPTION exceptVarName "(" errorInt ")" exceptVarName := name errorInt := integer raiseStmt := RAISE exceptVarName [errorMsg] errorMsg := "(" expr ")" tryStmt := TRY queryBodyStmts EXCEPTION caseExceptBlock+ [elseExceptBlock] END ";" caseExceptBlock := WHEN exceptVarName THEN queryBodyStmts elseExceptBlock := ELSE queryBodyStmts





Appendix C. Query Language Reserved Words

The following words are reserved for use by the GSQL query language. This includes words which are currently keywords (such as GRAPH), as well as words which might be used in the future (such as EXTERN).

Query Language Reserved Words
ACCUM ALIGNAS ALIGNOF AND AND_EQ ANY ASC ASM AUTO AVG BAG BETWEEN BITAND BITOR BOOL BREAK BY CASE CATCH CHAR CHAR16_T CHAR32_T CLASS COALESCE COMPL COMPRESS CONCEPT CONST CONSTEXPR CONST_CAST CONTINUE COUNT CREATE DATETIME DATETIME_ADD DATETIME_SUB DECLTYPE DEFAULT DELETE DESC DO DONE DOUBLE DYNAMIC_CAST EDGE ELSE END ENUM ESCAPE EXCEPTION EXPLICIT EXPORT EXTERN FALSE FILTER FLOAT FOR FOREACH FRIEND FROM GOTO GRAPH GSQL_INT_MAX GSQL_INT_MIN GSQL_UINT_MAX HAVING IF IN INLINE INSERT INT INTERSECT INTERVAL INTO IS ISEMPTY JSONARRAY JSONOBJECT LIKE LIMIT LIST LOADACCUM LOG LONG MAP MAX MIN MINUS MUTABLE NAMESPACE NEW NOEXCEPT NOT NOT_EQ NOW NULL NULLPTR OFFSET OPERATOR OR ORDER OR_EQ PINNED PRIMARY_ID PRINT PRIVATE PRODUCT PROTECTED PUBLIC QUERY RAISE RANGE REGISTER REINTERPRET_CAST REQUIRES RETURN RETURNS SAMPLE SELECT SELECTVERTEX SET SHORT SIGNED SIZEOF STATIC STATIC_ASSERT STATIC_CAST STRING STRUCT SUM SWITCH TARGET TEMPLATE THEN THIS THREAD_LOCAL THROW TO TO_DATETIME TRUE TRY TUPLE TYPEDEF TYPEID TYPENAME TYPES UINT UNION UNSIGNED UPDATE USER USING VALUES VERTEX VIRTUAL VOID VOLATILE WCHAR_T WHEN WHERE WHILE XOR XOR_EQ

Appendix D. Example Graphs - command and data files

Updated

Below is the listing of the graph create&load command files and data files to generate the six example graphs used in this document: workNet , socialNet , friendNet , computerNet , minimalNet , and investmentNet .  The tar-gzip file gsql_example_graphs_v1.0.gz contains all of these files. Each graph has its own folder. To create a particular graph, go in its folder and run the following command:
gsql graph_create.gsql

workNet

graph_create.gsql for workNet
DROP ALL CREATE VERTEX person(PRIMARY_ID personId STRING, id STRING, locationId STRING, skillSet SET<INT>, skillList LIST<INT>, interestSet SET<STRING COMPRESS>, interestList LIST<STRING COMPRESS>) CREATE VERTEX company(PRIMARY_ID clientId STRING, id STRING, country STRING) CREATE UNDIRECTED EDGE worksFor(FROM person, TO company, startYear INT, startMonth INT, fullTime BOOL) CREATE GRAPH workNet(*) CREATE ONLINE_POST JOB loadMember FOR GRAPH workNet { LOAD TO VERTEX person VALUES($0, $0, $1, _, _, SPLIT($3,"|"), SPLIT($3,"|") ), TO TEMP_TABLE t2(id, skill) VALUES ($0, flatten($2,"|",1)); LOAD TEMP_TABLE t2 TO VERTEX person VALUES($0, _, _, $"skill", $"skill", _, _); } CREATE ONLINE_POST JOB loadCompany FOR GRAPH workNet { LOAD TO VERTEX company VALUES($0, $0, $1); } CREATE ONLINE_POST JOB loadMemberCompany FOR GRAPH workNet { LOAD TO EDGE worksFor VALUES($0, $1, $2, $3, $4); } RUN JOB loadMember USING FILENAME="./persons", SEPARATOR = ",", EOL = "\n" RUN JOB loadCompany USING FILENAME="./companies", SEPARATOR = ",", EOL = "\n" RUN JOB loadMemberCompany USING FILENAME="./person_company", SEPARATOR = ",", EOL = "\n"
file: persons (vertices)
person1,us,1|2|3,management|financial person2,chn,2|3|5|6,engineering person3,jp,4|1|6,teaching person4,us,4|1|10,football person5,can,|8|2|5,sport|financial|engineering person6,jp,7|10,music|art person7,us,8|6,art|sport person8,chn,1|5|2,management person9,us,4|7|2,financial|teaching person10,us,3,football|sport person11,can,10,sport|football person12,jp,1|5|2|2|2,music|engineering|teaching|teaching|teaching
file: company (vertices)
company1,us company2,chn company3,jp company4,us company5,can
file: person_company (edges)
person1,company1,2016,1,1 person1,company2,2014,3,0 person2,company1,2015,7,1 person2,company2,2012,6,0 person3,company1,2016,6,1 person4,company2,2013,2,1 person5,company2,2016,4,0 person6,company1,2015,1,1 person7,company2,2016,3,0 person7,company3,2014,1,0 person8,company1,2013,2,1 person9,company2,2015,12,1 person9,company3,2016,11,1 person10,company1,2016,2,1 person10,company3,2014,5,0 person11,company5,2016,5,1 person12,company4,2014,1,1

socialNet

graph_create.gsql for socialNet
DROP ALL CREATE VERTEX person(PRIMARY_ID personId UINT, id STRING, gender STRING) WITH STATS="OUTDEGREE_BY_EDGETYPE" CREATE UNDIRECTED EDGE friend(FROM person, TO person) CREATE VERTEX post(PRIMARY_ID postId UINT, subject STRING, postTime DATETIME) CREATE DIRECTED EDGE posted(FROM person, TO post) CREATE DIRECTED EDGE liked(FROM person, TO post, actionTime DATETIME) CREATE GRAPH socialNet(*) CREATE ONLINE_POST JOB loadMember FOR GRAPH socialNet { LOAD TO VERTEX person VALUES($0, $0, $1) ; } CREATE ONLINE_POST JOB loadFriend FOR GRAPH socialNet { LOAD TO EDGE friend VALUES($0, $1) ; } CREATE ONLINE_POST JOB loadPost FOR GRAPH socialNet { LOAD TO VERTEX post VALUES($0, $1, $2); } CREATE ONLINE_POST JOB loadPosted FOR GRAPH socialNet { LOAD TO EDGE posted VALUES($0, $1) ; } CREATE ONLINE_POST JOB loadLiked FOR GRAPH socialNet { LOAD TO EDGE liked VALUES($0, $1, $2) ; } RUN JOB loadMember USING FILENAME="./persons", SEPARATOR=",", EOL="\n" RUN JOB loadFriend USING FILENAME="./friends", SEPARATOR=",", EOL="\n" RUN JOB loadPost USING FILENAME="./posts", SEPARATOR=",", EOL="\n" RUN JOB loadPosted USING FILENAME="./posted", SEPARATOR=",", EOL="\n" RUN JOB loadLiked USING FILENAME="./liked", SEPARATOR=",", EOL="\n"
file: persons (vertices)
person1,Male person2,Female person3,Male person4,Female person5,Female person6,Male person7,Male person8,Male
file: friends (edges)
person1,person2 person2,person3 person3,person4 person4,person5 person4,person6 person5,person7 person6,person8 person7,person8 person8,person1
file: posts (vertices)
0,Graphs,2010-01-12 11:22:05 1,tigergraph,2011-03-03 23:02:00 2,query languages,2011-02-03 01:02:42 3,cats,2011-02-05 01:02:44 4,coffee,2011-02-07 05:02:51 5,tigergraph,2011-02-06 01:02:02 6,tigergraph,2011-02-05 02:02:05 7,Graphs,2011-02-04 17:02:41 8,cats,2011-02-03 17:05:52 9,cats,2011-02-05 23:12:42 10,cats,2011-02-04 03:02:31 11,cats,2011-02-03 01:02:21
file: posted (edges)
person1,0 person2,1 person3,2 person4,3 person5,4 person5,11 person6,5 person6,10 person7,6 person7,9 person8,7 person8,8
file: liked (edges)
person1,0,2010-01-11 11:32:00 person2,0,2010-01-12 10:52:15 person2,3,2010-01-11 16:02:26 person3,0,2010-01-16 05:15:53 person4,4,2010-01-13 03:16:05 person5,6,2010-01-12 21:12:05 person6,8,2010-01-14 11:23:05 person7,10,2010-01-12 11:22:05 person8,4,2010-01-11 03:26:05

friendNet

graph_create.gsql for friendNet
DROP ALL CREATE VERTEX person(PRIMARY_ID personId UINT, id STRING) CREATE UNDIRECTED EDGE friend(FROM person, TO person) CREATE UNDIRECTED EDGE coworker(FROM person, TO person) CREATE GRAPH friendNet(*) CREATE ONLINE_POST JOB loadMember FOR GRAPH friendNet { LOAD TO VERTEX person VALUES($0, $0); } CREATE ONLINE_POST JOB loadFriend FOR GRAPH friendNet { LOAD TO EDGE friend VALUES($0, $1); } CREATE ONLINE_POST JOB loadCoworker FOR GRAPH friendNet { LOAD TO EDGE coworker VALUES($0, $1); } RUN JOB loadMember USING FILENAME="./persons", SEPARATOR=",", EOL="\n" RUN JOB loadFriend USING FILENAME="./friends", SEPARATOR=",", EOL="\n" RUN JOB loadCoworker USING FILENAME="./coworkers", SEPARATOR=",", EOL="\n"
file: persons (vertices)
person1 person2 person3 person4 person5 person6 person7 person8 person9 person10 person11 person12
file: friends (edges)
person1,person2 person1,person3 person1,person4 person2,person8 person3,person9 person4,person6 person5,person6 person6,person9 person7,person9 person8,person10 person9,person8 person10,person12 person11,person12 person12,person8 person12,person9
file: coworkers (edges)
person1,person4 person1,person5 person1,person6 person2,person3 person2,person4 person3,person5 person3,person6 person4,person5 person4,person6 person5,person6 person6,person5 person7,person9 person7,person5 person7,person4 person8,person9 person9,person2 person10,person7 person11,person7 person12,person7

computerNet

graph_create.gsql for computerNet
DROP ALL CREATE VERTEX computer(PRIMARY_ID compID UINT, id STRING) CREATE DIRECTED EDGE connected(FROM computer, TO computer, connectionSpeed DOUBLE, securityLevel INT) CREATE GRAPH computerNet(*) CREATE ONLINE_POST JOB loadComputer FOR GRAPH computerNet { LOAD TO VERTEX computer VALUES($0, $0); } CREATE ONLINE_POST JOB loadConnection FOR GRAPH computerNet { LOAD TO EDGE connected VALUES($0, $1, $2, $3) ; } RUN JOB loadComputer USING FILENAME="./computers", SEPARATOR=",", EOL="\n" RUN JOB loadConnection USING FILENAME="./connections", SEPARATOR=",", EOL="\n"
file: computers (vertices)
c1 c2 c3 c4 c5 c6 c7 c8 c9 c10 c11 c12 c13 c14 c15 c16 c17 c18 c19 c20 c21 c22 c23 c24 c25 c26 c27 c28 c29 c30 c31
file: connections (edges)
c1,c2,16.0,3 c1,c3,64.0,3 c1,c4,64.0,2 c1,c5,16.5,3 c1,c6,64.3,3 c1,c7,3.2,3 c1,c8,-3.5,3 c1,c9,-5.1,1 c1,c10,15.5,3 c1,c10,.5,1 c1,c10,126,3 c10,c11,16,3 c11,c12,.5,3 c12,c13,-0.5,3 c12,c14,0.16,4 c12,c15,1e2,3 c12,c16,3.516e3,3 c12,c17,5.12e-3,2 c12,c18,-2.34e-5,1 c12,c19,-0.000000000234,5 c12,c20,0.000123e-5,4 c12,c21,1000e3,1 c12,c22,0.000123e10,1 c14,c23,123456e-6,1 c14,c24,123456e5,3 c23,c24,64,2 c23,c25,16,2 c23,c26,32,2 c23,c27,16,2 c23,c28,3,1 c23,c29,32,2 c23,c30,16,2 c23,c25,3,2 c23,c26,3,2 c23,c27,64,2 c23,c28,32,2 c23,c29,3,2 c23,c30,3,2 c23,c31,32,2 c4,c23,16,2 c4,c23,32,2 c4,c23,64,2 c4,c23,3,2

minimalNet

graph_create.gsql for minimalNet
DROP ALL CREATE VERTEX testV(PRIMARY_ID id STRING) CREATE UNDIRECTED EDGE testE(FROM testV, TO testV) CREATE GRAPH minimalNet(*)


There is no loading job or data for minimalNet (hence, "minimal.")

investmentNet

graph_create.gsql for investmentNet
DROP ALL TYPEDEF TUPLE <age UINT (4), mothersName STRING(20) > SECRET_INFO CREATE VERTEX person(PRIMARY_ID personId STRING, portfolio MAP<STRING, DOUBLE>, secretInfo SECRET_INFO) CREATE VERTEX stockOrder(PRIMARY_ID orderId STRING, ticker STRING, orderSize UINT, price FLOAT) CREATE UNDIRECTED EDGE makeOrder(FROM person, TO stockOrder, orderTime DATETIME) CREATE GRAPH investmentNet (*) CREATE ONLINE_POST JOB loadPerson FOR GRAPH investmentNet { LOAD TO VERTEX person VALUES($0, SPLIT($1, ":", ";"), SECRET_INFO( $2, $3 ) ); } CREATE ONLINE_POST JOB loadOrder FOR GRAPH investmentNet { LOAD TO VERTEX stockOrder VALUES($1, $3, $4, $5), TO EDGE makeOrder VALUES($0, $1, $2); } RUN JOB loadPerson USING FILENAME="./persons", SEPARATOR=",", EOL="\n" RUN JOB loadOrder USING FILENAME="./orders", SEPARATOR=",", EOL="\n"
file: persons (vertices)
person1,AAPL:3142.24;G:6112.23;MS:5000.00,25,JAMES person2,A:5242.62;GCI:5331.21;BAH:3200.00,67,SMITH person3,AA:5223.73;P:7935.00;BAK:6923.52,45,WILLIAMS person4,ACH:3542.62;S:6521.55;BABA:4030.52,51,ANTHONY
file: orders (vertices and edges)
person1,0,1488566548,AAPL,500,34.42 person1,1,1488566549,A,210,50.55 person1,2,1488566550,B,211,202.32 person2,3,1488566555,S,2,42.44 person3,4,1488566155,ABC,2,52.44 person4,5,1488566255,Z,2,62.34 person4,6,1488566655,S,2,10.01