Introduction

The DataLinks Query Language is a powerful domain-specific language designed to query and manipulate datasets within the DataLinks ecosystem. It provides a flexible and intuitive way to access data, apply filters, establish relationships between datasets, and sort results.

This language is particularly useful for:

  • Retrieving data from specific datasets
  • Filtering data based on various conditions
  • Discovering relationships between different datasets
  • Traversing complex data structures
  • Sorting and limiting result sets

Basic Syntax

The query language follows a fluent interface pattern, where methods are chained together to build complex queries. The basic structure of a query is:

Ontology("<dataset>").<operation1>().<operation2>()...

or

OntologyObject("<dataset>").<operation1>().<operation2>()...

Both Ontology and OntologyObject are valid entry points for creating a query.

Querying Datasets

Basic Dataset Query

To query a dataset, specify its name:

Ontology("Movies")

Dataset names are case-insensitive, so Ontology("movies") and Ontology("Movies") refer to the same dataset.

Namespaced Datasets

Datasets can be organized in namespaces. To query a namespaced dataset, use the format:

Ontology("Namespace/Dataset")

For example:

Ontology("Companies/Orange")

If a dataset name is ambiguous (exists in multiple namespaces), you must specify the namespace to avoid errors.

Filtering Data

Basic Filtering

To filter data, use the filter() method with a comparison expression:

Ontology("Movies").filter(year == 2012)

or

Ontology("Movies").filter().column("year == 2012")

Comparison Operators

The query language supports the following comparison operators:

  • == (equal to)
  • != (not equal to)
  • < (less than)
  • <= (less than or equal to)
  • > (greater than)
  • >= (greater than or equal to)

Examples:

Ontology("Movies").filter(year < 2000)
Ontology("Movies").filter(budget >= 10000000)
Ontology("Movies").filter(name != "Titanic")

Logical Operators

Combine multiple conditions using logical operators:

  • && (AND)
  • || (OR)

Examples:

Ontology("Movies").filter(year == 2005 && director == "Christopher Nolan")
Ontology("Movies").filter(year == 2005 || year == 2006)

Nested Conditions

Use parentheses to create complex nested conditions:

Ontology("Movies").filter(director == "Mark Osborne" && (year == 2005 || year == 2006))

Filter Expression Syntax

The query language provides flexible syntax for filter expressions:

  1. Quoting Field Names: Field names can be unquoted, single-quoted, or double-quoted:
name == "value"
"name" == "value"
'name' == "value"
  1. Quoting Values: String values can be unquoted (for simple strings) or quoted:
title == dog
title == "dog"
title == 'dog'
  1. Multi-word Values: Multi-word string values must be quoted:
title == "The Matrix"
Note: Unquoted multi-word values (e.g., title == The Matrix) are not supported.

Linking Datasets

To find related datasets, use the searchAround() and find() methods:

Ontology("Movies").searchAround().find("Actor")

This searches for actors related to movies based on defined links between the datasets.

Specifying Search Depth

You can specify the depth of the search (how many hops to traverse):

Ontology("Movies").searchAround(2).find("Actor")

A depth of 2 means it will look for relationships that are two links away.

To follow specific columns when linking datasets:

Ontology("Movies").searchAround().find("Actor").follow("director")

This specifies that the link between Movies and Actor should be through the “director” column.

You can specify multiple columns to follow:

Ontology("Movies").searchAround().find("Actor").follow("director", "producer")

Sorting and Limiting Results

Sorting Results

Sort results in ascending or descending order:

Ontology("Movies").sortAscending("year")
Ontology("Movies").sortDescending("budget")

Alternative syntax:

Ontology("Movies").sort("year", "ASC")
Ontology("Movies").sort("budget", "DESC")

Multiple Sort Criteria

Sort by multiple fields:

Ontology("Actor").sortDescending("nationality").sortAscending("age")

This sorts actors first by nationality in descending order, then by age in ascending order.

Limiting Results

Limit the number of results returned:

Ontology("Movies").limit(5)

You can also limit results at different stages of a complex query:

Ontology("Movies").limit(10).searchAround().find("Actor").limit(5)

This limits the movies to 10 and the related actors to 5.

Natural Language Query Generation

The system can generate queries from natural language questions. For example:

Natural language: “What is the age of the director of Braveheart?”

Generated query:

OntologyObject("Movies").filter(name == "Braveheart").searchAround().find("Actor")

Examples

Basic Query Examples

  1. Find all movies:
Ontology("Movies")
  1. Find a specific movie:
Ontology("Movies").filter(name == "Argo")
  1. Find movies released before 2000:
Ontology("Movies").filter(year < 2000)

Relationship Query Examples

  1. Find actors who directed movies:
Ontology("Movies").searchAround().find("Actor")
  1. Find movies directed by actors under 60:
Ontology("Actor").filter(age < 60).searchAround().find("Movies")
  1. Find actors who directed movies, then find other actors who worked with them:
Ontology("Actor").filter(age < 60).searchAround().find("Movies").searchAround().find("Actor")

Sorting and Limiting Examples

  1. Find the 3 oldest movies:
Ontology("Movies").sortAscending("year").limit(3)
  1. Find the 5 highest budget movies:
Ontology("Movies").sortDescending("budget").limit(5)

Error Handling

If a query returns unexpected results:

  1. Check field names and table relationships for correctness.

  2. Ensure the depth in searchAround is sufficient for your query.

  3. Debug by breaking the query into smaller parts and testing them individually.