This document will work through a simple example using the Graql shell to show how to get started with GRAKN.AI.
Edit me

Summary

This example takes a simple genealogy dataset and briefly reviews its ontology, then illustrates how to query, extend and visualise the graph, before demonstrating reasoning and analytics with Graql.

Introduction

If you have not yet set up GRAKN.AI, please see the Setup guide. In this tutorial, we will load a simple ontology and some data from a file, basic-genealogy.gql and test it in the Graql shell and Grakn Visualiser. The basic-genealogy.gql file will be included in the /examples folder of the Grakn installation zip from release 0.11.0 and onwards. It can also be downloaded from the Grakn repo on Github. In the code below, we assume that it is in the /examples folder.

The Graql Shell

The first few steps mirror those in the Setup Guide, and you can skip to The Ontology if you have already run through that example. Start Grakn and load the example graph:

./bin/grakn.sh start
./bin/graql.sh -f ./examples/basic-genealogy.gql

Then start the Graql shell in its interactive (REPL) mode:

./bin/graql.sh

You will see a >>> prompt. Type in a query to check that everything is working:

match $x isa person, has identifier $n;

You should see a printout of a number of lines of text, each of which includes a name, such as “William Sanford Titus” or “Elizabeth Niesz”.

The Ontology

You can find out much more about the Grakn ontology in our documentation about the Grakn knowledge model, which states that

“The ontology is a formal specification of all the relevant concepts and their meaningful associations in a given application domain. It allows objects and relationships to be categorised into distinct types, and for generic properties about those types to be expressed”.

For the purposes of this guide, you can think of the ontology as a schema that describes items of data and defines how they relate to one another. You need to have a basic understanding of the ontology to be able to make useful queries on the data, so let’s review the chunks of it that are important for our initial demonstration:

insert

# Entities

person sub entity
  plays-role parent
  plays-role child
  plays-role spouse1
  plays-role spouse2

  has-resource identifier
  has-resource firstname
  has-resource surname
  has-resource middlename
  has-resource picture
  has-resource age
  has-resource birth-date
  has-resource death-date
  has-resource gender;

# Resources

identifier sub resource datatype string;
firstname sub resource datatype string;
surname sub resource datatype string;
middlename sub resource datatype string;
picture sub resource datatype string;
age sub resource datatype long;
birth-date sub resource datatype string;
death-date sub resource datatype string;
gender sub resource datatype string;

# Roles and Relations

marriage sub relation
  has-role spouse1
  has-role spouse2
  has-resource picture;

spouse1 sub role;
spouse2 sub role;

parentship sub relation
  has-role parent
  has-role child;

parent sub role;
child sub role;

There are a number of things we can say about ontology shown above:

  • there is one entity, person, which represents a person in the family whose genealogy data we are studying.
  • the person entity has a number of resources to describe aspects of them, such as their name, age, dates of birth and death, gender and a URL to a picture of them (if one exists). Those resources are all expressed as strings, except for the age, which is of datatype long.
  • there are two relations that a person can participate in: marriage and parentship
  • the person can play different roles in those relations, as a spouse (spouse1 or spouse2 - we aren’t assigning them by gender to be husband or wife) and as a parent or child (again, we are not assigning a gender such as mother or father).
  • the marriage relation has a resource, which is a URL to a wedding picture, if one exists.

The Data

The data is rather cumbersome, so we will not reproduce it all here. It is part of our genealogy-graph project, and you can find out much more about the Niesz family in our CSV migration and Graql reasoning example documentation. Here is a snippet of some of the data that you added to the graph when you loaded the basic-genealogy.gql file:

$57472 isa person has firstname "Mary" has identifier "Mary Guthrie" has surname "Guthrie" has gender "female";
$86144 has surname "Dudley" isa person has identifier "Susan Josephine Dudley" has gender "female" has firstname "Susan" has middlename "Josephine";
$118912 has age 74 isa person has firstname "Margaret" has surname "Newman" has gender "female" has identifier "Margaret Newman";
...
$8304 (parent: $57472, child: $41324624) isa parentship;
$24816 (parent: $81976, child: $41096) isa parentship;
$37104 isa parentship (parent: $49344, child: $41127960);
...
$122884216 (spouse2: $57472, spouse1: $41406488) isa marriage;
$40972456 (spouse2: $40964120, spouse1: $8248) isa marriage;
$81940536 (spouse2: $233568, spouse1: $41361488) has picture "http:\/\/1.bp.blogspot.com\/-Ty9Ox8v7LUw\/VKoGzIlsMII\/AAAAAAAAAZw\/UtkUvrujvBQ\/s1600\/johnandmary.jpg" isa marriage;

Don’t worry about the numbers such as $57472. These are variables in Graql, and happen to have randomly assigned numbers to make them unique. Each statement is adding either a person, a parentship or a marriage to the graph. We will show how to add more data to the graph shortly in the Extending The Graph section. First, however, it is time to query the graph in the Graql shell.

Querying the Graph

Having started Grakn engine and the Graql shell in its interactive mode, we are ready to make a number queries. First, we will make a couple of match queries.

Find all the people in the graph, and list their identifier resources (a string that represents their full name):

match $p isa person, has identifier $i;

Find all the people who are married:

match (spouse1: $x, spouse2: $y) isa marriage; $x has identifier $xi; $y has identifier $yi;  

List parent-child relations with the names of each person:

match (parent: $p, child: $c) isa parentship; $p has identifier $pi; $c has identifier $ci; 

Find all the people who are named ‘Elizabeth’:

match $x isa person, has identifier $y; $y value contains "Elizabeth"; 

Querying the graph is more fully described in the Graql documentation.

Extending the Graph

Besides making match queries, it is also possible to insert (see further documentation) and delete(see further documentation) items in the graph through the Graql shell. To illustrate inserting a fictional person:

insert $g isa person has firstname "Titus" has identifier "Titus Groan" has surname "Groan" has gender "male";
commit

To find your inserted person:

match $x isa person has identifier "Titus Groan"; 

To delete the person again:

match $x isa person has identifier "Titus Groan"; delete $x;
commit

Alternatively, we can use match...insert syntax, to insert additional data associated with something already in the graph. Adding some fictional information (middle name, birth date, death date and age at death) for one of our family, Mary Guthrie:

match $p has identifier "Mary Guthrie"; insert $p has middlename "Mathilda"; $p has birth-date "1902-01-01"; $p has death-date "1952-01-01"; $p has age 50;
commit

Using the Grakn Visualiser

You can open the Grakn visualiser by navigating to localhost:4567 in your web browser. The visualiser allows you to make queries or simply browse the knowledge ontology within the graph. The screenshot below shows a basic query (match $x isa person; offset 0; limit 100;) typed into the form at the top of the main pane, and visualised by pressing “>”:

Person query

You can zoom the display in and out, and move the nodes around for better visibility. Please see our Grakn visualiser documentation for further details.

Using Inference

We will move on to discuss the use of GRAKN.AI to infer new information about a dataset. In the ontology, so far, we have dealt only with a person, not a man or woman, and the parentship relations were simply between parent and child roles. We did not directly add information about the nature of the parent and child in each relation - they could be father and son, father and daughter, mother and son or mother and daughter.

However, the person entity does have a gender resource, and we can use Grakn to infer more information about each relationship by using that property. The ontology accommodates the more specific roles of mother, father, daughter and son:

person 
  plays-role son
  plays-role daughter
  plays-role mother
  plays-role father
	
parentship sub relation
  has-role mother
  has-role father
  has-role son
  has-role daughter;

mother sub parent;
father sub parent;
son sub child;
daughter sub child;

Included in basic-genealogy.gql are a set of Graql rules to instruct Grakn’s reasoner on how to label each parentship relation:

$genderizeParentships1 isa inference-rule
lhs
{(parent: $p, child: $c) isa parentship;
$p has gender "male";
$c has gender "male";
}
rhs
{(father: $p, son: $c) isa parentship;};

$genderizeParentships2 isa inference-rule
lhs
{(parent: $p, child: $c) isa parentship;
$p has gender "male";
$c has gender "female";
}
rhs
{(father: $p, daughter: $c) isa parentship;};

$genderizeParentships3 isa inference-rule
lhs
{(parent: $p, child: $c) isa parentship;
$p has gender "female";
$c has gender "male";
}
rhs
{(mother: $p, son: $c) isa parentship;};

$genderizeParentships4 isa inference-rule
lhs
{(parent: $p, child: $c) isa parentship;
$p has gender "female";
$c has gender "female";
}
rhs
{(mother: $p, daughter: $c) isa parentship;};

If you’re unfamiliar with the syntax of rules, don’t worry too much about it too much just now. It is sufficient to know that, for each parentship relation, Graql checks whether the pattern in the first block (left hand side or lhs) can be verified and, if it can, infers the statement in the second block (right hand side or rhs) to be true, so inserts a relation between gendered parents and children.

Let’s test it out!

First, try making a match query to find parentship relations between fathers and sons in the Graql shell:

match (father: $p, son: $c) isa parentship; $p has identifier $n1; $c has identifier $n2;

Did you get any results? Probably not, because reasoning is not enabled by default at present, although as Grakn develops, we expect that to change. If you didn’t see any results, you need to exit the Graql shell and restart it, passing -n and -m flags to switch on reasoning (see our documentation for more information about flags supported by the Graql shell).

./bin/graql.sh -n -m

Try the query again:

match (father: $p, son: $c) isa parentship; $p has identifier $n1; $c has identifier $n2;

There may be a pause, and then you should see a stream of results as Grakn infers the parentships between male parent and child entities. It is, in effect, building new information about the family which was not explicit in the dataset.

You may want to take a look at the results of this query in the Grakn visualiser and, as for the shell, you will need to activate inference before you see any results. Browse to the visualiser at localhost:4567 and open the Config settings on the left hand side of the screen. When the page opens you will see the “Activate Inference” checkbox. Check it, and try submitting the query above or a variation of it for mothers and sons, fathers and daughters etc. Or, you can even go one step further and find out fathers who have the same name as their sons:

match (father: $p, son: $c) isa parentship; $p has firstname $n; $c has firstname $n;

Father-Son Shared Names query

If you want to find out more about the Graql reasoner, we have a detailed example. An additional discussion on the same topic can be found in our “Family Matters” blog post.

Using Analytics

Use of Grakn Analytics is covered in Analytics.

Data Migration

In this example we loaded data from basic-genealogy.gql directly into a graph. However, data isn’t often conveniently stored in .gql files and, indeed, the data that we used was originally in CSV format. Our CSV migration example explains in detail the steps we took to migrate the CSV data into Grakn.

Migrating data in formats such as CSV, SQL, OWL and JSON into Grakn is a key use case. More information about each of these can be found in the migration documentation.

Where Next?

This page was a very high-level overview of some of the key use cases for Grakn, and has hardly touched the surface or gone into detail. The rest of our developer documentation and examples are more in-depth and should answer any questions that you may have, but if you need extra information, please get in touch.

A good place to start is to explore our additional example code and the documentation for:

Comments

Want to leave a comment? Visit the issues on Github for this page (you’ll need a GitHub account). You are also welcome to contribute to our documentation directly via the “Edit me” button at the top of the page.