In this guide, we are going to load a single file to set up a simple graph and insert some data. This example is loosely based on the genealogy-graph example that we use throughout our documentation. You can find the full example, including data and ontology, in the sample-datasets repo on Github. However, we are not going to load the complete data and ontology here: this is a basic “quick start” example to illustrate a simple ontology and some basic data.
If you have not yet set up GRAKN.AI, please see the Setup guide.
Using the Graql Shell
We will first make sure we are working with a clean ‘keyspace’. From the terminal:
We then start the Grakn engine:
Finally, we load the ontology and data into Grakn, so we have a graph to work with, using the basic-genealogy.gql file stored in the examples directory of the Grakn distribution zip. You can also find this file on Github. Simply invoke the Graql shell passing the -f flag to indicate the file to load into a graph (if you are interested, please see our documentation about flags supported by the Graql shell):
<relative-path-to-Grakn>/bin/graql.sh -f <relative-path-to-Grakn>/examples/basic-genealogy.gql
You can find out much more about the Grakn ontology in our documentation about the Grakn knowledge model, which states that
“The ontology is a formal specification of all the relevant concepts and their meaningful associations in a given application domain. It allows objects and relationships to be categorised into distinct types, and for generic properties about those types to be expressed”.
For the purposes of this guide, you can think of the ontology as a schema that describes items of data and defines how they relate to one another.
There are a number of things we can say about ontology shown below:
- there is one entity,
person, which represents a person in the family whose genealogy data we are studying.
personentity has a number of resources to describe aspects of the person, such as their name, age, dates of birth and death, gender and a URL to a picture of them (if one exists). Those resources are all expressed as strings, except for the age, which is of datatype long.
- there are two relations that a
personcan participate in:
- the person can play different roles in those relations, as a spouse (
spouse2- we aren’t assigning them by gender to be husband or wife in this example) and as a
marriagerelation has a resource, which is a URL to a wedding picture, if one exists.
person sub entity plays-role parent plays-role child plays-role spouse1 plays-role spouse2 has-resource identifier has-resource firstname has-resource surname has-resource middlename has-resource picture has-resource age has-resource birth-date has-resource death-date has-resource gender; # Resources identifier sub resource datatype string; firstname sub resource datatype string; surname sub resource datatype string; middlename sub resource datatype string; picture sub resource datatype string; age sub resource datatype long; birth-date sub resource datatype string; death-date sub resource datatype string; gender sub resource datatype string; # Roles and Relations marriage sub relation has-role spouse1 has-role spouse2 has-resource picture; spouse1 sub role; spouse2 sub role; parentship sub relation has-role parent has-role child; parent sub role; child sub role;
The data is rather cumbersome, so we will not reproduce it all here. It is part of our genealogy-graph project, and you can find out much more about the Niesz family in our CSV migration and Graql reasoning example documentation. Here is a snippet of some of the data that you added to the graph when you loaded the basic-genealogy.gql file:
$57472 isa person has firstname "Mary" has identifier "Mary Guthrie" has surname "Guthrie" has gender "female"; $86144 has surname "Dudley" isa person has identifier "Susan Josephine Dudley" has gender "female" has firstname "Susan" has middlename "Josephine"; $118912 has age 74 isa person has firstname "Margaret" has surname "Newman" has gender "female" has identifier "Margaret Newman"; ... $8304 (parent: $57472, child: $41324624) isa parentship; $24816 (parent: $81976, child: $41096) isa parentship; $37104 isa parentship (parent: $49344, child: $41127960); ... $122884216 (spouse2: $57472, spouse1: $41406488) isa marriage; $40972456 (spouse2: $40964120, spouse1: $8248) isa marriage; $81940536 (spouse2: $233568, spouse1: $41361488) has picture "http:\/\/1.bp.blogspot.com\/-Ty9Ox8v7LUw\/VKoGzIlsMII\/AAAAAAAAAZw\/UtkUvrujvBQ\/s1600\/johnandmary.jpg" isa marriage;
Don’t worry about the numbers such as
$57472. These are variables in Graql, and happen to have randomly assigned numbers to make them unique. Each statement is adding either a
parentship or a
marriage to the graph. We will show how to add more data to the graph shortly in the Extending The Graph section. First, however, it is time to check the graph in the Graql shell.
Querying the Graph
To start the Graql shell, type the following from the terminal:
You will see a
>>> prompt, at which point, you can make a number queries to explore the graph, as more fully described in the Graql documentation. Here, we will make a couple of
Find all people in the graph, and list their
identifier resources (a string that represents their full name):
match $p isa person, has identifier $i;
Find all people who are married:
match (spouse1: $x, spouse2: $y) isa marriage; $x has identifier $xi; $y has identifier $yi;
$, which represent wildcards, and are returned as results in
matchqueries. A variable name can contain alphanumeric characters, dashes and underscores.
Extending the Graph
insert $gormenghast isa person has firstname "Titus" has identifier "Titus Groan" has surname "Groan" has gender "male"; commit;
Nothing you have entered into the Graql shell has yet been committed to the graph, nor has it been validated. To save any changes you make to a graph, you need to type
commitin the shell. It is a good habit to get into regularly committing what you have entered.
To find your inserted
match $x isa person has identifier "Titus Groan";
To delete the
match $x isa person has identifier "Titus Groan"; delete $x;
Using the Grakn Visualiser
The Grakn visualiser provides a graphical tool to inspect and query your graph data. You can open the visualiser by navigating to localhost:4567 in your web browser. The visualiser allows you to make queries or simply browse the types within the graph. The screenshot below shows a basic query (
match $x isa person;) typed into the form at the top of the main pane, and visualised by pressing “Submit”:
The help tab on the main pane shows a set of key combinations that you can use to further drill into the data. You can zoom the display in and out, and move the nodes around for better visibility. Please see our Grakn visualiser documentation for further details.
The use of GRAKN.AI to infer new information about a dataset lies at its core. We have a detailed example of using the Grakn reasoner to infer information about the genealogy dataset. An additional discussion on the same topic can be found in our “Family Matters” blog post.
Use of Grakn Analytics is covered in Analytics.
Migrating data in formats such as CSV, SQL, OWL and JSON into Grakn is a key use case. We have used a simple example here, which loaded basic-genealogy.gql data directly into a graph. However, that file is based on CSV data that can be migrated into Grakn to provide a more complex graph. The CSV migration example explains the steps in further detail. There are a number of other formats besides CSV that are supported, and more information can be found in the migration documentation.
This page was a very high-level overview of some of the key use cases for Grakn, and has hardly touched the surface or gone into detail. The rest of our developer documentation and examples are more in-depth and should answer any questions that you may have, but if you need extra information, please get in touch.
Want to leave a comment? Visit the issues on Github for this page (you’ll need a GitHub account). You are also welcome to contribute to our documentation directly via the “Edit me” button at the top of the page.