[go: up one dir, main page]

Getting Started with Neosemantics

We’re assuming that you have read the Installation Guide. If you haven’t already done so, head back and read the Installation instructions first.

Now that you have installed the plugin, it’s time to get started with importing data. In this tutorial, we will begin with a basic RDF document containing a few triples in Turtle format. This file is hosted on the Neosemantics Github Repository.

@prefix neo4voc: <http://neo4j.org/vocab/sw#> .
@prefix neo4ind: <http://neo4j.org/ind#> .

neo4ind:nsmntx3502 neo4voc:name "NSMNTX" ;
			   a neo4voc:Neo4jPlugin ;
			   neo4voc:version "3.5.0.2" ;
			   neo4voc:releaseDate "03-06-2019" ;
			   neo4voc:runsOn neo4ind:neo4j355 .

neo4ind:apoc3502 neo4voc:name "APOC" ;
			   a neo4voc:Neo4jPlugin ;
			   neo4voc:version "3.5.0.4" ;
			   neo4voc:releaseDate "05-31-2019" ;
			   neo4voc:runsOn neo4ind:neo4j355 .

neo4ind:graphql3502 neo4voc:name "Neo4j-GraphQL" ;
			   a neo4voc:Neo4jPlugin ;
			   neo4voc:version "3.5.0.3" ;
			   neo4voc:releaseDate "05-05-2019" ;
			   neo4voc:runsOn neo4ind:neo4j355 .

neo4ind:neo4j355 neo4voc:name "neo4j" ;
			   a neo4voc:GraphPlatform , neo4voc:AwesomePlatform ;
			   neo4voc:version "3.5.5" .

The RDF fragment describes four elements (resources) including:

  • Neo4j - a Graph Platform with a version property of 3.5.5

  • Neosemantics (n10s) - A Neo4j Plugin with a version 3.5.0.2 released on 3 June 2019 which runs on the the Neo4j platform

  • APOC - A Neo4j Plugin with a version 3.5.0.4 released on 31 May 2019 which runs on the the Neo4j platform

  • Neo4j-GraphQL - A Neo4j Plugin with a version 3.5.0.3 released on 5 May 2019 which runs on the the Neo4j platform

In Neo4j’s Property Graph Model, resources will become Nodes in the database with properties to represent the versions. The neo4voc:runsOn predicate will become a Relationship.

The general idea is that triples with literal values as objects will become node properties an triples with resources as objects will become relationships.

Creating a Constraint

When Neosemantics imports an RDF resource into Neo4j, it creates a node and automatically adds a :Resource label to it. In order to make our queries efficient, a unique constraint is required on the URI property for nodes with a label of :Resource. Creating a constraint will ensure that a URI will be unique across the database, and also that lookups on a :Resource node by it’s uri property will use an index rather than scanning through all nodes to find the relevent record.

CREATE CONSTRAINT n10s_unique_uri FOR (r:Resource)
REQUIRE r.uri IS UNIQUE

Graph Config

Before running any import operations with Neosemantics, we should create a Graph Config using the n10s.graphconfig.init procedure. A Graph Config defines the way that our RDF data is persisted in Neo4j.

For this example, we will use the default values except for handleVocabUri option. For more information on the configuration options, see the Neosemantics Reference Guide.

CALL n10s.graphconfig.init({
  handleVocabUris: 'MAP'
})

By default the handleVocabUri is set to SHORTEN - meaning that the namespace is prefixed with an automatically generated string, for example http://neo4j.org/vocab/sw#Neo4jPlugin would be shortened to ns0__Neo4jPlugin and at the same time the ns0 prefix will be associated with the namespace http://neo4j.org/vocab/sw#. This is useful if we want to be able to regenerate the RDF again.

In this example we will set the handleVocabUri param to MAP, which means that instead of shortening the namespaces, we will map the long URIs to custom simple names, making the Property Graph more readable.

Choosing between Inline and Fetch

Each of the functions in this tutorial have similar signatures. Depending on the source of your RDF query, you can run suffix each procedure with .fetch or .inline. If your RDF is hosted at a remote URL, you can call the .fetch procedure with the first argument as the URL and the format second parameter.

Fetch Example
CALL n10s.rdf.preview.fetch(
  '<< YOUR URL >>',
  '<< FORMAT >>'
)

You can copy and paste RDF into the query using the .inline procedures. This procedure expects an RDF query or fragment as the first parameter, and the format as the second parameter.

Inline Example
CALL n10s.rdf.preview.inline(
  '<< YOUR RDF QUERY HERE >>',
  '<< FORMAT >>'
)

You can view the difference between the two types of procedure by toggling the code examples below.

Previewing Data

Now that the Graph Config has been created, we can start to manipulate the RDF into a suitable format for the property graph by changing the import configuration parameters. After each change we can use the n10s.rdf.preview.* procedures to preview the effect on the resulting property graph. As our RDF document is hosted online, we can use n10s.rdf.preview.fetch to fetch the RDF from a remote URL.

Fetch
CALL n10s.rdf.preview.fetch(
  'https://raw.githubusercontent.com/neo4j-labs/neosemantics/3.5/docs/rdf/nsmntx.ttl',
  'Turtle'
)
Inline
CALL n10s.rdf.preview.inline(
  '
@prefix neo4voc: <http://neo4j.org/vocab/sw#> .
@prefix neo4ind: <http://neo4j.org/ind#> .

neo4ind:nsmntx3502 neo4voc:name "NSMNTX" ;
			   a neo4voc:Neo4jPlugin ;
			   neo4voc:version "3.5.0.2" ;
			   neo4voc:releaseDate "03-06-2019" ;
			   neo4voc:runsOn neo4ind:neo4j355 .

neo4ind:apoc3502 neo4voc:name "APOC" ;
			   a neo4voc:Neo4jPlugin ;
			   neo4voc:version "3.5.0.4" ;
			   neo4voc:releaseDate "05-31-2019" ;
			   neo4voc:runsOn neo4ind:neo4j355 .

neo4ind:graphql3502 neo4voc:name "Neo4j-GraphQL" ;
			   a neo4voc:Neo4jPlugin ;
			   neo4voc:version "3.5.0.3" ;
			   neo4voc:releaseDate "05-05-2019" ;
			   neo4voc:runsOn neo4ind:neo4j355 .

neo4ind:neo4j355 neo4voc:name "neo4j" ;
			   a neo4voc:GraphPlatform , neo4voc:AwesomePlatform ;
			   neo4voc:version "3.5.5" .
  ',
  'Turtle'
)

preview 3

Neosemantics has performed the following transformations on the RDF data:

  • neo4j355 has become a Node with two labels: GraphPlatform and AwesomePlatform, one for each rdf:type statement

    • The node has a URI property of http://neo4j.org/ind#neo4j355 (the URI of the resource)

    • neo4voc:name and neo4voc:version have been stripped of the namespace and assigned as properties on the Node

  • nsmntx3502, apoc3502 and graphql3502 have become Nodes with a label of Neo4jPlugin based on their rdf:type

    • neo4voc:version, neo4voc:releaseDate and neo4voc:runsOn have been stripped of the namespace and converted to properties on these nodes

  • A Relationship has been created between the GraphPlatform node and each of the Neo4jPlugin nodes with the type of runsOn

You can also use the n10s.rdf.preview.inline(rdf: string, format: string), passing the RDF as the first parameter and the format as the second parameter.

Mappings

We can change the names that Neosemantics assigns to a Graph schema element (property name, rel type or node label) by creating a Mapping. For example, you may want to change the type of the Relationship that is created between the Neo4jPlugin and GraphPlatform.

First, we will need to ensure that Neosemantics is aware of the namespace. To do this, we use the n10s.nsprefixes.add procedure. This procedure takes two parameters:

  1. The prefix for the namespace

  2. The full URI of the namespace

In this example, the http://neo4j.org/vocab/sw# namespace will be mapped to neo4voc.

CALL n10s.nsprefixes.add(
    'neo4voc', (1)
    'http://neo4j.org/vocab/sw#' (2)
)
1 The prefix that we will use from now on to refer to the namespace is neo4voc
2 The full namespace is http://neo4j.org/vocab/sw#

Now we can create the mapping for vocabulary elements in this namespace. We do this using the n10s.mapping.add procedure. This procedure takes two parameters:

  1. The full URI of the Schema Element (including the namespace part)

  2. The value that the Schema Element should be renamed to in the graph

As we would like to rename the neo4voc:runsOn element to RUNS_ON, we can run the following procedure:

CALL n10s.mapping.add(
  'http://neo4j.org/vocab/sw#runsOn', (1)
  'RUNS_ON' (2)
)
1 The fully URI of the runsOn RDF predicate
2 The name of the the relationship that will be persisted in Neo4j

Re-running the preview query above should now show that the Relationship Type has been renamed from runsOn to RUNS_ON.

preview 4

You can also extract the namespaces from an RDF document using the n10s.nsprefixes.addFromText procedure and passing the header part of the document where the namespace definitions are normally found.

Importing Data

The signature of the import procedure is similar to the preview - all you need to do is swap preview for import. The difference is that this one will persist the graph in Neo4j while the preview one would just show what the result of importing RDF would be but without any effect on the graph. Running the following Cypher query should run the import and return a summary of the execution including whether the import was successful or failed (terminationStatus), the number of triples from the RDF source that were effectively persisted in Neo4j (triplesLoaded), the number of triples that were parsed in the RDF source (triplesParsed). Note that some triples may be filtered based on the configuration options.

Fetch
CALL n10s.rdf.import.fetch(
  'https://raw.githubusercontent.com/neo4j-labs/neosemantics/3.5/docs/rdf/nsmntx.ttl',
  'Turtle'
)
Inline
CALL n10s.rdf.import.inline(
  '
@prefix neo4voc: <http://neo4j.org/vocab/sw#> .
@prefix neo4ind: <http://neo4j.org/ind#> .

neo4ind:nsmntx3502 neo4voc:name "NSMNTX" ;
			   a neo4voc:Neo4jPlugin ;
			   neo4voc:version "3.5.0.2" ;
			   neo4voc:releaseDate "03-06-2019" ;
			   neo4voc:runsOn neo4ind:neo4j355 .

neo4ind:apoc3502 neo4voc:name "APOC" ;
			   a neo4voc:Neo4jPlugin ;
			   neo4voc:version "3.5.0.4" ;
			   neo4voc:releaseDate "05-31-2019" ;
			   neo4voc:runsOn neo4ind:neo4j355 .

neo4ind:graphql3502 neo4voc:name "Neo4j-GraphQL" ;
			   a neo4voc:Neo4jPlugin ;
			   neo4voc:version "3.5.0.3" ;
			   neo4voc:releaseDate "05-05-2019" ;
			   neo4voc:runsOn neo4ind:neo4j355 .

neo4ind:neo4j355 neo4voc:name "neo4j" ;
			   a neo4voc:GraphPlatform , neo4voc:AwesomePlatform ;
			   neo4voc:version "3.5.5" .
  ',
  'Turtle'
)
terminationStatus triplesLoaded triplesParsed namespaces extraInfo callParams

OK

19

19

{}

Exporting Data

Neosemantics offers two methods for exporting data out of Neo4j back into RDF format.

Using the Cypher n10s.rdf.export Procedure

This is an experimental feature. The procedure may change in future.

You can export the results of a Cypher query as RDF triples using the n10s.rdf.export.cypher procedure. The procedure takes a Cypher statement as the first parameter and an optional map of params. The result of the query is stream of triples.

CALL n10s.rdf.export.cypher("MATCH (p:GraphPlatform) RETURN p")
subject predicate object isLiteral literalType literalLang

"http://neo4j.org/ind#neo4j355"

"http://www.w3.org/1999/02/22-rdf-syntax-ns#type"

"neo4j://vocabulary#AwesomePlatform"

false

null

null

"http://neo4j.org/ind#neo4j355"

"http://www.w3.org/1999/02/22-rdf-syntax-ns#type"

"neo4j://vocabulary#GraphPlatform"

false

null

null

"http://neo4j.org/ind#neo4j355"

"neo4j://vocabulary#version"

"3.5.5"

true

"http://www.w3.org/2001/XMLSchema#string"

null

"http://neo4j.org/ind#neo4j355"

"neo4j://vocabulary#name"

"neo4j"

true

"http://www.w3.org/2001/XMLSchema#string"

null

Using the HTTP Endpoint

The neosemantics plugin also exposes a HTTP endpoint on the Neo4j Server. To enable this, you must add the following line to the neo4j.conf file in the conf/ directory of your Neo4j installation.

dbms.unmanaged_extension_classes=n10s.endpoint=/rdf

After restarting the database, an endpoint will be mounted to the Neo4j Server. Provided your server is hosted on localhost, you should be able to send a GET request to DATABASE/describe/[URI], replacing [DATABASE] with the name of the database and [URI] with a URL encoded version of the full URI of the RDF resource.

For example, to export an RDF representation of the http://neo4j.org/ind#neo4j355 resource, we could run the following in Neo4j Browser

:GET http://localhost:7474/rdf/neo4j/describe/http%3A%2F%2Fneo4j.org%2Find%23neo4j355

This will provide us with the following RDF output:

@prefix neovoc: <neo4j://vocabulary#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<http://neo4j.org/ind#neo4j355> a neovoc:AwesomePlatform, neovoc:GraphPlatform;
  neovoc:name "neo4j";
  neovoc:version "3.5.5" .

<http://neo4j.org/ind#apoc3502> neovoc:RUNS_ON <http://neo4j.org/ind#neo4j355> .

<http://neo4j.org/ind#graphql3502> neovoc:RUNS_ON <http://neo4j.org/ind#neo4j355> .

<http://neo4j.org/ind#nsmntx3502> neovoc:RUNS_ON <http://neo4j.org/ind#neo4j355> .

Deleting Data as Triples

To delete RDF data from our Neo4j Graph, we can use the n10s.rdf.delete.fetch procedure. As with the preview and import procedures, this accepts two arguments: the URL of the RDF query and the format.

For example, to delete the data that we loaded in the Import step, we can run the following query:

Fetch
CALL n10s.rdf.delete.fetch(
  'https://raw.githubusercontent.com/neo4j-labs/neosemantics/3.5/docs/rdf/nsmntx.ttl',
  'Turtle'
)
Inline
CALL n10s.rdf.delete.inline(
  '
@prefix neo4voc: <http://neo4j.org/vocab/sw#> .
@prefix neo4ind: <http://neo4j.org/ind#> .

neo4ind:nsmntx3502 neo4voc:name "NSMNTX" ;
			   a neo4voc:Neo4jPlugin ;
			   neo4voc:version "3.5.0.2" ;
			   neo4voc:releaseDate "03-06-2019" ;
			   neo4voc:runsOn neo4ind:neo4j355 .

neo4ind:apoc3502 neo4voc:name "APOC" ;
			   a neo4voc:Neo4jPlugin ;
			   neo4voc:version "3.5.0.4" ;
			   neo4voc:releaseDate "05-31-2019" ;
			   neo4voc:runsOn neo4ind:neo4j355 .

neo4ind:graphql3502 neo4voc:name "Neo4j-GraphQL" ;
			   a neo4voc:Neo4jPlugin ;
			   neo4voc:version "3.5.0.3" ;
			   neo4voc:releaseDate "05-05-2019" ;
			   neo4voc:runsOn neo4ind:neo4j355 .

neo4ind:neo4j355 neo4voc:name "neo4j" ;
			   a neo4voc:GraphPlatform , neo4voc:AwesomePlatform ;
			   neo4voc:version "3.5.5" .
  ',
  'Turtle'
)
terminationStatus triplesDeleted namespaces extraInfo

"OK"

19

null

""

Conclusion

In this tutorial we have learned how to:

  • Prepared Neo4j and Neosemantics to import RDF Data

  • Created a Graph Config

  • Previewed RDF data from a remote source using n10s.rdf.preview.fetch

  • Imported the RDF data using n10s.rdf.import.fetch

  • Exported graph data back into RDF using a Cypher statement (n10s.rdf.export.cypher) and the HTTP endpoint

  • Deleted RDF data from the graph using n10s.rdf.delete.fetch

Now that you are familiar with the procedures available in Neosemantics, we can now move on to a more concrete example, Importing Wikidata into Neo4j