[go: up one dir, main page]

Appendix

Glossary

Pending migration

See Resolved migration.

Resolved migration

A migration that has been resolved in the classpath or the filesystem which has not been yet applied.

Schema database

A database inside a Neo4j enterprise instance or cluster that stores the schema information from Neo4j-Migrations.

Target database

A database inside a Neo4j enterprise instance or cluster that is refactored by Neo4j-Migrations.

XML Schemes

migration.xsd

Before we jump into the pure joy of an XML Schema, lets read in plain english what our schema can do:

  • A <migration /> can have zero or exactly one <catalog /> element.

  • A <catalog /> consists of zero or one <constraints /> and zero or one <indexes /> elements. In addition, it can indicate a reset attribute, replacing the current known content with the catalog currently being in definition.

  • Both of them can contain zero or more of their individual elements, according to their definition.

  • A <migration /> can have zero or one <verify /> operations and the <verify /> operation must be the first operation.

  • A <migration /> can than have zero or more <create /> and <drop /> operations or exactly one <apply /> operation. The <apply /> operation is mutual exclusive to all operations working on single items.

  • Operations that work on a single item (create and drop) are allowed to define a single item locally. This item won’t participate in the global catalog.

  • Operations that work on a single item can refer to this item by either using the attribute item (a free form string) or ref (an xs:IDREF). While the latter is useful for referring to items defined in the same migration (it will usually be validated by your tooling), the former is handy to refer to items defined in other migrations.

A catalog item will either have a child-element <label /> in which case it will always refer to nodes or a mutual exclusive child-element <type /> in which it always refers to relationships. The type attribute is unrelated to the target entity. This attribute defines the type of the element (such as unique- or existential constraints).

We do support the following processing instructions:

  • <?assert followed by a valid precondition ?>

  • <?assume followed by a valid precondition ?>

Look up valid preconditions here. The full XMl schema for catalog-based migrations looks like this:

Listing 1. migration.xsd
<?xml version="1.0" encoding="UTF-8" ?>
<!--

    Copyright 2020-2023 the original author or authors.

    Licensed under the Apache License, Version 2.0 (the "License");
    you may not use this file except in compliance with the License.
    You may obtain a copy of the License at

         https://www.apache.org/licenses/LICENSE-2.0

    Unless required by applicable law or agreed to in writing, software
    distributed under the License is distributed on an "AS IS" BASIS,
    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    See the License for the specific language governing permissions and
    limitations under the License.

-->
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
       targetNamespace="https://michael-simons.github.io/neo4j-migrations"
       xmlns="https://michael-simons.github.io/neo4j-migrations"
       elementFormDefault="qualified">

  <xs:element name="migration" type="migration"/>

  <xs:complexType name="migration">
    <xs:sequence>
      <xs:element name="catalog" minOccurs="0" type="catalog"/>
      <xs:element name="verify" minOccurs="0" type="verifyOperation" />
      <xs:choice>
        <xs:choice maxOccurs="unbounded">
          <xs:element name="refactor" minOccurs="0" maxOccurs="unbounded" type="refactoring"/>
          <xs:choice maxOccurs="unbounded">
            <xs:element name="create" minOccurs="0" maxOccurs="unbounded" type="createOperation"/>
            <xs:element name="drop" minOccurs="0" maxOccurs="unbounded" type="dropOperation"/>
          </xs:choice>
        </xs:choice>
        <xs:element name="apply" minOccurs="0" type="applyOperation"/>
      </xs:choice>
    </xs:sequence>
  </xs:complexType>

  <xs:complexType name="refactoring">
    <xs:sequence minOccurs="0">
      <xs:element name="parameters">
        <xs:complexType>
          <xs:sequence maxOccurs="unbounded">
            <xs:any processContents="lax"/>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
    </xs:sequence>
    <xs:attribute name="type">
      <xs:simpleType>
        <xs:restriction base="xs:string">
          <xs:enumeration value="merge.nodes"/>
          <xs:enumeration value="migrate.createFutureIndexes"/>
          <xs:enumeration value="migrate.replaceBTreeIndexes"/>
          <xs:enumeration value="normalize.asBoolean"/>
          <xs:enumeration value="rename.label"/>
          <xs:enumeration value="rename.type"/>
          <xs:enumeration value="rename.nodeProperty"/>
          <xs:enumeration value="rename.relationshipProperty"/>
          <xs:enumeration value="addSurrogateKeyTo.nodes"/>
          <xs:enumeration value="addSurrogateKeyTo.relationships"/>
        </xs:restriction>
      </xs:simpleType>
    </xs:attribute>
  </xs:complexType>

  <xs:complexType name="catalog">
    <xs:all>
      <xs:element name="constraints" minOccurs="0">
        <xs:complexType>
          <xs:sequence>
            <xs:element type="constraint" name="constraint"
                  maxOccurs="unbounded" minOccurs="0"/>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
      <xs:element name="indexes" minOccurs="0">
        <xs:complexType>
          <xs:sequence>
            <xs:element type="index" name="index"
                  maxOccurs="unbounded" minOccurs="0"/>
          </xs:sequence>
        </xs:complexType>
      </xs:element>
    </xs:all>
    <xs:attribute name="reset" type="xs:boolean" default="false"/>
  </xs:complexType>

  <xs:complexType name="operation" />

  <xs:complexType name="applyOperation">
    <xs:complexContent>
      <xs:extension base="operation" />
    </xs:complexContent>
  </xs:complexType>

  <xs:complexType name="verifyOperation">
    <xs:complexContent>
      <xs:extension base="operation" >
        <xs:attribute name="useCurrent" type="xs:boolean" default="false"/>
        <xs:attribute name="allowEquivalent" type="xs:boolean" default="true"/>
        <xs:attribute name="includeOptions" type="xs:boolean" default="false"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <xs:complexType name="itemOperation">
    <xs:complexContent>
      <xs:extension base="operation">
        <xs:sequence>
          <xs:choice minOccurs="0">
            <xs:element name="constraint" type="constraint"/>
            <xs:element name="index" type="index"/>
          </xs:choice>
        </xs:sequence>
        <xs:attribute name="item" type="xs:string"/>
        <xs:attribute name="ref" type="xs:IDREF"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <xs:complexType name="createOperation">
    <xs:complexContent>
      <xs:extension base="itemOperation">
        <xs:attribute name="ifNotExists" type="xs:boolean" default="true"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <xs:complexType name="dropOperation">
    <xs:complexContent>
      <xs:extension base="itemOperation">
        <xs:attribute name="ifExists" type="xs:boolean" default="true"/>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <xs:complexType name="properties">
    <xs:sequence>
      <xs:element type="xs:string" name="property" maxOccurs="unbounded"/>
    </xs:sequence>
  </xs:complexType>

  <xs:complexType name="catalogItem">
    <xs:attribute name="name" use="required" type="xs:ID"/>
  </xs:complexType>

  <xs:complexType name="constraint">
    <xs:complexContent>
      <xs:extension base="catalogItem">
        <xs:sequence>
          <xs:choice>
            <xs:element name="label" type="xs:string"/>
            <xs:element name="type" type="xs:string"/>
          </xs:choice>
          <xs:element type="properties" name="properties"/>
          <xs:element type="xs:string" name="options" minOccurs="0"/>
        </xs:sequence>
        <xs:attribute name="type" use="required">
          <xs:simpleType>
            <xs:restriction base="xs:string">
              <xs:enumeration value="unique"/>
              <xs:enumeration value="exists"/>
              <xs:enumeration value="key"/>
            </xs:restriction>
          </xs:simpleType>
        </xs:attribute>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>

  <xs:complexType name="index">
    <xs:complexContent>
      <xs:extension base="catalogItem">
        <xs:sequence>
          <xs:choice>
            <xs:element name="label" type="xs:string"/>
            <xs:element name="type" type="xs:string"/>
          </xs:choice>
          <xs:element type="properties" name="properties"/>
          <xs:element type="xs:string" name="options" minOccurs="0"/>
        </xs:sequence>
        <xs:attribute name="type">
          <xs:simpleType>
            <xs:restriction base="xs:string">
              <xs:enumeration value="property" />
              <xs:enumeration value="fulltext"/>
              <xs:enumeration value="text"/>
            </xs:restriction>
          </xs:simpleType>
        </xs:attribute>
      </xs:extension>
    </xs:complexContent>
  </xs:complexType>
</xs:schema>

Refactorings

Neo4j-Migrations contains a set of ready-to-use database refactorings. These refactorings are all modelled very closely to those available in APOC but none of them requires APOC to be installed in your database. The refactorings are mostly designed to work from within a catalog but they work very well on their own to. While they are part of the Core API, they don’t depend on a Migration instance. Their API is subject to the same versioning guarantees as the rest of Neo4j-Migrations. Refactorings might evolve into a their module at a later point in time.

Some refactorings require certain Neo4j versions. If you do support multiple Neo4j versions, define those refactorings as single itemed migrations and add assumptions like in the following example:

Listing 2. Normalize boolean properties when running Neo4j 4.1+
<?xml version="1.0" encoding="UTF-8"?>
<migration xmlns="https://michael-simons.github.io/neo4j-migrations">

  <?assume that version is ge 4.1 ?>

  <refactor type="normalize.asBoolean">
    <parameters>
      <parameter name="property">watched</parameter>
      <parameter name="trueValues">
        <value>y</value>
        <value>YES</value>
      </parameter>
      <parameter name="falseValues">
        <value>n</value>
        <value>NO</value>
      </parameter>
    </parameters>
  </refactor>
</migration>

Applying refactorings programmatically

While you would normally use the declarative approach of applying refactorings from within XML / catalog based migrations, Neo4j-Migrations offers an API for it as well:

Listing 3. Rename one type and normalize attributes to boolean in a programmatic fashion
try (Session session = driver.session()) {
  session.run("CREATE (m:Person {name:'Michael'}) -[:LIKES]-> (n:Person {name:'Tina', klug:'ja'})"); (1)
}

Migrations migrations = new Migrations(MigrationsConfig.defaultConfig(), driver); (2)

Counters counters = migrations.apply(
  Rename.type("LIKES", "MAG"), (3)
  Normalize.asBoolean("klug", List.of("ja"), List.of("nein"))
);

try (Session session = driver.session()) {
  long cnt = session
    .run("MATCH (m:Person {name:'Michael'}) -[:MAG]-> (n:Person {name:'Tina', klug: true}) RETURN count(m)")
    .single().get(0).asLong();
  assert cnt == 1
}
1 The graph that will be refactored
2 You can create the instance as shown here or use the existing one when you already use the Spring Boot starter or the Quarkus extensions
3 Build as many refactorings as needed, they will be applied in order. You can use the counters to check for the numbers of modifications

Merging nodes

Merge.nodes(String source, List<PropertyMergePolicy> mergePolicies) merges all the nodes, their properties and relationships onto a single node (the first in the list of matched nodes). It is important that your query uses an ordered return for this to work proper.

The Merge refactoring requires Neo4j 4.4+.

As catalog item:

<refactor type="merge.nodes">
  <parameters>
    <parameter name="sourceQuery">MATCH (p:Person) RETURN p ORDER BY p.name ASC</parameter>
    <!-- Repeat as often as necessary -->
    <parameter name="mergePolicy">
      <pattern>name</pattern>
      <strategy>KEEP_LAST</strategy>
    </parameter>
    <parameter name="mergePolicy">
      <pattern>.*</pattern>
      <strategy>KEEP_FIRST</strategy>
    </parameter>
  </parameters>
</refactor>

Normalizing

Normalizing is the process to take an humongous set of properties and other Graph Items and apply a scheme to it. The normalizing refactoring requires at least Neo4j 4.1, running it with batches requires Neo4j 4.4 or higher.

Normalize properties as boolean

Often times database schemes evolved over time, and you find properties with a boolean meaning and a string datatype with content such as ja, HiddenB, yes, NO or literal null. To use them proper in queries, you might want to normalize them into a real boolean value. This is done with Normalize.asBoolean.

Normalize.asBoolean takes in the name of a property and a list of values that are treated as true and a list of values that are treated as false. A property with a value that is not in any of those lists will be deleted. null as value is a non-existent property. However, if either lists contains literal null, a property will be created with the corresponding value.

By default all properties of all nodes and relationships will be normalized. To only apply this refactoring to a subset, i.e. only to nodes, you would want to use a custom query.

A Java example looks like this:

Normalize.asBoolean(
    "watched",
    List.of("y", "YES", "JA"),
	// List.of does not support literal null,
	// so we need to this the old-school
    Arrays.asList("n", "NO", null)
);

The same as a catalog item:

<refactor type="normalize.asBoolean">
  <parameters>
    <parameter name="property">watched</parameter>
    <parameter name="trueValues">
      <value>y</value>
      <value>YES</value>
      <value>JA</value>
    </parameter>
    <parameter name="falseValues">
      <value>n</value>
      <value>NO</value>
      <value />
    </parameter>
    <!-- Optional custom query and batch size -->
    <!--
    <parameter name="customQuery">MATCH (n:Movie) return n</parameter>
    <parameter name="batchSize">42</parameter>
    -->
  </parameters>
</refactor>

Renaming labels, types and properties

ac.simons.neo4j.migrations.core.refactorings.Rename renames labels, types and properties and requires in its default form only Neo4j 3.5 to work. Custom queries for filtering target entities require Neo4j 4.1, batches Neo4j 4.4.

Common methods

inBatchesOf

Enables or disables batching, requires Neo4j 4.4

withCustomQuery

Provides a custom query matching an entity (Node or Label) for renaming. The query must return zero or more rows each containing one item. This feature requires Neo4j 4.1

Renaming labels

Rename.label(String from, String to) renames all labels on all nodes that are equal the value of from to the value of to.

As catalog item:

<refactor type="rename.label">
  <parameters>
    <parameter name="from">Engineer</parameter>
    <parameter name="to">DevRel</parameter>
    <!-- Optional custom query -->
    <!--
    <parameter name="customQuery"><![CDATA[
      MATCH (person:Engineer)
      WHERE person.name IN ["Mark", "Jennifer", "Michael"]
      RETURN person
    ]]></parameter>
    -->
    <!-- Optional batch size (requires Neo4j 4.4+) -->
    <!--
    <parameter name="batchSize">23</parameter>
    -->
  </parameters>
</refactor>

Renaming types

Rename.type(String from, String to) renames all types on all relationships that are equal the value of from to the value of to.

As catalog item:

<refactor type="rename.type">
  <parameters>
    <parameter name="from">COLLEAGUES</parameter>
    <parameter name="to">FROLLEAGUES</parameter>
    <!-- Optional custom query -->
    <!--
    <parameter name="customQuery"><![CDATA[
      MATCH (:Engineer {name: "Jim"})-[rel]->(:Engineer {name: "Alistair"})
      RETURN rel
    ]]></parameter>
    -->
    <!-- Optional batch size (requires Neo4j 4.4+) -->
    <!--
    <parameter name="batchSize">23</parameter>
    -->
  </parameters>
</refactor>

Renaming node properties

Rename.nodeProperty(String from, String to) renames all properties on all nodes that are equal the value of from to the value of to.

As catalog item:

<refactor type="rename.nodeProperty">
  <parameters>
    <parameter name="from">released</parameter>
    <parameter name="to">veröffentlicht im Jahr</parameter>
    <!-- Optional custom query -->
    <!--
    <parameter name="customQuery"><![CDATA[
      MATCH (n:Movie) WHERE n.title =~ '.*Matrix.*' RETURN n
    ]]></parameter>
    -->
    <!-- Optional batch size (requires Neo4j 4.4+) -->
    <!--
    <parameter name="batchSize">23</parameter>
    -->
  </parameters>
</refactor>

Renaming type properties

Rename.typeProperty(String from, String to) renames all properties on all relationships that are equal the value of from to the value of to.

As catalog item:

<refactor type="rename.relationshipProperty">
  <parameters>
    <parameter name="from">roles</parameter>
    <parameter name="to">rollen</parameter>
    <!-- Optional custom query -->
    <!--
    <parameter name="customQuery"><![CDATA[
      MATCH (n:Movie) <-[r:ACTED_IN] -() WHERE n.title =~ '.*Matrix.*' RETURN r
    ]]></parameter>
    -->
    <!-- Optional batch size (requires Neo4j 4.4+) -->
    <!--
    <parameter name="batchSize">23</parameter>
    -->
  </parameters>
</refactor>

Adding surrogate keys

You can use Neo4j-Migrations to add Surrogate Keys aka technical keys to your Nodes and Relationships. This is especially helpful to migrate away from internal Neo4j ids, such as id() (Neo4j 4.4 and earlier) or elementId(). While these functions are useful and several Object-Graph-Mappers can use them right out of the box, they are often not what you want:

  • You expose database internals as proxy for your own technical keys

  • Your business now is dependent on the way the database generates them

  • They might get reused (inside Neo4j), leaving you with no good guarantees for an identifier

Our build-in refactorings use randomUUID() to assign a UUID to a property named id for Nodes with a given set of labels or Relationships with a matching type for which such a property does not exist. Both the generator and the name of the property can be individually configured. Also, both type of entities can be matched with a custom query.

Listing 4. Adding random UUIDs as ids to Movie Nodes (XML)
<refactor type="addSurrogateKeyTo.nodes">
  <parameters>
    <parameter name="labels">
      <value>Movie</value>
    </parameter>
  </parameters>
</refactor>
Listing 5. Adding random UUIDs as ids to Movie Nodes (Java)
var addSurrogateKey = AddSurrogateKey.toNodes("Movie");
Listing 6. Adding random UUIDs as ids to ACTED_IN relationships (XML)
<refactor type="addSurrogateKeyTo.relationships">
  <parameters>
    <parameter name="type">ACTED_IN</parameter>
  </parameters>
</refactor>
Listing 7. Adding random UUIDs as ids to ACTED_IN relationships (Java)
var addSurrogateKey = AddSurrogateKey.toRelationships("ACTED_IN");

The following examples use a different target property and hard-copy the internal id into a property. Of course, you can use your own user-defined functions for generating keys. A single %s will be replaced with a variable holding the matched entity. The syntax for relationships is the same (as demonstrated above):

Listing 8. Using a different property and generator function (XML)
<refactor type="addSurrogateKeyTo.nodes">
  <parameters>
    <parameter name="labels">
      <value>Movie</value>
    </parameter>
    <parameter name="property">movie_pk</parameter>
    <parameter name="generatorFunction">id(%s)</parameter>
  </parameters>
</refactor>
Listing 9. Using a different property and generator function (Java)
var addSurrogateKey = AddSurrogateKey.toNodes("Movie")
  .withProperty("movie_pk")
  .withGeneratorFunction("id(%s)");

Migrating BTREE indexes to "future" indexes

Neo4j 4.4 introduces future indexes, RANGE and POINT which replace the well known BTREE indexes of Neo4j 4.x. These new indexes are available from Neo4j 4.4 onwards but will not participate in any query planing in Neo4j 4.4. They exist merely for migration purposes in Neo4j 4.4: Neo4j 5.0 does not support BTREE indexes at all. This means a database that contains BTREE indexes cannot be upgraded to Neo4j 5.0. Existing BTREE indexes need to be dropped prior to attempting the upgrade. The class ac.simons.neo4j.migrations.core.refactorings.MigrateBTreeIndexes has been created for this purpose. It allows creation of matching new indexes and optionally dropping the indexes that are no longer supported in Neo4j 5.0 and higher prior to upgrading the store.

As with all the other refactorings, it can be used programmatically in your own application or through Neo4j-Migrations.

Preparing an upgrade to Neo4j 5.0 by creating future indexes in parallel

Listing 10. Creating future indexes in parallel to old indexes
<refactor type="migrate.createFutureIndexes">
    <parameters> (1)
        <parameter name="suffix">_future</parameter> (2)
        <parameter name="excludes"> (3)
            <value>a</value>
            <value>b</value>
        </parameter>
        <parameter name="typeMapping"> (4)
            <mapping>
                <name>c</name>
                <type>POINT</type>
            </mapping>
            <mapping>
                <name>d</name>
                <type>TEXT</type>
            </mapping>
        </parameter>
    </parameters>
</refactor>
1 All parameters are optional
2 The default suffix is _new
3 An excludes list can be used to exclude items from being processed by name. Its pendant is the includes list. If the latter is not empty, only the items in the list will be processed
4 By default, RANGE indexes are created. The type mapping allows to map specific old indexes to either RANGE, POINT or TEXT. The type mappings are not consulted when migrating constraint-backing indexes.

When the above refactoring is applied, new indexes and constraints will be created in parallel to the old ones. The refactoring will log statements for dropping the old constraints.

Preparing an upgrade to Neo4j 5.0 by replacing BTREE indexes with future indexes

The advantage of this approach is the fact that it won’t need additional manual work before doing a store upgrade. However, the store upgrade should follow closely after dropping the old indexes and creating the replacement indexes as the latter won’t participate in planning at all prior to the actual upgrade to Neo4j 5.0 or higher.

Listing 11. Replacing BTREE indexes with future indexes
<refactor type="migrate.replaceBTreeIndexes">
    <parameters>
        <parameter name="includes">
            <value>x</value>
            <value>y</value>
        </parameter>
    </parameters>
</refactor>

The suffix parameter is not supported as it is not needed. The other parameters have the same meaning as with migrate.createFutureIndexes. The above example shows the includes parameter.

Annotation processing

Neo4j-Migrations offers annotation processing for SDN 6 and generates catalogs containing unique constraints for all @Node entities using either assigned or externally generated ids (via @Id plus an optional external @GeneratedValue or without further annotation).

This is in line with recommended best practices for SDN 6:

  • Use externally assigned or generated IDs instead of Neo4j internal id values (especially when making those ids available to external systems)

  • Create at least indexes for them, better unique constraint to ensure that any assigned value is fit for its purpose

For more ideas and ruminations around that, please have a look at How to choose an unique identifier for your database entities. While that article is still from an SDN5+OGM perspective, it’s core ideas still apply.

The annotation processor is available under the following coordinates:

Listing 12. Annotation processor as Maven dependency
<dependency>
    <groupId>eu.michael-simons.neo4j</groupId>
    <artifactId>neo4j-migrations-annotation-processor</artifactId>
    <version>2.0.3</version>
</dependency>

It has no dependencies apart from Neo4j-Migrations itself (neither SDN6 nor Neo4j-OGM), so it is safe to use it either directly as dependency so that it will be picked up by all recent Java compilers or as dedicated processor for the compiler:

Listing 13. Annotation processor configured as processor for the compiler plugin inside a Maven pom
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-compiler-plugin</artifactId>
    <configuration>
        <annotationProcessorPaths>
            <annotationProcessorPath>
                <groupId>eu.michael-simons.neo4j</groupId>
                <artifactId>neo4j-migrations-annotation-processor</artifactId>
                <version>2.0.3</version>
            </annotationProcessorPath>
        </annotationProcessorPaths>
        <compilerArgs>
            <arg>-Aorg.neo4j.migrations.catalog_generator.default_catalog_name=R${next-migration-version}__Create_sdn_constraints.xml</arg>
            <arg>-Aorg.neo4j.migrations.catalog_generator.output_dir=my-generated-migrations</arg>
        </compilerArgs>
    </configuration>
</plugin>

The latter approach allows for passing additional configuration to the processor, such as the output location relativ to target/generated-sources and various name generators. There is a limited API to the processor living in the neo4j-migrations-annotation-processor-api module, such as ac.simons.neo4j.migrations.annotations.proc.ConstraintNameGenerator and the CatalogNameGenerator. You can provide implementations, but they must live outside the project that is being subject to compilation, as otherwise those classes can’t be loaded by us. All implementations must provide a default, publicly accessible constructor or - if they take in any nested options - a public constructor taking in exactly one argument of type Map<String, String>.

The scope of the generator is limited on purpose: It will generate a valid catalog declaration and by default an <apply /> operation. The latter is safe todo because catalogs are internally bound to their migration version and elements added or changed in v2 of a catalog will be appended, no elements will be deleted from the known catalog. Optionally the generator can be configured to generate a reset catalog, which will start the catalog at the given version fresh.

The generator does not generate a migration in a known migrations directory nor does it use a name that will be picked up Neo4j-Migrations by default. It is your task to configure the build system in such a way that any generated migration will

  • have a recognized naming schema

  • a name that evaluates to a correctly ordered version number

  • be part of the directories in the target that are configured to be picked by Neo4j-Migrations

Taking the above configuration of the processor one exemplary way to take this further is this:

Listing 14. Adding generated migrations to the actual target dir
<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-resources-plugin</artifactId>
    <executions>
        <execution>
            <id>copy-resources</id>
            <goals>
                <goal>copy-resources</goal>
            </goals>
            <phase>process-classes</phase>
            <configuration>
                <outputDirectory>${project.build.outputDirectory}/neo4j/migrations/</outputDirectory>
                <resources>
                    <resource>
                        <directory>${project.build.directory}/generated-sources/annotations/my-generated-migrations</directory>
                        <filtering>false</filtering>
                    </resource>
                </resources>
            </configuration>
        </execution>
    </executions>
</plugin>

This works in our examples but bear in mind: The migration will always be regenerated. This is fine as long as you don’t change your annotated model in any capacity that results in a new or modified index (renaming attributes, labels etc.).

The generator will always use idempotent versions of indexes if available in your database. They work well with repeatable migrations. So one solution is to configure the generator that it generates a name like R1_2_3__Create_domain_indexes.xml.

One approach is to add the processor to your build and run a diff with the last "good" generated catalog and the new one. If it is different, add the new catalog under an incremented version number.

A simpler approach is using a name generator that is connected to your target dev-database using a Migrations instance and our api (MigrationChain info = migrations.info(MigrationChain.ChainBuilderMode.REMOTE);) to get the latest applied version from the info instance (via .getLastAppliedVersion) and take that and increment it and just add the catalog fresh with a new version if it has change, otherwise resuse the old name.

For the naming generation APIs are provided and for the rest, maven-resources-plugin and maybe build-helper-maven-plugin are helpful. The decision to delegate that work has been made as it is rather difficult to propose a one-size-fits-all solution within this tool for all the combinations of different setups and build-systems out there.

Options can be passed to name generators via -Aorg.neo4j.migrations.catalog_generator.naming_options=<nestedproperties> with nestedproperties following a structure like a=x,b=y and so on. If you want to use that, your own name generator must provide a public constructor taking in one single Map<String, String> argument.

Our recommended approach is to use javac directly and script it’s invocation in your CI/CD system as shown in the following paragraph!

Additional annotations

We offer a set of additional annotations - @Unique and @Required that can be used standalone or together with SDN6 or OGM to specify constraints on classes. Please check the JavaDoc of those annotations about their usage. The module as shown below has no dependencies, neither on Neo4j-Migrations, nor SDN6 or OGM. While it works excellent with SDN6 for specifying additional information, all annotations offer a way to define labels and relationship types.

Listing 15. Annotation processor as Maven dependency
<dependency>
    <groupId>eu.michael-simons.neo4j</groupId>
    <artifactId>neo4j-migrations-annotation-catalog</artifactId>
    <version>2.0.3</version>
</dependency>

Combined with SDN6, a valid definition would look like this:

import java.util.UUID;

import org.springframework.data.neo4j.core.schema.GeneratedValue;
import org.springframework.data.neo4j.core.schema.Id;
import org.springframework.data.neo4j.core.schema.Node;

import ac.simons.neo4j.migrations.annotations.catalog.Required;
import ac.simons.neo4j.migrations.annotations.catalog.Unique;

@Node
public record Organization(
	@Id @GeneratedValue @Unique UUID id, (1)
	@Required String name) {
}
1 Technically, the @Unique annotation isn’t necessary here and the processor will generate a constraint for that field out of the box, but we think it reads better that way.

Using Javac and our annotation processor

The annotation processor itself is made of 3 artifacts:

neo4j-migrations-2.0.3.jar

Needed to generate the catalogs

neo4j-migrations-annotation-processor-api-2.0.3.jar

Contains the API and built-in annotations

neo4j-migrations-annotation-processor-2.0.3.jar

The processor itself

You need to make sure to include all of them in the processor path, otherwise you will most likely read something like error: Bad service configuration file, or exception thrown while constructing Processor object: javax.annotation.processing.Processor: ac.simons.neo4j.migrations.annotations.proc.impl.CatalogGeneratingProcessor Unable to get public no-arg constructor, which is a bit misleading.

For OGM entities

You need at least neo4j-ogm-core as dependency for processing Neo4j-OGM entities and most likely all libraries that you are used in addition to OGM annotations in those entities. The following statement generates V01__Create_OGM_schema.xml in a directory output. It only does annotation processing:

Listing 16. Generating a catalog from Neo4j-OGM entities
javac -proc:only \
-processorpath neo4j-migrations-2.0.3.jar:neo4j-migrations-annotation-processor-api-2.0.3.jar:neo4j-migrations-annotation-processor-2.0.3.jar \
-Aorg.neo4j.migrations.catalog_generator.output_dir=output \
-Aorg.neo4j.migrations.catalog_generator.default_catalog_name=V01__Create_OGM_schema.xml \
-cp neo4j-ogm-core-4.0.0.jar \
extensions/neo4j-migrations-annotation-processing/processor/src/test/java/ac/simons/neo4j/migrations/annotations/proc/ogm/*

For SDN Entities

The only difference here is that you must use SDN 6.0+ and its dependencies as a dependencies to JavaC:

Listing 17. Generating a catalog from Neo4j-OGM entities
javac -proc:only \
-processorpath neo4j-migrations-2.0.3.jar:neo4j-migrations-annotation-processor-api-2.0.3.jar:neo4j-migrations-annotation-processor-2.0.3.jar \
-Aorg.neo4j.migrations.catalog_generator.output_dir=output \
-Aorg.neo4j.migrations.catalog_generator.default_catalog_name=V01__Create_SDN6_schema.xml \
-cp apiguardian-api-1.1.2.jar:spring-data-commons-2.7.2.jar:spring-data-neo4j-6.3.2.jar \
extensions/neo4j-migrations-annotation-processing/processor/src/test/java/ac/simons/neo4j/migrations/annotations/proc/sdn6/movies/*

For classes annotated with catalog annotations

No additional jars apart from the dedicated annotations are necessary

Listing 18. Generating a catalog from plain annotated classes
javac -proc:only \
-processorpath neo4j-migrations-2.0.3.jar:neo4j-migrations-annotation-processor-api-2.0.3.jar:neo4j-migrations-annotation-processor-2.0.3.jar \
-Aorg.neo4j.migrations.catalog_generator.output_dir=output \
-Aorg.neo4j.migrations.catalog_generator.default_catalog_name=R01__Create_annotated_schema.xml \
-cp neo4j-migrations-annotation-catalog-2.0.3 \
extensions/neo4j-migrations-annotation-processing/processor/src/test/java/ac/simons/neo4j/migrations/annotations/proc/catalog/valid/CoffeeBeanPure*

Extensions

CSV Support (Experimental)

What does it do?

This module consists of some abstract bases classes that helps you to use data in CSV files during migration. The idea is that you have some CSV data you want to use LOAD CSV. Depending on whether the data has been changed or not, you need want to repeat the migration or not.

We have basically everything in place:

  • Java based migrations that can be repeated or not

  • Check-summing based on whatever.

What we can do for you is check-summing CSV data on HTTP urls for you. What you need to do is make them available to both Neo4j and this tool and provide a query to deal with them. Our tooling brings it together. Essentially, you want to inherit from ac.simons.neo4j.migrations.formats.csv.AbstractLoadCSVMigration like this:

Listing 19. R050__LoadBookData.java
import java.net.URI;

import org.neo4j.driver.Query;

import ac.simons.neo4j.migrations.formats.csv.AbstractLoadCSVMigration;

public class R050__LoadBookData extends AbstractLoadCSVMigration {

    public R050__LoadBookData() {
        super(URI.create("https://raw.githubusercontent.com/michael-simons/goodreads/master/all.csv"), true);
    }

    @Override
    public Query getQuery() {
        // language=cypher
        return new Query("""
            LOAD CSV FROM '%s' AS row FIELDTERMINATOR ';'
            MERGE (b:Book {title: trim(row[1])})
            SET b.type = row[2], b.state = row[3]
            WITH b, row
            UNWIND split(row[0], '&') AS author
            WITH b, split(author, ',') AS author
            WITH b, ((trim(coalesce(author[1], '')) + ' ') + trim(author[0])) AS author
            MERGE (a:Person {name: trim(author)})
            MERGE (a)-[r:WROTE]->(b)
            WITH b, a
            WITH b, collect(a) AS authors
            RETURN b.title, b.state, authors
            """);
    }
}

In the above example, we decide that the CSV data might change and therefor we indicate this migration being repeatable in the constructor call. If this is the case, we suggest using a class name reflecting that. If you use false during construction, migrations will fail if the data changes. The Cypher being used here does a merge and therefor, we added constraints to the title and person names beforehand. You may choose to omit the %s in the query template, but we suggest to use for the URI.

AsciiDoctor Support (Experimental)

What does it do?

Please open this README.adoc not only in a rendered view, but have a look at the raw asciidoc version!

When added to one of the supported use-case scenarios as an external library, it allows Neo4j-Migrations to discover AsciiDoctor files and use them as sources of Cypher statements for defining refactorings.

An AsciiDoctor based migration can have zero to many code blocks of type cypher with an id matching our versioning scheme and valid inline Cypher content. The block definition looks like this:

[source,cypher,id=V1.0__Create_initial_data]
----
// Your Cypher based migration
----

In fact, this README.adoc is a source of migrations on its own. It contains the following refactorings:

CREATE (a:Author {
  id: randomUUID(),
  name: 'Stephen King'
})
CREATE (b:Book {
  id: randomUUID(),
  name: 'The Dark Tower'
})
CREATE (a)-[:WROTE]->(b)

We can have as many migrations as we want.

MATCH (a:Author {
  name: 'Stephen King'
})
CREATE (b:Book  {
  id: randomUUID(),
  name: 'Atlantis'
})
CREATE (a)-[:WROTE]->(b);


CREATE (a:Author {
  id: randomUUID(),
  name: 'Grace Blakeley'
})
CREATE (b:Book {
  id: randomUUID(),
  name: 'Stolen: How to Save the World From Financialisation'
})
CREATE (a)-[:WROTE]->(b);

And to make queries on peoples name perform fast, we should add some indexes and constraints. This we do with a separate document, V1.2__Create_id_constraints.xml to be included here:

<?xml version="1.0" encoding="UTF-8"?>
<migration xmlns="https://michael-simons.github.io/neo4j-migrations">
  <catalog>
    <indexes>
      <index name="idx_author_name">
        <label>Author</label>
        <properties>
          <property>name</property>
        </properties>
      </index>
      <index name="idx_book_name">
        <label>Book</label>
        <properties>
          <property>name</property>
        </properties>
      </index>
    </indexes>
    <constraints>
      <constraint name="unique_id_author" type="unique">
        <label>Author</label>
        <properties>
          <property>id</property>
        </properties>
      </constraint>
      <constraint name="unique_id_book" type="unique">
        <label>Book</label>
        <properties>
          <property>id</property>
        </properties>
      </constraint>
    </constraints>
  </catalog>

  <apply/>
</migration>
Includes are not processed. To make the system process the above xml content respectively any included Cypher file, these files must live in a configured location, as described in the manual.
We opted against resolving includes for two reasons: It’s easier to reason about the sources of migrations when just inline code is processed and also, inclusion of arbitrary URLs may expose a security risk.
Please have a look at the source of this file itself to understand what works and what not.

The following block is an example of an included Cypher file, that will be used from its own location when this changeset is applied, but can still be referenced in this documentation:

CREATE (m:User {
  name: 'Michael'
})
WITH m
MATCH (a:Author {
  name: 'Stephen King'
})-[:WROTE]->(b)
WITH m, a, collect(b) AS books
CREATE (m)-[:LIKES]->(a)
WITH m, books
UNWIND books AS b
CREATE (m)-[:LIKES]->(b);

The checksum of AsciiDoctor based migrations is computed individually per Cypher block, not for the whole file. So one AsciiDoctor file basically behaves as a container for many migrations.

How to use it?

The extension is loaded via service loader. In a standard Spring Boot or Quarkus application you just need to add one additional dependency:

Listing 20. AsciiDoctor extension as Maven dependency
<dependency>
    <groupId>eu.michael-simons.neo4j</groupId>
    <artifactId>neo4j-migrations-formats-adoc</artifactId>
    <version>2.0.3</version>
</dependency>

Or in case you fancy Gradle:

Listing 21. AsciiDoctor extension as Gradle dependency
dependencies {
    implementation 'eu.michael-simons.neo4j:neo4j-migrations-formats-adoc:2.0.3'
}

And that’s all.

For the CLI, you should download the -all artifact from Maven Central: neo4j-migrations-formats-adoc-2.0.3-all.jar This will work only with the JVM based CLI version, which is available here.

A full example looks like this:

curl -LO https://github.com/michael-simons/neo4j-migrations/releases/download/2.0.3/neo4j-migrations-2.0.3.zip
curl -LO https://repo.maven.apache.org/maven2/eu/michael-simons/neo4j/neo4j-migrations-formats-adoc/2.0.3/neo4j-migrations-formats-adoc-2.0.3-all.jar
unzip neo4j-migrations-2.0.3.zip
cd neo4j-migrations-2.0.3
CLASSPATH_PREFIX=../neo4j-migrations-formats-adoc-2.0.3-all.jar \
  bin/neo4j-migrations --password secret \
  --location file:///path/to/neo4j/adoc-migrations \
  info

Which will result in:

neo4j@localhost:7687 (Neo4j/4.4.4)
Database: neo4j

+---------+---------------------------+---------+---------+----------------------------------------------+
| Version | Description               | Type    | State   | Source                                       |
+---------+---------------------------+---------+---------+----------------------------------------------+
| 1.0     | initial data              | CYPHER  | PENDING | initial_schema_draft.adoc#V1.0__initial_data |
| 1.2     | more data                 | CYPHER  | PENDING | initial_schema_draft.adoc#V1.2__more_data    |
| 2.0     | lets rock                 | CYPHER  | PENDING | more_content.adoc#V2.0__lets_rock            |
| 3.0     | We forgot the constraints | CATALOG | PENDING | V3.0__We_forgot_the_constraints.xml          |
| 4.0     | Plain cypher              | CYPHER  | PENDING | V4.0__Plain_cypher.cypher                    |
+---------+---------------------------+---------+---------+----------------------------------------------+

(Note: empty columns have been omitted for brevity.)

Markdown Support (Experimental)

What does it do?

When added to one of the supported use-case scenarios as an external library, it allows Neo4j-Migrations to discover Markdown files and use them as sources of Cypher statements for defining refactorings.

A Markdown based migration can have zero to many fenced code blocks with an id matching our versioning scheme and valid inline Cypher content. The block definition looks like this:

```id=V1.0__Create_initial_data
// Your Cypher based migration

How to use it?

The extension is loaded via service loader. In a standard Spring Boot or Quarkus application you just need to add one additional dependency:

Listing 22. Markdown extension as Maven dependency
<dependency>
    <groupId>eu.michael-simons.neo4j</groupId>
    <artifactId>neo4j-migrations-formats-markdown</artifactId>
    <version>2.0.3</version>
</dependency>

Or in case you fancy Gradle:

Listing 23. AsciiDoctor extension as Gradle dependency
dependencies {
    implementation 'eu.michael-simons.neo4j:neo4j-migrations-formats-markdown:2.0.3'
}

And that’s all.

For the CLI, you should download the -all artifact from Maven Central: neo4j-migrations-formats-markdown-2.0.3-all.jar This will work only with the JVM based CLI version, which is available here.

A full example looks like this:

curl -LO https://github.com/michael-simons/neo4j-migrations/releases/download/2.0.3/neo4j-migrations-2.0.3.zip
curl -LO https://repo.maven.apache.org/maven2/eu/michael-simons/neo4j/neo4j-migrations-formats-markdown/2.0.3/neo4j-migrations-formats-markdown-2.0.3-all.jar
unzip neo4j-migrations-2.0.3.zip
cd neo4j-migrations-2.0.3
CLASSPATH_PREFIX=../neo4j-migrations-formats-markdown-2.0.3-all.jar \
  bin/neo4j-migrations --password secret \
  --location file:///path/to/neo4j/markdown-migrations \
  info

Which will result in:

neo4j@localhost:7687 (Neo4j/4.4.8)
Database: neo4j

+---------+---------------------+--------+---------+--------------------------------------------+
| Version | Description         | Type   | State   | Source                                     |
+---------+---------------------+--------+---------+--------------------------------------------+
| 1.0     | initial data        | CYPHER | PENDING | initial_schema_draft.md#V1.0__initial_data |
| 1.2     | more data           | CYPHER | PENDING | initial_schema_draft.md#V1.2__more_data    |
| 1.3     | something different | CYPHER | PENDING | more_content.md#V1.3__something_different  |
+---------+---------------------+--------+---------+--------------------------------------------+

(Note: empty columns have been omitted for brevity.)