Karthick Sankarachary http://github.com/karthicks
The gremlin-objects module defines a library that puts an object-oriented spin on the gremlin property graph.
It aims to make it much easier to specify business domain specific languages around Gremlin, without any loss of expressive power.
While it targets the Gremlin-Java variant, the concept itself is language-independent.
Every element in the property graph, whether it be a vertex (property) or an edge, is made up of properties.
Each such property is a String key and an arbitrary Java value.
It only seems fitting then to try and represent that property as a strongly-typed Java field.
The specific class in which that field is defined then becomes the vertex (property) or edge, which the property describes.
A gremlin object model such as this would need abstractions to query and update the graph in terms of those objects.
To get the library that facilitates all of this, add this dependency to your pom.xml:
<dependency>
<groupId>com.github.karthicks</groupId>
<artifactId>gremlin-objects</artifactId>
<version>3.3.1-RC1</version>
</dependency>A reference use case of this library is available in the following tinkergraph-test module:
<dependency>
<groupId>com.github.karthicks</groupId>
<artifactId>tinkergraph-test</artifactId>
<version>3.3.1-RC1</version>
</dependency>In this section, we go over how gremlin elements may be modeled, and how those models may be queried and stored.
Let’s consider the example of the person vertex, taken from the "modern" and "the crew" graphs defined in the TinkerFactory.
In our object world, it would be defined as a Person class that extends Vertex.
By default, the vertex’s label matches its simple class name, hence we have to un-capitalize it using the @Alias annotation.
The person’s name and age properties become primitive fields in the class.
The @PrimaryKey and @OrderingKey annotations on them not only indicate that they are mandatory,
but also allow the person to be found easily through the HasKeys.of(person) SubTraversal.
Think of the SubTraversal as a reusable function that takes a GraphTraversal, performs a few steps on it, and returns it back (to allow for chaining).
The KnowsPeople field in this class is an example of an in-line SubTraversal, albeit a stronger-typed version of it called ToVertex, to indicate that it ends up selecting vertices.
Note that these traversal functions are not stored in the graph.
@Data
@Alias(label = "person")
public class Person extends Vertex {
public static ToVertex KnowsPeople = traversal -> traversal
.out(Label.of(Knows.class))
.hasLabel(Label.of(Person.class));
@PrimaryKey
private String name;
@OrderingKey
private int age;
private Set<String> titles;
private List<Location> locations;
}Next, we look at its titles field, which is defined to be a Set.
As you might expect, the cardinality of the underlying property becomes set.
Similarly, the locations field takes on the list cardinality.
Further, each element in the locations list has it’s own meta-properties, and ergo deserves a Location class of it’s own.
@Data
@Alias(label = 'location')
public class Location extends Element {
@OrderingKey
@PropertyValue
private String name;
@OrderingKey
private Instant startTime;
private Instant endTime;
}|
Note
|
The value of the location is stored in name, due to the placement of the @PropertyValue annotation.
Every other field in the Location class becomes the `location’s meta-property.
|
An edge is defined much like the vertex, except it extends the Edge class.
By default, an edge’s label is it’s un-capitalized simple class name, and hence no @Alias is needed:
@Data
public class Knows extends Edge {
private Double weight;
private Instant since;
}The Graph interface lets you update the graph using Vertex or Edge objects.
You can get it via dependency injection, assuming you’ve an Object provider for GraphTraversalSource:
@Inject @Object
private Graph graph;Or, the good old fashioned way, using the GraphFactory:
private GraphFactory graphFactory =
GraphFactory.of(TinkerGraph.open().traversal()); // This gets you the factory for TinkerGraph.
private Graph = graphFactory.graph();Now that we know how to obtain a Graph instance, let’s see how to change it using Java objects.
Here, we create software vertices for tinkergraph and gremlin, and add a traverses edge from gremlin to tinkergraph.
graph
.addVertex(Software.of("tinkergraph")).as("tinkergraph")
.addVertex(Software.of("gremlin")).as("gremlin")
.addEdge(Traverses.of(), "tinkergraph");Below, a person vertex containing a list of locations is added, along with three outgoing edges.
graph
.addVertex(
Person.of("marko",
Location.of("san diego", 1997, 2001),
Location.of("santa cruz", 2001, 2004),
Location.of("brussels", 2004, 2005),
Location.of("santa fe", 2005))).as("marko")
.addEdge(Develops.of(2010), "tinkergraph")
.addEdge(Uses.of(Proficient), "gremlin")
.addEdge(Uses.of(Expert), "tinkergraph")To see how the modern and the crew reference graphs may be created using the object Graph interface, go here.
|
Tip
|
Since the object being added may already exist in the graph, we provide various options to resolve "merge conflicts", such as MERGE, REPLACE, CREATE, IGNORE AND INSERT.
|
There are two ways to get a handle to the Query interface. You can inject it like so:
@Inject @Object
private Query query;Otherwise, you can create it using the GraphFactory like so:
private GraphFactory graphFactory = GraphFactory.of(TinkerGraph.open().traversal());
private Query = graphFactory.query();Next, let’s see how to use the Query interface.
The following snippet queries the graph by chaining two SubTraversals (a function denoting a partial traversal), and parses the result into a list of Person vertices.
List<Person> friends = query
.by(HasKeys.of(modern.marko), Person.KnowsPeople)
.list(Person.class);Below, we query by an AnyTraversal (a function on the GraphTraversalSource), and get a single Person back.
Person marko = Person.of("marko");
Person actual = query
.by(g -> g.V().hasLabel(marko.label()).has("name", marko.name()))
.one(Person.class);The type of the result may be primitives too, and that is handled as shown below.
long count = query
.by(HasKeys.of(crew.marko), Count.of())
.one(Long.class);Last, we show a traversal involving select steps, which requires special handling as it may return a map.
Selections selections = query
.by(g -> g.V().as("a").
properties("locations").as("b").
hasNot("endTime").as("c").
order().by("startTime").
select("a", "b", "c").by("name").by(T.value).by("startTime").dedup())
.as("a", String.class)
.as("b", String.class)
.as("c", Instant.class)
.select();To see more examples showcasing how the object Query interface may be used, go here.
In this section, we talk about how the gremlin-objects library can be customized for a graph system provider.
A provider that wishes to plug into gremlin-objects through dependency injection, will need to provide a GraphTraversalSource of it’s choice, through the Object qualifier.
For users that don’t use dependency injection, they may manually pass the GraphTraversalSource to the GraphFactory.
Typically, gremlin property values are Java primitives.
Sometimes, a provider treats a custom type as a primitive.
For instance, DataStax lets you define property keys of the primitive geometric type Point.
Such types can be registered using the Primitives#registerPrimitiveClass methods.
When a GraphTraversal is completed, it usually returns (a list of) gremlin Element(s).
However, when some providers execute a traversal, the result comprises custom element types.
For instance, when DataStax executes a graph query, it returns a result set made up of GraphNode(s), a proprietary element type.
We give such providers a way to tell us how to parse such custom elements using the Parsers#registerElementParser method.
While there exist similar OGM libraries, this one has some key differentiating factors. Now, let’s consider the alternatives:
The gremlin-core module defines a GremlinDsl annotation that lets you define custom traversals by extending the GraphTraversal and GraphTraversalSource.
However, it requires some familiarity of gremlin-core internals.
Peopod represents elements as annotated interfaces or abstract classes. While it generates boilerplate for traversals to adjacent vertices, it doesn’t let you co-locate arbitrary traversals. This library is less intrusive and more flexible.
An older version of TinkerPop allowed you to define custom steps using Closures, not unlike the AnyTraversal and SubTraversal functions.
However, they aren’t as developer friendly as the functional interfaces provided here.
Moreover, it doesn’t allow for co-locating the traversal logic along with the element model, as we do here.
So far, we have the gremlin-objects library, and a tinkergraph-test reference use case for it.
Here, we list a few directions in which we see the library evolving:
The concept of lifting the property graph into objects is language-independent.
To quote the TinkerPop docs, "with JSR-223, any language compiler written for the JVM can directly access the JVM and any of its libraries", and that would include gremlin-objects.
For GLVs not written for the JVM, it can be ported over as long as it supports basic reflection.
Case in point, the Gremlin-Python variant could achieve the object mapping through the dir, getattr and setattr built-in functions.
In reality, it is fairly easy for a provider to plug-into gremlin-objects simply by supplying a GraphTraversalSource of their choosing.
The ability to register custom primitive types and traversal result parsers allows for further customization.
Since neo4j already has it’s own Neo4jGraph, it’s a good candidate to become the next test case.
Some providers use GraphFrames to execute bulk operations and graph algorithms on top of Tinkerpop.
Assuming they can work with DataFrames, one could build a GraphTraversalSource,
which translates the object Graph and Query operations into DataFrame tables, and adapt’s it to the provider’s GraphFrame.
The AnyTraversal and SubTraversal
interfaces extend Formattable so that the steps defined in it’s body can be revealed.
Let’s say that we stored the bytecode of these types of functional fields as a hidden property in the element.
That could potentially allow us to execute user defined traversals using a, say, traversal.call('function-name') step.