Elasticsearch Beyonder

Welcome to the Elasticsearch Beyonder project.

This project comes historically from spring-elasticsearch project.

The goal of this project is to provide a simple Java library which helps to create indices, mappings, etc. when you start your application.

Versions

elasticsearch-beyonder	elasticsearch	Release date
9.1-SNAPSHOT	9.x
9.0	9.x	2025-06-11
8.17	8.x	2025-03-06
7.16	7.x	2022-01-13
7.15	7.x	2021-10-14
7.13.2	7.x	2021-07-22
7.13.1	7.x	2021-06-21
7.13	7.x	2021-06-03
7.5	7.x	2020-01-15
7.0	7.0 -> 7.x	2019-04-04
6.5	6.5 -> 6.x	2019-01-04
6.3	6.3 -> 6.4	2018-07-21
6.0	6.0 -> 6.2	2018-02-05
5.1	5.x, 6.x	2017-07-12
5.0	5.x, 6.x	2017-07-11
2.1.0	2.0, 2.1	2015-11-25
2.0.0	2.0	2015-10-24
1.5.0	1.5	2015-03-27
1.4.1	1.4	2015-03-02
1.4.0	1.4	2015-02-27

Documentation

For 9.x elasticsearch versions, you are reading the latest documentation.
For 8.x elasticsearch versions, look at es-8.x branch.
For 7.x elasticsearch versions, look at es-7.x branch.
For 6.x elasticsearch versions, look at es-6.x branch.
For 5.x elasticsearch versions, look at es-5.x branch.
For 2.x elasticsearch versions, look at es-2.1 branch.

Build Status

Release notes

9.1-SNAPSHOT

Nothing changed yet.

9.0

Update project to Elasticsearch 9.0.2.
Update required JVM to Java 17

8.17

Update project to Elasticsearch 8.17.2.
Remove the deprecated Transport Client
_pipeline dir is not supported anymore. Use _pipelines dir.
_template and _templates dir are not supported anymore. Use _index_templates and _component_templates dirs.
method start(RestClient client, String root, boolean merge, boolean force) is now start(RestClient client, String root, boolean force).
support for sample dataset has been added. If we detect a directory named _data in the classpath, we will try to load sample data from it. We support both ndjson files which will be loaded using the Bulk API and json files which will loaded using the Index API.

7.16

Update Log4J (optional) dependency to 2.17.1.

7.15

Add support for Index Lifecycles.

7.13.2

Added back support for Java 8

7.13.1

_pipeline dir has been deprecated by _pipelines dir.
_template dir has been deprecated by _templates dir.
force parameter is not applied anymore to pipelines. So pipelines are always updated.
force parameter is not applied anymore to templates, component templates and index templates. So they are always updated.
method start(RestClient client, String root, boolean merge, boolean force) is now deprecated as the merge parameter is not used anymore. Use instead the start(RestClient client, String root, boolean force) method.
support for the aliases API has been added.

Getting Started

Maven dependency

Import elasticsearch-beyonder in you project pom.xml file:

<dependency>
  <groupId>fr.pilato.elasticsearch</groupId>
  <artifactId>elasticsearch-beyonder</artifactId>
  <version>9.0</version>
</dependency>

Or if you want to use the latest build:

<dependency>
  <groupId>fr.pilato.elasticsearch</groupId>
  <artifactId>elasticsearch-beyonder</artifactId>
  <version>9.1-SNAPSHOT</version>
</dependency>

You need to import as well the elasticsearch client you want to use by adding one of the following dependencies to your pom.xml file.

For example, here is how to import the REST Client to your project:

<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-client</artifactId>
    <version>9.2.2</version>
</dependency>

For example, here is how to import the Transport Client to your project (deprecated):

<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>transport</artifactId>
    <version>9.2.2</version>
</dependency>

If you are using a SNAPSHOT version of elasticsearch-beyonder, you need to add the Sonatype repository to your pom.xml file:

<repositories>
  <repository>
    <name>Central Portal Snapshots</name>
    <id>central-portal-snapshots</id>
    <url>https://central.sonatype.com/repository/maven-snapshots/</url>
    <releases>
      <enabled>false</enabled>
    </releases>
    <snapshots>
      <enabled>true</enabled>
    </snapshots>
  </repository>
</repositories>

Adding Beyonder to your client

Elasticsearch provides a Low Level Rest Client. You can create it like this:

RestClient client = RestClient.builder(HttpHost.create("http://127.0.0.1:9200")).build();

Once you have the client, you can use it to manage automatic creation of index, mappings, templates and aliases. To activate those features, you only need to pass to Beyonder the Rest Client instance:

ElasticsearchBeyonder.start(client);

By default, Beyonder will try to locate resources from elasticsearch directory within your classpath. We will use this default value for the rest of the documentation.

But you can change this using:

ElasticsearchBeyonder.start(client, "models/myelasticsearch");

In that case, Beyonder will search for resources from models/myelasticsearch.

There is also a more complete version of the start method with:

ElasticsearchBeyonder.start(client, "models/myelasticsearch", true);

This last parameter is known as force. It removes any existing index which is managed by Beyonder. It is super useful for integration testing, but it is super dangerous in production.

For the record, when your cluster is secured, you can use for example the Basic Authentication:

CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials("elastic", "changeme"));
RestClient client = RestClient.builder(HttpHost.create("http://127.0.0.1:9200"))
        .setHttpClientConfigCallback(hcb -> hcb.setDefaultCredentialsProvider(credentialsProvider)).build();
ElasticsearchBeyonder.start(client);

Managing indices

When Beyonder starts, it tries to find index names and settings in the classpath.

If you add in your classpath a file named elasticsearch/twitter, the twitter index will be automatically created at startup if it does not exist yet.

If you add in your classpath a file named elasticsearch/twitter/_settings.json, it will be automatically applied to define settings for your twitter index.

For example, create the following file src/main/resources/elasticsearch/twitter/_settings.json in your project:

{
  "settings": {
    "number_of_shards": 3,
    "number_of_replicas": 0
  },
  "mappings": {
    "properties": {
      "message": { "type": "text" },
      "foo": { "type": "text" }
    }
  }
}

By default, Beyonder will not overwrite an index if it already exists. This can be overridden by setting force to true in the expanded factory method ElasticsearchBeyonder.start().

You can also provide a file named _update_settings.json to update your index settings and a file named _update_mapping.json if you want to update an existing mapping. Note that Elasticsearch do not allow updating all settings and mappings.

You can for example add a new field, or change the search_analyzer for a given field, but you can not modify the field type.

Considering the previous example we saw, you can create a elasticsearch/twitter/_update_settings.json to update the number of replicas:

{
    "number_of_replicas" : 1
}

And you can create elasticsearch/twitter/_update_mapping.json:

{
  "properties": {
    "message" : {"type" : "text", "search_analyzer": "keyword" },
    "bar" : { "type" : "text" }
  }
}

This will change the search_analyzer for the message field and will add a new field named bar. All other existing fields (like foo in the previous example) won't be changed.

If you would like to use math expressions for the index name, you can use the URI encoded version of the expression. For example, the following directory structure will end up creating an index named my-index-{now/d}:

.
└── %3Cmy-index-%7Bnow%2Fd%7D%3E
    ├── _data
    │   └── bulk.ndjson
    └── _settings.json

Managing aliases

An alias is helpful to define or remove an alias to a given index. You could also use index templates to do that automatically when the index is created, but you can also define a file elasticsearch/_aliases.json:

{
  "actions" : [
    { "remove": { "index": "test_1", "alias": "test" } },
    { "add":  { "index": "test_2", "alias": "test" } }
  ]
}

When Beyonder starts, it will automatically send the content to the Aliases API.

Managing index templates (aka templates V2)

Since version 7.13, the new index template management API is supported. It allows to define both component templates and index templates.

Component templates

To define component templates, you can create json files within the elasticsearch/_component_templates/ dir.

Let's first create a elasticsearch/_component_templates/component1.json:

{
  "template": {
    "mappings": {
      "properties": {
        "@timestamp": {
          "type": "date"
        }
      }
    }
  }
}

Then a second component template as elasticsearch/_component_templates/component2.json:

{
  "template": {
    "mappings": {
      "runtime": {
        "day_of_week": {
          "type": "keyword",
          "script": {
            "source": "emit(doc['@timestamp'].value.dayOfWeekEnum.getDisplayName(TextStyle.FULL, Locale.ROOT))"
          }
        }
      }
    }
  }
}

When Beyonder starts, it will create 2 component templates into elasticsearch, named respectively component1 and component2.

Index templates

To define index templates, you can create json files within the elasticsearch/_index_templates/ dir.

Let's create a elasticsearch/_index_templates/template_1.json:

{
  "index_patterns": ["te*", "bar*"],
  "template": {
    "settings": {
      "number_of_shards": 1
    },
    "mappings": {
      "_source": {
        "enabled": true
      },
      "properties": {
        "host_name": {
          "type": "keyword"
        },
        "created_at": {
          "type": "date",
          "format": "EEE MMM dd HH:mm:ss Z yyyy"
        }
      }
    },
    "aliases": {
      "mydata": { }
    }
  },
  "priority": 500,
  "composed_of": ["component1", "component2"],
  "version": 3,
  "_meta": {
    "description": "my custom"
  }
}

When Beyonder starts, it will create the index templates named template_1 into elasticsearch. Note that this index template references 2 component templates that must be available before Beyonder starts or defined within the component_templates dir as we saw just before.

Managing pipelines

A pipeline is a definition of a series of processors that are to be executed in the same order as they are declared while documents are being indexed. Please note that this feature is only supported when you use the REST client not the Transport client.

For example, setting one fields value based on another field by using a Set Processor you can add a file named elasticsearch/_pipelines/set_field_processor.json in your project:

{
  "description" : "Twitter pipeline",
  "processors" : [
    {
      "set" : {
        "field": "copy",
        "value": "{{otherField}}"
      }
    }
  ]
}

Index lifecycles

To define an index lifecycle, you can create json files within the elasticsearch/_index_lifecycles/ dir.

Let's create a elasticsearch/_index_lifecycles/my_lifecycle.json:

{
  "policy": {
    "phases": {
      "warm": {
        "min_age": "10d",
        "actions": {
          "forcemerge": {
            "max_num_segments": 1
          }
        }
      },
      "delete": {
        "min_age": "30d",
        "actions": {
          "delete": {}
        }
      }
    }
  }
}

When Beyonder starts, it will create the index templates named my_lifecycle into elasticsearch.

Loading sample data

If you want to load sample data into your cluster, you can create a directory named _data in your classpath. We support both ndjson files which will be loaded using the Bulk API (recommended) and json files which will loaded using the Index API (slower).

If the _data directory is created within an index directory, the data will be loaded only for this index, meaning that you don't need to define the index name in the bulk headers.

For example, let's assume that you have the following files under your elasticsearch directory:

.
├── _data
│   ├── bulk-001.ndjson
│   └── bulk-002.ndjson
├── person
│   ├── _data
│   │   ├── doc001.json
│   │   ├── doc002.json
│   │   ├── doc003.json
│   │   └── doc004.json
│   └── _index_templates
│       └── person.json
├── test_1
│   ├── _data
│   │   ├── bulk-001.ndjson
│   │   └── bulk-002.ndjson
│   └── _settings.json
├── test_2
│   ├── _data
│   │   └── abcd.ndjson
│   └── _settings.json
└── twitter
    └── _settings.json

The _data/bulk-001.ndjson file contains:

{ "index" : { "_index" : "twitter" } }
{ "message" : "message 1" }
{ "index" : { "_index" : "twitter" } }
{ "message" : "message 2" }
{ "index" : { "_index" : "twitter" } }
{ "message" : "message 3" }
{ "index" : { "_index" : "twitter" } }
{ "message" : "message 4" }
{ "index" : { "_index" : "twitter" } }
{ "message" : "message 5" }

The person/_data/doc001.json file contains something like:

{ 
  "name": "John Doe"
}

Note that it contains no header and could be pretty formatted unlike the ndjson format.

JSon documents can be only added within a given index directory and not at the root level in the _data directory.

The test_1/_data/bulk-001.ndjson file contains:

{ "index" : {  } }
{ "message" : "message 1" }
{ "index" : {  } }
{ "message" : "message 2" }
{ "index" : {  } }
{ "message" : "message 3" }
{ "index" : {  } }
{ "message" : "message 4" }
{ "index" : {  } }
{ "message" : "message 5" }

And the test_2/_data/abcd.ndjson file contains:

{ "index" : {  } }
{ "message" : "message 1" }
{ "index" : {  } }
{ "message" : "message 2" }
{ "index" : {  } }
{ "message" : "message 3" }
{ "index" : {  } }
{ "message" : "message 4" }
{ "index" : {  } }
{ "message" : "message 5" }

When Beyonder starts, it will:

Create the person index with the mapping defined in elasticsearch/person/_index_templates/person.json.
Create the test_1 index with the settings defined in elasticsearch/test_1/_settings.json.
Create the test_2 index with the settings defined in elasticsearch/test_2/_settings.json.
Create the twitter index with the settings defined in elasticsearch/twitter/_settings.json.
Load the data from elasticsearch/test_1/_data/bulk-001.ndjson and then elasticsearch/test_1/_data/bulk-002.ndjson into the test_1 index.
Load the data from elasticsearch/test_2/_data/abcd.ndjson into the test_2 index.
Load the data from elasticsearch/_data/bulk-001.ndjson and elasticsearch/_data/bulk-002.ndjson into the indices specified within the bulk files.
Load the data from elasticsearch/person/_data/doc*.json files into the person index.

Note that files are sorted by name before being loaded which means that a file bulk_001.ndjson will be loaded before bulk_002.ndjson.

If the index already existed before Beyonder starts, the data won't be loaded unless you are using the force option. This does not apply to the _data root directory which will always load the data at every startup.

Tests

This project comes with unit tests and integration tests. You can disable running them by using skipTests option as follows:

mvn clean install -DskipTests

Unit Tests

If you want to disable only running unit tests, use skipUnitTests option:

mvn clean install -DskipUnitTests

Integration Tests

Integration tests are launching a Docker instance using TestContainers. So you need to have Docker installed.

If you want to disable running integration tests, use skipIntegTests option:

mvn clean install -DskipIntegTests

If you wish to run integration tests against a cluster which is already running externally, you can configure the following settings to locate your cluster:

setting	default
`tests.cluster`	`https://127.0.0.1:9200`
`tests.cluster.user`	`elastic`
`tests.cluster.pass`	`changeme`

For example:

mvn clean install -Dtests.cluster=https://127.0.0.1:9200

If you want to run your tests against an Elastic Cloud instance, you can use something like:

mvn clean install \
    -Dtests.cluster=https://CLUSTERID.us-central1.gcp.cloud.es.io:443 \
    -Dtests.cluster.user=elastic \
    -Dtests.cluster.pass=GENERATEDPASSWORD

Release guide

To release the project, you simply need to run:

./release.sh

If you want to just test the release without deploying anything, run:

DRY_RUN=1 ./release.sh

Why this name?

I was actually looking for a cool name in the marvel characters list and found that Beyonder was actually a very powerful character.

This project gives some features beyond elasticsearch itself. :)

Community

You can find here the list of related community projects:

spring-elasticsearch by @dadoonet: the Spring factory provides a Java Rest Client for Elasticsearch and automatically creates index settings and templates based on what is found in the classpath.
Logstash Editor by @fbaligand: provides completion, documentation and auto-formatting for Logstash pipeline configuration files, Filebeat configuration files and Elasticsearch index template json files

License

This software is licensed under the Apache 2 license, quoted below.

Copyright 2011-2025 David Pilato

Licensed under the Apache License, Version 2.0 (the "License"); you may not
use this file except in compliance with the License. You may obtain a copy of
the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations under
the License.

Name		Name	Last commit message	Last commit date
Latest commit History 794 Commits
.github		.github
src		src
.gitignore		.gitignore
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
pom.xml		pom.xml
release.sh		release.sh

License

dadoonet/elasticsearch-beyonder

Folders and files

Latest commit

History

Repository files navigation

Elasticsearch Beyonder

Versions

Documentation

Build Status

Release notes

9.1-SNAPSHOT

9.0

8.17

7.16

7.15

7.13.2

7.13.1

Getting Started

Maven dependency

Adding Beyonder to your client

Managing indices

Managing aliases

Managing index templates (aka templates V2)

Component templates

Index templates

Managing pipelines

Index lifecycles

Loading sample data

Tests

Unit Tests

Integration Tests

Release guide

Why this name?

Community

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors 13

Uh oh!

Languages

Packages