Search
⌃K

Elasticsearch Data Seeding

This guide will walk you through the process of seeding an Elasticsearch cluster deployed in your Velocity Environment with data that has been dumped from an existing Elasticsearch cluster with the Elasticsearch Dump utility.

1. Clone the example repo

First, you'll need to clone the example repository that contains sample data to be seeded and a Velocity-specific K8s Job that will seed the Elasticsearch cluster.
git clone https://github.com/techvelocity/velocity-blueprints.git
cd velocity-blueprints/examples/aws-elasticsearch-data-seeding

2. Upload your data snapshot

Next, you'll need to upload the provided sample data to Velocity by running:
veloctl snapshot put --target es-seeding-example -f index_data.json --default

3. Deploy and seed Elasticsearch in Velocity

Finally, you can deploy and seed an Elasticsearch cluster in Velocity by running the following:
helm repo add elastic https://helm.elastic.co
{ helm template elastic/elasticsearch --set replicas=1 --set tests.enabled=false \
&& cat elasticsearch-seed-job.yaml } | veloctl env create -f -

4. View the result in the Velocity UI

5. Update the provided examples for your data

veloctl snapshot put --target es-seeding-example -f <your_data_file> --default
You'll need to make two small changes to deploy and seed an Elasticsearch cluster in Velocity with your own data. First, as shown above, you'll need to update the veloctl snapshot put command with the name of your local data file.
Next, you'll need to set the ELASTIC_INDEX_NAME environment variable in the elasticsearch-seed-job.yaml file to the name of your Elasticsearch index.
---
apiVersion: batch/v1
kind: Job
metadata:
name: "elasticsearch-seeding-job"
annotations:
velocity.tech.v1/id: es-seeding-example # Velocity service identifier
velocity.tech.v1/dependsOn: elasticsearch-master # Velocity dependencies declaration
labels:
app: seeding
spec:
template:
metadata:
name: "elasticsearch-seeding-job"
labels:
app: seeding
spec:
containers:
- env:
- name: ELASTIC_PASSWORD
valueFrom:
secretKeyRef:
name: elasticsearch-master-credentials
key: password
- name: ELASTIC_INDEX_NAME
value: kibana_sample_data_flights
name: seeding
image: elasticdump/elasticsearch-dump:latest
command: ["/bin/bash", "-c"]
args:
- NODE_TLS_REJECT_UNAUTHORIZED=0 elasticdump --input=/mnt/seeds/data.json --output=https://elastic:[email protected]-master:9200/$ELASTIC_INDEX_NAME --type=data
volumeMounts:
- name: seed-files
mountPath: /mnt/seeds/
readOnly: false
initContainers:
- name: download-es-json
image: amazon/aws-cli:2.0.6
command: [ "/bin/sh", "-c" ]
env:
- name: SNAPSHOT_FILE
value: "{velocity.v1.snapshot}"
args:
- aws s3 cp $SNAPSHOT_FILE /mnt/seeds/data.json
volumeMounts:
- name: seed-files
mountPath: /mnt/seeds/
readOnly: false
volumes:
- name: seed-files
emptyDir: { }
restartPolicy: Never

6. Seed Elasticsearch with your own data

Finally, with your data snapshot uploaded to Velocity and the above change made to the local example elasticsearch-seed-job.yaml file, you can seed Elasticsearch in Velocity with your own data by running the following:
helm repo add elastic https://helm.elastic.co
{ helm template elastic/elasticsearch --set replicas=1 --set tests.enabled=false \
&& cat elasticsearch-seed-job.yaml } | veloctl env create -f -