Search
⌃K

Advanced Data Seeding Configurations

Create Non-Default Snapshot

If you want to create a new Velocity Environment with a DB that is seeded with data that is different from the default, you can achieve that by providing a different data file.
Let’s continue the Blueprints for Data Seeding flow and create another text file called special-data.sql with the following SQL:
CREATE TABLE public.users (
first_name text,
last_name text,
user_id smallint
);
ALTER TABLE public.users OWNER TO postgres;
COPY public.users (first_name, last_name, user_id) FROM stdin;
ps '1' 1
ls '2' 2
vim '3' 3
\.
Note that we are creating the same table we created in the previous tutorial, but we fill it with different values.
Now run the following command:
veloctl snapshot put -f special-data.sql --target seeding --name special-data
The --target is still seeding, because we are using the same Blueprints with the same seeding job as we used before.
However, this time we are using the --name special-data flag. That means that our data file will be associated with a different snapshot named special-data.
Next, let’s create a new Velocity Environment with the special-data snapshot like this:
veloctl env create -f https://raw.githubusercontent.com/techvelocity/velocity-blueprints/main/getting-started/aws/data-seeding/postgres-single-job.yaml --env-version v2 --snapshot special-data
Note the --snapshot flag. This command means: create a new environment and don’t use the default snapshot but rather a snapshot data named special-data.

Using multiple seeding jobs/targets

In the previous example, we had to repeat the SQL for creating a table. This isn’t ideal and not DRY. We will try to solve this issue with the following example.
Let's create two text files:
migrate.sql
CREATE TABLE IF NOT EXISTS tasks (
task_id INT PRIMARY KEY NOT NULL,
title TEXT NOT NULL,
start_date DATE,
due_date DATE,
status INT NOT NULL,
priority INT NOT NULL,
description TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
data.sql
INSERT INTO tasks (task_id, title, status, priority) VALUES
(1, 'ps', 1, 2),
(2, 'ls', 2, 1),
(3, 'vim', 3, 1);
We will refer to these file names in the next steps of the guide, so be sure to use these exact names.
Upload the data files:
veloctl snapshot put -f migrate.sql --target migrate-job --default
veloctl snapshot put -f data.sql --target data-job --default
Note that each file is associated with a target: migrate-job and data-job respectively.
Now, let’s create a new Velocity Environment from a multi-job sample:
veloctl env create -f https://raw.githubusercontent.com/techvelocity/velocity-blueprints/main/getting-started/aws/data-seeding/postgres-multi-job.yaml --env-version v2
And the expected output should be similar to:
Requesting the creation of environment surprised-bart-rozum with services at 2022-09-13 10:43:28 IDT...
Environment 'surprised-bart-rozum' status:
Point in time: 2022-09-13 10:43:28 IDT
Service Status Version Public URI Tunneled
psql Ready ...postgresql:13.2.0
migrate Ready ...postgresql:13.2.0
data Ready ...postgresql:13.2.0
the-app Ready ...postgresql:13.2.0
Overall status: Ready
We can see that this time there are 4 Velocity Services provisioned.
Now, if we want to override only the values seeded, but keep the same table definitions, we will create another file:
data-2.sql
INSERT INTO tasks (task_id, title, status, priority) VALUES
(10, 'cp', 1, 2),
(20, 'rm', 2, 1),
(30, 'top', 3, 1);
And upload it like this:
veloctl snapshot put -f data-2.sql --target data-job --default
And create another Velocity Environment to demonstrate an interesting behavior:
veloctl env create -f https://raw.githubusercontent.com/techvelocity/velocity-blueprints/main/getting-started/aws/data-seeding/postgres-multi-job.yaml --snapshot special-data,default
Note that we pass a comma-separated list of snapshot names here. The order of names is important. First, we include all the data files that are associated with the special-data snapshot name. Note that we only uploaded one file for special-data that is associated with the data job.
Next, we take more data files from the default snapshot name. The default snapshot includes two files, each associated with data and migrate jobs respectively.
Since the data job is already using the data file from the special-data snapshot, it will not be replaced with the data from default. However, the migrate job is still “free,” because there wasn’t any file in the special-data snapshot associated with it, so that K8s Job will use the data from the default snapshot.
You can use as many snapshot names as you want when creating a Velocity Environment. The default snapshot name is a special name that is always added to the end of the list, so you don’t have to add it yourself, so the following command has the same result:
veloctl env create -f https://raw.githubusercontent.com/techvelocity/velocity-blueprints/main/getting-started/aws/data-seeding/postgres-multi-job.yaml --snapshot special-data

Summary

You can use as many seeding jobs as you need and divide the work as is relevant to your application. It is possible to create multiple jobs that seed a single DB, as was demonstrated in this guide. It is also possible to create multiple jobs, each seeding a different DB, and it is also possible to create one job that seeds multiple databases; however, this is outside the scope of this guide.