Search
⌃K

Data Seeding with Default Snapshots

Seed your Velocity environment with production-like data.
In this guide, we will create a Velocity Environment that does the following:
  1. 1.
    starts a database service
  2. 2.
    starts a job that will seed the DB from a data file (a snapshot)
  3. 3.
    starts an application that queries the DB and uses the seed data
Data seeding code samples: PostgreSQL, MySQL

1. Upload a Default Snapshot to Velocity

First, we will create a text file called migrate.sql that contains the following SQL:
CREATE TABLE public.users (
first_name text,
last_name text,
user_id smallint
);
ALTER TABLE public.users OWNER TO postgres;
COPY public.users (first_name, last_name, user_id) FROM stdin;
bob smith 1
tom smith 2
rob smith 3
\.
Here, we are creating a new table named users and populating it with some sample data.
Next, run the following command to upload the data file to the Velocity's snapshot storage:
veloctl snapshot put -f migrate.sql --target seeding --default
The --default flag means that this data will be used by default whenever a new Velocity Environment that contains a K8s Job called seeding is created.
NOTE: see Advanced Data Seeding Configurations for non-default snapshot data seeding examples.

2. Deploy and Seed a Postgres DB with Snapshot Data

Next, we will create a Velocity Environment with a Postgres DB, and we will seed it with the snapshot we just uploaded. For that, we will use a sample Velocity Blueprint from our samples repository, which includes:
  • A K8s Deployment, PVC and Service for the Postgres DB
  • A K8s Job that inserts our uploaded snapshot into the DB
  • A sample application that queries the DB to confirm success in seeding the DB
NOTE: We're passing a URL to Velocity's env create command to deploy each of the above elements, but you can review the full YAML manifest here.
veloctl env create -f https://raw.githubusercontent.com/techvelocity/velocity-blueprints/main/getting-started/aws/data-seeding/postgres-single-job.yaml --env-version v2
Running the above command should result in output similar to this:
Requesting the creation of environment surprised-bart-rozum with services at 2022-09-13 10:43:28 IDT...
Environment 'surprised-bart-rozum' status:
Point in time: 2022-09-13 10:43:28 IDT
Service Status Version Public URI Tunneled
psql Ready ...postgresql:13.2.0
seeding Ready ...postgresql:13.2.0
the-app Ready ...postgresql:13.2.0
Overall status: Ready
You can see that 3 Velocity Services were successfully deployed. psql is the Postgres DB itself. seeding is the name of the K8s Job responsible for seeding the data, and the-app is an example application that queries the DB.
Remember in the previous step the value of the --target flag had to be seeding. The target of the snapshot is the name of the Velocity Service (K8s Job) that uses it.
Each data file has precisely one target, which means that each data file is handled by precisely one seeding job.

3. Confirm Success

Finally, we can confirm that the DB was seeded with our sample data by viewing the logs of the the-app Velocity Service, by running:
veloctl env logs -s the-app
And we should see the following output:
first_name | last_name | user_id
------------+-----------+---------
bob | smith | 1
tom | smith | 2
rob | smith | 3
the-app (3 rows) │
In this example, the app repeatedly runs the following query: select * from users. And you can see in the above output that the table exists and contains the data that was successfully seeded by the seeding job.

Next Steps: