When a new Velocity environment is created, we want it to be usable out of the box. Most environments include at least one database, and generally speaking starting off with an empty database is awkward at best, and in some cases, it renders the environment completely unusable.
The purpose of “Data seeding” is to initialize the database(s) within your environment with production-like data. This allows your various services, while in development, to interact with your database(s) in the same way that they will in production.
Database migration (a.k.a schema migration)
Database migration involves making incremental and reversible changes to a database schema – the organizational structure of the database, which consists of elements such as tables, rows and datatypes. It is a reversible, and thus safe, way to update or revert that database's schema to some newer or older version.
Usually, when a new environment is created, we would like the database to have the latest schema available. As such, database migration is usually the first step in the process of data seeding.
A database snapshot consists of a text file that represents the data stored in a given database at a particular point in time.
For example, in the case of a SQL database, this file will consist of a series of SQL queries. Alternatively, for NoSQL databases, the snapshot will consist of JSON, BSON, or any other format supported by your specific NoSQL database migration tool.
Data seeding job
In Velocity, data seeding is carried out with a K8s job that handles the process of accessing an uploaded snapshot, and seeding that file into the targeted database within your environment.
Database snapshots are stored in a cloud storage bucket associated with your Velocity environment.
Each snapshot file is associated with exactly one target. The target is the name of the relevant K8s job that will carry out the seeding process. Because there can be multiple seeding jobs within a single set of Velocity Blueprints, each snapshot must be associated with exactly one job.
Because an environment may have multiple data seeding jobs, the process of data seeding may require more than one snapshot file. For example, an environment that contains three separate databases may require at least three different database snapshots.
The snapshot group is used to specify a collection of snapshot files required for data seeding for a specific environment configuration.
A snapshot template is a dynamically-resolved reference within your Velocity Blueprint to a snapshot file that lives in your cloud storage bucket.
Instead of explicitly referring to the full path of the snapshot that is stored inside the cloud storage bucket, we can use the snapshot template, which will be automatically replaced with the relevant snapshot path during the creation of an environment.
Snapshot templates can only be used within the context of a k8s seeding job definition.