> For the complete documentation index, see [llms.txt](https://jadelab.gitbook.io/jadegit/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://jadelab.gitbook.io/jadegit/0.18.0/admin/database-snapshots.md).

# Database Snapshots

Similar to traditional backups, database snapshots are copies taken at a point in time. However, in this context they're not intended for disaster recovery, rather as a means to publish database copies that can be used elsewhere for development and testing purposes.

Since these aren't intended for disaster recovery, they don't need to preserve the source folder structure or any environment-specific files, nor do they need to be copied in a manner that strictly guarantees data integrity. Instead, they can use the same folder structure as database images and can be copied in a manner that prioritises speed over strict data integrity guarantees.

Using this pattern provides a decoupled approach where snapshots can be published from one place and consumed elsewhere in a consistent manner.

The initial need for this is likely to be the publishing of snapshots of production databases taken when they're quiesced or offline, which just contain system data files, that can be used to prepare empty database images.

For teams looking to go further, pipelines can be set up to automate the backup and refresh of test environments using full snapshots. This supports self-service scenarios such as a tester taking a snapshot of a database where a problem has been encountered, which can then be used to refresh another test database where a developer is ready to investigate. When snapshots are stored in the cloud, and using self-hosted agents wherever databases are hosted, this approach can also support environments running in the cloud, on-prem, and even local workstations if desired.

### Code-Only

A code-only snapshot is a copy of a database that doesn't contain any application data, just the system metamodel data which defines application schemas, providing the basis for empty database images.

These are produced by copying only the system data files from a snapshot taken when the database was quiesced or offline, and using `jdbutilb` to reset the database state to reflect that application data files are not present.

## Storage

Like images, snapshots can be located anywhere common to its consumers. Using cloud hosted blob storage is recommended, but a traditional filesystem can also be used instead.

It's recommended snapshots be published using the structure:

```
<environment>/<database>/<name>
```

Where `environment` identifies the source environment the snapshot was taken from (e.g. `production`, `test`), `database` is the canonical database name (e.g. `finance`, `inventory`), and `name` is the snapshot folder name, consisting of a timestamp in `yyyyMMddHHmmss` format with an optional reference suffix separated by a hyphen (e.g. `20250101120123`, `20250101120123-code-only`).

A suffix is added when a reference other than the default of `latest` is specified when the snapshot is taken. A corresponding reference file with the same name is then created or updated to store the snapshot name, allowing it to be resolved by reference rather than by name. The special reference `code-only` is used to indicate that a snapshot need only contain system data files, and be taken while the database is quiesced or offline, ready to be used as the basis for creating an empty database image.

The following special references are also maintained automatically:

* `latest.txt` — the most recent full snapshot
* `code.txt` — the most recent offline or quiesced snapshot, suitable as the basis for a code-only download

For example:

```
production/
  finance/
    20250101120123-code-only/ (offline)
      bin/
      system/
        _*.dat
    20250131235959-monthend/ (offline)
      bin/
      system/
        *.dat
        journals/
    20250301090456/ (online)
      bin/
      system/
        *.dat
        journals/
    code-only.txt (20250101120123-code-only)
    code.txt (20250131235959-monthend)
    latest.txt (20250301090456)
    monthend.txt (20250131235959-monthend)
  inventory/
    ...

test/
  ...
```

## Scrambled

For teams working with sensitive data, they may have a requirement to use scrambled snapshots.

This is particularly relevant when using a cloud platform, for which networks may be used to define strict boundaries for scrambled and unscrambled data usage. For example, a production and non-production network may both have access to a common services network, but neither have sight of the other. The production network may contain pre-production environments that use unscrambled or scrambled data, whereas environments in the non-production network may only use scrambled data.

To support this kind of architecture, two storage locations are recommended. One for unscrambled snapshots located within the production network, which cannot be accessed by other networks. One for scrambled snapshots located within the common services network, which is accessible by other networks. Within the production network, a scrambling pipeline would be used to download unscrambled snapshots, scramble, and then publish to the scrambled storage location where snapshots would be named and organised in the same way, with the only difference being that the contents have been scrambled.
