Merge branch 'BackupAndRestore' into 'main'
Document Backup and Restore See merge request d_mcknight/blog-content!3
This commit is contained in:
commit
8cc63bdd4c
2 changed files with 80 additions and 0 deletions
80
2024-01-19_Backup-and-Restore.md
Normal file
80
2024-01-19_Backup-and-Restore.md
Normal file
|
@ -0,0 +1,80 @@
|
|||
---
|
||||
date: 2024-01-20
|
||||
title: Recovering from a Backup
|
||||
tags:
|
||||
- homelab
|
||||
- truenas
|
||||
- backup
|
||||
- restoration
|
||||
---
|
||||
As part of my moving data and hard drives around between servers, I have reached
|
||||
a situation where I need to move data off of my primary storage server, destroy
|
||||
the array, and restore my data. Incidentally, this is basically simulating what
|
||||
I will do in the event my storage server or ZFS array fails. I'll document the
|
||||
process here to serve as a reference for what to do when upgrading the pool in
|
||||
the future or in case too many drives fail to rebuild the array normally.
|
||||
|
||||
## Save any Encryption Keys
|
||||
This should be done as part of setting up TrueNAS datasets, since without the keys
|
||||
any encrypted datasets are inaccessible. Even if the array is fine, the data on it
|
||||
will be inaccessible if TrueNAS fails or gets re-installed without first backing up
|
||||
the encryption keys. This is [documented](https://www.truenas.com/docs/scale/22.12/scaleuireference/storage/datasets/encryptionuiscale/#pool-encryption),
|
||||
so just be sure to export all of the keys and put them somewhere safe (NOT anywhere
|
||||
on the TrueNAS filesystem). For me, this means a copy saved to my laptop and another
|
||||
copy on an SSD I keep in a fireproof safe.
|
||||
|
||||
## Configure Replication
|
||||
I talked about replication tasks [in a previous post](https://blog.mcknight.tech/2024/01/18/NAS_Setup/#Data-Protection),
|
||||
but the important thing here is to make sure the entire array is replicated, or at
|
||||
least the data you care to restore. Since this is a planned migration for me, I
|
||||
created a manual snapshot of my whole dataset and ran the replication manually to
|
||||
be sure I have a completely up-to-date copy of my data before destroying the old
|
||||
array. Make sure that manual backup is included in replication; in my case, I had
|
||||
configured only "auto" snapshots to replicate, so I updated the task to include this
|
||||
manual one. I also checked the `Full Filesystem Replication` box so that I can mount the
|
||||
replicated dataset on my target device.
|
||||
|
||||
## Mount and Validate the Replica
|
||||
I don't consider this step optional, go to the machine that received the snapshots and
|
||||
mount every dataset. If you have encrypted datasets, the keys will need to be imported;
|
||||
I did this manually with [`zfs-load-key`](https://openzfs.github.io/openzfs-docs/man/master/8/zfs-load-key.8.html)
|
||||
on the backup system. After mounting, I spot check that recently modified files were
|
||||
present and then decide I am satisfied that all of my data is safe.
|
||||
|
||||
### Update References to the Backup Shares
|
||||
This is an optional step, but replication can take some time so it may be worthwhile to
|
||||
mount the backup shares and update any containers or clients to reference the backup.
|
||||
Usually, a backup should be read-only but in this scenario the "live" data is about to
|
||||
be wiped out, so I mounted my backup as read-write.
|
||||
|
||||
## Perform Restoration
|
||||
On the system being restored to, disable any scheduled replication tasks and then create
|
||||
the VDEV(s) to restore to. Configure a replication task that is essentially the one used
|
||||
to create the backup, but with source/destination reversed. If a manual backup was created
|
||||
as in my case, make sure it is included in the restored snapshots. Also, if the backup has
|
||||
any changes, a manual snapshot will need to be taken or else changes synced back (i.e.
|
||||
via rsync) after the replication from the backup to storage server is complete.
|
||||
|
||||
## Validate Restored Data
|
||||
After the replication back to the primary storage server is complete, make sure encryption
|
||||
keys are all present, data is available and up-to-date, and permissions are correct.
|
||||
It will likely be necessary to restore encryption keys, but the TrueNAS web UI makes it
|
||||
easy to unloack shares with downloaded backup keys.
|
||||
|
||||
If everything looks good, then proceed to re-configuring shares and enabling/re-configuring
|
||||
backups as appropriate.
|
||||
|
||||
## Complete Restoration
|
||||
With all of the data restored, re-enable any disabled services and backup tasks. Note that after
|
||||
re-enabling services, each individual share will have to be re-enabled. The replication task to
|
||||
restore data can be disabled but I prefer to remove it entirely so it isn't accidentally run.
|
||||
|
||||
I found that permissions were preserved so there was nothing left to do; at this point, my TrueNAS
|
||||
instance is in the same state it was before, just with an updated pool configuration.
|
||||
|
||||
## Some Random Notes
|
||||
I have both machines involved here connected through a 10Gig switch and they transferred data at
|
||||
up to around 4Gbps. At those speeds, my local backup restoration took about 12 hours; backups are
|
||||
less bandwidth-constrained since only changes have to be transferred. I eventually want to have
|
||||
an offsite backup; based on my local backup/restore experience, this would actually be doable within
|
||||
a day or two, assuming my remote backup location also has gigabit or better upload speeds.
|
Loading…
Reference in a new issue