diff --git a/2023-01-18_NAS_Setup.md b/2024-01-18_NAS_Setup.md similarity index 100% rename from 2023-01-18_NAS_Setup.md rename to 2024-01-18_NAS_Setup.md diff --git a/2024-01-19_Backup-and-Restore.md b/2024-01-19_Backup-and-Restore.md new file mode 100644 index 0000000..a47c140 --- /dev/null +++ b/2024-01-19_Backup-and-Restore.md @@ -0,0 +1,80 @@ +--- +date: 2024-01-20 +title: Recovering from a Backup +tags: + - homelab + - truenas + - backup + - restoration +--- +As part of my moving data and hard drives around between servers, I have reached +a situation where I need to move data off of my primary storage server, destroy +the array, and restore my data. Incidentally, this is basically simulating what +I will do in the event my storage server or ZFS array fails. I'll document the +process here to serve as a reference for what to do when upgrading the pool in +the future or in case too many drives fail to rebuild the array normally. + +## Save any Encryption Keys +This should be done as part of setting up TrueNAS datasets, since without the keys +any encrypted datasets are inaccessible. Even if the array is fine, the data on it +will be inaccessible if TrueNAS fails or gets re-installed without first backing up +the encryption keys. This is [documented](https://www.truenas.com/docs/scale/22.12/scaleuireference/storage/datasets/encryptionuiscale/#pool-encryption), +so just be sure to export all of the keys and put them somewhere safe (NOT anywhere +on the TrueNAS filesystem). For me, this means a copy saved to my laptop and another +copy on an SSD I keep in a fireproof safe. + +## Configure Replication +I talked about replication tasks [in a previous post](https://blog.mcknight.tech/2024/01/18/NAS_Setup/#Data-Protection), +but the important thing here is to make sure the entire array is replicated, or at +least the data you care to restore. Since this is a planned migration for me, I +created a manual snapshot of my whole dataset and ran the replication manually to +be sure I have a completely up-to-date copy of my data before destroying the old +array. Make sure that manual backup is included in replication; in my case, I had +configured only "auto" snapshots to replicate, so I updated the task to include this +manual one. I also checked the `Full Filesystem Replication` box so that I can mount the +replicated dataset on my target device. + +## Mount and Validate the Replica +I don't consider this step optional, go to the machine that received the snapshots and +mount every dataset. If you have encrypted datasets, the keys will need to be imported; +I did this manually with [`zfs-load-key`](https://openzfs.github.io/openzfs-docs/man/master/8/zfs-load-key.8.html) +on the backup system. After mounting, I spot check that recently modified files were +present and then decide I am satisfied that all of my data is safe. + +### Update References to the Backup Shares +This is an optional step, but replication can take some time so it may be worthwhile to +mount the backup shares and update any containers or clients to reference the backup. +Usually, a backup should be read-only but in this scenario the "live" data is about to +be wiped out, so I mounted my backup as read-write. + +## Perform Restoration +On the system being restored to, disable any scheduled replication tasks and then create +the VDEV(s) to restore to. Configure a replication task that is essentially the one used +to create the backup, but with source/destination reversed. If a manual backup was created +as in my case, make sure it is included in the restored snapshots. Also, if the backup has +any changes, a manual snapshot will need to be taken or else changes synced back (i.e. +via rsync) after the replication from the backup to storage server is complete. + +## Validate Restored Data +After the replication back to the primary storage server is complete, make sure encryption +keys are all present, data is available and up-to-date, and permissions are correct. +It will likely be necessary to restore encryption keys, but the TrueNAS web UI makes it +easy to unloack shares with downloaded backup keys. + +If everything looks good, then proceed to re-configuring shares and enabling/re-configuring +backups as appropriate. + +## Complete Restoration +With all of the data restored, re-enable any disabled services and backup tasks. Note that after +re-enabling services, each individual share will have to be re-enabled. The replication task to +restore data can be disabled but I prefer to remove it entirely so it isn't accidentally run. + +I found that permissions were preserved so there was nothing left to do; at this point, my TrueNAS +instance is in the same state it was before, just with an updated pool configuration. + +## Some Random Notes +I have both machines involved here connected through a 10Gig switch and they transferred data at +up to around 4Gbps. At those speeds, my local backup restoration took about 12 hours; backups are +less bandwidth-constrained since only changes have to be transferred. I eventually want to have +an offsite backup; based on my local backup/restore experience, this would actually be doable within +a day or two, assuming my remote backup location also has gigabit or better upload speeds. \ No newline at end of file