blog-content/2024-01-18_NAS_Setup.md

158 lines
9.6 KiB
Markdown

---
date: 2024-01-18
title: Configuring TrueNAS
tags:
- homelab
- truenas
- storage
- backup
---
I [previously](https://blog.mcknight.tech/2024/01/11/Storage-Server/) talked about my physical storage server
build and choice of TrueNAS SCALE as an operating system. I did configure a single storage pool and copy my
data over since then to experiment with settings and see how things work. I had enough spare drives so I was
able to do this without touching my Unraid server which made for an easier and less stressful transition.
I'll go over my storage architecture plan, accounting for what I learned in this initial setup.
## Storage Pools
I plan on creating two pools, one pool of large HDDs with a Metadata VDEV for general file storage and another
pool of SSDs for application data and VMs.
Initially, I started with a pool of HDDs named "Rusty" and have been working out the details for share
settings and permissions. My current containers and VMs in Unraid are still using the SSD cache on that
box for application data, but I've moved all other data to my new NAS and have applications like Plex and
Nextcloud pointed at the new TrueNAS shares. My plan is to add a pool of SSDs and move my application data
and VM disks there before re-purposing my Unraid server.
### VDEV Configuration
I have my primary storage pool configured with a 5 disk RAIDZ2 Data VDEV and a 2 disk Mirrored Metadata VDEV.
This means I have data on an array that will tolerate two drive failures and the file metadata on an array
that will tolerate one drive failure. I'm fairly comfortable with this configuration since the metadata
device is using SSDs that should be less prone to failure and that should rebuild more quickly in the event of
a failure.
My plan is to change this to a RAIDZ configuration with the same 2 cache disks. I already have a comprehensive
backup plan (more on that below), so I'm comfortable with the risk of only tolerating one drive failure. I also
intend on adding another drive or two to my primary storage pool to keep my primary and backup arrays about
the same size.
## Datasets
Without going into too much detail, a dataset can be thought of as a directory in a ZFS pool. Each dataset
can have different permissions, encryption, and compression settings applied and datasets may be nested. In
addition to datasets, Zvols can be created to provision block storage for iSCSI targets (more on that later).
### "Rusty" Dataset Configuration
I created the root dataset as unencryptedt with LZ4 Compression and no de-duplication. Under that, I
re-created my Unraid shares as datasets (except for the "Games" share which I configured as a Zvol for an
iSCSI share for better compatibility). I'll work through each of my datasets to cover some of my configuration
choices and why I chose them; not everyone will structure data in the same way, but I believe this covers most
common use cases for network shares.
#### Backup
This share contains manual backups and archived data, for example I will backup user directories before
recycling or repurposing an old computer. I will also take a manual backup of my pictures before trying
new software like Lightroom or Darktable that has the ability to remove files it manages. This share also
includes files that I might otherwise just delete, like a local Nextcloud directory with sync conflicts or
old configuration/application data backups I made before making changes.
I have this set to use ZSTD-19 Compression (highest compression at the expense of CPU usage) with ZFS
Deduplication enabled and Key-based encryption. Deduplication on this share is VERY beneficial for me; for
example, I have multiple backups of my photos and Lightroom library which will always overlap previous backups.
Arguably, I should just clear out old backups when I create new ones, but with deduplication it doesn't cost
anything to keep the duplicated backup.
#### Games
This is an unencrypted Zvol that I can mount to a computer and treat like a local hard drive. This is useful
when you need to expose block storage to a client and not a network share, i.e. program installation.
This volume is where I install games to and also keep save data. I followed a tutorial from Craft Computing
[on youtube](https://www.youtube.com/watch?v=9JL-RVUHj6o) which shows the exact process and also provides
more information about Zvols.
#### Installers
I keep driver installers, purchased software, and OS images here. Basic LZ4 compression, deduplication, and no
encryption since nothing here is sensitive. I don't know that deduplication is really necessary here, but it
shouldn't cost much performance and this share is primarily read from and not often written.
#### Media
Media libraries; unencrypted with no compression and no deduplication. Most media will not compress much if at
all, so I think it would be a waste of CPU cycles to enable it here. I also manage my media libraries and know
there are no duplicate files.
#### Nextcloud
Nextcloud user share data, encrypted with LZ4 compression and no de-duplication. I may update this to enable
de-duplication since user shares might include duplicate files but I didn't see a benefit when setting up the
share and I only have a few Nextcloud users.
#### Security
Security camera recordings, encryption with no compression snd no de-duplication. Nothing else to note here.
#### Personal User Share
Encrypted, compressed, and with de-duplication. This is where I put my pictures and documents, and also the
configured home directory for my TrueNAS user. I have de-duplication enabled here because I have some software
projects/downloads duplicated with different names as well as some duplicated pictures due to having multiple
backups (Plex upload, Nextcloud upload, manual backup before moving to a new phone).
##### Private Dataset
Same as the personal share its nested under, except this is encrypted with a passphrase. Data in this share
is generally unaccessible unless I login to the TrueNAS Web UI and unlock it with a password; useful for
sensitive data that isn't often accessed.
## Shares
With all of my data organized into datasets, shares need to be configured so I can access that data from
client systems. For the most part, this is a 1:1 mapping where each dataset gets an SMB and/or NFS share
but I will highlight my personal user share since its a little unique.
I have my personal share exposed which includes the Private dataset within it available when unlocked.
if the Private dataset is locked, then clients just see an empty directory.
This is a nice feature so even authenticated SMB connections won't be
able to leak sensitive information if I keep the share locked behind a password.
## Data Protection
I have a couple of backups that run locally with plans to include an offsite backup in the future.
### Manual Backup Target
I have an old 1U server that I turn on manually every week or 2 to run a replication task. This server has
1Gig networking and I found that SSH only transferred at around 40Mbps; following some advice in the TrueNAS
forums, I enabled SSH + Netcat which got transfers back up to 1Gig pretty consistently. My backup server runs
Ubuntu server, so I did have to build [libzfs](https://github.com/truenas/py-libzfs) from scratch for this. I
don't have enough capacity here for a full replication, so I exclude the `Backup` share here.
### Scheduled Backup Target
I setup a replication task to my Unraid server, where I configured a RAIDZ Pool using the drives previously
assigned to an Unraid data array. I configured replication using SSH only for this task and I am seeing
transfer rates arount 3Gbps which is about the best I would expect from HDDs.
## Users
I decided to use service accounts for containers to access shares as well as a user acount for myself. I
created dedicated accounts for Nextcloud, Security, and Media access and used those user groups to grant
myself access to the relevant shares.
## Other Config
I didn't do too much manual configuration and mostly left default settings outside of enabling services for
the various shares. I enabled 2FA to require an authenticator code at login, though I don't intend on exposing
the admin UI outside my network so this is probably overkill.
I also increase the ARC memory limit after observing fairly constant memory usage by other services. I will
note that this seems to be a contentious topic on the forums and some claim that increasing ARC can cause
system instability. Personally, I know that I'm not running any services that will consume more memory
(i.e. containers and VMs) so I feel comfortable allocating much more of my RAM to ARC. I added the below
startup script to allocate most of my 32GB of RAM to ARC while leaving a small buffer beyond what I see services
using.
```shell
echo 30064771072 > /sys/module/zfs/parameters/zfs_arc_max
```
## Next Up
As this summary included current and planned configuration, I still have some work to do getting my setup to
match what I've described. At the moment, I'm replicating my new ZFS dataset to my Unraid server before
re-creating the pool on my NAS with the changes I discussed here (and with an additional drive in the data
VDEV).
Migrating my data to Unraid and back is essentially a test-run of my data recovery plan so I will plan on a
dedicated post describing that process and any complications I run into.
Once I have my data comfortably in its new home, I will flesh out my backup strategy and be ready to look into
how I want to structure compute. I mentioned [in my first post](https://blog.mcknight.tech/2023/11/29/Homelab-Upgrades/)
that I was considering Proxmox as a solution here, but I have since been hearing more about
[XCP-ng](https://xcp-ng.org/) and will have to do some more research and probably trial both before making a
decision.