Discussion:
choosing OpenBSD for fileserver instead of FreeBSD + ZFS
(too old to reply)
Miles Keaton
2016-07-20 11:52:04 UTC
Permalink
Got a fileserver with a few terabytes of important personal media, like all
old home movies, baby photos, etc. Files that I want my family to have
access to when I die.

Really it's more of a file archive. A backup. Just rsync + ssh. Serving
it isn't the point. Just preserving it forever.

(It's all unencrypted. It's not that kind of private. Private and offline
from the outside world, but public within the family.)

For years it's been on a Synology, Linux ext4 filesystem. Now I'm making a
new clone of it (new PC) to be in a different location.

I assumed I'd use FreeBSD + ZFS because of ZFS's checksum features. But
really I love and prefer OpenBSD for everything else, and don't want any
other ZFS features : just that checksum.

So I figure if I use OpenBSD + softraid RAID 5 (across 4 disks) and then
write my own little shell script to track the MD5 (find . -type f -exec md5
{} \;) whenever I make changes, that should be enough to see if a file has
been changed due to disk corruption.

(Which makes me realize I don't know a damn thing about disk corruption,
only that it's happened a few times in the past. The occasional JPG or MP3
from the late 90s that used to work but now doesn't, and who-knows-why.)

Before I embark on this direction for a fileserver, I thought I should
check with the smart people here on misc:

Any tips from anyone who's done something similar?

Or would anyone advise me against OpenBSD or this MD5 log approach for a
fileserver like this?

Thank you.
Solène
2016-07-20 12:00:33 UTC
Permalink
Post by Miles Keaton
Got a fileserver with a few terabytes of important personal media, like
all
old home movies, baby photos, etc. Files that I want my family to have
access to when I die.
Really it's more of a file archive. A backup. Just rsync + ssh.
Serving
it isn't the point. Just preserving it forever.
(It's all unencrypted. It's not that kind of private. Private and
offline
from the outside world, but public within the family.)
For years it's been on a Synology, Linux ext4 filesystem. Now I'm
making a
new clone of it (new PC) to be in a different location.
I assumed I'd use FreeBSD + ZFS because of ZFS's checksum features.
But
really I love and prefer OpenBSD for everything else, and don't want
any
other ZFS features : just that checksum.
So I figure if I use OpenBSD + softraid RAID 5 (across 4 disks) and
then
write my own little shell script to track the MD5 (find . -type f -exec
md5
{} \;) whenever I make changes, that should be enough to see if a file
has
been changed due to disk corruption.
(Which makes me realize I don't know a damn thing about disk
corruption,
only that it's happened a few times in the past. The occasional JPG or
MP3
from the late 90s that used to work but now doesn't, and
who-knows-why.)
Before I embark on this direction for a fileserver, I thought I should
Any tips from anyone who's done something similar?
Or would anyone advise me against OpenBSD or this MD5 log approach for
a
fileserver like this?
Thank you.
Hello,

I built a NAS with OpenBSD and I am using aide to track checksum
changes, I have a tutorial on-going on about this but not finished yet,
I will let you know when it's finish.

Quickly, with aide, create a database with modification time, checksum
and size and when you do a check with aide, look at the files which has
a different checksum and where modification time didn't change. This can
be made with a one-liner awk command.

Also, make backup. Raid5 will prevent data loss when a disk fail, but if
2 disks fails or if the filesystem get corrupted, you will lose your
data. When you have multiple terabyte of data, if you use multiple disks
that have been made at the same time, chances are that they can fail at
the same time, also, rebuilding a few terabytes can takes time. Having
backup with rsnapshot to keep track of a few days changes can be a good
idea, or at least save very important data if you can't afford saving
everything (maybe the loss of the musics or videos files is acceptable
?)
Francois Pussault
2016-07-20 12:03:20 UTC
Permalink
Hello,

I still have my personnal NAS working as storage using standard OpenBSD fs +
rsync to a backup machine and a very simple md5sum on that partition
(Partition is mounted read only by default, I remount it rw only for rebuild
the md5sum each time I add new files on that storage).

I don't use any versionning over that... (svn neither git)

But size is lower by far than your's ... about 375Go on a 500Gb physical
media.
----------------------------------------
Sent: Wed Jul 20 13:52:04 CEST 2016
Subject: choosing OpenBSD for fileserver instead of FreeBSD + ZFS
Got a fileserver with a few terabytes of important personal media, like all
old home movies, baby photos, etc. Files that I want my family to have
access to when I die.
Really it's more of a file archive. A backup. Just rsync + ssh. Serving
it isn't the point. Just preserving it forever.
(It's all unencrypted. It's not that kind of private. Private and offline
from the outside world, but public within the family.)
For years it's been on a Synology, Linux ext4 filesystem. Now I'm making a
new clone of it (new PC) to be in a different location.
I assumed I'd use FreeBSD + ZFS because of ZFS's checksum features. But
really I love and prefer OpenBSD for everything else, and don't want any
other ZFS features : just that checksum.
So I figure if I use OpenBSD + softraid RAID 5 (across 4 disks) and then
write my own little shell script to track the MD5 (find . -type f -exec md5
{} \;) whenever I make changes, that should be enough to see if a file has
been changed due to disk corruption.
(Which makes me realize I don't know a damn thing about disk corruption,
only that it's happened a few times in the past. The occasional JPG or MP3
from the late 90s that used to work but now doesn't, and who-knows-why.)
Before I embark on this direction for a fileserver, I thought I should
Any tips from anyone who's done something similar?
Or would anyone advise me against OpenBSD or this MD5 log approach for a
fileserver like this?
Thank you.
Cordialement
Francois Pussault
10 chemin de négo saoumos
apt 202 - bat 2
31300 Toulouse
+33 6 17 230 820
***@contactoffice.fr
Kamil Cholewiński
2016-07-20 12:08:21 UTC
Permalink
Post by Miles Keaton
So I figure if I use OpenBSD + softraid RAID 5 (across 4 disks) and then
write my own little shell script to track the MD5 (find . -type f -exec md5
{} \;) whenever I make changes, that should be enough to see if a file has
been changed due to disk corruption.
This will detect corruption, but won't fix it. ZFS fixes corrupted files
on the fly, when possible, and updates on-disk parity to sustain another
hit on the same file.

Also I would rather recommend you use RAID10, with drives from two
different batches.
Theodoros
2016-07-20 12:28:55 UTC
Permalink
+1, zfs and hammer are great filesystems for such a use.

Looking forward to RAID10 support on softraid (!).
Post by Kamil Cholewiński
Post by Miles Keaton
So I figure if I use OpenBSD + softraid RAID 5 (across 4 disks) and then
write my own little shell script to track the MD5 (find . -type f -exec
md5
Post by Kamil Cholewiński
Post by Miles Keaton
{} \;) whenever I make changes, that should be enough to see if a file has
been changed due to disk corruption.
This will detect corruption, but won't fix it. ZFS fixes corrupted files
on the fly, when possible, and updates on-disk parity to sustain another
hit on the same file.
Also I would rather recommend you use RAID10, with drives from two
different batches.
Kamil Cholewiński
2016-07-20 12:42:52 UTC
Permalink
Post by Theodoros
+1, zfs and hammer are great filesystems for such a use.
Looking forward to RAID10 support on softraid (!).
Been running "manually stacked" RAID10 with 6 drives, on a low-traffic
production system, for half a year. System boots off the first RAID1
array. The second RAID1 provides altroot. Script in rc.local assembles
the RAID0 volume with the data pool.

However, I didn't try an upgrade yet. ;)
Christian Weisgerber
2016-07-20 12:43:59 UTC
Permalink
Post by Miles Keaton
So I figure if I use OpenBSD + softraid RAID 5 (across 4 disks) and then
write my own little shell script to track the MD5 (find . -type f -exec md5
{} \;)
Note that mtree(8) can checksum files.
--
Christian "naddy" Weisgerber ***@mips.inka.de
Scott Bonds
2016-07-20 13:15:48 UTC
Permalink
Take a look at par2. https://en.wikipedia.org/wiki/Parchive
Post by Miles Keaton
Got a fileserver with a few terabytes of important personal media, like all
old home movies, baby photos, etc. Files that I want my family to have
access to when I die.
Really it's more of a file archive. A backup. Just rsync + ssh. Serving
it isn't the point. Just preserving it forever.
(It's all unencrypted. It's not that kind of private. Private and offline
from the outside world, but public within the family.)
For years it's been on a Synology, Linux ext4 filesystem. Now I'm making a
new clone of it (new PC) to be in a different location.
I assumed I'd use FreeBSD + ZFS because of ZFS's checksum features. But
really I love and prefer OpenBSD for everything else, and don't want any
other ZFS features : just that checksum.
So I figure if I use OpenBSD + softraid RAID 5 (across 4 disks) and then
write my own little shell script to track the MD5 (find . -type f -exec md5
{} \;) whenever I make changes, that should be enough to see if a file has
been changed due to disk corruption.
(Which makes me realize I don't know a damn thing about disk corruption,
only that it's happened a few times in the past. The occasional JPG or MP3
from the late 90s that used to work but now doesn't, and who-knows-why.)
Before I embark on this direction for a fileserver, I thought I should
Any tips from anyone who's done something similar?
Or would anyone advise me against OpenBSD or this MD5 log approach for a
fileserver like this?
Thank you.
Bryan C. Everly
2016-07-20 13:37:10 UTC
Permalink
Interesting. Seems to be in our ports tree as well. Now I know what I'm
doing this evening. :)
Post by Scott Bonds
Take a look at par2. https://en.wikipedia.org/wiki/Parchive
Post by Miles Keaton
Got a fileserver with a few terabytes of important personal media, like
all
old home movies, baby photos, etc. Files that I want my family to have
access to when I die.
Really it's more of a file archive. A backup. Just rsync + ssh. Serving
it isn't the point. Just preserving it forever.
(It's all unencrypted. It's not that kind of private. Private and
offline
from the outside world, but public within the family.)
For years it's been on a Synology, Linux ext4 filesystem. Now I'm making
a
new clone of it (new PC) to be in a different location.
I assumed I'd use FreeBSD + ZFS because of ZFS's checksum features. But
really I love and prefer OpenBSD for everything else, and don't want any
other ZFS features : just that checksum.
So I figure if I use OpenBSD + softraid RAID 5 (across 4 disks) and then
write my own little shell script to track the MD5 (find . -type f -exec
md5
{} \;) whenever I make changes, that should be enough to see if a file has
been changed due to disk corruption.
(Which makes me realize I don't know a damn thing about disk corruption,
only that it's happened a few times in the past. The occasional JPG or
MP3
from the late 90s that used to work but now doesn't, and who-knows-why.)
Before I embark on this direction for a fileserver, I thought I should
Any tips from anyone who's done something similar?
Or would anyone advise me against OpenBSD or this MD5 log approach for a
fileserver like this?
Thank you.
Karel Gardas
2016-07-20 13:48:08 UTC
Permalink
Post by Kamil Cholewiński
Post by Miles Keaton
So I figure if I use OpenBSD + softraid RAID 5 (across 4 disks) and then
write my own little shell script to track the MD5 (find . -type f -exec
md5
Post by Kamil Cholewiński
Post by Miles Keaton
{} \;) whenever I make changes, that should be enough to see if a file has
been changed due to disk corruption.
This will detect corruption, but won't fix it. ZFS fixes corrupted files
on the fly, when possible, and updates on-disk parity to sustain another
hit on the same file.
Yes, similar functionality is in RAID1C patch I posted on tech@ in the
past. Life is too intense now so I barely work on this to update it
following Joel Sing requirements. Anyway, someday I hope to push it
again to tech@
Kenneth Gober
2016-07-20 18:07:31 UTC
Permalink
Post by Miles Keaton
Got a fileserver with a few terabytes of important personal media, like all
old home movies, baby photos, etc. Files that I want my family to have
access to when I die.
Really it's more of a file archive. A backup. Just rsync + ssh. Serving
it isn't the point. Just preserving it forever.
When you die, will there be somebody around who knows how to access
these files? I have a file server running OpenBSD and I have both NFS
and Samba configured. Samba is the important one if you want people
who are less technically savvy to be able to access the data. Samba
makes the files easily accessible from a Windows system. Make sure
your survivors know how to access this data or your efforts are for nothing.

Even with RAID10, your data is still at risk. A fire, for example, can trash
everything at once. I back up my server to tape. As businesses upgrade
to the latest tape technology the older stuff becomes available relatively
cheaply, especially used. I got a used SAS LTO4 tape drive and a SAS
controller for it (one OpenBSD supports) on eBay for a good price. LTO4
tapes have a nominal capacity of 800GB uncompressed, or 1600GB
compressed; in practice with my particular data I get about 950GB. To
protect from loss due to fire, I keep a full set of tapes stored someplace
over 100 miles away from my home.

Use cron jobs to automate tape backups, then all you have to do is
remember to change the tape in the drive. I have a cron job that does
a level 1 dump every Sunday so the only thing I normally have to do
is change tapes once per week. Periodically I will do a level 0 (full)
dump but I do these manually because they will span multiple tapes
and a cron job doesn't work well for those.

FFS/FFS2 filesystems in OpenBSD work reliably, but fsck can take
a while to run on bootup if the server didn't shut down cleanly (e.g.
after a power failure). My server is running on fairly old hardware
and it takes between 30 and 60 minutes to fsck 6TB of space after
an unclean shutdown. Putting your server on a UPS will help you
avoid unclean shutdowns due to short power failures, but extended
outages will eventually exhaust the batteries. It's possible to have
the server automatically shut down when the UPS batteries get
low but I don't do this because I'm sure that as soon as I start the
shutdown process, the power will come back on. I hold out until
the bitter end even if it means a longer fsck later.

-ken
Tinker
2016-07-21 01:09:05 UTC
Permalink
On 2016-07-20 21:48, Karel Gardas wrote:
...
Post by Karel Gardas
past. Life is too intense now so I barely work on this to update it
following Joel Sing requirements. Anyway, someday I hope to push it
+1 for RAID1C!
Chris Bennett
2016-07-21 17:08:09 UTC
Permalink
Also, make backup. Raid5 will prevent data loss when a disk fail, but if 2
disks fails or if the filesystem get corrupted, you will lose your data.
When you have multiple terabyte of data, if you use multiple disks that have
been made at the same time, chances are that they can fail at the same time,
also, rebuilding a few terabytes can takes time. Having backup with
rsnapshot to keep track of a few days changes can be a good idea, or at
least save very important data if you can't afford saving everything (maybe
the loss of the musics or videos files is acceptable ?)
As I understand, the worst thing you can do to your hardware and your
disks is to power off. Shock to the power supply, motherboard components
and the disks have to spin up again, which often times they can't do,
but would keep spinning reliably for another couple of years if never
powered down.

So would it be best to keep a system like this up 24/7?
How does life expectancy compare using home PC versus server PC?

Are there hard drives out there that stop spinning on their own after a
certain time if inactive?

SSD's are getting much bigger now. Are they now considered more
reliable, less reliable or not decided yet against spinning disks?

Chris Bennett
Boris Goldberg
2016-07-21 15:21:01 UTC
Permalink
Hello Miles,

I did research the matter about 18 month (or maybe 2 years) ago for the
business, even asked the list. Decided in favor of FreeNAS (based on
FreeBSD+ZFS if someone doesn't know). Can't tell how it went because the
project died for reasons unrelated to the storage.
If you decide to go with OpenBSD I'd strongly suggest to use a good
hardware RAID controller (not relaying on the softraid). Make sure it's
supported. I've had a good experience with HP Smart Array Pxx series. You
can buy older models quite cheap on ebay (if you trust ebay). Haven't
checked it on a "generic" PC though. Install the battery and replace it
than the system complains (on boot or otherwise) - also sold on ebay.
RAID5 might not be enough than dealing with "few terabytes" - there is a
risk of a second disk corruption due a high activity during recovery
(google the subject). Consider RAID6 or RAID10 (1E, 1C, etc.) - both
require a minimum of four disks.
I was told that fsck requires about 1G of memory per 1T of space. Could
be dealt with by splitting to multiple partitions (labels). The ZFS memory
requirements aren't lower anyway.
You need some sort of snapshoted (!) backup. Even if the RAID saves you
from the disk corruption (the "if" here bigger than most people think), a
human error (or a virus on someone's computer/phone) can destroy all your
data, and than a rsync can propagate the "changes" to the backup (also
destroying it if you don't have proper snapshots). The snapshots don't need
to be called "snapshots" - any sort of backup with possibility to restore
to an older date will do.


Wednesday, July 20, 2016, 6:52:04 AM, you wrote:

MK> Got a fileserver with a few terabytes of important personal media, like all
MK> old home movies, baby photos, etc. Files that I want my family to have
MK> access to when I die.

MK> Really it's more of a file archive. A backup. Just rsync + ssh. Serving
MK> it isn't the point. Just preserving it forever.

MK> (It's all unencrypted. It's not that kind of private. Private and offline
MK> from the outside world, but public within the family.)

MK> For years it's been on a Synology, Linux ext4 filesystem. Now I'm making a
MK> new clone of it (new PC) to be in a different location.

MK> I assumed I'd use FreeBSD + ZFS because of ZFS's checksum features. But
MK> really I love and prefer OpenBSD for everything else, and don't want any
MK> other ZFS features : just that checksum.

MK> So I figure if I use OpenBSD + softraid RAID 5 (across 4 disks) and then
MK> write my own little shell script to track the MD5 (find . -type f -exec md5
MK> {} \;) whenever I make changes, that should be enough to see if a file has
MK> been changed due to disk corruption.

MK> (Which makes me realize I don't know a damn thing about disk corruption,
MK> only that it's happened a few times in the past. The occasional JPG or MP3
MK> from the late 90s that used to work but now doesn't, and who-knows-why.)

MK> Before I embark on this direction for a fileserver, I thought I should
MK> check with the smart people here on misc:

MK> Any tips from anyone who's done something similar?

MK> Or would anyone advise me against OpenBSD or this MD5 log approach for a
MK> fileserver like this?
--
Best regards,
Boris mailto:***@prodigy.net
Liviu Daia
2016-07-21 20:08:57 UTC
Permalink
Post by Miles Keaton
Got a fileserver with a few terabytes of important personal media,
like all old home movies, baby photos, etc. Files that I want my
family to have access to when I die.
Really it's more of a file archive. A backup. Just rsync + ssh.
Serving it isn't the point. Just preserving it forever.
[...]

Don't rely on your machines alone. As other people have pointed
out, a fire can ruin your backup in a few minutes. There are online
storage services, make copies of your backups to two or more separate
systems like this, and make sure your family know about them, and know
how to restore your files from them. Only when you have that sorted out
spend time optimizing your local bakup system.

Regards,

Liviu Daia

Loading...