this post was submitted on 02 Jun 2024
50 points (100.0% liked)

Asklemmy

43858 readers
1673 users here now

A loosely moderated place to ask open-ended questions

Search asklemmy ๐Ÿ”

If your post meets the following criteria, it's welcome here!

  1. Open-ended question
  2. Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
  3. Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
  4. Not ad nauseam inducing: please make sure it is a question that would be new to most members
  5. An actual topic of discussion

Looking for support?

Looking for a community?

~Icon~ ~by~ ~@Double_[email protected]~

founded 5 years ago
MODERATORS
 

I have about 500GB of data (photos, documents, videos etc.) that I have accumulated over the years. Currently, I keep them on my computer and rsync all additions / changes once a month or so to an external hard drive. Do I need to be worried about data loss (sectors going bad, bit rot, bit flip, whatever it is called)?

To clarify,

  1. None of this is commercially important; I just don't want to get into a situation where I look up an old family photo or video twenty years down the line and it has got corrupted.

  2. Both my computer and the external HD are HDDs. They are fairly cheap here (and very cheap if second hand). Buying SSDs or dedicated hardware would be expensive.

top 28 comments
sorted by: hot top controversial new old
[โ€“] [email protected] 19 points 5 months ago (2 children)

The 3 2 1 rule is always the gold standard.

I'd recommend at least adding an offsite backup. Set up rclone with a mounted folder (client side encryption is recommended) and sync the files to that as well.

I use Backblaze for about $6/TB/mo, pro-rated for whatever amount is actually used.

[โ€“] [email protected] 6 points 5 months ago

second, for the small amount a backblaze account would be cheap and more than enough. If OP is worried about security then enabling a crypt endpoint in rclone is moderately trivial.

3-2-1 OP. 3 copies of your data, across 2 different storage mediums, with at least 1 offsite.

[โ€“] [email protected] 3 points 5 months ago (1 children)

6$ is about 500 rupees. I can get another HDD for double that price.

I do copy some important files to Google Drive, but I don't pay for it, and I don't rely on it.

[โ€“] [email protected] 3 points 5 months ago (1 children)

If you don't pay for it, you can't rely on it

[โ€“] [email protected] 1 points 5 months ago (3 children)

Right, which is why I prefer to rely on local backups. Much cheaper in the long run.

[โ€“] [email protected] 7 points 5 months ago

I used to work with a guy who was religious about backing up his files to an external drives. Until someone broke into his house and stole his computer AND his external drives. He lost everything.

[โ€“] [email protected] 3 points 5 months ago

It's always a good idea to have an off-site backup (e.g. in case of fires, robbery, natural disasters, etc). If you prefer to manage them yourself, an option is to find someone else who also needs an off-site backup and exchange disk space. You do your off-site on their machine, and they do theirs on yours. With external HDDs, you can just have someone else hold on to it for you at a different location. You can come up with fancier schemes to reduce the chances of data loss or to make the process simpler if you care to do so.

[โ€“] [email protected] 2 points 5 months ago (1 children)

I also like local only with a similar set up as yours, rsync to and HDD and to an SSD.
But I also would recommend you to follow that suggestion, you need to have an external backup managed by someone else (encrypted, of course) so you can have options if anything happens to everything in your local.
It's up to you how much you're willing to pay to be sure to be able to retrieve your data.

I'm using iDrive e2, it says it has a limited offer, but it's been there for over a year.

Im basically paying $1.25 for 2TB per month (it's charged at once for 24 months) https://www.idrive.com/s3-storage-e2/pricing

[โ€“] [email protected] 1 points 5 months ago

I see, I'll look into it then. Thank you.

[โ€“] [email protected] 9 points 5 months ago (1 children)

In my experience, a well treated, non overused physical hard drive can and likely will hold up for over 15 years.

I haven't had any problems with any of my HDDs, but I don't stress them out with daily gaming or video production, and I don't toss them around like footballs, obviously.

Just speaking from my own experiences though..

[โ€“] [email protected] 1 points 5 months ago

My external HD is working well, but the computer's HD seems to be of poor quality. I'm worried that once the primary copy gets corrupted, the mistakes will then be copied to the external HD as well. (Although if I understand rsync correctly, this shouldn't happen.)

[โ€“] [email protected] 7 points 5 months ago

I started using restic for backups.

Pro:

  • Encryption
  • Deduplication
  • Flexible backup location
  • Data integrity checks

Con:

  • No good GUI
[โ€“] [email protected] 5 points 5 months ago

Hard drives can fail. A strong magnetic field could scramble the data on the platters. HDD's are pretty reliable usually though. Biggest concern with external HDDs would be fall damage.

I would say to check random files from time to time and you should be fine. Every 2 or 3 years, replace your backup drive. A backup program like Borg could help detect if you have a problem with your files, but you lose a bit of the simplicity of your current rsync method.

Anything your truly worried about should follow the 3,2,1 standard. Minimum 3 copies, on 2 separate media types, with 1 copy offsite. That said your current setup is already better than 95% of the general population and probably 70% of the Fedi.

[โ€“] [email protected] 4 points 5 months ago (1 children)

I also just do this. However I have already found 2 photos that got randomly corrupted, and I don't know how to prevent that.

So far my only idea was using md5sum, but checking all files like that takes a loooooooooong time.

I am paranoid about cloud. I do have my music backed up on OneDrive, encrypted with GPG using AES256, but I don't even fully trust that. I know, it sounds stupid, but maybe in the future it will be quite easy to break.

But I don't know much about encryption. Just reading the man page, I put these options together:

--s2k-cipher-algo AES256 --s2k-digest-algo SHA512 --s2k-mode 3 --s2k-count 65011712

but whether I can consider that safe enough, I don't know.

And since I don't know enough about it, I prefer not to trust it.

[โ€“] [email protected] 4 points 5 months ago

I also just do this. However I have already found 2 photos that got randomly corrupted, and I don't know how to prevent that.

If you are fine with changing your file system, check out zfs. It stores checksums with your data, and can, if configured to store multiple copies, repair corruption.

[โ€“] [email protected] 3 points 5 months ago (1 children)

Worried in the sense that having a backup is a good idea, most filesystems do not have much protection from a file becoming corrupted, but random corruption is rare. Personally I like an automated, regular cloud backup to B2 and also do a local one that is easier (faster) to restore. For local, I prefer Borg (or rather the Pika Backup frontend) because you can easily store different dates while also benefitting from file deduplication.

[โ€“] [email protected] 1 points 5 months ago (1 children)

So which filesystems are better for archiving?

[โ€“] [email protected] 2 points 5 months ago

I think ZFS with redundancy is typically the gold standard.

[โ€“] [email protected] 3 points 5 months ago (1 children)

Instead of a single external HD set up a NAS with a raid configuration so that even if a drive fails the data is safe.

[โ€“] [email protected] 5 points 5 months ago (1 children)

I want to point out that RAID doesn't actively prevent bit rot and data degradation. You'll want ZFS/RAIDZ for that.

[โ€“] [email protected] 3 points 5 months ago* (last edited 5 months ago) (1 children)

I'm not super familiar with ZFS/RAIDZ, I guess it does extra data scrubbing and stuff to prevent data issues?

That's cool. But a traditonal RAID setup still gives you redundancy and fault tolerance which is the important part, right?

[โ€“] [email protected] 5 points 5 months ago (1 children)

It's a software RAID integrated in the filesystem, as I understand it. This video helped me understand it a bit more and it's why I'm saying ZFS is a better idea. afaik you get the good parts of raid and some more. Obviously I have very superficial knowledge on all of this though, so I recommend doing your own reading :P

[โ€“] [email protected] 2 points 5 months ago

Here is an alternative Piped link(s):

This video

Piped is a privacy-respecting open-source alternative frontend to YouTube.

I'm open-source; check me out at GitHub.

[โ€“] [email protected] 3 points 5 months ago (1 children)

I recommend kopia. It lets you backup automatically to a primary location, copy that data periodically to a secondary location, and it has a command that you can use to verify all the data is actually what it was when the backup was created.

[โ€“] [email protected] 2 points 5 months ago (1 children)

Thank you. On that note, when backing up, is there a way to compare the two versions, see if one has become corrupted, and copy the good version to both? It would be sad if your primary copy got corrupted, and you overwrote all other copies with it.

[โ€“] [email protected] 1 points 5 months ago* (last edited 5 months ago)

Kopia uses content addressable storage. So basically when it copies things, it only copies what data is new. Files that haven't changed will not be overwritten.

You kind of need to run the verification command on both the source and the "backup copy" for maximum paranoia. If you're running it on a local copy, that should be a relatively fast process as you don't need to download stuff.

You'd basically connect on the command line to the copy you just updated via sync-to and then ask kopia to verify 100% of the file integrity ... it should then run through everything and make sure it matches what's supposed to be there. I'm not sure how you fix it if it detects something wrong, I've yet to run into that ... I'm sure there's a way ๐Ÿ™‚

You could also use two backup drives and sync to both, then if you get an error restoring a particular file from one, you could in theory restore it from the other. A ZFS cluster with redundant copies and/or a RAID-1, RAID-5 or RAID-6 style setup could also help ... but most people aren't going to run an entire NAS just to turn it on periodically and backup their data "offline". Most people are going to be better served (IMO) by using cloud storage like B2 (where bitflips aren't really a concern) or a NAS (where bitflips similarly are a minimal concern, ideally in another location) with a periodically updated offline copy (on say an external hard drive) should be enough to protect most people's data well.

Also going to like to what I'm talking about:

[โ€“] [email protected] 2 points 5 months ago

I recommend encrypting them locally and backup the encrypted copy on a cloud drive.

[โ€“] [email protected] 1 points 5 months ago

I like Duplicacy, which is $20 but worth it for something as important as backups.