32

So, I've been running offsite copies to OVH S3 bucket via PBS running as an VM but I ran into an issue that verification of the backups is so slow that they're practically unusable.

Copies run nightly and I've set the storage to keep last 4 copies in place. Bigger VMs, like my immich-instance with a bit over terabyte of data, take several days to verify and logs show data rates at around 5MB/s or less. So, with the current schedule I'm running it'll mean in practise that backups expire before they're verified.

I could keep the copies longer, but that'd cost more, or run copies less frequently, which risks losing data if hardware fails at unfortunate moment (which it most likely will). Tuning settings are on default and based on what I've read, adding more runners wouldn't really help that much.

PBS VM itself shows very little load on proxmox monitoring and I've got plenty of bandwidth to use, so the verification shouldn't have any bottlenecks on my end at those speeds. Cache usage is at around 60% with ~30GB of total space available.

Does anyone have any ideas on how to speed that up? Or should I just give up and do something totally different? I attempted to run backups to Hetzner storagebox over cifs-mount, but that's pretty much the same or worse with performance.

you are viewing a single comment's thread
view the rest of the comments
[-] vk6flab@lemmy.radio 2 points 2 weeks ago* (last edited 2 weeks ago)

Have a look at your AWS billing console, since data egress is charged and downloading to verify is considered egress.

AWS S3 supports data checksums where a checksum is calculated at AWS, which you can compare against a checksum that you calculate locally.

This is an article that goes into how it works, but I've not (yet) tested it, but I'll be following in your footsteps pretty soon.

https://medium.com/@maureenosaghae86/check-the-integrity-of-data-in-amazon-s3-with-additional-checksums-3e51fe45f530

As an aside, make sure that versioning is OFF on your backup bucket unless you specifically require and understand it, because even when you delete objects, they persist as a previous, all but invisible, and charged(!), version.

My former backup software "helpfully" enabled versioning and I was left with a $600 monthly bill for six months while there was no actual backup being done due to a local hardware failure, until I figured out what was happening. I used that software for years and shudder to think just how much extra it actually cost.

I will note that while I had a catastrophic hardware failure, I didn't lose any data.

Finally, if you're storing data in Glacier, retrieval is charged at different rates, depending on timelines of access, so it might be that your backup software is using the slow tier to "save" you money.

Edit: OP advises that they're not using AWS, instead they're using OVH. The object storage solutions appear to be mostly compatible, but I was unable to discover if the OVH implementation supports checksums.

[-] IsoKiero@sopuli.xyz 2 points 2 weeks ago

I'm using OVH, not AWS. Their console gives estimation of ~20€/month for the ~2TB I have stored. Versioning is disabled and i'm currently runnign on their signup offer of 200€ credit, so I'm good to go for few weeks more. The storage I'm using includes the traffic, it's just practically unusable due to verification speeds.

[-] vk6flab@lemmy.radio 3 points 2 weeks ago

I apologise, I saw S3, never even noticed the "OVH", nor had I ever heard of it.

I'll leave my original reply as is with an added disclaimer for anyone who follows down the same path.

this post was submitted on 21 May 2026
32 points (97.1% liked)

Selfhosted

59731 readers
895 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

  7. No low-effort posts. This is subjective and will largely be determined by the community member reports.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 3 years ago
MODERATORS