240

GREAT NEWS about Lemmy Server Performance, another major SQL mistake has been discovered today: every single comment & post create (INSERT) is updating ~1700 rows in the site_aggregates table (lemmy.ml)

submitted 2 years ago* (last edited 2 years ago) by [email protected] to c/[email protected]

32 comments fedilink hide all child comments

Details here: https://github.com/LemmyNet/lemmy/issues/3165

This will VASTLY decrease the server load of I/O for PostgreSQL, as this mistaken code is doing writes of ~1700 rows (each known Lemmy instance in the database) on every single comment & post creation. This creates record-locking issues given it is writes, which are harsh on the system. Once this is fixed, some site operators will be able to downgrade their hardware! ;)

you are viewing a single comment's thread
view the rest of the comments

[-] [email protected] 38 points 2 years ago

This is fascinating

My biggest takeaway from reading through the GitHub comments though is that it seems like no one actually knows where much of the SQL comes from? As in it's possible that the bug in question is just one manifestation of old, handwritten Postres code that may or may not be optimized (Or even logical?).

I don't mean this in a critical way, as things like this are bound to happen in an open-source, federated world. However, I would think a comprehensive audit of the Lemmy Postgres triggers, queries, etc could potentially save us all from some future headaches.

[+] [email protected] -46 points 2 years ago

That's not fascinating, that's depressing. Lemmy team lacks development skill.

[-] [email protected] 33 points 2 years ago* (last edited 2 years ago)

I am always fascinated with these types of comments, specifically for a free and open-source software. There are lemmy instances supporting hundreds of thousands of users and trafic, feedback from both server owners and lemmy devs is almost instantaneous.

A platform like lemmy requires client side knowledge to build both desktop and mobile UI (that are performant), it requires ActivityPub knowledge to integrate with the Fediverse, it requires backend knowledge to build APIs for 100% feature compatibility with 3rd party apps. It requires DB knowledge to optimize queries, it requires devops / platform knowledge to deploy it.

And all of this is built in public.

BuT LEMmY tEaM lAcks dEvEloPmenT SkiLL – sure buddy.

[-] [email protected] 13 points 2 years ago

Or there's just room for improvement and optimization, as each developer has its strengths and weaknesses, as any other professional, and a system like lemmy is very complex and really requires to cover a lot from backend to front end.

And there used to be only 2 developers.

I once check a open source implementation of a niche product from Microsoft, and it was a nightmare of unoptimized code. And Microsoft spent a lot of development resources there.

Creating lemmy as 2 people job is quite impressive. Luckily now there are resources for optimization

[-] [email protected] 11 points 2 years ago

This is such a dumb take. Using a database efficiently is not some binary, once-off thing: you build what works based on the data you have at the time. When it works, you move on to other features. It takes analysis of real operation over time to find the bottlenecks, and discipline to focus on fixing the things that will have the most benefit to your users.

There are many successful tech companies who introduce features that create dogshit performance impacts regularly. They work because there are people looking at metrics and catching issues to fix. This is healthy.

[-] [email protected] 7 points 2 years ago

Luckily with open source new team members or forking can address that.

[-] [email protected] -2 points 2 years ago

Yep.

this post was submitted on 23 Jul 2023

240 points (99.2% liked)

Lemmy Server Performance

449 readers

1 users here now

lemmy_server uses the Diesel ORM that automatically generates SQL statements. There are serious performance problems in June and July 2023 preventing Lemmy from scaling. Topics include caching, PostgreSQL extensions for troubleshooting, Client/Server Code/SQL Data/server operator apps/sever operator API (performance and storage monitoring), etc.

founded 2 years ago

MODERATORS

[email protected]