ChatGPT, write a post for the stuff that I have in my head and want to get out as an update.
Hmm. No brain implant yet. Guess I'll have to write this the hard way.
Syncing update
It has been an eventful week. I successfully deployed the initial version of smarter content syncing, and have made some adjustments to algorithm since then. Most notably, communities with only 1 subscriber (the bot) will no longer receive updates, and communities with fewer than 5 subscribers or with a low posting frequency will only be updated twice a day. Furthermore, for the highest update priority (every 10 minutes), a community must have a minimum of 50 subscribers. Implementation details can be found in the decide_interval()
method over here.
Being a developer is fun
Meanwhile... Damnit, bot is stuck again.
2023-07-08 10:13:39,945 - utils.syncer - INFO - Scraping subreddit: bustynaturals. Last time 2:30:48 ago, interval 120 minutes
2023-07-08 10:13:40,653 - utils.syncer - INFO - 'latina bodies are the best' at https://www.reddit.com/r/BustyNaturals/comments/14twww8/latina_bodies_are_the_best/ updated: 2023-07-08 07:14:13+00:00
2023-07-08 10:13:45,324 - utils.syncer - ERROR - Error trying to retrieve post details, try again in a bit; Couldn't retrieve post detail page
2023-07-08 10:13:46,333 - utils.syncer - INFO - Scraping subreddit: bustynaturals. Last time 2:30:54 ago, interval 120 minutes
2023-07-08 10:13:48,581 - utils.syncer - INFO - 'latina bodies are the best' at https://www.reddit.com/r/BustyNaturals/comments/14twww8/latina_bodies_are_the_best/ updated: 2023-07-08 07:14:13+00:00
2023-07-08 10:13:51,227 - utils.syncer - ERROR - Error trying to retrieve post details, try again in a bit; Couldn't retrieve post detail page
...
1 bugfix and deployment later:
2023-07-08 10:46:42,836 - utils.syncer - INFO - Scraping subreddit: bustynaturals. Last time 3:03:51 ago, interval 120 minutes
2023-07-08 10:46:43,573 - utils.syncer - INFO - 'latina bodies are the best' at https://www.reddit.com/r/BustyNaturals/comments/14twww8/latina_bodies_are_the_best/ updated: 2023-07-08 07:14:13+00:00
2023-07-08 10:46:48,327 - utils.syncer - ERROR - Couldn't find post on https://old.reddit.com/r/BustyNaturals/comments/14told8/latina_bodies_are_the_best/, skipping.
Defederation
Meanwhile, the folks at https://lemmy.world reached out to me to tell me they're defederating Lemmit. They are not fond of high volume of posts made by the bot, and the fact that there are now (quick check) 462 communities on this server all being moderated by a single person. They have already received a couple of complaints about spam, and it didn't help that some requests for NSFW subreddits were not marked as NSFW. Occasionally, those subreddits had explicit thumbnails that appeared in the 'All feed' without warning.
I had a good talk with the LemmyWorld admin, wherein they explained their point of view, and I explained mine. I understand their decision to disassociate with Lemmit, and appreciate their attempt to contact me. Other instances like Beehaw, and some smaller ones have also reached the same decision.
This does mean that you will no longer be able to get new community updates on those servers. So make sure to check the blocked instances list on your home server if you were subscribed to Lemmit. At the same time I have removed all the subscriptions of users from those servers, in order to not affect the sync priority mentioned above. This does mean, that if LemmyWorld, Beehaw, etc ever decide to connect to Lemmit again (however unlikely), you will need to un- and re-subscribe from there.
Meanwhile, I've added a feature in the bot that will remove request posts for NSFW subreddits, if the post itself is not marked for NSFW. This should prevent explicit thumbnails showing up where they are not wanted.
Server growth
Last night I got an alert from my server monitoring that the disk is 80% full. Unfortunately, the disk is only 60 GB, so that doesn't leave much room for expansion. On the bright side, a good chunk of that is from Lemmys very verbose logging (like, 4 GB a day, which gets cleaned up daily), so it should last throughout the weekend if I tune that down.
Furthermore, most of the storage growth is from from pictrs
, the image upload part of Lemmy, and that can utilize an S3 bucket, rather than using the VM's storage like it is now. Using an S3 bucket offers a cost-efficient solution for expanding storage. Initial estimates indicate a monthly cost of around $5 for 1000 GB of storage, which should be sufficient for a while *fingers crossed*.
In the early days of Lemmit (literally, as the server is less than a month old) image uploads were limited to a default setting, which was something around 40 megabytes. That did add up quickly (thanks to half-minute porn gifs), and so I had to limit the max filesize to 1 MB, and later 0.5 MB. Once the server has switched to S3 storage, I can probably up that limit a little, although not too much.
Finally, Lemmy v0.18.1 has been released, and it contains even more performance boosts compared to v0.18.0, so if there's time left this weekend (and I can verify the Lemmit Bot is compatible), I will probably perform the upgrade.
It's done: [email protected].