385

Pushing back against the wave of bot accounts on Lemmy (sh.itjust.works)

submitted 2 years ago* (last edited 1 year ago) by [email protected] to c/[email protected]

41 comments fedilink hide all child comments

Hi everyone. I wanted to share some Lemmy-related activism I’ve been up to. I got really interested in the apparent surge of bot accounts that happened in June. Recently, I was able to play a small part in removing some of them. Hopefully by getting the word out we can ensure Lemmy is a place for actual human users and not legions of spam bots.

First some background. This won't be new to many of you, but I'll include it anyway. During the week of June 18 to June 25, as the Reddit migration to Lemmy was in full swing, there was a surge of suspicious account creation on Lemmy instances that had open registration and no captcha or email verification. Hundreds of thousands of accounts appeared and then sat inactive. We can only guess what they’re for, but I assume they are being planted for future malicious use (spamming ads, subversive electioneering, influencing upvotes to drive content to our front pages, etc.)

If you look at the stats on The Federation you might notice that even the shape of the Total Users graphs are the same across many instances. User numbers ramped up on June 18, grew almost linearly throughout the week, and peaked on June 24. (I’m puzzled by the slight drop at the end. I assume it's due to some smoothing or rate-sensitive averaging that The Federation uses for the graphs?)

Here are total user graphs for a few representative instances showing the typical shape:

Clearly this is suspicious, and I wasn’t the only one to notice. Lemmy.ninja documented how they discovered and removed suspicious accounts from this time period: (https://lemmy.ninja/post/30492). Several other posts detailed how admins were trying to purge suspicious accounts. From June 24 to June 30 The Federation showed a drop in the total number of Lemmy users from 1,822,313 to 1,589,412. That’s 232,901 suspicious accounts removed! Great success! Right?

Well, no, not yet. There are still dozens of instances with wildly suspicious user numbers. I took data from The Federation and compared total users to active users on all listed instances. The instances in the screenshot below collectively have 1.22 million accounts but only 46 active users. These look like small self-hosted instances that have been infected by swarms of bot accounts.

As of this writing The Federation shows approximately 1.9 million total Lemmy accounts. That means the majority of all Lemmy accounts are sitting dormant on these instances, potentially to be used for future abuse.

This bothers me. I want Lemmy to be a place where actual humans interact. I don’t want it to become another cesspool of spam bots and manipulative shenanigans. The internet has enough places like that already.

So, after stewing on it for a few days, I decided to do something. I started messaging admins at some of these instances, pointing out their odd account numbers and referencing the lemmy.ninja post above. I suggested they consider removing the suspicious accounts. Then I waited.

And they responded! Some admins were simply unaware of their inflated user counts. Some had noticed but assumed it was a bug causing Lemmy to report an incorrect number. Others weren’t sure how to purge the suspicious accounts without nuking their instances and starting over. In any case, several instance admins checked their databases, agreed the accounts were suspicious, and managed to delete them. I’m told that the lemmy.ninja post was very helpful.

Check out these early results!

Awesome! Another 144k suspicious accounts are gone. A few other admins have said they are working on doing the same on their instances. I plan to message the admins at all the instances where the total accounts to active users ratio is above 10,000. Maybe, just maybe, scrubbing these suspected bot accounts will reduce future abuse and prevent this place from becoming the next internet cesspool.

That’s all for now. Thanks for reading! Also, special thanks to the following people:

@[email protected] for your helpful post!

@[email protected], @[email protected], and @[email protected] for being so quick to take action on your instances!

top 41 comments

sorted by: hot top new old

[-] [email protected] 38 points 2 years ago

That great. Thank you for your hard work. It's a worthy cause.

[-] [email protected] 31 points 2 years ago

Nice! Please don't remove me tho, I migrated from reddit during that wave and I don't interact or comment much, I mostly like looking at posts and scrolling mindlessly while searching for interesting communities to join.

[-] [email protected] 8 points 2 years ago

So just like everyone else

[-] [email protected] 6 points 2 years ago

I generally lurk as well. Usually I'm finding out about something too late to be the first post about it or someone else has already commented what I was thinking. In 13 years on Reddit, I probably have under 100 contributions.

[-] [email protected] 3 points 2 years ago

Don't remove me either pls :(

[-] [email protected] 21 points 2 years ago

How do you tell if these suspicious accounts are actually bots? What if they're users who are actually just inactive?

[-] [email protected] 7 points 2 years ago

To be honest I don't know of a way to objectively distinguish a legit user's inactive account from an automated account created via software. I'm looking at graphs and playing the odds. It's possible there are a small number of legit accounts in there, though IMO I think that's very unlikely.

Looking back, I suppose I didn't elaborate why I think the user counts on the highlighted instances look suspicious. The ones with huge total-vs-active user ratios look like pure bot pools to me due to two characteristics: (1) The number of active users doesn't change as the number of accounts increases. 30,000 or 60,000 new accounts appear and none of them show any activity? No way. (2) The user count grows occurs evenly within a certain date range and then abruptly stops. If these instances were really being used by the general public then I would expect accounts to be generated before June 18 and after June 25. And the growth within that window should be uneven. The fact that multiple instances saw the same growth pattern on the same dates smells like automation at work.

[-] [email protected] 19 points 2 years ago

Thank you for your service o7

[-] [email protected] 17 points 2 years ago

This is brilliant, but I think you might be over looking something here. I’m part of the reddit migration and a like a lot of new users I didn’t know what I was doing or really understand instances when I joined. So I ended up signing up on a few different instances before I understood. 3 of my accounts are inactive, but I don’t want to delete them necessarily - having alt accounts makes sense.

Lemmy.world is my main account, but it was completely overwhelmed for a couple a days at the start of the migration and was pretty much unusable. Some instances have already defederated from other instances, or are debating doing so in future. And then there was the hack that rendered a bunch of instances unusable. Not to mention I might want a separate porn account, professional account etc… I wouldn’t be surprised if there were double the number of genuine (but inactive) accounts created 3 weeks ago as there were new sign ups.

From your numbers the bot accounts still far outnumber the genuine accounts, even if every new user made 4 like me. But I’d be concerned about genuine inactive accounts being chucked out with the bots. Although maybe that’s better than the alternative?

[-] bdonvr 8 points 2 years ago

The way I did it was just deleting any account that signed up, but didn't complete email verification within one hour.

The bots weren't completing email verification.

Though that's only during 0.18 when captcha didn't work, I don't have email verification on now and didn't before 0.18 so I don't know how they identify them now

[-] [email protected] 2 points 2 years ago

There's a bug currently going around that allows people to gain your account info, there is no way I'm giving lemmy my email address.

[-] bdonvr 1 points 2 years ago

Well with bots running rampant and captchas gone, I didn't have much other choice.

I told people feel free to use a burner/temp email

Doesn't matter anyway captchas are back and email verification isn't required

[-] [email protected] 1 points 2 years ago

Oh yeah, I wasn't dissing the method. Just explaining to people that it's currently not a great idea to link your email.

[-] [email protected] 1 points 2 years ago

That sounds like a good compromise. Is there a way to message all inactive accounts and asking them to complete a captcha before a certain date, or the accounts will be deleted?

[-] [email protected] 7 points 2 years ago

That's me. I've been creating multiple accounts and hopping around. It's only been a week so nothing permanently planted quite yet.

[-] [email protected] 8 points 2 years ago

I’d like to wait a month or so and let everything settle before I decide which instance I make my main account. There’s so much going on at the moment the admins, hosts, mods and developers/programmers deservedly need a minute to get on top of everything. I’d like to retain my inactive accounts until then, especially as we’re talking about what these inactive accounts might do in the future, not an army of malicious bots attacking now.

[-] [email protected] 2 points 2 years ago

Exactly. I know in "internet time" it's been forever. Reddit kicked me out. Lemmy (and similar sites) exploded with users. Twitter closed out strangers. Threads launched. Twitter opened the door a little. Lemmy instances experienced an XSS and were vandalized. Some are still down or in various states.

It's been a crazy week. One fucking week, and it was a vacation week (for many).

And I know this entire investigation is into accounts before all that. I just hope the folks (admins) take a minute to really let this all play out a little bit and sink in.

[-] [email protected] 5 points 2 years ago

I did this too. I initially signed up at beehaw but they then defederated from many of the larger instances. So it's already become a very specific vibe and focus when I log in there. I have a local instance I joined that I use to keep up with sports and local news. And another for general scrolling and discovering new communities. I tried to create a specific use for each one rather than having several abandoned accounts I made before I understood how it worked.

[-] [email protected] 2 points 2 years ago

Same here. I signed up with the same username on 9 different instances. Had no idea how it all worked. 8 of the 9 accounts are mostly inactive now as I stick to just this one

[-] [email protected] 16 points 2 years ago

Why not simply ask if they are robots, they wouldn't lie, their programming won't allow it.

[-] [email protected] 4 points 2 years ago

as an AI language model, I am not able to respond to this.

[-] [email protected] 1 points 2 years ago

A shibboleth for sentience.

[-] [email protected] 16 points 2 years ago

A bunch of lurkers are probably sweating after reading this lol

[-] [email protected] 5 points 2 years ago

Quick! Make a comment to show activity!

[-] [email protected] 4 points 2 years ago

Better get a reply in just to make sure then

[-] [email protected] 15 points 2 years ago

Great job man. I alerted voltage.vn about their bot problem back when this was all going down and their admin was able to remove over 10,000 accounts pretty easily.

I've also been following the bot activity with some interest, and to be honest I'm not sure whether it's actually coming from malicious actors or people wanting to help Lemmy by creating news/attention.

However, regardless of the motives, I still feel uneasy about sharing the Threadiverse with a bunch of bot accounts. It just seems like a problem waiting to happen. Thanks for putting in the legwork to fix the biggest offenders, by targeting the servers on your list we can clean up most of the problem with comparatively little effort. Nice work.

[-] [email protected] 7 points 2 years ago

Im not deep into the tech behind lemmy, so forgive me my simple question. What counts as an activr user? Someone who posts, comments or votes? What about lurkers?

[-] [email protected] 4 points 2 years ago

An active user is anyone who posts or comments within a specified timeframe. I used 6 months for my spreadsheet.

It's at the bottom of this page: in the Lemmy documentation

https://join-lemmy.org/docs/contributors/07-ranking-algo.html

[-] [email protected] 7 points 2 years ago

Good idea, although to add to the above some of us are just here to read and don't post often. It'd be a shame to be deleted as a bot

[-] [email protected] 6 points 2 years ago

Good job 👍

[-] [email protected] 6 points 2 years ago

You're a real one! 🙏

[-] [email protected] 6 points 2 years ago

Amazing job. This whole post deserves way more upvotes.

[-] [email protected] 6 points 2 years ago

Does this take into reference accounts that are marked as bots? .. I mean i have my friend: @[email protected] .. he's a nice guy.

[-] [email protected] 5 points 2 years ago

I salute you for your service 🫡

[-] [email protected] 5 points 2 years ago

@kersploosh I’m a real boy!

[-] [email protected] 4 points 2 years ago

When I was on Reddit I felt that I was at the whim of the company or the admins/mods. On Lemmy I feel like I have the power to make a positive change like you just did. Thanks

[-] [email protected] 4 points 2 years ago

Great post. Half of the time, when I come across a bot, it's not for malicious purposes. It's usually because they want to help grow an instance. The wheels begin to fall off when they start to populate (sometimes spam) communities with unwanted posts, thinking they're helping.

The ad bots are the worse. I've started to see an increase in those. I guess it means Lemmy and the Fediverse are becoming more popular.

[-] [email protected] 3 points 2 years ago

Great work!!

[-] [email protected] 3 points 2 years ago

As for the reason to create bot accounts, maybe some people tried to increase the number of lemmy user to gain momentum and trigger a (more) massive migration from reddit ?

[-] [email protected] 2 points 2 years ago

Thank you very much for such an excellent work! I like Lemmy a lot, and it would be a shame to see that many bots ruining instances.

[-] [email protected] 1 points 2 years ago

Would it be possible to do a certificate-based authentication scheme?

The idea is that Lemmy instances could collaboratively maintain API keys that grant access to posting.

this post was submitted on 11 Jul 2023

385 points (98.5% liked)

sh.itjust.works Main Community

8145 readers

1 users here now

Home of the sh.itjust.works instance.

founded 2 years ago

MODERATORS

[email protected]