414
submitted 1 week ago by [email protected] to c/[email protected]
top 10 comments
sorted by: hot top new old
[-] [email protected] 2 points 4 days ago

I just recently seen a python scraper in my server logs earlier today. Strangest thing to see.

[-] [email protected] 45 points 1 week ago

As long as the scrapers follows robots.txt

[-] [email protected] 36 points 1 week ago

It's equivalent to "the code."

[-] [email protected] 2 points 14 hours ago

It really should be "parlay.txt".

[-] [email protected] 26 points 1 week ago

beautiful soup

[-] [email protected] 15 points 1 week ago

I feel like there should be a third box with Wall Street raider types, for scrapers that use Selenium browser automation.

I don’t think it’s entirely unblockable - adsense seems to know to only serve unmonetized PSA ads - but I think it’s very difficult to discriminate between “this is a real browser controlled by an end user” and “this is a real browser being controlled by automated test software”.

[-] [email protected] 5 points 1 week ago

Fourth panel as well, with those bots collecting data for AI training that don't respect your robots.txt, change user agents and overload your servers

[-] [email protected] 1 points 11 hours ago

War boys from Fury Road?

[-] [email protected] 1 points 1 week ago

Love me some Scrapy spiders

this post was submitted on 30 May 2025
414 points (99.1% liked)

Programmer Humor

23784 readers
2659 users here now

Welcome to Programmer Humor!

This is a place where you can post jokes, memes, humor, etc. related to programming!

For sharing awful code theres also Programming Horror.

Rules

founded 2 years ago
MODERATORS