this post was submitted on 11 May 2025
15 points (100.0% liked)

TechTakes

1842 readers
113 users here now

Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.

This is not debate club. Unless it’s amusing debate.

For actually-good tech, you want our NotAwfulTech community

founded 2 years ago
MODERATORS
 

Need to let loose a primal scream without collecting footnotes first? Have a sneer percolating in your system but not enough time/energy to make a whole post about it? Go forth and be mid: Welcome to the Stubsack, your first port of call for learning fresh Awful you’ll near-instantly regret.

Any awful.systems sub may be subsneered in this subthread, techtakes or no.

If your sneer seems higher quality than you thought, feel free to cut’n’paste it into its own post — there’s no quota for posting and the bar really isn’t that high.

The post Xitter web has spawned soo many “esoteric” right wing freaks, but there’s no appropriate sneer-space for them. I’m talking redscare-ish, reality challenged “culture critics” who write about everything but understand nothing. I’m talking about reply-guys who make the same 6 tweets about the same 3 subjects. They’re inescapable at this point, yet I don’t see them mocked (as much as they should be)

Like, there was one dude a while back who insisted that women couldn’t be surgeons because they didn’t believe in the moon or in stars? I think each and every one of these guys is uniquely fucked up and if I can’t escape them, I would love to sneer at them.

(Credit and/or blame to David Gerard for starting this.)

all 21 comments
sorted by: hot top controversial new old
[–] [email protected] 12 points 7 hours ago (3 children)

A German lawyer is upset because open-source projects don't like it when he pastes chatbot summaries into bug reports. If this were the USA, he would be a debit to any bar which admits him, because the USA's judges have started to disapprove of using chatbots for paralegal work.

[–] [email protected] 8 points 2 hours ago* (last edited 2 hours ago) (1 children)

I just read the github issue comment thread he links, what an entitled chode.

Love that the laughing face reactions to his AI slop laden replies stung so much he ended up posting through it on his blog.

[–] [email protected] 2 points 17 minutes ago* (last edited 29 seconds ago)

Ow god that thread. And what is it with 'law professionals' like this? I also recall a client in a project who had a law background who was quite a bit of pain to work with. (Also amazing that he doesn't get that getting a reaction where somebody tries out your very specific problem at all is already quite something, 25k open issues ffs).

E: Also seeing drama like this unfold a few times in the C:DDA development stuff (a long time ago), which prob was done by young kids/adults and not lawyers. My kneejerk reaction is to get rid of people like this from the project. They will just produce more and more drama, and will eventually burn valuable developers out.

[–] [email protected] 4 points 3 hours ago* (last edited 3 hours ago)

the message I get is to preemptively ban user ms178 from any project I'm on

[–] [email protected] 15 points 10 hours ago (3 children)

I know it’s very very very petty but this article about how basic Sam Altman’s kitchen skills are did make me laugh https://www.ft.com/content/b1804820-c74b-4d37-b112-1df882629541

[–] [email protected] 12 points 9 hours ago

The coda is top tier sneer:

Maybe it’s useful to know that Altman uses a knife that’s showy but incohesive and wrong for the job; he wastes huge amounts of money on olive oil that he uses recklessly; and he has an automated coffee machine that claims to save labour while doing the exact opposite because it can’t be trusted. His kitchen is a catalogue of inefficiency, incomprehension, and waste. If that’s any indication of how he runs the company, insolvency cannot be considered too unrealistic a threat.

[–] [email protected] 8 points 8 hours ago

It starts out seeming like a funny but petty and irrelevant criticism of his kitchen skill and product choices, but then beautifully transitions that into an accurate criticism of OpenAI.

[–] [email protected] 7 points 9 hours ago

Its definitely petty, but making Altman's all-consuming incompetence known to the world is something I strongly approve of.

Definitely goes a long way to show why he's an AI bro.

[–] [email protected] 7 points 9 hours ago (1 children)
[–] [email protected] 7 points 8 hours ago (1 children)

Despite the snake-oil flavor of Vending-Bench, GeminiPlaysPokemon, and ClaudePlaysPokemon, I've found them to be a decent antidote to agentic LLM hype. The insane transcripts of Vending-Bench and the inability of an LLM to play Pokemon at the level of a 9 year old is hard to argue with, and the snake oil flavoring makes it easier to get them to swallow.

[–] [email protected] 7 points 8 hours ago (1 children)

I now wonder how that compares to earlier non-LLM AI attempts to create a bot that can play games in general. Used to hear bits of that kind of research every now and then but LLM/genAI has sucked the air out of the room.

[–] [email protected] 6 points 7 hours ago* (last edited 6 hours ago) (1 children)

In terms of writing bots to play Pokemon specifically (which given the prompting and custom tools written I think is the most fair comparison)... not very well... according to this reddit comment a bot from 11 years ago can beat the game in 2 hours and was written with about 7.5K lines of LUA, while an open source LLM scaffold for playing Pokemon relatively similar to claude's or gemini's is 4.8k lines (and still missing many of the tools Gemini had by the end, and Gemini took weeks of constant play instead of 2 hours).

So basically it takes about the same number of lines written to do a much much worse job. Pokebot probably required relatively more skill to implement... but OTOH, Gemini's scaffold took thousands of dollars in API calls to trial and error develop and run. So you can write bots from scratch that substantially outperform LLM agent for moderately more programming effort and substantially less overall cost.

In terms of gameplay with reinforcement learning... still not very well. I've watched this video before on using RL directly on pixel output (with just a touch of memory hacking to set the rewards), it uses substantially less compute than LLMs playing pokemon and the resulting trained NN benefits from all previous training. The developer hadn't gotten it to play through the whole game... probably a few more tweaks to the reward function might manage a lot more progress? OTOH, LLMs playing pokemon benefit from being able to more directly use NPC dialog (even if their CoT "reasoning" often goes on erroneous tangents or completely batshit leaps of logic), while the RL approach is almost outright blind... a big problem the RL approach might run into is backtracking in the later stages since they use reward of exploration to drive the model forward. OTOH, the LLMs also had a lot of problems with backtracking.

My (wildly optimistic by sneerclubbing standards) expectations for "LLM agents" is that people figure out how to use them as a "creative" component in more conventional bots and AI approaches, where a more conventional bot prompts the LLM for "plans" which it uses when it gets stuck. AlphaGeometry2 is a good demonstration of this, it solved 42/50 problems with a hybrid neurosymbolic and LLM approach, but it is notable it could solve 16 problems with just the symbolic portion without the LLM portion, so the LLM is contributing some, but the actual rigorous verification is handled by the symbolic AI.

(edit: Looking at more discussion of AlphaGeometry, the addition of an LLM is even less impressive than that, it's doing something you could do without an LLM at all, on a set of 30 problems discussed, the full AlphaGeometry can do 25/30, without the LLM at all 14/30,* but* using alternative methods to an LLM it can do 18/30 or even 21/30 (depending on the exact method). So... the LLM is doing something, which is more than my most cynical sneering would suspect, but not much, and not necessarily that much better than alternative non-LLM methods.)

[–] [email protected] 3 points 4 hours ago

Cool thanks for doing the effort post.

My (wildly optimistic by sneerclubbing standards) expectations for “LLM agents” is that people figure out how to use them as a “creative” component in more conventional bots and AI approaches

This was my feeling a bit how it was used basically in security fields already, with a less focus on the conventional bots/ai. Where they use the LLMs for some things still. But hard to spread fact from PR, and some of the things they say they do seem to be like it isn't a great fit for LLMs, esp considering what I heard from people who are not in the hype train. (The example coming to mind is using LLMs to standardize some sort of reporting/test writing, while I heard from somebody I trust who has seen people try that and had it fail as it couldn't keep a consistent standard).

[–] [email protected] 7 points 13 hours ago (3 children)

Beff back at it again threatening his doxxer. Nitter link

[–] [email protected] 4 points 8 hours ago

what a strange way to sell your grift no one knows what it's for. "bad people want to force me to tell you what it is we're building."

[–] [email protected] 7 points 10 hours ago

That whole 'haters just like incomplete information' makes no sense btw. Amazing to get that out of people basically going 'we hate beff for x,y,z' (which I assume happens, as I don't keep up with this beff stuff, I don't like these 'e/acc' people).

[–] [email protected] 8 points 12 hours ago (1 children)

Unrelated to this: man, there should be a parody account called “based beff jeck” which is just a guy trying to promote beck’s vast catalogue as the future of music. Also minus any mention of johnny depp.

[–] [email protected] 5 points 9 hours ago

Also minus any mention of johnny depp.

Depp v. Heard was my generation's equivalent to the OJ Simpson trial, so chances are he'll end up conspicuous in his absence.

[–] [email protected] 6 points 17 hours ago* (last edited 16 hours ago) (1 children)

That Keeper AI dating app has an admitted pedo running its twitter PR (Hunter Ash - old username was Psikey, the receipts are under that).

[–] [email protected] 5 points 11 hours ago

🎶 Certified Keeper guy? Certified pedophile

(the joke presented itself, I had to :P)