overview for diz

Iyo vs. Io — OpenAI and Jony Ive get sued by diz in c/[email protected]

[-] [email protected] 10 points 1 week ago* (last edited 1 week ago)

Old McDonald had a startup, iyo io o[4-mini].

It's funny how just today in a completely unrelated context a generative ai enthusiast used an example of OpenAI getting sued by NYT as a reason why they wouldn't commit some other malfeasance because they'd get caught if they did.

We test Google Veo: impressive demo, unusable results by diz in c/[email protected]

[-] [email protected] 11 points 1 week ago

Having worked in computer graphics myself, it is spot on that this shit is uncontrollable.

I think the reason is fundamental - if you could control it more you would put it too far from any of the training samples.

That being said video enhancements along the lines of applying this as a filter to 3d rendered CGI or another video, that could (to some extent) work. I think the perception of realism will fade as it gets more familiar - it is pretty bad at lighting, but in a new way.

Google's Gemini 2.5 pro is out of beta. by diz in c/[email protected]

[-] [email protected] 10 points 1 week ago

Oh and also for the benefit of our AI fanboys who can't understand why we would expect something as mundane from this upcoming super-intelligence, as doing math, here's why:

Google's Gemini 2.5 pro is out of beta. by diz in c/[email protected]

[-] [email protected] 11 points 1 week ago

Thing is, it has tool integration. Half of the time it uses python to calculate it. If it uses a tool, that means it writes a string that isn't shown to the user, which runs the tool, and tool results are appended to the stream.

What is curious is that instead of request for precision causing it to use the tool (or just any request to do math), and then presence of the tool tokens causing it to claim that a tool was used, the requests for precision cause it to claim that a tool was used, directly.

Also, all of it is highly unnatural texts, so it is either coming from fine tuning or from training data contamination.

Wake up babe, new "in this moment I am enlightened" copypasta just dropped by diz in c/[email protected]

[-] [email protected] 10 points 2 weeks ago* (last edited 2 weeks ago)

I am also presuming this is about purely non-fiction technical books

He has Dune on his list of worlds to live in, though...

edit: I know. he fed his post to AI and asked it to list the fictional universes he'd want to live in, and that's how he got Dune. Precisely the information he needed.

Apple: ‘Reasoning’ AIs fail hard if they actually have to think by diz in c/[email protected]

[-] [email protected] 10 points 3 weeks ago* (last edited 3 weeks ago)

Yeah any time its regurgitating an IMO problem it’s a proof it’salmost superhuman, but any time it actually faces a puzzle with unknown answer, this is not what it is for.

Game studios love AI! The gamers … hate it by diz in c/[email protected]

[-] [email protected] 11 points 1 month ago* (last edited 1 month ago)

I think it could work as a minor gimmick, like terminal hacking minigame in fallout. You have to convince the LLM to tell you the password, or you get to talk to a demented robot whose brain was fried by radiation exposure, or the like. Relatively inconsequential stuff like being able to talk your way through or just shoot your way through.

Unfortunately this shit is too slow and too huge to embed a local copy of, into a game. You need a lot of hardware compatibility. And running it in the cloud would cost too much.

Where Scoot makes the case about how an AGI could build an army of terminators in a year if it wanted. by diz in c/[email protected]

[-] [email protected] 11 points 1 month ago

is somewhere between 0 and 100%.

That really pins it down, doesn't it?

Musk ("xAI") now claims grok was hacked by diz in c/[email protected]

[-] [email protected] 10 points 1 month ago

Yeah. I’d love to see the prompt, gab’s nazi ai prompt was utterly pathetic and this one got to be pretty bad as well.

Gemini seem to have "solved" my duck river crossing, lol. by diz in c/[email protected]

[-] [email protected] 11 points 2 months ago* (last edited 2 months ago)

Yeah I think the best examples are everyday problems that people solve all the time but don't explicitly write out solutions step by step for, or not in the puzzle-answer form.

It's not even a novel problem at all, I'm sure there's even a plenty of descriptions of solutions to it as part of stories and such. Just not as "logical puzzles" due to triviality.

What really annoys me is when they claim high performance on benchmarks consisting of fairly difficult problems. This is basically fraud, since they know full well it is still entirely "knowledge" reliant, and even take steps to augment it with generated problems and solutions.

I guess the big sell is that it could use bits and pieces of logic gleaned from other solutions to solve a "new" problem. Except it can not.

Gemini 2.5 "reasoning", no real improvement on river crossings. by diz in c/[email protected]

[-] [email protected] 9 points 3 months ago* (last edited 3 months ago)

Yeah, exactly. There's no trick to it at all, unlike the original puzzle.

I also tested OpenAI's offerings a few months back with similarly nonsensical results: https://awful.systems/post/1769506

All-vegetables no duck variant is solved correctly now, but I doubt it is due to improved reasoning as such, I think they may have augmented the training data with some variants of the river crossing. The river crossing is one of the top most known puzzles, and various people have been posting hilarious bot failures with variants of it. So it wouldn't be unexpected that their training data augmentation has river crossing variants.

Of course, there's very many ways in which the puzzle can be modified, and their augmentation would only cover obvious stuff like variation on what items can be left with what items or spots on the boat.

"Google Gemini tried to kill me" by diz in c/[email protected]

[-] [email protected] 11 points 1 year ago

YOU CAN DO THAT WITHOUT AI.

Can they, though? Sure, in theory Google could hire millions of people to write overviews that are equally idiotic, but obviously that is not something they would actually do.

I think there's an underlying ethical theory at play here, which goes something like: it is fine to fill internet with half-plagiarized nonsense, as long as nobody dies, or at least, as long as Google can't be culpable.