[-] Architeuthis@awful.systems 22 points 2 months ago* (last edited 2 months ago)
[-] Architeuthis@awful.systems 24 points 9 months ago

So if you knew the average lawyer made 3.6 mistakes per case and the AI only made 1.2, it’s still a net gain.

thats-not-how-any-of-this-works.webm

[-] Architeuthis@awful.systems 22 points 10 months ago

It should be noted that the only person to lose his life in the article was because the police, who were explicitly told to be ready to use non-lethal means to subdue him because he was in the middle of a mental episode, immediately gunned him down when they saw him coming at them with a kitchen knife.

But here's the thrice cursed part:

“You want to know the ironic thing? I wrote my son’s obituary using ChatGPT,” Mr. Taylor said. “I had talked to it for a while about what had happened, trying to find more details about exactly what he was going through. And it was beautiful and touching. It was like it read my heart and it scared the shit out of me.”

[-] Architeuthis@awful.systems 24 points 11 months ago* (last edited 11 months ago)

To get a bit meta for a minute, you don't really need to.

The first time a substantial contribution to a serious issue in an important FOSS project is made by an LLM with no conditionals, the pr people of the company that trained it are going to make absolutely sure everyone and their fairy godmother knows about it.

Until then it's probably ok to treat claims that chatbots can handle a significant bulk of non-boilerplate coding tasks in enterprise projects by themselves the same as claims of haunted houses; you don't really need to debunk every separate witness testimony, it's self evident that a world where there is an afterlife that also freely intertwines with daily reality would be notably and extensively different to the one we are currently living in.

[-] Architeuthis@awful.systems 23 points 1 year ago

thinkers like computer scientist Eliezer Yudkowsky

That's gotta sting a bit.

[-] Architeuthis@awful.systems 22 points 1 year ago

In case anybody skips the article, it's a six year old cybernetically force grown to the body of a horny 13 to 14 year old.

The rare sentence that makes me want to take a shower for having written it.

[-] Architeuthis@awful.systems 23 points 1 year ago

Maybe Momoa's PR agency forgot to send an appropriate tribute to Alphabet this month.

[-] Architeuthis@awful.systems 21 points 2 years ago* (last edited 2 years ago)

On each step, one part of the model applies reinforcement learning, with the other one (the model outputting stuff) “rewarded” or “punished” based on the perceived correctness of their progress (the steps in its “reasoning”), and altering its strategies when punished. This is different to how other Large Language Models work in the sense that the model is generating outputs then looking back at them, then ignoring or approving “good” steps to get to an answer, rather than just generating one and saying “here ya go.”

Every time I've read how chain-of-thought works in o1 it's been completely different, and I'm still not sure I understand what's supposed to be going on. Apparently you get a strike notice if you try too hard to find out how the chain-of-thinking process goes, so one might be tempted to assume it's something that's readily replicable by the competition (and they need to prevent that as long as they can) instead of any sort of notably important breakthrough.

From the detailed o1 system card pdf linked in the article:

According to these evaluations, o1-preview hallucinates less frequently than GPT-4o, and o1-mini hallucinates less frequently than GPT-4o-mini. However, we have received anecdotal feedback that o1-preview and o1-mini tend to hallucinate more than GPT-4o and GPT-4o-mini. More work is needed to understand hallucinations holistically, particularly in domains not covered by our evaluations (e.g., chemistry). Additionally, red teamers have noted that o1-preview is more convincing in certain domains than GPT-4o given that it generates more detailed answers. This potentially increases the risk of people trusting and relying more on hallucinated generation.

Ballsy to just admit your hallucination benchmarks might be worthless.

The newsletter also mentions that the price for output tokens has quadrupled compared to the previous newest model, but the awesome part is, remember all that behind-the-scenes self-prompting that's going on while it arrives to an answer? Even though you're not allowed to see them, according to Ed Zitron you sure as hell are paying for them (i.e. they spend output tokens) which is hilarious if true.

[-] Architeuthis@awful.systems 21 points 2 years ago

If I remember correctly SBF taking the stand was completely against his lawyers' recommendations, and in general he seems to have a really hard time doing what people who know better tell him to, such as don't DM journalists about your crimes and definitely don't start a substack detailing how you felt justified in doing them, and also trying to 'explain yourself' to prosecution witnesses is witness tampering and will get your bail revoked.

[-] Architeuthis@awful.systems 23 points 2 years ago

conflict averse and probably low testosterone German Catholics [...] overcivilized and effete Teutons

Kind of off topic, but this piece of wall to wall insanity reminded me how Steven Pinker tried to explain away southern US crime rates that didn't fit with his Violence Is Declining And In Fact Everything's Improving Inexorably (As Long As You Don't Rock The Boat) thesis by randomly blaming irish-catholic sheepherder genealogy.

[-] Architeuthis@awful.systems 24 points 2 years ago

Had to google shit-test, apparently it's a PUA term, what a surprise.

[-] Architeuthis@awful.systems 21 points 2 years ago* (last edited 2 years ago)

On one hand it's encouraging that the comments are mostly pushing back.

On the other hand a lot of them do so on the basis of a disagreement over the moral calculus of how many chickens a first trimester fetus should be worth, and whether that makes pushing for abortion bans inefficient compared to efforts to reduce the killing of farm animals for food.

Which, while pants-on-head bizarre in any other context, seems fairly normal by EA standards.

view more: ‹ prev next ›

Architeuthis

0 post score
0 comment score
joined 2 years ago