
because it encodes semantics.
if it really did so, performance wouldn't swing up or down when you change syntactic or symbolic elements of problems. the only information encoded is language-statistical
I didn't read the post at all
rather refreshing to have someone come out and just say it. thank you for the chuckle
"rat furry" :3
"(it's short for rationalist)" >:(
a thought on this specifically:
Google Cloud Chief Evangelist Richard Seroter said he believes the desire to use tools like Gemini for Google Workspace is pushing organizations to do the type of data management work they might have been sluggish about in the past.
“If you don’t have your data house in order, AI is going to be less valuable than it would be if it was,” he said.
we're right back to "you're holding it wrong" again, i see
i'm definitely imagining Google re-whipping up their "Big Data" sales pitches in response to Gemini being borked or useless. "oh, see your problem is that you haven't modernized and empowered yourself by dumping all your databases into a (our) cloud native synergistic Data Sea, available for only $1.99/GB"
good longpost, i approve
honestly i wouldn't be surprised if some AI companies weren't cheating at AI metrics with little classically-programmed, find-and-replace programs. if for no other reason than i think the idea of some programmer somewhere being paid to browse twitter on behalf of OpenAI and manually program exceptions for "how many months does it take 9 women to make 1 baby" is hilarious
data scientists can have little an AI doomerism, as a treat
the upside: we can now watch "disruptive startups" go through the aquire funding -> slapdash development -> catastrophic failure -> postmortem cycle at breakneck speeds
48th percentile is basically "average lawyer".
good thing all of law is just answering multiple-choice tests
I don't need a Supreme Court lawyer to argue my parking ticket.
because judges looooove reading AI garbage and will definitely be willing to work with someone who is just repeatedly stuffing legal-sounding keywords into google docs and mashing "generate"
And if you train the LLM with specific case law and use RAG can get much better.
"guys our keyword-stuffing techniques aren't working, we need a system to stuff EVEN MORE KEYWORDS into the keyword reassembler"
In a worst case scenario if my local lawyer can use AI to generate a letter
oh i would love to read those court documents
and just quickly go through it to make sure it didn't hallucinate
wow, negative time saved! okay so your lawyer has to read and parse several paragraphs of statistical word salad, scrap 80+% of it because it's legalese-flavored gobbledygook, and then try to write around and reformat the remaining 20% into something that's syntactically and legally coherent -- you know, the thing their profession is literally on the line for. good idea
what promptfondlers continuously seem to fail to understand is that verification is the hard step. literally anyone on the planet can write a legal letter if they don't care about its quality or the ramifications of sending it to a judge in their criminal defense trial. part of being a lawyer is being able to tell actual legal arguments from bullshit, and when you hire an attorney, that is the skill you are paying for. not how many paragraphs of bullshit they can spit out per minute
they can process more clients, offer faster service and cheaper prices. Maybe not a revolution but still a win.
"but the line is going up!! see?! sure we're constantly losing cases and/or getting them thrown out because we're spamming documents full of nonsense at the court clerk, but we're doing it so quickly!!"
correlation? between the rise in popularity of tools that exclusively generates bullshit en masse and the huge swelling in volume of bullshit on the Internet? it's more likely than you think
it is a little funny to me that they're taking about using AI to detect AI garbage as a mechanism of preventing the sort of model/data collapse that happens when data sets start to become poisoned with AI content. because it seems reasonable to me that if you start feeding your spam-or-real classification data back into the spam-detection model, you'd wind up with exactly the same degredations of classification and your model might start calling every article that has a sentence starting with "Certainly," a machine-generated one. maybe they're careful to only use human-curated sets of real and spam content, maybe not
it's also funny how nakedly straightforward the business proposition for SEO spamming is, compared to literally any other use case for "AI". you pay $X to use this tool, you generate Y articles which reach the top of Google results, you generate $(X+P) in click revenue and you do it again. meanwhile "real" business are trying to gauge exactly what single digit percent of bullshit they can afford to get away with putting in their support systems or codebases while trying to avoid situations like being forced to give refunds to customers under a policy your chatbot hallucinated (archive.org link) or having to issue an apology for generating racially diverse Nazis (archive).
Actually, that email exchange isn’t as combative as I expected.
i suppose the CEO completely barreling forward past multiple attempts to refuse conversation while NOT screaming slurs at the person they're attempting to lecture, is, in some sense, strictly better than the alternative
ebu
0 post score0 comment score
i think you're missing the point that "Deepseek was made for only $6M" has been the trending headline for the past while, with the specific point of comparison being the massive costs of developing ChatGPT, Copilot, Gemini, et al.
to stretch your metaphor, it's like someone rolling up with their car, claiming it only costs $20 (unlike all the other cars that cost $20,000), when come to find out that number is just how much it costs to fill the gas tank up once