958
AI agents wrong ~70% of time: Carnegie Mellon study
(www.theregister.com)
This is a most excellent place for technology news and articles.
I'd just like to point out that, from the perspective of somebody watching AI develop for the past 10 years, completing 30% of automated tasks successfully is pretty good! Ten years ago they could not do this at all. Overlooking all the other issues with AI, I think we are all irritated with the AI hype people for saying things like they can be right 100% of the time -- Amazon's new CEO actually said they would be able to achieve 100% accuracy this year, lmao. But being able to do 30% of tasks successfully is already useful.
It doesn't matter if you need a human to review. AI has no way distinguishing between success and failure. Either way a human will have to review 100% of those tasks.
A human can review something close to correct a lot better than starting the task from zero.
It is a lot harder to notice incorrect information in review, than making sure it is correct when writing it.
Depends on the context, there is a lot of work in the scientific methods community trying to use NLP to augment traditionally fully human processes such as thematic analysis and systematic literature reviews and you can have protocols for validation there without 100% human review
That depends entirely on your writing method and attention span for review.
Most people make stuff up off the cuff and skim anything longer than 75 words when reviewing, so the bar for AI improving over that is really low.
In University I knew a lot of students who knew all the things but "just don't know where to start" - if I gave them a little direction about where to start, they could run it to the finish all on their own.