236

...and I still don't get it. I paid for a month of Pro to try it out, and it is consistently and confidently producing subtly broken junk. I had tried doing this before in the past, but gave up because it didn't work well. I thought that maybe this time it would be far along enough to be useful.

The task was relatively simple, and it involved doing some 3d math. The solutions it generated were almost write every time, but critically broken in subtle ways, and any attempt to fix the problems would either introduce new bugs, or regress with old bugs.

I spent nearly the whole day yesterday going back and forth with it, and felt like I was in a mental fog. It wasn't until I had a full night's sleep and reviewed the chat log this morning until I realized how much I was going in circles. I tried prompting a bit more today, but stopped when it kept doing the same crap.

The worst part of this is that, through out all of this, Claude was confidently responding. When I said there was a bug, it would "fix" the bug, and provide a confident explanation of what was wrong... Except it was clearly bullshit because it didn't work.

I still want to keep an open mind. Is anyone having success with these tools? Is there a special way to prompt it? Would I get better results during certain hours of the day?

For reference, I used Opus 4.6 Extended.

you are viewing a single comment's thread
view the rest of the comments
[-] ozymandias@sh.itjust.works 7 points 18 hours ago

you need to fully be able to program to work with these things, in my experience.
you have to explain what you want very specifically, in precise programming terms.

i tried a preview of chatgpt codex and it’s working better than my free version of claude, but codex creates a whole virtual programming environment, you have to connect it to a github repository, then it spins up an instance with tools you include and actually tests the code and fixes bugs before sending it back to you.
but you still need to be able to find the bugs and fix them yourself.

oh and i think they work best with python, but i’ve also used ruby and dart and it’s decent.
it’s kinda like a power tool, it’ll definitely help you a lot to fix a car but if you can’t do it with wrenches it won’t help very much.

[-] quixote84@midwest.social 0 points 17 hours ago* (last edited 17 hours ago)

I've never been able to program in anything more complex than BASIC and command line batch files, but I'm able to get useful output from Claude.

I'm an IT Infrastructure Manager by trade, and I got there through 20 years of supporting everything from desktop to datacenter including weird use cases like controlling systems in a research lab. On top of that, I've gotten under the hood of software in the form of running game servers in my spare time.

What you need to get good programs out of AI boils down to 3 things:

  1. The ability to teach an entity whose mistakes resemble those of a gifted child where it went wrong a step or ten back from where it's currently looking.
  2. The ability to provide useful beta test / debug output regarding programs which aren't behaving as expected. This does include looking at an error log and having some idea what that error means.
  3. Comfort using (either executing or compiling depending on the language) source code associated with the language you're doing things in. This might be as simple as "How do I run a Powershell script or verify that I meet the version and module requirements for the script in question?", or it might be as complicated as building an executable in Visual Studio. Either way whatever the pipeline is from source to execution, it must be a pipeline you're comfortable working with. If you're doing things anywhere outside the IT administration space, it's reasonable to be looking at Python as the best first path rather than Powershell. Personally, I must go where supported first party modules exist for the types of work I'm developing around. In IT Administration, that's Powershell.

I've made tools which automate and improve my entire department's approach to user data, device data, application inventory, patch management, vulnerability management, and these are changes I started making with a free product three months ago, and two months back I switched to the paid version.

Programming is sort of like conversation in an alien language. For that reason, if you can give precise instructions sometimes you really can pull something new into existence using LLM coding. It's the same reason that you could say words which have never been said in that specific order before, and have an LLM translate them to Portuguese.

I always used to talk about how everything in a computer was math, and that what interested me more than quantum computing would be a machine which starts performing the same sorts of operations on words or concepts that computers of that day ('90s and '00s when "quantum" was being slapped on everything to mean "fast" or "powerful") were doing on math. I said that the best indicator when linguistic computing arrives would be that without ever learning to program, I'd start being able to program. I was looking at "Dragon Naturally Speaking" when I had this idea. It was one of the earliest effective speech to text programs. I stopped learning to program immediately and focused exclusively on learning operations from that point forward.

I've been testing the code generation abilities of LLMs for about three years. Within the last six months I feel like I'm starting to see evidence that the associations being made internally by LLMs are complex enough to begin considering them the fulfillment of my childhood dream of a "word computer".

All the shitty stuff about environment and theft of art is all there too, which sucks, but more because our economic model sucks than because LLMs either do or do not suck. If we had a framework for meeting everybody's basic needs, this software in its current state has the potential to turn everyone with a passion for grammatical and technical precision into a concept based developer practically overnight.

[-] eneff@discuss.tchncs.de 2 points 6 hours ago

I have no qualifications to judge the quality of the generated results, yet the generated results are always of great quality.

Do you seriously not realize how out of touch this sounds?

[-] Feyd@programming.dev 7 points 17 hours ago

I’ve never been able to program in anything more complex than BASIC and command line batch files, but I’m able to get useful output from Claude.

Chatbots being deemed useful in tasks by people unqualified to make those judgments is a running problem.

this post was submitted on 11 Apr 2026
236 points (90.4% liked)

Programming

26482 readers
287 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities !webdev@programming.dev



founded 2 years ago
MODERATORS