programming

1

6

GitHub Actions Is Slowly Killing Your Engineering Team - Ian Duncan (iankduncan.com)

submitted 6 hours ago by git@hexbear.net to c/programming@hexbear.net

0 comments fedilink

2

4

The RCE that AMD won't fix! (mrbruh.com)

submitted 6 hours ago by git@hexbear.net to c/programming@hexbear.net

0 comments fedilink

3

Tower of Flaws: Dismantling Tower of Fantasy's Anti-Cheat Driver While Waiting for The Game to Install (vespalec.com)

submitted 6 hours ago by git@hexbear.net to c/programming@hexbear.net

0 comments fedilink

4

2

Decompiling Xbox games using PDB debug info (i686.me)

submitted 1 week ago by git@hexbear.net to c/programming@hexbear.net

0 comments fedilink

5

3

How To Hack A Denuvo Game (www.youtube.com)

submitted 2 weeks ago by git@hexbear.net to c/programming@hexbear.net

1 comments fedilink

6

10

I took a shot at implementing the idea from the Recursive Language Model paper (git.sr.ht)

submitted 3 weeks ago* (last edited 3 weeks ago) by yogthos@lemmygrad.ml to c/programming@hexbear.net

1 comments fedilink

Despite context windows expanding to millions of tokens, LLMs still struggle with a fundamental task of precision.

When you ask an LLM to "analyze this report," it often glances at the text and simply hallucinates a plausible sounding answer based on probability.

A good example of the problem can be seen in asking a model to sum sales figures from a financial report. Left to its own devices, it will not bother reading the whole document, and simply give you a hallucinated answer. This is especially a problem with smaller models you can run locally.

The Recursive Language Model paper comes up with a clever technique that forces the LLM to stop guessing and start coding.

The standard approach to try to deal with the problem is to use Retrieval Augmented Generation (RAG) which relies on semantic similarity (embeddings). If you ask for "sales figures," a Vector DB retrieves chunks of text that sound like sales figures. But semantic similarity is fuzzy, and limited in functionality.

For example, embeddings can't count so you can't ask "count the number of times X happens." They can't handle information that's scattered across a bunch of unrelated lines. And they can't distinguishing between concepts like "Projected Sales" and "Actual Sales" when they appear in similar contexts.

It would be nice to have a system that treats text as a dataset that should be queries as opposed to a prompt to be completed. and this where RLMs come in.

Here, the model acts as a programmer, and writes code to explore the document, verify its execution results, and only then formulate an answer based on them.

The core insight is that code execution provides grounding for the model. When an LLM guesses a number, it might be wrong. When an LLM writes regex.match() and the computer returns ['$2,340,000'], that result is a hard fact.

The process works like as follows:

The document is loaded into a secure, isolated Node.js environment as a read-only context variable.
The model is given exploration tools: text_stats(), fuzzy_search(), and slice().
The Loop:

The model writes TypeScript to probe the text.
The Sandbox executes it and returns the real output.
The model reads the output and refines its next step.

The model iterates until it has enough proven data to answer FINAL("...").

The system can work entirely locally using something like Ollama with Qwen-Coder, or with DeepSeek which is much smarter by default.

Allowing an LLM to write and run code directly on your system is obviously a security nightmare, so the implementation uses isolated-vm to create a secure sandbox for it to play in.

The model cannot hallucinate rm -rf / or curl a URL. Having a sandbox also prevents infinite loops or memory leaks. And since the document is immutable, the model can read it but cannot alter the source truth.

I also used Universal Tool Calling Protocol (UTCP) patterns from code-mode to generate strict TypeScript interfaces. This provides the LLM with a strict contract:

// The LLM sees exactly this signature in its system prompt
declare function fuzzy_search(query: string, limit?: number): Array<{
  line: string;
  lineNum: number;
  score: number; // 0 to 1 confidence
}>;

Another problem is that LLMs are messy coders. They forget semicolons, use hallucinated imports, etc. The way around that is to have a self healing layer. If the sandbox throws a syntax error, a lightweight intermediate step attempts to fix imports and syntax before re-running. This keeps the reasoning chain alive and minimizes round trips to the model.

As a demo of the concept, I made a document that has a bunch of scattered data, having 5 distinct sales figures hidden inside 4,700 characters of Lorem Ipsum filler and unrelated business jargon.

Feeding the text into a standard context window and asking for the total will almost certainly give you a hallucinated a total like $480,490. It just grabs numbers that look like currency from unrelated sections and mashes them together.

Running the same query through RLM took around 4 turns on average in my tests, but the difference was night and day.

The model didn't guess. It first checked the file size.

const stats = text_stats();
console.log(`Document length: ${stats.length}, Lines: ${stats.lineCount}`);

Next, it used fuzzy search to locate relevant lines, ignoring the noise.

const matches = fuzzy_search("SALES_DATA");
console.log(matches);
// Output: [
//   { line: "SALES_DATA_NORTH: $2,340,000", ... },
//   { line: "SALES_DATA_SOUTH: $3,120,000", ... }
// ]

And finally, it wrote a regex to parse the strings into integers and summed them programmatically to get the correct result.

// ...regex parsing logic...
console.log("Calculated Total:", total); // Output: 13000000

Only after the code output confirmed the math did the model verify the answer.

The key difference is that traditional approach asks the model what does this document say, while the recursive coding approach asks it to write a program to find out what this document says. The logic is now expressed using actual code, and the role of the LLM is to write the code and read the results as opposed to working with the document directly.

As with all things, there is a trade off here with the RLM approach being slower since it takes multiple turns and can generate more tokens as a result.

edit: it does look like for smaller models, you kinda have to tweak things to be model specific, or they get confused with general prompts

7

4

C3 Programming Language (c3-lang.org)

submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net

2 comments fedilink

8

3

Salmon Recipe (waveinscriber.com)

submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net

0 comments fedilink

9

6

No Graphics API — Sebastian Aaltonen (www.sebastianaaltonen.com)

submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net

1 comments fedilink

10

4

Top Gun's Carrier Landing: Exposed (relaxing.run)

submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net

0 comments fedilink

11

27

4 billion if statements (andreasjhkarlsson.github.io)

submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net

7 comments fedilink

12

Super Mario 64 for the PS1 (github.com)

submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net

3 comments fedilink

13

15

Spinlocks vs. Mutexes: When to Spin and When to Sleep (howtech.substack.com)

submitted 2 months ago by git@hexbear.net to c/programming@hexbear.net

0 comments fedilink

14

9

SVG Filters - Clickjacking 2.0 [Lyra Rebane] (lyra.horse)

submitted 2 months ago* (last edited 2 months ago) by AernaLingus@hexbear.net to c/programming@hexbear.net

0 comments fedilink

As usual, some absolutely diabolical CSS wizardy from Lyra.

15

9

In defense of lock poisoning in Rust · sunshowers (sunshowers.io)

submitted 2 months ago by AernaLingus@hexbear.net to c/programming@hexbear.net

0 comments fedilink

16

7

Advanced, Overlooked Python Typing (martynassubonis.substack.com)

submitted 2 months ago by git@hexbear.net to c/programming@hexbear.net

1 comments fedilink

17

7

Running unsupported iOS on deprecated devices (nyansatan.github.io)

submitted 2 months ago by git@hexbear.net to c/programming@hexbear.net

0 comments fedilink

18

2

Why Castrol Honda Superbike crashes on (most) modern systems (seri.tools)

submitted 2 months ago by git@hexbear.net to c/programming@hexbear.net

0 comments fedilink

19

You no longer need JavaScript [Lyra Rebane] (lyra.horse)

submitted 3 months ago by AernaLingus@hexbear.net to c/programming@hexbear.net

2 comments fedilink

Informative post (with tons of examples!) from a genuine CSS wizard. Definitely check out her projects css clicker and antonymph, or if cybersecurity is more your thing, her recent talk on CSS exploits.

20

12

Arcpy (hexbear.net)

submitted 3 months ago by invalidusernamelol@hexbear.net to c/programming@hexbear.net

3 comments fedilink

Anyone here have experience with arcpy? I've been working on a wrapper library that tries it's best to bring in into compliance with the Python data model and would like some testers.

I'll dm you a link to the repository if you have any interest. It's still pretty barebones and focused mainly on simplifying interaction with file databases and project files.

21

Making Linked Lists on C-likes makes me appreciate Lisp even more (hexbear.net)

submitted 3 months ago by hello_hello@hexbear.net to c/programming@hexbear.net

33 comments fedilink

I get dizzy looking at C-like languages, they just feel incredibly hard to follow compared to an S-expression.

Everything this just so verbose and there's so much negative space between the lines. To be fair, this course is making us program using Java so maybe it has to do more with that.

22

6

Casey Muratori – The Big OOPs: Anatomy of a Thirty-five-year Mistake – BSC 2025 (www.youtube.com)

submitted 4 months ago by AernaLingus@hexbear.net to c/programming@hexbear.net

1 comments fedilink

Don't let the title mislead you into thinking this is a polemic—it's an excellent talk that traces the lineage of object-oriented programming (which goes back way further than you might imagine) and helps elucidate why it became such a dysfunctional pattern. It's really well-researched, citing a bunch of primary sources, and the speaker is clear and engaging. Never would have thought a nearly 2 hour lecture^[There's about ~35 minutes of Q&A, hence the runtime] on OOP would engender such rapt attention—there are so many fascinating tidbits that were dug up from ancient academic papers.

So much of computing is about the newest and shiniest thing that it's easy to lose sight of how—and importantly, why—we got here, and I think reexamining these foundations can help prevent us from propagating the mistakes of the past.

23

40

Hosting a WebSite on a Disposable Vape (bogdanthegeek.github.io)

submitted 4 months ago by git@hexbear.net to c/programming@hexbear.net

2 comments fedilink

24

2

Video Game Blurs (and how the best one works) (blog.frost.kiwi)

submitted 5 months ago by git@hexbear.net to c/programming@hexbear.net

0 comments fedilink

25

1

I'm looking for Rust guides aimed at beginners/people new to low level languages (hexbear.net)

submitted 5 months ago* (last edited 5 months ago) by BountifulEggnog@hexbear.net to c/programming@hexbear.net

11 comments fedilink

A lot of what I've found seems to assume C knowledge, or other things that I just don't understand. I know a little bit of python, I've mostly made small scripts for webscraping/data management/discord bots. My ultimate goal is a gameboy emulator, but plan on making a chip-8 interpreter first along with any other test programs ect I need. There's a lot of guides and resources for making either of those projects but I need to understand Rust first.

Picking Rust because its modern, memory management seems a lot easier, and I've generally heard good things about it. Also heard some errors that cause runtime issues on something like C are caught by Rust's compiler.