1
6
submitted 6 hours ago by git@hexbear.net to c/programming@hexbear.net
2
4
submitted 6 hours ago by git@hexbear.net to c/programming@hexbear.net
3
3
submitted 6 hours ago by git@hexbear.net to c/programming@hexbear.net
4
2
submitted 1 week ago by git@hexbear.net to c/programming@hexbear.net
5
3
How To Hack A Denuvo Game (www.youtube.com)
submitted 2 weeks ago by git@hexbear.net to c/programming@hexbear.net
6
10
submitted 3 weeks ago* (last edited 3 weeks ago) by yogthos@lemmygrad.ml to c/programming@hexbear.net

Despite context windows expanding to millions of tokens, LLMs still struggle with a fundamental task of precision.

When you ask an LLM to "analyze this report," it often glances at the text and simply hallucinates a plausible sounding answer based on probability.

A good example of the problem can be seen in asking a model to sum sales figures from a financial report. Left to its own devices, it will not bother reading the whole document, and simply give you a hallucinated answer. This is especially a problem with smaller models you can run locally.

The Recursive Language Model paper comes up with a clever technique that forces the LLM to stop guessing and start coding.

The standard approach to try to deal with the problem is to use Retrieval Augmented Generation (RAG) which relies on semantic similarity (embeddings). If you ask for "sales figures," a Vector DB retrieves chunks of text that sound like sales figures. But semantic similarity is fuzzy, and limited in functionality.

For example, embeddings can't count so you can't ask "count the number of times X happens." They can't handle information that's scattered across a bunch of unrelated lines. And they can't distinguishing between concepts like "Projected Sales" and "Actual Sales" when they appear in similar contexts.

It would be nice to have a system that treats text as a dataset that should be queries as opposed to a prompt to be completed. and this where RLMs come in.

Here, the model acts as a programmer, and writes code to explore the document, verify its execution results, and only then formulate an answer based on them.

The core insight is that code execution provides grounding for the model. When an LLM guesses a number, it might be wrong. When an LLM writes regex.match() and the computer returns ['$2,340,000'], that result is a hard fact.

The process works like as follows:

  1. The document is loaded into a secure, isolated Node.js environment as a read-only context variable.
  2. The model is given exploration tools: text_stats(), fuzzy_search(), and slice().
  3. The Loop:
  • The model writes TypeScript to probe the text.
  • The Sandbox executes it and returns the real output.
  • The model reads the output and refines its next step.
  1. The model iterates until it has enough proven data to answer FINAL("...").

The system can work entirely locally using something like Ollama with Qwen-Coder, or with DeepSeek which is much smarter by default.

Allowing an LLM to write and run code directly on your system is obviously a security nightmare, so the implementation uses isolated-vm to create a secure sandbox for it to play in.

The model cannot hallucinate rm -rf / or curl a URL. Having a sandbox also prevents infinite loops or memory leaks. And since the document is immutable, the model can read it but cannot alter the source truth.

I also used Universal Tool Calling Protocol (UTCP) patterns from code-mode to generate strict TypeScript interfaces. This provides the LLM with a strict contract:

// The LLM sees exactly this signature in its system prompt
declare function fuzzy_search(query: string, limit?: number): Array<{
  line: string;
  lineNum: number;
  score: number; // 0 to 1 confidence
}>;

Another problem is that LLMs are messy coders. They forget semicolons, use hallucinated imports, etc. The way around that is to have a self healing layer. If the sandbox throws a syntax error, a lightweight intermediate step attempts to fix imports and syntax before re-running. This keeps the reasoning chain alive and minimizes round trips to the model.

As a demo of the concept, I made a document that has a bunch of scattered data, having 5 distinct sales figures hidden inside 4,700 characters of Lorem Ipsum filler and unrelated business jargon.

Feeding the text into a standard context window and asking for the total will almost certainly give you a hallucinated a total like $480,490. It just grabs numbers that look like currency from unrelated sections and mashes them together.

Running the same query through RLM took around 4 turns on average in my tests, but the difference was night and day.

The model didn't guess. It first checked the file size.

const stats = text_stats();
console.log(`Document length: ${stats.length}, Lines: ${stats.lineCount}`);

Next, it used fuzzy search to locate relevant lines, ignoring the noise.

const matches = fuzzy_search("SALES_DATA");
console.log(matches);
// Output: [
//   { line: "SALES_DATA_NORTH: $2,340,000", ... },
//   { line: "SALES_DATA_SOUTH: $3,120,000", ... }
// ]

And finally, it wrote a regex to parse the strings into integers and summed them programmatically to get the correct result.

// ...regex parsing logic...
console.log("Calculated Total:", total); // Output: 13000000

Only after the code output confirmed the math did the model verify the answer.

The key difference is that traditional approach asks the model what does this document say, while the recursive coding approach asks it to write a program to find out what this document says. The logic is now expressed using actual code, and the role of the LLM is to write the code and read the results as opposed to working with the document directly.

As with all things, there is a trade off here with the RLM approach being slower since it takes multiple turns and can generate more tokens as a result.

edit: it does look like for smaller models, you kinda have to tweak things to be model specific, or they get confused with general prompts

7
4
submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net
8
3
Salmon Recipe (waveinscriber.com)
submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net
9
6
No Graphics API — Sebastian Aaltonen (www.sebastianaaltonen.com)
submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net
10
4
submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net
11
27
4 billion if statements (andreasjhkarlsson.github.io)
submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net
12
12
submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net
13
15
submitted 2 months ago by git@hexbear.net to c/programming@hexbear.net
14
9
submitted 2 months ago* (last edited 2 months ago) by AernaLingus@hexbear.net to c/programming@hexbear.net

As usual, some absolutely diabolical CSS wizardy from Lyra.

15
9
16
7
Advanced, Overlooked Python Typing (martynassubonis.substack.com)
submitted 2 months ago by git@hexbear.net to c/programming@hexbear.net
17
7
submitted 2 months ago by git@hexbear.net to c/programming@hexbear.net
18
2
submitted 2 months ago by git@hexbear.net to c/programming@hexbear.net
19
19

Informative post (with tons of examples!) from a genuine CSS wizard. Definitely check out her projects css clicker and antonymph, or if cybersecurity is more your thing, her recent talk on CSS exploits.

20
12
Arcpy (hexbear.net)

Anyone here have experience with arcpy? I've been working on a wrapper library that tries it's best to bring in into compliance with the Python data model and would like some testers.

I'll dm you a link to the repository if you have any interest. It's still pretty barebones and focused mainly on simplifying interaction with file databases and project files.

21
21

I get dizzy looking at C-like languages, they just feel incredibly hard to follow compared to an S-expression.

Everything this just so verbose and there's so much negative space between the lines. To be fair, this course is making us program using Java so maybe it has to do more with that.

22
6

Don't let the title mislead you into thinking this is a polemic—it's an excellent talk that traces the lineage of object-oriented programming (which goes back way further than you might imagine) and helps elucidate why it became such a dysfunctional pattern. It's really well-researched, citing a bunch of primary sources, and the speaker is clear and engaging. Never would have thought a nearly 2 hour lecture^[There's about ~35 minutes of Q&A, hence the runtime] on OOP would engender such rapt attention—there are so many fascinating tidbits that were dug up from ancient academic papers.

So much of computing is about the newest and shiniest thing that it's easy to lose sight of how—and importantly, why—we got here, and I think reexamining these foundations can help prevent us from propagating the mistakes of the past.

23
40
submitted 4 months ago by git@hexbear.net to c/programming@hexbear.net
24
2
submitted 5 months ago by git@hexbear.net to c/programming@hexbear.net
25
1
submitted 5 months ago* (last edited 5 months ago) by BountifulEggnog@hexbear.net to c/programming@hexbear.net

A lot of what I've found seems to assume C knowledge, or other things that I just don't understand. I know a little bit of python, I've mostly made small scripts for webscraping/data management/discord bots. My ultimate goal is a gameboy emulator, but plan on making a chip-8 interpreter first along with any other test programs ect I need. There's a lot of guides and resources for making either of those projects but I need to understand Rust first.

Picking Rust because its modern, memory management seems a lot easier, and I've generally heard good things about it. Also heard some errors that cause runtime issues on something like C are caught by Rust's compiler.

view more: next ›

programming

291 readers
9 users here now

  1. Post about programming, interesting repos, learning to program, etc. Let's try to keep free software posts in the c/libre comm unless the post is about the programming/is to the repo.

  2. Do not doxx yourself by posting a repo that is yours and in any way leads to your personally identifying information. Use reports if necessary to alert mods to a potential doxxing.

  3. Be kind, keep struggle sessions focused on the topic of programming.

founded 2 years ago
MODERATORS