1
14
submitted 1 week ago by git@hexbear.net to c/programming@hexbear.net
2
11
submitted 1 week ago by git@hexbear.net to c/programming@hexbear.net
3
2

Crossposting here because I found the fastmem and floating point arithmetic improvements to be technically quite interesting!

cross-posted from: https://hexbear.net/post/7926055

Two blog posts in less than a month? They're spoiling us!

This is a real humdinger of a progress report, too. High-level summary:

  • (link) Massive performance improvements to the two Rogue Squadron games through a combination of emulation improvements and settings changes, which now allow for it to run at full speed on high-end hardware (and very playable speeds on low-end hardware)
  • (link) Further improvements to the newly-added Triforce arcade emulation (check out the previous blog post for more info about Triforce)
  • (link) Core emulation improvement to an edge case of floating-point arithmetic that fixes a desync in Mario Strikers Charged; now, Dolphin can play online with real Wiis in that game. I think this was my favorite bit in the post—a real team effort with perseverance over many years!
  • (link) Rough timings implemented for Wii NAND management to allow for better performance on that menu
  • (link) The ability to preload entire games into RAM, a long-requested feature. The reason it hadn't been implemented earlier is that it's completely unnecessary with any modern storage, since even a crappy USB stick is faster than disc access on a GC/Wii, but this is apparently helpful for people who have their games stored on a NAS where disks might actually spin down, causing lag spikes.
  • (link) New GUI settings for SDL controller tweaks, specifically SDL hinting (apparently helpful for using Joycons as separate Nunchuck + Wiimote as well as fixing DS4 connectivity issues).
  • (link) Performance patches for a half dozen games, most notably Need For Speed: Hot Pursuit 2 (my beloved) and 007: Quantum of Solace. For those not in the know, there's a relatively new feature in Dolphin which allows for games to be patched on-the-fly to fix issues like uncapped framerates and complex idle loops that can bring the emulator to its knees even though it can otherwise run the games fine.

Aside from the interesting technical details, reading these progress reports always gives me the warm fuzzies. I love hearing about how all these different people come together and use their unique talents to improve emulation for everyone.

4
12
submitted 1 week ago by git@hexbear.net to c/programming@hexbear.net
5
14
So you want to write an "app" (arcanenibble.github.io)
submitted 2 weeks ago by git@hexbear.net to c/programming@hexbear.net
6
4
submitted 2 weeks ago by git@hexbear.net to c/programming@hexbear.net
7
6
submitted 2 weeks ago by git@hexbear.net to c/programming@hexbear.net
8
20
submitted 3 weeks ago by git@hexbear.net to c/programming@hexbear.net
9
7
submitted 3 weeks ago by git@hexbear.net to c/programming@hexbear.net
10
6
microgpt (karpathy.github.io)
submitted 3 weeks ago by git@hexbear.net to c/programming@hexbear.net
11
5
submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net
12
8
submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net
13
11
submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net
14
11
submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net
15
8
submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net
16
12

The muse of Python thinks Claude is his friend/colleague :(

17
6
submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net
18
4
submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net
19
4
submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net
20
2
submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net
21
3
How To Hack A Denuvo Game (www.youtube.com)
submitted 1 month ago by git@hexbear.net to c/programming@hexbear.net
22
11
submitted 2 months ago* (last edited 2 months ago) by yogthos@lemmygrad.ml to c/programming@hexbear.net

Despite context windows expanding to millions of tokens, LLMs still struggle with a fundamental task of precision.

When you ask an LLM to "analyze this report," it often glances at the text and simply hallucinates a plausible sounding answer based on probability.

A good example of the problem can be seen in asking a model to sum sales figures from a financial report. Left to its own devices, it will not bother reading the whole document, and simply give you a hallucinated answer. This is especially a problem with smaller models you can run locally.

The Recursive Language Model paper comes up with a clever technique that forces the LLM to stop guessing and start coding.

The standard approach to try to deal with the problem is to use Retrieval Augmented Generation (RAG) which relies on semantic similarity (embeddings). If you ask for "sales figures," a Vector DB retrieves chunks of text that sound like sales figures. But semantic similarity is fuzzy, and limited in functionality.

For example, embeddings can't count so you can't ask "count the number of times X happens." They can't handle information that's scattered across a bunch of unrelated lines. And they can't distinguishing between concepts like "Projected Sales" and "Actual Sales" when they appear in similar contexts.

It would be nice to have a system that treats text as a dataset that should be queries as opposed to a prompt to be completed. and this where RLMs come in.

Here, the model acts as a programmer, and writes code to explore the document, verify its execution results, and only then formulate an answer based on them.

The core insight is that code execution provides grounding for the model. When an LLM guesses a number, it might be wrong. When an LLM writes regex.match() and the computer returns ['$2,340,000'], that result is a hard fact.

The process works like as follows:

  1. The document is loaded into a secure, isolated Node.js environment as a read-only context variable.
  2. The model is given exploration tools: text_stats(), fuzzy_search(), and slice().
  3. The Loop:
  • The model writes TypeScript to probe the text.
  • The Sandbox executes it and returns the real output.
  • The model reads the output and refines its next step.
  1. The model iterates until it has enough proven data to answer FINAL("...").

The system can work entirely locally using something like Ollama with Qwen-Coder, or with DeepSeek which is much smarter by default.

Allowing an LLM to write and run code directly on your system is obviously a security nightmare, so the implementation uses isolated-vm to create a secure sandbox for it to play in.

The model cannot hallucinate rm -rf / or curl a URL. Having a sandbox also prevents infinite loops or memory leaks. And since the document is immutable, the model can read it but cannot alter the source truth.

I also used Universal Tool Calling Protocol (UTCP) patterns from code-mode to generate strict TypeScript interfaces. This provides the LLM with a strict contract:

// The LLM sees exactly this signature in its system prompt
declare function fuzzy_search(query: string, limit?: number): Array<{
  line: string;
  lineNum: number;
  score: number; // 0 to 1 confidence
}>;

Another problem is that LLMs are messy coders. They forget semicolons, use hallucinated imports, etc. The way around that is to have a self healing layer. If the sandbox throws a syntax error, a lightweight intermediate step attempts to fix imports and syntax before re-running. This keeps the reasoning chain alive and minimizes round trips to the model.

As a demo of the concept, I made a document that has a bunch of scattered data, having 5 distinct sales figures hidden inside 4,700 characters of Lorem Ipsum filler and unrelated business jargon.

Feeding the text into a standard context window and asking for the total will almost certainly give you a hallucinated a total like $480,490. It just grabs numbers that look like currency from unrelated sections and mashes them together.

Running the same query through RLM took around 4 turns on average in my tests, but the difference was night and day.

The model didn't guess. It first checked the file size.

const stats = text_stats();
console.log(`Document length: ${stats.length}, Lines: ${stats.lineCount}`);

Next, it used fuzzy search to locate relevant lines, ignoring the noise.

const matches = fuzzy_search("SALES_DATA");
console.log(matches);
// Output: [
//   { line: "SALES_DATA_NORTH: $2,340,000", ... },
//   { line: "SALES_DATA_SOUTH: $3,120,000", ... }
// ]

And finally, it wrote a regex to parse the strings into integers and summed them programmatically to get the correct result.

// ...regex parsing logic...
console.log("Calculated Total:", total); // Output: 13000000

Only after the code output confirmed the math did the model verify the answer.

The key difference is that traditional approach asks the model what does this document say, while the recursive coding approach asks it to write a program to find out what this document says. The logic is now expressed using actual code, and the role of the LLM is to write the code and read the results as opposed to working with the document directly.

As with all things, there is a trade off here with the RLM approach being slower since it takes multiple turns and can generate more tokens as a result.

edit: it does look like for smaller models, you kinda have to tweak things to be model specific, or they get confused with general prompts

23
4
submitted 2 months ago by git@hexbear.net to c/programming@hexbear.net
24
3
Salmon Recipe (waveinscriber.com)
submitted 2 months ago by git@hexbear.net to c/programming@hexbear.net
25
6
No Graphics API — Sebastian Aaltonen (www.sebastianaaltonen.com)
submitted 3 months ago by git@hexbear.net to c/programming@hexbear.net
view more: next ›

programming

297 readers
1 users here now

  1. Post about programming, interesting repos, learning to program, etc. Let's try to keep free software posts in the c/libre comm unless the post is about the programming/is to the repo.

  2. Do not doxx yourself by posting a repo that is yours and in any way leads to your personally identifying information. Use reports if necessary to alert mods to a potential doxxing.

  3. Be kind, keep struggle sessions focused on the topic of programming.

founded 2 years ago
MODERATORS