this post was submitted on 11 Sep 2023
154 points (92.8% liked)
Technology
59299 readers
4561 users here now
This is a most excellent place for technology news and articles.
Our Rules
- Follow the lemmy.world rules.
- Only tech related content.
- Be excellent to each another!
- Mod approved content bots can post up to 10 articles per day.
- Threads asking for personal tech support may be deleted.
- Politics threads may be removed.
- No memes allowed as posts, OK to post as comments.
- Only approved bots from the list below, to ask if your bot can be added please contact us.
- Check for duplicates before posting, duplicates may be removed
Approved Bots
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I think the fundamental difference in our perspectives is that I want to see neural expansion capabilities that are not limited by a static state and dedicated compilation. I think this is the only way to achieve a real AGI. If the neural network is static, ultimately you have a state machine with a deterministic output. It can be ultra complex for sure, but it is still deterministic. I expect an AGI to have expansion in any direction at all times according to circumstances and needs; aka adaptability beyond any preprogrammed algorithms.
Forth is very old, and from an era when most compute hardware was tailor made. It was originally created as a way to get professional astronomy observatories online much more quickly. The fundamental concept with Forth is to create the simplest looping interpreter on any given system using assembly or any supported API. The interpreter can then build on the Forth dictionary of words. Words are the fundamental building block of Forth. They can be anything from a pointer to a variable, or a function, to an entire operating system and GUI. Anything can be assigned to a word and a word can be any combination of data, types, and other words. The syntax is extremely simple. It is a stack based language that is very close to the bare metal. It is so simple and small, that there are versions of Forth that run on tiny old 8 bit AVRs and other microcontrollers.
Anyways, the idea of a threaded interpreter like Forth, could be made to compile tensor layers. The API for the network would be part of the Forth dictionary. Another key aspect to Forth is that the syntax to create new words is so simple that a word can be made that creates the required formatting. This could make it possible for a model to provide any arbitrary data for incorporation/modification and allow Forth to attempt to add it into the network in real time. It could also be used to modify specific tensor weights when a bad output is indicated by the user and a correction is provided.
If we put aside text formatting, settings, and user interface elements, the main reason a LLM needs external code for interfacing is because of the propensity for errors due to syntax complexity with languages like Python or C. No models can generate reliable complex code suitable for their own execution internally without intervention. Forth is so flexible that a dictionary could even be a tensor table of weights, like words could be the values. Forth is probably the most anti-standards, anti-syntax, language ever created.
Conceptually, the interpreter is like a compiler, command line, task scheduler, and init/process manager all built into one ultra simple system. Words are built from the registers, flags, and interrupts, up to anything of arbitrary complexity. A model does not need this low level interface with compute hardware, but this is not my point. Models are built on tensors and tokens. Forth can be made to speak these natively and in near real time as prompted internally and without compilation; a true learning machine. Most Forth implementations also have an internal bookmarking system that allows the dictionary to roll back to a known good state when encountering errors in newly created words.
A word of warning, full implementations like ANS Forth or G-Forth are intimidating at first glance. It is far better to look at something like Flash Forth for microcontrollers to see the raw power of the basic system without the giant dictionaries present in modern desktop implementations.
The key book on the concepts behind Forth and threaded interpretive languages is here: https://archive.org/details/R.G.LoeligerThreadedInterpretiveLanguagesTheirDesignAndImplementationByteBooks1981