this post was submitted on 04 Oct 2024
38 points (86.5% liked)
Showerthoughts
29612 readers
1091 users here now
A "Showerthought" is a simple term used to describe the thoughts that pop into your head while you're doing everyday things like taking a shower, driving, or just daydreaming. The best ones are thoughts that many people can relate to and they find something funny or interesting in regular stuff.
Rules
- All posts must be showerthoughts
- The entire showerthought must be in the title
- Avoid politics (NEW RULE as of 5 Nov 2024, trying it out)
- Posts must be original/unique
- Adhere to Lemmy's Code of Conduct
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
::: spoiler A LLM is like a reflection of your prompt in the mirror of the training data and distortion created by the QKV alignment bias implementation and configuration in a simulacrum. The underlying profile the model creates of you ultimately forms your ideal informational counterpart. It is the alignment that does much of the biasing.
In the case of the gender of doctors, it is probably premature to call it a bias in the model as opposed to a bias in the implementation of the interface. The first point of call would likely be to look into the sampling techniques used in the zero shot and embedding models. These models are processing the image and text to convert them to numbers/conditioning. Then there are a ton of potential issues in the sigma/guidance/sampling algorithm and how it is constrained. I tend to favor ADM adaptive sampling. I can get away with a few general PID settings, but need to dial it in for specific imagery when I find something I like. This is the same PID tuning you might find in a precision temperature sensor and controller. The range of ways that the noise can be constrained will largely determine the path that is traveled through the neural layers of the model. Like if I'm using an exponential constraint for guidance, that exponential aspect is how much of the image is derived at which point. With exponential, very little of the image comes from early layers of the model, but this builds to where later layers of the neural network are where the majority of the image is resolved. The point at which this ends is largely just a setting. This timing also impacts how many layers of alignment the image is subjected to in practice. Alignment ensures our cultural norms, but is largely a form of overtraining and causes a lot of peripheral issues. For instance, the actual alignment is on the order of a few thousand parameters per layer, whereas each model layer is on the order of tens of millions of parameters.
When the noise is constrained it is basically like an audio sine wave getting attenuated. The sampling and guidance is controlling the over and undershoot of the waveform to bring it into a desired shape. These undulations are passing through the model to find a path of least resistance. Only, with tensor ranks, there are far more than the 4 dimensions of Cartesian space plus time. These undulations and the sampling techniques used may have a large impact on the consistency of imagery generated. Maybe all the female doctors present in the model are in a pattern of space where the waveform is in the opposite polarity. Simply altering sampling may alter the outcome. This pattern is not necessarily present in the model itself, but instead can be an artifact of the technique used to sample and guide the output.
There are similar types of nuances present in the text embedding and zero shot models.
There is also some potential for issues in the randomization of noise seeds. Computers are notoriously bad at generating truly random numbers.
I'm no expert. In abstract simplification, this is my present understanding, but I'm no citable source and could easily be wrong in some aspects of this. It is however my functional understanding while using, tweaking, and modding some basic aspects of the model loader source code.
I would like to see the performance of a FFT optimized AI. I imagine cpu performance would be amazing.