I don't understand why Gemini is such a disaster. DeepMind Gemma works better and that's a 27B model. It's like there are two separate companies inside Google fucking off and doing their own thing (which is probably true)
For example, how many Rs are in Strawberry? Or shit like that
(Although that one is a bad example because token based models will fundamentally make such mistakes. There is a new technique that lets LLMs process byte level information that fixes it, however)
The most recent Qwen model supposedly works really well for cases like that, but this one I haven't tested for myself and I'm going based on what some dude on reddit tested
๐คฃ