GPT4All might be your answer. Desktop, open source, supports using GPT4 through the API.
ChatGPT
Unofficial ChatGPT community to discuss anything ChatGPT
Thank you! Have just installed GPT4All, waiting for Llama-2 to download. Keen to see how it goes!
You can study the source code and build this client yourself. To do this, just install a free version of the development environment and install a couple of free packages from the package manager.
P.S. And you shouldn’t think that everyone is telling you the absolute truth. With the same success, it could be a purely American repository owned by an attacker. And the fact that I’m Russian doesn’t bother me. I don't agree with what my government is doing, but it doesn't matter.
And one more interesting fact. If you place an application in Program Files, then upon startup there will be no notifications about an unsigned application. Most software in the world does not have a signature (including mine)
Why do you want a desktop app if I may ask? Doesn't the webapp work just as well?
The web app is great and I’ll definitely keep using it, but I was after something with decent UI/UX that runs locally / offline (which I foolishly didn’t mention in my OP).
"Runs locally" is a very different requirement and not one you'll likely be able to find anything for. There are smaller open source LLMs but if you are looking for GPT-4 level performance your device will not be able to handle it. Llama is probably your best bet, but unless you have more VRAM than any consumer gpu currently does , you'll have to go with lower size models which have lower quality output.
Thank you, I realised that once I installed GPT4All! I’ve got Llama going now, and will look at upgrading my RAM to accommodate a larger model like Falcon if I feel I need it. I’ve learned a lot this morning, it’s been great!
Do keep in mind that if you upgrade your regular RAM this will only benefit models running on the CPU, which are far slower than models on the GPU. So with more RAM you may be able to run bigger models, but when you run them they will also be more than a literal order of magnitude slower. If you want a response within seconds you would want to run that model on the GPU, where only VRAM counts.
Probably in the near future there will be models that perform much better at consumer device scale, but for now unfortunately it's still a pretty steep tradeoff, especially since large VRAM hasn't really been in high demand and is therefore much harder to come by.