fishynoob

joined 6 days ago
[โ€“] [email protected] 1 points 6 days ago

Thank you, that makes sense. Yes, I will look to create templates using AI that I like. Thanks again for the help

[โ€“] [email protected] 2 points 6 days ago* (last edited 6 days ago)

Thanks for the edit. You have a very intriguing idea; a second LLM in the background with a summary of the conversation + static context might make performance a lot better. I don't know if anyone has implemented it/knows how one can DIY it with Kobold/Ollama. I think it is an amazing idea for code assistants too if you're doing a long coding session.

[โ€“] [email protected] 4 points 6 days ago

Better be AGPL or she's never getting cloned on my PC, that's for sure!

[โ€“] [email protected] 2 points 6 days ago

I see. Thanks for the note. I think beyond 48GB of VRAM diminishing returns set in very quickly so I'll likely stick to that limit. I wouldn't want to use models hosted in the cloud so that's out of the question.

[โ€“] [email protected] 1 points 6 days ago

Absolutely. TheBloke's fine-tuned models with their guardrails removed are the only conversational models I will run. I get enraged looking at AI telling me to curb my speech.

I do use Python but I haven't touched AI yet so it's going to be a learning-curve if I go down that route. I am hoping to get finetuned models OOTB for this kind of stuff but I know it's a hard ask.

I was going to buy 2-3 used GPUs/new budget GPUs like the B580 but with the tariffs the prices of these are INFLATED beyond what I can afford to pay for them. Once something changes (financially speaking) I'll probably throw enough VRAM at it to at least get the 8B models (probably not FP16 but maybe quantised to 4K/8K) running smoothly.

Thanks for the reminder. I have wanted to use character AI for so long but couldn't bear to give away my thought patterns to them (look at my hypocrisy: I'm giving it all away anyway when everyone is free to scrape Lemmy). I guess I'm an idiot.

[โ€“] [email protected] 1 points 6 days ago* (last edited 6 days ago) (2 children)

I was going to buy the ARC B580s when they come back down in price, but with the tariffs I don't think I'll ever see them at MSRP. Even the used market is very expensive. I'll probably hold off on buying GPUs for a few more months till I can afford the higher prices/something changes. Thanks for the Lexi V2 suggestion

[โ€“] [email protected] 2 points 6 days ago

I didn't think of that. Indeed, DNS caching/using different DNS servers for different devices will break it exactly like what OP is experiencing. Thanks.

[โ€“] [email protected] 2 points 6 days ago (2 children)

I had never heard of Kobold AI. I was going to self-host Ollama and try with it but I'll take a look at Kobold. I had never heard about controls on world-building and dialogue triggers either; there's a lot to learn.

Will more VRAM solve the problem of not retaining context? Can I throw 48GB of VRAM towards an 8B model to help it remember stuff?

Yes, I'm looking at image generation (stable diffusion) too. Thanks

[โ€“] [email protected] 1 points 6 days ago (2 children)

Interesting. You're using a model without special finetuning for this specific purpose and managed to get it to work with just giving it a prompt. I didn't think that was possible. How would you piece together something like this? Can I just ask AI to give me a prompt which I can use on it/another AI?

How much of VRAM does your GPU have?

[โ€“] [email protected] 1 points 6 days ago (4 children)

Thank you. I was going to try and host Ollama and Open WebUI. I think the problem is to find a source for pretrained/finetuned models which provide such.... Interaction. Does huggingface have such pre-trained models? Any suggestions?

[โ€“] [email protected] 1 points 6 days ago (4 children)

Assuming NGINX is terminating SSL, I think the problem is ports.

[โ€“] [email protected] 5 points 6 days ago (7 children)

I don't think OP made two A records here. He simply configured the reverse proxy to point to the VM and the A record to point to the reverse proxy. In my mind, if NGINX is terminating SSL then the only problem could be ports.

view more: โ€น prev next โ€บ