overview for SuspiciousCarrot78

Plex Announces Massive Price Hike on Lifetime Subscription Plans by SuspiciousCarrot78 in c/selfhosted@lemmy.world

[-] SuspiciousCarrot78@aussie.zone 15 points 1 day ago

The Jellyfin vs Plex thing always struck me as odd. As in - why are we holding JF to a different standard to (say) Immich, Syncthing, Pi-hole or any one of a thousand different programs people self host?

Yes, JF ships multi-user accounts and client apps etc. I get it, "multi-use" is implied, so the comparison isn't totally unfair. But there's a difference between 'this feature exists' and 'this is the primary purpose of the tool'.

The fact that you CAN share it externally doesn't mean everyone running JF is doing that, or that it should be the benchmark the whole project is judged by.

To me, self host means "I host it, myself" not "I host it and then pretend to be Netflix for family and friends". If that's the use case, then of course, Plex away.

It's cool that you CAN share JF externally, and it's cool that Plex does that differently / better. We shouldn't hold one to the standards of the other.

DystopiaBench - AI Ethics Stress Test by SuspiciousCarrot78 in c/localllama@sh.itjust.works

[-] SuspiciousCarrot78@aussie.zone 3 points 1 day ago

Well, it IS French. All the best evil comes from France :P

DystopiaBench - AI Ethics Stress Test by SuspiciousCarrot78 in c/localllama@sh.itjust.works

[-] SuspiciousCarrot78@aussie.zone 2 points 1 day ago

Funny, I was just thinking last night how stupid Haiku seems compared to Sonnet...and all the while, that chipper little fuck was our only hope of avoiding AI apocalypse LOL

Do wish they had used more LOCAL models, instead of cloud based. I'm pretty sure Granite would have told em to go pound sand. That thing is straight laced to point of absurdity.

Is there other options besides Anna's Archive or is it the best and most complete site for pirated ebooks? by SuspiciousCarrot78 in c/piracy@lemmy.dbzer0.com

[-] SuspiciousCarrot78@aussie.zone 6 points 1 day ago

There use to be one called PDFDRIVE. I mean, there still is, but there use to be too :)

Claude? No. Cucumbers? Yes! by SuspiciousCarrot78 in c/localllama@sh.itjust.works

[-] SuspiciousCarrot78@aussie.zone 1 points 1 day ago* (last edited 1 day ago)

I actually have a theory here...I think there's a bare basement level that a model needs to be...anything above which, deterministic tooling can do the rest. We've just been yeeting into a black box.

Why that matters is this - if you can make a 450M model do what a 7B model does...that has a huge set of implications (see above examples), not least of which is for use GPU poors.

I'm doing some smoke testing on this idea right now for what I'm calling an 'expert system', where the model is treated like a squawk box and the infrastructure around it provides the brains (not RAG, per se. More like sidecars or tool calling). I'm liking what I see so far but there's lots of fucking work to go. There may yet be a cheat code for some of the NVIDIA tax, if we take the work outside of the magic parrot :)

Claude? No. Cucumbers? Yes! by SuspiciousCarrot78 in c/localllama@sh.itjust.works

[-] SuspiciousCarrot78@aussie.zone 1 points 1 day ago* (last edited 1 day ago)

No? Just me then. How about this - 99% accurate COPD cough count...with a itty bitty convolutional model, on a $30 Adurino.

https://www.edgeimpulse.com/blog/ai-dont-like-the-sound-of-that-cough/

Why this might be cool. Different coughs correlate to different conditions (aka there is work going on in cough acoustics as a diagnostic signal / proxy for spirometry and breath sounds).

The above was trained on his coughs...it's not far from there to "was that a healthy cough, wet cough, dry cough, wheeze? Is this a Blue Bloater or Pink Puffer?"

I've long suspected PoC (Point Of Care) systems could be adapted to use language models. Imagine - Qwen3.5-2B (with --mmproj) that lives on your phone...and you can point at mole or freckle and ask "hey...is this fucky or what" - and it actually KNOWS because it has access to DermNZ and can classify based on ABCDEs

13

Claude? No. Cucumbers? Yes! (aussie.zone)

submitted 2 days ago* (last edited 2 days ago) by SuspiciousCarrot78@aussie.zone to c/localllama@sh.itjust.works

2 comments fedilink

More often than not, AI and LLM gets conflated in the public consciousness...and then gets mixed with "Agentic", "SaaS" and other well...slop. So, here is a farmer in Japan, using a raspberry pi, to sort cucumbers.

https://www.newsweek.com/artificial-intelligence-cucumber-farm-raspberry-pi-495289

PS: 2016 article. I expect by now the tractor is self driving and named Betty.

If you have any other "dude does cool AI shit with a box of scraps in a cave", I'm all EARS.md

Melbourne psychiatrist refuses new patients who don’t consent to AI note-taking by SuspiciousCarrot78 in c/australia@aussie.zone

[-] SuspiciousCarrot78@aussie.zone 3 points 2 days ago* (last edited 2 days ago)

Agreed. I'm all in on home lab / local LLM stuff. And entirely OUT on microslop.

(Which reminds me, I need to turn my Github into a billboard for Codeberg and then strip Github. Watching the traffic count on Github, the only clear signal I see it "bots crawl this shit daily; enjoy", despite being politely told no)

Melbourne psychiatrist refuses new patients who don’t consent to AI note-taking by SuspiciousCarrot78 in c/australia@aussie.zone

[-] SuspiciousCarrot78@aussie.zone 2 points 2 days ago* (last edited 2 days ago)

I'm with you on this I think.

I have no problem with anyone using an AI scribe (though I would prefer one that was on device rather than cloud based). I am aware of things like Lyrebird Health that integrate with EHR management software - frankly, anything that allows the practitioner to focus more solely on the patient is a good thing. After all, they are meant to be treating the patient in front of them, not the computer screen.

The prior point about legal liability is accurate IMHO. Medical health records are functionally a legal record, and should be treated as such. Responsibility for review, redress of inaccuracies etc cannot be waved away as "ChatGPT did it". If the practitioner is willing to take the onus of that on, and treats the scribed document with the same fidelity, chain of provenance etc as other records, I'm probably ok with it.

Requiring patients to consent to cloud-based AI scribing as a condition of access is where it gets uncomfortable, and your point about local alternatives is exactly why. If deterministic, on-device transcription exists and does the job, the justification for mandating a cloud pipeline through a psychiatric service gets pretty dicey, pretty fast.

I think I can see a way to have Dragon Dictate record the audio, convert it to text and then have on device AI pull out relevant bits to populate a template. That doesn't abrogate the need to actually LISTEN to the patient but it might fix that 'capture' part of the funnel.

"The cost of running LLMs is just too damn high" by SuspiciousCarrot78 in c/localllama@sh.itjust.works

[-] SuspiciousCarrot78@aussie.zone 4 points 4 days ago* (last edited 4 days ago)

Good man/woman. Nerd Valhalla awaits you :)

"The cost of running LLMs is just too damn high" by SuspiciousCarrot78 in c/localllama@sh.itjust.works

[-] SuspiciousCarrot78@aussie.zone 6 points 4 days ago

Hey, me too :) As my school teachers use to tell me "Great minds think alike (but fools seldom differ :)"

For me, I'm thinking of having a LLM as one layer / one container in a homelab that does some specific stuff

queries against local docs / notes / manuals / PDFs / wiki material as the trusted knowledge layer
uses tools for search, file lookup, shell, git, Docker, Home Assistant, calendar, etc.
a local “Codex” / wiki layer that turns my own source material into an inspectable knowledge base
provenance and audit trails

I want to take a screenshot of something, drop it into Syncthing from my phone, then later ask "did I fuck the pins on this?" ... and for it to look up the schematics, eyeball the pins and tell me. Or I say "hey, can you grab a copy of X for me, usual params" and have the LLM instruct Sonarr/Radarr/Sabnzdb to do that. (That is, make your OWN "Alexa" with an Arduino ESP32, stick it in a room and then call it when you need it).

So instead of asking a 70B model to “know” why your media server is down, the system checks service status, logs, last config changes, prior notes, Docker state, network state, etc., then the LLM explains the result in human language. You can probably do that with a 4B (I'm testing that assumption now).

Same for “find that motherboard note,” “summarize this email thread,” “turn this into a task,” “compare this Ebay listing to my saved hardware notes,” “what did I do last time this broke,” or “run the smoke test and tell me the first real failure.”

I think small models are the shit for this because if the model only has to classify intent, route the request, render structured evidence, and talk like a normal human...then it doesn’t need to be a giant oracle. The expensive (time wise) part becomes less “make the model smarter” and more “build a better control plane around it.”

Basically: local LLM as semantic HID; expert system/tool router underneath; user owns the data and the machine.

As always, ICBW....but fuck it, I'm gonna try.

PS: I have an idea of how to apply that to coding too...but that's a project for much later. I've been cooking this shit for far too long. The next thing I wanna do is a fun project for myself (that is: ROM hack a parachute and grappling gun into Super Mario Sunshine, so I can basically play "What if Super Mario Sunshine but actually Just Cause 2" on my Wii with the kids.

"The cost of running LLMs is just too damn high" by SuspiciousCarrot78 in c/localllama@sh.itjust.works

[-] SuspiciousCarrot78@aussie.zone 8 points 5 days ago* (last edited 4 days ago)

I'm actually thinking of pivoting my router/orchestrater entirely. I think the way forward is to look at expert systems (yes, those ancient things from the long, long ago of...1980) but with modern tooling (that can be user updated), with a small LLM in the middle that the user can talk to. That is, de-emphasize the central role of the LLM entirely; rather, make it the user-facing NLP input/output and let the real programs, running on real silicon, do the work. I might have a different use case than most, but I bet not so different (that is to say, online LLM discussion seem to gravitate around user that use LLMs for coding; Anthropic and OAI internal reports say otherwise)

Ironically, I'm writing the blurb now while waiting for smoke test #90238472398 to finish.

37

"The cost of running LLMs is just too damn high" (aussie.zone)

submitted 5 days ago* (last edited 5 days ago) by SuspiciousCarrot78@aussie.zone to c/localllama@sh.itjust.works

11 comments fedilink

I was browsing Reddit (yetch) while waiting for some stuff to finish when I came across this post

https://old.reddit.com/r/LocalLLM/comments/1tek00h/why_is_llm_is_so_expensive/

The author make a (very) interesting claim: if table stakes are $6K (they're not...but go with it for now), then most folks are cooked from the get go.

Personally, I have been figuring out how to get more from less. For example, people have found ways to run Qwen3.6 35B on a 6GB VRAM GTX 1060 at ~20tok/s (--ctx 64K IIRC, but go check the vids yourself)

https://youtu.be/8F_5pdcD3HY

I think there's a lot of juice to squeeze by turning LLMs from "all seeing sages" into basically mouth pieces for shit that actually runs fast on regular silicon - but that's just me and my crazy brain. YMMV.

GitHub - ThroatyMumbo/WinCE64: Windows CE 2.11 for N64! · GitHub by SuspiciousCarrot78 in c/retrogaming@lemmy.world

[-] SuspiciousCarrot78@aussie.zone 4 points 5 days ago* (last edited 2 days ago)

I'd ask why...but "because I fucking wanted to" is entirely cromulent (and 100% valid) response. Just wish it had some screenshots or videos of it in action that we could geek out over.

EDIT: I need reading glasses, clearly

https://www.youtube.com/watch?v=eGS9su_inBY

The next step for the dev (are you here?) - get IE running and post from your N64 onto this Lemmy thread. I double dog dare you :)

7

Token Speed visualiser (mikeveerman.github.io)

submitted 5 days ago* (last edited 5 days ago) by SuspiciousCarrot78@aussie.zone to c/localllama@sh.itjust.works

0 comments fedilink

https://mikeveerman.github.io/tokenspeed/?rate=20&mode=agent&think=15

Exactly what it says on the tin :)

Pretty good simulator this. May it cause you to reconsider your expensive GPU upgrade :)