34
Using AI for image transcripts, yay or nay?
(lemmy.world)
A community to talk about the Fediverse and all it's related services using ActivityPub (Mastodon, Lemmy, Mbin, etc).
If you wanted to get help with moderating your own community then head over to !moderators@lemmy.world!
Learn more at these websites: Join The Fediverse Wiki, Fediverse.info, Wikipedia Page, The Federation Info (Stats), FediDB (Stats), Sub Rehab (Reddit Migration)
Those are different mechanisms. Object recognition doesn't mean the AI is now trained on the image and can reproduce it (which is btw why AI can still "visually" recognise what's in an image that has been nightshaded/glazed).
This is true but it’s also important to remember that if you use an AI model hosted by the same party that trains it it’s likely that they will pass any data you input to the training stage. Unless you have an enterprise contract regulating training use.
OP mentioned he will use a self-hosted LLM though and in that case it’s no risk of the data being used for training.
I mean, if you put any image online that hasn't been protected/poisoned in some way, you have to (unfortunately) assume it's in some AI's training data anyway. If the tradeoff for a useful description (! See my other comments about the lack of usefulness) is that an image is also fed into one more training corpus, that would be worth a thought, imho. If the image is protected/poisoned, I'd indeed encourage this whole hypothetical process, just to further sabotage the data.