this post was submitted on 28 Nov 2023

120 points (100.0% liked)

Technology

38699 readers

257 users here now

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:

This community's icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

founded 3 years ago

MODERATORS

[email protected]

120

ethinically ambigaus (feddit.de)

submitted 2 years ago by [email protected] to c/[email protected]

41 comments fedilink hide all child comments

top 41 comments

sorted by: hot top controversial new old

[–] [email protected] 17 points 2 years ago

I asked DALLE-2 for a "wide shot of a delivery driver in a Louisiana bayou with bagged food" and it gave me this:

That's certainly a fascinating way to interpret "bagged food."

[–] [email protected] 13 points 2 years ago* (last edited 2 years ago) (1 children)

I think this makes a bit of sense though doesn't it? They wrote "guy". Given that training data is probably predominantly white "guy" would give you a white guy nine times out of ten without clarification of what the word means to the AI, i.e. ethnically ambiguous. Because that's what guy is, ethnically ambiguous. The spelling is because DALL-E suuuuucks at text, but slowly getting better at least.

But they should 100% tweak it so that when a defined character is asked for stuff like that gets dropped. I think the prompt structure is what makes this one slip through. Had they put quotes around "guy with swords pointed at him" to clearly mark that as it's own thing this wouldn't have happened.

[–] [email protected] 12 points 2 years ago (3 children)

But I don't think the software can differentiate between the ideas of defined and undefined characters. It's all just association between words and aesthetics, right? It can't know that "Homer Simpson" is a more specific subject than "construction worker" because there's no actual conceptualization happening about what these words mean.

I can't imagine a way to make the tweak you're asking for that isn't just a database of every word or phrase that refers to a specific known individual that the users' prompts get checked against and I can't imagine that'd be worth the time it'd take to create.

[–] [email protected] 5 points 2 years ago (2 children)

If they're inserting random race words in, presumably there's some kind of preprocessing of the prompt going on. That preprocessor is what would need to know if the character is specific enough to not apply the race words.

[–] [email protected] 2 points 2 years ago (1 children)

Yeah but replace("guy", "ethnically ambiguous guy") is different than does this sentence reference any possible specific character

[–] [email protected] 5 points 2 years ago (1 children)

I don't think it's literally a search and replace but a part of the prompt that is hidden from the user and inserted either before or after the user's prompt. Something like [all humans, unless stated otherwise, should be ethnically ambiguous]. Then when generating it's got confused and taken it as he should be named ethnically ambiguous.

[–] [email protected] 2 points 2 years ago

It’s not hidden from the user. You can see the prompt used to generate the image, to the right of the image.

[–] [email protected] 1 points 2 years ago

Gee, I wonder if there’s any way to use GPT-4 to detect whether a prompt includes reference to any specific characters. 🤔

[–] [email protected] 2 points 2 years ago (1 children)

Let’s see about that: https://imgur.com/a/QCGRrUb

[–] [email protected] 2 points 2 years ago (1 children)

Let's say hypothetically I had given you that question and that instruction on how to format your response. You would presumably have arrived at the same answer the AI did.

What steps would you have taken to arrive at that being your response?

[–] [email protected] 1 points 1 year ago (1 children)

Honestly my eyes glommed onto the capital letters first. I brought to mind images from the words, and Homer Simpson is clearer and brighter, and somehow that’s the internal representation of coherence or something. That aspect of using the brightness to indicate the match/answer/solution/better bet might be an instruction I gave my brain at some point too. I’m autistic and I’ve built a lot of my shit like code. It’s kinda like the iron man mask in here to be honest. But so more more elaborate. I often wish I could project it onto a screen. It’s like kinex models doing transformer jiu jitsu and me flicking those little battles off into the darkness to run on their own. I’m afraid I might not be a good candidate for questions of how human cognition normally works. Though I’ve done a lot of zen and drugs and enjoy watching it and analyzing it too.

I’m curious, why do you ask? What does that tell you?

[–] [email protected] 1 points 1 year ago

I will admit this is almost entirely gibberish to me but I don't really have to understand. What's important here is that you had any process at all by which you determined which answer was correct before writing an answer. The LLM cannot do any version that.

You find a way to answer a question and then provide the answer you arrive at, it never saw the prompt as a question or its own text as an answer in the first place.

An LLM is only ever guessing which word probably comes next in a sequence. When the sequence was the prompt it gave you, it determined that Homer was the most likely word to say. And then it ran again. When the sequence was your prompt plus the word Homer, it determined that Simpson was the next most likely word to say. And then it ran again. When the sequence was your prompt plus Homer plus Simpson, it determined that the next most likely word in the sequence was nothing at all. That triggered it to stop running again.

It did not assign any sort of meaning or significance to the words before it began answering, did not have complete idea in mind before it began answering. It had no intent to continue past the word Homer when writing the word Homer because it only works one word at a time. Chat GPT is a very well-made version of hitting the predictive text suggestions on your phone over and over. You have ideas. It guesses words.

[–] [email protected] 2 points 2 years ago

ChatGPT was just able to parse a list of fictional characters out of concepts, nouns, and historical figures.

It wasn’t perfect, but if it can take the prompt and check if any mention of a fictional or even defined historical character is in there it could be made to not apply additional tags to the prompt.

[–] [email protected] 8 points 2 years ago* (last edited 2 years ago) (1 children)

"Ethnically ambiguous" is the last thing I'd call that. From 🧍🏻‍♀️ to 🧍🏿‍♂️, I still think the most "ambiguous" is🧍(lacking 🧞 and 🦄).

[–] [email protected] 6 points 2 years ago

You don't understand. It says "ethnically ambiguous" right there! It is impossible to associate any race with this picture!

[–] [email protected] 5 points 2 years ago (7 children)

That's actually a pretty smart way to combat racial bias.

[–] [email protected] 8 points 2 years ago

Except when it does this

[–] [email protected] 8 points 2 years ago (1 children)

No, it's an incredibly dumb way because fucking with people's prompts will make the tech unreliable

[–] [email protected] 10 points 2 years ago (1 children)

will make the tech unreliable

Man, do I have some bad news for you

[–] [email protected] 4 points 2 years ago

Lol fair enough. I guess I could say "make the tech even less reliable"

[–] [email protected] 7 points 2 years ago (1 children)

The smarter way would be using balanced training data.

[–] [email protected] 1 points 2 years ago

You can't balance every single aspect of the training data. You will always run into some searches favoring one race over another.

[–] [email protected] 5 points 2 years ago

It's not, the underlying data is still just as biased. Taking a bunch of white people and saying they are "ethnically ambiguous" is just statistical blackface.

[–] [email protected] 3 points 2 years ago (1 children)

If a request is for a generic person, sure. But when the request is for a specific character, not really.

Like make one of the undefined arms black.

[–] [email protected] 2 points 2 years ago (1 children)

I agree with you, but there is a lot of gray area. What about Spider-man? 95% of the pictures it ingests are probably Peter Parker so it would have a strong bias towards making him white when there are several ethnicities that might apply. What about Katniss Everdeen? Is she explicitly white in the book or is she just white because she's played by a white actress? I truly don't know so maybe that is a bad example. What about Santa? What about Jesus? Of all characters, Jesus absolutely shouldn't be white but I'll bet the vast majority of AI depicts him that way.

I'm not disagreeing with you so much as I'm pointing out the line isn't really all that clear. I don't like this ham-handed way of going about it, but I agree with and support the goal of making sure the output isn't white biased just because preserved history tends to be.

[–] [email protected] 4 points 2 years ago (1 children)

It's tricky because the data itself is going to be biased here. Think about it - even the video game is specifically called "Spider-Man Miles Morales" while the one with Peter Parker is just called "Spider-Man."

Katniss is actually a good example. I was not aware of the details, but the books apparently describe her as having "olive skin". The problem though is that if you image search her all you get is Jennifer Lawrence.

That said, Homer is yellow.

[–] [email protected] 1 points 2 years ago (1 children)

Absolutely. There is only a single depiction of Homer and I agree that unless you specifically ask for a race bent Homer it shouldn't do this. I was just pointing out that you can't draw the line at "identifiable character" because clearly that's also a problem. Maybe there is a better place to draw the line, or maybe it's going to be problematic regardless of where is drawn, including not doing anything at all.

I would say if you can't do it right just do nothing at all, except as a white guy in a white biased world, that's self-serving. I'm not the right person to say it's fine to just let it be.

[–] [email protected] 2 points 1 year ago

Someone at BuzzFeed is reading our Lemmy conversations:

https://www.buzzfeed.com/laurengarafano/the-hunger-games-characters-ai-vs-the-movies

[–] [email protected] 1 points 2 years ago

Not when the prompt includes a named character.

[–] [email protected] 0 points 2 years ago (1 children)

Can you explain to me how racial bias in general-purpose LLM is a problem to begin with?

[–] [email protected] 6 points 2 years ago* (last edited 2 years ago)

If you were really curious about the answer, you practically gave yourself the right search term there: "racial bias in general purpose LLM" and you'll find answers.

However, like your question is phrased, you just seem to be trolling (= secretly disagreeing and pretending to wanting to know, just to then object).

[–] [email protected] 4 points 2 years ago

This is bloody hilarious 🤣

[–] [email protected] 4 points 2 years ago (2 children)

I don't like the idea of a prompt being subtly manipulated like this to "force" inclusion. Instead the training data should be augmented and the AI re-trained on a more inclusive dataset.

The prompt given by the user shouldn't be prefixed or suffixed by additional words, sentences or phrases; except to remind the AI what it is not allowed to generate.

Instead of forcing "inclusivity" on the end user in such a manner; we should instead allow the user to pick skin tone preferences in an easy to understand manner, and allow the AI to process that signal as a part of it's natural prompt.

Obviously; where specific characters are concerned, the default skin tone of the character named in the prompt should be encoded and respected. If multiple versions of that character exist, it should take a user's skin tone output selection into account and select the closest matching version.

If the prompt is requesting a skin tone alteration of a character; that prompt should obviously be honored as well, and executed with the requested skin tone; and not the skin tone setting selection. As an example I can select "Prefer ligher skin tones" in the UI and still request that the AI should generate me a "darker skinned version" of a typically fairer skinned character.

Instead of focusing on forcing "diversity" into prompts that didn't ask for it; let's just make sure that the AI has the full range of human traits available to it to pull from.

[–] [email protected] 3 points 2 years ago

Yes. But this would probably cause friction with the overall public, as the AI would then give a full range of human traits, but people would still expect very narrow default outputs. And thinking more about it, what is the full range of human traits anyways? Does such a thing exist? Can we access it? Like, if we only looked at the societies the AI is present in, we still don't get all the people to actually be documented for AI to be trained upon. That's partially the cause for the racist bias of AI in the first place, isn't it? Because white cishet ablebodied people are proportionally much more frequently depicted in media.

If you gave the AI a prompt, e.g. "a man with a hat". What would you expect a good AI to produce? You have a myriad of choices to make and a machine, i.e. the AI, will not be able to make all these choices by itself. Will the result be a black person? Visibly queer or trans? In a wheelchair?

I guess the problem really is, there is no default output for anything. But when people draw something then they so have a default option ready in their mind because of societal biases and personal experiences. So I would probably draw a white cishet man with a boring hat if I were to execute that prompt. Because I'm unfortunately heavily biased, like we all are. And an AI, based on our biases, would draw the same.

But repeating the question from before, what would we expect a "fair" and "neutral" AI to draw? This is really tricky. In the meantime your solution is probably good, i.e. training the AI with more diverse data.

(Oh and I ignored the whole celebrity or known people thingy, your solution is definitely the way to go.)

[–] [email protected] 3 points 2 years ago

But that's expensive and they can't sell it

[–] [email protected] 4 points 2 years ago

I asked for a Sailor Moon 'be gay, do crime' meme and got this

[–] [email protected] 4 points 2 years ago

wow

[–] [email protected] 3 points 2 years ago

This thread really showcases white fragility.

[–] [email protected] 1 points 2 years ago (2 children)

I get that the Simpsons are coded as white, but technically they are yellow.

[–] [email protected] 3 points 2 years ago* (last edited 2 years ago)

IMHO one of the biggest mistakes in the Simpsons was adding non-yellow skin tones. "Yellow is white, but brown is brown"... should've stuck with yellow for everyone, green for alien, and could have added some blue. As it is, OP's image is a "white" guy (yellow hand) in blackface.

[–] [email protected] 2 points 2 years ago

technically they're cartoons as well, not real people ;)