488
glupi jebeni bot (mander.xyz)
submitted 3 weeks ago by [email protected] to c/[email protected]
all 39 comments
sorted by: hot top new old
[-] [email protected] 180 points 3 weeks ago* (last edited 3 weeks ago)

i see this all the time with software designed by americans. on an old job we used a tool called "officevibe" where you'd enter your current impression of your role and workplace once a month. you got some random questions to answer on a 10-degree scale.

when we were presented with the result the stats were terrible because the scale was weighted so that everything below 7 was counted as negative. we were all just answering 5 for "it's okay", 3-4 for "could use improvement", and 6-7 for "better than expected". there had never been a 10 in the stats, and the software took that as "this place sucks".

like, of course you downvote a bad response. you're supposed to help the model get better, right?

[-] [email protected] 30 points 3 weeks ago

Recently, saw some survey that explicitly said 1-7 is "poor", 7-8 is "OK", and 9-10 is "great". Wild, not sure what the point of the scale is then.

Same with book ratings. Looking at StoryGraph, the average ratings I see is somewhere between 3.5 and 4.5. While I would rate a decent book a 3.

Born in Eastern Europe, live in the US, maybe that's why.

[-] [email protected] 10 points 3 weeks ago

I wonder if it's like the grading system we use in school? <60% is F for fail, 60% to <70% is D which depending on the class can be barely passing or barely failing. >=70% would be A, B, and C grades which are all usually passing, and A in particular means doing extremely well or perfect (>=90%). I just noticed that that rating scale kind of lines up with the typical American grading scale, maybe that's just a coincidence

[-] [email protected] 11 points 3 weeks ago

most countries i know mark <50% as a failing grade

[-] [email protected] 2 points 3 weeks ago

i was unaware most countries still use this terrible score system at all

[-] [email protected] 1 points 3 weeks ago

Apples and watermelons. The all-time highest major league batting average is only .371, nowhere near .500 which would correspond to 50% of the max possible.

[-] [email protected] 11 points 3 weeks ago

i have no idea what that means or why it's relevant.

[-] [email protected] 1 points 3 weeks ago* (last edited 3 weeks ago)

I believe you. On a rating scale of 0-10 a value of 5 doesn't usually represent a failure or anything negative, it's usually a middle concept such as "neither like nor dislike". Batting average is another example where 50% isn't a "failing grade". Hope that helps clear it up for you.

[-] [email protected] 6 points 3 weeks ago

no i mean i don't know what a "batting average" is or why it's apples to oranges to compare it to test scores.

i'm assuming you mean that comparing a pure gaussian distribution to a weighted system is unproductive?

[-] [email protected] 15 points 3 weeks ago

From the looks of it, what they're calculating is a net promoter score. The idea is that, in some context, what you actually want to know is whether your target audience would be willing to actually promote your business to their friends and family or not.

It's very common in retail and other competitive markets, because a customer that had an "okay" experience could still go to a competitor, so only customers who had a great experience (7+ out of ten) are actually loyal, returning clients.

Don't know if that's the best method to gather impressions on workplace environment though, I don't think many people would consider their workplace "amazing"

[-] [email protected] 82 points 3 weeks ago
[-] [email protected] 56 points 3 weeks ago

"objective distribution" yeah right

[-] [email protected] 26 points 3 weeks ago

and the main problem with gpt are the em-dashes

[-] [email protected] 13 points 3 weeks ago

Yeah, nothing about the blatant hallucinations to even basic questions, the dashes are the major problem

[-] [email protected] 3 points 3 weeks ago

They can fix the styling, they cant fix the hallucinations.

[-] [email protected] 1 points 2 weeks ago

should look more like a Boltzmann distribution?

[-] [email protected] 48 points 3 weeks ago

“Optimizing for things people love” aka talking to you like an hr team building seminar

It’s frustrating, or maybe it’s a good thing given the tendency for some people to form weird pseudo social relationships with LLMs, to see the evolution of chatgpts language processing

Public chatgpt only had the 3.5, 4, and 4o model but you can play with earlier models like 2 and 3 on huggingface. These were far weirder, often robotic and stilted but sometimes mirroring more natural colloquial English more based on the input

Rather than make something that is authentic and more natural to interact with they instead go for the ultra sanitized HR corporate speak bullshit. Completely bland and inoffensive with constant encouragement and reinforcement to drive engagement that feels so inauthentic (unless you are desperate for connection with anything, I guess). It’s mirrored in other models to some degree, deepseek, llama, etc (I don’t know about grok, fuck going on twitter).

3-5 years until it’s ruined by advertising, tops. If that

[-] [email protected] 3 points 3 weeks ago* (last edited 3 weeks ago)

i don't understand how people can find it appealing when computers speak like humans, i genuinely find HAL-9000 more appealing.

the ideal computer response style is how it works in star trek voyager

[-] [email protected] 35 points 3 weeks ago* (last edited 3 weeks ago)

It's hard to imagine how horrible 'early gpt' versions were at Croatian if they constantly invented words and grammar for much more popular languages, at the time.

[-] [email protected] 21 points 3 weeks ago* (last edited 3 weeks ago)

Gpt aside even google translate invents words regularly especially for augmentative languages.

Put google translate on Turkish to English and try something like "teakmezliyorlacaklarasacisinimislaslarin Charlie"

[-] [email protected] 9 points 3 weeks ago
[-] [email protected] 14 points 3 weeks ago

There is nothing objective about that 'objective distribution' why would the output automatically center on good?

[-] [email protected] 4 points 3 weeks ago

Because it's the center duh...

I feel like for real this will be the reasoning, since they divided the [0,1] interval into 5 equidistant intervals I think they believe that is what the regular distribution of ratings should look like and then compare that to how much different regions deviate from this norm.

[-] [email protected] 12 points 3 weeks ago

Self defense against the IA: tell them they suck until they stop talking to you :D

this post was submitted on 29 May 2025
488 points (99.0% liked)

Science Memes

15278 readers
1591 users here now

Welcome to c/science_memes @ Mander.xyz!

A place for majestic STEMLORD peacocking, as well as memes about the realities of working in a lab.



Rules

  1. Don't throw mud. Behave like an intellectual and remember the human.
  2. Keep it rooted (on topic).
  3. No spam.
  4. Infographics welcome, get schooled.

This is a science community. We use the Dawkins definition of meme.



Research Committee

Other Mander Communities

Science and Research

Biology and Life Sciences

Physical Sciences

Humanities and Social Sciences

Practical and Applied Sciences

Memes

Miscellaneous

founded 2 years ago
MODERATORS