56
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
this post was submitted on 10 Nov 2023
56 points (100.0% liked)
askchapo
23022 readers
147 users here now
Ask Hexbear is the place to ask and answer ~~thought-provoking~~ questions.
Rules:
-
Posts must ask a question.
-
If the question asked is serious, answer seriously.
-
Questions where you want to learn more about socialism are allowed, but questions in bad faith are not.
-
Try [email protected] if you're having questions about regarding moderation, site policy, the site itself, development, volunteering or the mod team.
founded 4 years ago
MODERATORS
so they've defined ambivalent typologies based on their framework in table 1, and use that to impose 4 clusters onto the data
so there's really only 3 clusters but they've decided to set k=4 anyway, and then k-means just minimizes the variance within each cluster relative to its mean value. each observation gets assigned to whichever mean is "closest" in a certain sense, but that doesn't mean it's really the best choice.
even after they "merge" the ambivalent classes and set k=3, assigning each observation to a cluster based on the closest mean value doesn't mean it's the best choice for defining each class, just that it's the closest in terms of variance.
the natopedia article has a good illustration:
https://en.wikipedia.org/wiki/K-means_clustering#/media/File:K-means_convergence.gif
Doesn't it seem strange to apply this kind of statistical analysis to a four point survey?
yeah, the more i think about it the more strange it seems. they've already defined groups, just cluster them based on what their answers were. doing an iterative means clustering algorithm seems like they felt like they needed to do some fancy math to make it look better.