Welcome to Codidact Meta!
Codidact Meta is the meta-discussion site for the Codidact community network and the Codidact software. Whether you have bug reports or feature requests, support questions or rule discussions that touch the whole network – this is the site for you.
Comments on Should we start displaying the score of a post instead of the raw votes?
Parent
Should we start displaying the score of a post instead of the raw votes?
Currently, when viewing a post, Codidact will show you the raw votes on a post, with the breakdown into upvotes and downvotes:
There's been some feedback that this is a bit too much to show, especially coming from platforms like Stack Exchange where they generally just show the aggregate score of upvotes and downvotes as one number (with the option to expand the votes to see the split). We decided to show both counts automatically to better show when there's controversy.
However, we now also have another option. We have a method for scoring posts that assigns a score between 0 and 1 to each post.
Perhaps instead of showing the raw votes on each post, we should instead show the post score (e.g. 0.81363... or 0.3793...), rounded to the nearest two or three decimal places (so that it would show as 0.937 or 0.276), with the raw votes available on request, perhaps either on click or in the tools menu.
This would take people a bit of time to get used to, but it might be worth that initial adjustment time, since this... is our scoring system and we want people to be familiar with it quickly.
This has the added benefit of making it much clearer why answers are sorted the way they are by displaying their score (that's currently computed without being displayed) for everyone to see. The raw votes matter less than the computed score.
Background: the information content being presented In principle, the number of upvotes and number of downvotes on a …
1y ago
Why show scores at all? When I was at Stack Exchange, we spent a good deal of time discussing sort orders in the cont …
1y ago
If we show raw Wilson score I think we're going to see a lot of confusion and questions -- "is 0.65 good?" "what does it …
4y ago
I agree that on a list of questions, one clear indicator of fitness is most helpful. On a post's own page, it might make …
4y ago
All the proposals so far are missing what people really want to know, which are two orthogonal metrics: How good/bad …
4y ago
instead of - Absolutely not! Showing separate + and - votes is a good thing. Or as they say, that's not a bug, it's a …
4y ago
Just a thought that occurred to me: One problem when seeing the Wilson score when not knowing it is that it's not cle …
8mo ago
In my view displaying fractional numbers representing an unintuitive measure would be even worse than displaying two int …
2y ago
Post
Why show scores at all?
When I was at Stack Exchange, we spent a good deal of time discussing sort orders in the context of obsolete answers. One suggestion was to change the sort order to use Wilson scoring. The objections were:
- The algorithm is confusing so will raise more questions than shed light.
- For many questions sorting by the sum of positive and negative votes gives the same result as using a more complicated sorting mechanism.
- Answer scores are part of the Stack Overflow brand.
The final point doesn't apply here except to the degree Codidact's strategy might be to distance/embrace a connection to Stack Overflow.
The goal of a number is to give people a quick view into how good the answer is. If nobody has voted, the answer is ¯\_(ツ)_/¯
. The more people who have voted, the greater confidence we have in the voting. For instance, which toaster would you buy?
Even though we would normally pick a 5-star toaster over a 4½ star toaster, we know that 46 ratings probably means more than 1 rating. The single 5-star rating could very well have been from someone associated with the product.
There are some other things it would be helpful to know, such as how old the ratings are and if the product has changed after getting a handful of ratings. And this also applies to answers. An answer with many upvotes might not be good anymore if the world has changed and the answer hasn't. Or maybe someone edited the answer in a way that would have caused early voters to vote differently if the edit had been original.
We have some intuition when it comes to the 5 star review system Amazon uses. The simple sum of positive and negative votes is also easy to understand. Codidact's display isn't for most people. Looking at answers to this very question right at the moment, I see:
- 6/1
- 3/0
- 6/2
With some help from a calculator, I can see that's the right sorting:
- 0.4869
- 0.4385
- 0.4093
But it kinda breaks my brain to think about it. I just don't have the right intuition. (Yet?)
Show rank instead?
A simple change would be to display ranks instead of scores:
- This is the top ranked answer by Wilson score, so it's #1!
- Still a good answer.
- Not great, but better than #4.
- Just good enough to avoid getting deleted.
If you click on the rank, you could see details such as the number of up/down votes, the confidence interval and maybe some indication of age of votes. It really doesn't matter how you calculate rank as long as it's documented somewhere. That's helpful because it could let you explore using other signals.
As Evan Miller wrote in "Bayesian Average Ratings":
Bayesian average ratings are an excellent way to sort items with up-votes and down-votes, and lets us incorporate a desired level of caution directly into the model. It is eminently “hackable” in the sense of affording opportunities for adjusting the model to accommodate new features (prior beliefs, belief decay) without violating the model’s consistency. As long as we make a judicious choice of belief structure (namely a beta distribution), it is feasible to compute.
As with other hackable systems, it is possible to take the Bayesian framework and produce something utterly useless. But as long as you set aside time for a little experimentation, I think you’ll eventually arrive at a sorting method that helps people find the good stuff quickly.
Now you do lose something because there's still a vast difference between answers scored +10/-0 and a +0/-10, but with only two answers the second one will be ranked second. (By definition!) Sometimes second answers are great, but not in this case. So maybe some indication of the strength of the system's belief in the quality of the answer would be helpful. (I'm partial to Isaac Moses' signal strength indicator.) But the primary signal is simply where the answer is placed on the page and maybe a number showing that rank.
What about random, weighted placement?
If you remove the numbers (whether score or rank) that opens up another possibility: randomly place each answer on the page with better answers weighted more heavily to be on top. When there are several answers with no votes, this is an honest method of display since the system can't know which is the best. Chronological sorting is usually a default, but is the best answer the first or the last? On a programming site, the more recent answer might have incorporated updates to the tools. But the older answer might be standard and the newer answer a speculative variant. Only a human can tell.
Once votes start to roll in, the system can estimate which answer is better, but it's still an estimate. Bumping up based on upvotes and down based on downvotes gives a more accurate view, of course. But random placement gives downvoted answers a chance of getting support and avoids the fastest gun bias.
Random sorting has an impact on performance because of caching. You either need to rebuild the page every time it loads or be stuck with just one random sorting until something triggers a rebuild. Building this is probably not worthwhile unless there can be some experimentation showing that random order is useful.
How about bins?
Another option would be to group answers in "quality bins". Pick a Wilson score threshold to be "great answers" and put them in a section nearest the top of the page. Answers the system is less confident about would be in a lower group. Maybe you only need two groups:
- Good answers
- Other
Or maybe a bin for bad answers that aren't yet deleted. I'm less sure that's helpful though becasue:
Most readers want the top answer and maybe the second
Based on my experience, the key indicator of answer quality for visitors is the top answer. (I thought I did a study at Stack Exchange, but I can't find any evidence. Maybe it was internal?) This might be less true on more philosophical communities (like meta), but people don't have much patience for reading past one or maybe two answers. That means sort order is really important. I find the current +/- display distracting. I want a discoverable way to see the breakout, but my intuition about that the numbers mean (perhaps from years of seeing scores) isn't formed yet. Sort order probably tells me all I need anyway.
1 comment thread