Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Codidact Meta!

Codidact Meta is the meta-discussion site for the Codidact community network and the Codidact software. Whether you have bug reports or feature requests, support questions or rule discussions that touch the whole network – this is the site for you.

Comments on Should we start displaying the score of a post instead of the raw votes?

Parent

Should we start displaying the score of a post instead of the raw votes?

+10
−2

Currently, when viewing a post, Codidact will show you the raw votes on a post, with the breakdown into upvotes and downvotes:

Screenshot of the voting buttons, showing +12 and -1.

There's been some feedback that this is a bit too much to show, especially coming from platforms like Stack Exchange where they generally just show the aggregate score of upvotes and downvotes as one number (with the option to expand the votes to see the split). We decided to show both counts automatically to better show when there's controversy.

However, we now also have another option. We have a method for scoring posts that assigns a score between 0 and 1 to each post.

Perhaps instead of showing the raw votes on each post, we should instead show the post score (e.g. 0.81363... or 0.3793...), rounded to the nearest two or three decimal places (so that it would show as 0.937 or 0.276), with the raw votes available on request, perhaps either on click or in the tools menu.

This would take people a bit of time to get used to, but it might be worth that initial adjustment time, since this... is our scoring system and we want people to be familiar with it quickly.

This has the added benefit of making it much clearer why answers are sorted the way they are by displaying their score (that's currently computed without being displayed) for everyone to see. The raw votes matter less than the computed score.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

Two degrees of freedom (2 comments)
Post
+4
−0

Why show scores at all?

When I was at Stack Exchange, we spent a good deal of time discussing sort orders in the context of obsolete answers. One suggestion was to change the sort order to use Wilson scoring. The objections were:

  1. The algorithm is confusing so will raise more questions than shed light.
  2. For many questions sorting by the sum of positive and negative votes gives the same result as using a more complicated sorting mechanism.
  3. Answer scores are part of the Stack Overflow brand.

The final point doesn't apply here except to the degree Codidact's strategy might be to distance/embrace a connection to Stack Overflow.

The goal of a number is to give people a quick view into how good the answer is. If nobody has voted, the answer is ¯\_(ツ)_/¯. The more people who have voted, the greater confidence we have in the voting. For instance, which toaster would you buy?

Toaster with 5 stars and 1 rating Toaster with 4 1/2 stars and 46 ratings

Even though we would normally pick a 5-star toaster over a 4½ star toaster, we know that 46 ratings probably means more than 1 rating. The single 5-star rating could very well have been from someone associated with the product.

There are some other things it would be helpful to know, such as how old the ratings are and if the product has changed after getting a handful of ratings. And this also applies to answers. An answer with many upvotes might not be good anymore if the world has changed and the answer hasn't. Or maybe someone edited the answer in a way that would have caused early voters to vote differently if the edit had been original.

We have some intuition when it comes to the 5 star review system Amazon uses. The simple sum of positive and negative votes is also easy to understand. Codidact's display isn't for most people. Looking at answers to this very question right at the moment, I see:

  • 6/1
  • 3/0
  • 6/2

With some help from a calculator, I can see that's the right sorting:

  • 0.4869
  • 0.4385
  • 0.4093

But it kinda breaks my brain to think about it. I just don't have the right intuition. (Yet?)

Show rank instead?

A simple change would be to display ranks instead of scores:

  1. This is the top ranked answer by Wilson score, so it's #1!
  2. Still a good answer.
  3. Not great, but better than #4.
  4. Just good enough to avoid getting deleted.

If you click on the rank, you could see details such as the number of up/down votes, the confidence interval and maybe some indication of age of votes. It really doesn't matter how you calculate rank as long as it's documented somewhere. That's helpful because it could let you explore using other signals.

As Evan Miller wrote in "Bayesian Average Ratings":

Bayesian average ratings are an excellent way to sort items with up-votes and down-votes, and lets us incorporate a desired level of caution directly into the model. It is eminently “hackable” in the sense of affording opportunities for adjusting the model to accommodate new features (prior beliefs, belief decay) without violating the model’s consistency. As long as we make a judicious choice of belief structure (namely a beta distribution), it is feasible to compute.

As with other hackable systems, it is possible to take the Bayesian framework and produce something utterly useless. But as long as you set aside time for a little experimentation, I think you’ll eventually arrive at a sorting method that helps people find the good stuff quickly.

Now you do lose something because there's still a vast difference between answers scored +10/-0 and a +0/-10, but with only two answers the second one will be ranked second. (By definition!) Sometimes second answers are great, but not in this case. So maybe some indication of the strength of the system's belief in the quality of the answer would be helpful. (I'm partial to Isaac Moses' signal strength indicator.) But the primary signal is simply where the answer is placed on the page and maybe a number showing that rank.

What about random, weighted placement?

If you remove the numbers (whether score or rank) that opens up another possibility: randomly place each answer on the page with better answers weighted more heavily to be on top. When there are several answers with no votes, this is an honest method of display since the system can't know which is the best. Chronological sorting is usually a default, but is the best answer the first or the last? On a programming site, the more recent answer might have incorporated updates to the tools. But the older answer might be standard and the newer answer a speculative variant. Only a human can tell.

Once votes start to roll in, the system can estimate which answer is better, but it's still an estimate. Bumping up based on upvotes and down based on downvotes gives a more accurate view, of course. But random placement gives downvoted answers a chance of getting support and avoids the fastest gun bias.

Random sorting has an impact on performance because of caching. You either need to rebuild the page every time it loads or be stuck with just one random sorting until something triggers a rebuild. Building this is probably not worthwhile unless there can be some experimentation showing that random order is useful.

How about bins?

Another option would be to group answers in "quality bins". Pick a Wilson score threshold to be "great answers" and put them in a section nearest the top of the page. Answers the system is less confident about would be in a lower group. Maybe you only need two groups:

  • Good answers
  • Other

Or maybe a bin for bad answers that aren't yet deleted. I'm less sure that's helpful though becasue:

Most readers want the top answer and maybe the second

Based on my experience, the key indicator of answer quality for visitors is the top answer. (I thought I did a study at Stack Exchange, but I can't find any evidence. Maybe it was internal?) This might be less true on more philosophical communities (like meta), but people don't have much patience for reading past one or maybe two answers. That means sort order is really important. I find the current +/- display distracting. I want a discoverable way to see the breakout, but my intuition about that the numbers mean (perhaps from years of seeing scores) isn't formed yet. Sort order probably tells me all I need anyway.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.

1 comment thread

Breaks down for sub-zero net votes on the best answer. (4 comments)
Breaks down for sub-zero net votes on the best answer.
Michael‭ wrote 11 months ago

If there's a question that has only attracted poor answers, and the top one is +0-2 (or +1-6 or whatever), "Wow! Rank #1" is kind of misleading. If you mean "rank" meaning "cohort" like Isaac's pentiles, I'm interested. But if you mean RANK() OVER (ORDER BY upvotes - downvotes), I don't want that.

Jon Ericson‭ wrote 11 months ago

I think showing voting solves the problem of obviously poor answers. (But so does a the SO score in all but the most extreme cases.) It would make a lot of sense to have a threshold below which an answer is visibly signalled as "no good". Maybe even hide answers behind a link. (Cribbing this from Discourse.) That way you don't even need to rely on users interpreting the numbers.

I like Isaac's system for answers that are on the more-likely-to-be-helpful-than-not side of the ledger. For +1/-6 I'm not sure you'd want that answer on the same scale. If it's the only answer, the question basically is unanswered.

Jon Ericson‭ wrote 11 months ago

To put it another way, the tricky problem is showing the difference between +0/-0 and +X/-X. Dealing with +0/-X is relatively easy. I'm not sure you necessarily need to use the same solution to both problems.

Michael‭ wrote 11 months ago

"To put it another way, the tricky problem is showing the difference between +0/-0 and +X/-X." Agreed. It's a problem that SO chose to obscure behind the 1000-rep click-to-see-raw-votes.

"I'm not sure you necessarily need to use the same solution to both problems." That could be, but I'm going to push for all the information to be presented up front to make a determination on both post value and post interaction, simultaneously.

"I like Isaac's system for answers that are on the more-likely-to-be-helpful-than-not side of the ledger. For +1/-6 I'm not sure you'd want that answer on the same scale. If it's the only answer, the question basically is unanswered." At least half of Isaac's answer deals with negatively voted posts. I explicitly do not want different "scales."

I think a lot of your post (and comments) have good insight, but I disagree with your conclusions.