Notifications
Sign Up Sign In
Q&A

Welcome to Codidact Meta!

Codidact Meta is the meta-discussion site for the Codidact community network and the Codidact software. Whether you have bug reports or feature requests, support questions or rule discussions that touch the whole network – this is the site for you.

Scoring System for Trust Level Requirements

+13
−1

Currently, we're planning to implement a system for user privileges based on Trust Levels.

These are of the form of 'if you satisfy [these requirements], you get [these perks]', where [these requirements] are generally of the form e.g. "at least 50 accepted edits".

Continuing this example, what this doesn't take into account is the number of rejected edits - if a user has 50 accepted edits out of a total of 200 suggested edits (i.e. has 150 rejected edits), then I for one would be hesitant to give that user the ability to directly edit.

At this point, it appears that the solution would be to come up with some method of figuring out the number of accepted edits in order to 'balance out' the rejected edits, or having some system of '> x accepted edits and < y rejected edits within the past [some time-scale]'. However, we're already using a system that estimates the probability of a successful outcome of a binary choice (e.g. 'accept' and 'reject'), given some data. That is, our post scoring system.

I'm therefore proposing that we use this same scoring system to 'score' each individual requirement in the Trust Levels: (accepts + N) / (accepts + rejects + 2N), for N=2. The requirement of 'at least 50 accepted edits' could then be replaced with a requirement of 'an edit-score of at least 0.95'. This could similarly be applied to create a user post-score, where 'accepts' is the total number of upvotes across all posts and 'rejects' is the total number of downvotes.

As we're planning on getting rid of rep and not replacing it with any number (other than trust levels), an individual user's score should perhaps only be visible to that user. For easy visualisation, it could also be displayed in a radar chart such as

Radar chart for user scores

Why should this post be closed?

6 comments

Can you please reupload the image in a higher resolution? The only thing I can read is "Score for (user)". ‭Zerotime‭ 3 months ago

@Zerotime The formatting, font size etc. would need to be improved, but here's a link to another one ‭Mithrandir24601‭ 3 months ago

I think this approach would simplify things for all involved. ‭Monica Cellio‭ 3 months ago

what is N conceptually here? Is it just some constant? What is it's purpose? Also don't we still need to take into effect timescale, or maybe just the last X number of accept rejects to help people who've gotten better/worse at edits over time? ‭Cazadorro‭ 3 months ago

N is just some constant, yep. My opinion on the timescale thing is that it doesn't matter if someone rarely edits, as long as they're consistently good at it, while someone who had a bad string of edits to start with, but improves, will see a gradual increase in score. Maybe your idea of 'last X number of...' is the way to go here? ‭Mithrandir24601‭ 3 months ago

Show 1 more comments

2 answers

+6
−0

Update: Based on this question, this answer, feedback on both, the original spec for trust levels, and lots of discussion in chat, I've posted a new specification for privileges on the wiki. Key differences:

  • All privileges are derived from scores based on your posts, suggested edits, and/or flags. Thresholds remain configurable per-site.
  • More decoupling (the old TL3 was overloaded and that was a symptom).
  • We've nailed down some interactions with other site features (categories, rate limits, ability suspensions).

Our original trust levels grew out of long discussions on the forum, where we tried to work out baseline thresholds for different privileges and identified where communities could customize. We also lined those privileges up in a sequence, ordered trust "levels". It all ended up being a little complicated, and with apologies to the team member who has already put a lot of work into writing documentation about them (at my request!)... I think you have a really good point here and I'd like to revisit them.

We already had a problem brewing at Trust Level 3, which says:

Requirements:
At least 50 accepted flags.
At least 50 accepted edits.
At least 15 well-received questions/answers.

There was discussion somewhere of an either/or approach to flags and edits, but I can't find it right now.

I wrote, or compiled, those words, and when I went back to look at it later, I asked myself: what do flags and edits have to do with each other? On the dev team I've already brought up that we need a 3A and 3B.

But even that doesn't seem quite right. We're trying to pack too much into a few bundles. People at this trust level, according to our spec, get these privileges: edit directly, review edits, vote to close, create tags. Yeah, I think I just jammed tags in there because they had to go somewhere.

Not everything on the site can be "scored" in the way the question proposes, but posts (up/down votes), suggested edits (accept/reject), and flags (accept/reject) can be. Let's take advantage of that.

My radical, haven't-consulted-with-anybody-else-on-the-team proposal for a revamp of trust levels follows.


Privileges are earned independently. For each one I list the factor to score and the resulting abilities. I do not specify what the score threshold is; we would need to work out defaults for these that emulate our original trust-level spec, but also, sites could change these thresholds.

Note: higher scores automatically require more events; it's impossible to get an edit score of 0.9 on a single edit. So if we set the thresholds right we don't also need to specify minimum numbers of events.

Also note: assume there are daily rate-limits on most things -- comments, votes, flags, etc. Eventually we could have rate limits scale with your activity/score -- if you have a great track record with flags you get more, etc. I think we should defer that until after we have privileges in use for a while so we can collect data.

Anybody (no privilege required) can:

  • flag (can be revoked; see note)
  • suggest edits (can be revoked; see note)
  • suggest a duplicate question
  • post N top-level posts (questions, articles) per day (3ish, if we go with what we worked out in the previous spec)
  • post any number of answers per day (within rate limits)
  • vote on answers to own questions
  • comment on answers to own questions
  • comment on own posts

Note: to prevent floods of bad flags/suggestions, we could set a score threshold at which you temporarily lose these abilities. I think it's important that anybody be able to participate in these ways by default, even though we don't know anything about you -- but if you build up a bad record, that's different.

Privilege: Remove new-poster restrictions (usually the first one earned but doesn't have to be!)

  • Criteria: score for posts (all types, all categories)
  • Can: make any number of posts, subject to rate-limiting, daily thresholds, and category restrictions; upvote and downvote anywhere; comment anywhere

Question: Should downvotes have a higher requirement than upvotes? That is, should we insert a "Downvote" privilege with a criterion of "higher score for posts"?

Privilege: Edit

  • Criteria: score for accepted edits
  • Can: edit directly, review edits

Privilege: Edit tags

  • Criteria: a higher score for accepted edits
  • Can: create tags, create tag descriptions

Privilege: Vote to close or keep open

  • Criteria: score for posts (to demonstrate some site familiarity) and score for flags on posts (all flags, not just "should be closed" flags; comment flags do not count)
  • Can: vote to close, vote to open/keep open (review close votes)

Privilege: Review flags

  • Criteria: score for posts and a higher score for flags (posts and comments, unless we decide these should be separate privileges)
  • Can: handle flags except "other/for moderators" (which are reserved for moderators)

Privilege: Moderator

  • Criteria: be chosen by the community :-) (or be appointed as a pro-tem until the community is ready to choose its own)
  • Can: do all the things, pretty much

I've had trouble fitting some actions into this score-based scheme, which might be how our trust levels became somewhat eclectic to begin with. These "stray" privileges are:

  • deleting posts (outside of flag-handling)
  • locking posts

We could initially make these moderator privileges, though I'd like to find a way to push them down to the community too.

We have use cases and wireframes for many of these activities, in various stages of development. For example, you might have noticed that I separated "suggest duplicate" from "close", because on Codidact those will be two different things -- poke around in the "wireframes for review" column there, and sorry for the clutter.


Responses to questions and concerns:

  • I added post score as an additional criterion for close votes and handling flags, so people have to have had some constructive first-class activity on the site to moderate in these ways. I did not do so for editing, because if you've actually had enough suggested edits approved, that's already a good indicator that you can be trusted to edit.

  • Categories can set posting and viewing restrictions based on old-style trust levels. I would suggest changing this to specify the privilege(s) needed. A site that wants to minimally restrict a category could require "remove new-user restrictions"; one that wants a higher bar for its blog could require one of "edit" or "vote to close". For the meta blog category (currently set at TL5), the posting requirement would be "moderator". Category configuration is infrequent and restricted to admins, so if we create this system I'm not too worried about people messing it up.

  • I have not built in any score decay over time. That's something we should look at, but I propose to look at it separately. It'll be a while before it will be relevant, so let's first get some practical experience with the system without decay and then decide what to do there.

  • Users do not need to do these calculations, though they're free to check our math. We would provide a clear indication of progress, whether graphically as suggested in the question or textually (or, probably, both).

  • New sites should be able to automatically grant certain privileges (such as "remove new-poster restrictions") and/or temporarily lower the thresholds, so that they are not unduly hindered in getting up and running.

6 comments

I think this is a good idea. A question I have, though. For example, what should happen with posting/reading restrictions in categories? Our current idea is, that we can set a minimum trust level, but this wouldn't be possible. We could, though, add 2 checkboxes: "Prevent posting for new users" (requires the No New User Restrictions privilege) and "Prevent posting for non-moderators" (requires the Moderator privilege). Same for reading of/course. ‭luap42‭ 3 months ago

Also, with regards to thresholds: We'll probably want to allow communities to change them, but we'll need good defaults of course. Furthermore, with regards to starting a new site: We could mark some privileges as "automatically granted" (ex. Remove new poster rest.,. Edit, Edit Tags and Close) for some time. ‭luap42‭ 3 months ago

Is the intention for these scores also to decay over time? ‭Sigma‭ 3 months ago

@Sigma Doing that in some way makes sense. Maybe, we could also just consider actions within the last year (or other time frame), which would prevent "complex" decaying math and would be more easy to explain ‭luap42‭ 3 months ago

So if we set the thresholds right we don't also need to specify minimum numbers of events. But that's less intuitive. A threshold is easier to understand and set. Easy understanding and control should be more important that internal math. ‭Olin Lathrop‭ 3 months ago

Show 1 more comments
+1
−1
I'm therefore proposing that we use this same scoring system to 'score' each individual requirement in the Trust Levels: (accepts + N) / (accepts + rejects + 2N), for N=2.

This is good enough for as measuring the success of some specific activity. However, trust should be earned with a user's broader participation also in mind.

There are really two classes of privileges, the merely mechanical ones, and those that exercise some level of policy. For example, editing is a mechanical privilege, whereas opening/closing questions is a policy privilege.

It might be OK to be allowed to edit posts without review by demonstrating a good edit history alone. However, I wouldn't want someone opening/closing questions without having shown broader site participation. To exercise policy, one needs to really understand the site. That can't be done just by watching, or having completed a particular task successfully a few times. You really want a measure of being invested in the site. I don't want someone making policy decisions that doesn't have skin in the game, so to speak.

You keep saying you don't like rep, but some measure of having provided widely accepted value is useful for lots of reasons. I won't go into the others here, but this should be one of the factors in allowing policy privileges.

The open/close privilege is a good example. I would say that successfully answering lots of questions is necessary for deciding whether a particular questions should be allowed. Actually answering questions gives you a different perspective than someone merely viewing them as a bystander. Bystanders shouldn't be allowed to make policy decisions.

So to answer your question, your formula could be OK for some types of privileges, but not others. There still needs to be an overall measure of having provided value and being active on the site that is necessary for other types of privileges.

About your specific formula: It's probably effective enough for the mechanical privileges, but it's rather unintuitive and difficult to provide easy per-site controls. Ease and clarity of the controls are also important. In fact, I think they are more important than mathematical elegance, or even theoretical "rightness" (within limits). For example, for the edit priviledge, I'd prefer a set of rules like:

  1. Must have at least AA accepted edits.
  2. No more than BB percent of edits rejected.

This is very easy to compute, but more importantly, it is easy and intuitive to control and adjust.

5 comments

In our current spec for trust levels, trust for close votes is based on successful flagging. That could possibly be refined; on SE you wouldn't want the people running Smoke Detector auto-flaggers for spam to earn privileges on sites they're not otherwise on. But the basic idea is that they're both content moderation, and question flags include flags to close. ‭Monica Cellio‭ 3 months ago

@Monica: Yes, how well accepted close flags are is a good metric. But, it's just one metric. There still needs to be some participation threshold, "skin in the game". You don't want a close-class of user emerging where all they do is close questions without actually participating in the site in a more meaningful way. ‭Olin Lathrop‭ 3 months ago

I agree that the current formula is somewhat complex and fragmenting it across a bunch of different metrics makes it even less intuitive. I have three semesters of stats and have to think through the math carefully - how is a casual user expected to grasp what is going on without frustration? ‭Sigma‭ 3 months ago

"You don't want a close-class of user emerging where all they do is close questions without actually participating in the site in a more meaningful way." Sure, it would be nice if they did more, but shouldn't people be able to choose the level of contribution they're comfortable at? If they want to act as a glorified spambot, why should we prevent that? Especially if it no longer affects their access to higher privileges on other actions, I don't see the issue. ‭Sigma‭ 3 months ago

@Sigma: Because when you don't participate for real, you will have a different view of what a good question is. ‭Olin Lathrop‭ 3 months ago

Sign up to answer this question »