Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Codidact Meta!

Codidact Meta is the meta-discussion site for the Codidact community network and the Codidact software. Whether you have bug reports or feature requests, support questions or rule discussions that touch the whole network – this is the site for you.

Comments on Let's improve how we handle duplicates

Parent

Let's improve how we handle duplicates

+11
−0

Currently, marking a question as a duplicate is part of question closure. Duplicates are a little different from other close reasons, though -- often the question itself is clear, complete, and otherwise solid, but it happens to have been asked before. Question closure can leave people feeling judged (as we learned Somewhere Else), but finding a duplicate should make the asker feel happy -- "we already have an answer for you". I've been wanting to change how we handle duplicates for a while -- the semantics are different, so why should they be part of the same workflow?

Here's a proposal; please provide feedback and help refine it.

Goals

  • Address duplicates as promptly as possible, to get askers to their answers and to reduce effort spent on what turn out to be duplicate answers.

  • Help authors to differentiate their dupe-nominated posts (if they disagree) and expedite resolution when they do.

  • Enable the community to have an ongoing evaluation by collecting all types of feedback including disagreement.

  • As already noted, counter the impression that duplicates are bad.

  • Test some ideas that would apply to closures too (which we also want to improve).

The main ideas

Someone who thinks a top-level post[1] is a duplicate can propose it, including an optional comment with the link. The suggestion is shown on the post and a comment thread is created for discussion. Other people who see this notice can agree, disagree, or propose other duplicate targets. We keep a running tally of votes in both directions, as opposed to going through close/reopen cycles.

The author is given specific editing guidance (or can accept a dupe suggestion). If the author edits in response to the dupe suggestion, and has the Edit ability, we (initially) trust that the edit resolved the issue -- clear the dupe suggestions, record everything in the history, and otherwise reset. Question: To avoid abuse or "dupe wars", should we only do this once (per post)?

If the author doesn't have the Edit ability, then -- while the edit takes effect (you can always edit your own posts), the dupe suggestion remains. People who can review suggested edits see a notice on the post asking them to review the edit and decide if it resolved the duplicate suggestion. If yes, proceed as for the author edit.

If "enough" people (score threshold still TBD[2]) agree that a post is a duplicate, it's marked as such. A duplicate designation can be reversed by the community.

Duplicate identification and resolution is democratized much more than other closures. I propose that anybody with the Participate Generally ability can participate in these votes.

In more detail

The following is taken from the draft specification. That spec also talks a little about closure ("hold"), which isn't very far along and will probably change so please don't focus on it.

Functional specification

Codidact supports duplicate suggestions and hold suggestions. Duplicates are not a type of hold -- the focus of a duplicate is "get to an answer more quickly" and link posts together, while hold is more about closing a question down until problems are addressed. We think the user experience of duplicates can be improved if they're not treated as closures/holds.

Duplicates are, intentionally, more "democratic"; while holds require the Curate ability, anybody with Participate Generally can participate in duplicate resolutions.

Suggesting a duplicate

Anybody with the Participate Generally ability can propose that a top-level post is a duplicate of another top-level post in the same community. (This could be a different category.) This spec also covers "superseded" or other duplicate-like phrasings -- the behavior is the same, even if a community customizes its wording.

To suggest a duplicate, any user (with the ability) can:

  • select the Tools menu under the post
  • select "suggest duplicate" from the menu (move "close" to this menu at the same time to reduce confusion)
  • fill out an in-page form with a required link and an optional comment (the comment can be helpful when it's not obvious why the other question is a duplicate)

Question: Should we disable the option if you have a suggestion pending, i.e. one suggestion per user at a time?

On submission:

  • A "Possible duplicate" comment thread is created or updated. A comment is added with the link and (if provided) additional comment. These comments are attributed (duplicate suggestions are not anonymous).
  • If there are now enough votes for the same duplicate target ("enough" to be defined), the question is marked as a duplicate. The author receives an inbox notification.
  • Otherwise, we display a notice of the suggested dupe, including links to the target and the comment thread, with action buttons (see below).
  • The author and everybody who has already answered the question receive inbox notifications of suggested duplicates.
  • State changes (marking a question as a duplicate or reversing it) are recorded in the post history.

Notice and actions

The notice is something like the following:

This question might be a duplicate of (other title with link) (could be multiple).
Community members provided the following feedback: (comment text that accompanied votes, unsigned here, and link to thread)

The author additionally sees:

Please read the linked question and its answers. If your question is different, you can edit to clarify.

And two buttons: "Yes, it's a duplicate" and "No, I will edit". See "author response" for how these buttons are handled.

Question: Should there be a third option, for "no, I disagree and don't need to edit" (spurious suggestions, etc), which would be treated as an ordinary "disagree" vote?

Everybody else who has Participate Generally sees two buttons next to each suggested duplicate: "agree" and "disagree". Choosing either prompts for a comment to add to the thread (like the initial suggestion).

Question: should each dupe suggestion show the number of suggest + agree / disagree tallies? Or should people who want to know the details have to go to the comment thread?

Answering a possible duplicate

While duplicate suggestions are pending, starting an answer generates a "hey, this might be a duplicate" alert, form and wording to be determined. This serves two purposes: (a) if you know enough to answer the question you probably know enough to contribute to the evaluation of whether it's a duplicate, and (b) you might want to answer that other question instead (or in addition).

Author response

If the author agrees it's a duplicate, the question is so marked (author's vote is binding). A notice is added and "[duplicate]" is added to the title. If there are multiple suggestions, the author selects one or more.

If the author disagrees and begins an edit (either via the button or the usual way):

  • It's the usual edit interface, except that "My question is not a duplicate of (link) because" has been inserted at the bottom and (ideally) the cursor is positioned there. If there's more than one dupe suggestion, do this for each and position cursor at the first.

  • If the author has the Edit ability, when the author submits the edit, the duplicate notice is removed from the question (for all viewers) and this review/resolution is logged in the history. (We can talk about yo-yo cases, where the author keeps rejecting duplicate votes this way, but I think it's something we should consider later. Let's not over-complicate it to start. Perhaps we only allow one author-edit resolution per question.)

  • If the author does not yet have the Edit ability, the duplicate notice remains and is updated to add a message along the lines of "thanks for your edit; the community will review to see if it's not a dupe any more" (not those words). The community sees something like "the author edited this post in response to duplicate suggestions" and, for those who can review edits, an invitation to do so.

Review: problem solved?

Users who can review edits see a notice on the post (similar to the "suggested edit pending" one) that says something like: "This question was suggested as a duplicate of (link) and the author has edited to address the suggestion. (review button)".

Entering the review shows the diff (like for a suggested edit) and includes a link to each suggested duplicate.

The options for the review are "Not a duplicate" and "Still a duplicate".

  • Choosing "still a duplicate" prompts for a comment and is treated like a duplicate vote. If there are multiple duplicate suggestions, the reviewer checks off which ones apply (maybe it's not a dupe of A any more but still is of B).

  • Choosing "not a duplicate" resolves the suggestions -- the question is reset to its "ordinary" state, with the resolution being logged in the post history, and the "possible duplicate" comment thread is archived. (Subsequent duplicate suggestions start over with a new thread.)

Reopening

If a post was marked as a duplicate, everybody sees the duplicate notice. Those with the Participate Everywhere ability also get the "disagree" button, like when duplicate votes are still pending. Here the comment is required -- explain why the duplicate status should be removed. The comment is added to a "possibly not a duplicate" thread. The duplicate notice is updated to add something like:

This question might have been incorrectly marked as a duplicate. Community members provided the following feedback: (comment text, link to thread).

Unaddressed issues

  • Retracting votes
  • Third-party edit from someone trying to help -- how does that affect the flow?
  • Vote threshold

  1. Usually questions, but there's no reason an article couldn't be a duplicate. A community that uses articles for sandboxing could mark those as duplicates of the resulting questions, clearly signaling that the sandbox phase is done and linking to the live question. ↩︎

  2. I think the score threshold -- the net score to mark a duplicate -- should be relatively low, 2 or 3. It should also be a community setting. ↩︎

History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

2 comment threads

How does reopening work? (2 comments)
Conflict between overview and functional specification (3 comments)
Post
+4
−0

Question closure can leave people feeling judged (as we learned Somewhere Else)

...counter the impression that duplicates are bad

To solve that problem, one needs to address the source. Somewhere Else would instantly make a conclusion like "aha it's the evil community being rude again" and then come up with some misguided system to counter that. But by applying a slight bit of empathy, we can get to the root source:

The people who cast duplicate close votes Somewhere Else are fed up with endless duplicates. Newbies asking the same question over and over again, with little to no research effort made. Therefore the regulars get tired of that behavior and just close the posts without providing much feedback to the person who asked the question.

There exists a somewhat rare phenomenon though: sometimes when a high quality question that is a dupe gets asked, it is left open long enough for good answers to pop up. And when this happens, this new question might actually turn into the "canonical" target for duplicates. And then the old, present duplicates get closed with the new post as target. It's a very good thing when this happens. The old duplicates of diverse quality are not necessarily the best ones.

But most of the time, new questions that are duplicates just get closed with an old post as target, because that's how the system was designed.

Somewhere Else is suffering from the results of this: there's a lot of old posts with canonical status but so-so quality. Also such posts tend to attract a whole lot of answers over time, where everyone and their mother feels inclined to contribute even though they aren't adding anything new. Or in case they are adding something new, they only add that part and not a complete answer. So over time the canonical post "fragments" into several answers and the result as whole is not very good.

It would be better if these old posts were recompiled into complete answers and one natural way to do that is to close them when something better and more complete shows up. But the duplicate system often doesn't let that happen.

We shouldn't close posts as duplicates unless there exists a high quality duplicate target. If a question has been asked before, then that alone should not be a reason to close it.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

1 comment thread

Some good points raised. (2 comments)
Some good points raised.
Fie‭ wrote over 2 years ago

I think you've made a good observation that adds to the bigger picture here. The problem we have with the 'fragmentation' (and general mess) you describe is that what we'd really like at a certain point is for a wiki-style self-answer writeup to try to distill the problem into the 'best' form to help people having asking (and seeking) all of the 'similar' questions. At the same time, that idea makes me uneasy. I think that it would be optimal to focus on preventing this kind of 'mediocre pile-up/stagnation' by drawing more attention to the issue and asking style (see the first thread on my answer), and look into some kind of merging or cleanup system to tidy things up when it seems appropriate.

Rather than pruning questions, the solution to this problem may be to prune partial or otherwise-lacking answers!

Monica Cellio‭ wrote over 2 years ago

I posted an updated proposal based on your and Fie's feedback.