Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Codidact Meta!

Codidact Meta is the meta-discussion site for the Codidact community network and the Codidact software. Whether you have bug reports or feature requests, support questions or rule discussions that touch the whole network – this is the site for you.

Comments on Proposal: tool for user-requested import of a single question and its answers from SE

Post

Proposal: tool for user-requested import of a single question and its answers from SE

+8
−1

We have a tool for doing bulk imports of questions (and answers) from SE. It has several limitations:

  • Running it requires direct DB access.
  • The resulting data is sometimes wrong in various ways -- duplicates, answers not getting wired up to the right questions, something that blocks voting sometimes until something gets tweaked in the DB (I don't know why), probably other things.
  • Special characters/encodings tend to get messed up; all the Hebrew on Judaism and at least some of the MathJax on Scientific Speculation had to be hand-corrected.
  • It's resource-intensive.
  • And did I mention that it requires direct DB access, meaning only a very small number of people can do it?

In addition, we've found that bulk imports have not served us well on the communities that used them (Writing, Outdoors, Scientific Speculation). Many of us feel that bringing in a big pile of Q&A from SE, when there's no specific request and the community isn't prepared to moderate it all here, is counter-productive. This is why, on Judaism, we asked people to make specific requests, of which we've processed about 20. Even those 20 caused enough problems that we haven't done a second run.

I'd like to have a tool that imports one question and its answers, that users can directly invoke (perhaps gated by an ability and almost certainly rate-limited). People might do this to bring over their own work, or to add an answer here, or because it could then be a target for duplicates for questions that have been asked here. These are good use cases that are different from the use cases that drove the creation of the bulk-import tool.1

When a user requests an import (through some tool into which the user drops an SE URL), the following things should happen:

  • The question and all answers should be brought over. We don't care about comments or full edit history. (We can add a link to the edit history on SE to our post history but needn't recreate all those events here.)
  • For any post (Q or A) where the SE account is associated with an existing Codidact user, make that user the owner. This applies to both "native" users (accounts that a human created) and placeholder users created from previous imports.
  • For any post where the SE account is not associated with an existing Codidact user, create a placeholder user, make the user the owner, and add the "imported" notice with attribution/license info. (See examples on any of the communities I've mentioned that have imports.)
  • If the import is immediate, take the requestor to the imported post page. If the import is not immediate, notify the requestor with a link when it's done. (It's fine if imports aren't immediate for API-budget reasons, but I assume it would happen within several hours.)
  • Notify native (not placeholder) Codidact users when posts of theirs are imported. (It's not nice to silently attribute stuff to people, even if it's their work and they probably don't mind. Maybe they want to make edits. Maybe they forgot about that old wrong answer and want to delete it. Etc.)

I'm not proposing to bump the post. If the people involved get notified and anybody edits anything it will get bumped then; otherwise, there's probably no reason to move older posts to the top of the list.

TopAnswers has a tool like this. While our implementation languages, tech stacks, and database schemas are different, so there's little that could be reused, we2 could look at their implementation to see how they handle things like that data corruption or staying within API limits. I don't remember if their imports are immediate or queued or, if queued, if there's a notification.

  1. If you see a question Somewhere Else that has no good answers, and you want to answer it here, then it's better to just re-ask the question here (or if it's yours, copy it directly). Imports should be for cases where there are good answers but we want to make improvements or otherwise use them here.

  2. For values of "we" that include PHP fluency, sadly not a set that includes me.

History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

General comments (6 comments)
General comments
manassehkatz‭ wrote over 3 years ago

I have the PHP fluency, just don't have the time to do this. But I do think it is a very good idea. In particular it could help for people who want to move from different sites - e.g., there are several different sites that might match to Software Development.

Lundin‭ wrote over 3 years ago · edited over 3 years ago

While this might be useful, I'd still be very restrictive with what to import. I've been re-reading through various canonical posts on SE that I've been used as duplicate targets there for years. And found that most of them are not actually that good, suffering from "fragmented answers" where answer A says one good thing and answer B says another, but they ought to be merged in order to actually create a really good answer. ->

Lundin‭ wrote over 3 years ago · edited over 3 years ago

So a complete re-write of these canonical "patchworks" would be ideal, not just importing them out of habit. Codidact offers a fresh start to do just that. Also, some of the old canonical posts might be outdated, particularly when it comes to tech communities.

Monica Cellio‭ wrote over 3 years ago

@Lundin agreed; people need to be judicious in using the tool. Sometimes some of the answers are good (or good starting points for edits) and others should be deleted, which the community could do here but can't do there.

Lundin‭ wrote over 3 years ago

@Monica Cellio But because of licensing, we can't really edit them to shape during import, right? Like merging two answers into one. It's either grab the answer(s) as-is or leave it be?

Monica Cellio‭ wrote over 3 years ago

@Lundin not as part of the import process, no. But once the post is here, people can edit to merge answers, preserving attribution. This would be a good use case for the "community wiki" proposal, too, as it's truly a combined effort that no one person gets "credit" for.