Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Codidact Meta!

Codidact Meta is the meta-discussion site for the Codidact community network and the Codidact software. Whether you have bug reports or feature requests, support questions or rule discussions that touch the whole network – this is the site for you.

Comments on Could Codidact provide a data dump?

Post

Could Codidact provide a data dump?

+4
−0

FR: Could Codidact provide a data dump (e.g., an archive with all QA, comments, etc.)? Could be hosted on https://archive.org/ if need to save money.

History
Why does this post require attention from curators or moderators?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

On demand or regular backups? (7 comments)
On demand or regular backups?
trichoplax‭ wrote 9 days ago

Were you thinking of something you could request at any time, or a daily/weekly backup available to the public?

Depending on how much of the data you want at once (everything or something narrower like just the comments for a particular question) there may be some overlap with work on the requirements for a Codidact API. If you want to add an answer there too, you're still early as implementing it hasn't started yet.

Not suggesting a duplication - I can imagine both an API and data dump.

Thanks trichoplax‭, regular backup available to the public

Zoe‭ wrote 9 days ago

Quarterly would probably be enough at the current scale codidact operates at. Monthly might be nice once post volume increases, but anything more than that is arguably unnecessary regardless of volume, especially considering the computational requirements increase as database size does -- especially for compression, and especially if you can't multithread it. For the record, Wikipedia (who has an insanely good data dump system1) does 1-2 per month with no real upper limit, with one per month as the minimum and two per month as a usual practice.

  1. seriously, poke around their data dump site and documentation - they even include logs for each site. This would be absolute overkill for an initial implementation of course, but worth keeping in mind for Some Point:tm: in the future

If we're being futuristic and all, we might even get to include images in the dumps! :O

Zoe‭ wrote 9 days ago

Note that an image data dump would be significantly larger due to compression stuff, so an image data dump has very different considersrions, and drastically higher upload requirements

trichoplax‭ wrote 8 days ago

Good point. I guess even if we had an image data dump, we'd want to make sure that the much smaller version without images is still available for people who don't need the images and have a limited download speed / monthly limit.