Communities

Writing
Writing
Codidact Meta
Codidact Meta
The Great Outdoors
The Great Outdoors
Photography & Video
Photography & Video
Scientific Speculation
Scientific Speculation
Cooking
Cooking
Electrical Engineering
Electrical Engineering
Judaism
Judaism
Languages & Linguistics
Languages & Linguistics
Software Development
Software Development
Mathematics
Mathematics
Christianity
Christianity
Code Golf
Code Golf
Music
Music
Physics
Physics
Linux Systems
Linux Systems
Power Users
Power Users
Tabletop RPGs
Tabletop RPGs
Community Proposals
Community Proposals
tag:snake search within a tag
answers:0 unanswered questions
user:xxxx search by author id
score:0.5 posts with 0.5+ score
"snake oil" exact phrase
votes:4 posts with 4+ votes
created:<1w created < 1 week ago
post_type:xxxx type of post
Search help
Notifications
Mark all as read See all your notifications »
Q&A

Welcome to Codidact Meta!

Codidact Meta is the meta-discussion site for the Codidact community network and the Codidact software. Whether you have bug reports or feature requests, support questions or rule discussions that touch the whole network – this is the site for you.

What is the policy for Codidact data use by other, especially for-profit organizations?

+5
−0

SE recently made a deal allowing OpenAI to train their LLMs on community generated content (user-created questions and answers). Many active SE users were rather dismayed about this (including myself); some users saw it as a betrayal of trust.

Given that Codidact is a not-for-profit organization, and given that users of Codidact can include their own copyright and license to responses, you might conclude that this could never happen on Codidact.

But is it safe to assume that Codidact does not and will never allow a for-profit organization to crawl community created content and use this as data to be monetized? If it's not totally safe to assume this, is then at least safe to assume that if the Codidact organization board would in some dismal future consider making such a deal, that they would first consult the actual user community and try to reach a broad consensus?

Last question. If the answer to the first and/or second question is "yes", would it then not be more in line with Codidact policies or guidelines, to use the CC By-NC 4.0 license as default (and leave it up to any user to perhaps change this to CC By-SA 4.0)?

History
Why does this post require moderator attention?
You might want to add some details to your flag.
Why should this post be closed?

1 comment thread

related: https://meta.codidact.com/posts/291474 (7 comments)

2 answers

+8
−0

IMHO we need to distinguish two things:

  1. It's not really possible per se to prevent for-profit companies from just accessing our web pages and downloading the content from them.

    We can try to block crawlers of known bad faith actors, as has been suggested before and should, in my personal opinion, very much be implemented. There are probably legal means to defend users' content when somebody fails to abide to the license a post has been published under. However we need to keep in mind that we are all volunteers and have quite limited resources.

    Also I'd like to point out that not every for-profit use of our content is necessarily bad for our communities. E. g. we probably want for profit search engines to crawl our sites so that communities become discoverable.

  2. Actively enabling such community-adverse conduct by the Codidact Foundation is a different matter of course. And IMO something that should not happen.

    There are good reasons we chose to incorporate as a non-profit and one of them is that it legally binds us to support the interests of the community (instead of monetary interests). It's always been our principle that Codidact is "Community First" and "by the community, for the community" and in that spirit any major decision needs consultation and generally also agreement by the Community.

Last question. If the answer to the first and/or second question is "yes", would it then not be more in line with Codidact policies or guidelines, to use the CC By-NC 4.0 license as default (and leave it up to any user to perhaps change this to CC By-SA 4.0)?

IIRC this is can be changed per-community, so this should probably be a per-community decision. Also you can set a default license for your posts in your profile preferences FWIW.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

0 comment threads

+4
−0

We have just posted an official statement from the board of directors. Key part (click through for the whole thing):

Companies such as OpenAI and Google are forming partnerships with some organizations that host user-contributed content, for use of that content in for-profit ventures such as building Large Language Models. We share the concerns we have heard from our community about how these companies will use such content. We do not see how such a partnership could benefit our users, and we prioritize the needs of our users and the online community in general.

If in the future we were to receive a proposal that we feel could benefit the community, we commit to discussing it openly and transparently with our users before proceeding. We have not received any proposals of this type so far.

History
Why does this post require moderator attention?
You might want to add some details to your flag.

0 comment threads

Sign up to answer this question »