Welcome to Codidact Meta!
Codidact Meta is the meta-discussion site for the Codidact community network and the Codidact software. Whether you have bug reports or feature requests, support questions or rule discussions that touch the whole network – this is the site for you.
What is the policy for Codidact data use by other, especially for-profit organizations?
SE recently made a deal allowing OpenAI to train their LLMs on community generated content (user-created questions and answers). Many active SE users were rather dismayed about this (including myself); some users saw it as a betrayal of trust.
Given that Codidact is a not-for-profit organization, and given that users of Codidact can include their own copyright and license to responses, you might conclude that this could never happen on Codidact.
But is it safe to assume that Codidact does not and will never allow a for-profit organization to crawl community created content and use this as data to be monetized? If it's not totally safe to assume this, is then at least safe to assume that if the Codidact organization board would in some dismal future consider making such a deal, that they would first consult the actual user community and try to reach a broad consensus?
Last question. If the answer to the first and/or second question is "yes", would it then not be more in line with Codidact policies or guidelines, to use the CC By-NC 4.0 license as default (and leave it up to any user to perhaps change this to CC By-SA 4.0)?
2 answers
We have just posted an official statement from the board of directors. Key part (click through for the whole thing):
Companies such as OpenAI and Google are forming partnerships with some organizations that host user-contributed content, for use of that content in for-profit ventures such as building Large Language Models. We share the concerns we have heard from our community about how these companies will use such content. We do not see how such a partnership could benefit our users, and we prioritize the needs of our users and the online community in general.
If in the future we were to receive a proposal that we feel could benefit the community, we commit to discussing it openly and transparently with our users before proceeding. We have not received any proposals of this type so far.
0 comment threads
IMHO we need to distinguish two things:
-
It's not really possible per se to prevent for-profit companies from just accessing our web pages and downloading the content from them.
We can try to block crawlers of known bad faith actors, as has been suggested before and should, in my personal opinion, very much be implemented. There are probably legal means to defend users' content when somebody fails to abide to the license a post has been published under. However we need to keep in mind that we are all volunteers and have quite limited resources.
Also I'd like to point out that not every for-profit use of our content is necessarily bad for our communities. E. g. we probably want for profit search engines to crawl our sites so that communities become discoverable.
-
Actively enabling such community-adverse conduct by the Codidact Foundation is a different matter of course. And IMO something that should not happen.
There are good reasons we chose to incorporate as a non-profit and one of them is that it legally binds us to support the interests of the community (instead of monetary interests). It's always been our principle that Codidact is "Community First" and "by the community, for the community" and in that spirit any major decision needs consultation and generally also agreement by the Community.
Last question. If the answer to the first and/or second question is "yes", would it then not be more in line with Codidact policies or guidelines, to use the CC By-NC 4.0 license as default (and leave it up to any user to perhaps change this to CC By-SA 4.0)?
IIRC this is can be changed per-community, so this should probably be a per-community decision. Also you can set a default license for your posts in your profile preferences FWIW.
1 comment thread