Codidact Meta

Welcome to Codidact Meta!

Codidact Meta is the meta-discussion site for the Codidact community network and the Codidact software. Whether you have bug reports or feature requests, support questions or rule discussions that touch the whole network – this is the site for you.

Olin Lathrop‭ wrote about 1 year ago

copy link

Prohibiting activities should only be done for activities that are harmful, but you haven't provided any evidence that crawling the site for AI data is harmful. Also, we can't know what a crawler does with the data. You can only prohibit crawling or not, not crawling to gather AI training data versus indexing web pages, versus gathering word usage frequency, versus any number of other reasons we might never know about. Some crawlers might be "nice" in that they tell you why and respect your request not to for a particular use, but those are probably the ones least likely to cause whatever problem you are trying to avoid.

Lundin‭ wrote about 1 year ago

copy link

Olin Lathrop‭ The main harm in case of GenAI would be that it steals licensed content, bakes it into the training and then uses that stolen content without any attribution to the original author (since the AI itself normally doesn't even know where the training data is coming from). Or worse: uses the content in a hallucination where half of the AI output would be stolen from an original author and the rest of it would be complete nonsense and lies.

Communities

Comments on What can be done to block Codidact content from getting used by crawlers/for AI training?

What can be done to block Codidact content from getting used by crawlers/for AI training?

1 comment thread