Disclaimer: I don’t have a background in computer science
I recently heard about a lightweight opensource software called Anubis.
Anubis is designed to stop AI crawlers that download a lot of data to train artificial intelligence models
Several websites have deployed Anubis:
https://gitlab.gnome.org/GNOME
10 websites have deployed. 10 websites. Out of millions of websites.
My question is extremely simple.
If this software is so damn great, why isn’t it everywhere?
Seriously. Why isn’t it used on Lemmy? On Wikipedia? On CBC?
Because it’s not a perfect solution and other sites have other solutions in place.
- not every website has the problem it solves
- not everyone who does likes the solution it offers
- web development moves fast but not that fast
Weird that this is getting downvotes. It’s a legit question.
My guess is it’s the tone/wording that implies that OP doesn’t think it’s great.
I agree that it’s a legit question, though.
People get sensitive when religion is mentioned!
Set is so much cooler.
Pffff stop with your superstitions. Anybody serious knows that Teutates is the best.
seems like some crawlers already know how to bypass it: https://social.anoxinon.de/@Codeberg/115033782514845941
Yeah. It’s an arms race. Any technological defense will be countered eventually. In the long run, I’m not sure a technical defense like this will be sufficient - we’ll need legal defenses that are enforced.
It isn’t so great, and it is everywhere.
I believe, (far too) much of the commercial world relies on Cloudflare to solve that problem.
And as for Wikipedia, any AI trainer worth their salt should know that they don’t need to crawl it, because you can actually just download the whole Wikipedia dataset.