You must log in or register to comment.
don’t they already?
whenever I looked for old reddit threads on internet archive they never showed up.
protect against ai scraping that they can’t monetize even though it uses none of their own server time
But not LLM training 🤔
This could potentially destroy existing archived data
How so? Do archive services not also archive content from linked CDNs?
Maybe I’m mistaken but I have heard the Internet Archive applies robots.txt retroactively