Page 2 of 3

Re: Forum Resource Limits

Posted: Fri Jan 31, 2025 1:07 pm
by NormanDunbar
I wonder if this article on "poisoning" AI crawlers that don't follow the rules in robots.txt would be a fun punishment for them?

https://arstechnica.com/tech-policy/202 ... obots-txt/


Cheers,
Norm.

Re: Forum Resource Limits

Posted: Fri Jan 31, 2025 1:53 pm
by dilwyn
Well, the same companies that complain about red tape are usually the ones campaigning to cut red tape, despite being the ones who misbehave and cause the red tape in the first place.

While I don't normally approve of this kind of thing, taking revenge on companies and bots who behave like this is a perfectly understandable response if you've been on the receiving end of them when you consider what they do. Growing spikes is a legitimate natural defence.

Re: Forum Resource Limits

Posted: Fri Jan 31, 2025 5:36 pm
by pjw
NormanDunbar wrote: Fri Jan 31, 2025 1:07 pm I wonder if this article on "poisoning" AI crawlers that don't follow the rules in robots.txt would be a fun punishment for them?

https://arstechnica.com/tech-policy/202 ... obots-txt/

Cheers,
Norm.
I do what I can to poison the well, but only in a minor way. Today I added the robot.txt to knoware.no. It says:

Code: Select all

User-agent: *
Disallow: /
Allow: /htm/
Im hoping this means: dont go through my actual stuff, but feel free to trawl the indexes.
However, all the indexes point to the stuff I dont want them poking around in. Is what I say what I mean? I couldnt get a clear answer.

Re: Forum Resource Limits

Posted: Fri Jan 31, 2025 11:01 pm
by NormanDunbar
Hi Per,

Some info here...https://www.cloudflare.com/en-gb/learni ... obots-txt/

Disallow / prevents good bots from all of your site.
Allow /htm/ Allows then access to the entire /htm directory and anything beneath it.

Cheers,
Norm.

Re: Forum Resource Limits

Posted: Sat Feb 01, 2025 3:28 pm
by pjw
NormanDunbar wrote: Fri Jan 31, 2025 11:01 pm <>
Some info here...https://www.cloudflare.com/en-gb/learni ... obots-txt/

Disallow / prevents good bots from all of your site.
Allow /htm/ Allows then access to the entire /htm directory and anything beneath it.
<>
Thanks Norm, that helped a little.

But not knowing what bots are good or bad without spending a lot more time than Im willing to do seems a hopeless task! And the bad guys will, as usual, not give a sh*t whatever I do.

I ended up scraping the robots.txt files of a few sites I trust, weeding out duplicates and combining them. My main objective, apart from not feeding the beast, is to minimise unwanted traffic so as to spare the server.

Re: Forum Resource Limits

Posted: Sat Feb 08, 2025 5:45 am
by NormanDunbar
Morning all,

Just a heads up, I got the Resource limit reached error this morning at 05:43 twice. There was only me and 0 guests on the forum at the time!

Cheers,
Norm.

Re: Forum Resource Limits

Posted: Sat Feb 08, 2025 9:50 am
by RalfR
What are you doing on the Internet at this time of day? ;)

By the way, I had the same error a minute ago.

Re: Forum Resource Limits

Posted: Sat Feb 08, 2025 10:10 am
by XorA
RalfR wrote: Sat Feb 08, 2025 9:50 am What are you doing on the Internet at this time of day? ;)

By the way, I had the same error a minute ago.
There are some changes coming which should improve this, please wait for announcement from the admins!

Re: Forum Resource Limits

Posted: Sat Feb 08, 2025 9:27 pm
by NormanDunbar
RalfR wrote: Sat Feb 08, 2025 9:50 am What are you doing on the Internet at this time of day? ;)
I was surprisingly wide awake at 03:45 this morning. No reason, I just was. :( :( :( :( :(

Cheers,
Norm.

Re: Forum Resource Limits

Posted: Sat Feb 08, 2025 9:42 pm
by RalfR
NormanDunbar wrote: Sat Feb 08, 2025 9:27 pmI was surprisingly wide awake at 03:45 this morning. No reason, I just was. :( :( :( :( :(
I know that...unfortunately.