Bots and Scrapers

I am not a big fan of bots and scapers. About half of my site traffic comes from them, but I have been using PowerShell recently to help automate some tasks and like updating the site’s SSL Certifcates.

Towards the end of each month I spend a lot of the day updating the list of webrings I keep, and thought maybe I could save some time by automating that. All the script does is visit the ring page of each webring and collect a little bit of information. Normally I’d just do it then I realised another use for it.

Only about 20% (1/5) of webrings actually work properly. For whatever reason, the member sites don’t add the code or forget about it when rewriting the page. The script can be easily altered to visit each of the member sites and check that the code is there, so I thought I’d share what I did.or the ring manager can use something similar in Bash with the sed and grep commands.

There’s plenty of other scrapers around, but they can be a bit awkward to use as people (like me) change the code a little to suit their page styles. PowerShell is available for both Linux and Macs

6 Likes

I wrote too much in one post so the bottom of the last one is missing.

I went on to say, it’s not difficult to write your own on Linux and Macs using the sed and grep commands in Bash.

2 Likes

This is super interesting, thanks for sharing! There are a lot of great applications for web scrapers, it’s a shame they get used so often for nefarious purposes.

2 Likes

It’s amazing how many bots will visit a site. I run AWStats on the server logs and here’s what it found for last month.

I really must settle down one day and see what some of these are doing.

1 Like