ForumPostersUnion.com


   

Go Back   Forum Posters Union > Search Engine Intelligence & Research > Spiders, Crawlers and web robots
Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Spiders, Crawlers and web robots Intelligence on search engine spider bots and identification, bad bots from spam botnets, content scrapers, tools to identify web robots, blocking malicious bots.

Reply
 
Thread Tools
  #1  
Old 02-07-2010, 05:20 PM
AnthonyCea's Avatar
AnthonyCea AnthonyCea is offline
Publisher
 
Join Date: Feb 2006
Location: Deep South, USA
Posts: 29,611
grvcrawler/0.3

174.129.158.205 ec2-174-129-158-205.compute-1.amazonaws.com
grvcrawler/0.3


grvcrawler/0.3 is a bot running on the same IP that the Omgili.com forum search engine crawler has been run on, so I assume this might be a name change for the spider bot, but with no link to a spider bot identification page in their user agent to provide transparency to webmasters and server administrators, this bot will be banned by the uninformed, this is a great way for a botmaster or new search engine to shoot themselves in the foot.
Reply With Quote
  #2  
Old 02-07-2010, 06:54 PM
miqrogroove's Avatar
miqrogroove miqrogroove is offline
Senior Member
 
Join Date: Dec 2008
Posts: 306
You haven't blacklisted AWS yet? :P At what point do you say enough is enough?
Reply With Quote
  #3  
Old 02-07-2010, 06:57 PM
AnthonyCea's Avatar
AnthonyCea AnthonyCea is offline
Publisher
 
Join Date: Feb 2006
Location: Deep South, USA
Posts: 29,611
Well, if I ban the entire data center I would block Alexa and all of Amazon too most likely, I think the way things are going I will have blocked every IP c-network they have soon.

Once we get our new firewall system finalized I may have a better way to do things, it is not done yet.
Reply With Quote
  #4  
Old 02-08-2010, 05:52 PM
AnthonyCea's Avatar
AnthonyCea AnthonyCea is offline
Publisher
 
Join Date: Feb 2006
Location: Deep South, USA
Posts: 29,611
174.129.158.205 ec2-174-129-158-205.compute-1.amazonaws.com
grvcrawler/0.3


grvcrawler was back again tonight quickly scanning our thread content, in keeping with our policy of banning bots that do not run a link back to a comprehensive spider ID page in their user agent, the IP c-network was banned.

Sorry guys but you will have to do a lot better job of providing transparency before we will allow you to continue scanning content.
Reply With Quote
Reply



Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Forum Jump


All times are GMT -7. The time now is 12:30 PM.


Powered by vBulletin®
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
2006-2009 ForumPostersUnion.com