How to Block SEMrushBot From Crawling Website

  • Thread starter EmeraldHike
  • Start date
EmeraldHike

EmeraldHike

Active member
  • #1
I was recently looking through my website's log files and noticed that SEMrushBot is crawling my site like crazy. It's out of control and I am concerned that it's using up too much bandwidth. I'm afraid that I'm going to go over my limit and my web hosting company is going to bill me for it. What's the best way to block this bot? I am not using them for anything and I have never used their service. Personally, I find their aggressive crawling very rude. Thank you.
 
15Katey

15Katey

Active member
  • #2
I have battled with SEMrushBot in the past with crawling my sites. They're very aggressive and so very annoying. Most site owners don't know that they need to go in and analyze their log files, so how do they ever know that a crawler is sucking up all their bandwidth? You are correct when you say that they are rude. I hate crawlers like this.

Okay, here's how you do it. If you aren't using any of their services, you can go ahead and safely block their bots or crawlers from accessing your website. If you are using their services, such as their backlink audit tool or some other SEO tool that that offer, you shouldn't block them at all. For this post, I'll assume you want to block them.

In your robots.txt file, add these lines. This first one will block them completely:

User-agent: SemrushBot
Disallow: /

This one will tell them to slow way down with their crawling:

User-agent: SemrushBot
Crawl-delay: 60

If you would only like to block their backlink audit tool but allow the rest of their crawlers to access your site for various reasons, add this line:

User-agent: SemrushBot-BA
Disallow: /

I've heard about people having some huge bandwidth overages because of SEMrushBot. This is appalling to hear. I've also heard about robots.txt not working. Or, that it takes up to 100 requests or two weeks for those changes to take place with this crawler. If you're getting slammed with bandwidth overages, the last thing you want to do is wait two weeks for a block to take affect.

This company also states that you shouldn't bother with trying to block their crawlers by IP because they don't use unified IP blocks. So what's left? What can you do?

I would suggest you first block them in your robots.txt file. Then, if your server has a firewall or if you're using a CDN such as Cloudflare, you can block their user agent. I've had good luck with this.

In Cloudflare, go to the Firewall area. Then, click Tools and scroll down until you see the User Agent Blocking area. Go ahead and set up new blocks for SEMrushBot's different user agents. So far, I've identified three and I've blocked them in my account. This has reduced the crawling tremendously. These are the three user agents that I've blocked so far. If you have more that you know of, please let us know below.

Mozilla/5.0 (compatible; SemrushBot-BM/1.0; +http://www.semrush.com/bot.html)

Mozilla/5.0 (compatible; SemrushBot/6~bl; +http://www.semrush.com/bot.html)

Mozilla/5.0 (compatible; SemrushBot/1.0~bm; +http://www.semrush.com/bot.html)

And while you're at it, you may as well block MJ12Bot as well.

Mozilla/5.0 (compatible; MJ12bot/v1.4.8; http://mj12bot.com/)

I hope this helps. You can read more about this and other annoying bots in this forum post that I found:

https://www.knownhost.com/forums/threads/fighting-semrushbot.4461/

Let me know if you have any other questions.
 
Top