How Do 301 Redirects Affect Crawl Efficiency?

JGaulard

Moderator
Staff member
  • #1
I've been reading a lot lately about how 301 redirects affect how search engines crawl websites. From what I've seen in my log files, it's not a pretty picture. Let me explain.

For this forum, I'm using Xenforo forum software. It's incredible software and probably one of my favorite packages to date, and I've been working online for a a long time. The issue I'm finding with it though is that it's chock full of 301 redirects. For each thread, it's got three URLs that lead to it. For example, this is what you'd see for a demo thread:

https://mysite.com/threads/1/
https://mysite.com/threads/1/post-1
https://mysite.com/threads/1/latest

The first URL is the canonical one and the second two 301 redirect to the first one, but use a hashtag at the end so it counts as just the canonical URL. And not only that, each post that's written inside of the thread gets its own URL that redirects to the canonical. Here's an example of one of them:

https://mysite.com/threads/1/post-4357

The above URL would lead to the 4357th post inside of the canonical thread in question. There can be an infinite number of these URLs on any given site.

Now, while people tend to think fondly of 301 redirects as if they're the best thing since sliced bread, I'm noticing in my log files that Google is doing a heck of a lot of crawling of these URLs and leaving a lot out, meaning, it's not crawling many of my more important pages. As you may have guessed, I don't really need these URLs crawled at all. I'd much prefer my other threads and forums be crawled. I guess my conclusion is that having many, many 301 redirects on a website inherently written into the code is horrible for search engine crawler efficiency.

Add to this, if someone highlights text that's been written inside of a thread to reply to it, that highlighted text also has a 301 redirect that leads to the canonical URL for the overall thread. Here's an example of one of these URLs:

https://mysite.com/goto/post?id=17

Following this link would eventually lead to a page like:

https://mysite.com/threads/1/

So let me ask you a question. Why would having a website full of 301 redirects be bad? Well, first, as I said earlier, they're terrible for crawl efficiency because the search engines may choose to spend much of their time crawling the redirects as opposed to the more valuable pages on a website. Also, think about how much time it takes to perform one of these redirects. Now, multiply that time by all the redirects on the website. In Google's eyes, having slow redirects like this can't be a good thing. Again, there are two redirects for every thread URL and an infinite amount for posts and replies inside of a given thread. This isn't a good thing.

So what's the answer? I honestly don't know, but I'll tell you what I did. I used the robots.txt file to block these redirects. Here's what the lines in my file look like:

Disallow: /goto
Disallow: /threads/*/post
Disallow: /threads/*/latest

I know this is going to create a lot of blocked pages, but hopefully Google will get the hint and drop them eventually. I've seen many very successful Xenforo forums that have a lot of pages blocked and they seem to be doing well, so I've got my fingers crossed. I'm not sure if it's the right thing to do or not, but we'll see, I suppose. If you've got any advice about crawler efficiency and 301 redirects, please let me know. Thanks!

Oh, by the way, someone else on the Xenforo forums has picked up on this issue here:

https://xenforo.com/community/threads/remove-all-thoses-useless-301-redirect.181525/

And here's what Yoast has to say about crawler efficiency:

https://yoast.com/crawl-efficiency/
 
Top