301 Redirects: Inefficient Crawling & Duplicate Content

KristinaW

Member
I've been working online for a long time and I've seen lots of things, but one that completely bewilders and frustrates me is the world of search engine optimization, otherwise known as SEO. The reason I get so frustrated is that very few people on the entire internet seem to know anything beyond what they've read that's been written by someone else. The entire thing is a gigantic telephone game. It's maddening. Of course, I've found a few sources of information that have been deemed quite good and those sources do actual research and report their findings. Those are few and far between though.

Something that's been bothering me of late has to do with something called a 301 redirect. In theory, if you have one web page that you'd like to forward to another web page, you can employ what's known as a 301 redirect to make a seamless transition. If a user lands on the original page, they'll not even notice the fact that they've been transitioned to another. It's remarkable really.

The issue arises when you've got many, many 301 redirects. Situations like this arise when a website migration is performed or if code is programmed into a piece of software that creates these types of redirects. On this very website, there are hundreds, if not thousands of redirects. They're insidious and difficult to see. But to the trained eye, they're as clear as day.

So my question is this: does a high volume of 301 redirects affect the efficiency of search engine crawling on a website? Also, can redirects cause duplicate content?

The reason I ask these questions is because I've been noticing some strange phenomena lately. First, I'll tell you that I operate a website that's got tons of redirects programmed right into the thing. There's nothing I can do about them. Whenever Google crawls the site, it spends tons of time crawling these redirects, which really doesn't help anything. The way I see it is if Google is crawling the redirects, it's not crawling something else that is most likely more important. Other pages that may have been added or updated recently. After scouring the website log files, I've noticed that many pages are, in fact, not being crawled because these redirected links are. It's making the crawling very inefficient and I'd like to do something about that. I'm not sure whether to block these redirects, which I can do easily enough, or what. Leave them? Let Google sort it all out by itself? Which brings me to my next situation.

I've been peeking around the internet at other websites that run the same software and I've noticed that no one else blocks these redirected links. When I check to see if they've got any of these links indexed in Google, I see that they do. Hundreds, if not thousands, of them. And the worst part is, many of these links are a decade old. I've read official word from Google that states that "it may take some time to consolidate link signals and canonicalize multiple links together," but c'mon. A decade? Really?

301 redirects have been the gold standard of forwarding pages on the internet since it was invented. To now learn that some redirects never actually redirect is shocking. On my own site I've got some redirected links that shouldn't be visible at all counting for the canonical version right in the Google Webmaster Console. Are these multiple links that target the same page causing duplicate content? I sure hope not. They can't be helping.
 
My guess is that if your website has enough redirects, your SEO would suffer. I'm not sure about duplicate content or whether or not all that redirect crawling is inefficient, but I do know that each redirect increases the page access speed. Google says that is a ranking metric, so I would be careful with all of those. Maybe blocking them in the robots.txt file is a good idea so Google doesn't crawl them at all.
 

CaptainDan

Member
If you read or listen to what Google has to say, you'll find that it's total BS. I've heard on a number of occasions that only those with websites of around one million pages or more need concern themselves with crawl budget or crawl efficiency. From what I've found, even sites with as few as 800 pages need to be concerned with this. I currently have a website of approximately 800 pages and before I removed irrelevant links and killed off useless pages, Google wasn't crawling my good pages at all. But since I got rid of those links and pages, Googlebot has been doing a lot more crawling of the regular site and it's making a big difference. I watch my log files and the stats in Google Search Concole like a hawk, so I'm confident with my findings. If you have useless links and pages that don't help the search engines at all, you really do need to remove them. Don't listen to what the big boys say. Trust your gut.
 
Top