Do Orphaned Pages Get Dropped From Google's Index?

CraigHardy

Member
Here's my question: If a website has a lot of orphaned pages (pages that have no links to them) on it, do those pages eventually fall out of Google's index? The reason I ask is because I recently restructured my website and it's now got a huge amount of new pages, but also a huge amount of pages that are no longer used. The old pages were created by the old software that I was using, but now they're all orphaned. Thousands upon thousands of them. Yes, they do return 404 error codes, but as far as I can tell, Google isn't crawling these pages too quickly. If I had to estimate, I'd say there are approximately 100,000 orphaned pages on this website. Or actually, there are about this number of pages that have been deleted. They're not actually "orphaned" because they're gone. Orphaned pages are pages that are still live, but have no links to them. These pages I'm referring to don't even exist anymore, but to Google, they do exist because they've yet to be crawled to show they aren't there.

I looked this issue up and it seems to be quite common. I saw one guy talking about the fact that his site currently has 50,000 pages and they're going to get rid of 48,000 of them and focus on only 2,000 of them. He wanted to know if deleting or removing links to those 48,000 pages would affect his ranking. I can answer that question. Yes, it will have a huge impact on ranking. Once the links to those 48,000 pages are severed, there will be no pagerank flow to them and they'll drag the entire site down until those pages are completely out of Google's index. I know this from experience. I would just like to know if pages eventually get dropped from the index if they aren't linked to for long enough. Live pages, dead pages, I don't care which.

There are actually questions like mine all over the place. Apparently, it's a common scenario to have lots of orphaned pages after a website redesign or restructuring.

Here's one thing I do know - If you've got a number of pages that have no parent pages linking to them and that Google hasn't ever crawled, then there's no problem. Yes, they're considered orphan pages, but since they've never been assigned any pagerank and since they've not been included in the overall ranking of the website, they're irrelevant. They won't harm your site's ranking. Now, if you have a number of pages that have been discovered and crawled by Google, assigned pagerank, and then have had their links cut from their parent pages, yes, your site's ranking will be affected. Those pages are live on the website with no links to them and they'll drag the ranking down, affecting the overall site's ranking.

If you're interested in what orphaned website pages are and how a website ends up acquiring them, you can check out this post from OnCrawl. It's pretty informative.
 

KodyWallice

Member
I'm not sure if orphaned pages actually get dropped from the index or not, but I do know they go somewhere. I've got a blog where I thought a whole bunch of orphaned pages that were live for years, but not linked to or used, got dropped. They all but disappeared. I couldn't find them using the Google "site:" operator and they weren't in the Google Search Console. Oddly enough, when I began deleting these pages, they began to appear when I used the site: operator. It's been years. Maybe even a decade since I've seen them last, so they've remained in the index someplace.

I firmly believe that Google still uses its supplemental index. I think it's where they store websites' very low quality pages. They're buried in there and I'm telling ya, if a website has enough of these pages in that secondary index, the site will get penalized. That's why it's so important to dredge the pages out of it by deleting them altogether. It's a simple ratio. Enough bad pages, bam, you get hit by Google Panda.

So really, if you've got enough orphaned pages that Google knows about, you better block them in the robots.txt file. Don't bother using noindex, because that'll just push them into the supplemental index.
 
Top