XenForo: Blocking /posts/ in Robots.txt

  • Thread starter KristinaW
  • Start date
KristinaW

KristinaW

Active member
  • #1
I've been reading through a few different threads in the XenForo community and I'm slightly confused about something. In one thread in particular, the originating author asks why someone might block or disallow the /posts/ directory in their robots.txt file. The thread began back in 2013, so I'm wondering if something has changed in the code for this forum software since then. I'm a fairly new user and from what I can tell, there aren't any URLs with /posts/ in them, except for links to reactions someone may have applied to a certain post. In the thread, the user questions why the developers have disallowed /posts/ in their robots.txt file. A developer named Mike replied that there wasn't any SEO reason, per se, but that they simply didn't want Google or any other search engine for that matter to follow useless links. Since the URLs in question are merely 301 redirects that send the user who clicks on them to the most recent post in a thread, it's sort of a waste of time for the search engines to crawl them. The person who created the thread kept talking about how these URLs exist on the homepage, which confused me. I don't see any of these in any of my installs.

So my question is, what exactly is being discussed here. Have the URLs that had /posts/ in them changed to something else through the years? Were they once like this?

https://mysite.com/posts/1234/

From what I can see, the only 301 redirects that exist on the homepage look like this:

https://mysite.com/threads/thread-name.1234/post-1234

Has XenForo updated the old URLs to the new ones?

I know people who use this forum software are sometimes slow to upgrade to the latest version due to all their template and code modifications and that they can be even slower to update their robots.txt files. As I cruise around the web creeping on other site's robots.txt files, mostly everyone has the /posts/ directory blocked. About 90% of the largest forums that use XenForo on the web have that directory, among others, blocked. I'm curious if they had that disallowed because of this 301 redirect and duplicate content issue or are they merely blocking the reactions link. It couldn't be just the reactions link.

In my own robots.txt file, I have these two lines included:

Disallow: /threads/*/post
Disallow: /threads/*/latest

Once I blocked these two directories, I witnessed marked changes in crawling and indexing. As the original thread creator on the XenForo site stated, Google definitely doesn't handle 301 redirects like they're supposed to. When you link to a redirected URL on your own website, that URL might stay indexed and the target URL might never get indexed. It's a weird situation. If you have any input, please let me know. Thanks!
 
Last edited:
LukeLewis

LukeLewis

Active member
  • #2
So far as I can tell, the only two locations the /posts/ directory is used is on the thread pages themselves as well as alert emails that are sent to users. On the thread pages, if someone clicks a reaction emoji, that emoji (as well at other's) will appear in a "reaction row" at the bottom of that post. People can click that reaction row URL. It's got the /posts/ directory in the URL. Also, if someone replies to a post, such as I am doing right now, the person they're replying to will receive an alert email. In that email, the title of the thread is a clickable URL. It's also got the /posts/ directory in it. That's all I know of off the top of my head. There are probably more locations though. I'll update this thread if I can find them.

UPDATE: I just noticed this - as an admin, you have the ability to approve threads and posts before they go live on the website. The title of the thread in the approval area uses a URL that contains the /post/ directory.
 
KristinaW

KristinaW

Active member
  • #3
I just found where these /posts/ URLs are coming from. From the "What's New" area. At the top of this site, you'll see a What's New link. Inside this area is a Latest Activity area. Links to posts exist there and they use the /posts/ URL. There is also another area that uses them, but I can't see that right now because there are no new posts. I think what's happening is that people are using the "New Posts" widget on their homepage and that's showing these URLs. Maybe. As far as I can tell right now.
 
Top