Post by jcgadgets » Fri Jan 14, 2011 2:56 pm

Hi,

Since OpenCart allows the same page to be accessed via so many different URL's (even more if SEO URL's are enabled), Google Webmaster tools is telling me that I've got duplicate content all over the place.

To remedy this, I used 301 redirects on certain URL's in my .htaccess file as per a post here.

Examples:
Redirected: domain/index.php?route=information/information&information_id=7
To: domain/Help-Terms

Redirected: domain/Apple-MacBook?sort=p.model&order=ASC
To: domain/Apple-MacBook

Don't know how to redirect: domain/Factory-Unlocked-Apple-iPhone-4-32GB (this is in the "Featured" box and of course has a different bread crumb than the below URL like I want)
To: domain/Factory-Unlocked-iPhones/Factory-Unlocked-Apple-iPhone-4-32GB

Will this hurt my rankings in any way? I've also added the following in my robots.txt as per Chones recommendation:
User-agent: *
Disallow: /*?sort
Disallow: /*?route=checkout/
Disallow: /*?route=account/
Disallow: /*?route=product/search
Disallow: /*?page=1
Disallow: /*&create=1
Allow: /

All of my product pages have the rel="canonical" such as here:
<link href="http://domain/Factory-Unlocked-Apple-iPhone-4-32GB" rel="canonical" />
I think this is making things (more or less) good enough. However, none of the information pages or category pages have the canonical "thing" (I don't know what to call it) in the source. Should I add it? If so, exactly how / where?

What does everyone think of all of this? I am hoping that this will eliminate any perception of "duplicate" content by search engines..improving SEO (??) and make it more appealing to the eye / more easily understood and remembered for customers. My main concern is: does using 301 redirects hurt me for SEO?

Just want to make sure that I am doing this all right and am not shooting myself in the foot in any way.


Thank you!
Jared
Last edited by jcgadgets on Fri Jan 14, 2011 3:43 pm, edited 1 time in total.

Active Member

Posts

Joined
Sun Oct 31, 2010 4:49 pm

Post by Chones » Fri Jan 14, 2011 3:42 pm

Hi Jared,

I did it differently when I realised I was getting a lot of duplicate content in Google. If you redirect ?sort pages back to the original page the sort function won't work for users.

Firstly, I put that robots.txt file in, which should stop a lot of new duplicate pages being indexed.

Then in Webmaster Tools I asked Google to remove all the ?sort URLs that it had already indexed. It takes a while to do that, but is probably worth it. As ?sort is now blocked, Google should remove them.

To remove pages that were indexed before you added SEO keywords you can also block them with robots.txt and ask Google to remove them. So you could add Disallow: /*?route=information/ to robots.txt then ask Google to remove all the route=information/ pages too.

I think that's better than 301 redirects. Others may disagree.

http://scarletandjones.com/
http://sharpdressedman.co.uk/
http://coffincompany.co.uk/
http://horsesculptures.co.uk/
If I've helped you out, why not buy me a beer? http://craigmurray.me.uk


User avatar
Active Member

Posts

Joined
Wed Mar 24, 2010 9:07 pm
Location - London

Post by jcgadgets » Fri Jan 14, 2011 3:58 pm

Hey again,

I appreciate your continued assistance. I hadn't 301 redirected any ?sort URL's, but it's a good thing you mentioned it because I probably would have without realizing what I was doing.

Do you know of any way to SEO other URL's such as the contact page? I ask because it is of course at a route=information/ URL. For the time being, I will just disallow the former specific pages.

I also added the following line to my robots.txt, and would like your take on it?:
Disallow: /*&sort

I did so because Webmaster Tools was also seeing the product specials page as accessible via many different URL's (because of sorting), and it uses a & instead of a ?. Will this work correctly?

I assume, though, that I would still want the 301 redirects (at least for pages such as the information page, or for products in multiple categories), correct? In order to make the URL more customer friendly and more accurate.

Also, I'm not really sure how to tell Google in Webmaster Tools to remove a page?


Thank you!!
Jared

Active Member

Posts

Joined
Sun Oct 31, 2010 4:49 pm

Post by jcgadgets » Fri Jan 14, 2011 4:41 pm

Found out how to remove URL's in Google Webmaster Tools.

New question now though. Originally, attracta had a link for a sitemap for me in the first line of my robots.txt file. This was all Google was showing, even though there was content beneath this line.

I have since removed that line, and submitted the sitemap to Google...however, in Webmaster Tools, it is still showing just that one line for the robots.txt file. Any ideas on how I can fix that??


Thank you,
Jared

Active Member

Posts

Joined
Sun Oct 31, 2010 4:49 pm

Post by Chones » Sat Jan 15, 2011 2:58 am

The robots.txt file in Google is not your robots.txt file, it is a testing area - ignore it.

You can use it to test your file by putting the contents of your robots.txt file in there and then putting some URLs in the box below. It will tell you if your robots.txt file blocks them or not.

Looks like &sort will be fine - but, of course, now you can test it by using the method above.

No need to SEO or redirect the contact URL - it has "contact" in it - leave it as is. It's only URLs with variables such as product_id=23 that you need to SEO.

You don't need redirects for products in multiple categories either as Daniel has implemented rel=canonical.
If you look in the meta data of a product there is a rel=canonical link that doesn't feature any category info - that is the link Google will index.

http://scarletandjones.com/
http://sharpdressedman.co.uk/
http://coffincompany.co.uk/
http://horsesculptures.co.uk/
If I've helped you out, why not buy me a beer? http://craigmurray.me.uk


User avatar
Active Member

Posts

Joined
Wed Mar 24, 2010 9:07 pm
Location - London

Post by jcgadgets » Sat Jan 15, 2011 8:06 am

Ok, that's some very good information.

So:
- Are 301 redirects bad for SEO in any way?
- How do I make the links from something linked from the "featured" page go to the proper page with the right breadcrumbs? I know that with the canonical entry it probably isn't too big of a deal, but if it's possible I'd like to do it. That way the customer always has those nice breadcrumbs to follow.


Thank you,
Jared

Active Member

Posts

Joined
Sun Oct 31, 2010 4:49 pm

Post by Chones » Sat Jan 15, 2011 5:19 pm

Hi Jared,

301s aren't bad for SEO, but you should only really use them if you change the URL of a page that people are linking too. That way, when people click on the link they will be sent to the new URL.

If nobody is linking to the page then you can just change the URL, remove the old link from Google, and submit a new sitemap.

You can't make a breadcrumb trail from something like the "Featured" box. It links straight to the product. This is because it could be possible that a product is in more than one category, so how would you choose which one to output.

When I first started using OpenCart I tried to find some sort of solution for this by listing all categories a product was in on the Product page - solution is here http://forum.opencart.com/viewtopic.php?t=13347

However, I didn't use it in the end.

http://scarletandjones.com/
http://sharpdressedman.co.uk/
http://coffincompany.co.uk/
http://horsesculptures.co.uk/
If I've helped you out, why not buy me a beer? http://craigmurray.me.uk


User avatar
Active Member

Posts

Joined
Wed Mar 24, 2010 9:07 pm
Location - London

Post by jcgadgets » Sun Jan 16, 2011 5:46 am

Ok, this post got larger and larger as I thought more, so I've organized it into numbers. If someone does decide to take up this post, please answer in the same numbered format so that we can keep things straight :)

1. Thanks, good to know the fact that 301 redirects do not hurt SEO.

2. That is too bad that there isn't really a good way to make the breadcrumbs and URL's the same for featured box. I suppose that I will just block the plain product one in my robots.txt, and leave the proper breadcrumbed one as accessible.

3. What if the canonical entry points to the plain product URL? Should I change this to the category/product URL, since this is what I would rather have it be?

4. What do you think of the most recent post made by andyspartan?

5. I thought I had everything worked out in my .htaccess...but apparently, I was quite mistaken.

This is what I have:
RewriteEngine On
RewriteBase /StoreFront/
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^([^?]*) index.php?_route_=$1 [L,QSA]

RewriteCond %{QUERY_STRING} ^route=product/product&product_id=49$
RewriteCond %{REQUEST_METHOD} !^POST$
RewriteRule ^index\.php$ http://domain/StoreFront/category/product? [R=301,L]

What I want:
To be able to type in any of the following:
http://www.domain/StoreFront/route=prod ... duct_id=49
http://www.domain/StoreFront/index.php? ... duct_id=49
http://www.domain/route=product/product&product_id=49
http://www.domain/index.php?route=produ ... duct_id=49

and be redirected to:
http://domain/StoreFront/category/product

What I get:
http://www.domain/StoreFront/route=prod ... duct_id=49 -->> no change
http://www.domain/StoreFront/index.php? ... duct_id=49 -->> http://www.domain/StoreFront/category/product
http://www.domain/route=product/product&product_id=49-->> Hostgator 404 error page
http://www.domain/index.php?route=produ ... duct_id=49 -->> http://www.domain/StoreFront/category/product

Can I do this? If so, how?

I know that it may not all be 100% necessary, since I will be submitting new sitemaps to Google, but I'd still rather have it all nice and clean so that every product is always accessed via the most correct URL if possible.

6. One interesting thing to note, is that the robots.txt file in Google Webmaster Tools now reflects the contents of my actual file. It says that it has the URL for it (domain.com/robots.txt), and that downloaded it 4 hours ago...?


Thank you,
Jared

Active Member

Posts

Joined
Sun Oct 31, 2010 4:49 pm

Post by Chones » Sun Jan 16, 2011 6:47 am

RE: 2 and 3 - If you are sure you are never going to have a product in more than one category, you could block the product only URL - you'd also have to remove the rel=canonical meta tag as that directs Google to index the product only URL. But if you do that and then have a product in two categories you've got duplicate content.

However, I'm not sure why you are doing this, or if it is worth the effort. The way OpenCart is set up means you have no duplicate content, ever, and you have great SEO URLs. I think you are over analysing things and putting too much effort into this.

If you want a product indexed as www.yoursite.com/category/product-name because the category is important for SEO why not just rename the product? For example, if you're selling Nike T-Shirts and the URL is currently www.yoursite.com/nike/t-shirt, just change the SEO keyword to nike-t-shirt and the URL that will be indexed, due to rel=canonical, will be www.yoursite.com/nike-t-shirt.

That's what I did on my site and I get great Google rankings. It may actually be much better if you don't include the category as some people think that Page rank dilutes as you descend down a folder structure - so site/cat/product is worth half as much as site/product.

4. Not read it yet - but will look it up after this.

5. Why would you type in anything ending with product_id=49? If you have SEO URLs Google won't index any URLs ending in product_id=49, no one will ever know you have URLs ending in product_id=49. No one will ever try to visit that link, so why redirect it? Even if they do (and the one place they do come from is Google shopping because OpenCart doesn't supply the SEO URL), the rel=canonical takes care of it.

Once again - I think you're putting too much effort into this. It's admirable, but you should maybe put some more energy into writing great product titles and descriptions, that's where the real SEO happens, not blocking a few URLs.

http://scarletandjones.com/
http://sharpdressedman.co.uk/
http://coffincompany.co.uk/
http://horsesculptures.co.uk/
If I've helped you out, why not buy me a beer? http://craigmurray.me.uk


User avatar
Active Member

Posts

Joined
Wed Mar 24, 2010 9:07 pm
Location - London

Post by jcgadgets » Sun Jan 16, 2011 12:38 pm

Hey Chones,

I do legitimately appreciate you taking the time to educate me (especially so many times and so thoroughly).

I think you're right - and that it's not something that I need to pour too much effort into.

I actually have every product in two categories. As I only have just a couple products (ten at the moment), I have an all products category. So I guess I will have to unblock the product only URL as a result of the canonical reference.

You make a very interesting point concerning the possible dilution of the "SEOness" as you go farther down the file structure...I was not aware of this as a possibility. I guess it may then be a good thing when the featured products lead to product only URL's.


Thanks again for all of your help! I think I've got things mostly straight in my kind, have just got to implement it all now.
Jared

Active Member

Posts

Joined
Sun Oct 31, 2010 4:49 pm

Post by Chones » Sun Jan 16, 2011 5:44 pm

Hi Jared,

No probs. Just read andyspartan's solution and it is much better than mine - but then I was only beginning with OpenCart at the time ;)

If you have every product in 2 categories then I would urge you to rely on rel=canonical and Google will index the product only url. To go back to my example, if you sell T-shirts by brand and colour, you may have the same product at both
www.yoursite.com/nike/t-shirt
www.yoursite.com/white/t-shirt

Google would see that as duplicate content - they won't penailse you but they will only index one and not the other. Best to change the SEO keyword to take care of both, so to
www.yoursite.com/white-nike-t-shirt

You can also tell Google in Webmaster Tools to index http://yoursite.com rather than http://www.yoursite.com - you have to verify both addresses separately first - although with the same Google code, so it's quite quick.

That really should be all you need to do.

Craig

http://scarletandjones.com/
http://sharpdressedman.co.uk/
http://coffincompany.co.uk/
http://horsesculptures.co.uk/
If I've helped you out, why not buy me a beer? http://craigmurray.me.uk


User avatar
Active Member

Posts

Joined
Wed Mar 24, 2010 9:07 pm
Location - London

Post by jcgadgets » Tue Jan 18, 2011 6:42 am

Chones,

I've got Google indexing http://www.mysite.com, http://mysite.com, and http://www.mysite.com/store-sub-folder. Does it do more good to have one or another?

I put in my robots.txt to disallows domain/all-electronics/* (the category with everything in it) - this should stop Google from indexing the URL's there (correct)? Or perhaps I do not want this, because of the rel=canonical?

I've got my SEO URL keywords set up now so they will work best (I think).


Thank you,
Jared

Active Member

Posts

Joined
Sun Oct 31, 2010 4:49 pm

Post by Chones » Tue Jan 18, 2011 7:06 am

As I said, you can ask Google to index http://www.yoursite.com or http://yoursite.com

You do that in Webmaster Tools. It doesn't matter which one you choose. I prefer to remove www.

I would not disallow domain/all-electronics/* - I don't think that's going to do anything. Because of rel="canonical" Google will index all products as domain/product-name

If you block robots from everything after /all-electronics/ you may stop Google indexing any of your products - you may have blocked all access to your site - although I'm not 100% sure about that.

http://scarletandjones.com/
http://sharpdressedman.co.uk/
http://coffincompany.co.uk/
http://horsesculptures.co.uk/
If I've helped you out, why not buy me a beer? http://craigmurray.me.uk


User avatar
Active Member

Posts

Joined
Wed Mar 24, 2010 9:07 pm
Location - London

Post by jcgadgets » Fri Jan 21, 2011 3:20 am

Craig,

Good to know - I'll re-allow my electronics folder. I was also curious to know - how often does Google re-check / update my robots.txt? Every time a sitemap is submitted, or on some other schedule?

About the www. or non www., what I was asking was - is it a best practice to have one or the other, or does it really not matter which?


Thank you,
Jared

Active Member

Posts

Joined
Sun Oct 31, 2010 4:49 pm

Post by Chones » Fri Jan 21, 2011 3:28 am

I have no idea how often Google looks at your robots.txt file, but basically it will be when Google visits your site - which could be daily or weekly depending on how much you update it. Submitting a sitemap does not force Google to go to your site - it just tells it where to look next time it visits.

As for www I prefer to leave it off - some people prefer it though. It's not needed anymore so why bother with it. And on your business card "craigmurray.me.uk" looks alot better than "www.craigmurray.me.uk"

http://scarletandjones.com/
http://sharpdressedman.co.uk/
http://coffincompany.co.uk/
http://horsesculptures.co.uk/
If I've helped you out, why not buy me a beer? http://craigmurray.me.uk


User avatar
Active Member

Posts

Joined
Wed Mar 24, 2010 9:07 pm
Location - London

Post by jcgadgets » Fri Jan 21, 2011 3:33 am

Craig,

Good points - thanks for the tips.

Just came up with yet another question / issue...
Google Webmaster Tools was complaining that I had lots of duplicate content, as per having not blocked the sorting of each page. However, I have long since made the following entries in my robots.txt:
Disallow: /*?sort
Disallow: /*&sort

Is there something more that I need to do? It still shows this duplicate content under HTML suggestions.


Thank you,
Jared

Active Member

Posts

Joined
Sun Oct 31, 2010 4:49 pm

Post by Chones » Fri Jan 21, 2011 4:08 am

It just takes a bit of time. If you have the robots.txt file set up, and requested the removal of the URLs, it will be done eventually - but it can take a few weeks before Webmaster Tools updates - as long as you know you've blocked it I wouldn't worry about it.

http://scarletandjones.com/
http://sharpdressedman.co.uk/
http://coffincompany.co.uk/
http://horsesculptures.co.uk/
If I've helped you out, why not buy me a beer? http://craigmurray.me.uk


User avatar
Active Member

Posts

Joined
Wed Mar 24, 2010 9:07 pm
Location - London

Post by jcgadgets » Fri Jan 21, 2011 4:24 am

Craig,

Thanks for the info. Wow - I figured it would take a few days at most.

I was also wondering if:
Disallow: /*sort*

Wouldn't be better than:

Disallow: /*?sort
Disallow: /*&sort

?


Thank you,
Jared

Active Member

Posts

Joined
Sun Oct 31, 2010 4:49 pm

Post by Chones » Fri Jan 21, 2011 4:26 am

I haven't tried /*sort, but you can test it in Webmaster Tools - let me know if it works.

http://scarletandjones.com/
http://sharpdressedman.co.uk/
http://coffincompany.co.uk/
http://horsesculptures.co.uk/
If I've helped you out, why not buy me a beer? http://craigmurray.me.uk


User avatar
Active Member

Posts

Joined
Wed Mar 24, 2010 9:07 pm
Location - London

Post by jcgadgets » Fri Jan 21, 2011 4:29 am

Craig,

Will do!


Thank you,
Jared

Active Member

Posts

Joined
Sun Oct 31, 2010 4:49 pm
Who is online

Users browsing this forum: No registered users and 308 guests