|Stuck in geolocation hell. Full pages via CDN||SummerBarker85||7/2/12 2:12 AM|
We are a large international ecommerce brand operating in 10+ countries. We have a .com site and use a subdomain for each country we operate in. Our site is centrally hosted in Spain and a CDN delivers all of our content including HTML.
Webmaster Tools accounts have been setup and the correct geotarget selected for each country subdomain. A Webmaster Tools account has also been setup for our WWW. subdomain (which contains only a 'choose your location' entry page) and it's geotarget has been left unselected.
We have major geotargeting problems whereby the incorrect
country subdomain appears in the search results. For example US. appears in
google.co.uk and UK. appears in google's US results.
WWW. also outranks a lot of country subdomains - for example in the US search results our WWW. outranks US. Our branded results show sitelinks which contain links to the wrong country - for example a US user clicking on a sitelink will often find themselves on our Swedish site viewing prices in Euro's not Dollars. We've added hreflang alternate tags to the site and have seen a marked improvement however it's not been good enough. I'm going to change the Webmaster Tools geotarget for our WWW. page from unselected to 'unlisted' to see if this has an impact, however I'm not confident that this is the root cause of all the problems.
As I understand it Google primarily uses the server IP to geolocate a website and then falls back on GWT settings, on-page signals and inbound links. These fallback signals aren't working for us so I believe we would be best suited to direct our efforts into improving the IP signal. Presumably sites who rely on the fallback methods also find it harder to rank highly in search results as Google won't have as great a confidence level in the sites relevancy?
Could an option be to have the CDN develop a special rule for Google IP's - to serve UK. requests from UK servers, US. requests from a US server and so forth? Our CDN delivers the content from the nearest server to the user's IP. Due to Google crawling entirely from the US all of our country subdomains must appear hosted in the US. Matt Cutt says to 'treat Googlebot as you would a regular user' however Googlebot doesn't replicate the actions of a regular user because it crawls entirely from the US. I'd view this new CDN rule as a way of treating Google closer to how we would treat the majority of our users (the majority of UK. requests come from UK IP's so we'd deliver from a local UK server). However please let me know if the Google team would view this as a breach of TOS. I don't want to break any rules but at the same time we need to escape the Geolocation hell that we're currently in!
Unfortunately the business is very sensitive about publishing their woes so I can't publish the website details, however I can send this info to Google off-forum if it helps.
Thanks for any help or advice you can offer.
|Stuck in geolocation hell. Full pages via CDN||cristina||7/2/12 3:55 AM|
Without the site URL it is difficult to comment.
Look with Fetch as Googlebot and see if the alternate hreflang attribute is in the source code and is correct.
Look at the source code of Google cache of a few pages that are not working well in search results, see if they were cached after you added hreflang.
|Re: Stuck in geolocation hell. Full pages via CDN||SummerBarker85||7/2/12 4:39 AM|
Yes I'm trying to convince the business that we need to publish the URL in order to secure the neccessary assistance from Google however it's a very large FTSE brand and they feel uncomfortable doing this. I can send details to Google employees as it will remain private but I'm not able to share with the wider community publically which is a real pain.
Fetch as Googlebot is showing the hreflang code being correct and is reflected in Google's cached copy too. Hreflang has been on the site for two full weeks now so the majority of the site has been recached and there has been a vast improvement but problems still remain.
For example one sitelink in the UK search results is a US URL which has been cached with hreflang. The equivalent UK URL has also been cached with hreflang so the reciprocal annotation is there but it's not working.
Thanks anyway for your help Cristina, I appreciate your time
|Re: Stuck in geolocation hell. Full pages via CDN||Steven Lockey||7/2/12 5:07 AM|
>>As I understand it Google primarily uses the server IP to geolocate a website and then falls back on GWT settings, on-page signals and inbound links. >>These fallback signals aren't working for us so I believe we would be best suited to direct our efforts into improving the IP signal. Presumably sites who >>rely on the fallback methods also find it harder to rank highly in search results as Google won't have as great a confidence level in the sites relevancy?
I would say this is a lot of work for a small amount of gain, if any gain. Since the Googlebots all come from a US IP, I assume they would only see the US IP anyway even if you did this.
Does EACH of the sub-domains e.t.c. have entirely unique and useful content? Or are large chunks of it from the main site or simply refer to the main site? Either way this explains the situation perfectly. This would include product descriptions e.t.c.
Also do you actually have a reason to display the sub-sites at all if this is the case? Wouldn't a single collaborated site for each language rather than separating them via country be the most useful thing here? It would probably only be the local laws that are different for each site I believe?
I can't say much more than that without seeing the site and having details I'm afraid but they are my general thoughts on the issues you are having.
|Re: Stuck in geolocation hell. Full pages via CDN||SummerBarker85||7/3/12 3:30 AM|
>>Does EACH of the sub-domains e.t.c. have entirely unique and useful content? ... Wouldn't a single collaborated site for each language rather than separating them via country be the most useful thing here?
No, it's almost entirely duplicate content across the subdomains - product descriptions are the same, and category level pages only vary by product mix. I'd love to have a single collaborated site for each language however in practice it isn't that straightforward. Take our US & UK sites as an example: both sites are english langauge so in theory could be condensed into a single site, with prices dynamically changing based on IP location, however... a large number of products are exclusive to each country. The business categorically does not won't to show a US exclusive product to a UK audience and vice versa - but a single site would a) show the products in both US & UK search results and b) I don't think we could get away with changing the html by IP for each market.
Also, stock levels differ across countries - product A may have stock in US but not in UK - again the business categorically does not want to show UK Out of Stock items to a UK audience. Currently the separate UK & US sites only list in stock products. Upto 20% of UK products may be out of stock when compared against US. A single site would require us to have 20% of unavailable stock being listed on our UK site (and a similar equivalent on the US site). Clicking on products which then say 'out of stock' is a really bad user experience (especially as the majority of products are unlikely to come back in stock again).
I can't see a way around these differing product and stock problems exceot to keep them on separate sites, but would love to hear any suggestions if you have any. The above example is US & UK but we have other English language subdomains for Canada, and Australia - with upwards of 5,000 products on each its not possible to write differing, unique content for each - its a) a lot of resource and b) there's only so many ways that product descriptions can be written and still make sense.
If you know of any sites who're overcoming these difficulties I'd love to know. In the mean time I'm going to continue chasing down the IP based options.
|Re: Stuck in geolocation hell. Full pages via CDN||Steven Lockey||7/3/12 5:06 AM|
The IP solution won't solve it due to the duplicate content.
Google will basically pick the one it likes best from the two of them. The IP signal isn't a big enough factor IMHO to make much of a difference here. I very much doubt this will address the problem.
The only realistic way I can see of doing this then would to be to eliminate all the duplicate content, this will mean a lot of work as you will have to rewrite all the content on all the sites which share the same language. The text on each one can then refer specifically to the country it is related to, which should make it more prominent in searches relating to that country.
Basically what your current plan is seems doomed to failure to me. Its not an approach I would suggest as I don't believe it will achieve the results you are looking for. Many UK site hosted in the UK rate below their .com equivalent hosted in the US still regardless of the hosting. The larger, more popular of the sites will almost certainly have more links e.t.c. back to it, so will display more prominently either way.
|Re: Stuck in geolocation hell. Full pages via CDN||SummerBarker85||7/4/12 6:18 AM|
Thanks very much for your time and thoughts. You're entirely correct that we have a duplicate content nightmare and I completely agree that the Duplicate Content is a major problem - its something that I'm actively pursuing but I don't think this should be pursued exclusively and the Hosting/CDN aspect needs investigating too as theres some interesting activity going on...
I do see both copies (eg. UK and US) of our pages stored in Google's index, with the cached copies of each URL confirming this. Google aren't looking at both copies and deciding to only store one in the Index, they are storing both. Don't get me wrong - there are instances when Google does remove one and, for example, returns the US version in both UK & US serps - this is definitely a Duplicate Content caused issue.
Why I'm looking at the IP/CDN setup is because, when Google are indexing both the UK and US versions of a page, they often (5-10% of the time) show the UK listing in the US serps, or the US listing in the UK serps. This feels like very strange behaviour and not a typical duplicate content caused scenario, hence wanting to look more closely at the IP/CDN setup. I've read on these forums some problems relating to CDN's and Geolocation (I decided to use 'Geolocation Hell' in this threads title after reading it on a similar thread about CDN's & Geolocation) and wanted to find out if anyone else has ever experienced this - but I can assure you the actual Duplicate Content issue is being pursued too and share your feelings on it.
Thanks again Steven, really appreciate your time and thoughts.
|Re: Stuck in geolocation hell. Full pages via CDN||Steven Lockey||7/4/12 6:36 AM|
Yep, thats how Google handles duplicate content, they store both pages, but when it comes to chosing to display a page, they only display one with duplicate content, the 2nd page simply gets filtered from the search results. Which is chooses is generally based on the page rating. Unless its a deliberately local search then its very unlikely the geo-location of the server will actually influence this enough to put the one you want ahead of the main page, simply due to the many more high-quality links the main page is likely to have.
Thats actually pretty standard behaviour, its just picking out the page that has the best quality signals out of the two pages, you'll probably find the ones you are seeing pop up more often has more high quality websites linking directly to that page or similar quality signals.
I'm trying to think of a site structure for a company like your's that would incorporate this into an individual site but unfortunately its actually quite difficult. I was thinking of doing something like the single site you mentioned above but filtering both prices AND product by user location, so it would only display the products for your location BUT if it was Google or Bing bot then display both.
I'm not however sure if Google might class that as a type of cloaking however, which would be bad. I would assume if you had a click-able section called 'products available in other countries' that contained the other products, that would actually solve this, so basically it would just change which section (the main section or the 'hidden' other countries section) depending on the user's IP.
For any IP based solution however I would strongly advise giving the user the ability to select a country manually to over-ride the automatic detection, simply due to the use of VPNs/Proxies as well as mobile traffic which may appear to come from a location other than the user's actual location.
Hope that helps!
|Re: Stuck in geolocation hell. Full pages via CDN||zihara||7/4/12 7:07 AM|
Have you tried putting this in your various headers:
<meta name="geo.position" content="lat;long">
<meta name="geo.placename" content="pick-a-city">
<meta name="geo.region" content="pick-a-country">
Schema.org coding might be helpful, too. Google's "rich snippet tool" picks all these up, I would expect the SE to, too, especially the schema .org...
|SummerBarker85||7/5/12 2:56 AM||<This message has been deleted.>|
|Re: Stuck in geolocation hell. Full pages via CDN||SummerBarker85||7/5/12 3:25 AM|
Thanks Zihara. It's not currently on the site but its in the Development stack along with meta language tags for Bing. Schema is also in there - pretty much anything related to Geo-location is in the stack, these are mostly meta related but don't want to ignore the IP/CDN impact as I understand its the primary method for Geolocating sites/users.
|Re: Stuck in geolocation hell. Full pages via CDN||SummerBarker85||7/5/12 3:25 AM|
Thank you so much for the time you've spent on this, I really do appreciate it. On the Dupe content getting filtered out - if I search for 'blue widgets' in the UK then uk.example.com/blue-widgets ranks, if I search the same in the US then us.example.com/blue-widgets ranks. However when you search for our brand name in the UK and the blue widgets page appears in the sitelinks, its the us.example.com/blue-widgets version that gets used. The US version does have more inbound links than the UK version so I think you are correct in terms of linking signals and believe it is the most likely cause, however...
In the above 'blue widgets' example, the US version also appears in our Australian and Canadian sitelinks - so I'd need to link build each version of the page for each language, which irkes me as I'm a "spend your time improving the website for users and the rest will follow" kind of guy and link building every country version both goes against the grain for me and doesn't feel a sustainable way to resolve this.
I also am struggling to put together a suitable site structure - I'd also considered filtering prices and products by IP (with a country selector option) and although I know we could do this from a development perspective, it's too similar to the BMW situation of showing different content to users and search engines for my liking. As it stands the 'best fit' is an approach similar to how L'occitane deal with this - they take a perfectly good site and then slightly tweak it for each market to avoid duplicate content. For example: http://uk.loccitane.com/angelica,83,1,29954,0.htm & http://usa.loccitane.com/angelica,82,1,29278,0.htm
It just feels wrong doing this, you end up spending your time writing and developing for Search Engines rather than users. We spend our time writing good product descriptions, making them the best we can - I really don't want to spend $250k on a team of people to rewrite them for each market, making each version slightly different from every other version, when they are already the best descriptions they can be. Purposefully making tweaked descriptions doesn't feel like a step forward. As Google get better and better at detecting duplicate content, how sustainable is this option going to be? Making a page 30% different might work now, but in 6 months time this might be 40% - it's a game I really don't want to play. It'll work in the short term but it doesn't feel a long term solution.
I don't think I'm wrong in thinking it SHOULD be fine to have a US & UK version of the same page if there are valid scenarios for doing so - international companies do sell varied product stock, and have differing stock levels across markets. Google normally get most things right in the end so I think focusing on Geo-Targetig the site correctly might be the best long term approach, if not short term. In which case the best course of action is to work on all Geo-related areas and lobbying Google to either a) state what their view of how we should structure our site or b) consider making it acceptable to have duplicates across sites.
If you think up any other ideas for a suitable site structure please let me know!
Thanks again Steven, I've really appreciated the time you've put into my problem
|Re: Stuck in geolocation hell. Full pages via CDN||Steven Lockey||7/5/12 3:54 AM|
While I agree with you that it should be possible, unfortunately its puts too much emphasis on the webmaster to make sure every country is covered or its too exploitable by MFA sites. They could just copy your page, fill it full of adds, then add geo-location tags to a country you don't have a specific site for and then that would display instead of the legitimate site for searches from that country.
Its a bit of a nightmare when you have to think about the how the scammers/spammers could hack even the simplest thing like that.
I would agree the best option would be for Google to allow you to make a single international site, then you could simply include tags at the top of each page that basically say 'If country = x use this page instead' which would provide specific instructions to the search engines about which page you would rather display for each country, however I think this is probably a while off yet! Having the code on the main international page telling the SEs about the other pages is the only way I can see of doing it for the search engines that isn't fairly easy for the bad people to exploit.
Hope my thoughts helped anyway :)
|Re: Stuck in geolocation hell. Full pages via CDN||walter.luetisch||7/5/12 4:49 AM|
i am very new to this SEO In-Country Page Ranking and CDN Topic.
My Site does use an ccTLD (.de).
The Site is hosted on an German ISP (1&1).
Most of my visitors are from Germany,
I would like to understand if there is a negative impact on serving my site via an CDN,
The Page (html) and the Embedded Objects (gif, js, css, ...) would be served via an CDN.
My Domain would be CNAMEed via DNS to point to an CDN Domain.
My side is currently hosted at Germany but as far as I understood, if I use an CDN it will be visible via an US IP Address to the Google Search Bot.
Would that have a negative impact on my so called In-Country Ranking?
On the other side I have been believing that using an CDN would improve my Page Download Time and therefore also my SEPR Ranking but based on this discussion i am not sure anymore.
Would the Page Download Time/Performance improvement help to lower the Risk (in case there is a risk)?
Would be great if somebody could help me to understood this in more detail.
|Re: Stuck in geolocation hell. Full pages via CDN||SummerBarker85||7/5/12 12:43 PM|
There already is a solution for 'If country = x use this page instead' in the form of the hreflang alternate meta tags. We've implemented this and seen some really good results, it just hasn't done it perfectly (blue widgets example).
I think Google must go through a process of asking "is this a legitimate site? does it have a minimum level of quality links? does it have a minimum level of social 'follows'?" If the answer to these questions are all yes then the whole domain (and subdomains) should be trusted, and the hreflang alternate tags trusted above the other signals such as server IP/Inbound link location. If a webmaster actively states X site is for X country then that should be trusted, not suspect that the webmaster doesn't know what they're doing or has copied it from a friends template. Even that can be gone around with a GA/GWT file upload & verification process.
If a scammer/MFA does clone a site including the content and tags on their own site, then that site should have to go through the "is this a legitimate site?" process.
I've really enjoyed your thoughts, thanks again for taking the time to offer them. If you're ever in the big smoke looking for a SEO position send me a message :)
|Re: Stuck in geolocation hell. Full pages via CDN||SummerBarker85||7/5/12 12:48 PM|
as you've got a .de site, thats already a very explicit instruction to Google that you're targeted to DE. I don't think you should have any worries about using a CDN, but... as with everything SEO related - make sure you can measure everything.
Use software to measure ranking positions, make sure your analytics is working properly and you have a GWT account then make the change confident that if anything unexpected does happen, you'll be able to spot it.
I'd recommend a code freeze whilst you do this. You don't want to make other changes that may affect your sites performance at the same time as this will cloud your analysis and decrease your confidence in knowing what has happened.
Also - let us know how you get on afterwards, its always good to hear from other webmasters.
|Re: Stuck in geolocation hell. Full pages via CDN||SummerBarker85||7/6/12 1:38 AM|
Google - do you have a recommended approach for international sites like ours that have slightly differing product offerings, and differing stock levels? That doesn't involve showing 'Not for sale in this country' or 'Out of Stock in this Country' messages?
|Re: Stuck in geolocation hell. Full pages via CDN||SummerBarker85||7/10/12 2:54 AM|
Bump - is the L'occitane method of marginally differentiating the pages the way to go?
|Re: Stuck in geolocation hell. Full pages via CDN||Steven Lockey||7/10/12 3:17 AM|
Nope, they need to be very different. Google is getting better and better every day at picking up when its the same content just slightly re-worded.
IMHO full rewrite or not worth it.
|Re: Stuck in geolocation hell. Full pages via CDN||JohnMu||7/11/12 2:58 PM|
Backing up to your original question, in short:
- Using subdomains like that is fine
- There's no need to artificially use local IP addresses for each country-site
- Duplicate content for multi-regional sites is fine & common (it's not perfect, but we live with it)
- Use the hreflang where-ever possible
- Geotargeting does not guarantee that only one location's sites are visible in search
With the hreflang, keep in mind that you need to do this on a per-page basis, and it has to be confirmed from the other language/location pages. When using that markup, we try to swap out the URLs for better, local versions, where we have them. This doesn't change ranking though.
If you have specific search results where you're seeing bad results due to geotargeting, I'd love to pass those on to the team to review.
Hope this helps!
|Re: Stuck in geolocation hell. Full pages via CDN||vasugudala||7/16/12 10:24 AM|
We have a multilingual web sites with different CC'TLD’s, Content across all websites is almost same & we geo-located all the cc’TLD’s with respect to their countries by using the Google webmasters tools,All the websites are loading from CDN's Systems
Now the problems is when i search in Google for site:http://wego.co.in/ , we can see co.in domain indexed in google but if we look at the cached page its showing the wego.com.au version. This problem is appearing in Google but not in other search engines, is there any reason why Google is showing the wrong url in their cached version. I check with my tech team, honestly we are not doing any redirection against to the search engines.
|Re: Stuck in geolocation hell. Full pages via CDN||Christopher Semturs||7/16/12 3:55 PM|
so what is the underlying question? I guess the URL being displayed doesn't really disturb you, what is the problem you experience which you set in connection to the display of the cached URL?
P.S: Posting once is sufficient. :-)
|vasugudala||7/16/12 9:07 PM||<This message has been deleted.>|
|Re: Stuck in geolocation hell. Full pages via CDN||vasugudala||7/16/12 9:11 PM|
Thanks for you r time,My problem is when the user's trying to land on wego portal from Google SRP , they can see the Australia websites metadata in SRP, that will gone effect my CTR from Google SRP. Why google cant able to understand the correct pages when it crawl's my portal.
|Re: Stuck in geolocation hell. Full pages via CDN||Christopher Semturs||7/17/12 7:41 AM|
can you post a concrete example query so that I can understand the problem better?
|Re: Stuck in geolocation hell. Full pages via CDN||SummerBarker85||8/29/12 4:53 AM|
Thanks for your time on this John. As time has progressed we've seen Google get better and better at displaying the correct URL's in their respective markets (it must have just taken a while for it to have made its way into Google's system). At this stage I'm not concerned about the SEO benefits or costs of doing this, only that users are seeing the relevant results for their market.