|AJAX Crawling working with hash but not meta tag||tmoyer||3/19/11 4:41 PM|
I have read the FAQs and checked for similar issues: YES
My site's URL (web address) is: toddmoyer.net/blog
Description (including timeline of any changes made):
I have an AJAX site, and <meta name="fragment" content="!"> has been added to all the pages (all the pages of the site are root pages with hash marks). My snapshot mechanism seems to be working fine. Here's an example:
AJAX page: http://toddmoyer.net/blog
SNAPSHOT page: http://toddmoyer.net/blog?_escaped_fragment_=
But when I use Fetch as Googlebot, the page it gets is the AJAX version.
However, when I test the site with this URL: http://toddmoyer.net/blog#!
The snapshot version is shown by Fetch as Googlebot.
So it would seem my meta tag is not being recognized. Any help would be appreciated.2011-03-19
*all the pages of the site are root pages with-OUT hash marks
|Re: AJAX Crawling working with hash but not meta tag||webado||3/19/11 5:07 PM|
Tghis is what web-sniffer.net sees:
<HTML xmlns="http://www.w3.org/1999/xhtml" xmlns:og="http://ogp.me/ns#" xmlns:fb="http://www.facebook.com/2008/fbml" xml:lang="en" lang="en">
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta name="fragment" content="!"></meta>
var _gaq = _gaq || ;
ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js';
var s = document.getElementsByTagName('script'); s.parentNode.insertBefore(ga, s);
<link type="text/css" href="/css/jqueryui/jquery-ui-1.8.6.custom.css" rel="stylesheet" />
<!-- start padding
end padding -->
|Re: AJAX Crawling working with hash but not meta tag||webado||3/19/11 5:08 PM|
WHich is the same as what I see in teh browser.
|Re: AJAX Crawling working with hash but not meta tag||tmoyer||3/19/11 5:16 PM|
What URL is that for? That looks like the source for the AJAX version, not the SNAPSHOT version.
Are you familiar with the AJAX crawling spec? http://code.google.com/web/ajaxcrawling/docs/getting-started.html
The AJAX version: http://toddmoyer.net/blog
The SNAPSHOT version: http://toddmoyer.net/blog?_escaped_fragment_=
|Re: AJAX Crawling working with hash but not meta tag||webado||3/19/11 7:00 PM|
I tested http://toddmoyer.net/blog#!
This would be good: http://toddmoyer.net/blog?_escaped_fragment_= as it has actual content.
No, not familiar with Ajax crawling. Would using a canonical link element that gives the url of what you call the snapshot page help?
But this may be an issue too:
|Re: AJAX Crawling working with hash but not meta tag||webado||3/19/11 7:02 PM|
|Re: AJAX Crawling working with hash but not meta tag||tmoyer||3/19/11 7:24 PM|
Thanks for the validator tip. It looks like I have a little issue with the case of my <html> and <head> tags.
Here's how I know that the validity of the HTML is not the problem: I'm using the "Fetch as Googlebot" feature for testing, and when I put #! on my URLs (to denote an AJAX page), the bot shows the content of my static snapshot page (as expected).
The only problem is that I would prefer not to have to submit all my URLs to Google with #! on them. The specification says the addition of the <meta name="fragment" content="!"> tag will work, and Googlebot will respond by fetching the page with ?_escaped_fragment_= on the query string, which provides the snapshot.
Everything checks out ok on my end, so I'm wondering if there's something I'm missing or there's a problem with Googlebot.
Do you know if it would be possible to escalate my issue to someone at Google knowledgeable about the AJAX crawling system?
|Re: AJAX Crawling working with hash but not meta tag||tmoyer||3/19/11 7:26 PM|
>> I'd say that's good, if by that you mean that when you try Fetch...
Yes, that part is working.
|Re: AJAX Crawling working with hash but not meta tag||webado||3/19/11 7:33 PM|
I will see if I can get somebody's attention on this.
|Re: AJAX Crawling working with hash but not meta tag||cristina||3/20/11 9:08 AM|
Can you look in your server access logs to see
what URLs are requested by the real Googlebot (not only by Fetch as Googlebot)?
|Re: AJAX Crawling working with hash but not meta tag||JohnMu||3/20/11 2:54 PM|
It's good to see more sites using the AJAX crawling proposal :-)!
Looking at your blog's homepage, one thing to keep in mind is that the Fetch as Googlebot feature does not parse the content that it fetches. So when you submit http://toddmoyer.net/blog/ , it fetches that URL. After fetching the URL, it doesn't parse it to check for the "fragment" meta tag, it just returns it to you. However, if you fetch http://toddmoyer.net/blog/#! , then it should rewrite the URL and fetch the URL http://toddmoyer.net/blog/?_escaped_fragment_= .
When we crawl and index your pages, we'll notice the meta-tag and act accordingly. It's just the Fetch as Googlebot feature that doesn't check for meta-tags, and instead just returns the raw content.
I hope that makes it a bit clearer!
|Re: AJAX Crawling working with hash but not meta tag||tmoyer||3/20/11 10:11 PM|
Ah ha! Thanks for the info. I had a feeling that Fetch As Googlebot might just not be parsing them.
It would probably be a good idea to make a note of this on the Fetch page so this doesn't come up again and again with other folks. Or better yet, make it parse the meta tags. ;)
Regarding use of the AJAX proposal, happy to oblige! It's great that the system exists.
While I have your attention, I have a proposal of my own:
|Re: AJAX Crawling working with hash but not meta tag||ramya.krishna2525||3/25/11 3:04 AM|
Facing a similar kind of issue:
In the Fetch Google bot, the URL without hash fragment is not crawled, on explicitly placing the URL with #! gets the Ajax content.
http://www.example.com/ajax.html is not crawling the ajax content.
where in, http://www.example.com/ajax.html#!ajax1 crawls the content.
How do we confirm the Ajax crawling is successful?
i) http://www.example.com/ajax.html => has < a href="#!ajax1"/> when Google crawler actually crawls this page, will it identify this
hash fragment , encode the URL and place a request??
ii) Should the html snapshot be the same as the one user sees on the browser?? For Google crawler, what matters is the Ajax content not how the content is placed on the html page??
Any help will be appreciated.
|Re: AJAX Crawling working with hash but not meta tag||markthesnowman||3/26/11 2:53 AM|
Thanks for posting your comments about "Fetch as GoogleBot" ignoring the metatag. I can confirm that this also happens for me. The "Fetch as GoogleBot" is working well for the pages with #! fragments but not for the home page with the metatag. (it seems to simply ignore the metatag).
I did I test where I incorporated the <meta name="fragment" content="!"> into a page; used "Fetch as GoogleBot and then wrote the expected query string using a php command: <h2><?php echo "Values in Query string are".$_SERVER['QUERY_STRING']; ?></h2>. The value of $_SERVER['QUERY_STRING] was null.
The "Fetch as GoogleBot" inconsistency between the #! pages and the metatag pages is a bit inconsistent. Hopefully, this feature can be incorporated such that "Fetch as GoogleBot" works for the metatag pages but in the interim a note regarding this limitation would be appreciated.
|Re: AJAX Crawling working with hash but not meta tag||JohnMu||3/26/11 4:00 AM|
As I mentioned above, the Fetch as Googlebot feature was purposely designed to show the data in the raw format, so that you can diagnose issues that come directly from those requests. By using a meta-tag on your pages, the "real Googlebot" would also have to fetch it normally first, and afterwards fetch the #!-version. This is similar to redirects, where we'd fetch the original URL first, see the redirect, and then fetch the final URL (Fetch as Googlebot also shows the redirecting page, not the final one).
That said, I'm happy to pass your feedback on to the team here, perhaps there's a way we could optionally do both :-).
|Re: AJAX Crawling working with hash but not meta tag||webado||3/26/11 4:03 AM|
However if your fragment is handled internally by url rewriting to url with a query string, then you will get that with the QUERY_STRING server variable.