Categories: Security, malware & hacked sites :

Google's index corrupt?

Showing 1-3 of 3 messages
Google's index corrupt? sgbotsford 5/5/12 7:58 AM
I have read the FAQs and checked for similar issues: 

My site's URL (web address) is:

I'm puzzled.  I just got a call from a would-be customer asking about prescription drugs.  MY web site comes up in a search for Alberta Prescription drugs.  To narrow the search I tried {prescription drugs sherwoods forests}  and got several hits.

Connecting to the link shows my normal page.  Showing source for that page, and searching for the word drugs gets no hits.

Executing the SAME search on Bing doesn't show this problem.  Ditto

Logging into my server, changing to the top level of my site, and executing " grep -r drugs . " (recursive grep) gets no hits.

It appears that the index for my site on google has gotten scrambled.

Is this the best place to report this?

Is there an alternative explanation?

Re: Google's index corrupt? luzie 5/5/12 8:13 AM
Hi sgbotsford,

I'm afraid your site could have been hacked ...

Check your URL with the "Fetch-as-Googlebot" feature in the webmastertools and look at the reult. It's possible you'd find the keywords in question in the code shown there.

Re: Google's index corrupt? redleg-redleg 5/5/12 8:17 AM
Unfortunately there is an alternative explanation, your site has been hacked with a "Pharamcy" hack. It looks like you have a conditional hack, when the user agent is Googlebot the "page(s)" that are returned when by your site are spammy pharmaceuticals pages. When a request is made for a page on your site the request provides your server with some additional information beyond which page is being requested.  The information provided varies but typically the request will include information about the user agent, ie the browser being used to make the request, and the referring page, the page that contains the link that is being clicked on such as a search results page. Hackers place code on the site to detect the values of the user agent and/or referring pages and use these values as conditions for their hacks.  These types of conditional hacks are frequently used by hackers because it normally takes longer for site owners to discover and remove the hack.

You can use one of the tools like  to request the page with the user agent set to Googlebot and see the results, then go back and request the page with the user agent set to a browser. Another good thing to check is to use the Fetch as Googlebot utility in your Webmaster Tools account to request the pages and check the content that is being returned to Google for indexing.  The hack is a bit unusual in that when the user gaent is a browser you get a http status code of 404 and your 404 page is returned but when the user agent is Googlebot you get an http status code of 200, then a div of the spammy content then your 404 page.

A search using the search operator site: viagra    will give you an idea of the URLs involved. Check the cached versions of the pages and you can see the content being returned to googlebot for indexing.  Unfortunately the "Pharma" hack is fairly common.  I have a blogpost at that provides some tips on what to look for in your code.