Categories: Crawling, indexing & ranking :

The number of pages/URLs in the SERPs/GWMT/Sitemaps decreasing/going down/fluctuating/don't match :: Auto-Response ::

Showing 1-3 of 3 messages
The number of pages/URLs in the SERPs/GWMT/Sitemaps decreasing/going down/fluctuating/don't match :: Auto-Response :: Autocrat 9/24/09 2:34 AM
This is an Auto-Response for the (very) common question regarding the Total number of indexed pages decreasing/not matching the Sitemaps/SERPs figures.
This is an attempt to compile information for questions/issues such as;

Q: Why is the total number of pages in the SERPs changing/decreasing
Q: Why is the total number of pages in the Sitemap URLs chaging/decresaing
Q: Why is the total number of pages in the GWMT changing/decreasing
Q: Why am I seeing changes/fluctuations in the figures in the SERPs/GWMT/Sitemaps
Q: Why don't the number in GWMT/Sitemaps/SERPs match
Q: Why are pages/URLs being deindexed
Q: Why am I losing pages
Q: Google deleting my pages from the SERPs
etc. etc. etc.


=============   DataCenters   =============

First on the "you should know" list ... the DataCenters.
Google holds tons of data.  That data is held in various places - DataCenters (DC's).

When you make a search - it gets passed to several DC's. 
The results you see will come from one DC.

DC's may hold different data, depending on when they were updated etc.
This means you may be talking to a DC that is more/less uptodate than some others ... so the results may/can vary.

(Note: In some cases, the DC's used may vary - things like browsers and ISPs/Location etc. seem to influence it)


=============   SERPs may depend on you   =============

You have to remember that Google attempts to be very helpful, and personalise your searches.
So if you are logged into a G account - you may see different results than if you are logged out.
This may efect the figures as well.


=============   SERPs are Estimates!   =============

Okay - lets get this one dealt with too.
The SERPs - the figures you see when you do a search... those are NOT accurate in many cases.
Infact - Google does tell you this ..... it clearly says    "of about"  - it does Not say "exactly".

To the point - as you click through the pages (using the pager at the bottom), the chances are damned good that the number you see quoted will change.  It may start off on page1 saying "of about 1270" ... and yet on page2 it may say "of about 421".
So Click Through the Pager and watch the figures.



=============   SERPs show what you tell them too!   =============


Make sure you are searching "correctly".
Doing a search for    www.somesite.com    is NOT the same as searching for    site:www.somesite.com.
The 1st search will show ANY reference to that bit of text.
The 2nd search will show content FROM that Domain.

There may be 1000 pages on the net that mention/contain/refer to your Domain.
But your site may only have 40 pages!
So make sure you are searching properly!!!

And remember - there is a difference between    site:www.somesite.com    and    site:somesite.com   - so check both (helps to see canonical issues)

For more info - please read;
   My site has been Deindexed // Google has removed my Site // I cannot find my Site ::: Auto Response :::
   http://www.google.com/support/forum/p/Webmasters/thread?tid=086d3f2cddff2869&hl=en
+
   Google search basics: More search help
   http://www.google.com/support/websearch/bin/answer.py?hl=en&answer=136861


=============   SERPs maybe Filtered   =============


Google wants to please it's users.
That means that it reserves the right to be a little fussy/picky with what it shows.
So it will filter the results.

Common things to get filtered are;

* Duplicated Content :
      If it sees/has indexed the same content on multiple URLs - it may only count 1 version!
      This includes Internal and External Duplication (Canonical issues, Multiple DomainNames, Multiple URLs, Copied/Scraped content etc.)

* Highly Similar/Repeated Content :
      If it sees content that is very (VERY) similar to other content on your site, or content that seems little more than minorly changed content from other sites - it may not bother showing it.
      This includes "cookie cutter", "cookie cutout" and "boilerplate" content - like changing the main keyword/location, but leaving everything the same (or very similar).

* Blank Content :
      If it sees pages that are "empty" - just the general page template/design and no real content, or very short/small content, it may not bother showing it.
      This includes things termed as "stubs" and "placeholder" pages - basically things that have no/little value to users.

* Weak Content :
      If it sees pages/URLs that are "unpopular" and "buried" - then it may decide not to bother with them.
      This may include pages that posses no internal links (orphans), or that can only be reached by numerous clicks (you ahve to click through 8 pages to get to it!).

* Unreliable Content :
      If Google tries to access a URL and gets timedout, refused or a poor response ... and it keeps happening, it may not want to bother.
      This can include pages that seem to take too long to load/respond, that aren't always sending 200/304 responses, if hte site/server becomes unstable etc.

So if you have some form of Duplicated/Repeated content and/or Blank Content and/or Weak Content and/or Unreliable Content - and Googe finally figures this out .... the results may change as it starts applying Filtersand cleaning up your mess.

Following the WebMaster Guidelines and good practice should help prevent/limit the effects of Filtering.
Original/Unique and Useful content on a well structured site that responds correctly and in a timely fashion is the way to go!


=============   GWMT/Sitemap figures   =============

For Google to list something here - it must have found it or been told about it.
This means it may take some time for it to find things.

As with the SERPs - you may see some form(s) of Filtering - such as if you have Duplication issues.


=============   The figures don't match!   =============

I cannot believe you have jsut read all the above - and are still trying to figure out why the numbers do Not match!

What - with all the variances due to DC's, the Estimates, some Filtering etc. - I'm Not surprised they don't match!


---------------------------------------------------------------------------------------------------------------

   NOTE:

This is a "general auto-response" post.

This is not a Topic for discussion - it is a point of reference to save having to type the same answer repeatedly due to the sheer number of times this question is asked and is meant as an aid for people that don't seem to search/read the various other posts regarding this topic.
Thank you for taking the time to read this Auto-Response.


2010-07-27


[Append]


=============   0 Indexed / # Submitted, Zero Indexed   =============

This seems to be a common question now...

--- If the Site is new ---
Then you are Not gonna have anyting Indexed yet.
Give it some time and you will see it change

--- If the Verification is new ---
Then give Google the chance/time to go look at it's DB,
then figure what to report back with

--- If the Sitemap was ReSubmitted ---
Then give Google the chance/time to go look at it's DB,
then figure what to report back with

--- If the Sitemap was recently regrabbed/redownloaded by G ---
Then give Google the chance/time to go look at it's DB,
then figure what to report back with

Getting it ?

(Note: Yes - I know ... there was a previous bug/issue .... but it Still wasn't something to worry about, nor did it ever have an affect on what was indexed nor how it ranked!  GWMT is a Reporting tool (mainly) - so such thigns don't directly affect you/your site!)


2011-01-31

[Append]

=============   How do I find out what is Indexed then?   =============

Well - here's another AR that should help answer that question;

How to know what Google has indexed / How to find out what is in the Index / List of indexed URLs ::: Auto-Response :::
http://www.google.com/support/forum/p/Webmasters/thread?tid=378b3cdf485ee3a7&hl=en


Re: The number of pages/URLs in the SERPs/GWMT/Sitemaps decreasing/going down/fluctuating/don't match :: Auto-Response :: attabatta 6/4/10 11:36 PM
General question : Why number of pages submitted in sitemap increases indexed page with Google?
Distinct question : My sitemap file contains  91,037 but my indexed files 18,888 ,why?
Re: The number of pages/URLs in the SERPs/GWMT/Sitemaps decreasing/going down/fluctuating/don't match :: Auto-Response :: Autocrat 6/5/10 4:41 AM
???


   NOTE:

This is a "general auto-response" post.

This is not a Topic for discussion - it is a point of reference to save having to type the same answer repeatedly due to the sheer number of times this question is asked and is meant as an aid for people that don't seem to search/read the various other posts regarding this topic.
Thank you for taking the time to read this Auto-Response.


???