Google Product Forums

Re: Approximate number of pages indexed

oslegov Aug 27, 2009 8:37 AM
Posted in group: Google Custom Search

Categories: Indexing and Results :

Hello Audiophile:

We've run into this issue too and I've been at this since about April with well over 100 URL's being crawled.  At our high point we were at nearly 200,000 pages listed in our indexing statistic.  We're now at zero.  I have asked about this twice in this forum in the past 4 months and have not received an answer from a Google employee.  My guess is that they're very reluctant to get concrete about how the indexing works even though those of us depending on this tool for enterprise or near enterprise purposes it would be nice to get some answers.  As there are three of us having this discussion now about this very issue it would nice to get some sort of official answer from Google--even along the lines of "this is a question about proprietary practices that we're not able to answer at this time."  ARE YOU LISTENING GOOGLE?

Here's my best guess based on experience and limited testing: There's a CSE index and a Google Main index.  Over time and  as long as you have pages that are eligible for PageRank, Google is grabbing pages from your CSE and adding them to their Main index.  As Google's crawlers  pass through and grab these pages  for Google Main your CSE's indexing statistic decreases.  Performance wise you're not going to notice much of a difference as Google's Main index is always available to your CSE (this is documented in a couple of places).  In other words--I've learned to stop worrying about that CSE  indexing stat because it just doesn't reflect what's available results wise to my users.  My CSE is used by a giant domain with thousands of users a day for a pretty large entity. We just switched from maintaining our own server based solution and the CSE performs far better than a far more expensive solution and we have tested it with objective criteria over time.  That said I'd sleep a little better at night if this indexing process were explained better officially.