Categories: Crawling, indexing & ranking :

5,507 Not found errors. Anyone know the cause?

Showing 1-112 of 112 messages
5,507 Not found errors. Anyone know the cause? Filmonic 8/28/12 6:00 AM
I've just checked Webmaster Tools and found I have the following error:

Increase in not found errors
Aug 23, 2012

Google detected a significant increase in the number of URLs that return a 404 (Page Not Found) error. Investigating these errors and fixing them where appropriate ensures that Google can successfully crawl your site's pages. 

When checking I found they were all random numbers such as below. There's 5000+ of them. Can anyone help detect the cause? My website is filmonic.com


59
1345130958000
404
8/20/12
60
1345128268000
404
8/20/12
61
1345127495000
404
8/20/12
62
1345128296000
404
8/20/12
63
1345128652000
404
8/20/12
64
1345128593000
404
8/20/12
65
1345128692000
404
8/20/12
Re: 5,507 Not found errors. Anyone know the cause? Ron 2 9/1/12 4:00 AM
I am getting a sudden drastic increase in 404's too and have no idea why.  Did you eventually find the cause of your increase?

Re: 5,507 Not found errors. Anyone know the cause? William Rock 9/1/12 9:15 AM
Have you run a crawl tool like Xenu LInk Sluth to see what pages you maybe generating bad links from there maybe a broken page in your website causing this ... Not sure without the URL
Re: 5,507 Not found errors. Anyone know the cause? William Rock 9/1/12 9:16 AM
my bad.. I see the link to your site.. Sorry..
Re: 5,507 Not found errors. Anyone know the cause? webado 9/1/12 9:48 AM
When tested with web-sniffer.net and web-sniffer as user agent, the site does not actually  respond with 404 for all non-existent urls like  http://filmonic.com/1345130958000 . It responds with a 200 and an error message. That's a soft 404.

It does respond with 404 when the user agent is Googlebot though.

So you should fix this because it's not just Googlebot you have to satisfy with the server response. Googlebot sometimes masquerades as a different user agent in order to catch improper behaviours.

Re: 5,507 Not found errors. Anyone know the cause? Filmonic 9/2/12 3:13 AM
I haven't been able to find the problem. There are now over 20,000 of them. I think it may have something to do with my images as all the codes begin with '1345' and some of my images have those numbers in the URL


I'm thinking it could also be a plugin I installed 2 weeks ago and removed. Trying to remember what it was called. 
Re: 5,507 Not found errors. Anyone know the cause? Tiggerito2 9/2/12 3:41 AM
Click on one then click on the Linked from tab. That should tell you where the problems are coming from.
Re: 5,507 Not found errors. Anyone know the cause? Tiggerito2 9/2/12 4:46 AM
Well I'll be,

I just checked out my website and exactly the same thing has happened to me. Same links starting with 1345!!!!

It also started on the 20th.

The linked from info implies it's my own pages that are the source of the links.

I looked at the source and used chromes element inspector. No exact matches but Disqus is using 1345 based number in it's links. Hmmmm.

I believe Google supports crawling Disqus comments, so could it be a bug due to a recent update?

@Ron, I see your site also uses Disqus. Evidence is mounting.

It would be great if some others could check their stats to see if they are getting this, especially if you are using Disqus.




Re: 5,507 Not found errors. Anyone know the cause? zihara 9/2/12 6:47 AM
Several months ago I was seeing 80,000 404s - 80,000 pages that never existed. It was a straight html5 site, no php, no CMS or database connection of any sort, no Disqus, no user-generated-content of any sort. Over time that has reduced to about 1,200 pages now - all pages that were 301'd ten years ago. Now I'm seeing that the bot is trying to index the individual slides in jQuery slideshows... Seems like the Googlebot is kept in a perpetual alpha state... kinda like Chrome OS in this Cr-48 I'm using: worked like a charm yesterday but updated something this morning that changed aspects of the display and wiped wireless. A hard reboot fixed it but what happens with next week's update?
Re: 5,507 Not found errors. Anyone know the cause? Afternoon_tea 9/2/12 2:43 PM
Hey all, Just checked my site on WMT, and yup, just over 300 crawl errors and a couple of soft 404s, plus server errors. Yesterday all was fine, and most of the errors start with that “1345” number.

I'm also using Disqus.

Not sure what to do next, but might ride it out and see.
Re: 5,507 Not found errors. Anyone know the cause? webado 9/2/12 2:55 PM
Ensure they respond with 404 not with a soft 404.
Re: 5,507 Not found errors. Anyone know the cause? Afternoon_tea 9/2/12 3:49 PM
Yeah, the bulk of them (300+) are 404s. Must have happened over night, and I had a couple of 404s there before, but I knew what they were. These new ones all have the URL but then this 1345 sequence appended after the URL.

Strange, since none of my URls look like this. Technically, they are 404s, but the URLs being listed don't exist since the 1345 shouldn't be there.
Re: 5,507 Not found errors. Anyone know the cause? webado 9/2/12 4:10 PM
They MUST respond with 404. Test using web-sniffer.net .
Re: 5,507 Not found errors. Anyone know the cause? Afternoon_tea 9/2/12 4:35 PM
Yup, they do. Here's one example: http://www.********.com/this-is-the-article-name/1345510526000


Now, that number after the end of the URL -1345510526000- shouldn't be there, so all these links which are non-existant, are showing up as 404s.

Not sure what's going on, but so many URLs in WMT showing as 404s are polluted with this number at the end of the URL.
Re: 5,507 Not found errors. Anyone know the cause? Tiggerito2 9/2/12 5:21 PM
So it's a relative link Google is picking up. 

I've just done a deeper check on the files involved in loading one of my suspect pages. No sign of 1345.

I wonder if Google uses an API or some other separate request to gather comment data from Disqus?
Re: 5,507 Not found errors. Anyone know the cause? webado 9/2/12 5:31 PM

Normal if they don't exist.
Re: 5,507 Not found errors. Anyone know the cause? Filmonic 9/3/12 1:25 AM
Thanks for doing some detective work. It seems Disqus may be the cause. 
Re: 5,507 Not found errors. Anyone know the cause? Afternoon_tea 9/3/12 1:34 AM
Seems like it. It's the common element here, unless there might be a plugin that's doing it, which we all use? Only plugin that was updated recently was my All in one SEO, but all looking fine with that.

Too much of a hassle disabling the whole comment system. Hope Google/Disqus can wrangle this one out soon.
Afternoon_tea 9/3/12 1:35 AM <This message has been deleted.>
Re: 5,507 Not found errors. Anyone know the cause? Afternoon_tea 9/3/12 1:47 AM
Not sure, but Disqus has been an issue for me in the past when it comes to crawl errors. This time it's a bit more severe. I'm pretty much the same, none of the links contain that number, and checking the source, the generated links. articles, etc., it seems like it's Disqus rendering this on the fly and Google picking up on it. God knows where it's coming from, though.

Wonder if someone from Google coule pick up on this?
Re: 5,507 Not found errors. Anyone know the cause? Tiggerito2 9/3/12 2:24 AM
Mine's a fully custom website. External stuff I use:
  • Disqus Comments
  • ShareThis Buttons (G+, Twitter, ShareThis, Facebook, Stumble, LinkedIn)
  • Google+ Share Button
  • Google Analytics (Async)
  • Linkstant
  • The odd embedded YouTube video
Otherwise I have full control over my content. Barring some server rewriting it!
Re: 5,507 Not found errors. Anyone know the cause? Afternoon_tea 9/3/12 2:50 AM
That's good to know, thanks! I'm using WordPress and the only plugin or addition that we have in common, is Disqus. Overnight, the number of 404s has doubled, but not as high as the numbers that some others have mentioned here.

Must be quite a new situation, since I can only find one other mention of it via Google search (some other person is reporting the same thing on their blog.)

Kind of a good sign that it's probably not something on your custom setup, except that Disqus plugin.
Re: 5,507 Not found errors. Anyone know the cause? Afternoon_tea 9/3/12 5:32 AM
Yes, but we need to know why Google is picking up on these phantom links, and hopefully resolve it.
Afternoon_tea 9/3/12 6:42 AM <This message has been deleted.>
Re: 5,507 Not found errors. Anyone know the cause? webado 9/3/12 7:50 AM
Nosy Gogolebot has taken to crawling through on-page js content and picks up anything that even remotely looks like it might be a url.
If you've not done it yet, add html comment tags, it might work. Or not. Break up some of the js code so it doesn't resemble urls. Not sure if possible with Disqus scripts, but maybe they can be placed in external js files and included on pages.


On Monday, September 3, 2012 8:32:15 AM UTC-4, Afternoon_tea wrote:
Yes, but we need to know why Google is picking up on these phantom links, and hopefully resolve it.
Re: 5,507 Not found errors. Anyone know the cause? Afternoon_tea 9/3/12 9:02 AM
Thanks for the input! Not sure how it's going to be done with Disqus, but I'll do a scan of what I can do with their system. I'm also looking at URL paramaters to filter out these links in the meantime.

Cheers!
Re: 5,507 Not found errors. Anyone know the cause? webado 9/3/12 9:12 AM
It's not a query string parameter so not much you can do that way.

Maybe some .htaccess directive can strip off everything that matches some pattern at the preceding / ?
Re: 5,507 Not found errors. Anyone know the cause? JohnMu 9/4/12 5:14 AM
Hi guys

Thanks for posting -- I'll double-check what's happening there. In the meantime, it's always useful to have specific URLs to look at -- so those of you who are also seeing this kind of problem, it would be great to get your site's URL so that we can check there. Feel free to use a URL shortener if you prefer. 
Thanks!
John
Re: 5,507 Not found errors. Anyone know the cause? Tiggerito2 9/4/12 6:11 AM
Nice to see you here John,

In my case I'm seeing about 550 new 404 errors being sourced from what looks like most pages on my website (about 100). So I'm seeing more 404s from this than pages I have indexed. An example of a page cited as a "linked from" is this:


All error pages are purely numerical using the pattern /134xxxxxxx000. It looks like an incrementing number padded with 3 zeros, so from the same system.

Some have stated they see them in subfolders so the causing reference is relative and not absolute.

Indexing of these errors for me started on the 22nd and jumped up again on the 29th (Australia time). Others reported similar time-frames.

I'm suspecting it's some sort of Google<->Disqus interaction problem. So far every case has confirmed Disqus is in used. There is also a confirmation on the WordPress forum:


I'm not using WordPress so that can be eliminated.

I'm happy to help in any way I can.




Re: 5,507 Not found errors. Anyone know the cause? Tiggerito2 9/4/12 6:21 AM
I just got this message in WMT...

Increase in not found errors 
Google detected a significant increase in the number of URLs that return a 404 (Page Not Found) error.  
Investigating these errors and fixing them where appropriate ensures that Google can successfully crawl your site's pages. 

No shit Sherlock, I'm on it :-)

This may be the point where a bunch more webmasters become aware of the problem. Shows the value of these new warning messages.
Re: 5,507 Not found errors. Anyone know the cause? Afternoon_tea 9/4/12 8:54 AM
Hi John!

Thanks for popping in! As of today, the number of 404s with this particular issue has risen to around 785. Some of the URLs contain this number string twice, and the linked from info in WMT shows that it's linking from the URL with just one of the 1345 strings.

Many thanks for any help!

Here's a few of the links that are showing this appended sequence and turning up as 404s.

This one has the number sequence twice in the URL:


Same thing again, but the second sequence of numbers starts with 1346:

This URL has just /27/ before the 1345 string"

Re: 5,507 Not found errors. Anyone know the cause? Filmonic 9/5/12 1:57 AM
I got some feedback from Disqus support:

We believe these 404 crawl errors are due to search engines like Google trying to crawl dynamic JavaScript-assembled links coded in the Disqus WordPress plugin. We’ve recommended reaching out to Google directly about this, as enough feedback may help them resolve this. In the past someone from Google has had to tweak it to get them to go away, and this looks similar to the previous case.
Kind Regards,
Re: 5,507 Not found errors. Anyone know the cause? barryhunter 9/6/12 5:53 AM
Direct example if it helps:
(I've put a special 'small' error handler on such urls, so doesnt bother trying to look up blog entry no 1345548361000 in our database :)

Claims to be found here:
which redirects to

(checking fetch as GoogleBot on that url (after following the redirect) - doesnt show anything containing such a number. GoogleBot, must be executing the  Disqus JS) 

Another

I have several hundred of them in total, but dont have any after 8/31/12 - so maybe Disqus have changed their JS?

Interestingly 1345548361000 is 1000x the unix tiemstamp for  2012-08-21 12:26:01 - so suggests its picking up some sort of timestamp. 


Re: 5,507 Not found errors. Anyone know the cause? Afternoon_tea 9/6/12 6:00 AM
that's great to hear! Let's hope They can get it all wrangled out. I'm still showing high 404s, but given the nature of the issue, understandable it might take some time.

Thanks for all the info and getting Disqus to look at it!
Re: 5,507 Not found errors. Anyone know the cause? JohnMu 9/6/12 8:41 AM
Thanks for the examples, everyone! It does look like we're picking up something funny via JavaScript there. We're looking into what can be done in this particular case. In the meantime, keep in mind that 404 errors of URLs that are invalid (like these appear to be) are not something that would affect your site's indexing or ranking, so while I understand that they may be confusing (and I'm sorry for the number of URLs here), it's not something that you'd need to take action on.

Cheers
John
Re: 5,507 Not found errors. Anyone know the cause? Afternoon_tea 9/7/12 4:11 AM
Thank you so much for all the help, investigation and information! Much appreciated and as long as I kinda know what's happening, that's good for me.

Enjoy the weekend when it comes!

Cheers.
Re: 5,507 Not found errors. Anyone know the cause? Blaziran 9/10/12 8:26 AM
Hello. My problem is very similar on http://bit.ly/18ayJh. The "not selected" green line in my Index/Health on WMT showed several hundred thousand (!) pages. Investigation revealed that Google was crawling URLs that never existed by appending directories onto a file. The result was directory/goodfile.htm/other-directory/other-directory/other directory/another directory/anothergoodfile.htm. The path never existed and a number of tools revealed no such link on the site. Nevertheless a 200 was returned as the page redirected back to goodfile.htm without the css styling. We reconfigured htaccess to return a 410 in these circumstances and now I have a thousand of them (and climbing) this week in WMT. I do not have Disqus on this site. While the 200 response was returned, I'm afraid that Google viewed this as duplicate content and has penalized me accordingly. Should I submit a reconsideration request? 
Re: 5,507 Not found errors. Anyone know the cause? cristina 9/10/12 9:26 AM
Hi Blaziran, maybe open a new thread with your question. Problems that might look similar can have different causes.
Can you see in Google Webmaster Tools the URLs that the links are coming from?




Re: 5,507 Not found errors. Anyone know the cause? Blaziran 9/10/12 11:17 AM
The links are coming from other fictitious and created URLs. The first time the created page appeared was on a crawl from msnbot in June. It was not linked from anywhere. Can I create a new thread without making a duplicate post?

Thanks!
Re: 5,507 Not found errors. Anyone know the cause? ukcleaner 9/14/12 12:45 AM
1) thanks for checking with Disqus (I've just added added Disqus to my site and getting the same errors)
2) has anyone seen an update from Google or Disqus on this?
Thanks
 
Re: 5,507 Not found errors. Anyone know the cause? maxumer 9/15/12 5:19 AM
we had the same problem and in order to try and resolve it removed Disqus but Google is still reporting new increase on 404 errors after we removed it so i don't think it is Disqus. Anyone else who doesn't have Disqus or removed it still seeing this problem? 
Re: 5,507 Not found errors. Anyone know the cause? maxumer 9/15/12 5:27 AM
John, the site we are seeing this problem on is Cheapism.com. as i wrote below, we removed Disqus but still seeing an increase. any ideas on how we can resolve it?
Re: 5,507 Not found errors. Anyone know the cause? robo62 9/18/12 12:07 AM
Never used Discus and have been having this probelm for about a month on WP site.
Re: 5,507 Not found errors. Anyone know the cause? Tiggerito2 9/18/12 1:44 AM
@robo62

It would be interesting to confirm if this is the same issue. It could shed some light on it. Are you getting 404 errors in WMT that are numeric and start with 134?

If so, what plugins or customisations do you use?


Re: 5,507 Not found errors. Anyone know the cause? Webmaster Tools 9/18/12 12:42 PM
I am experiencing same issues. I get 404 not found in web master tools and numbers in search ids start with 1343 or 1345, almost all of these pages show these ids. It is a job site and jobs get deleted  but the frequency of not found errors is unbelievable, increasing by 1000/2000 every day and reached to 36000. The number of these errors is massive as compared to the the deleted jobs.My SEO man says it is affecting my rank he says it won't affect if errors are  not that massive but if it is that huge number it affects ranking, so better to resolve it.

I have changed the 404 title of these pages to nothing found for this page, now I marked them all fixed in Google web master and will see if these reappear. I will keep you guys posted on this.
Re: 5,507 Not found errors. Anyone know the cause? Rob W Baker 10/1/12 11:37 PM
Anybody got an update on this issue?  We're still getting tons of 404 errors, and have noticed a large decline in traffic on our website, www.mumbai.me.  Have disabled disqus to see whether it improves things.  
inbusiness 10/2/12 2:52 AM <This message has been deleted.>
Re: 5,507 Not found errors. Anyone know the cause? inbusiness 10/2/12 3:18 AM
Hi

I am having the exact same issue. Several thousand 404 not found errors.

I do have discus installed, and the problem appeared early last month

I also have the number 1347 in each URL with the 404 error

I hope we find a solution to this issue soon.

Thanks
Re: 5,507 Not found errors. Anyone know the cause? JohnMu 10/2/12 5:51 AM
Hi everyone

I wouldn't worry about these 404 errors -- when we discover URLs in various places, we just want to double-check them to make sure that we're not missing anything. These crawl errors won't affect the rest of your site's crawling, indexing, or ranking. If you are seeing changes in your site's performance in search, they would not be due to these crawl errors & I'd recommend starting a separate thread for those issues.

Cheers
John
Re: 5,507 Not found errors. Anyone know the cause? TRGGEO 10/2/12 8:58 AM
Hi John,

First of all, thanks for the support!

I'm having a similar problem, however when I did a site: search (trgpeak.com), I found that some of these crawl errors are being indexed.  It's not even close to all of them, but there are still enough that I'm concerned.  What would be the best course of action?  I have them canonicalized properly as far as I can tell, but should I use the URL removal tool in WMT since there aren't a ton of them?  I'm not sure how to proceed here...

Thanks,

G

Re: 5,507 Not found errors. Anyone know the cause? JohnMu 10/2/12 2:52 PM
Hi G

It looks like your server is accepting URLs like http://trgpeak. com/montana-minimum-wage-increase/1235466789/ and returning a page with "200 OK" for them. Theoretically -- assuming the URL is invalid -- they should be returning "404" so that they don't get indexed separately. With the rel=canonical, what generally happens is that we crawl and index the page first, and then follow the specified canonical, so it can happen that we'd visibly index the URL like that temporarily. All in all, this isn't a critical problem, though it can result in us crawling those URLs from time to time; it's not something that you'd absolutely need to forcibly resolve (and the URL removal tool isn't suited for this). 

So my recommendation would be to try to have these invalid URLs return a proper 404, which would be a clean solution -- and if that's not possible in the short run, using the rel=canonical like you have it now is a reasonable workaround. 

Cheers
John
Re: 5,507 Not found errors. Anyone know the cause? Suppazone 10/5/12 4:32 AM
I'm seeing the exact same problem. I hope Google will see this thread and work out the problem.
Re: 5,507 Not found errors. Anyone know the cause? TRGGEO 10/5/12 2:29 PM
Thanks for the reply John, that was incredibly helpful.  I'll keep an eye out in case the number of these being indexed rises, and make sure I'm not serving these pages.

Have a great weekend 
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/7/12 8:17 AM
Hi John

Thanks for the info, you say in your post that these 404 errors will not affect indexing or ranking, my site has lost almost all of it's traffic. 
Early in September the 404's started to appear and at the end of the month (28th to be exact) the site lost almost all of its traffic, do you think the sudden drop in traffic is due to another issue?

Hope you can help.

Regards

James
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/7/12 8:35 AM
Has anyone else lost rankings?
Re: 5,507 Not found errors. Anyone know the cause? Mr Pinks 10/7/12 2:41 PM
I had the same exact problem. Redirected all of the bad urls to my main page and then lost 90% of my traffic on the 28th.
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/8/12 5:41 AM
Mr Pink do you have a disqus plugin installed?
Re: 5,507 Not found errors. Anyone know the cause? Mr Pinks 10/8/12 5:43 AM
I did. Removed it after I found this thread and found out that Disqus was the cause of the problem.
Re: 5,507 Not found errors. Anyone know the cause? Luigi Gambardella 10/8/12 6:43 AM
Ehi Mr Pinks are you sure it's disqus? My ranking is now destroyed with over 20K 404's...
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/8/12 7:10 AM
Did your ranking come back Mr Pinks or anyone else that removed the plugin?
Re: 5,507 Not found errors. Anyone know the cause? Mr Pinks 10/8/12 8:00 AM
Positive that it was Disqus. My rankings never came back. Was Using Disqus on ExpressionEngine.
Re: 5,507 Not found errors. Anyone know the cause? webmaster123 10/8/12 8:24 AM
I do not have this plugin and I have a huge amount of 404 errors like these:


search/Nova%25252BAndradina/feed/rss2/
search/Salto%25252Bde%25252BPirapora/feed/rss2/
search/S%252525C3%252525A3o%25252BBernardo%25252Bdo%25252 BCampo/feed/rss2/
search/Nova%252BAndradina/feed/rss2/
search/Porto%2525252BSeguro/feed/rss2/
search/Salto%25252Bde%25252BPirapora/
search/S%C3%83O+PAULO/page/116/
search/S%C3%83O+PAULO/page/21/
search/%C3%A0+Venda/page/74/
search/SAO%2BPAULO/page/2/
search/S%C3%A3o+Paulo/page/75/
search/S%C3%A3o+Paulo/page/72/
search/S%C3%A3o+Paulo/page/50/


I'm going crazy and I can not solve the problem. Can anyone help?
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/8/12 9:20 AM
Which plugins are you using?
Re: 5,507 Not found errors. Anyone know the cause? webmaster123 10/8/12 11:51 AM
James,
I'm using the following plugins:

WordPress SEO
W3 Total Cache
User Photo
Subscribe2
Contact Form 7
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/8/12 12:59 PM
The only plugin i'm using the same as you is Contact Form 7
I'm sure that can't be the issue, i'm stumped too

I've got nearly 3000 of these now
les/1346759283000/1347980998000
Re: 5,507 Not found errors. Anyone know the cause? webmaster123 10/8/12 2:55 PM
I'm desperate. Before realizing these errors I was on the first page (fourth place) for my main keyword. Today I'm on the third page for that keyword.

Lucky for me, I'm still well positioned in other keywords, but I fear that my site is penalized because of these errors.
Re: 5,507 Not found errors. Anyone know the cause? Tiggerito2 10/9/12 1:51 AM
@Sandro,

Thats a different problem to the one we're talking about here. I'd suggest you start a new post so you get fresh eyes on it.

@James W, you do have the same problem and it is caused by your Disqus plugin. 


Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/9/12 4:53 AM
I have the same problem, disabled the discus plugin last night, awaiting results
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/9/12 4:55 AM
Or even Disqus :)
Re: 5,507 Not found errors. Anyone know the cause? Lysis 10/9/12 5:05 AM
404s do not cause ranking drop. However, if you 404 89% of your site and Google has nothing to crawl or rank you anymore for whatever it was using to rank on the old 404 pages, then it naturally makes sense that you would lose ranking due to removed pages.

If these are random links found on the web that are pointing to a non-existing page, then no, they don't matter.

Yes, I know you see a correlation in the 404s and rank loss.
No, they are not related unless you've deleted a ton of pages and Google hasn't crawled new ones.
The weird links are being generated from your site or an external site.
Use Xenu Link Slueth to find broke links on your site.
Use Webmaster Tools to find weird random links found on the web pointing to your site.
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/9/12 5:05 AM
Has anyone got positive results after disabling the plugin?
Has anyone regained rankings if they dropped.
It's a shame this issue came in at the same time as the Google EMD update as I can't tell if I've been hit or not.
Any help would be very much appreciated.
Re: 5,507 Not found errors. Anyone know the cause? Luigi Gambardella 10/9/12 7:06 AM
Hey James,
I am still in your situation, but seems not related with discuss...

The wordpress.org site too is affected with this bug (try to add a number after any post and see) so i don't know why google crawled these...
All I can say is that my site lost a lot of traffic without an apparent reason...
Still waiting for a resolution...

Re: 5,507 Not found errors. Anyone know the cause? Lysis 10/9/12 7:26 AM
I feel like I'm talking to myself :-\

There is no "issue." Google crawls sites based on what they find in web pages. If the page is invalid, returning a 404 is the proper way to do it. IT's normal and does not affect rank.
Re: 5,507 Not found errors. Anyone know the cause? Tiggerito2 10/9/12 7:26 AM

On Tuesday, 9 October 2012 22:35:25 UTC+10:30, Lysis wrote:
404s do not cause ranking drop. However, if you 404 89% of your site and Google has nothing to crawl or rank you anymore for whatever it was using to rank on the old 404 pages, then it naturally makes sense that you would lose ranking due to removed pages.


This may be why some people are suffering ranking issues. Since this Disqus issue WMT has reported 1,600 missing pages on my website which has just over 100 pages. So 404s have become 95% of my identified site. If nothing else, Google is spending it's allotted time checking out these bogus URLs and not my real pages.
Re: 5,507 Not found errors. Anyone know the cause? Luigi Gambardella 10/9/12 7:44 AM
ok Lysis talk to yourself too but for me there is a problem...
if you try a check header tool for this type of url http://wordpress.org/news/2012/09/wordpress-3-5-beta-1/3153135135131/  you'll see 200 OK not 404...

And also: why this appens, why google crawls non existent urls?

It's impossible to find those numbers in all my site (and all wordpress installations)...
Re: 5,507 Not found errors. Anyone know the cause? Lysis 10/9/12 8:24 AM
If the page doesn't exist, then return a 404. If you are returning a 200, then that's an error on your site.

>> So 404s have become 95% of my identified site

Did you convert to a new site? If you converted, you should be (optimally...this can be too much of a pain for most people) 301 redirecting old pages to new.

If this plugin is incorrectly returning server status codes or generating bogus URLs, then yeah, time to get rid of that plugin.
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/9/12 10:28 AM
I'm going to 301 all the unfound pages to my homepage and see what happens, this is going to be a long painful job

Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/9/12 10:35 AM
Whatever you think about this issue there must be a fix for it, some bright spark must have a resolution 

Re: 5,507 Not found errors. Anyone know the cause? Lysis 10/9/12 10:40 AM
There is: it's a 404. You can't control how people link to you, and Google will crawl what it finds as a link. 

I don't think you are understanding how server messages work and what they mean to a browser or a bot.
Re: 5,507 Not found errors. Anyone know the cause? Mr Pinks 10/9/12 1:55 PM
I wouldn't redirect them. I really wouldn't. That's what I did and it killed my site. I would remove Disqus and just mark them as fixed in Tools then see if they return.
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/9/12 1:56 PM
I agree you can't control who links to your site but my site is creating these 404's, 3000 of them and it's going up each day, my rankings are gone and i'm getting no traffic. i've only got 2500 pages so to have 3000 unfound is a little over the top i think......

Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/9/12 2:00 PM
Message in Webmaster Tools............
Google detected a significant increase in the number of URLs that return a 404 (Page Not Found) error. Investigating these errors and fixing them where appropriate ensures that Google can successfully crawl your site's pages.
Re: 5,507 Not found errors. Anyone know the cause? Lysis 10/9/12 2:03 PM
If your pages are 404ing and they are real pages, then that's an issue with your site not with Google.

You'd have to give an example at this point. I don't know disqus so maybe it's that.
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/9/12 3:19 PM
Here is an example:         
easyjet-holidays/1346675945000/134749744700
On this page within my site none of the numbers exist, something is creating them.
Should have a bit more of an idea Thursday, hopefully the number of 404s will start coming down now I've uninstalled the plugin.

Thanks for taking time to comment by the way.

Re: 5,507 Not found errors. Anyone know the cause? Tiggerito2 10/9/12 6:55 PM
@Lysis

This is a problem both acknowledged by Google and Disqus but neither have provided a solution. It's more core than a plugin issue as I use the Disqus code directly.

The only solution I can think of is to remove Disqus and wait. Maybe 410 those pages to help speed up their removal.

I'm personally leaving it in and hoping the mass 404s have no real negative affect, as @JohnMu stated earlier in this thread.

To identify the problem, you will be getting bursts of new 404 errors in WMT that have numeric folders in them. Currently they start with 134 and end in some zeros.
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/11/12 12:31 PM
Is there a fix for this yet, i've had enough of redirecting 600 odd pages each day. Obviously not the Disqus plugin as that's now been disabled since Tuesday.
Re: 5,507 Not found errors. Anyone know the cause? adeb1 10/12/12 3:01 AM
I've been getting this for the past month or so on one of my sites (adrianbold.com) but don't understand why not on others when they also have the Disqus plug-in installed. 

I hope this resolved soon!
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/12/12 3:10 AM
Has this effected your traffic/rankings?
Re: 5,507 Not found errors. Anyone know the cause? barryhunter 10/12/12 1:18 PM
The change should be done once at the server level. Dont need to do each one, seperately. 


Even though you've removed teh plugin, Google will still have a backlog of URLs to crawl (that they found when you had it installed) 

So it could take months for them to disappear. Even after Google/Discus fix the crawling issue. The will still take time to 'flush' though the system. 


Like others have said, they shouldnt affect your site in anyway. Other than clogging up the Site Errors  graphs of course :)


Re: 5,507 Not found errors. Anyone know the cause? webado 10/12/12 5:07 PM
And all you need to do then is mark them all as fixed and you won't see those again in the history. You may see some again as they are re-discovered on old caches.
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/13/12 5:41 AM
Are you saying the issue is fixed now?
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/13/12 5:45 AM
And i can mark all the crawl errors as fixed?
Re: 5,507 Not found errors. Anyone know the cause? webado 10/13/12 6:48 AM
You mark as fixed those you know to be fixed, those that have an explanation which is beyond your control, those you cannot fix any  better (like by 301 redirecting them).


Re: 5,507 Not found errors. Anyone know the cause? TheBigK 10/17/12 9:50 PM
I'm quite happy that I found this discussion after about a month of extensive search on the Internet. I'm in the same boat as you all, and have following observations to share.

I removed a large number of unwanted tags (contributed by our author team) from our WordPress blog. I removed about 25k of them, each resulting into a 404 page. On September 4, 2012, Google Webmaster Tools reported sudden increase in 404 pages and the count kept growing to 99k errors! I noticed that around the same time (on 4-5 September), traffic to our overall website began sinking. Our highly popular community forums suddenly got pushed down in rankings and and the website lost about 40% of its Google traffic. Having been running a successful website for 7 years now; I'm well aware of what are the do's and donts' for webmasters. That's the reason our website never got affected by any of the Panda or Penguin algorithm updates.

However beginning September, the unfortunate began happening. I first noticed that it was the sudden removal of unwanted tags that led Google bot to discover page not found errors. After much thinking and reading, I decided to redirect the removed tags to homepage; else it'll make Google think that the site still has a lot of 404 errors. As a result, the number of errors came down from 99k to about 94k now. 

But now I notice that a large number of URLs reported in GWT have the same "1345.....000" string attached, as 404 errors. I too have disqus system and have now switched over to LiveFyre. 

Observations:

1. Though Google says that 404s don't affect site rankings, I've observed that a large number of those *DO* affect overall site rankings.
2. Google takes a LOT of time to drop the 404 pages. I found that by redirecting the removed tag pages, the errors dropped at the rate of 1000 per day. I'm not sure if that's a good strategy though.
3. Disqus system could be the real issue here. I hope LiveFyre does a better job.

It'd be great if JohnMu can clarify if there's a limit to the 404 pages beyond which the domain gets pushed down in rankings. It will help us know that we're going in the right direction in fixing the errors and getting our traffic back. 

Re: 5,507 Not found errors. Anyone know the cause? TheBigK 10/17/12 9:51 PM
Yes, it has, for us. About 40% traffic lost.
Re: 5,507 Not found errors. Anyone know the cause? Luigi Gambardella 10/18/12 2:57 AM
Thanks TheBigK for reporting your situation...
Since the problem my site lost 50% / 60% of the traffic since 2 September. I've uninstalled disqus 3 days ago...
The number of urls with 13 digits at the end is now at 23K
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/18/12 3:14 AM
I've uninstalled disqus plugin and the number of unfound urls is still going up but ou by about 30 a day. All my traffic has gone due to this issue.
I've redirected (301) 60% of the unfound urls and should get the rest done by the end of the week, I'll let you all know if anything changes with regards to my traffic.
This is a massive issue for me and I believe there'd must be a fix for this somewhere.

Someone help ... PLEASE

Re: 5,507 Not found errors. Anyone know the cause? TheBigK 10/18/12 9:06 PM
Well, I've removed LiveFyre too as they use similar rendering as Disqus. I've decided to disable comments on my blog until this issue is resolved. The most logical step here seems to be to inform Google that those bad URLs are actually from JavaScript error; and I've redirected all the bad URLs to their respective correct URLs. This plugin here -> https://github.com/echovoice/Wordpress-Disqus-Google-404-Issue-Fix (for WordPress) does that. I've installed it and have began marking the errors as 'Fixed'. So far, Google's only resolving ~1000 errors per day; so it may take us about 90 days to get rid of all those errors. 

I'm wondering if there's a way to let Google crawl all those bad pages again and determine that they've been fixed. It's too heavy on us to wait for ~90 days to get rid of all those errors and get our traffic back. 

Yes - it's a massive issue for all of us; for no fault from our side. 
Re: 5,507 Not found errors. Anyone know the cause? James Willment 10/22/12 3:16 PM
Looks like the issue has been fixed but my rankings are gone:(
Such a shame, I had a lot of happy users.
Not sure what was to blame for this but they've upset quite a few webmasters.
Re: 5,507 Not found errors. Anyone know the cause? TheBigK 10/22/12 8:30 PM
Disqus has fixed the issue in their recent updates, but I'm skeptical about enabling it on my site. I've redirected all those error URLs to the correct ones (by stripping out the problem code). Not a single day passes without me making sure that we aren't doing anything that attracts Panda or Penguin. I now see a lot of lower quality pages outranking me.

What worries me more is that I'm left with no clear strategy on how to deal with this problem.
Re: 5,507 Not found errors. Anyone know the cause? webado 10/24/12 7:29 PM
If all urls are now 301 redirected to the correct ones, then this subject is no longer an issue. 

Now you can concentrate on your site and its content and your linking profile - ensure everything abides by the guidelines.

If you haven't done so, optimize a WP site with tips from this tutorial:

It's pretty basic but it does tend to resolve a lot of basic issues.
Re: 5,507 Not found errors. Anyone know the cause? VenomVoid 11/29/12 10:04 AM

800 here.. where does google come up with links such as:

mysite.net/products/bananas/car-for-sale/lease-old-inventory-now

when the actual link is has been for over 15 years:
mysite.net/products/bananas

 
Re: 5,507 Not found errors. Anyone know the cause? Blaziran 11/29/12 10:13 AM
Looks like you've been hacked. Someone probably placed a malicious file on your server.
Re: 5,507 Not found errors. Anyone know the cause? VenomVoid 11/29/12 10:54 AM
yeah! right..

after 15 years on the same site I still don't know how many files there are in our servers.... especially the site in question: 15 files...
had not thought of that!




Re: 5,507 Not found errors. Anyone know the cause? Chris Liversidge 12/17/12 3:28 AM
Hi John,

We're still seeing this issue for uk.queryclick.com across all of the news post URLs live today. We are now triggering warnings from webmaster tools about the increased number of errors (in the thousands now), 99.99% of them conform to this issue with the string of numbers appended created by Disqus. The others are typo errors we can clean up ourselves.

Can you confirm that these 404 reports will not cause a penalty - as errors backed by warnings can be seen as a negative quality indicator - and also that there is a fix coming to remove these 'false' errors (presumably caused by headerless JS execution on Googlebots part?).

Thanks for any clarity you can shed John, this issue seems to have gone quiet since October.

Chris
Re: 5,507 Not found errors. Anyone know the cause? TheBigK 12/17/12 3:59 AM
The answer is 'No'. 404 errors won't affect your rankings. It's a different story that several webmasters on the Internet have found a strange relation between rise of these errors and their traffic going down. It's better that you fix those errors right away. 

By the way, how many of your pages are indexed in Google and what's the current error count?
Re: 5,507 Not found errors. Anyone know the cause? Chris Liversidge 12/17/12 4:21 AM
I'd equate traffic going down with a fall in rankings myself - but I take your point about the official line being 'no'.

C2,000 indexed, same amount on top as 404s from the issue. We've upgraded to the latest version of Disqus and have started the process of marking the errors fixed, but I'm yet to be convinced they won't come back as many of the 404s are recent.
Re: 5,507 Not found errors. Anyone know the cause? TheBigK 12/17/12 4:46 AM
Marking errors as fixed is going to be a long process. I've been doing it for the past 2.5 months and got the count down from 99k to 15k now. There will be big drops in error count in the process, but unless that process is followed religiously, the error count won't lower. 

I removed disqus right away. Can't trust them anymore. I'd recommend trying Lifefyre - thought they do the rendering almost the same as Disqus. Engadget and Mashable are on Lifefyre.  


Re: 5,507 Not found errors. Anyone know the cause? TheBigK 12/17/12 4:47 AM
Forgot to add : We too received warning the day errors shot up and our traffic dropped. Have you noticed traffic drop already?
Re: 5,507 Not found errors. Anyone know the cause? brokenbat 12/20/12 3:05 PM
It seems to me the solution here would be to make wordpress return a 404 error. The problem on my site is it is indexing a ton of pages like blog/12312/ that are all exactly the same.  Why does wordpress return 200 on these? It shouldn't be a page at all.
Anyone have a solution?
VenomVoid 12/20/12 3:40 PM <This message has been deleted.>
Re: 5,507 Not found errors. Anyone know the cause? VenomVoid 12/20/12 3:42 PM

@brokenbat

 
   You might wanna take a look into your Mod_rewrite.
   the fact that apache returns 200 OK it means it is a valid link.  Url_Rewrite is usually the cause
   for this bahavior

More topics »