Well - we cover that in a minute (Duplication).
or showing it some other pages content (sending a 302 redirect etc.).
You should ensure that your server/script/site sends the correct 404
response for URLs that do Not exist!
Okay ... Parameters are not the same as Paths (FilePath).
is a Path. (You can also tack on a file too (FilePath),
/this/that/theother/somefile.
html
Where as what you are seeing is a Paramter and Value pairing;
?this=that
Parameters are used by Scripts as a method of passing data and deciding
what to do/show, what to get from a DataBase etc.
Some systems are setup to permit such requests - regardless of the
filetype.
Some systems are setup to show scripted pages/files as notrmal
non-scripted ones.
MAkes little difference ... it makes a "different" URL.
And it is the URL that counts ... that is what G looks at.
As far as G is concerened;
http://www.example.com/this/that/theother/somefile.html
http://www.example.com/this/that/theother/somefile.html?this=1
http://www.example.com/this/that/theother/somefile.html?this=2
are 3 different URLs - 3 different pages and should show 3 different
bits of content,
else it causes Duplication (See next bit).
======= It's causing Duplication! =======
That
is something to worry about.
Google
doesn't really weant to see lots of the same content - no matter the
variant URL.
If it sees the same content under different URLs - then
it may Filter ... picking 1 version out of the 2+ versions it see's.
The
one
it chooses may Not be the better ranking one - and thus may impact
your rankings/traffic.
======= Where are
they coming from? =======
These things can
occur due to
various reasons/sources.
* Some are "trackers" ... the linked to
URL is set so that you can see where the link originates from.
*
Others may be "look at me's" - some people watch inbound traffic - and
if they see traffic from an unknown site - may go look.
* I've seen
some people state them as "hijacks" - apparently some systems are setup
to automatically do a redirect on certain paramters ... so people may
intentionall include such a URL hoping to jack some of your traffic.
(No idea if true!)
* Then there's simply "bad links" - sometimes
people miss-type, other times the stuff up the copy and paste process
etc.
======= What can I do? =======
Well -
there are several options ... and which you do/use depends on your
setup/skills etc.
--- Setup Server based Redirects ---
If you
have Apache with htaccess - (no idea if IIS can do similar),
you can
tell the server to examine URLs request for specific paramaters,
and
if found, remove them from the URL and redirect to the "cleaned"
version.
If G encounters a 301 - then it will remove the bad URL from the SERPs.
---
Setup
Script based Redirects ---
Pretty much like the Server based
method,
but you use your Script (php, asp, cfm etc.) to examine the URL, and
send a redirect resposne if needed.
If G encounters a 301 - then it will remove the bad URL from the SERPs.
--- Start using the Canonica Link Element ---
By including the CLE on
your pages - you wil lbe telling G which URL you would prefer
that content to appear on in the SERPs - regardless of what URL G see's
it on.
That means though G may see the same content udner the normal URL and
hte one with the strange
parameters - it will know to use the "clean" one.
If G encounters a CLE - then it will remove the bad URL from the SERPs.
--- Start using the Parameter Handling Tool ---
In Google WebMaster Tools (GWMT) there is the PHT.
You can tell G to ignore certain Parameters (or Paramters and Values).
This works along the same lines as the CLE method - but instead of
telling G which URLyou want it to pay attention to,
you tell it parts of a URL to ignore.
If G see's a match for the PHT - then it will remove the bad URL from
the SERPs.
--- robots.txt [Last Resort] ---
And I'll say it again - this is the
Last Resort Method!
Robots.txt does Not stop Indexing - it stops Crawling.
Blocking such URLs
a) may Not stop G showing them in the SERPs
b) will Not automatically remove the Bad URL from the SERPs
(If you
want the URL removed from the SERPs - then you will have to use the URL
Removal Request tool as well)