In my opinion, the issue goes beyond duplicate file
detection, which is an efficiency improvement. I'd be more ok if Google
Drive uploads the same files multiple times to cloud, as long as at the
end I still end up with one copy of the file. What we see is that each
upload means a duplicate file being created in the cloud! Here are the
exact steps to reproduce the issue:
install Google Drive on machine A, and let it sync files to the cloud
install Google Drive on machine B. Since Google Drive doesn't
support LAN sync, to speed up the sync process many users will exit
Google Drive right away, manually copying over the files to the "Google
Drive" folder in machine B, and then start Google Drive again.
Once Google Drive starts up on machine B, the following happens:
Google Drive does not recognize that the files on machine B are exactly the same in the cloud.
Now, it would be ok, though very inefficient, if Google Drive
re-uploads the files and REPLACES the original files on the cloud with
the newly uploaded copies.
The deal-breaker is that instead of replacing the files, Google
Drive uploads the files as NEW COPIES. From the web interface, you end
up with two files with the same file name for every file that you have
(e.g. yes, you can see "a.txt" and "a.txt" again in the same directory!)
Now, these duplicate files can get sync'ed back to the local
drive too. (e.g. you end up having "a.txt" and "a (1).txt" in a
At times, when manually trying to delete the duplicate files,
you may get the "Primary Key" error that some other users have reported.
This is a serious issue because I see this as corrupting the
integrity of my directory structure. It's not just an efficiency issue.