OK, you have several questions...
- First a simple script to find all duplicate filenames.
problem is you need to get a list of all files on your system, then compare the names, minus the path. So I would try something like this (not fully tested):
#/bin/bash
find -P / -type f > /tmp/files.txt
sed -i -e 's#.*/\(.*\)$#\1#' /tmp/files.txt
sort /tmp/files{,1}.txt
rm files.txt
uniq -D /tmp/files{1,}
rm files1.txt
My logic:
First get a list of all files ignoring symlinks (which are duplicate by definition) looking at only regular files.
Next strip the path from the names in the temp file
Now that you only have filenames, sort the list into a temp file
Delete the original file
Now, seek all duplicates, and place those names back into the original file
Delete the second temp file
Now you should have a list of all dup filenames
- How can I tell if they are just duplicate filenames, or if they are actually duplicate files?
for each filename, find all copies of the files with the find command, and run them through sha1sum like so:
for x in $(find /tmp -name <filename to check>); do sha1sum $x; done
files with the same sha1sum, should have duplicate contents.
You may need to check my syntax on some of this, but it should get the job done.
Kevin Fries
What command syntax can I use to locate all duplicate files (filenames) on
my system? Or, more specifically, within any specified directory on the
system?
Also, how can I tell which duplicates have identical contents and which
duplicates have different content (or at least different file sizes)?
---------------------------------------------------
PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us
To subscribe, unsubscribe, or to change your mail settings:
http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss