Topic "duplicate file finder on remote server (script or new feature)"

Author Message
smitty123
[View user's profile]

Joined: 2014-10-26
Posts: 36
Location: Canada
i've been cleaning out my nas these past few days and it came to me that a duplicate search feature would be a great asset for me. i suggested the idea to the nas manufacturer but they're being mute on the subject.

i searched for a few windows programs which would work, but then i realized that at 30MB/s it's going to take days and days to read, transfer and compare 5-6 terabytes (2x3tb hdd).

i wonder if there's a way to find duplicate files directly on the server, a script maybe, rather than use a Windows/Linux based program which would force each file to be read from the server as it scans for dupes.

i've thought about listing all the files and then sorting by size. But that isn't an easy solution. Plus i don't know much about linux command line to do all that.

A quick way would be to make a list of all files on the nas shares, sort by size as they are added to the list, and if a size already exists for whatever reason, then copy the 2 file names to a 2nd list that contains only duplicates by size, and use that list to display to the user. Then the user can have the program do a byte comparison locally on the server of the files with same size to confirm they are dupes. i read up on the linux cmp command, it can do that for us remotely on the server, if the server has such a command in its code of course.

ex: cmp /share/HDA_DATA/hdd1/123.mp4 /share/HDB_DATA/hdd2/123.mp4

Then display true duplicates in an interface for management, delete or view/open the file or its folder for us to handle the file manually.

if there is such a way it would be very helpful for disk space management, it would really take the sting out of scanning so many terabytes.
Advertisements
martin
[View user's profile]
Site Admin
Joined: 2002-12-10
Posts: 24530
Location: Prague, Czechia
You can script this using WinSCP .NET assembly. Some SFTP/FTP servers do even support checksum calculation for remote files. So you would not have to download the files.
smitty123
[View user's profile]

Joined: 2014-10-26
Posts: 36
Location: Canada
prikryl wrote:
You can script this using WinSCP .NET assembly. Some SFTP/FTP servers do even support checksum calculation for remote files. So you would not have to download the files.


i'm sure its doable, but i'm not a programmer.
martin
[View user's profile]
Site Admin
Joined: 2002-12-10
Posts: 24530
Location: Prague, Czechia
Maybe I'll write the script one day.
_________________
Martin Prikryl
martin
[View user's profile]
Site Admin
Joined: 2002-12-10
Posts: 24530
Location: Prague, Czechia
I wrote the script:
http://winscp.net/eng/docs/library_example_find_duplicate_files

Can you please test it?
Advertisements

You can post new topics in this forum






Search Site

What is WinSCP?

It is award-winning SFTP client, SCP client, FTPS client and FTP client integrated into one software program for file transfer to FTP server or secure SFTP server. [More]

And it's free!

Donate

About donations

$9   $19   $49   $99

About donations

Recommend

WinSCP Privacy Policy

WinSCP License