1
I am creating a script to scan my system and delete all duplicate files. To not have problems in the test phase I changed the function of delete to copy and I am testing in a specific folder, to make sure that everything is ok, but gives an error that I do not know how to solve, and always complain about the last file, no matter what it is.
appears this:
Traceback (most recent call last):\Users\Unknown\AppData\Local\Programs\Python\Python37-32\lib\filecmp.py", line 51, in cmp
s1 = _sig(os.stat(f1))
FileNotFoundError: [WinError 2] O sistema não pode encontrar o arquivo especificado: '82110225_2471280396417255_4533531125607301120_n - Copia.jpg'
That is the code:
import os, shutil, filecmp,itertools
files = os.listdir('D:\\Scripts_Python\\Nova pasta\\')
extension =('.jpg')
for filename in files:
if filename.endswith(extension):
for f1, f2 in itertools.combinations(files,2):
comp = filecmp.cmp(f1, f2, shallow=False)
if comp == True:
shutil.copy(f2,'D:\\Scripts_Python\\Nova pasta\\Nova pasta\\')
break
I also worried about giving the hashlib tip, to avoid that in a folder with 30 images, the program reads 900 files - but Python’s filecmp is smart and already does it - look at his help: "Return value: True if the files are the same, Otherwise false. This Function uses a cache for Past comparisons and the Results, with cache Entries invalidated if their stat information "
– jsbueno
and also, it is not legal to indicate "os.pth.abspath" at this time of the championship - code of this type should use everything possible from "pathlib" - with the methods of "pathlib. Path" - becomes much more practical (and everything in the same place) - the listdir, the extension check, until reading a file and the "stat" are object methods and properties
pathlib.Path
– jsbueno
@jsbueno: thanks for the comments! following edition.
– Lacobus