加入收藏 | 设为首页 | 会员中心 | 我要投稿 济南站长网 (https://www.0531zz.com/)- 科技、建站、经验、云计算、5G、大数据,站长网!
当前位置: 首页 > 服务器 > 搭建环境 > Windows > 正文

如何在Linux上识别同样内容的文件

发布时间:2019-06-10 18:08:30 所属栏目:Windows 来源:Sandra Henry-stocker
导读:副标题#e# 有时文件副本相当于对硬盘空间的巨大浪费,并会在你想要更新文件时造成困扰。以下是用来识别这些文件的六个命令。 在最近的帖子中,我们看了如何识别并定位硬链接的文件(即,指向同一硬盘内容并共享 inode)。在本文中,我们将查看能找到具有相

你可以在 dryrun 模式中运行这个命令 (换句话说,仅仅汇报可能会另外被做出的改动)。

  1. $ rdfind -dryrun true ~
  2. (DRYRUN MODE) Now scanning "/home/shark", found 12 files.
  3. (DRYRUN MODE) Now have 12 files in total.
  4. (DRYRUN MODE) Removed 1 files due to nonunique device and inode.
  5. (DRYRUN MODE) Total size is 699352 bytes or 683 KiB
  6. Removed 9 files due to unique sizes from list.2 files left.
  7. (DRYRUN MODE) Now eliminating candidates based on first bytes:removed 0 files from list.2 files left.
  8. (DRYRUN MODE) Now eliminating candidates based on last bytes:removed 0 files from list.2 files left.
  9. (DRYRUN MODE) Now eliminating candidates based on sha1 checksum:removed 0 files from list.2 files left.
  10. (DRYRUN MODE) It seems like you have 2 files that are not unique
  11. (DRYRUN MODE) Totally, 223 KiB can be reduced.
  12. (DRYRUN MODE) Now making results file results.txt

rdfind 命令同样提供了类似忽略空文档(-ignoreempty)和跟踪符号链接(-followsymlinks)的功能。查看 man 页面获取解释。

  1. -ignoreempty ignore empty files
  2. -minsize ignore files smaller than speficied size
  3. -followsymlinks follow symbolic links
  4. -removeidentinode remove files referring to identical inode
  5. -checksum identify checksum type to be used
  6. -deterministic determiness how to sort files
  7. -makesymlinks turn duplicate files into symbolic links
  8. -makehardlinks replace duplicate files with hard links
  9. -makeresultsfile create a results file in the current directory
  10. -outputname provide name for results file
  11. -deleteduplicates delete/unlink duplicate files
  12. -sleep set sleep time between reading files (milliseconds)
  13. -n, -dryrun display what would have been done, but don't do it

注意 rdfind 命令提供了 -deleteduplicates true 的设置选项以删除副本。希望这个命令语法上的小问题不会惹恼你。;-)

  1. $ rdfind -deleteduplicates true .
  2. ...
  3. Deleted 1 files. <==

你将可能需要在你的系统上安装 rdfind 命令。试验它以熟悉如何使用它可能是一个好主意。

使用 fdupes 命令

fdupes 命令同样使得识别重复文件变得简单。它同时提供了大量有用的选项——例如用来迭代的 -r。在这个例子中,它像这样将重复文件分组到一起:

  1. $ fdupes ~
  2. /home/shs/UPGRADE
  3. /home/shs/mytwin
  4.  
  5. /home/shs/lp.txt
  6. /home/shs/lp.man
  7.  
  8. /home/shs/penguin.png
  9. /home/shs/penguin0.png
  10. /home/shs/hideme.png

这是使用迭代的一个例子,注意许多重复文件是重要的(用户的 .bashrc.profile 文件)并且不应被删除。

  1. # fdupes -r /home
  2. /home/shark/home.html
  3. /home/shark/index.html
  4.  
  5. /home/dory/.bashrc
  6. /home/eel/.bashrc
  7.  
  8. /home/nemo/.profile
  9. /home/dory/.profile
  10. /home/shark/.profile
  11.  
  12. /home/nemo/tryme
  13. /home/shs/tryme
  14.  
  15. /home/shs/arrow.png
  16. /home/shs/PNGs/arrow.png
  17.  
  18. /home/shs/11/files_11.zip
  19. /home/shs/ERIC/file_11.zip
  20.  
  21. /home/shs/penguin0.jpg
  22. /home/shs/PNGs/penguin.jpg
  23. /home/shs/PNGs/penguin0.jpg
  24.  
  25. /home/shs/Sandra_rotated.png
  26. /home/shs/PNGs/Sandra_rotated.png

(编辑:济南站长网)

【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!