Skip to content

Commit

Permalink
Make remove-duplicates behaviour reproducible and add test
Browse files Browse the repository at this point in the history
  • Loading branch information
samueltardieu committed Jan 5, 2020
1 parent 7faba0c commit f0ea524
Show file tree
Hide file tree
Showing 4 changed files with 46 additions and 11 deletions.
7 changes: 3 additions & 4 deletions doc/remove-duplicates.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
%REMOVE-DUPLICATES(1) User Manuals
%Samuel Tardieu <[email protected]>
%November 12, 2016
%January 5, 2019

# NAME

Expand All @@ -13,16 +13,15 @@ remove-duplicate [*-f*]
# DESCRIPTION

Removes duplicates of the same file in the current directory if *-f*
is given. If *-f* is not given, duplicate will be identified twice
(once in every direction).
is given. If *-f* is not given, duplicates will be identified.

# OPTIONS

-f

# COPYRIGHT

Copyright (c) 2004-2016 Samuel Tardieu <[email protected]>.
Copyright (c) 2004-2019 Samuel Tardieu <[email protected]>.
This is free software; see the source for copying conditions. There is
NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR
PURPOSE.
Expand Down
19 changes: 13 additions & 6 deletions scripts/remove-duplicates
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@

import os

def check_duplicate(orig, copy):
def check_duplicate(orig, content, copy):
try:
if open(orig).read() == open(copy).read():
if content == open(copy).read():
print("Removing %s which is a copy of %s" % (copy, orig))
os.unlink(copy)
except:
Expand All @@ -28,10 +28,17 @@ def aggregate():
return d

def remove_duplicates(d):
for v in d.values():
while v:
del v[0]
for c in v[1:]: check_duplicate(v[0], c)
for v in sorted(d.values()):
if len(v) < 2:
continue
v.sort()
for (i, f1) in enumerate(v[:-1]):
try:
content = open(f1).read()
for f2 in v[i+1:]:
check_duplicate(f1, content, f2)
except IOError:
continue

if __name__ == '__main__':
remove_duplicates(aggregate())
2 changes: 1 addition & 1 deletion tests/Makefile.am
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
TESTS = chdir-ok.test chdir-not-ok.test
TESTS = chdir-ok.test chdir-not-ok.test remove-duplicates.test
XFAIL_TESTS = chdir-not-ok.test

EXTRA_DIST = *.test
Expand Down
29 changes: 29 additions & 0 deletions tests/remove-duplicates.test
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
#! /bin/sh
#

topsrcdir=$(cd $(dirname $0)/.. && pwd)
REMOVE_DUPLICATES="$topsrcdir/scripts/remove-duplicates"
trap "rm -rf $PWD/$0.dir" INT QUIT TERM EXIT
mkdir "$0.dir"
cd "$0.dir"

echo foo > foo1.txt
echo foo > foo2.txt
echo foo > foo3.txt
echo foo > foo4.txt
echo bar > bar1.txt
echo bar > bar2.txt
echo baz > baz1.txt
echo zab > baz2.txt

[ $("$REMOVE_DUPLICATES" -f | wc -l) = 4 ] || exit 1

[ $(ls foo?.txt | wc -l) = 1 ] || exit 2

[ -f foo1.txt ] || exit 3

[ $(ls bar?.txt | wc -l) = 1 ] || exit 4

[ -f bar1.txt ] || exit 5

[ $(ls baz?.txt | wc -l) = 2 ] || exit 6

0 comments on commit f0ea524

Please sign in to comment.