Dear devs, 2 small unix related questions. does anyone know about: - a quick and easy way to remove duplicate lines from a text files? - a quick and easy way to decide if two image (binary) files are the same image/picture? thanks in advance for your help! -g.
on 2008-02-04 17:11
on 2008-02-04 17:32
Hi, > - a quick and easy way to remove duplicate lines from a text files? cat file | sort | uniq > file # should work (if the sorting doesn't matter in the file) cat file | uniq > file # works if the duplicate lines are next to each other > - a quick and easy way to decide if two image (binary) files are the same > image/picture? findimagedupes << search for that in google, I'm using one of those perl scripts. > thanks in advance for your help! Hope that helps, Jo
on 2008-02-04 17:37
On Feb 4, 2008 5:09 PM, George Moschovitis <george.moschovitis@gmail.com> wrote: > Dear devs, > > 2 small unix related questions. > > does anyone know about: > > - a quick and easy way to remove duplicate lines from a text files? uniq < filename (assuming that the lines are consequtive; otherwise, you need to do a sort first, or keep the lines in a hash in either Ruby or Perl) > - a quick and easy way to decide if two image (binary) files are the same > image/picture? Do you mean if they are identical? cmp should do it. Another approach is to get a checksum of the files (e.g. using "openssl rmd160 <filename>") and see if the checksum is the same. Eivind.
on 2008-02-04 17:58
On Mon, 2008-02-04 at 18:09 +0200, George Moschovitis wrote: > Dear devs, > > 2 small unix related questions. > > does anyone know about: > > - a quick and easy way to remove duplicate lines from a text files? from http://www.student.northpark.edu/pemente/sed/sed1line.txt # delete duplicate, nonconsecutive lines from a file. Beware not to # overflow the buffer size of the hold space, or else use GNU sed. sed -n 'G; s/\n/&&/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P' > - a quick and easy way to decide if two image (binary) files are the > same image/picture? diff will tell you if the files are different also rthompso@raker ~ $ diff 10000_Galaxies,_HST_Ultra_Deep.png Deathvalleysky_nps_big.png Files 10000_Galaxies,_HST_Ultra_Deep.png and Deathvalleysky_nps_big.png differ rthompso@raker ~ $ cp 10000_Galaxies,_HST_Ultra_Deep.png junk.png rthompso@raker ~ $ diff 10000_Galaxies,_HST_Ultra_Deep.png junk.png