bash - why diff utility show the similar text in result file? -
i using diff find differences between 2 text file. working great but, when change order of lines in text files, show similar text in result file.
here file1.txt:
>gi17 aaaaaa >gi30 bbbbbb >gi40 cccccc >gi92 dddddd >gi50 eeeeee >gi81 ffffff
file2.txt
>gi40 cccccc >gi01 bbbbbb >gi02 aaaaaa >gi30 bbbbbb
result.txt:
>gi17 aaaaaa >gi30 ??? bbbbbb ??? >gi92 dddddd >gi01 bbbbbb >gi50 eeeeee >gi81 ffffff >gi02 aaaaaa >gi30 ??? bbbbbb ???
diff statement:
$ diff c:/users/user/desktop/file1.txt c:/users/user/desktop/file2.txt > c:/users/user/desktop/result.txt
why displays
>gi30 bbbbbb
as defferent?
edit 1: want search occurrence of each line in file 1 in whole file 2 because 2 files not ordered , cannot touch them (genetic data).
edit 2: want execute join command php code. run in cygwin cmd application but, did not run php
shell_exec("c:\\cygwin64\\bin\\bash.exe --login -c 'join -v 1 <(sort $olddatabasefile.txt) <(sort $newdatabasefile.txt) > $text_files_path/delseqgi.txt 2>&1'");
thanks.
to difference between files use bash
join
util below:-
description join utility performs ``equality join'' on specified files , writes result standard output. ``join field'' field in each file files compared. first field in each line used default. there 1 line in output each pair of lines in file1 , file2 have identical join fields. each output line consists of join field, remaining fields file1 , remaining fields file2. -v file_number not display default output, display line each unpairable line in file file_number. options -v 1 , -v 2 may specified @ same time. -1 field join on field'th field of file1. -2 field join on field'th field of file2. join -v 1 <(sort file1.txt) <(sort file2.txt) # lines in file file1.txt file file2.txt not have join -v 2 <(sort file1.txt) <(sort file2.txt) # vice versa of above
original answer/credits:- https://stackoverflow.com/a/4544780/5291015
Comments
Post a Comment