Finding a new line in a file
Apr. 18th, 2014 02:40 pm![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
Today I helped a collegue who came with the question: I have two files, how do I find which lines were added to one file, but not to the other?
He was thinking of a program to write. I'm more a KISS person, why waste time writing a program when brute force will do just fine.
So:
We have two files a and b:
We want to see the lines in b which are not in a:
So we take the two files, sort then and then print the unique lines.
But what if there are also unique lines in a which we don't need? So let's add a line to 0 which we do not want to see in the output:
How do we remove this 0?
A trick is to include a twice, then a line in a will never be unique:
I used a similar method today to find which interface gave the CRC errors...
He was thinking of a program to write. I'm more a KISS person, why waste time writing a program when brute force will do just fine.
So:
We have two files a and b:
pevaneyn@mac-book:/tmp :) $ cat a 1 2 3 4 5 pevaneyn@mac-book:/tmp :) $ cat b 1 2 3 4 5 7 8
We want to see the lines in b which are not in a:
pevaneyn@mac-book:/tmp :) $ cat a b | sort | uniq -u 7 8
So we take the two files, sort then and then print the unique lines.
But what if there are also unique lines in a which we don't need? So let's add a line to 0 which we do not want to see in the output:
pevaneyn@mac-book:/tmp :) $ cat >> a 0 pevaneyn@Pmac-book:/tmp :) $ cat a b | sort | uniq -u 0 7 8
How do we remove this 0?
A trick is to include a twice, then a line in a will never be unique:
pevaneyn@mac-book:/tmp :) $ cat a a b | sort | uniq -u 7 8
I used a similar method today to find which interface gave the CRC errors...
no subject
Date: 2014-04-18 04:22 pm (UTC)Because that would be a la bash for LINE in `cat b`; do grep "$LINE" a 1>/dev/null 2>/dev/null || echo "$LINE"; done (adjust IFS to suite) ;)
no subject
Date: 2014-04-19 07:52 am (UTC)