pvaneynd: (Default)
[personal profile] pvaneynd
Today I helped a collegue who came with the question: I have two files, how do I find which lines were added to one file, but not to the other?

He was thinking of a program to write. I'm more a KISS person, why waste time writing a program when brute force will do just fine.

So:

We have two files a and b:

pevaneyn@mac-book:/tmp :) $ cat a
1
2
3
4
5
pevaneyn@mac-book:/tmp :) $ cat b
1
2
3
4
5
7
8


We want to see the lines in b which are not in a:

pevaneyn@mac-book:/tmp :) $ cat a b | sort | uniq -u
7
8


So we take the two files, sort then and then print the unique lines.

But what if there are also unique lines in a which we don't need? So let's add a line to 0 which we do not want to see in the output:

pevaneyn@mac-book:/tmp :) $ cat >> a
0
pevaneyn@Pmac-book:/tmp :) $ cat a b | sort | uniq -u
0
7
8


How do we remove this 0?

A trick is to include a twice, then a line in a will never be unique:

pevaneyn@mac-book:/tmp :) $ cat a a b | sort | uniq -u
7
8


I used a similar method today to find which interface gave the CRC errors...

Date: 2014-04-18 04:22 pm (UTC)
rbarclay: (laughingcat)
From: [personal profile] rbarclay
Nonono, that is elegant, but not real Brute Force(tm).

Because that would be a la bash for LINE in `cat b`; do grep "$LINE" a 1>/dev/null 2>/dev/null || echo "$LINE"; done (adjust IFS to suite) ;)

Profile

pvaneynd: (Default)
pvaneynd

March 2017

S M T W T F S
   1234
567891011
12131415 161718
19202122232425
26272829 3031 

Most Popular Tags

Page Summary

Style Credit

Expand Cut Tags

No cut tags
Page generated May. 24th, 2017 05:39 pm
Powered by Dreamwidth Studios