regex - Vim EX command to remove non-duplicate records -
i have large file trying reduce neighboring duplicated record id lines. (it's been sorted already)
example:
ab12345 10987654321 andy male ab12345 10987654321 andrea female cd34567 98765432100 andrea female ef45678 54321098765 bobby tables
should remove lines 3-4 leaving lines 1-2.
the following regex pattern finds duplicate lines successfully, subsequent command removes not of non-matching lines.
:/\v^(\a{2}\d{5}\s{2}\d{11}).*\n(\1.*)+ :g!/\v^(\a{2}\d{5}\s{2}\d{11}).*\n(\1.*)+/d
why aren't non-matching lines being deleted?
not vim solution, should work:
$ fgrep -f <(awk -v ofs=' ' '{print $1, $2}' data.txt | sort | uniq -d) data.txt
the <(...)
bashism, , osf=' '
has 2 spaces.
Comments
Post a Comment