PDA

View Full Version : remove duplicate lines in a file


jack777
25th April 2008, 07:00
Hi
Having a file as follows

==================================================
Date Time error code server
================================================== ====
12-29-08 10:10 121221 server A
12-29-08 10:12 121221 server A
12-29-08 10:10 121221 server B
12-29-08 10:10 121221 server c


Need to remove the duplicate lines with the following conditions

if date=12-29-08 and server=server A .
Must delete the line with old time and keep the latest.

anyone can suggest

Thanks in advance

burschik
15th May 2008, 13:24
#!/bin/awk

BEGIN { pattern = /^[[:digit:]]+-[[:digit:]]+-[[:digit:]]/; }

$1 !~ pattern { print; }

$1 ~ pattern {
if ($1 != previous_date && previous_date != "" ||
$5 != previous_server && previous_server != "") {
print previous_line;
}

previous_date = $1;
previous_server = $5;
previous_line = $0;
}

END { print previous_line; }