我需要删除2次以上的模式匹配。
示例输入
5006719,9845861877,"2014-10-01 07:53:26","2014-10-01 11:52:15",Expired
5006720,9845885761,"2014-10-01 07:53:11","2014-10-01 11:52:00",Recieved
5006720,9845885761,"2014-10-01 07:53:26","2014-10-01 11:52:15",Expired
5006720,9845885761,"2014-10-01 07:53:27","2014-10-01 11:52:16",Expired
5006720,9845885761,"2014-10-01 10:36:24","2014-10-01 12:35:13",Expired
5006721,9845888313,"2014-10-01 07:53:11","2014-10-01 11:52:01",Expired
5006721,9845888313,"2014-10-01 07:53:27","2014-10-01 11:52:16",Expired
5006722,9848157771,"2014-10-01 07:53:13","2014-10-01 11:52:02",Expired
5006722,9848157771,"2014-10-01 07:53:28","2014-10-01 11:52:17",Expired
5006722,9848157771,"2014-10-01 07:53:29","2014-10-01 11:52:18",Expired
5006723,9848497273,"2014-10-01 07:53:13","2014-10-01 11:52:03",Expired
5006723,9848497273,"2014-10-01 07:53:29","2014-10-01 11:52:18",Expired
5006723,9848497273,"2014-10-01 07:53:30","2014-10-01 11:52:19",Expired
5006723,9848497273,"2014-10-01 10:36:25","2014-10-01 12:35:14",Expired
5006724,9848788789,"2014-10-01 07:53:14","2014-10-01 11:52:04",Expired
要匹配的模式是第一列,例如5006719,删除此记录的两个以上的一个。结果集应该是
5006719,9845861877,"2014-10-01 07:53:26","2014-10-01 11:52:15",Expired
5006720,9845885761,"2014-10-01 07:53:11","2014-10-01 11:52:00",Recieved
5006720,9845885761,"2014-10-01 07:53:26","2014-10-01 11:52:15",Expired
5006721,9845888313,"2014-10-01 07:53:11","2014-10-01 11:52:01",Expired
5006721,9845888313,"2014-10-01 07:53:27","2014-10-01 11:52:16",Expired
5006722,9848157771,"2014-10-01 07:53:13","2014-10-01 11:52:02",Expired
5006722,9848157771,"2014-10-01 07:53:28","2014-10-01 11:52:17",Expired
5006723,9848497273,"2014-10-01 07:53:13","2014-10-01 11:52:03",Expired
5006723,9848497273,"2014-10-01 07:53:29","2014-10-01 11:52:18",Expired
5006724,9848788789,"2014-10-01 07:53:14","2014-10-01 11:52:04",Expired
单个条目应保持单一,双重条目应保持双倍,三个条目应剥离为双倍。注意:我们在这里不能匹配整行,只能说明列匹配。
答案 0 :(得分:1)
不熟悉shell脚本,所以解决了php中的问题:
<?php
$file='sort.csv'; //file containing data
$fileData=fopen($file,'r');
$last = 0; //variable contains last entry
$count = 0; //count of similar occurences
while($row=fgets($fileData)){ //loop through each record
$data = explode(",", $row);
if($data['0'] != $last){
file_put_contents("f1.csv", $row, FILE_APPEND); //output file
$count = 0;
}else{
if($count == 0){
file_put_contents("f1.csv", $row, FILE_APPEND); //output file
$count++;
}
}
$last = $data['0'];
}