如何将一个csv文件读入数组并将其与另一个csv文件中的条目进行比较并替换?

时间:2019-01-21 06:21:06

标签: csv awk sed

我有两个csv文件file1.csvfile2.csv
file1.csv包含4列。

文件1:

Header1,Header2,Header3,Header4
aaaaaaa,bbbbbbb,ccccccc,ddddddd
eeeeeee,fffffff,ggggggg,hhhhhhh
iiiiiii,jjjjjjj,kkkkkkk,lllllll
mmmmmmm,nnnnnnn,ooooooo,ppppppp

文件2:

"Header1","Header2","Header3"
"aaaaaaa","cat","dog"
"iiiiiii","doctor","engineer"
"mmmmmmm","sky","blue"

所以我要做的是逐行读取file1.csv,将每个条目放入数组,然后将该数组的第一个元素与file2.csv的第一列进行比较,如果存在匹配项,则将file1.csv的column1和column2替换为file2.csv的相应列

所以我想要的输出是:

cat,dog,ccccccc,ddddddd
eeeeeee,fffffff,ggggggg,hhhhhhh
doctor,engineer,kkkkkkk,lllllll
sky,blue,ooooooo,ppppppp

当只有要替换的列时,我能够做到。
这是我的代码:

awk -F'"(,")?' '
NR==FNR { r[$2] = $3; next }
{ for (n in r) gsub(n,r[n]) } IGNORECASE=1' file2.csv file1.csv>output.csv

我的最后一步是将整个数组转储到具有10列的文件中。 有什么建议可以改善或纠正我的代码吗?

3 个答案:

答案 0 :(得分:3)

编辑:考虑到您的Input_file2的日期为{% load staticfiles %} <html> <head> <script src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script> <meta charset="UTF-8"> <title>Light Control</title> <link rel="stylesheet" href="{% static 'css/bootstrap.css' %}" media="screen" type="text/css" /> <link rel="stylesheet" href="{% static 'css/style.css' %}" media="screen" type="text/css" /> <link rel="stylesheet" href="{% static 'css/bootstrap-theme.css' %}" media="screen" type="text/css" /> </head> <body> <nav class="navbar navbar-expand-lg navbar-light bg-light"> <a class="navbar-brand">Light Control</a> {% if isLogged %} <button class="navbar-toggler" type="button" data-toggle="collapse" data-target="#navbarNavDropdown" aria-controls="navbarNavDropdown" aria-expanded="false" aria-label="Toggle navigation"> <span class="navbar-toggler-icon"></span> </button> <div class="collapse navbar-collapse" id="navbarNavDropdown"> <ul class="navbar-nav"> <li class="nav-item"> <a class="nav-link" href="/control">Light Control<span class="sr-only">(current)</span></a> </li> <li class="nav-item dropdown"> <a class="nav-link dropdown-toggle" href="#" id="navbarDropdownMenuLink" role="button" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"> Cameras </a> <div class="dropdown-menu" aria-labelledby="navbarDropdownMenuLink"> <a class="dropdown-item" href="/camera/office">Office</a> <a class="dropdown-item" href="/camera/office">ec</a> </div> </li> </ul> </div> <div> <a href="/logout/" class="btn btn-outline-danger">Log Out</a> </div> {% endif %} </nav> </body> </html> I tried to click on the button, but nothing actually shows up once I click on it. 等格式,请尝试以下操作。(感谢Tiw在他/她的帖子中提供了此示例)

"ytest","test2"


请您尝试以下。

awk '
BEGIN{
  FS=OFS=","
}
FNR==NR{
  gsub(/\"/,"")
  a[tolower($1)]=$0
  next
}
a[tolower($1)]{
  print a[tolower($1)],$NF
  next
}
1' file2.csv file1.csv

或者,如果您可能在Input_file中包含小写字母和大写字母的组合,然后尝试执行以下操作。

awk '
BEGIN{
  FS=OFS=","
}
FNR==NR{
  a[$1]=$0
  next
}
a[$1]{
  print a[$1],$NF
  next
}
1'  Input_file2  Input_file1

答案 1 :(得分:2)

给出示例数据以及注释中的描述,请尝试以下操作:
(根据您自己的代码判断,您可能在字段周围加引号,因此我没有尝试回答。)

awk 'BEGIN{FS=OFS=","}
    NR==FNR{gsub(/^"|"$/,"");gsub(/","/,",");a[$1]=$2;b[$1]=$3;next}
    $1 in a{$2=b[$1];$1=a[$1];}
    1' file2.csv file1.csv

例如:

$ cat file1.csv
Header1,Header2,Header3,Header4
aaaaaaa,bbbbbbb,ccccccc,ddddddd
eeeeeee,fffffff,ggggggg,hhhhhhh
iiiiiii,jjjjjjj,kkkkkkk,lllllll
mmmmmmm,nnnnnnn,ooooooo,ppppppp

$ cat file2.csv
"Header1","Header2","Header3"
"aaaaaaa","cat","dog"
"iiiiiii","doctor","engineer"
"mmmmmmm","sky","blue"

$ awk 'BEGIN{FS=OFS=","}
NR==FNR{gsub(/^"|"$/,"");gsub(/","/,",");a[$1]=$2;b[$1]=$3;next}
$1 in a{$2=b[$1];$1=a[$1];}
1' file2.csv file1.csv
Header2,Header3,Header3,Header4
cat,dog,ccccccc,ddddddd
eeeeeee,fffffff,ggggggg,hhhhhhh
doctor,engineer,kkkkkkk,lllllll
sky,blue,ooooooo,ppppppp

另一种方式,较为冗长,但我认为最好理解(GNU awk):

awk 'BEGIN{FS=OFS=","}
    NR==FNR{for(i=1;i<=NF;i++)$i=gensub(/^"(.*)"$/,"\\1",1,$i);a[$1]=$2;b[$1]=$3;next}
    $1 in b{$2=b[$1];}
    $1 in a{$1=a[$1];}
    1' file2.csv file1.csv

请注意一个陷阱,因为$1是关键,所以我们应该最后更改$1

不区分大小写的解决方案:

awk 'BEGIN{FS=OFS=","}
    NR==FNR{gsub(/^"|"$/,"");gsub(/","/,",");k=tolower($1);a[k]=$2;b[k]=$3;next}
    {k=tolower($1);if(a[k]){$2=b[k];$1=a[k]}}
    1' file2.csv file1.csv

为简洁起见,添加了变量k并将"if"移入了内部。

答案 2 :(得分:2)

在任一文件中包含任何awk和任意数量的字段:

$ cat tst.awk
BEGIN { FS=OFS="," }
{
    gsub(/"/,"")
    key = tolower($1)
}
NR==FNR {
    for (i=2; i<=NF; i++) {
        map[key,i] = $i
    }
    next
}
{
    for (i=2; i<=NF; i++) {
        $(i-1) = ((key,i) in map ? map[key,i] : $(i-1))
    }
    print
}

$ awk -f tst.awk file2 file1
Header2,Header3,Header3,Header4
cat,dog,ccccccc,ddddddd
eeeeeee,fffffff,ggggggg,hhhhhhh
doctor,engineer,kkkkkkk,lllllll
sky,blue,ooooooo,ppppppp