我正在尝试将一列从csv文件转换为另一个.csv文件。 但是,单列非常复杂:里面有双引号和逗号 例如:
fileA.csv
A,B,C,D,E,F,G,H
I,J,K,L,M,N,O,P
...
和
fileB.csv
1,2,3,4,5,"has "commas," and \"quotes\"",7,8
10,11,12,13,14,"another "commas," and \"quotes\"",15,16
我希望将第六列(F&N)替换为fileB.csv中相同的列号
所以结果将是:
A,B,C,D,E,"has "commas," and "\"quotes\""",G,H
I,J,K,L,M,"another "commas," and \"quotes\"",O,P
我尝试使用
paste -d' ' 123.csv <(awk '{print $6}' realfinalfile.csv) > finalwoot.csv
但是我只得到了123.csv文件的内容,没有输入来自realfinalfile.csv的列
这是实际fileB.csv中的行之一的示例
"R111_Bellca_LiveContent_SHP","bell.ca","BCACXB-6912","No_Request_Validation","20","*No_Request_Validation* issue exists @ *Views/Search/Web.config*
Request validation is explicitly disabled by version="1.0"?> in file Views\Search\Web.config at line 1.
*Application:* R111_Bellca_LiveContent_SHP
*Cx-Project:* R111_Bellca_LiveContent_SHP
*Cx-Team:* CxServer\Bell\DCX\Bell.ca
*Severity:* Medium
*CWE:* 20
*Addition Info*
----
[Checkmarx|https://cwypwa-368.bell.corp.bce.ca/CxWebClient/ViewerMain.aspx?scanid=1000353&projectid=136&pathid=184]
[Mitre Details|https://cwe.mitre.org/data/definitions/20.html]
[Training|https://cxa.codebashing.com/courses/]
[Guidance|https://custodela.atlassian.net/wiki/spaces/AS/pages/79462432/Remediation+Guidance]
Lines: 41
----
Line #41
{code}
validateRequest=""false""
{code}
----
","3-Medium","https://cwe.mitre.org/data/definitions/20.html"
所以我想获取看起来像
的单元格的内容*No_Request_Validation* issue exists @ *Views/Search/Web.config*
Request validation is explicitly disabled by version...
并将其放入FileA.csv的第六列
答案 0 :(得分:0)
这是您要做什么的方法:
$ cat tst.awk
BEGIN { FS=OFS="," }
NR==FNR {
gsub(/^([^,]*,){5}|(,[^,]*){2}$/,"")
val[FNR] = $0
next
}
{
$6 = val[FNR]
print
}
$ awk -f tst.awk fileB.csv fileA.csv
A,B,C,D,E,"has "commas," and \"quotes\"",G,H
I,J,K,L,M,"another "commas," and \"quotes\"",O,P
但是,就像您的输入一样,该输出仍然不是有效的CSV。如果您希望输出为有效的CSV,则将其更改为:
$ cat tst.awk
BEGIN { FS=OFS=","; escQ="\\\"" }
NR==FNR {
gsub(/^([^,]*,){5}|(,[^,]*){2}$/,"")
gsub(/^"|"$/,"")
gsub(/\\?"/,escQ)
val[FNR] = "\"" $0 "\""
next
}
{
$6 = val[FNR]
print
}
$ awk -f tst.awk fileB.csv fileA.csv
A,B,C,D,E,"has \"commas,\" and \"quotes\"",G,H
I,J,K,L,M,"another \"commas,\" and \"quotes\"",O,P
或(只需将escQ="\\\""
更改为escQ="\"\""
):
$ cat tst.awk
BEGIN { FS=OFS=","; escQ="\"\"" }
NR==FNR {
gsub(/^([^,]*,){5}|(,[^,]*){2}$/,"")
gsub(/^"|"$/,"")
gsub(/\\?"/,escQ)
val[FNR] = "\"" $0 "\""
next
}
{
$6 = val[FNR]
print
}
$ awk -f tst.awk fileB.csv fileA.csv
A,B,C,D,E,"has ""commas,"" and ""quotes""",G,H
I,J,K,L,M,"another ""commas,"" and ""quotes""",O,P
根据您遵循的CSV“标准”是使用\"
还是""
来在字段中使用双引号。
注意:仅当您在每条记录中具有已知数量的“字段”,每条记录在一行上且仅“字段”中的一个包含引号和逗号时(如您的示例中所示),以上内容才有效。 / p>