下面是我的.csv文件的简要示例:
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","","N"
在第三列(“ Z”列)中,我缺少一些单元格(第3,6和9行)。最好使用awk或sed,我想专门针对第3列,如果任何单元格为空白,我想删除整行。我的最终结果将是:
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
对于我的实际项目,以下是我的一个文件的精确副本- https://github.com/drphillgood/riotapidata/blob/master/csv/game3.csv。您将在第28列(participants__participantId)中看到只有某些单元格具有数据(与最后一列的数据相同,参与者__playerName)。如果此列中的单元格之一为空白,我想使用.sh脚本删除整行。最终文件会这样-https://github.com/drphillgood/riotapidata/blob/master/csv/game3_v2.csv
答案 0 :(得分:3)
一个更简单的AWK命令:
awk -F , '$3 != "\"\"" {print}' inputfile > outputfile
将字段分隔符设置为逗号,并打印出第三行仅包含""
的每一行。
此功能不足以处理字段中包含逗号的CSV文件。它希望空字段由空引号组成。
答案 1 :(得分:1)
这是一个awk
脚本,可以完成操作。
awk -F '","' '!$3{next}1' input.csv
输出:
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
说明:
BEGIN {FS = "\",\""} # input line field separator ","
!$3{next} # if empty string in 3rd input field, skip
1 # print current line
以下是注释,并在链接中提供了测试CSV文件。
测试$ 28字段
awk -F '","' '!$28{next}1' input.txt | awk -F '","' '{print $28}'
的输出
participants__participantId
1
2
3
4
5
6
7
8
9
10
答案 2 :(得分:1)
可以用sed命令完成:
sed -r -n '/^([^,]*,){27}""/! p' yourfile
对于完整文件,请使用27;对于最小示例,请使用2:在需要检查的列之前中指定字段数。
正则表达式打印(p
和-n
选项)的行(/.../!
表示否定否)与条件不匹配:
^
匹配([^,]*,)[27}
由逗号分隔的27个字段可能会退化为27个逗号,""
仅在下一个字段中使用双引号答案 3 :(得分:1)
awk -F, '$3 ~ /"Z"/{print $0}' file
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"