使用bash(awk / sed)在第28列中没有数据时删除行

时间:2019-06-23 10:17:36

标签: bash shell csv awk sed

下面是我的.csv文件的简要示例:

"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","","N"

在第三列(“ Z”列)中,我缺少一些单元格(第3,6和9行)。最好使用awk或sed,我想专门针对第3列,如果任何单元格为空白,我想删除整行。我的最终结果将是:

"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"

对于我的实际项目,以下是我的一个文件的精确副本- https://github.com/drphillgood/riotapidata/blob/master/csv/game3.csv。您将在第28列(participants__participantId)中看到只有某些单元格具有数据(与最后一列的数据相同,参与者__playerName)。如果此列中的单元格之一为空白,我想使用.sh脚本删除整行。最终文件会这样-https://github.com/drphillgood/riotapidata/blob/master/csv/game3_v2.csv

4 个答案:

答案 0 :(得分:3)

一个更简单的AWK命令:

awk -F , '$3 != "\"\"" {print}' inputfile > outputfile

将字段分隔符设置为逗号,并打印出第三行仅包含""的每一行。

此功能不足以处理字段中包含逗号的CSV文件。它希望空字段由空引号组成。

答案 1 :(得分:1)

这是一个awk脚本,可以完成操作。

awk -F '","' '!$3{next}1' input.csv

输出:

"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"

说明:

BEGIN {FS = "\",\""}  # input line field separator ","
!$3{next}             # if empty string in 3rd input field, skip
1                     # print current line

更新:

以下是注释,并在链接中提供了测试CSV文件。

测试$ 28字段

awk -F '","' '!$28{next}1' input.txt | awk -F '","' '{print $28}'的输出

participants__participantId
1
2
3
4
5
6
7
8
9
10

答案 2 :(得分:1)

可以用sed命令完成: sed -r -n '/^([^,]*,){27}""/! p' yourfile

对于完整文件,请使用27;对于最小示例,请使用2:在需要检查的列之前中指定字段数。

正则表达式打印(p-n选项)的行(/.../!表示否定否)与条件不匹配:

    从行首开始
  • ^匹配
  • ([^,]*,)[27}由逗号分隔的27个字段可能会退化为27个逗号,
  • ""仅在下一个字段中使用双引号

答案 3 :(得分:1)

awk -F, '$3 ~ /"Z"/{print $0}' file

"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"
"X","Y","Z","N"