我有一个大的制表符分隔的CSV文件。但是缺少一些数据:
1 cat The cat ate the fish.
dog The dog played in the yard.
fish The fish went to the river.
2 eagle The eagle flew in the sky.
The eagle stopped in the mountains.
bear The bear ate the honey.
我需要用前面行中出现的任何数据填充所有空单元格。输出看起来像这样:
1 cat The cat ate the fish.
1 dog The dog played in the yard.
1 fish The fish went to the river.
2 eagle The eagle flew in the sky.
2 eagle The eagle stopped in the mountains.
2 bear The bear ate the honey.
有没有办法用同一列中包含数据的前一个单元格的内容填充CSV中的空单元格?
答案 0 :(得分:1)
awk解决方案来完成整个文件:
awk -F\\t '
{
for (i=1;i<=NF;++i) if ($i != "") a[i] = $i;
if (na < NF) na = NF;
for (i=1;i<na;++i) printf "%s\t", a[i]
printf "%s\n", a[na];
}
' file.tsv
只做一个指定的列:
awk -F\\t -v COL=2 '
$COL=="" {$COL = saved}
{saved = $COL; print}
' file.tsv
答案 1 :(得分:1)
这适用于第1列和第2列:
awk -F '\t' '$1 != ""{p1=$1} NF==3{p2=$2} p1 && $1 == ""{$1=p1} p2 && NF==2{$0=$1 OFS p2 OFS $2} 1' OFS='\t' file
1 cat The cat ate the fish.
1 dog The dog played in the yard.
1 fish The fish went to the river.
2 eagle The eagle flew in the sky.
2 eagle The eagle stopped in the mountains.
2 bear The bear ate the honey.
答案 2 :(得分:1)
适用于任何缺失的列
awk -F\\t '
{ for (i=1;i<=NF;++i)
{ if ($i != "") a[i] = $i;
printf "%s\t", a[i]
}
printf RS
}' file