Ubuntu 16.04
重击4.3.3
如果第六列中不存在逗号,我还需要一种在逗号后添加空格的方法。我不得不注释上面的行,因为它在csv文件中的所有逗号之后放置了一个空格。
错误: "This is 6th column,Hey guys,Red White & Blue,I know it,Right On"
完美: "This is 6th column, Hey guys, Red White & Blue, I know it, Right On"
我几乎可以看到awk
打印出第六列,然后让sed
进行其余操作:
awk '{ print $6 }' "$feed " | sed -i 's/|/,/g; s/,/, /g; s/,\s\+/, /g'
这是我到目前为止所拥有的:
for feed in *; do
sed -r -i 's/([^,]{0,10})[^,]*/\1/5' "$feed"
sed -i '
s/<b>//g; s/*//g;
s/\([0-9]\)""/\1inch/g;
# s/|/,/g; s/,/, /g; s/,\s\+/, /g;
s/"one","drive"/"onetext","drive"/;
s/"comments"/"description"/;
s/"features"/"optiontext"/;
' "$feed"
done
s/|/,/g; s/,/, /g; s/,\s\+/, /g;
有效,但它是全局的,不在列内。
答案 0 :(得分:2)
听起来您需要的就是这个(将GNU awk用于FPAT):
awk 'BEGIN{FPAT="[^,]*|\"[^\"]+\""; OFS=","} {gsub(/, ?/,", ",$6)} 1'
例如:
$ cat file
1,2,3,4,5,"This is 6th column,Hey guys,Red White & Blue,I know it,Right On",7,8
$ awk 'BEGIN{FPAT="[^,]*|\"[^\"]+\""; OFS=","} {gsub(/, ?/,", ",$6)} 1' file
1,2,3,4,5,"This is 6th column, Hey guys, Red White & Blue, I know it, Right On",7,8
实际上看起来您的整个shell脚本(包括对GNU sed的多次调用)只需一次对GNU awk的调用就可以更有效地完成,而无需围绕shell循环,例如(未试用):
awk -i inplace '
BEGIN{FPAT="[^,]*|\"[^\"]+\""; OFS=","}
{
$0 = gensub(/([^,]{0,10})[^,]*/,"\\1",5)
$0 = gensub(/([0-9])""/,"\\1inch","g")
sub(/"one","drive"/,"\"onetext\",\"drive\"")
sub(/"comments"/,"\"description\"")
sub(/"features"/,"\"optiontext\"")
gsub(/, ?/,", ",$6)
}
' *