目前我的ff CSV数据存在问题。
COLUMN1,COLUMN2,COLUMN3,COLUMN4
apple1,apple2,apple3,apple4
banana1,banana2,banana3,
caimito1,"caimito21
caimito22","caimito31
caimito32",caimito4
看起来像这样:
╔══════════╦═══════════╦═══════════╦══════════╗
║ COLUMN1 ║ COLUMN2 ║ COLUMN3 ║ COLUMN4 ║
╠══════════╬═══════════╬═══════════╬══════════╬
║ apple1 ║ apple2 ║ apple3 ║ apple4 ║
║ banana1 ║ banana2 ║ banana3 ║ ║
║ caimito1 ║ caimito21 ║ caimito31 ║ caimito4 ║
║ ║ caimito22 ║ caimito32 ║ ║
╚══════════╩═══════════╩═══════════╩══════════╝
所以我的计划是添加COLUMN5,它的每一行都有一个值" FRUIT"。
使用的命令:
sed "1 s/$/,COLUMN5/g" FILE.csv | sed "2,$ s/$/,FRUIT/g" > OUTPUT.csv
输出:
╔══════════╦════════════════╦════════════════╦══════════╦═════════╗
║ COLUMN1 ║ COLUMN2 ║ COLUMN3 ║ COLUMN4 ║ COLUMN5 ║
╠══════════╬════════════════╬════════════════╬══════════╬═════════╣
║ apple1 ║ apple2 ║ apple3 ║ apple4 ║ FRUIT ║
║ banana1 ║ banana2 ║ banana3 ║ ║ FRUIT ║
║ caimito1 ║ caimito21FRUIT ║ caimito31FRUIT ║ caimito4 ║ FRUIT ║
║ ║ caimito22 ║ caimito32 ║ ║ ║
╚══════════╩════════════════╩════════════════╩══════════╩═════════╝
有没有办法添加" FRUIT"不影响" caimito"行?
我也试过了ff。命令,但它没有成功。添加","之前" $"。
sed "1 s/$/,COLUMN5/g" FILE.csv | sed "2,$ s/,$/,FRUIT/g" > OUTPUT.csv
答案 0 :(得分:2)
Sed可能不是处理csv文件的正确工具,因为规则比它看起来更复杂(它可能是可能的,但这样的脚本通常容易出错,等等)。但是,您可以使用csvtools
来处理此问题:
file="FILE.csv"
nr=$(csvtool height $file)
ot=$(perl -e "print \"COLUMN5\\n\";for\$i(2..$nr){print \"FRUIT\\n\";}")
echo "$ot" | csvtool paste "$file" -
该脚本的工作原理如下:
csvtool height
,COLUMN5
,然后 n-1 次FRUIT
来生成其他列。答案 1 :(得分:2)
嗯,这就是。这是一种在sed中执行此操作的方法:
sed ':a $!{ N; ba }; s/"[^"]*"/{&}/g; :b s/\({"[^"]*\)\n\([^"]*"}\)/\1~"~\2/g; tb; s/\n\|$/,FRUIT&/g; s/,FRUIT\(\n\|$\)/,COLUMN5\1/; :c s/\({"[^"]\)*~"~/\1\n/g; tc; s/{"\|"}/"/g' filename
这将有点搭便车。它的工作原理如下:
:a $!{ N; ba } # assemble the whole file in the
# hold buffer
s/"[^"]*"/{&}/g # encase all "-enclosed fields in
# {"..."} to make matching the beginning
# and end separately possible.
:b # jump mark for looping
s/\({"[^"]*\)\n\([^"]*"}\)/\1~"~\2/g # replace the first newline in all
# {"..."} fields with ~"~
tb # loop until all were replaced
s/\n\|$/,FRUIT&/g # Put FRUIT at the end of all lines
s/,FRUIT\(\n\|$\)/,COLUMN5\1/ # Replace the first ,FRUIT with ,COLUMN5
# The \(\n\|$\) bit is so that this
# works with empty files (that only
# have a header line)
:c # Jump mark for looping
s/\({"[^"]\)*~"~/\1\n/g # replace the first ~"~ in all {"..."}
# fields with a newline
tc # loop until all were replaced
s/{"\|"}/"/g # replace all {", "} markers with "
# again.
答案 2 :(得分:1)
sed '1 {
s/$/,COLUMN5/
b
}
:load
/^\([^"]*"[^"]*"\)*[^"]*"[^"]*$/ {
N
b load
}
s/$/,,,,/;s/^\(\([^,]*,\)\{4\}\).*/\1FRUIT/' YourFile
COLUMN5
而不是周期(b
)"
,请加载一个新行并重试此,
,
并添加FRUIT
在GNU sed上使用--posix
的posix版本
对于"有效" csv(1行,所有参数由,
分隔),只需删除加载周期部分