我正在尝试解析.csv文件,我在使用IFS时遇到了一些问题。 该文件包含以下行:
"Hello","World","this","is, a boring","line"
列用逗号分隔,所以我试图用这段代码爆炸该行:
IFS=, read -r -a tempArr <<< "$line"
但我得到了这个输出:
"Hello"
"World"
"this"
"is
a boring"
"line"
我理解为什么,所以我尝试了其他一些命令,但我没有得到预期的输出。
IFS=\",\"
IFS=\",
IFS=',\"'
IFS=,\"
每次将第三个元素分成两部分。 我如何使用IFS将字符串分成5个这样的部分?
"Hello"
"World"
"this"
"is, a boring"
"line"
答案 0 :(得分:0)
尝试一下:
sed 's/","/"\n"/g' <<<"${line}"
sed
有一个搜索和替换命令s
,它使用正则表达式搜索模式。
正则表达式用新行char替换,
中的","
。
因此,每个元素都在一个单独的行上。
答案 1 :(得分:0)
您可能希望使用FPAT
的gawk来定义有效字符串的内容 -
输入:
“你好”,“世界”,“这,是”
脚本:
gawk -n 'BEGIN{FS=",";OFS="\n";FPAT="([^,]+)|(\"[^\"]+\")"}{$1=$1;print $0}' somefile.csv
输出:
“你好”
“世界”
“这是”
答案 2 :(得分:0)
bashlib提供了csvline
功能。假设你已经在PATH的某个地方安装了它:
line='"Hello","World","this","is, a boring","line"'
source bashlib
csvline <<<"$line"
printf '%s\n' "${CSVLINE[@]}"
...从以上输出:
Hello
World
this
is, a boring
line
引用实施(版权lhunath),以下文字取自this specific revision of the relevant git repo):
# _______________________________________________________________________
# |__ csvline ____________________________________________________________|
#
# csvline [-d delimiter] [-D line-delimiter]
#
# Parse a CSV record from standard input, storing the fields in the CSVLINE array.
#
# By default, a single line of input is read and parsed into comma-delimited fields.
# Fields can optionally contain double-quoted data, including field delimiters.
#
# A different field delimiter can be specified using -d. You can use -D
# to change the definition of a "record" (eg. to support NULL-delimited records).
#
csvline() {
CSVLINE=()
local line field quoted=0 delimiter=, lineDelimiter=$'\n' c
local OPTIND=1 arg
while getopts :d: arg; do
case $arg in
d) delimiter=$OPTARG ;;
esac
done
IFS= read -d "$lineDelimiter" -r line || return
while IFS= read -rn1 c; do
case $c in
\")
(( quoted = !quoted ))
continue ;;
$delimiter)
if (( ! quoted )); then
CSVLINE+=( "$field" ) field=
continue
fi ;;
esac
field+=$c
done <<< "$line"
[[ $field ]] && CSVLINE+=( "$field" ) ||:
} # _____________________________________________________________________