我有一个包含以下列的CSV文件:
Year,113 Cause Name,Cause Name,State,Deaths,Age-adjusted Death Rate
这些是文件中的一些示例行:
2016,Malignant neoplasms (C00-C97),Cancer,Missouri,12696,167
2015,Malignant neoplasms (C00-C97),Cancer,Missouri,12965,173.4
2014,Malignant neoplasms (C00-C97),Cancer,Missouri,13067,177.7
2013,Malignant neoplasms (C00-C97),Cancer,Missouri,12955,179.4
2012,Malignant neoplasms (C00-C97),Cancer,Missouri,12919,182.3
我正在尝试在Bash中构建CSV解析器,该解析器将从用户那里获取参数并显示与参数匹配的行。到目前为止,这是我的代码:
#!/bin/sh
# set up the arguments
for i in "$@"
do
case $i in
-y=*|--year=*)
YEAR="${i#*=}"
shift # past argument=value
;;
-c=*|--cause=*)
CAUSE="${i#*=}"
shift # past argument=value
;;
-s=*|--state=*)
STATE="${i#*=}"
shift # past argument=value
;;
-d=*|--deaths=*)
DEATHS="${i#*=}"
shift # past argument=value
;;
-ad=*|--age_adjusted=*)
AGE_ADJUSTED="${i#*=}"
shift # past argument=value
;;
*)
# unknown option
;;
esac
done
# print out the values of the passed arguments
echo $YEAR
echo $CAUSE
echo $STATE
echo $DEATHS
echo $AGE_ADJUSTED
# read the file, segregating value in each column
while IFS='' read -r year cause1 cause2 state deaths age_adj; do
if [ -z "$DEATHS" ]; then # user did not pass a "number of deaths" argument
if [ -z "$AGE_ADJUSTED" ]; then # user also did not pass an age "adjusted death rate" argument
echo "$year $cause1 $cause2 $state $deaths $age_adj" | grep "$YEAR" | grep "$CAUSE" | grep "$STATE"
else # user passed an age "adjusted death rate" argument, check against that value
if [[ $age_adj -ge $AGE_ADJUSTED ]]; then
echo "$year $cause1 $cause2 $state $deaths $age_adj" | grep "$YEAR" | grep "$CAUSE" | grep "$STATE"
fi
fi
else # user passed a "number of deaths" argument
if [ -z "$AGE_ADJUSTED" ]; then # user did not pass an "age adjusted death rate" argument
echo "$year $cause1 $cause2 $state $deaths $age_adj" | grep "$YEAR" | grep "$CAUSE" | grep "$STATE"
else # user passed both "number of deaths" and "age adjusted death rate" arguments
if [[ $deaths -ge $DEATHS && $age_adj -ge $AGE_ADJUSTED ]]; then
echo "$year $cause1 $cause2 $state $deaths $age_adj" | grep "$YEAR" | grep "$CAUSE" | grep "$STATE"
fi
fi
fi
done < "$1"
当我尝试将死亡数列($ deaths)与传递的自变量值($ DEATHS)和“年龄调整死亡率”列($ age_adj)与传递的自变量值($ AGE_ADJUSTED)进行比较时,会发生问题。它不会触发比较,而是打印出与其他参数匹配的所有结果(如果通过)。
感谢您的帮助。预先感谢。
我以以下格式传递参数:
./main.sh -y=2015 -d=50000 <additional arguments if I want to> ./file.csv
答案 0 :(得分:1)
使用awk
。
YEAR="2015"
CAUSE=""
STATE=""
DEATHS=""
AGE_ADJUSTED=""
awk \
-vFS=, -vOFS=, \
-vYEAR=$YEAR \
-vCAUSE=$CAUSE \
-vSTATE=$STATE \
-vDEATHS=$DEATHS \
-vAGE_ADJUSTED=$AGE_ADJUSTED \
'{
if (length(YEAR) != 0) {
if ($1 != YEAR) {
next;
}
}
if (length(CAUSE) != 0) {
if ($2 != CAUSE) {
next;
}
}
if (length(STATE) != 0) {
if ($3 != STATE) {
next;
}
}
if (length(DEATHS) != 0) {
if ($4 != DEATHS) {
next;
}
}
if (length(AGE_ADJUSTED) != 0) {
if ($5 != AGE_ADJUSTED) {
next;
}
}
print
}' file.csv
可通过tutorialspoint获得实时版本。
next
行。如果所有匹配项匹配或为零,则为print
当前行。-vVAR=VAL
设置内部awk
变量。 -vFS=,
和-vOFS=,
设置awk
的输出和输入分隔符。-y=*|--year=*)
-出于可移植性和可读性的原因,我建议您遵循POSIX utility conventions和/或GNU argument syntax。只需使用GNU getopt(我更喜欢)或BASH getopts
(广泛使用,但不支持长参数)即可。for i in "$@"; do .... shift; ...
移位对参数不影响。阅读完后,您将无法更改它们。因此shift
在那里毫无用处,什么也不做。我更喜欢使用while (($#)); do .... shift; done;
或仅使用for i; do ... done
while IFS='' read -r
通常用于读取行而不拆分。 IFS
变量控制read
命令将分割行的变量。 read
从输入中读取数据,直到读取由-d
指定的分隔符(默认换行符),然后使用在IFS
中找到的任何字符将其拆分。您打算while IFS=, read -r ...