Question

以下是我要做什么的简短摘要：

假设我有一个CSV存储为$ variable在shell中。看起来像这样：

account,index,quantity
100,AAPL,10
105,NFLX,25
110,TSLA,50
120,TWTR,45

现在，我像这样从shell查询PSQL数据库：

accounts=$(psql -d mydb -h mydb -f "SELECT account_num FROM accounts WHERE is_relevant")

现在我有一个帐户列表，存储在一个看似非结构化的变量中。 简单来说，我想为我的新帐户查询中给出的值过滤原始CSV。

1）当我在存储查询结果的变量上调用echo时，会得到一长串输出-只是所有相关帐户的串联列表。

2）当我在变量上调用head时，每个帐号都会引发错误：

head: cannot open '100' for reading: No such file or directory

我看到了这一点，并且我认为，“ shell不会将这些条目识别为要打印的字符串，而是要运行的命令”-我不确定如何解决此问题。尝试使用sed放置引号或逗号来分隔字符串已经引发了类似的错误-有关丢失文件或不存在的命令。

尽管我怀疑grep最终是解决此问题的正确工具-我想以开放的方式摆出姿势。你会怎么做？

编辑：在给定原始帐户表的情况下，澄清一下PSQL查询是否返回：

100
105
120

我想根据这些值过滤原始表，以获得：

account,index,quantity
100,AAPL,10
105,NFLX,25
120,TWTR,45

（行号为110的行已被过滤掉。）

Answer 1

查询后，您可以尝试以下操作：

# Create a filtered_variable to store the filtered results
# and add the first line from the original variable (the CSV header)
filtered_variable=$(echo "$variable" | head -n 1)

# For each account in the accounts obtained in the query
for account in $accounts
do
    # Create a filtered_line variable to store the line where the account
    # appears in the CSV, or an empty line if the account is not in the CSV
    filtered_line=$(echo "$variable" | grep "^$account,")

    # If $filtered_line is not empty (the account is in the CSV) ...
    if [ ! -z "$filtered_line" ]
    then
        # ... add the line to the filtered_variable (filtered CSV)
        filtered_variable+=$'\n'"$filtered_line"
    fi
done

现在，变量filtered_variable中有过滤表。如果您希望在原始的variable中使用它，则只需在循环后执行variable="$filtered_variable"。

替代解决方案

您还可以将egrep与正则表达式一起使用，该正则表达式包括查询中返回的所有帐户。例如

echo "$variable" | egrep -e "^100,|^110,"

将返回

100,AAPL,10
110,TSLA,50

此正则表达式查找以100,或110,开头的行。我添加了,以避免错误的正面匹配。

因此，您所需要做的就是为查询中返回的所有帐户创建该正则表达式。使用sed可以轻松完成：

filter=$(echo "^$accounts," | sed -e 's/ /,|^/g')

现在，您将过滤器作为变量filter中的正则表达式使用，剩下的就是要做egrep：

filtered_variable=$(echo "$variable" | egrep "$filter")

同样，您将在辅助变量filtered_variable中拥有过滤后的帐户（不要忘记首先添加CSV标头行）。

如何为psql输出中存在的值grep shell变量？

1 个答案: