我有一个名为the_list = entrada.split(" ") # take input & make a list of all values, separated by " "
saida = the_list.join(", ") # join all elements with ", "
brands_url
我想使用第二列品牌的值来使用此命令行查找该品牌的域名
"relative/url","brand"
"relative/url1","brand"
我希望使用该结果作为第一列的前置,以便最终结果将是这样的。
curl url.json | jq -r '.[] | select(.slug=="brand") | .domain.production' # this would produce >> www.domain.com
我的脚本现在的问题是它很慢。
"www.domain.com/relative/url"
"www.domain.com/relative/url1"
BRAND_JSON=$(curl url.json)
while IFS= read -r line
do
BRAND=$(echo $line | awk -F',' '{print $2}' | sed "s/\"//g")
URI=$(echo $line | awk -F',' '{print $1}' | sed "s/\"//g")
echo $BRAND
DOMAIN=$(echo $BRAND_JSON | jq -r ".[] | select(.slug==\"$BRAND\") | .domain.production")
echo $DOMAIN
echo $URI
echo "https://$DOMAIN/$URI" >> urls
done < "brand_urls"
的内容如下所示
$BRAND_JSON
答案 0 :(得分:2)
只需使用带有子串删除的参数扩展,即可消除80%的子shell开销。您可以简单地通过让bash句柄解析这些行来替换对awk
和sed
(以及每个'|'
所需的子shell)的4个调用,例如
while IFS= read -r line
do
BRAND=${line%\"}
BRAND=${BRAND##*\"}
URI=${line#\"}
URI=${URI%%\"*}
echo $BRAND
DOMAIN=$(echo $BRAND_JSON | jq -r ".[] | select(.slug==\"$BRAND\") | \
.domain.production")
echo $DOMAIN
echo $URI
echo "https://$DOMAIN/$URI" >> urls
done < "brand_urls"
尝试一下让我知道。剩下的大部分时间都在curl
的外部信息检索中,bash对此无能为力。
答案 1 :(得分:1)
jq + awk 工具的简短组合:
示例url.json
(应该是有效的json):
[
{
"slug": "brand",
"domain": {
"production": "www.domain.com"
}
},
{
"slug": "brand1",
"domain": {
"production": "www.domain1.com"
}
}
]
示例brands_urls.csv
内容:
"relative/url","brand"
"relative/url1","brand1"
工作:
awk -F, 'NR==FNR{ gsub(/"/,""); a[$2]=$1;next }
$2 in a{ printf "https://%s/%s\n",$1,a[$2] }' brands_urls.csv \
FS='\t' <(jq -r '.[] | [.domain.production,.slug] | @tsv' url.json)
输出(反斜杠befor \ domain 是故意添加的,因为SO不允许明确地粘贴www.domain.com
代码。实际输出会很好):
https://www.\domain.com/relative/url
https://www.\domain1.com/relative/url1