从shell脚本中的字符串中提取信息

时间:2015-04-20 13:22:04

标签: bash shell awk sed

我无法从我的shell脚本中的字符串中提取出我需要的信息。我已阅读并尝试提出正确的awk或sed命令来执行此操作,但我无法弄明白。希望你们能帮忙。

假设我有一个字符串如下:     ["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,"isvalid":true,"name":"somename","hasproperty":true]

现在我要做的是将所有这些属性拉出到单个字符串数组中。例如:

我想有一系列ID 2817262 2262 28182 名称somename somename somename的数组 hasproperty false false true数组

任何人都可以帮我提出我需要的命令来解决这个问题。另外请记住,字符串可能会比这长得多,所以如果我们不能使它特定于3个有用的情况。非常感谢提前。

7 个答案:

答案 0 :(得分:2)

你可以使用grep。

grep -oP '"ids":\K\d+' file

示例:

$ echo '["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,"isvalid":true,"name":"somename","hasproperty":true]' | grep -oP '"ids":\K\d+'
2817262
2262
28182

答案 1 :(得分:1)

因为它用awk标记

awk '{while(x=match($0,/"ids":([^,]+)/,a)){print a[1];$0=substr($0,x+RLENGTH)}}' file

这只是保持匹配任何id,然后更改该行以仅包含id之后的内容。

输出

2817262
2262
28182

也可以做到这一点(受到Wintermutes对另一个答案的评论的启发)

awk -v RS=",|]" 'sub(/^.*"ids":/,"")' file

答案 2 :(得分:0)

grep解决方案很漂亮。你的问题被标记为awk。 awk解决方案很难看:

echo '["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,"isvalid":true,"name":"somename","hasproperty":true]' \
| awk '{split(substr($0,2,length($0)-2),x,",");
 for(i=0;i<length(x);i++) {split(x[i],a,":");
 if(a[1]=="\"ids\"") print a[1],a[2]}}'

输出:

"ids" 2817262
"ids" 2262
"ids" 28182

请选择grep解决方案作为正确答案。

答案 3 :(得分:0)

这是一个纯粹的bash解决方案(啰嗦,不是吗?我倾向于同意@chepner):

str='["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,
"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,
"isvalid":true,"name":"somename","hasproperty":true]'

#Remove [ ]
str=${str/[/}
str=${str/]/}

declare -a ids
declare -a names
declare -a properties
oldIFS="$IFS"
IFS=','

for record in $str
do
    type=${record%%:*}
    value=${record##*:}

    if [[ $type == \"ids\" ]]
    then
        ids[ids_i++]="$value"
    elif [[ $type == \"name\" ]]
    then
        names[names_i++]="$value"
    elif [[ $type == \"hasproperty\" ]]
    then
        properties[properties_i++]="$value"
    else
        echo "Ignored type: '$type'" >&2
    fi
done

IFS="$oldIFS"
echo "ids: ${ids[@]}"
echo "names: ${names[@]}"
echo "properties: ${properties[@]}"

唯一可行的是没有子进程。

答案 4 :(得分:0)

awk 'BEGIN {
   Field = 1
   Index = 0
   }
   {
   gsub( /[][]/,"")
   gsub( /"[a-z]*":/, "")
   FS=","

   while ( Field < NF) {
      ThisID[ Index]=$Field
      ThisName[ Index]=$(Field + 2)
      ThisProperty [ Index]=$(Field + 3)

      Index+=1
      Field+=4
      }
   }
END {
   for ( Iter=0;Iter<Index;Iter+=1) printf( "%s ", ThisID[Iter])
   printf "\n"
   for ( Iter=0;Iter<Index;Iter++) printf( "%s ", ThisName[Iter])
   printf "\n"
   for ( Iter=0;Iter<Index;Iter++) printf( "%s ", ThisProperty[Iter])
   printf "\n"
   }' YourFile

仍然要将您的数组分配给您最喜欢的变量

答案 5 :(得分:0)

unset n
string='["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,"isvalid":true,"name":"somename","hasproperty":true]'
while IFS=',' read -ra line
do
    ((n++))
    for i in "${line[@]//\"/}"
    do
        eval ${i%:*}[$n]=${i#*:}
    done
done < <(sed 's/[][]//g;s/,"ids/\n"ids/g' <<<$string)

以上将生成4个数组(idsisvalidnamehasproperty)。如果您不需要isvalid,只需添加:

unset n
string='["ids":2817262,"isvalid":true,"name":"somename","hasproperty":false,"ids":2262,"isvalid":false,"name":"somename","hasproperty":false,"ids":28182,"isvalid":true,"name":"somename","hasproperty":true]'
while IFS=',' read -ra line
do
    ((n++))
    for i in "${line[@]//\"/}"
    do
        [ "${i%:*}" != "isvalid" ] && eval ${i/:/[$n]=}
    done
done < <(sed 's/[][]//g;s/,"ids/\n"ids/g' <<<$string)

答案 6 :(得分:0)

根据您发布的内容,如果您想要的只是每种类型项目的列表,那么这就是您所需要的:

$ awk -v RS=, -F: '{gsub(/[[\]"\n]/,"")} /^ids/{print $2}' file                 
2817262
2262
28182
$ awk -v RS=, -F: '{gsub(/[[\]"\n]/,"")} /^name/{print $2}' file
somename
somename
somename
$ awk -v RS=, -F: '{gsub(/[[\]"\n]/,"")} /^hasproperty/{print $2}' file
false
false
true
$ awk -v RS=, -F: '{gsub(/[[\]"\n]/,"")} /^isvalid/{print $2}' file    
true
false
true

但这是解决问题的正确方法。正如我在评论中提到的,如果您想要一些真正的帮助,请编辑您的问题以提供更多信息。