如何使用shell脚本中的正则表达式从URL中提取字符串字段值?

时间:2015-02-03 17:58:06

标签: regex linux bash shell

我正在开发一个项目,我需要从bash shell脚本中调用我的一个服务器。

http://hostname.domain.com:8080/beat

点击上面的网址后,我会得到以下回复,我需要解析它并从中提取state的值

num_retries_allowed: 3 count: 30 count_behind: 100 state: POST_INIT num_rounds: 60 hour_col: 2 day_col: 0

现在我想使用正则表达式提取state变量值。我可以从中提取countcount_behind值但不确定如何从中提取state值。

#send the request, put response in variable
DATA=$(wget -O - -q -t 1 http://hostname.domain.com:8080/beat)

#grep $DATA for count and count_behind
COUNT=$(echo $DATA | grep -oE 'count: [0-9]+' | awk '{print $2}')
COUNT_BEHIND=$(echo $DATA | grep -oE 'count_behind: [0-9]+' | awk '{print $2}')

# how to extract state variable value here?
STATE= what do I add here?

另外如果在$DATA state变量不存在,那么我想将0分配给STATE变量。之后,我想验证条件并退出脚本,具体取决于那个。

如果STATE等于POST_INIT,则成功退出shell脚本或STATE等于0,然后也成功退出。

#verify conditionals
if [[ $STATE -eq "POST_INIT" || $STATE -eq "0" ]]; then exit 0; fi

1 个答案:

答案 0 :(得分:1)

您可以使用此grep -P

state=$(grep -oP 'state: \K\S+' <<< "$DATA")   
[[ -z "$state" ]] && state=0
echo "$state"
POST_INIT