Question

我的字符串是：

sdk_version=ios4.2.4&gender=male&product=JE779SPAKLZ5SGAMZ&shop_country=sg&user_id=44337&app_version=1.5.1

在这种情况下＆＃39;产品＆＃39;在用户_id＆＃39;之前但有时它是另一种方式。

我需要捕获并删除在这两个变量之一之前的所有内容，我的实际正则表达式是：

sed 's/.\+\(user_id=\\?\|product=\)//g'

但是这段代码不起作用，它总是会选择第二个元素后面的所有内容。在我的例子中，它将检索：

sdk_version=ios4.2.4&gender=male&product=JE779SPAKLZ5SGAMZ&shop_country=sg&user_id=

而不是：

sdk_version=ios4.2.4&gender=male&product=

这个正则表达式在这里工作：http://regexr.com/3beh2

但不可能使它与sed一起使用

Answer 1

在Perl中使用非贪心量词：

 perl -pe 's/^.*?(product|user_id)=//'
 #              ^
 #              |
 #   match as little as possible

Answer 2

使用sed执行此操作需要一些技巧，因为sed不支持非贪婪匹配。我认为，最简单的方法是

sed 's/\(user_id=\\?\|product=\)/\n&/; s/.*\n//'

这分为两部分：

s/\(user_id=\\?\|product=\)/\n&/   # Place a newline before the first matching
                                   # foo=bar token as a marker
s/.*\n//                           # Remove everything up to the newline

Answer 3

sed 's/\&user_id=/\
&/
     s/.*\&\(product=.*\)\n/\1/
     s/.*\n\&//'

posix sed version
假设user_id和product前面有&

Answer 4

您可以将其分解为可能发生的两种情况。产品，然后是 user_id 或 user_id ，然后是产品。

sed -e '/.*product=.*user_id=.*/{s/.*product=//}' -e '/.*user_id.*product=.*/{s/.*user_id=//}'

最后-e可以简化为'/.*product=.*/{s/.*user_id=//}'，就好像第一个命令成功，那么就没有产品。

bash，正则表达式在第一场比赛时停止

4 个答案: