Question

我有一个文本文件，我正在尝试使用bash脚本获取包含$ .. $ delimiters（LaTeX公式）之间的字符串数组。我当前的代码不起作用，结果为空：

#!/bin/bash
array=($(grep -o '\$([^\$]*)\$' test.txt))
echo ${array[@]}

我测试了这个正则表达式here，它找到了匹配项。我使用以下测试字符串：

b5f1e7$bfc2439c621353$d1ce0$629f$b8b5

预期结果是

bfc2439c621353 629f

但 echo 返回空。虽然如果我使用'[0-9]\+'它可以工作：

5 1 7 2439 621353 1 0 629 8 5

我做错了什么？

Answer 1

您可以将<?xml version="1.0" encoding="UTF-8" standalone="no"?> <window xmlns="http://schemas.haulmont.com/cuba/window.xsd" caption="msg://editCaption" class="com.tkbbank.client.web.item.CardItemEdit" datasource="cardItemDs" focusComponent="fieldGroup" messagesPack="com.tkbbank.client.web.item"> <dsContext> <datasource id="cardItemDs" class="com.tkbbank.client.entity.CardItem" allowCommit="false"/> <collectionDatasource id="cardTypeDs" class="com.tkbbank.client.entity.CardType" view="_local"> <query> <![CDATA[select e from demo$CardType e]]> </query> </collectionDatasource> </dsContext> <dialogMode forceDialog="true" width="AUTO"/> <layout expand="windowActions" spacing="true"> <fieldGroup id="fieldGroup" datasource="cardItemDs"> <column width="500px"> <field id="cardCreationDate" editable="false"/> <field id="cardType" caption="Тип документа"> <lookupField datasource="cardItemDs" property="cardType" optionsDatasource="cardTypeDs"/> </field> <field id="cardSubtype"/> <field id="cardAutoFill"/> <field id="cardOutcomeNumber"/> <field id="cardDate"/> <field id="cardOrganization"/> <field id="cardDeliveryMethod"/> <field id="cardAdditionalInformation"/> <field id="registratorName"/> </column> </fieldGroup> <frame id="windowActions" screen="editWindowActions"/> </layout> </window>与输入字段分隔符一起用作awk：

请注意，此s='b5f1e7$bfc2439c621353$d1ce0$629f$b8b5' awk -F '$' '{for (i=2; i<=NF; i+=2) print $i}' <<< "$s"命令不验证输入。如果您希望awk仅允许有效输入，那么您可以将此awk命令与gnu awk一起使用：

FPAT

awk -v FPAT='\\$[^$]*\\$' '{for (i=1; i<=NF; i++) {gsub(/\$/, "", $i); print $i}}' <<< "$s"

Answer 2

怎么样：

grep -o '\$[^$]*\$' test.txt | tr -d '$'

这基本上是执行原始grep（但没有括号，导致它不匹配），然后从每个匹配中删除第一个/最后一个字符。

Answer 3

这个怎么样？

grep -Eo '\$[^$]+\$' a.txt | sed 's/\$//g'

我正在使用sed替换$。

Answer 4

尝试逃避大括号：

tst> grep -o '\$\([^\$]*\)\$' test.txt
$bfc2439c621353$
$629f$

当然，你必须删除$个标志（-o打印整个匹配）。您可以尝试使用sed：

tst> sed 's/[^\$]*\$\([^\$]*\)\$[^\$]*/\1\n/g' test.txt
bfc2439c621353
629f

Answer 5

为什么您的预期输出为b5f1e7$bfc2439c621353$d1ce0$629f$b8b5两个元素bfc2439c621353 629f，而不是三个元素bfc2439c621353 d1ce0 629f？

这里有一个grep命令来提取它们：

$ grep -Po '\$\K[^\$]*(?=\$)' <<<'b5f1e7$bfc2439c621353$d1ce0$629f$b8b5'
bfc2439c621353
d1ce0
629f

（这需要使用libpcre为-P编译的GNU grep）

使用\$\K（相当于(?<=\$)查看第一个$和(?=\$)，以展望下一个$。由于这些是d1ce0在这个过程中，它们不被grep吸收，因此可以找到$ sed 's/^[^$]*\$//; s/\$[^$]*$//; s/\$/\n/g' \ <<<'b5f1e7$bfc2439c621353$d1ce0$629f$b8b5' bfc2439c621353 d1ce0 629f。

这是一个单一的POSIX sed命令来提取它们：

这不使用任何GNU表示法，并且应该适用于任何POSIX兼容系统（例如OS X）。它会删除不需要的前导和尾随部分，然后用换行符替换每个realData = newData;。

Answer 6

使用bash正则表达式：

var="b5f1e7\$bfc2439c621353\$d1ce0\$629f\$b8b5"  # string to var
while [[ $var =~ ([^$]*\$)([^$]*)\$(.*) ]]       # matching
do 
    echo -n "${BASH_REMATCH[2]} "                # 2nd element has the match
    var="${BASH_REMATCH[3]}"                     # 3rd is the rest of the string
done
echo                                             # trailing newline
bfc2439c621353 629f

使用bash脚本查找$ ... $ delimiters之间的所有文本

6 个答案: