Question

我正在处理laTex文件，我需要删除两个$之间的所有内容，包括换行符，并且只保留英文文本。

我正在使用这样的命令来处理文件：

find "." -name "*.tex" | xargs perl -pi -e 's/\$[^\$].*?\$/ /g' *

示例：

Then use the naturality formula 

    $t_{G^{n-1}M} G^{i+1} (\epsilon_{G^{n-i}M}) 
    = G^{i+1} (\epsilon_{G^{n-i}M}) t_{G^n M}$ on the left-hand side.

输出：

Then use the naturality formula 
 on the left-hand side.

文件中的另一个例子：

实施例

\begin{itemize}
\item $M$ is atomic and finitely generated;
\item $M$ is cancellative;
\item $(M, \le_L)$ and $(M, \le_R)$ are lattices;
\item there exists an element $\Delta \in M$, called {\it Garside element}, such that the set 
$L(\Delta)= \{ x \in M; x\le_L \Delta\}$ generates $M$ and is equal to $R(\Delta)= \{ x\in M; 
x\le_R \Delta\}$.
\end{itemize}

输出：

\begin{itemize}
\item   is atomic and finitely generated;
\item   is cancellative;
\item   and   are lattices;
\item there exists an element  , called {\it Garside element}, such that the set 
  generates   and is equal to $R(\Delta)= \{ x\in M; 
x\le_R \Delta\}$.
\end{itemize}

如果你能注意到（$ R（\ Delta）= {x \ in M; x \ le_R \ Delta} $。）无法删除!!

示例2来自不同的文件，输入与输出相同，没有任何变化：

    Using the fact that   is atomic and that $L(\Delta)= 
\{x \in M; x \le_L \Delta\} M \pi_L(a) \neq 1 a \neq 
1 k \partial_L^k(a)=1 k$ be the

Answer 1

我猜这是不匹配的，它应匹配的文本跨越多行。

您[^\$].*?使用$匹配一个不是[^\$]的字符，然后匹配任何不是换行符的字符匹配的.*?零次或多次懒洋洋。这适用于您的单行情况，因为延迟修饰符尝试在$之前匹配.，但多行情况失败，因为.与换行符不匹配。

正确且效率更高的[^\$]*可以匹配尽可能多的非$个字符，包括换行符。

所以你的命令就是

s/\$[^\$]*\$/ /g

或更清洁我认为使用非标准分隔符并避免'fencepost'看起来/\

s~\$[^\$]*\$~ ~g

Demo

Perl正在逐行处理您的文件，这是跨换行符失败匹配的另一个原因。这个问题已经有很多记录在案的答案，并且由比我更了解perl的人写的：How to match multiline data in perl

删除$中包含乳胶文件中多行的所有内容

1 个答案: