Question

（我使用BSD Sed。）

这个bash脚本：

sed -E -f parsefile < parsewords.d

使用此命令文件：

# Delete everything before BEGIN RTL and after END RTL
\?/\* BEGIN RTL \*/?,\?/\* END RTL \*/?!d   

# Delete comments unless they begin with /*!
s?/\*[^!].*\*/??g       

# Delete blank lines
/^[     ]*$/d

# Break line into words
s/[^A-Za-z0-9_]+/ /g 

# Remove leading and trailing spaces and tabs
s/^[    ]*(.*)[     ]*$/\1/

使用此输入文件：

any stuff
/* BEGIN RTL */

/*! INPUTS: a  b c d ph1   */ /* Comment */
x = a && b || c && d;

    y = x ? a : b;  /* hello */
z = ph1 ? x : z;
  w = c || x || (z || d);
/* END RTL */

产生这个结果：

INPUTS a b c d ph1 
x a b c d 
y x a b 
z ph1 x z 
w c x z d

到目前为止这很好，但我真正喜欢的是这样的：

x = a && b || c && d; x a b c d
y = x ? a : b; y x a b
z = ph1 ? x : z; z ph1 x z
w = c || x || (z || d); w c x z d

以便保留原始行以及脚本正在制作的mod。

这可能与sed或我应该使用其他东西。（也欢迎任何其他意见。）

编辑：这不是一个解析问题。它是关于保留原始输入行以及sed修改。

Answer 1

使用'sed'的解决方案。

输入文件（infile）：

any stuff
/* BEGIN RTL */

/*! INPUTS: a  b c d ph1   */ /* Comment */
x = a && b || c && d;

    y = x ? a : b;  /* hello */
z = ph1 ? x : z;
  w = c || x || (z || d);
/* END RTL */

'Sed'程序（script.sed）：

# Delete everything before BEGIN RTL and after END RTL
\?/\* BEGIN RTL \*/?,\?/\* END RTL \*/?!d   

# Delete comments unless they begin with /*!
s?/\*[^!].*\*/??g       

# Delete blank lines
/^[     ]*$/d

# Copy current line in hold space.
h

# Break line into words
s/[^A-Za-z0-9_]+/ /g 

# Join both lines with a ';'.
H ; g ; s/\n/ / ; s/;\s+/; /

# Remove leading and trailing spaces and tabs
s/^[    ]*(.*)[     ]*$/\1/

执行：

$ sed -E -f script.sed infile

输出（我不理解带有'INPUTS'字样的行，但更改脚本以使其适应）：

/*! INPUTS: a  b c d ph1   */   INPUTS a b c d ph1 
x = a && b || c && d; x a b c d 
y = x ? a : b; y x a b 
z = ph1 ? x : z; z ph1 x z 
w = c || x || (z || d); w c x z d

Answer 2

我说使用sed来完成这项任务将会很困难。

也许你需要研究解析/ lexing？

可以这样做吗？

2 个答案: