我正在为John Tromp's Binary Lambda Calculus写一个简单的翻译器给De Bruijn Notation Lambda Calculus,以便我能理解他的Lambda文件在他的2012 "Most Functional" International Obfuscated C Code winner
中是如何工作的这是翻译前primes.blc
语言的一个示例:
00010001100110010100011010000000010110000010010001010111110111101001000110100001110011010000000000101101110011100111111101111000000001111100110111000000101100000110110
我在Bruijn.pl的primes.txt文件保存部分之前在注释行中遇到嵌套正则表达式的问题:
#!/usr/bin/env perl
#use strict;
use warnings;
use IO::File;
use Cwd; my $originalCwd = getcwd()."/";
#primes.blc as argument for test conversion
#______________________________________________________________________open file
my ($name) = @ARGV;
$FILE = new IO::File;
$FILE->open("< ".$originalCwd."primes.blc") || die("Could not open file!");
#$FILE->open("< ".$name) || die("Could not open file!");
while (<$FILE>){ $field .= $_; }
$FILE->close;
#______________________________________________________________________Translate
$field =~ s/(00|01|(1+0))/$1 /gsm;
$field =~ s/00 /\\ /gsm;
$field =~ s/01 /(a /gsm;
$field =~ s/(1+)0 /length($1)." "/gsme;
$RecursParenthesesRegex = m/\(([^()]+|(??{$RecursParenthesesRegex}))*\)/;
#$field =~ 1 while s/(\(a){1}(([\s\\]+?(\d+|$RecursParenthesesRegex)){2})/\($2\)/sm;
#______________________________________________________________________save file
#$fh = new IO::File "> ".$name;
$fh = new IO::File "> ".$originalCwd."primes.txt";
if (defined $fh) { print $fh $field; $fh->close; }
翻译文件primes.txt
应该是什么:
\ (\ (1 (1 ((\ (1 1) \ \ \ ((1 \ \ 1) (\ (((4 4) 1) (\ (1 1) \ (2 (1 1)))) \ \ \ \ ((1 3) (2 (6 4)))))) \ \ \ (4 (1 3))))) \ \ ((1 \ \ 2) 2))
目前,该行已注释掉,它转换为几乎可读的格式,如下所示:
\ (a \ (a 1 (a 1 (a (a \ (a 1 1 \ \ \ (a (a 1 \ \ 1 (a \ (a (a (a 4 4 1 (a \ (a 1 1 \ (a 2 (a 1 1 \ \ \ \ (a (a 1 3 (a 2 (a 6 4 \ \ \ (a 4 (a 1 3 \ \ (a (a 1 \ \ 2 2
哪个需要找到(a
的最里面的抽象,以及2个数字或匹配的括号及其所有内容,并插入尾随)
并将a
一直删除到最外层的申请。
答案 0 :(得分:2)
虽然我不理解你的算法,但这行很可疑
$RecursParenthesesRegex = m/\(([^()]+|(??{$RecursParenthesesRegex}))*\)/
您根据包含它的模式是否与$_
匹配来定义未声明的变量
use strict
意图抓住这样的错误,但不是修复错误而是将其关闭。这不明智
我猜您正在尝试定义递归模式,因此您需要使用qr//
代替m//
,并在模式中使用(?0)
或(?R)
让我们称之为$re
而不是吗?喜欢这个
my $re = qr/\(([^()]+|(?R))*\)/
此外,这条线是奇数
$field =~ 1 while s/(\(a){1}(([\s\\]+?(\d+|$RecursParenthesesRegex)){2})/\($2\)/sm
将$field
的值与正则表达式模式1
进行比较,只要替换在$_
除此之外,如果没有对算法的描述以及您的代码与它的关系,我无法帮助您
答案 1 :(得分:2)
您可能需要像这样的正则表达式
# (\(a)(([\s\\]*?(?:\d+|(?&RecursParens))){2})(?(DEFINE)(?<RecursParens>(?>\((?>(?>[^()]+)|(?:(?=.)(?&RecursParens)|))+\))))
( \(a ) # (1)
( # (2 start)
( # (3 start)
[\s\\]*?
(?:
\d+
|
(?&RecursParens)
)
){2} # (3 end)
) # (2 end)
(?(DEFINE)
(?<RecursParens> # (4 start)
(?>
\(
(?>
(?> [^()]+ )
| (?:
(?= . )
(?&RecursParens)
|
)
)+
\)
)
) # (4 end)
)
使用像这样的Perl代码
use strict;
use warnings;
use feature qw{say};
my $field = "00010001100110010100011010000000010110000010010001010111110111101001000110100001110011010000000000101101110011100111111101111000000001111100110111000000101100000110110";
$field =~ s/(00|01|(1+0))/$1 /g;
$field =~ s/00 /\\ /g;
$field =~ s/01 /(a /g;
$field =~ s/(1+)0 /length($1)." "/ge;
1 while $field =~ s/(\(a)(([\s\\]*?(?:\d+|(?&RecursParens))){2})(?(DEFINE)(?<RecursParens>(?>\((?>(?>[^()]+)|(?:(?=.)(?&RecursParens)|))+\))))/\($2\)/g;
$field =~ s/\( /\(/g;
say $field;
这会给你一个这样的输出
\ (\ (1 (1 ((\ (1 1) \ \ \ ((1 \ \ 1) (\ (((4 4) 1) (\ (1 1) \ (2 (1 1)))) \ \ \ \ ((1 3) (2 (6 4)))))) \ \ \ (4 (1 3))))) \ \ ((1 \ \ 2) 2))
可以格式化为这样
\
( # (1 start)
\
( # (2 start)
1
( # (3 start)
1
( # (4 start)
( # (5 start)
\
( 1 1 ) # (6)
\ \ \
( # (7 start)
( 1 \ \ 1 ) # (8)
( # (9 start)
\
( # (10 start)
( # (11 start)
( 4 4 ) # (12)
1
) # (11 end)
( # (13 start)
\
( 1 1 ) # (14)
\
( # (15 start)
2
( 1 1 ) # (16)
) # (15 end)
) # (13 end)
) # (10 end)
\ \ \ \
( # (17 start)
( 1 3 ) # (18)
( # (19 start)
2
( 6 4 ) # (20)
) # (19 end)
) # (17 end)
) # (9 end)
) # (7 end)
) # (5 end)
\ \ \
( # (21 start)
4
( 1 3 ) # (22)
) # (21 end)
) # (4 end)
) # (3 end)
) # (2 end)
\ \
( # (23 start)
( 1 \ \ 2 ) # (24)
2
) # (23 end)
) # (1 end)