从Ruby移植到PHP时的正则表达式问题

时间:2013-05-20 22:16:55

标签: php ruby regex porting

我有两位代码似乎是彼此正确的翻译。不幸的是,他们似乎返回了不同的价值观。

Ruby中的代码:

def separate(text,boundary = nil)
    # returns array of strings and arrays containing all of the parts of the email
    textList = []
    if !boundary #look in the email for "boundary= X"
        text.scan(/(?<=boundary=).*/) do |bound|
            textList = recursiveSplit(text,bound)
            end
    end
    if boundary 
        textList = recursiveSplit(text,boundary)
    end
    puts textList.count
    return textList
end


def recursiveSplit(chunk,boundary)
    if chunk.is_a? String
        searchString = "--" + boundary
        ar = chunk.split(searchString)
        return ar
    elsif chunk.is_a? Array
        chunk do |bit|
            recursiveSplit(bit,boundary);
        end
    end
end

PHP代码:

function separate($text, $boundary="none"){
    #returns array of strings and arrays containing all the parts of the email
    $textBlock = [];
    if ($boundary == "none") {
        preg_match_all('/(?<=boundary=).*/', $text, $matches);
        $matches = $matches[0];
        foreach ($matches as $match) {
            $textList = recursiveSplit($text,$match);
        }
    }else {
        $textList = recursiveSplit(text,boundary);
    }
    var_dump($textList);
    return$textList;
}

function recursiveSplit($chunk,$boundary){
    if (is_string($chunk)) {
        $ar = preg_split("/--".$boundary."/", $chunk);
        //$ar = explode($searchString, $chunk);
        return $ar;
    }
    elseif (is_array($chunk)) {
        foreach ($chunk as $bit) {
            recursiveSplit($bit,$boundary);
        }
    }
}

var_dump($textList)显示长度为3的数组,而textList.count =&gt;什么给出了什么?

匿名$ text示例:

MIME-Version: 1.0
Received: by 10.112.170.40 with HTTP; Fri, 3 May 2013 05:08:21 -0700 (PDT)
Date: Fri, 3 May 2013 08:08:21 -0400
Delivered-To: me@gmail.com
Message-ID: <CADPp44E47syuXvP1K-aemhcU7vdSijZkfKLu-74QPWs9U9551Q@mail.gmail.com>
Subject: MiB 5/3/13 7:43AM (EST)
From: Me <me@gmail.com>
To: Someone <someone@aol.com>
Content-Type: multipart/mixed; boundary=BNDRY1

--BNDRY1
Content-Type: multipart/alternative; boundary=BNDRY2

--BNDRY2
Content-Type: text/plain; charset=ISO-8859-1

-TEXT STUFF HERE. SAYING THINGS
ABOUT CERTAIN THINGS

--BNDRY2
Content-Type: text/html; charset=ISO-8859-1

<div dir="ltr">-changed signature methods to conform more to working clinic header methods(please test/not testable in simulator)<div style>-confirmed that signature image is showing up in simulator. Awaiting further tests</div>
<div style>-Modified findings spacing/buffer. See if you like it</div></div>

--BNDRY2--
--BNDRY1
Content-Type: application/zip; name="Make it Brief.ipa.zip"
Content-Disposition: attachment; filename="Make it Brief.ipa.zip"
Content-Transfer-Encoding: base64
X-Attachment-Id: f_hg9biuno0

<<FILE DATA>>
--BNDRY1--

在示例或任何gmail“查看原始”电子邮件上运行separate(text)以重现错误

1 个答案:

答案 0 :(得分:0)

BINGO ZINGO想出来了!

显然,在PHP中,为了在涉及该变量的循环内更改变量,您必须在变量前加上'&amp;'

添加'&amp;'并修复了一些一般的递归错误,并且运行顺利。