HTML解析 - 将文本转换为链接

时间:2013-03-29 01:42:34

标签: php javascript html parsing dom

假设我有这样的文字:

  

亚伦在他的兄弟在梅里巴的罪中受到牵连( Num。20:8-13 ),并且在那个帐户上不允许进入应许之地。当部落到达何山时,“在以东地的边缘”,在摩西的吩咐下,摩西带着亚伦和他的儿子以利亚撒登上那座山的顶端,在所有人的眼中。在那里,他剥夺了亚伦的祭司外衣,并把它们放在以利亚撒身上;在那里,亚伦死在山顶上,年仅123岁( Num。20:23-29 。Comp。 Deut.10:6 ; 32 :50

我想要做的是,将上面的每个粗体文本转换为链接,以及链接,如果是:

  • 货号20:8-12,应该是:< a href =“num20.8-12”> Num。 20:8-13< / A>
  • 申10:6; 32:50,应该是:< a href =“deut10.6”> Deut。 10:6≤ / A> < a href =“deut32.50”> Deut。 32:50℃; / A>

本文的结构如下:

<DIV>
  <B>Aaron</B>
  <SPAN>
    Aaron was implicated in the sin of his brother at Meribah (Num. 20:8-13), and on that account was not permitted to enter the Promised Land. When the tribes arrived at Mount Hor, "in the edge of the land of Edom," at the command of God Moses led Aaron and his son Eleazar to the top of that mountain, in the sight of all the people. There he stripped Aaron of his priestly vestments, and put them upon Eleazar; and there Aaron died on the top of the mount, being 123 years old (Num. 20:23-29. Comp. Deut. 10:6; 32:50)
  </SPAN>
</DIV>

任何好主意都会受到赞赏。谢谢:))


修改

代码:

$chapters = array ("Deut", "Num");

$html = file_get_html($link);

foreach($html->find('div') as $dict) {
    $descr  = $dict->find('SPAN', 0)->innertext;    
    $descrl = preg_replace("/$chapters\. [0-9:-]*/", "<a href=\"$0\">$0</a>", $descr); //--> See description below

    echo $descrl . "<hr/>";
}

说明:当我将$chapters更改为NumDeut这样的单个词时,它效果很好,但是当我将其更改为$chapters时,它却没有t返回任何链接。

1 个答案:

答案 0 :(得分:2)

您没有指定规则,您应该为自己定义和改进规则;我处理了你的具体案件。

//replace against either book followed by period followed by space
//followed by one or more digit, comma, semicolon, space, or dash
txt.replace(/(Num|Deut)\. ([\d:,; -]+)/g, function (match, book, verses) {
    var link = '';
    //split the verse on semicolon + space as each must be linked
    verses.split(/;\s+/).forEach(function (elem) {
        //create the link; replace : with period
        link += '<a href="' + book.toLowerCase() + elem.replace(':', '.') + '">'
            + book + '. ' + elem + '</a> ';
    });
    return link;
});

http://jsfiddle.net/XaVXW/