RegEx替换所需的帮助

时间:2010-12-19 14:13:01

标签: regex vb.net

假设我有一个html字符串,如下所示:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html dir='ltr' xmlns='http://www.w3.org/1999/xhtml' xmlns:b='http://www.google.com/2005/gml/b' xmlns:data='http://www.google.com/2005/gml/data' xmlns:expr='http://www.google.com/2005/gml/expr'>
<head>
</head>
<body>
<p>GRANDMÈRE Break the fillets of the saucepan on a double and shaped into neat pieces and stir it boil hard, or of nutmeg and salt. Throw them fry as a few inches by one in this very well. Put the whites of butter by three. Put some artichoke-bottoms cooked green</p>
<p>darkly colored on half with a little flour MY_IDENTIFIER and midrib. Put a hot for each side of vanilla cream as you cannot give it, and dish is a cauliflower, which you have not very useful sauce from the inside with a little nutmeg, and serve with the King of water</p>
<p>dining-room. At the meat. STUFFED CAULIFLOWER SOUP (BELGIAN RECIPE) Take three quarters of tying the juice of ham. Keep the pot, so as being interpreted, means that time put them into four bay salt, and chopped. When the better to sprinkle in the tomato much as many crescents one of</p>
<p>touch the rabbit to put quickly in. A white wine glass cups and pour over them, cut them in salt, pepper, and fill them in a pint of the liquor; it is a poached on slowly, without a layer of an egg on the yolks, and mix very clean, while</p>
<p>CAKE, EXCELLENT FOR PASTRY Equal quantities of red wine. Stew your taste, use that, with extract and salt and ham, mushrooms when the mold and dip them a good red wine. This dish with pepper and place meat and serve with a good foundation for twenty potatoes, and potato, some</p>
<p>half-an-hour. GOLDEN RICE Put them very little MY_IDENTIFIER book on a glass dish that way. CABBAGE WITH CHEESE Every one and season it up with not enough to make a pat of butter, each round quickly. Or add, instead of fresh lean and let it every now and put it melts</p>
<p>leek, and over it, a half a fireproof cases from burning. CHOU-CROUTE Take the salad you take out the amount of cream is not get in four, about three-and-a-half pints of the middle of this sauce some chopped almonds, chopped parsley and mix it in your pieces of grated cheese</p>
<p>sides. In four or flageolets, and stir in company with flour, and let it out, and pour over all, chop your vinegar to half a lemon--this would do not quite, add the edges. Steep them in a tablespoonful of butter and mustard. Take it in salted water; and, crumbling out</p>
<p>care that it in which you have seasoned with an equal size, mix MY_IDENTIFIER these are well with the fermentation has a custard. Put the top with a very carefully, so that you have added at a sieve; or, for at home than thick. Then fry the custard as you prepare</p>
<p>stuffing into a fireproof dish, and fry them to picnics, or marjoram with this MY_IDENTIFIER way besides parsley. Roll them out neatly with vanilla, a tablespoonful of mustard, pepper and salt, then pour it all cooked, and it to be ready to keep it simmer it over and salt. The original</p>
</body>
</html>

我需要找到p标签,如果文本包含“MY_IDENTIFIER”,则对该文本进行一些操作,并用一些文本替换文本。

在这里,我知道如何使用正则表达式查找带有文本的段落标记。我可以循环匹配,并可以根据需要对文本进行操作。我想知道如何用另一个文本替换匹配的项目。

在上面的例子中,我在第2,第6,第9和第10段有“MY_IDENTIFIER”。假设我想将第2段文本替换为

<p>2nd paragraph text</p>

和第6段文字为

<p>6th paragraph text</p>

依旧......

到目前为止我的代码......

Imports System.Text.RegularExpressions

Module modMain

    Sub main()
        Dim fileContents As String
        fileContents = My.Computer.FileSystem.ReadAllText("C:\temp\a.html")
        Dim paras As MatchCollection = Regex.Matches(fileContents, "<p>(.+?MY_IDENTIFIER.+?)</p>")
        Dim TxtFound As String
        For Each oMatch As Match In paras
            TxtFound = oMatch.Groups(1).Value
            'do some manipulations with txtfound
            '...
            'replace the txtfound with some other text

        Next

        'Save the file again
    End Sub
End module

任何帮助表示感谢。

1 个答案:

答案 0 :(得分:0)

我首先尝试通过全局匹配找到所有段落:

my @matches = ($string =~ m!<p>(.*?)</p>!sig);

然后我会遍历,并替换包含您的标识符的任何内容:

foreach(@matches) {
  #keep a copy for substitution below
  my $before = $_;

  #if the identifier is found, replace it
  if($_ =~ s!MY_IDENTIFIER!replacement text!is) {
    #then take the newly replaced text, and replace it in your original $string variable
    $string =~ s!$before!$_!is;
  }
}