如何在c#中使用正则表达式提取字符串

时间:2015-06-27 10:06:29

标签: c# regex

我的代码如下:

<owl:ObjectProperty rdf:about="http://www.semanticweb.org/rukhsana/ontologies/2015/5/untitled-ontology-21#ہیز_پرائس">
    <rdfs:range rdf:resource="http://www.semanticweb.org/rukhsana/ontologies/2015/5/untitled-ontology-21#پرائس"/>
    <owl:inverseOf rdf:resource="http://www.semanticweb.org/rukhsana/ontologies/2015/5/untitled-ontology-21#پرائس_فار"/>
    <rdfs:domain rdf:resource="http://www.semanticweb.org/rukhsana/ontologies/2015/5/untitled-ontology-21#کار_ایڈ"/>
</owl:ObjectProperty>

这是本体的RDF / XML格式。我想使用正则表达式提取ہیز_پرائس,پرائس_فار。如果有人可以告诉我正则表达式。

3 个答案:

答案 0 :(得分:0)

试试这个表达式:

(小于?=#)*(= \&#34;?)。

您可以使用此网站轻松创建正则表达式:

https://regex101.com/

答案 1 :(得分:0)

试试这个

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;

namespace ConsoleApplication1
{
    class Program
    {
        static void Main(string[] args)
        {
            string input = 
                "<owl:ObjectProperty rdf:about=\"http://www.semanticweb.org/rukhsana/ontologies/2015/5/untitled-ontology-21#ہیز_پرائس\">" +
                      "<rdfs:range rdf:resource=\"http://www.semanticweb.org/rukhsana/ontologies/2015/5/untitled-ontology-21#پرائس\"/>" +    
                   "<owl:inverseOf rdf:resource=\"http://www.semanticweb.org/rukhsana/ontologies/2015/5/untitled-ontology-21#پرائس_فار\"/>" +
                      "<rdfs:domain rdf:resource=\"http://www.semanticweb.org/rukhsana/ontologies/2015/5/untitled-ontology-21#کار_ایڈ\"/>" +
                "</owl:ObjectProperty>";

            string pattern = ":(?'tag'[^=]+)=\"(?'url'[^#]+)#(?'arabic'[^\"]+)";

            Regex expr = new Regex(pattern, RegexOptions.Singleline);

            MatchCollection matches = expr.Matches(input);
            foreach(Match match in matches)
            {
                Console.WriteLine("tag : {0}; url : {1}; arabic : {2}", match.Groups["tag"].Value, match.Groups["url"].Value, match.Groups["arabic"].Value);
            }
            Console.ReadLine();

        }
    }
}
​

答案 2 :(得分:0)

假设您要在'21#'和'''之间提取阿拉伯语文本并使用逗号加入它们,您可以使用正则表达式Groups

var matches = String.Join(",", Regex
    // search for text between 21# and " using groups 
    .Matches(xml, "21#(.*?)\"")       
    // convert MatchCollection to IEnumerable
    .Cast<Match>()                    
    // select only the first group that will contain only required text.
    .Select(m => m.Groups[1].Value));