检查字符串是否有多个正则表达式模式

时间:2017-08-27 18:54:32

标签: c# regex

我正在尝试在字符串上运行两个不同的正则表达式模式,以比较字符串是使用C#的myspace.com还是last.fm网址。

我使用单个正则表达式使用以下代码使用它:

17/08/26 15:51:15 WARN TaskSetManager: Lost task 1118.0 in stage 145.0 (TID 72871, ip-172-22-247-134.ec2.internal, executor 91): java.lang.IllegalArgumentException: Self-suppression not permitted
    at java.lang.Throwable.addSuppressed(Throwable.java:1043)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1316)
    at org.apache.spark.rdd.ReliableCheckpointRDD$.writePartitionToCheckpointFile(ReliableCheckpointRDD.scala:182)
    at org.apache.spark.rdd.ReliableCheckpointRDD$$anonfun$writeRDDToCheckpointDirectory$1.apply(ReliableCheckpointRDD.scala:137)
    at org.apache.spark.rdd.ReliableCheckpointRDD$$anonfun$writeRDDToCheckpointDirectory$1.apply(ReliableCheckpointRDD.scala:137)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:99)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /checkpoints/48b56f21-7f74-429c-934c-0aea983c0175/rdd-359/.part-01118-attempt-0 could only be replicated to 0 nodes instead of minReplication (=1).  There are 36 datanode(s) running and no node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1580)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3107)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)

现在我的问题是,有一种更简单的方法可以检查字符串是myspace.com还是last.fm网址?

public string check(string value)
{
    Regex DescRegexShortened = new Regex(@"myspace\.com\/([a-zA-Z0-9_-]*)");
    Match DescRegexShortenedMatch = DescRegexShortened.Match(value);

    string url = "";

    if (DescRegexShortenedMatch.Success)
    {
        url = DescRegexShortenedMatch.Value;
    }

    return url;
}

例如:

如果字符串匹配regex1或regex2则...

2 个答案:

答案 0 :(得分:2)

可能这太明显了:

var rx1 = new Regex(@"myspace\.com\/([a-zA-Z0-9_-]*)");
var rx2 = new Regex(@"last\.fm\/([a-zA-Z0-9_-]*)");

if ( rx1.IsMatch(value) || rx2.IsMatch(value) )
{
    // Do something.
}

答案 1 :(得分:0)

您可以在正则表达式中使用替换来检查不同的值,而不是检查多个正则表达式:

int theProblem = iter;

在这种情况下,替换为var regex = new Regex(@"(?<domain>myspace\.com|last\.fm)/[a-zA-Z0-9_-]*"); var match = regex.Match(value); if (match.Success) { var domain = match.Groups["domain"].Value; // ... } else { // No match } ,与myspace\.com|last\.fmmyspace.com匹配。替换位于命名组last.fm内,如果正则表达式匹配,则可以像访问代码一样访问此命名组的值。

您可以只检查字符串是以domain还是myspace.com开头,或者如果网址是使用last.fm等正确语法的真实网址,而不是使用正则表达式,您可以创建和实例http://myspace.com/bla类的Uri,然后检查Host属性。

如果您想使用正则表达式,您可能应该将其更改为与fakemyspace.com等域名不匹配,但仍匹配MYSPACE.COM