我正在尝试在字符串上运行两个不同的正则表达式模式,以比较字符串是使用C#的myspace.com还是last.fm网址。
我使用单个正则表达式使用以下代码使用它:
17/08/26 15:51:15 WARN TaskSetManager: Lost task 1118.0 in stage 145.0 (TID 72871, ip-172-22-247-134.ec2.internal, executor 91): java.lang.IllegalArgumentException: Self-suppression not permitted
at java.lang.Throwable.addSuppressed(Throwable.java:1043)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1316)
at org.apache.spark.rdd.ReliableCheckpointRDD$.writePartitionToCheckpointFile(ReliableCheckpointRDD.scala:182)
at org.apache.spark.rdd.ReliableCheckpointRDD$$anonfun$writeRDDToCheckpointDirectory$1.apply(ReliableCheckpointRDD.scala:137)
at org.apache.spark.rdd.ReliableCheckpointRDD$$anonfun$writeRDDToCheckpointDirectory$1.apply(ReliableCheckpointRDD.scala:137)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
at org.apache.spark.scheduler.Task.run(Task.scala:99)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:322)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /checkpoints/48b56f21-7f74-429c-934c-0aea983c0175/rdd-359/.part-01118-attempt-0 could only be replicated to 0 nodes instead of minReplication (=1). There are 36 datanode(s) running and no node(s) are excluded in this operation.
at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1580)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3107)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3031)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:725)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
现在我的问题是,有一种更简单的方法可以检查字符串是myspace.com还是last.fm网址?
public string check(string value)
{
Regex DescRegexShortened = new Regex(@"myspace\.com\/([a-zA-Z0-9_-]*)");
Match DescRegexShortenedMatch = DescRegexShortened.Match(value);
string url = "";
if (DescRegexShortenedMatch.Success)
{
url = DescRegexShortenedMatch.Value;
}
return url;
}
例如:
如果字符串匹配regex1或regex2则...
答案 0 :(得分:2)
可能这太明显了:
var rx1 = new Regex(@"myspace\.com\/([a-zA-Z0-9_-]*)");
var rx2 = new Regex(@"last\.fm\/([a-zA-Z0-9_-]*)");
if ( rx1.IsMatch(value) || rx2.IsMatch(value) )
{
// Do something.
}
答案 1 :(得分:0)
您可以在正则表达式中使用替换来检查不同的值,而不是检查多个正则表达式:
int theProblem = iter;
在这种情况下,替换为var regex = new Regex(@"(?<domain>myspace\.com|last\.fm)/[a-zA-Z0-9_-]*");
var match = regex.Match(value);
if (match.Success)
{
var domain = match.Groups["domain"].Value;
// ...
}
else
{
// No match
}
,与myspace\.com|last\.fm
或myspace.com
匹配。替换位于命名组last.fm
内,如果正则表达式匹配,则可以像访问代码一样访问此命名组的值。
您可以只检查字符串是以domain
还是myspace.com
开头,或者如果网址是使用last.fm
等正确语法的真实网址,而不是使用正则表达式,您可以创建和实例http://myspace.com/bla
类的Uri
,然后检查Host
属性。
如果您想使用正则表达式,您可能应该将其更改为与fakemyspace.com
等域名不匹配,但仍匹配MYSPACE.COM
。