我一直在寻找解决方案,以便在句子中找到类似set.seed(0)
SP <- data.frame(Company = c(rep_len("Apple", 50),
rep_len("Microsoft", 50)),
Price = round(runif(100, 1, 2), 2),
Date = rep(seq.Date(from = as.Date("2002-01-01"),
length.out = 50, by = "month"),
2),
Event = rbinom(100, 1, 0.05),
stringsAsFactors = FALSE)
Event <- which(SP$Event %in% 1)
resultFrame <- data.frame(Period = (-10):15)
for (i in Event){
Stock <- SP$Company[i]
eventTime <- format(SP$Date[i], "%b-%Y")
stockWin <- (i - 10):(i + 15)
stockWin[stockWin <= 0 | stockWin > nrow(SP)] <- NA
stockWin[!(SP$Company[stockWin] %in% Stock)] <- NA
priceWin <- SP[stockWin, "Price"]
eventName <- paste("Event", eventTime, Stock, sep=".")
resultFrame <- data.frame(resultFrame, priceWin)
names(resultFrame)[ncol(resultFrame)] <- eventName
}
的字符串并将其从中移除。例如:
我们有一句话 - howareyou
复合 - Hello there, how are you?
因此,我希望使用此字符串 - how are you
删除化合物。
我目前的解决方案是将字符串拆分为单词并检查复合词是否包含每个单词,但它不能正常工作,因为如果您有其他与该复合词匹配的单词,它们也将被删除,例如:
如果我们要在此字符串Hello there, ?
中查找foreseenfuture
,那么根据我的解决方案I have foreseen future for all of you
也会被删除,因为它位于复合词内。
代码
for
那么,还有其他方法可以解决这个问题吗?
答案 0 :(得分:0)
我会假设你复合时只删除空格。所以有了这个假设&#34;因为,看到了未来。为了看到未来&#34;会变成&#34;因为,看到未来。 &#34;因为逗号分解了其他化合物。在这种情况下,这应该工作:
String example1 = "how are you?";
String example2 = "how, are you... here?";
String example3 = "Madam, how are you finding the accommodations?";
String example4 = "how are you how are you how are you taco";
String compound = "howareyou";
StringBuilder compoundRegexBuilder = new StringBuilder();
//This matches to a word boundary before the first word
compoundRegexBuilder.append("\\b");
// inserts each character into the regex
for(int i = 0; i < compound.length(); i++) {
compoundRegexBuilder.append(compound.charAt(i));
// between each letter there could be any amount of whitespace
if(i<compound.length()-1) {
compoundRegexBuilder.append("\\s*");
}
}
// Makes sure the last word isn't part of a larger word
compoundRegexBuilder.append("\\b");
String compoundRegex = compoundRegexBuilder.toString();
System.out.println(compoundRegex);
System.out.println("Example 1:\n" + example1 + "\n" + example1.replaceAll(compoundRegex, ""));
System.out.println("\nExample 2:\n" + example2 + "\n" + example2.replaceAll(compoundRegex, ""));
System.out.println("\nExample 3:\n" + example3 + "\n" + example3.replaceAll(compoundRegex, ""));
System.out.println("\nExample 4:\n" + example4 + "\n" + example4.replaceAll(compoundRegex, ""));
输出如下:
\bh\s*o\s*w\s*a\s*r\s*e\s*y\s*o\s*u\b
Example 1:
how are you?
?
Example 2:
how, are you... here?
how, are you... here?
Example 3:
Madam, how are you finding the accommodations?
Madam, finding the accommodations?
Example 4:
how are you how are you how are you taco
taco
您也可以使用它来匹配任何其他字母数字化合物。