使用正则表达式在R和R-Studio中查找多行文本块

时间:2015-10-01 19:26:48

标签: regex r

我非常擅长在PHP中使用正则表达式,但现在我完全坚持使用新的编程语言R. 工作代码的一个示例是https://regex101.com/r/mO1yR3/2 我想要做的是在标题处找到并替换包含(Mini)的文本块。只需从文本中删除这些块并将其保存到文件中即可。

我花了一天的时间找到了解决方案,而我正处于转折状态。使用PHP,Perl或Python,这样做要快得多。 我用于R的代码:

library(readr)
library(stringr)
contentsTosCSV <- read_file("d:/ProVallue/Provalue Group/BackTesting/SPY/2015/test.txt")
contentsTosCSV <- str_replace_all(contentsTosCSV, '\\r', '')#deleting \r 
matches <- grep('|(^.*\\(Mini\\)(?s).*?\\n\\n)|mg', contentsTosCSV, ignore.case = FALSE, perl = TRUE, value = TRUE, fixed = FALSE, useBytes = FALSE, invert = FALSE)
print(matches)

它匹配contentsTosCSV中的整个字符串 然后我尝试了这些:

matches <- grep('(?m)(^.*\\(Mini\\)(?s).*?\\n\\n)', contentsTosCSV, ignore.case = TRUE, perl = TRUE, value = TRUE, fixed = FALSE, useBytes = FALSE, invert = FALSE)
print(matches)

并用[?m]和(?s)代替m。modifyer,而不用。?

matches <- grep('(^.*\\(Mini\\)[.\\n]*?\\n\\n)', contentsTosCSV, ignore.case = TRUE, perl = TRUE, value = TRUE, fixed = FALSE, useBytes = FALSE, invert = FALSE)
print(matches)

示例文字:

Last,Net Chng,Volume,Open,High,Low
209.79,-.71,"113,965,728",210.46,210.53,208.65

JUL4 15  (-10)  100 (Weeklys)
,,Mark,Last,Delta,Impl Vol,Open.Int,Volume,Bid,Ask,Exp,Strike,Bid,Ask,Mark,Last,Delta,Impl Vol,Open.Int,Volume,,
,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,
,,,,,,,,,,,,,,,,,,,,,

SEP 15  (46)  100
,,Mark,Last,Delta,Impl Vol,Open.Int,Volume,Bid,Ask,Exp,Strike,Bid,Ask,Mark,Last,Delta,Impl Vol,Open.Int,Volume,,
,,129.790,127.28,1.00,0.00%,0,0,129.55,129.87,SEP 15,80,0,.01,.005,.01,.00,81.60%,"2,964",0,,
,,.005,.01,.00,14.71%,"4,563",0,0,.01,SEP 15,245,36.19,36.52,36.355,0,-.94,27.65%,0,0,,
,,.005,.02,.00,16.54%,"2,473",0,0,.01,SEP 15,250,41.19,41.49,41.340,38.87,-.94,30.20%,118,0,,

SEP 15  (46)  10 (Mini)
,start,Mark,Last,Delta,Impl Vol,Open.Int,Volume,Bid,Ask,Exp,Strike,Bid,Ask,Mark,Last,Delta,Impl Vol,Open.Int,Volume,,
,,20.165,15.70,.91,21.64%,1,0,17.75,22.58,SEP 15,190,.52,4.99,2.755,2.22,-.19,32.90%,26,0,,
,,19.165,0,.91,20.79%,0,0,16.80,21.53,SEP 15,191,0,4.99,2.495,0,-.19,30.53%,0,0,,
,,18.230,21.31,.90,20.46%,2,0,15.83,20.63,SEP 15,192,0,4.99,2.495,2.90,-.19,29.45%,6,0,,end

SEP5 15  (58)  100 (Quarterlys)
,,Mark,Last,Delta,Impl Vol,Open.Int,Volume,Bid,Ask,Exp,Strike,Bid,Ask,Mark,Last,Delta,Impl Vol,Open.Int,Volume,,
,,134.790,132.33,1.00,0.00%,0,0,134.54,134.88,SEP5 15,75,0,.02,.010,.01,.00,81.69%,"2,375",0,,
,,129.790,127.37,1.00,0.00%,0,0,129.54,129.88,SEP5 15,80,0,.02,.010,.01,.00,76.86%,620,0,,

OCT 15  (74)  100
,,Mark,Last,Delta,Impl Vol,Open.Int,Volume,Bid,Ask,Exp,Strike,Bid,Ask,Mark,Last,Delta,Impl Vol,Open.Int,Volume,,
,,73.790,0,1.00,0.00%,0,0,73.56,73.89,OCT 15,136,.01,.03,.020,0,.00,33.93%,0,0,,
,,72.790,0,1.00,0.00%,0,0,72.57,72.89,OCT 15,137,.01,.03,.020,0,.00,33.35%,0,0,,
,,71.790,0,1.00,0.00%,0,0,71.57,71.89,OCT 15,138,.01,.04,.025,.04,.00,33.54%,300,0,,
,,70.790,0,1.00,0.00%,0,0,70.57,70.90,OCT 15,139,.02,.04,.030,0,.00,33.62%,0,0,,

1 个答案:

答案 0 :(得分:1)

我基于您的regex101示例并通过var data = oTable.fnGetData(); var pendingCount = 0; for(var i = 0; i< data.length; i++){ if (data[i][5] != null && data[i][5] == '1') { pendingCount++; } } //pass count to html $(".panel-footer #pending").val(pendingCount); 技巧修改了您的正则表达式。

你可以使用这样的正则表达式:

[\s\S]

您可以在此处找到更新:

<强> Working demo