在Java上的String.replaceAll上获取java.lang.StackOverflowError

时间:2018-05-09 13:50:52

标签: java regex string replace

我需要将'OR'替换为'||'在给定的字符串中。只有在输入字符串中它本身是一个完整的单词时才应该替换它。此外,如果它出现在引号内,则不应替换它。例如,如果输入字符串是

  

application.path =“EXCEL.exe”或application.path =“EXCELSIOR.exe”或application.path =“XYZ OR ABC.exe”

输出应为

  

application.path =“EXCEL.exe”|| application.path =“EXCELSIOR.exe”|| application.path =“XYZ OR ABC.exe”

请注意,EXCELSIOR.exe和“XYZ OR ABC.exe”中的OR不会被替换。

我正在使用的Java代码如下:

String inputStr = "(quote.AGE was 24 AND (application.path = \"**\\acad.exe\" OR application.path = \"**\\dxfdwg.exe\" OR application.path = \"**\\EXCELSIOR.EXE\" OR application.path = \"**\\iges.exe\" OR application.path = \"**\\notepad.exe\" OR application.path = \"**\\run_journal.exe\" OR application.path = \"**\\AcroRd32.exe\" OR application.path = \"**\\dllhost.exe\" OR application.path = \"**\\powerpnt.exe\" OR application.path = \"**\\Edge.exe\" OR application.path = \"**\\step203ug.exe\" OR application.path = \"**\\step214ug.exe\" OR application.path = \"**\\VisView.exe\" OR application.path = \"**\\Teamcenter.exe\" OR application.path = \"**\\ug_convert_part.exe\" OR application.path = \"**\\ugraf.exe\" OR application.path = \"**\\ugtopv.exe\" OR application.path = \"**\\wmplayer.exe\" OR application.path = \"**\\winword.exe\" OR application.path = \"**\\wordpad.exe\" OR application.path = \"**\\vlc.exe\" OR application.path = \"**\\dwgviewr.exe\" OR application.name = \"RMS\" OR application.path = \"**\\acrobat.exe\" OR application.path = \"**\\Alias.exe\" OR application.path = \"**\\awtessd.exe\" OR application.path = \"**\\proe.exe\" OR application.path = \"**\\STPViewer.exe\" OR application.path = \"**\\gom_inspect.exe\" OR application.path = \"**\\gom_cad_server2.exe\" OR application.path = \"**\\sldworks.exe\" OR application.path = \"**\\sldworks_fs.exe\" OR application.path = \"**\\sldProcMon.exe\" OR application.path = \"**\\AdapplicationMgr.exe\" OR application.path = \"**\\AdapplicationMgrSvc.exe\" OR application.path = \"**\\SE3Dtrans.exe\" OR application.path = \"**\\stamp.exe\" OR application.path = \"**\\psolid.exe\" OR application.path = \"**\\mpid.exe\" OR application.path = \"**\\mpirun.exe\" OR application.path = \"**\\FS.exe\" OR application.path = \"**\\xtop.exe\" OR application.path = \"**\\pro_comm_msg.exe\" OR application.path = \"**\\nmsd.exe\" OR application.path = \"**\\creoagent.exe\" OR application.path = \"**\\parametric.exe\" OR application.path = \"**\\PDFEditor.exe\" OR application.path = \"**\\CNEXT.exe\" OR application.path = \"**\\drafter.exe\" OR application.path = \"**\\convert.exe\" OR application.path = \"**\\ActCut3D.exe\" OR application.path = \"**\\ppcbasic.exe\" OR application.path = \"**\\deltamesh_stamping.exe\" OR application.path = \"Xasfsf\" OR application.path = \"sfdsdf\"))";
String replacedStr = inputStr.replaceAll("(?m)\\bOR\\b(?=(?:\"[^\"]*\"|[^\"])*$)", "||");

这适用于较短的字符串,但一旦长度超过2000个字符,就会引发以下错误:

  

线程“main”中的异常java.lang.StackOverflowError at   java.util.regex.Pattern $ BmpCharProperty.match(Pattern.java:3796)at at   java.util.regex.Pattern $ Branch.match(Pattern.java:4604)at   java.util.regex.Pattern $ GroupHead.match(Pattern.java:4658)at at   java.util.regex.Pattern $ Loop.match(Pattern.java:4785)at   java.util.regex.Pattern $ GroupTail.match(Pattern.java:4717)at   java.util.regex.Pattern $ BranchConn.match(Pattern.java:4568)at   java.util.regex.Pattern $ CharProperty.match(Pattern.java:3777)at at   java.util.regex.Pattern中$ Branch.match(Pattern.java:4604)

我读了一些其他线程(thread1thread2),Java不能很好地处理长字符串的正则表达式。有人可以建议我如何改进我的正则表达式以避免StackOverflowError?

1 个答案:

答案 0 :(得分:1)

  

有人可以建议我如何改进我的正则表达式来避免这种情况   的StackOverflowError?

是的,我可以为您提供两种解决方案,您只需要从另一方面看到您的问题。

以下是对您的问题和快速解决方案的快速分析,您可以使用此正则表达式(.*?\"\s+)\bOR\b(\s+application.*?)

解决方案一

tables = []
for elems in links:
    tables += [link.get_attribute("href") for link in elems]


for link in tables:
    browser.get(link)
    table = browser.find_elements_by_tag_name("td")
    if table:
        table_rows = [t.find_elements_by_tag_name("tr") for t in table]
        for table_row in table_rows:
            your_result = [t.text for t in table_row if not t.startswith("Pages")]
            if your_result:
                print(your_result)

browser.close()
browser.quit()

我注意到您要替换的OR存在于String inputStr = //that long String String regex = "(.*?\"\\s+)\\bOR\\b(\\s+application.*?)"; String replacedStr = inputStr.replaceAll(regex, "$1||$2"); System.out.println(replacedStr); ans " space OR之后,我的正则表达式将匹配该OR并替换它。

简短示例的输出,它将为您提供相同的结果:

application

解决方案二

如果您使用的是Java 9+,则可以使用此正则表达式application.path="EXCEL.exe" || application.path="EXCELSIOR.exe" || application.path="XYZ OR ABC.exe" ^^ ^^ ^^ ^^ 来匹配application.path=(\"(.*?)\")等所有内容,将结果与application.path="something here"

一起收集
||