如何在java中的字符串中使用正则表达式?

时间:2014-04-03 11:00:47

标签: java regex

嗨我有一个字符串,我想用正则表达式或任何方法打破

我的字符串是

  1 Agra Achhnera NIL
  2 Agra Agra NIL
  3 Agra Fatehabad NIL
  4 Agra Fatehpur Sikri NIL
  5 Aligarh Aligarh 1300.00
  6 Aligarh Khair 1300.00
  7 Ambedkar Nagar Akbarpur NIL
  8 Ambedkar Nagar Tanda Akbarpur 1478.00

结果我想要这样的字符串: -

1 Agra Achhnera NIL
2 Agra Agra NIL
3 Agra Fatehabad NIL
4 Agra FatehpurSikri NIL
5 Aligarh Aligarh 1300.00
6 Aligarh Khair 1300.00
7 AmbedkarNagar Akbarpur NIL
18 AmbedkarNagar TandaAkbarpur 1478.00

我怎样才能实现这个目标

我的java代码

<%@page import="com.gargoylesoftware.htmlunit.BrowserVersion;"%>
<%@page import="org.openqa.selenium.By;"%>
<%@page import="org.openqa.selenium.WebDriver;"%>
<%@page import="org.openqa.selenium.WebElement"%>
<%@page import="org.openqa.selenium.firefox.FirefoxDriver"%>
<%@page import="org.openqa.selenium.htmlunit.HtmlUnitDriver"%>
<%@page import="org.openqa.selenium.support.ui.Select"%>

<%


 WebDriver driver = new HtmlUnitDriver(BrowserVersion.getDefault());
        String sDate = "27/03/2014";

        String url="http://www.upmandiparishad.in/commodityWiseAll.aspx";
        driver.get(url);
        Thread.sleep(5000);

        new Select(driver.findElement(By.id("ctl00_ContentPlaceHolder1_ddl_commodity"))).selectByVisibleText("Jo");
         driver.findElement(By.id("ctl00_ContentPlaceHolder1_txt_rate")).sendKeys(sDate);

        Thread.sleep(3000);
        driver.findElement(By.id("ctl00_ContentPlaceHolder1_btn_show")).click();
        Thread.sleep(5000);


        WebElement findElement = driver.findElement(By.id("ctl00_ContentPlaceHolder1_GridView1"));
        String htmlTableText = findElement.getText();
        // do whatever you want now, This is raw table values.
   htmlTableText=htmlTableText.replace("S.No.DistrictMarketPrice","");
  htmlTableText= htmlTableText.replaceAll("\\s(\\d+\\s[A-Z])", "<br>$1");
   //System.out.println(htmlTableText);
String data[]=htmlTableText.split("");
   out.println(data[9]);



        driver.close();
        driver.quit();
%>

提前致谢

1 个答案:

答案 0 :(得分:0)

这将适用于您的特定情况,因为您正在抓取的数据采用此格式,并且以下正则表达式在大多数情况下都适用。我并不是说这是完美的答案,但我应该告诉你,这会让你继续前进:

查找

[a-z]\s[A-Z]

并替换为\1\2

演示: http://regex101.com/r/kF8vG6