我需要从网站中提取一些数据,然后在变量中保存一些值。
您已获得代码:
public class Principal {
public static void main(String[] args) throws IOException {
URL url = new URL("http://www.numbeo.com/cost-of-living/country_result.jsp?country=Turkey");
URLConnection yc = url.openConnection();
BufferedReader in = new BufferedReader(new InputStreamReader(yc.getInputStream()));
String inputLine;
String valor;
String str = null;
while ((inputLine = in.readLine()) != null) {
if(inputLine.contains("Milk"))
{
System.out.println("Encontrei! " + inputLine );
valor=inputLine.substring(inputLine.lastIndexOf("\"priceValue\">") + 14);
System.out.println("valor:" +valor);
}
}
in.close();
}
}
第一个输入行打印:<tr class="tr_standard"><td>Milk (regular), (1 liter) </td> <td style="text-align: right" class="priceValue"> 2.45 TL</td>
现在我必须提取"2.45"
我该怎么做?我已经尝试了一些正则表达式,但无法使其正常工作。
对不起我的英语不好。
提前谢谢。
答案 0 :(得分:2)
您可以尝试以下正则表达式:
(?:class="priceValue">\s*)(\d*\.\d+)
它会查找class="priceValue"
字符串后跟价格
以下是DEMO和explanation
答案 1 :(得分:2)
我知道你要求正则表达式,但是考虑通过解析HTML来使你的生活更轻松,好像它是一个结构化的XML文档,而不是一个普通的字符串。有些库可以为您处理此问题,并且可以避免担心文本格式,法律换行和其他内容:
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.7.1</version>
</dependency>
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
import java.io.IOException;
public class HtmlParser {
public static void main(String[] args) {
Document doc;
try {
doc = Jsoup.connect("http://www.numbeo.com/cost-of-living/country_result.jsp?country=Turkey").get();
Elements rows = doc.select("table.data_wide_table tr.tr_standard"); // CSS selector to find all table rows
for (Element row : rows) {
System.out.println("Item name: " + row.child(0).text()); // Milk will be here somewhere
System.out.println(" Item price by column number: " + row.child(1).text());
System.out.println(" Item price by column class: " + row.getElementsByAttributeValue("class", "priceValue").get(0).text());
}
} catch (IOException e) {
e.printStackTrace();
}
}
}
/**
Output:
Item name: Meal, Inexpensive Restaurant
Item price by column number: 15.00 TL
Item price by column class: 15.00 TL
Item name: McMeal at McDonalds (or Equivalent Combo Meal)
Item price by column number: 15.00 TL
Item price by column class: 15.00 TL
...
*/