Java - 从网站检索数据

时间:2012-04-07 23:35:59

标签: java swing web data-collection

我正在制作一个能够恢复彩票号码并在窗口中显示它们的应用程序。但是,我不确定如何从网站上恢复数据和数据:

https://www.national-lottery.co.uk/player/p/results.ftl

你会怎么做呢?我之前做过这个,但有一个网站返回了我可以使用的数据字符串。我更不确定如何做到这一点。任何建议都会受到赞赏,技术(如果有的话)将帮助我完成更多的项目!

4 个答案:

答案 0 :(得分:3)

使用Jsoup检索和解析页面:

String url = "https://www.national-lottery.co.uk/player/p/results.ftl";
Document document = Jsoup.connect(url).get();
final Elements elementsByTag = document.getElementsByTag("table");
... then work with the table or any other element

答案 1 :(得分:3)

该网站提供了下载CSV版本号码的链接。只需使用:

https://www.national-lottery.co.uk/player/lotto/results/downloadResultsCSV.ftl

看起来像:

DrawDate,Ball 1,Ball 2,Ball 3,Ball 4,Ball 5,Ball 6,Bonus Ball,Ball Set,Machine
07-Apr-2012,23,12,42,16,25,31,18,6,LANCELOT
04-Apr-2012,44,23,9,40,33,26,31,2,MERLIN
31-Mar-2012,2,49,40,47,18,5,19,1,MERLIN
28-Mar-2012,16,8,39,22,3,38,26,3,MERLIN
24-Mar-2012,24,27,6,39,31,45,32,4,LANCELOT
21-Mar-2012,10,14,45,25,39,21,40,1,MERLIN
17-Mar-2012,37,40,1,3,20,16,15,2,MERLIN
14-Mar-2012,15,36,26,31,14,18,48,4,MERLIN
10-Mar-2012,12,37,23,43,3,1,33,1,MERLIN
07-Mar-2012,28,44,8,35,11,2,17,3,MERLIN
03-Mar-2012,31,20,40,28,7,23,42,4,MERLIN
29-Feb-2012,41,29,46,14,49,13,43,3,LANCELOT
25-Feb-2012,29,27,26,7,32,25,33,1,LANCELOT
22-Feb-2012,35,12,7,49,43,15,8,4,MERLIN
18-Feb-2012,19,22,30,33,41,2,24,4,LANCELOT
15-Feb-2012,30,40,28,33,9,44,16,3,MERLIN
11-Feb-2012,24,31,23,1,49,45,6,3,LANCELOT
08-Feb-2012,7,13,31,44,36,16,26,8,LANCELOT
04-Feb-2012,41,45,7,40,48,4,46,2,MERLIN
01-Feb-2012,7,39,38,17,22,21,3,2,LANCELOT
28-Jan-2012,10,25,31,40,28,12,1,2,LANCELOT
25-Jan-2012,2,30,8,26,45,39,46,1,MERLIN
21-Jan-2012,17,5,32,39,49,42,19,5,MERLIN
18-Jan-2012,22,43,34,9,31,35,20,6,MERLIN
14-Jan-2012,7,12,10,15,25,42,33,7,LANCELOT
11-Jan-2012,40,33,39,9,2,27,45,6,LANCELOT
07-Jan-2012,47,8,15,17,14,20,38,7,MERLIN
04-Jan-2012,42,43,30,9,28,26,2,8,MERLIN
31-Dec-2011,11,38,42,37,44,7,2,7,LANCELOT
28-Dec-2011,48,11,49,13,17,8,19,6,LANCELOT
24-Dec-2011,43,32,36,15,23,1,19,7,LANCELOT
21-Dec-2011,30,7,28,34,38,45,6,5,MERLIN
17-Dec-2011,42,1,35,48,39,22,12,5,MERLIN
14-Dec-2011,3,43,30,28,10,25,31,8,MERLIN
10-Dec-2011,30,21,29,39,24,16,20,6,LANCELOT
07-Dec-2011,10,31,27,47,32,14,41,5,MERLIN
03-Dec-2011,49,1,35,48,47,30,8,8,MERLIN
30-Nov-2011,30,26,25,24,23,13,4,7,MERLIN
26-Nov-2011,13,36,26,16,25,46,15,6,MERLIN
23-Nov-2011,19,31,48,22,4,11,6,5,MERLIN
19-Nov-2011,32,31,1,34,29,36,45,3,ARTHUR
16-Nov-2011,26,40,39,27,10,12,20,1,GUINEVERE
12-Nov-2011,28,13,12,33,6,38,10,14,ARTHUR
09-Nov-2011,27,2,8,32,23,10,44,1,GUINEVERE
05-Nov-2011,14,24,39,23,16,27,43,8,LANCELOT
02-Nov-2011,12,38,11,33,37,49,3,2,GUINEVERE
29-Oct-2011,49,14,5,28,9,46,45,1,GUINEVERE
26-Oct-2011,4,23,34,41,38,39,27,4,GUINEVERE
22-Oct-2011,20,43,27,44,28,34,1,4,ARTHUR
19-Oct-2011,13,18,34,49,32,14,20,3,GUINEVERE
15-Oct-2011,41,7,12,46,34,27,14,2,ARTHUR
12-Oct-2011,37,26,40,25,13,24,30,3,ARTHUR

答案 2 :(得分:2)

创建页面地址的URL表示。 打开与URL的连接。 建立输入流。 读取流中的所有数据。这将是页面源。

URL url = new URL("https://www.national-lottery.co.uk/player/p/results.ftl");
URLConnection connection = url.openConnection();
InputStream stream = connection.getInputStream();
byte[] data = new byte[stream.available()];

stream.read(data);
stream.close();

String source = new String(data);

答案 3 :(得分:2)

除非网站提供允许查询彩票号码的API或网络服务,否则您可能需要抓取页面的html源代码。看起来数字存储在一个简单的html列表中:

<ul>
  <li>12</li>
  <li>16</li>
  <li>23</li>
  <li>25</li>
  <li>31</li>
  <li>42</li>
  <li class="bonus">18</li>
</ul>

那里有很多优秀的Java HTML解析器。这是一些项目:

我环顾了你感兴趣的网站,看来他们有一个“历史”页面,里面有几天的彩票号码:

https://www.national-lottery.co.uk/player/lotto/results/results.ftl

这可能是一个更好的页面。