使用java从动态html页面获取数据

时间:2016-08-18 11:49:37

标签: java html

我想从Live Cyber Atack获取数据。我想把数据包括:时间,攻击,攻击国家,目标国家。我想拍这个动态数据。经过分析。我正在使用Java。我该怎么办?

<div ng-repeat="attack in latestAttacks" class="attackRow" "="">
            <div class="timeCol" inline-animation="{ properties: { opacity : 1 }, duration: 500, easing:'swing'}" style="opacity: 0.0254519;">
                <p>
                    14:40:22
                </p>
            </div>
            <div class="attackCol" inline-animation="{ properties: { opacity : 1 }, duration: 500, easing:'swing'}" style="opacity: 0.0254519;">
                <p class="attackContainer">
                    infecting website.cb
                </p>
            </div>
            <div class="sourceCol" inline-animation="{ properties: { opacity : 1 }, duration: 500, easing:'swing'}" style="opacity: 0.0254519;">
                <p>
                    China
                </p>
            </div>
            <div class="destCol" inline-animation="{ properties: { opacity : 1 }, duration: 500, easing:'swing'}" style="opacity: 0.0254519;">
                <p>
                    India
                </p>
            </div>
        </div>

1 个答案:

答案 0 :(得分:1)

您可以做的最好的事情是通过WebSocket获取此网站的数据。首先你需要一个WebSocket客户端,在这里我使用JSR 356 - Java API for WebSocketTyrus参考实现。

假设您使用maven,以下是为WebSocket客户端添加到项目的依赖项:

<dependency>
    <groupId>javax.websocket</groupId>
    <artifactId>javax.websocket-api</artifactId>
    <version>1.1</version>
</dependency>
<dependency>
    <groupId>org.glassfish.tyrus.bundles</groupId>
    <artifactId>tyrus-standalone-client</artifactId>
    <version>1.13</version>
</dependency>

您将收到的数据是JSON格式,因此您需要一个必须添加到项目中的解析器

<dependency>
    <groupId>org.json</groupId>
    <artifactId>json</artifactId>
    <version>20160810</version>
</dependency>

以下是代码的外观:

final ClientEndpointConfig cec = ClientEndpointConfig.Builder.create().build();
URI uri = new URI(
    "wss://threatmap.checkpoint.com/ThreatPortal/websocket" +
    "?X-Atmosphere-tracking-id=0" +
    "&X-Atmosphere-Framework=2.2.5-javascript" +
    "&X-Atmosphere-Transport=websocket" +
    "&X-Atmosphere-TrackMessageSize=true" +
    "&Content-Type=application/json" +
    "&X-atmo-protocol=true"
);
ClientManager client = ClientManager.createClient();
try (Session session = client.connectToServer(new Endpoint() {

    @Override
    public void onOpen(Session session, EndpointConfig config) {
        session.addMessageHandler(new MessageHandler.Whole<String>() {

            @Override
            public void onMessage(String message) {
                // The data is of type "number|JSON Object"
                // so we remove everything before the JSON Object
                message = message.substring(message.indexOf('|') + 1);
                if (!message.startsWith("{")) {
                    // Not a JSON Object so we skip it
                    return;
                }
                // Parse the JSON Object
                JSONObject jsonObject = new JSONObject(message);
                if (jsonObject.has("attackname")) {
                    System.out.printf(
                        "Time: %tT Attack: %-40s Attacking Country: %-20s Target Country: %-20s%n",
                        Calendar.getInstance(), jsonObject.getString("attackname"),
                        new Locale("", jsonObject.getString("sourcecountry")).getDisplayName(),
                        new Locale("", jsonObject.getString("destinationcountry")).getDisplayName()
                    );
                }
            }
        });
    }
}, cec, uri)) {
    CountDownLatch messageLatch = new CountDownLatch(1);
    // Wait forever
    messageLatch.await();
}

<强>输出:

Time: 14:53:06 Attack: Trojan-Downloader.Win32.Sohanad.B        Attacking Country: United States        Target Country: Panama              
Time: 14:53:06 Attack: Trojan-Downloader.Win32.Sohanad.B        Attacking Country: United States        Target Country: Panama              
Time: 14:53:06 Attack: Trojan-Downloader.Win32.Sohanad.B        Attacking Country: United States        Target Country: Panama              
Time: 14:53:06 Attack: Trojan-Downloader.Win32.Sohanad.B        Attacking Country: United States        Target Country: Panama              
Time: 14:53:07 Attack: Trojan-Downloader.Win32.Sohanad.B        Attacking Country: United States        Target Country: Panama              
Time: 14:53:07 Attack: Trojan-Downloader.Win32.Sohanad.B        Attacking Country: United States        Target Country: Panama              
Time: 14:53:07 Attack: Trojan-Downloader.Win32.Sohanad.B        Attacking Country: United States        Target Country: Panama              
Time: 14:53:07 Attack: REP.huvcru                               Attacking Country: France               Target Country: Panama              
Time: 14:53:08 Attack: REP.huvcru                               Attacking Country: France               Target Country: Panama