如何从脚本的外部html中获取价值

时间:2017-02-01 12:25:56

标签: jsoup document outerhtml

我有一个来自脚本的外部HTML,我需要从里面获取一个城市列表。

我一直试图通过将此脚本恢复为HTML来获取它:

Document citiesHTML = Jsoup.parse(driver.findElement(By.xpath("/html/body/section/script")).getAttribute("outerHTML"));

该行的回报是:

<html>
 <head>
  <script>
      NCM.Registry.add('PreHomeStatic', 'PreHome_1485892226002', {
        backgroundColor: '#f1f1f1',
        backgroundImage: '',
        subscriberUrl: '/cliente',
        notSubscriberUrl: '/home', 
        defaultCityName: "sao_paulo",
        defaultCityId: '1366122212339',
        cityNotFoundMessage: 'Os serviços NET não estão disponíveis para sua cidade TEST',
        cityPlaceholder: 'Digite Sua Cidade',
        subscriberLabel: 'Já é <b>cliente NET?</b>',
        footerNote: 'Rodap&eacute;',
        cities: [{"id_wcs":"1374010568098","id":"almirante_tamandare","value":"Almirante Tamandaré","tokens":["almirante","tamandare","Almirante","Tamandaré"]},{"id_wcs":"1374019924528","id":"alvorada","value":"Alvorada"...

我需要得到那个&#34; cities&#34;阵列。

1 个答案:

答案 0 :(得分:0)

你有没有尝试过类似的东西?

    Document doc = // JSoup document

    String html = doc.select("script").html();

    String[] lines = html.split("\n");

    for (String line : lines) {
        if (line.trim().startsWith("cities:")){
            System.out.println(line.replaceFirst("cities:", ""));
        }
    }