我想问一个问题:如何删除所选标签
网站是www.yellowbook.com
我的代码是
for (int i = 1; i < 21; i++) {
String shopNameTemp = "";
String shopAddressTempA = "";
String shopAddressTempB = "";
String shopAddressTempC = "";
String shopAddressTempD = "";
String shopTelTemp = "";
String divName = "divInAreaSummary_" + String.valueOf(i);
Elements node = doc.select("li[id=" + divName);
shopNameTemp = node.first().select("a[class=fn]").toString();
shopAddressTempA = node.first().select("span[class=street-address]").toString();
shopAddressTempB = node.first().select("span[class=locality]").toString();
shopAddressTempC = node.first().select("span[class=region]").toString();
shopAddressTempD = node.first().select("span[class=postal-code]").toString();
shopTelTemp = node.first().select("div[class=call phone-number]").toString();
System.out.println("Name " + shopNameTemp);
System.out.println("Address" + shopAddressTempA + shopAddressTempB + shopAddressTempC + shopAddressTempD);
System.out.println("Tel " + shopTelTemp);
}
我的输出是:
Please input your category and location and Province...
auto repair,Seattle,WA
Name <#a class="fn" data-classid="690" href="/profile/76-station-mlk_1861635669.html" onclick="OmAdViewLeadClick('adsource: companyname', false, '8330', ';7;;;;evar33=inArea|evar34=16', 'auto repairing');" title="View more information about 76 Station MLK">76 Station MLK<#/a>
Address <#span itemprop="streetAddress" class="street-address">15 Avenue Nw<#/span><#span itemprop="addressLocality" class="locality">Seattle<#/span><#span itemprop="addressRegion" class="region">WA<#/span><#span itemprop="postalCode" class="postal-code">98102-9810<#/span>
Tel <#div class="call phone-number">
(206) 826-3263
<#/div>
我怎么才能得到
名称76 Station MLK
地址15 Avenue Nw Seattle WA 98102-9810
电话(206)826-3263
PS。我使用删除,内容将被删除,但标签仍然存在
答案 0 :(得分:1)
不使用toString()
,而是使用Element的text()
方法仅提取文本而不提取标记。
例如:
shopNameTemp = node.first().select("a[class=fn]").text();
shopAddressTempA = node.first().select("span[class=street-address]").text();
shopAddressTempB = node.first().select("span[class=locality]").text();
shopAddressTempC = node.first().select("span[class=region]").text();
shopAddressTempD = node.first().select("span[class=postal-code]").text();
shopTelTemp = node.first().select("div[class=call phone-number]").text();
当您将其打印到控制台时,这应该会产生正确的文本。请注意,您可能需要在+ " " +
,shopAddressTempA
等之间手动添加一些空格(例如shopAddressTempB
),否则所有这些空格都将打印,不会有空格。
我测试了这个,我的输出是:
Name 76 Station MLK
Address 2801 Martin Luther King Jr Way S Seattle WA 98144-6003
Tel (206) 722-4995