<div class="mcsColumnsTwoOne">
<h1> I B D Distribution Ltd </h1>
<p>Certificate Number: NAP 28766</p>
<p>Date Certified: 08/10/2010</p>
<p>Consumer Code: RECC</p>
<p>Membership Number: 00038340</p>
<h2>Company Address</h2>
<p>Unit 11 Enterprise Park,Black Moor Road,Verwood,Dorset, BH31 6YS</p>
<h2>Contact Details</h2>
<p>Telephone: 01202 825682</p>
<p>Website: <a href="http://www.ibd-distribution.com" title=" I B D Distribution Ltd ">www.ibd-distribution.com</a></p>
<p>Email: <span id="cloakc973703bbc5107b52e9fd9a2faf77e96"><a href="mailto:darren@ibd-distribution.com">darren@ibd-distribution.com</a></span><script type="text/javascript">
document.getElementById('cloakc973703bbc5107b52e9fd9a2faf77e96').innerHTML = '';
var prefix = 'ma' + 'il' + 'to';
var path = 'hr' + 'ef' + '=';
var addyc973703bbc5107b52e9fd9a2faf77e96 = 'darren' + '@';
addyc973703bbc5107b52e9fd9a2faf77e96 = addyc973703bbc5107b52e9fd9a2faf77e96 + 'ibd-distribution' + '.' + 'com';
var addy_textc973703bbc5107b52e9fd9a2faf77e96 = 'darren' + '@' + 'ibd-distribution' + '.' + 'com';document.getElementById('cloakc973703bbc5107b52e9fd9a2faf77e96').innerHTML += '<a ' + path + '\'' + prefix + ':' + addyc973703bbc5107b52e9fd9a2faf77e96 + '\'>'+addy_textc973703bbc5107b52e9fd9a2faf77e96+'<\/a>';
</script></p>
<p>Contact: Darren Johnson</p>
<p>Contact Position: Director</p>
<hr>
<h2 style="margin: 10px 0 0 0">Contact Installer</h2>
<form name="contact" action="" method="post" class="formstyle" style="width: 100%">
<fieldset>
<p>(<em>*</em>) Denotes required field </p>
<label for="name">Name <em>*</em></label>
<input type="text" name="name" id="name" class="text" required="">
<br class="clear">
<div class="thepot">
<label for="emailaddress">Email</label>
<input type="text" name="emailaddress" id="emailaddress">
</div>
<label for="email">Email <em>*</em></label>
<input type="email" name="email" id="email" class="text" required="">
<br class="clear">
<label for="telephone">Telephone</label>
<input type="tel" name="telephone" id="telephone" class="text">
<br class="clear">
<label for="enquiry">Enquiry <em>*</em></label>
<textarea name="enquiry" id="enquiry" rows="10" cols="10" required=""></textarea>
<br class="clear">
<input type="hidden" name="loadtime" value="1515943155">
<input id="submitbutton" name="submitbutton" value="Submit" type="submit">
<div class="thepot">
<label for="submitForm">submitForm</label>
<input type="text" name="submitForm" id="submitForm" value="">
</div>
</fieldset>
</form>
</div>**strong text**
我试图用java jsoup库从上面的代码中提取数据,虽然当'Contact'p标签为空时出现错误,'contact'p标签上会显示'Contact Position'如何进行联系列空白时显示空白,并在最后一列保留联系人位置p文本?非常感谢您的帮助 。 for(元素d:数据){
idrow++;
String Consumers = d.select("h1").text();
String CertificateNumberall = d.select("p:eq(1)").text();
String CertificateNumber = CertificateNumberall.substring(CertificateNumberall.lastIndexOf(":") + 1);
String DateCertifiedall = d.select("p:eq(2)").text();
String DateCertified = DateCertifiedall.substring(DateCertifiedall.lastIndexOf(":") + 1);
String ConsumerCodeAll = d.select("p:eq(3)").text();
String ConsumerCode = ConsumerCodeAll.substring(ConsumerCodeAll.lastIndexOf(":") + 1);
String MembershipNumberAll = d.select("p:eq(4)").text();
String MembershipNumber = MembershipNumberAll.substring(MembershipNumberAll.lastIndexOf(":") + 1);
String CompanyAddressAll = d.select("p:eq(6)").text();
String CompanyAddress = CompanyAddressAll.substring(CompanyAddressAll.lastIndexOf(":") + 1);
String TelephoneAll = d.select("p:eq(8)").text();
String Telephone = TelephoneAll.substring(TelephoneAll.lastIndexOf(":") + 1);
String WebsiteAll = d.select("p:eq(9) :not(span)").text();
String Website = WebsiteAll.substring(WebsiteAll.lastIndexOf(":") + 1);
String EmailAll = d.select("p:eq(10) span").text();
String Email = EmailAll.substring(EmailAll.lastIndexOf(":") + 1);
String ContactAll = d.select("p:eq(11)").text();
String Contact = ContactAll.substring(ContactAll.lastIndexOf(":") + 1);
String ContactPositionAll = d.select("p:eq(12)").next("hr").text();
String ContactPosition = ContactPositionAll.substring(ContactPositionAll.lastIndexOf(":") + 1);
答案 0 :(得分:0)
当您想要获得空白数据(String contact = null
或String contact = ""
)时,请联系&#39; p标签是空的。是对的吗?
根据您的报废页面,p
元素中没有类或ID。因此,基本上您无法识别特定信息,例如&#39;联系人&#39;
我的建议是&#34;选择包含指定文字的元素&#34;。(jsoup API - selector)如果没有,则返回null
。
String contact = d.select("p:contains(Contact:)").get(1);
或者您可以使用:matches(regex)
String contact = d.select("p:matches(^Contact:.*)").get(1);
另外,在Java惯例中,Variable应该是初始小写字母,例如&#39; contactAll&#39;(它由&#39;较低的Camel案例&#39;调用)。 这些文章可能会有所帮助: The Java™ Tutorials- Variables - Naming / Using Java Naming Conventions
快乐的编码!