我的html代码与此非常相似:
<TH CLASS="ddtitle">MovieOne</TH>
<TABLE CLASS="datadisplaytable" ><CAPTION class="captiontext">Movies</CAPTION>
<TR>
<TH CLASS="ddheader" scope="col" >Genre</TH>
<TH CLASS="ddheader" scope="col" >Time</TH>
<TH CLASS="ddheader" scope="col" >Days</TH>
<TH CLASS="ddheader" scope="col" >Where</TH>
<TH CLASS="ddheader" scope="col" >Date Range</TH>
<TH CLASS="ddheader" scope="col" >Seating</TH>
<TH CLASS="ddheader" scope="col" >Actors</TH>
</TR>
<TR>
<TD CLASS="dddefault">Action</TD>
<TD CLASS="dddefault">10:00 am - 12:00 pm</TD>
<TD CLASS="dddefault">SMTWTHFSA</TD>
<TD CLASS="dddefault">AMC Showplace</TD>
<TD CLASS="dddefault">Aug 20, 2014 - Sept 12, 2014</TD>
<TD CLASS="dddefault">Reservations</TD>
<TD CLASS="dddefault">Will Ferrel (<ABBR title= "Primary">P</ABBR>) target="Will Ferrel" ></TD>
</TR>
</TABLE>
<TH CLASS="ddtitle">MovieTwo</TH>
<TABLE CLASS="datadisplaytable" ><CAPTION class="captiontext">Movies</CAPTION>
<TR>
<TH CLASS="ddheader" scope="col" >Genre</TH>
<TH CLASS="ddheader" scope="col" >Time</TH>
<TH CLASS="ddheader" scope="col" >Days</TH>
<TH CLASS="ddheader" scope="col" >Where</TH>
<TH CLASS="ddheader" scope="col" >Date Range</TH>
<TH CLASS="ddheader" scope="col" >Seating</TH>
<TH CLASS="ddheader" scope="col" >Actors</TH>
</TR>
<TR>
<TD CLASS="dddefault">Action</TD>
<TD CLASS="dddefault">11:00 am - 12:30 pm</TD>
<TD CLASS="dddefault">SMTWTHFSA</TD>
<TD CLASS="dddefault">Showplace Cinemas</TD>
<TD CLASS="dddefault">Aug 20, 2014 - Sept 12, 2014</TD>
<TD CLASS="dddefault">TBA</TD>
<TD CLASS="dddefault">Zach Galifinakis (<ABBR title= "Primary">P</ABBR>) target="Zach Galifinakis" ></TD>
</TR>
</TABLE>
<TH CLASS="ddtitle">MovieThree</TH>
<BR>
<BR>
Coming Soon
<BR>
我希望能够做的是获取与电影标题相关的单个表格数据,如果电影没有表格,我想说这些值是TBA。到目前为止,我能够获得相关的表信息,但我无法跳过表。例如,我使用此代码来获取电影的类型:
int tcounter = 1;
for (Element elements : li) {
WebElement genre = driver.findElement(By.xpath("//table[@class='datadisplaytable']/descendant::table["+tcounter+"]//td[1]"));
WebElement time = driver.findElement(By.xpath("//table[@class='datadisplaytable']/descendant::table["+tcounter+"]//td[2]"));
WebElement days = driver.findElement(By.xpath("//table[@class='datadisplaytable']/descendant::table["+tcounter+"]//td[3]"));
WebElement where = driver.findElement(By.xpath("//table[@class='datadisplaytable']/descendant::table["+tcounter+"]//td[4]"));
WebElement date_range = driver.findElement(By.xpath("//table[@class='datadisplaytable']/descendant::table["+tcounter+"]//td[5]"));
WebElement seating = driver.findElement(By.xpath("//table[@class='datadisplaytable']/descendant::table["+tcounter+"]//td[6]"));
WebElement actors = driver.findElement(By.xpath("//table[@class='datadisplaytable']/descendant::table["+tcounter+"]//td[7]"));
tcounter++;
}
元素是指存储网页上所有链接的列表 ([1]的结果是行动,[2]将是上午10:00 - 中午12:00 ......)。 这是在for循环内,将tcounter的值增加1,以便接收不同表的数据。有没有办法让我能够告诉程序查看TH类下是否存在表格,如果没有给出TBA值并跳过它?
这是我第二次尝试基于siking的回答:
List<WebElement> linstings = driver.findElements(By.className("ddtitle"));
String genre = "";
String time = "";
String days = "";
String where = "";
String dateRange = "";
String seating = "";
String actors = "";
for(WebElement potentialMovie : linstings) {
try {
WebElement actualMovie = potentialMovie.findElement(By.xpath("//table[@class='datadisplaytable']"));
// System.out.println("Actual: " + actualMovie.getText());
// make all your assignments, for example:
type = actualMovie.findElement(By.xpath("/descendant::table//td")).getText();
time = actualMovie.findElement(By.xpath("/descendant::table//td[2]")).getText();
days = actualMovie.findElement(By.xpath("/descendant::table//td[3]")).getText();
location = actualMovie.findElement(By.xpath("/descendant::table//td[4]")).getText();
dates = actualMovie.findElement(By.xpath("/descendant::table//td[5]")).getText();
schedType = actualMovie.findElement(By.xpath("/descendant::table//td[6]")).getText();
instructor = actualMovie.findElement(By.xpath("/descendant::table//td[7]")).getText();
System.out.println(genre+" "+time+" "+days+" "+where+" "+dateRange+" "+actors);
} catch(Exception ex) {
// there is no table, so:
genre = "TBA";
}
}
此代码的问题在于它不断返回第一个表的值。
答案 0 :(得分:1)
我将您的HTML示例缩减为以下内容:
<TH CLASS="ddtitle">MovieOne</TH>
<TABLE CLASS="datadisplaytable">
<CAPTION class="captiontext">Movies</CAPTION>
<TR>
<TH CLASS="ddheader" scope="col">Genre</TH>
</TR>
<TR>
<TD CLASS="dddefault">Action</TD>
</TR>
</TABLE>
<TH CLASS="ddtitle">MovieTwo</TH>
<BR/>
<BR/>
Coming Soon
<BR/>
<TH CLASS="ddtitle">MovieThree</TH>
<TABLE CLASS="datadisplaytable">
<CAPTION class="captiontext">Movies</CAPTION>
<TR>
<TH CLASS="ddheader" scope="col">Genre</TH>
</TR>
<TR>
<TD CLASS="dddefault">Action</TD>
</TR>
</TABLE>
希望它代表你所有的情况!
不要使用计数器,但要使用实际的WebElement
来迭代:
// default all your variables to TBA, like:
String genre = "TBA";
// find all the listings on the page...
List<WebElement> linstings = driver.findElements(By.className("ddtitle"));
// ... and iterate over them
for (WebElement listing : linstings) {
// grab whatever is the _first_ element under the TH ...
WebElement potentialMovie = listing.findElement(By.xpath("following-sibling::*[1]"));
// ... check if it has a child element CAPTION
if (potentialMovie.findElement(By.xpath("caption")) != null) {
// make all your assignments, for example:
genre = potentialMovie.findElement(By.xpath("tr[2]/td[1]")).getText();
}
}
请注意,此代码未经测试,您的里程可能会有所不同!