我有一张桌子,想要按col抓取数据,然后逐个打印每一段数据。表格如下所示:https://i.stack.imgur.com/7F60c.png
这是HTML:
<div id="ember10911" class="ember-view flowsheet-table">
<div class="header flowsheet-column">
<div class="flowsheet-cell flowsheet-row"> </div>
<div id="ember10912" class="ember-view flowsheet-row header-row">
<h4 class="header4semibold " data-ember-action="10913">
<div class="ellipsis" style="width: 302px">Vitals</div>
</h4>
</div>
<div id="ember10914" class="ember-view flowsheet-row ellipsis"> Height </div>
<div id="ember10915" class="ember-view flowsheet-row ellipsis"> Weight </div>
<div id="ember10916" class="ember-view flowsheet-row ellipsis"> BMI </div>
<div id="ember10917" class="ember-view flowsheet-row ellipsis"> BP </div>
<div id="ember10918" class="ember-view flowsheet-row ellipsis"> Temperature </div>
<div id="ember10919" class="ember-view flowsheet-row ellipsis"> Pulse </div>
<div id="ember10920" class="ember-view flowsheet-row ellipsis"> Respiratory rate </div>
<div id="ember10921" class="ember-view flowsheet-row ellipsis"> O2 Saturation </div>
<div id="ember10922" class="ember-view flowsheet-row ellipsis"> Pain </div>
<div id="ember10923" class="ember-view flowsheet-row ellipsis"> Head Circumference </div>
</div>
<div class="flowsheet-scroll-region">
<div id="ember11083" class="ember-view flowsheet-column">
<div class="flowsheet-cell flowsheet-table-header selectable" data-ember-action="11084">
<div class="p-semibold">08/24/17</div>
<div class="p-semibold">8:44 PM</div>
</div>
<div id="ember11086" class="ember-view flowsheet-cell header-row"> </div>
<div id="ember11088" class="ember-view flowsheet-cell"> </div>
<div id="ember11090" class="ember-view flowsheet-cell selectable">
<span data-element="flowsheet-cell-value" class="">152 </span>
<span class="p-666" data-element="flowsheet-cell-units">lb</span>
</div>
<div id="ember11092" class="ember-view flowsheet-cell"> </div>
<div id="ember11094" class="ember-view flowsheet-cell selectable wide-cell">
<span data-element="flowsheet-cell-value" class="">102/64 </span>
<span class="p-666" data-element="flowsheet-cell-units">mmHg</span>
</div>
<div id="ember11096" class="ember-view flowsheet-cell selectable">
<span data-element="flowsheet-cell-value" class="">97.9 </span>
<span class="p-666" data-element="flowsheet-cell-units">°F</span>
</div>
<div id="ember11098" class="ember-view flowsheet-cell selectable">
<span data-element="flowsheet-cell-value" class="">72 </span>
<span class="p-666" data-element="flowsheet-cell-units">bpm</span>
</div>
<div id="ember11100" class="ember-view flowsheet-cell"> </div>
<div id="ember11102" class="ember-view flowsheet-cell"> </div>
<div id="ember11104" class="ember-view flowsheet-cell"> </div>
<div id="ember11106" class="ember-view flowsheet-cell"> </div>
</div>
</div>
</div>
我试过了:
List<WebElement> list = driver.findElements(By.xpath("//*[@id=\"ember4341\"]"));
for (int i = 0; i < list.size(); i++) {
System.out.print("Date" + driver.findElement(By.xpath("//*[@id=\"ember4929\"]/div[1]/div[1]")) + "Height" + driver.findElement(By.xpath("//*[@id=\"ember4944\"]/span[1]")) + "Weight" + driver.findElement(By.xpath("//*[@id=\"ember4946\"]/span[1]"))+......);
System.out.println("");
}
我想使用列表来存储每个col的信息,然后使用for循环来完成它。我的期望是:
"Date 10/17/16 Height 64 Weight 106........."
"Date 11/17/17 Height 64 Weight 109.99......"
.......
但我得到的是:
"Date[[ChromeDriver: chrome on MAC (bf163179196c60704f582de0d421323c)] -> xpath: //*[@id="ember4929"]/div[1]/div[1]]Height[[ChromeDriver: chrome on MAC (bf163179196c60704f582de0d421323c)] -> xpath: //*[@id="ember4944"]/span[1]]Weight[[ChromeDriver: chrome on MAC (bf163179196c60704f582de0d421323c)] -> xpath: //*[@id="ember4946"]/span[1]]"
我还尝试获取第一个col的数据和打印,我使用代码:
System.out.println("Date: " + driver.findElement(By.cssSelector("#ember5060 > div.flowsheet-cell.flowsheet-table-header.selectable > div:nth-child(1)")).getText() + " " + "Height: " + driver.findElement(By.cssSelector("#ember5075 > div")).getText() + " " + "Weight: " + driver.findElement(By.cssSelector("#ember5077 > div")).getText());
它有效。我得到了:
Date: 10/25/17 Height: 67 in Weight: 139.99 lb
现在我可以打印第一个col的数据,但我不知道如何迭代地执行它并打印表中的每个col。所以我该怎么做?谢谢!
答案 0 :(得分:0)
这个问题有几个HTML修订版,这是我用来提出解决方案的那个:
<div id="ember4342" class="ember-view flowsheet-table">
<div class="header flowsheet-column">
<div class="flowsheet-cell flowsheet-row">
</div>
<div id="ember4352" class="ember-view flowsheet-row header-row">
<h4 class="header4semibold " data-ember-action="4353">
<div class="ellipsis" style="width: 694px">Vitals</div>
</h4>
</div>
<div id="ember4355" class="ember-view flowsheet-row ellipsis"> Height
</div>
<div id="ember4357" class="ember-view flowsheet-row ellipsis"> Weight
</div>
<div id="ember4359" class="ember-view flowsheet-row ellipsis"> BMI
</div>
<div id="ember4361" class="ember-view flowsheet-row ellipsis"> BMI Percentile
</div>
<div id="ember4363" class="ember-view flowsheet-row ellipsis"> BP
</div>
<div id="ember4365" class="ember-view flowsheet-row ellipsis"> Temperature
</div>
<div id="ember4367" class="ember-view flowsheet-row ellipsis"> Pulse
</div>
<div id="ember4369" class="ember-view flowsheet-row ellipsis"> Respiratory rate
</div>
<div id="ember4371" class="ember-view flowsheet-row ellipsis"> O2 Saturation
</div>
<div id="ember4373" class="ember-view flowsheet-row ellipsis"> Pain
</div>
<div id="ember4375" class="ember-view flowsheet-row ellipsis"> Head Circumference
</div>
</div>
<div class="flowsheet-scroll-region">
<div id="ember5636" class="ember-view flowsheet-column">
<div class="flowsheet-cell flowsheet-table-header selectable" data-ember-action="5637">
<!---->
<div class="p-semibold">10/17/16</div>
<div class="p-semibold"><!----></div>
<!----></div>
<div id="ember5639" class="ember-view flowsheet-cell header-row"><!----></div>
<div id="ember5641" class="ember-view flowsheet-cell selectable"><!----><!---->
<span data-element="flowsheet-cell-value" class="">64 </span><span class="p-666"
data-element="flowsheet-cell-units">in</span>
</div>
<div id="ember5643" class="ember-view flowsheet-cell selectable"><!----><!---->
<span data-element="flowsheet-cell-value" class="">106 </span><span class="p-666"
data-element="flowsheet-cell-units">lb</span>
</div>
<div id="ember5645" class="ember-view flowsheet-cell selectable"><!----><!---->
<span data-element="flowsheet-cell-value" class="">18.19 </span><span class="p-666"
data-element="flowsheet-cell-units"><!----></span>
</div>
<div id="ember5647" class="ember-view flowsheet-cell"><!----></div>
<div id="ember5649" class="ember-view flowsheet-cell selectable wide-cell"><!----><!---->
<span data-element="flowsheet-cell-value" class="">111/83 </span><span class="p-666"
data-element="flowsheet-cell-units">mmHg</span>
</div>
<div id="ember5651" class="ember-view flowsheet-cell selectable"><!----><!---->
<span data-element="flowsheet-cell-value" class="">100.9 </span><span class="p-666"
data-element="flowsheet-cell-units">°F</span>
</div>
<div id="ember5653" class="ember-view flowsheet-cell selectable"><!----><!---->
<span data-element="flowsheet-cell-value" class="">86 </span><span class="p-666"
data-element="flowsheet-cell-units">bpm</span>
</div>
<div id="ember5655" class="ember-view flowsheet-cell selectable"><!----><!---->
<span data-element="flowsheet-cell-value" class="">14 </span><span class="p-666"
data-element="flowsheet-cell-units">bpm</span>
</div>
<div id="ember5657" class="ember-view flowsheet-cell selectable"><!----></div>
<div id="ember5659" class="ember-view flowsheet-cell selectable"><!----></div>
<div id="ember5661" class="ember-view flowsheet-cell selectable"><!----></div>
</div>
<div id="ember5663" class="ember-view flowsheet-column editable">
<div class="flowsheet-cell flowsheet-table-header selectable" data-ember-action="5664">
<!---->
<div class="p-link-semibold" data-element="flowsheet-column-date">11/17/17</div>
<div class="p-link-semibold" data-element="flowsheet-column-time"><!----></div>
<!----></div>
<div id="ember5666" class="ember-view flowsheet-cell header-row"><!----></div>
<div id="ember5668" class="ember-view flowsheet-cell selectable"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">64 in</div>
</div>
<div id="ember5670" class="ember-view flowsheet-cell selectable"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">109.99 lb</div>
</div>
<div id="ember5672" class="ember-view flowsheet-cell selectable"><!----><!---->
<span data-element="flowsheet-cell-value" class="">18.88 </span><span class="p-666"
data-element="flowsheet-cell-units"><!----></span>
</div>
<div id="ember5674" class="ember-view flowsheet-cell"><!----></div>
<div id="ember5676" class="ember-view flowsheet-cell selectable wide-cell"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">116/72 mmHg</div>
</div>
<div id="ember5678" class="ember-view flowsheet-cell selectable"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">101.2 °F</div>
</div>
<div id="ember5680" class="ember-view flowsheet-cell selectable"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">87 bpm</div>
</div>
<div id="ember5682" class="ember-view flowsheet-cell selectable"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">16 bpm</div>
</div>
<div id="ember5684" class="ember-view flowsheet-cell selectable"><!----></div>
<div id="ember5686" class="ember-view flowsheet-cell selectable"><!----></div>
<div id="ember5688" class="ember-view flowsheet-cell selectable"><!----></div>
</div>
<div id="ember5690" class="ember-view flowsheet-column editable">
<div class="flowsheet-cell flowsheet-table-header selectable" data-ember-action="5691">
<!---->
<div class="p-link-semibold" data-element="flowsheet-column-date">01/17/18</div>
<div class="p-link-semibold" data-element="flowsheet-column-time"><!----></div>
<!----></div>
<div id="ember5693" class="ember-view flowsheet-cell header-row"><!----></div>
<div id="ember5695" class="ember-view flowsheet-cell selectable"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">64 in</div>
</div>
<div id="ember5697" class="ember-view flowsheet-cell selectable"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">106 lb</div>
</div>
<div id="ember5699" class="ember-view flowsheet-cell selectable"><!----><!---->
<span data-element="flowsheet-cell-value" class="">18.19 </span><span class="p-666"
data-element="flowsheet-cell-units"><!----></span>
</div>
<div id="ember5701" class="ember-view flowsheet-cell"><!----></div>
<div id="ember5703" class="ember-view flowsheet-cell selectable wide-cell"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">123/84 mmHg</div>
</div>
<div id="ember5705" class="ember-view flowsheet-cell selectable"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">100.3 °F</div>
</div>
<div id="ember5707" class="ember-view flowsheet-cell selectable"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">77 bpm</div>
</div>
<div id="ember5709" class="ember-view flowsheet-cell selectable"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">14 bpm</div>
</div>
<div id="ember5711" class="ember-view flowsheet-cell selectable"><!----></div>
<div id="ember5713" class="ember-view flowsheet-cell selectable"><!----></div>
<div id="ember5715" class="ember-view flowsheet-cell selectable"><!----></div>
</div>
<div id="ember5717" class="ember-view flowsheet-column editable">
<div class="flowsheet-cell flowsheet-table-header selectable" data-ember-action="5718">
<!---->
<div class="p-link-semibold" data-element="flowsheet-column-date">02/17/18</div>
<div class="p-link-semibold" data-element="flowsheet-column-time"><!----></div>
<!----></div>
<div id="ember5720" class="ember-view flowsheet-cell header-row"><!----></div>
<div id="ember5722" class="ember-view flowsheet-cell selectable"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">64 in</div>
</div>
<div id="ember5724" class="ember-view flowsheet-cell selectable"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">106 lb</div>
</div>
<div id="ember5726" class="ember-view flowsheet-cell selectable"><!----><!---->
<span data-element="flowsheet-cell-value" class="">18.19 </span><span class="p-666"
data-element="flowsheet-cell-units"><!----></span>
</div>
<div id="ember5728" class="ember-view flowsheet-cell"><!----></div>
<div id="ember5730" class="ember-view flowsheet-cell selectable wide-cell"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">121/70 mmHg</div>
</div>
<div id="ember5732" class="ember-view flowsheet-cell selectable"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">100.8 °F</div>
</div>
<div id="ember5734" class="ember-view flowsheet-cell selectable"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">77 bpm</div>
</div>
<div id="ember5736" class="ember-view flowsheet-cell selectable"><!----><!---->
<div data-element="flowsheet-cell-value-and-units" class="p-link">18 bpm</div>
</div>
<div id="ember5738" class="ember-view flowsheet-cell selectable"><!----></div>
<div id="ember5740" class="ember-view flowsheet-cell selectable"><!----></div>
<div id="ember5742" class="ember-view flowsheet-cell selectable"><!----></div>
</div>
<div id="ember5743" class="ember-view flowsheet-column flowsheet-add-column current-context">
<div class="flowsheet-cell selectable" data-ember-action="5744">
<a class="icon icon-add" data-element="flowsheet-add-column-button"></a>
</div>
<div class="flowsheet-cell header-row"></div>
<div class="flowsheet-cell selectable" data-ember-action="5745"></div>
<div class="flowsheet-cell selectable" data-ember-action="5746"></div>
<div class="flowsheet-cell selectable" data-ember-action="5747"></div>
<div class="flowsheet-cell selectable" data-ember-action="5748"></div>
<div class="flowsheet-cell selectable" data-ember-action="5749"></div>
<div class="flowsheet-cell selectable" data-ember-action="5750"></div>
<div class="flowsheet-cell selectable" data-ember-action="5751"></div>
<div class="flowsheet-cell selectable" data-ember-action="5752"></div>
<div class="flowsheet-cell selectable" data-ember-action="5753"></div>
<div class="flowsheet-cell selectable" data-ember-action="5754"></div>
<div class="flowsheet-cell selectable" data-ember-action="5755"></div>
</div>
</div>
<div id="ember4390" class="ember-view"><!----></div>
<div id="ember4399" class="ember-view"><!---->
<!----></div>
<div id="ember4400" class="ember-view"><!----></div>
</div>
这是完整的脚本。运行需要几秒钟,因为由于单元和值元素结构的潜在差异,有一些短暂但必要的超时可以忍受:
import org.openqa.selenium.By;
import org.openqa.selenium.SearchContext;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.chrome.ChromeOptions;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.TimeUnit;
public class WebDriverScratch {
private static final String CHROME_DRIVER = "C:\\Users\\Joe\\Downloads\\chromedriver_win32\\chromedriver.exe";
private static final String HTML_FILE = "file:///C:/Users/Joe/IdeaProjects/scratch/src/main/resources/test.html";
private static final long SHORT_TIMEOUT = 1; // 1 second
private static final long VERY_SHORT_TIMEOUT_MILLIS = 100; // 100 milliseconds
private static final String EMPTY = "";
public static void main(String[] args) {
WebDriver driver = getDriver();
driver.get(HTML_FILE);
driver.manage().timeouts().implicitlyWait(SHORT_TIMEOUT, TimeUnit.SECONDS);
List<WebElement> labels = driver.findElements(By.cssSelector("div.header div.flowsheet-row.ellipsis"));
List<WebElement> dateElms = driver.findElements(By.cssSelector("div.flowsheet-table-header div:first-child"));
List<ColumnData> cols = new ArrayList<>(dateElms.size());
for (WebElement date : dateElms) {
// parse the data for this column
ColumnData col = new ColumnData(date.getText());
WebElement column = date.findElement(By.xpath("../.."));
List<WebElement> dataCells = column.findElements(By.cssSelector("div.ember-view.flowsheet-cell:not(.header-row)"));
for (int i = 0; i < dataCells.size(); i++) {
// get the header label by index
WebElement dataCell = dataCells.get(i);
String label = labels.get(i).getText();
WebElement valueAndUnits = findElementOrNull(driver, dataCell, By.cssSelector("div[data-element='flowsheet-cell-value-and-units']"));
if (valueAndUnits != null) {
col.addEntry(label, valueAndUnits.getText());
} else {
// could be empty data cell, or separate value and units
WebElement value = findElementOrNull(driver, dataCell, By.cssSelector("span[data-element='flowsheet-cell-value']"));
if (value != null) {
// since value element is present, assume units element is too
WebElement units = dataCell.findElement(By.cssSelector("span[data-element='flowsheet-cell-units']"));
col.addEntry(label, value.getText(), units.getText());
} else {
// empty data
col.addEntry(label, EMPTY);
}
}
}
// done parsing data for this column
cols.add(col);
}
driver.close();
// print it all out
printData(cols);
}
/**
* simple container for holding parsed data
*/
private static class ColumnData {
public class Entry {
public final String label;
public final String valueAndUnits;
public Entry(String label, String valueAndUnits) {
this.label = label;
this.valueAndUnits = valueAndUnits;
}
}
public final String date;
public final List<Entry> entries = new ArrayList<>();
public ColumnData(String date) {
this.date = date;
}
public void addEntry(String label, String valueAndUnits) {
entries.add(new Entry(label, valueAndUnits));
}
public void addEntry(String label, String value, String units) {
entries.add(new Entry(label, String.format("%s %s", value, units)));
}
}
private static WebDriver getDriver() {
System.setProperty("webdriver.chrome.driver", CHROME_DRIVER);
ChromeOptions options = new ChromeOptions();
options.addArguments("disable-infobars");
options.addArguments("start-maximized");
return new ChromeDriver(options);
}
/**
* returns null if element is not present
*/
private static WebElement findElementOrNull(WebDriver driver, SearchContext parent, By by) {
driver.manage().timeouts().implicitlyWait(VERY_SHORT_TIMEOUT_MILLIS, TimeUnit.MILLISECONDS);
try {
return parent.findElement(by);
} catch (RuntimeException ex) {
return null;
} finally {
driver.manage().timeouts().implicitlyWait(SHORT_TIMEOUT, TimeUnit.SECONDS);
}
}
private static void printData(List<ColumnData> allData) {
for (ColumnData data : allData) {
System.out.println("Date: " + data.date);
for (ColumnData.Entry entry : data.entries) {
System.out.println(String.format(" %s: %s", entry.label, entry.valueAndUnits));
}
}
}
}
输出:
Date: 10/17/16
Height: 64 in
Weight: 106 lb
BMI: 18.19
BMI Percentile:
BP: 111/83 mmHg
Temperature: 100.9 °F
Pulse: 86 bpm
Respiratory rate: 14 bpm
O2 Saturation:
Pain:
Head Circumference:
Date: 11/17/17
Height: 64 in
Weight: 109.99 lb
BMI: 18.88
BMI Percentile:
BP: 116/72 mmHg
Temperature: 101.2 °F
Pulse: 87 bpm
Respiratory rate: 16 bpm
O2 Saturation:
Pain:
Head Circumference:
Date: 01/17/18
Height: 64 in
Weight: 106 lb
BMI: 18.19
BMI Percentile:
BP: 123/84 mmHg
Temperature: 100.3 °F
Pulse: 77 bpm
Respiratory rate: 14 bpm
O2 Saturation:
Pain:
Head Circumference:
Date: 02/17/18
Height: 64 in
Weight: 106 lb
BMI: 18.19
BMI Percentile:
BP: 121/70 mmHg
Temperature: 100.8 °F
Pulse: 77 bpm
Respiratory rate: 18 bpm
O2 Saturation:
Pain:
Head Circumference:
答案 1 :(得分:0)
打印行例如日期10/17/16高度64重量106 ......... 您可以创建3个列表,一个用于项目在 Vitals 列中,一个用于值,另一个用于单位。现在,您可以将此操作放在一个函数中,该函数将接受所需的 date 作为输入参数,并按如下方式打印相关数据:
public void printReport(String myDate)
{
List<WebElement> row_Heads = driver.findElements(By.xpath("//div[@class='ember-view flowsheet-table']/div[@class='header flowsheet-column']//div[@class='ember-view flowsheet-row ellipsis']"));
List<WebElement> rowValues = driver.findElements(By.xpath("//div[@class='flowsheet-scroll-region']//div[@class='ember-view flowsheet-column']//div[@class='p-semibold'][.='" + myDate + "']//following::div[contains(@class,'ember-view') and contains(@class,'flowsheet-cell') and contains(@class,'selectable')]/span[@data-element='flowsheet-cell-value']"));
List<WebElement> row_Units = driver.findElements(By.xpath("//div[@class='flowsheet-scroll-region']//div[@class='ember-view flowsheet-column']//div[@class='p-semibold'][.='" + myDate + "']//following::div[contains(@class,'ember-view') and contains(@class,'flowsheet-cell') and contains(@class,'selectable')]//span[@data-element='flowsheet-cell-units']"));
for(int i=0; i<rowValues.size(); i++)
System.out.println(row_Heads.get(i).getText() + " : " + rowValues.get(i).getText() + " " + row_Units.get(i).getText());
}
您可以从脚本中的任何位置调用函数printReport()
,以打印相应日期的相关数据,如下所示:
printReport("10/17/16")
printReport("11/17/17")
printReport("01/17/18")
printReport("02/17/18")