我正在从站点抓取数据,并使用站点数据生成一个csv文件。我正在将opencsv jar和硒jar文件一起用于我的程序。 CSV文件正在生成,但是在每一行之后,都会生成空白行。我试图消除相同但不能消除。这是我的代码:-
package automation;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.util.List;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import com.opencsv.CSVWriter;
import com.opencsv.exceptions.CsvException;
import com.opencsv.exceptions.CsvRequiredFieldEmptyException;
import java.time.ZonedDateTime;
public class RecruitmeentTestApp {
public static void main(String[] args) throws IOException, CsvException, CsvRequiredFieldEmptyException {
//WebDriver driver = new FirefoxDriver();
//Fetching Remote Control driver for Chrome browser
WebDriver driver = new ChromeDriver();
driver.get("https://www.premierleague.com/tables/");
driver.manage().window().maximize();
//Setting the file name in milliseconds of execution as per requirement
long miliSec=ZonedDateTime.now().toInstant().toEpochMilli();
File file = new File(miliSec+ ".csv") ;
WebElement table=driver.findElement(By.xpath("//table"));
List<WebElement> rowsList = table.findElements(By.tagName("tr"));
//List<WebElement> headerList=rowsList.get(0).findElements(By.xpath(".//th"));
List<WebElement> columnsList = null;
BufferedWriter writer = new BufferedWriter(new FileWriter(file));
CSVWriter csvWriter = new CSVWriter(writer,
CSVWriter.DEFAULT_SEPARATOR,
CSVWriter.NO_QUOTE_CHARACTER,
CSVWriter.DEFAULT_ESCAPE_CHARACTER,
CSVWriter.DEFAULT_LINE_END);
//As the table is static so creating headerRecord and writing it to csv file
String[] headerRecord = {"More","Postion", "Club", "Played", "Won","Drawn","Lost","GF","GA","GD","Points","Next"};
csvWriter.writeNext(headerRecord);
for (WebElement row : rowsList) {
System.out.println();
columnsList = row.findElements(By.tagName("td"));
String[]colText=new String[columnsList.size()];
int i=0;
for(WebElement column: columnsList) {
colText[i]=column.getText();
i++;
}
//writing the output to csv file
csvWriter.writeNext(colText);
}
//Closing the stream
csvWriter.close();
}
}
答案 0 :(得分:2)
在https://www.premierleague.com/tables/的HTML中,每个可见tr后面都有一个隐藏的折叠tr(其具有colspan = '13')
<tr class="tableDark" data-compseason="210" data-filtered-entry-size="20" data-filtered-table-row="1" data-filtered-table-row-name="Arsenal" data-filtered-table-row-opta="t3" data-filtered-table-row-abbr="1">
<td class="revealMore" style="display: table-cell;" tabindex="0" role="button">
<div class="icn chevron-down-g"></div>
</td>
<td class="pos" tabindex="0">
<span class="value">1</span>
</td>
<td class="team" scope="row">
<a href="/clubs/1/Arsenal/overview"><span class="badge-25 t3"></span> <span class="long">Arsenal</span><span class="short">ARS</span></a>
</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td>0</td>
<td class="hideSmall">0</td>
<td class="hideSmall">0</td>
<td>
0
</td>
<td class="points">0</td>
<td class="nextMatchCol hideMed">
<span tabindex="0" class="button-tooltip" id="Tooltip">
<span class="nextMatch"><span class="badge-20 t43"><span class="visuallyHidden">Manchester City</span></span></span>
<a href="/match/38308" class="tooltipContainer linkable tooltip-link tooltip-right" role="tooltip">
<span class="tooltip-content">
<div class="matchAbridged">
<span class="matchInfo">Saturday 11 August 2018</span>
<span class="teamName"><abbr title="Arsenal">ARS</abbr></span>
<span class="badge-20 t3"></span>
<time>15:00</time>
<span class="badge-20 t43"></span>
<span class="teamName"><abbr title="Manchester City">MCI</abbr></span>
<span class="icn arrow-right"></span>
</div>
</span>
</a>
</span>
</td>
</tr>
<tr class="expandable" data-filtered-table-row-expander="1">
<td colspan="13">
<a href="/clubs/1/Arsenal/overview" class="expandableTeam">
<span class="badge-50 t3"></span>
<span class="teamName">Arsenal</span>
</a>
<div class="expandableFixtures">
<div class="resultWidget">
<div class="label"><strong>Next Fixture</strong> - Saturday 11 August 2018</div>
<a href="/match/38308" class="matchAbridged pre">
<span class="teamName"><abbr title="Arsenal">ARS</abbr></span>
<span class="badge-20 t3"></span>
<time>15:00</time>
<span class="badge-20 t43"></span>
<span class="teamName"><abbr title="Manchester City">MCI</abbr></span>
<span class="icn arrow-right"></span>
</a>
</div>
<div class="btnContainer">
<a href="/clubs/1/Arsenal/overview" class="btn-highlight" role="btn">Visit <span class="visuallyHidden">Arsenal </span>Club Page<span class="icn arrow-right-w"></span></a>
</div>
</div>
<div class="teamPerformanceStandingsArea" style="display:none;">
<header>
<h3 class="subHeader left">Performance Chart</h3>
<a href="/stats/comparison" class="btn right">Compare against another team<span class="icn arrow-right"></span></a>
</header>
<div class="teamPerformanceStandingsContainer"></div>
</div>
</td>
</tr>
因此,跳过添加具有colspan = '13'属性的所有备用tr的操作,这将导致将空行添加到csv文件中。