删除CSV中的空白行

时间:2018-06-26 10:42:34

标签: java selenium opencsv

我正在从站点抓取数据,并使用站点数据生成一个csv文件。我正在将opencsv jar和硒jar文件一起用于我的程序。 CSV文件正在生成,但是在每一行之后,都会生成空白行。我试图消除相同但不能消除。这是我的代码:-

package automation;

import java.io.BufferedWriter;
import java.io.File;
import java.io.FileWriter;
import java.io.IOException;
import java.util.List;



import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;


import com.opencsv.CSVWriter;

import com.opencsv.exceptions.CsvException;
import com.opencsv.exceptions.CsvRequiredFieldEmptyException;
import java.time.ZonedDateTime;

public class RecruitmeentTestApp {

    public static void main(String[] args) throws IOException, CsvException, CsvRequiredFieldEmptyException {

           //WebDriver driver = new FirefoxDriver();
        //Fetching Remote Control driver for Chrome browser
           WebDriver driver = new ChromeDriver();
           driver.get("https://www.premierleague.com/tables/");
           driver.manage().window().maximize();

           //Setting the file name in milliseconds of execution as per requirement
           long miliSec=ZonedDateTime.now().toInstant().toEpochMilli();
           File file = new File(miliSec+ ".csv") ;



           WebElement table=driver.findElement(By.xpath("//table"));
           List<WebElement> rowsList = table.findElements(By.tagName("tr"));
           //List<WebElement> headerList=rowsList.get(0).findElements(By.xpath(".//th"));

           List<WebElement> columnsList = null;

           BufferedWriter writer = new BufferedWriter(new FileWriter(file));

           CSVWriter csvWriter = new CSVWriter(writer,
                                CSVWriter.DEFAULT_SEPARATOR,
                                CSVWriter.NO_QUOTE_CHARACTER,
                                CSVWriter.DEFAULT_ESCAPE_CHARACTER,
                                CSVWriter.DEFAULT_LINE_END);

           //As the table is static so creating headerRecord and writing it to csv file

           String[] headerRecord = {"More","Postion", "Club", "Played", "Won","Drawn","Lost","GF","GA","GD","Points","Next"};
           csvWriter.writeNext(headerRecord);


           for (WebElement row : rowsList) {
                  System.out.println();
                  columnsList = row.findElements(By.tagName("td"));
                  String[]colText=new String[columnsList.size()];
                  int i=0;
                  for(WebElement column: columnsList) {

                      colText[i]=column.getText();
                      i++;

                  }
                 //writing the output to csv file

                 csvWriter.writeNext(colText);


                       }

                   //Closing the stream
                   csvWriter.close();




}

}

1 个答案:

答案 0 :(得分:2)

https://www.premierleague.com/tables/的HTML中,每个可见tr后面都有一个隐藏的折叠tr(其具有colspan = '13')

<tr class="tableDark" data-compseason="210" data-filtered-entry-size="20" data-filtered-table-row="1" data-filtered-table-row-name="Arsenal" data-filtered-table-row-opta="t3" data-filtered-table-row-abbr="1">

   <td class="revealMore" style="display: table-cell;" tabindex="0" role="button">
       <div class="icn chevron-down-g"></div>
   </td>
    <td class="pos" tabindex="0">
        <span class="value">1</span>
    </td>
    <td class="team" scope="row">
        <a href="/clubs/1/Arsenal/overview"><span class="badge-25 t3"></span> <span class="long">Arsenal</span><span class="short">ARS</span></a>
    </td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td>0</td>
    <td class="hideSmall">0</td>
    <td class="hideSmall">0</td>
    <td> 
    0

</td>
    <td class="points">0</td>

<td class="nextMatchCol hideMed">
        <span tabindex="0" class="button-tooltip" id="Tooltip">
            <span class="nextMatch"><span class="badge-20 t43"><span class="visuallyHidden">Manchester City</span></span></span>
            <a href="/match/38308" class="tooltipContainer linkable tooltip-link tooltip-right" role="tooltip">
                <span class="tooltip-content">
                    <div class="matchAbridged">
                            <span class="matchInfo">Saturday 11 August 2018</span>
<span class="teamName"><abbr title="Arsenal">ARS</abbr></span>
<span class="badge-20 t3"></span>
<time>15:00</time>
<span class="badge-20 t43"></span>
<span class="teamName"><abbr title="Manchester City">MCI</abbr></span>
<span class="icn arrow-right"></span>

                    </div>
                </span>
            </a>
        </span>
</td>
</tr>
<tr class="expandable" data-filtered-table-row-expander="1">
    <td colspan="13">
        <a href="/clubs/1/Arsenal/overview" class="expandableTeam">
            <span class="badge-50 t3"></span>
            <span class="teamName">Arsenal</span>
        </a>
        <div class="expandableFixtures">
                <div class="resultWidget">
                    <div class="label"><strong>Next Fixture</strong> - Saturday 11 August 2018</div>
                    <a href="/match/38308" class="matchAbridged pre">
<span class="teamName"><abbr title="Arsenal">ARS</abbr></span>
<span class="badge-20 t3"></span>
<time>15:00</time>
<span class="badge-20 t43"></span>
<span class="teamName"><abbr title="Manchester City">MCI</abbr></span>
<span class="icn arrow-right"></span>
                    </a>
                </div>
            <div class="btnContainer">
                <a href="/clubs/1/Arsenal/overview" class="btn-highlight" role="btn">Visit <span class="visuallyHidden">Arsenal </span>Club Page<span class="icn arrow-right-w"></span></a>
            </div>
        </div>
        <div class="teamPerformanceStandingsArea" style="display:none;">
            <header>
                <h3 class="subHeader left">Performance Chart</h3>
                <a href="/stats/comparison" class="btn right">Compare against another team<span class="icn arrow-right"></span></a>
            </header>
            <div class="teamPerformanceStandingsContainer"></div>
        </div>
    </td>
</tr>

因此,跳过添加具有colspan = '13'属性的所有备用tr的操作,这将导致将空行添加到csv文件中。