如果文件头不在文件的第一行,该如何自动设置头文件?

时间:2019-06-21 09:07:02

标签: csv symfony4

我处理的csv文件在标头之前的行数不同。我需要根据文件自动设置标题。

以下是文件的示例:

            Wine Directory List



            Wine Title  Vintage Country Region  Sub region  Appellation Color   Bottle Size Price   URL FORMAT
Chateau Petrus Pomerol  2011    France  Bordeaux    Pomerol     Red 750ML   2799.99 HTTP://holbrookliquors.com/sku218758.html   1x750ML
Pappy Van Winkle's Bourbon 15 Year Family Reserve       United States   Kentucky                0ML 999.99      1x0ML
Shipping Fee                            0ML 999.99  1x0ML
Heineken Holland Beer       Netherlands                 0ML 999.99  1x0ML

这是我的转换器:

UPDATE:第一个解决方案:getHeaderLine()。只有挫折:当我开始使用getHeaderLine()解析文件时,由于已经读取了getHeaderLine中的行,因此无法从HeaderLine获取数据。 拜托,外面有人帮我。

public function convert($filePath, $feedColumnsMatch)
{

    //this array will contain the elements from the file
    $articles = [];

    $headerRecord = [];

        //if we can open the file on mode "read"
        if (($handle = fopen($filePath, 'r')) !== FALSE) {
            //represents the line we are reading
            $rowCounter = 0;
            $nb = $feedColumnsMatch->getNumberOfColumns();

            $headerLine = $this->getHeaderLine($handle, $nb, $delimiter);

            //as long as there are lines
            while (($rowData = fgetcsv($handle, 5000, $delimiter)) !== FALSE) {
//todo enlever le vilain 9
                if ($nb===count($rowData)) {

                    //At x line, are written the keys so we record them in $headerRecord
 //What I had first     if (9 === $rowCounter) {
//What I now have
                        if(0 === $rowCounter) {
                        //trim the titles of columns
                        for ($i = 0; $i < $nb; $i++) {
                            $rowData[$i] = trim($rowData[$i]);
                        }

                        $headerRecord = $rowData;
                    }
                    elseif(9<$rowCounter )
                    {      //for every other lines...
                        foreach ($rowData as $key => $value) {       //in each line, for each value
                            // we set $value to the cell ($key) having the same horizontal position than $value
                            // but where vertical position = 0 (headerRecord[]
                            $articles[$rowCounter][$headerRecord[$key]] = mb_convert_encoding($value, "UTF-8");

                        }
                    }
                }
                $rowCounter++;
            }
            fclose($handle);
        }

    return $articles;
}

 public function getHeaderLine($handle, $nbColumns, $delimiter){
        $rowCounter = 0;
        while (($rowData = fgetcsv($handle, 5000, $delimiter)) !== FALSE) {
            $rowCounter++;
            if ($nbColumns===count($rowData)){
                return $rowCounter;
            }

        }
        return -1;
    } 

如您所见,我必须在if()中写入“ 9”才能正确解析数据并为每个文件进行更改。

1 个答案:

答案 0 :(得分:0)

当您有以不同数量的无效行(空白,标题或标签等)开头的csv / tsv文件时,获得此标头的解决方案是使用side函数来首次解析该文件。具有正确单元格数量的第一行(您应该知道csv文件具有多少列)是您的标头,这样您就可以将标头的数据返回到main函数并继续从您的位置进行解析在辅助功能中停止读取。

整个代码:

public function convert($filePath, $feedColumnsMatch)
    {

        if(!file_exists($filePath) ) {
            return "existe pas";
        }
        if(!is_readable($filePath)) {
            return "pas lisible";
        }

        //this array will contain the elements from the file
        $articles = [];

        if($feedColumnsMatch->getFeedFormat()==="tsv" | $feedColumnsMatch->getFeedFormat()==="csv"){
            if($feedColumnsMatch->getFeedFormat()==="csv"){
                $delimiter = $feedColumnsMatch->getDelimiter();
            }else{
                $delimiter = "\t";
            }

            //if we can open the file on mode "read"
            if (($handle = fopen($filePath, 'r')) !== FALSE) {
                //represents the line we are reading

                $nb = $feedColumnsMatch->getNumberOfColumns();
                $headerRecord = $this->getHeader($handle, $nb, $delimiter);           // With this function, I start parsing the file till line === $headerLine
                $rowCounter = 0;
                //if there is no header
                if (null!==$headerRecord || false!==$headerRecord) {
                    //as long as there are lines
                    while (($rowData = fgetcsv($handle, 5000, $delimiter)) !== FALSE) {

                        //if it is a line with valid number of cells
                        if ($nb === count($rowData)) {

                            foreach ($rowData as $key => $value) {       //in each line, for each value
                                // we set $value to the cell ($key) having the same horizontal position than $value
                                // but where vertical position = 0 (headerRecord[]
                                $articles[$rowCounter][$headerRecord[$key]] = mb_convert_encoding($value, "UTF-8");
                            }
                        }
                        $rowCounter++;
                    }
                }
                else{
                    new \Exception();
                }
                fclose($handle);
            }
        }
        return $articles;
    }


    /**
     * is used to get the data of the row containing the header of the csv file or null if no header
     * @param $handle
     * @param $nbColumns
     * @param $delimiter
     * @return array|false|null
     */
    public function getHeader($handle, $nbColumns, $delimiter){
        $rowCounter = 0;
        while (($rowData = fgetcsv($handle, 5000, $delimiter)) !== FALSE) {
            $rowCounter++;
            if ($nbColumns===count($rowData)){
                //trim the titles of columns
                for ($i = 0; $i < $nbColumns; $i++) {
                    $rowData[$i] = trim($rowData[$i]);
                }

                return $rowData;
            }
        }
        return null;
    }

但是,我不知道这是否是正确的方法。这只是一种工作方式。