CSV到数组但首先删除带有额外字段的行

时间:2018-03-07 07:12:40

标签: php arrays csv

我有一个带有标题的csv文件,有时在某一行中有额外的字段。这是因为文本字段中有一个逗号未被转义。

有没有办法在转换成数组之前删除一行?

示例csv文件:

CUST_NUMBER,PO_NUMBER,NAME,SERVICE,DATE,BOX_NUMBER,TRACK_NO,ORDER_NO,INV_NO,INV_AMOUNT
757626003,7383281,JACK SMITH,GND,20180306,1,1Z1370750453578430,2018168325,119348,70.70
757626003,7383282,GERALD SMITH, JR.,GND,20180306,1,1Z9R67670395033411,2018168326,119513,63.72
757626003,7383233,SCOTT R SMITH,GND,20180306,1,1Z1370750982624042,2018168329,119349,39.33

正如您所看到的,第3行有一个额外的字段,因为Gilbert, JR.在文本字段中有一个逗号而没有被转义,这会将JR.部分放在SERVICE列中并将GND列之外的SERVICE字段敲入没有标题的列中。

当行包含的字段多于标题时,我想删除整行。

删除行后,我会将剩余的csv转换为类似这样的数组。

<?
    $csv = array_map("str_getcsv", file("FILE.CSV",FILE_SKIP_EMPTY_LINES));

    $keys = array_shift($csv);

    foreach ($csv as $i => $row) {
        if(count($keys) == count($row)){
            $csv[$i] = array_combine($keys, $row);
        }
    }
?>

3 个答案:

答案 0 :(得分:1)

正如@Scuzzy所建议的那样,设置坏行

<?php
    $csv = array_map("str_getcsv", file("FILE.CSV",FILE_SKIP_EMPTY_LINES));

    $keys = array_shift($csv);

    foreach ($csv as $i => $row) {
        if(count($keys) == count($row)){
            $csv[$i] = array_combine($keys, $row);
        }
        else unset($csv[$i]);
    }
?>

答案 1 :(得分:1)

public class IdentifiedCommandHandler<T, R> : IRequestHandler<IdentifiedCommand<T, R>, R>
        where T : IRequest<R>
    {
        private readonly IMediator _mediator;
        private readonly IRequestManager _requestManager;

        public IdentifiedCommandHandler(IMediator mediator, IRequestManager requestManager)
        {
            _mediator = mediator;
            _requestManager = requestManager;
        }

        /// <summary>
        /// Creates the result value to return if a previous request was found
        /// </summary>
        /// <returns></returns>
        protected virtual R CreateResultForDuplicateRequest()
        {
            return default(R);
        }

        /// <summary>
        /// This method handles the command. It just ensures that no other request exists with the same ID, and if this is the case
        /// just enqueues the original inner command.
        /// </summary>
        /// <param name="message">IdentifiedCommand which contains both original command & request ID</param>
        /// <returns>Return value of inner command or default value if request same ID was found</returns>
        public async Task<R> Handle(IdentifiedCommand<T, R> message, CancellationToken cancellationToken)
        {
            var alreadyExists = await _requestManager.ExistAsync(message.Id);
            if (alreadyExists)
            {
                return CreateResultForDuplicateRequest();
            }
            else
            {
                await _requestManager.CreateRequestForCommandAsync<T>(message.Id);

                var result = await _mediator.Send(message.Command);

                return result;
            }
        }
    }

输出:

<?php

$data=<<<DATA
NUMBER,NAME,SERVICE
7375536,Ron,GND
7369530,RANDY,GND
7383287,Gilbert, JR.,GND
7383236,SCOTT,GND
DATA;

$data = array_map('str_getcsv', explode("\n", $data));
$keys = array_shift($data);
$data = array_filter($data, function($v) {
    return count($v) == 3;
});

var_export($data);

使用列标题作为键:

array (
0 => 
array (
    0 => '7375536',
    1 => 'Ron',
    2 => 'GND',
),
1 => 
array (
    0 => '7369530',
    1 => 'RANDY',
    2 => 'GND',
),
3 => 
array (
    0 => '7383236',
    1 => 'SCOTT',
    2 => 'GND',
),
)

答案 2 :(得分:1)

使用array_filter可以删除回调中不需要的项目。此版本使用$keys数组作为测试(与您使用的相同),使用use将其传递回调...

$csv = array_map("str_getcsv", file("books.csv",FILE_SKIP_EMPTY_LINES));
$keys = array_shift($csv);

$output = array_filter($csv, function($row) use ($keys) {
    return count($row) == count($keys);
});
$output = array_values($output);
print_r($output);

因此,每行不具有相同数量的列将被删除。

我刚刚添加了array_values()调用来重新索引数组。

如果您可以生成带有引号的文件,那么这个问题就不存在......

NUMBER,NAME,SERVICE
7375536,Ron,GND
7369530,RANDY,GND
7383287,"Gilbert, JR.",GND
7383236,SCOTT,GND

您可以使用您选择的引号括住任何文本字段,以确保将来不会出现此问题。

...替代

$csv = array_map("str_getcsv", file("FILE.CSV",FILE_SKIP_EMPTY_LINES));

$keys = array_shift($csv);
$out = array();
foreach ($csv as $row) {
    if(count($keys) == count($row)){
        $out[] = array_combine($keys, $row);
    }
}

上次更新: 就在我等着出去的时候,尝试了以下几点。这会尝试修复数据,因此您可以从文件中获取所有行...

$out = array();
foreach ($csv as $row) {
    if(count($keys) != count($row)){
        $row = array_merge(array_slice($row, 0, 2),
                [implode(",", array_slice($row, 2, count($row)-9))],
                array_slice($row, count($row)-7));
    }
    $out[] = array_combine($keys, $row);
}