如何在转换为json之前验证来自csv的数据?

时间:2017-01-06 05:38:47

标签: php json csv

我有一个像这样的csv文件: -

+------+------+------------------------+
| name | mark |        url             |
+------+------+------------------------+
| ABCD |    5 | http://www.example.org |
| BCD  |   -2 | http://www.example.com |
| CD   |    4 | htt://www.c.com        |
+------+------+------------------------+

它包含name,mark和url的标头。我使用PHP将数据从csv转换为json。我想在将它转换为json之前添加验证,就像它应该是UTF-8的名称一样,标记应该是一个正数,在0到5之间,url应该是有效的。如果该行通过所有验证,那么它将存储在errorless.json中,如果任何行有任何问题,那么在error.json中注释错误。 csv到json的PHP代码: -

$fh = fopen("names.csv", "r");

$csvdata = array();

while (($row = fgetcsv($fh, 0, ",")) !== FALSE) {
    $csvdata[] = $row;
}

$fp = fopen('data.json', 'w');
fwrite($fp, json_encode($csvdata));
fclose($fp);

我想知道如何添加这些验证来转换数据。我是这些概念的新手,无法想办法。如果有人能帮助我,我将非常感激。

3 个答案:

答案 0 :(得分:0)

以下是你要做的事情:

// to make life easier on yourself create 
// a function that checks one row of data 
// to make sure it is valid
// if a row is found to be invalid, the
// function will add an `error` field to the 
// array explaining the validation error   
function validateRow(&$data) {
    if (!array_key_exists('mark', $data) && ((int)$data['mark']) >= 0 && ((int)$data['mark']) <= 5) {
        $data['error'] = "No 'mark' value found between 0 and 5";
        return false;
    }
    // do validation for $data['name']
    // do validation for $data['url']
    return true;
}

$fh = fopen("names.csv", "r");

$csvdata = array();

while (($row = fgetcsv($fh, 0, ",")) !== FALSE) {
    $csvdata[] = $row;
}
$header = $csvdata[0];
$n = count($header);
$errorless = array();
$haserrors = array();
// here is where we convert the CSV data into
// an associative array / map of key value
// pairs by treating each row and the header 
// row as parallel arrays.  Start with index 1
// in the for loop to skip over the header row
// in csvdata
for ($row = 1; $row < count($csvdata); ++$row) {
    $data = array();
    for ($x = 0; $x < $n; ++$x) {
        $data[$header[$x]] = $csvdata[$row][$x];
    }
    // if we encounter no errors, 
    // add data to the errorless array
    if (validateRow($data)) {
        $errorless[] = $data;
    }
    else {
        $haserrors[] = $data;
    }
}

// you will want to do a parallel write
// for the error.json file
$fp = fopen('errorless.json', 'w');
fwrite($fp, json_encode($errorless));
fclose($fp);

答案 1 :(得分:0)

你应该使用一种叫做JSON模式的东西。您可以定义文档的形成方式。然后您的文档可以自动验证。

如果您正在寻找CSV验证库的另一个具体实现,您应该首先查看packagist.org。

答案 2 :(得分:0)

这可能会对你有帮助,

标准JSON格式未明确支持文件注释。 RFC 4627 application/json,这是一种用于存储和传输数据的轻量级格式。如果评论真的很重要,您可以将其作为评论

等其他数据字段包含在内

<强> 输入

$ cat test.csv 
name,mark,url
ABCD,5,http://www.example.org
BCD,-2,http://www.example.com
CD,4,htt://www.c.com

<强> 脚本

<?php
function validate_url($url)
{
    return in_array(parse_url($url, PHP_URL_SCHEME),array('http','https')) && filter_var($url, FILTER_VALIDATE_URL);
}
function validate_mark($val)
{
    return ($val >= 0 && $val <= 5);
}

function errors($field)
{
    $errors = array(
        'url' => 'URL should be valid',
        'mark' => 'Mark should be between 0 to 5'
    );

    return ( isset($errors[$field]) ? $errors[$field] : "Unknown");
}

function csv2array_with_validation($filename, $delimiter = ",")
{
    $header = $row = $c_row = $output = $val_func =  array();
    if (($handle = fopen($filename, 'r')) !== FALSE) 
    {
        while (($row = fgetcsv($handle, 0, $delimiter)) !== FALSE) 
        {
            if (empty($header))
            {
                $header = array_map('strtolower', $row);
                foreach ($header as $e) 
                {
                    $val_func[$e] = function_exists('validate_' . $e);
                }
                continue;
            }
            $c_row = array_combine($header, $row);
            $index = 'error_less'; $errors = array();
            foreach ($c_row as $e => $v) 
            {
                if ($val_func[$e]) 
                {
                    if (!call_user_func('validate_' . $e, $v)) 
                    {
                        $index = 'error';
                        $errors[$e] =  errors($e);
                    }
                }
            }
        /* 
          If the comment is truly important, 
          you can include it as another data field like errors,  
          comment below part if you do not wish to create new field (errors) in 
          json file 
        */
        if(!empty($errors))
        {
            $c_row['errors'] = $errors;
        }   
            $output[$index][] = $c_row;
        }
        fclose($handle);
    }
    return $output;
}

$output = csv2array_with_validation('test.csv');

// Write error.json
if (isset($output['error']) && !empty($output['error']))  
{
    file_put_contents('error.json', json_encode($output['error'], JSON_PRETTY_PRINT | JSON_NUMERIC_CHECK));
}

// Write errorless.json
if (isset($output['error_less']) && !empty($output['error_less'])) 
{
    file_put_contents('error_less.json', json_encode($output['error_less'], JSON_PRETTY_PRINT | JSON_NUMERIC_CHECK));
}
?>

<强> 输出

$ php test.php
$ cat error.json 
[
    {
        "name": "BCD",
        "mark": -2,
        "url": "http:\/\/www.example.com",
        "errors": {
            "mark": "Mark should be between 0 to 5"
        }
    },
    {
        "name": "CD",
        "mark": 4,
        "url": "htt:\/\/www.c.com",
        "errors": {
            "url": "URL should be valid"
        }
    }
]
$ cat error_less.json 
[
    {
        "name": "ABCD",
        "mark": 5,
        "url": "http:\/\/www.example.org"
    }
]