删除CSV文件特定列中具有重复值的行

时间:2020-07-25 18:34:49

标签: php csv file filter

我在这里有data.csv:

id: 10, location: Canada, people: 12
id: 10, location: United States, people: 15
id: 15, location: England, people: 19
id: 16, location: India, people: 20
id: 16, location: Germany, people: 9

我希望它使用PHP输出:

id: 10, location: Canada, people: 12
id: 15, location: England, people: 19
id: 16, location: India, people: 20

通过删除第一列中具有相同值的行。 我怎样才能做到这一点? (我是PHP的新手,实际上不知道该怎么做;我尝试了其他人为解决类似问题而编写的一些脚本,但是它们似乎不起作用)我最好希望它回显结果而不是覆盖或创建一个新文件。

1 个答案:

答案 0 :(得分:0)

使用fgetcsv逐行读取csv并创建数组,其中':'后面的内容是键,后面的内容是值。

然后,您可以删除重复项。

只有数据时,需要构造csv字符串。您可以使用它或将其存储在输出的csv文件中。

<?php

$handle = fopen("data.csv", "r");

// parse csv line by line and create data array with its information
$data = [];
while (($row = fgetcsv($handle)) !== false) {
  $newRow = [];
  foreach ($row as $field) {
     $parts = explode(':', $field);
     $key = trim($parts[0]);
     $value = trim($parts[1]);

     $newRow[$key] = $value;
  }

  $data[] = $newRow;
}

// iterate data and remove duplicate ids - keep only first id occurence
$indexedData = [];
foreach ($data as $row) {
  if (!isset($indexedData[$row['id']])) {
    $indexedData[$row['id']] = $row;
  }
}

var_dump($indexedData);

// create csv string with new data
$result = '';
foreach ($indexedData as $row) {
  $fields = [];
  foreach ($row as $key => $value) {
    $fields[] = $key.': '.$value;
  }
  $result .= implode(', ', $fields).PHP_EOL;
}

var_dump($result);

$ indexedData:

array(3) {
  [10]=>
  array(3) {
    ["id"]=>
    string(2) "10"
    ["location"]=>
    string(6) "Canada"
    ["people"]=>
    string(2) "12"
  }
  [15]=>
  array(3) {
    ["id"]=>
    string(2) "15"
    ["location"]=>
    string(7) "England"
    ["people"]=>
    string(2) "19"
  }
  [16]=>
  array(3) {
    ["id"]=>
    string(2) "16"
    ["location"]=>
    string(5) "India"
    ["people"]=>
    string(2) "20"
  }
}

$ result:

string(111) "id: 10, location: Canada, people: 12
id: 15, location: England, people: 19
id: 16, location: India, people: 20
"

或者,如果您不关心csv中的数据(例如,您不需要人员计数等),这里的版本更简单:

<?php

$handle = fopen("data.csv", "r");

$data = [];
while (($row = fgetcsv($handle)) !== false) {
  if (!isset($data[$row[0]])) {
    $data[$row[0]] = $row;
  }
}

$result = '';
foreach ($data as $row) {
  $result .= implode(',', $row).PHP_EOL;
}

var_dump($result);

$ result相同。