PHP - 创建包含file1中所有行的新文件,这些行不包含file2中任何行的文本

时间:2016-05-09 19:10:41

标签: php

我已经在StackExchange上阅读了很多帖子,但无法找到我需要的内容。注意:这不仅仅是删除重复项。我需要通过File1.csv并创建一个新文件 - Results.csv - 它包含的每一行都不包含File2.txt中的一行。

File1.csv包含个人详细信息和电子邮件地址,每行1个:

"mr","Happy","Man","mrhappy@example.com"
"mr","Sad","Man","mrsad@example.com"
"mr","Grumpy","Man","mrgrumpy@example.com"
"mr","Strong","Man","mrstrong@example.com"

File2.txt包含电子邮件地址,每行1个:

mrhappy@example.com
mrsomeoneelse@example.com
mrsomeoneelse2@example.com

预期结果:Results.csv应包含:

"mr","Sad","Man","mrsad@example.com"
"mr","Grumpy","Man","mrgrumpy@example.com"
"mr","Strong","Man","mrstrong@example.com"

令人困惑的是,当File2.txt包含一行时,我的代码按预期工作。但是当它包含多行时,Results.txt包含File1.csv中的所有行(包括本应删除的行)并多次重复这些行(与File2.txt中的行一样多次)。我有一种感觉我已经接近但我无法弄明白。

我的代码:

<?php
$to_be_searched = "File1.csv";

$items_to_catch = file("File2.txt");

// create empty array to store lines we want to keep - i.e. lines that dont contain emails we're checking for
$good_lines = array();

// open $to_be_searched
$handle = fopen($to_be_searched, "r");
if ($handle) {
  // go line by line until end of file
  while (($line = fgets($handle)) !== false) {
    // check if line contains any items from $items_to_catch
    foreach($items_to_catch as $key => $value) {
      if(strpos($line, $value) === false) {
        // email wasn't found on the line so we want this line in the results file, therefore add to $good_lines array
        $good_lines[] = $line;
      } 
    }
  }
  fclose($handle);
} else {
  echo "Couldn't open " . $to_be_searched;
  exit();
}

// write $array_of_good_lines into new file
$new_file = "Results.csv";
foreach($good_lines as $key => $value) {
    file_put_contents($new_file, $value, FILE_APPEND | LOCK_EX);
}

?>

我做错了什么?

2 个答案:

答案 0 :(得分:1)

目前无效,因为在您的foreach中,您多次向$good_lines添加相同的行。

要解决此问题,您可以在循环中添加一个标志变量。

while (($line = fgets($handle)) !== false) {
    // Declare our flag variable as false by default
    $found = false;

    // Loop through each item to see if the email has been found
    foreach($items_to_catch as $key => $value) {
        // If the email was found, stop looping in the second file
        if(strpos($line, $value) !== false){
            $found = true;
            break;
        } 
    }

    // If the email was not found in the second file, add it to the good_lines array
    if(!$found)
        $good_lines[] = $line;
}

更新

除了循环之外,当你阅读File2.txt时还有另外一个问题,因为它会在字符串中添加换行符,因此,当你稍后将字符串与strpos进行比较时,它就是不工作解决这个问题:

$items_to_catch = file("File2.txt", FILE_IGNORE_NEW_LINES);

这是没有标志的$ items_to_catch的var_dump:

array (size=3)
    0 => string 'mrhappy@example.com
    ' (length=20)
    1 => string 'mrsomeoneelse@example.com
    ' (length=26)
    2 => string 'mrsomeoneelse2@example.com
    ' (length=27)

这是带有标志的$ items_to_catch的var_dump:

array (size=3)
    0 => string 'mrhappy@example.com' (length=19)
    1 => string 'mrsomeoneelse@example.com' (length=25)
    2 => string 'mrsomeoneelse2@example.com' (length=26)

注意每封电子邮件中的额外字符,即换行符。

答案 1 :(得分:1)

file()返回文件的每一行,包括终结行。如果您使用{{3}}查看$items_to_catch,您会看到它:

array:3 [
   0 => "mrhappy@example.com\n"
   1 => "mrsomeoneelse@example.com\n"
   2 => "mrsomeoneelse2@example.com\n"
]

这不是您想要的,因为您之后的比较包含终端行结尾。另外,Symfony的VarDumper组件比print_rvar_dump好几个数量级:我高度建议将它组合到您的项目中。

所以,用以下方法修剪终端新线:

$items_to_catch = array_map('trim', file('File2.txt'));

最小的工作示例:

$excludedLinesWithTheseEmails = array_map('trim', file('File2.txt'));

$out = fopen('Results.csv', 'w') or die('Cannot open Results.csv');
$in = fopen('File1.csv', 'r') or die('Cannot open File1.csv');
while (false !== ($row = fgetcsv($in))) {
    if (! in_array($row[3], $excludedLinesWithTheseEmails)) {
        fputcsv($out, $row);
    }
}
fclose($out);
fclose($in);