我已经在StackExchange上阅读了很多帖子,但无法找到我需要的内容。注意:这不仅仅是删除重复项。我需要通过File1.csv并创建一个新文件 - Results.csv - 它包含的每一行都不包含File2.txt中的一行。
File1.csv包含个人详细信息和电子邮件地址,每行1个:
"mr","Happy","Man","mrhappy@example.com"
"mr","Sad","Man","mrsad@example.com"
"mr","Grumpy","Man","mrgrumpy@example.com"
"mr","Strong","Man","mrstrong@example.com"
File2.txt包含电子邮件地址,每行1个:
mrhappy@example.com
mrsomeoneelse@example.com
mrsomeoneelse2@example.com
预期结果:Results.csv应包含:
"mr","Sad","Man","mrsad@example.com"
"mr","Grumpy","Man","mrgrumpy@example.com"
"mr","Strong","Man","mrstrong@example.com"
令人困惑的是,当File2.txt包含一行时,我的代码按预期工作。但是当它包含多行时,Results.txt包含File1.csv中的所有行(包括本应删除的行)并多次重复这些行(与File2.txt中的行一样多次)。我有一种感觉我已经接近但我无法弄明白。
我的代码:
<?php
$to_be_searched = "File1.csv";
$items_to_catch = file("File2.txt");
// create empty array to store lines we want to keep - i.e. lines that dont contain emails we're checking for
$good_lines = array();
// open $to_be_searched
$handle = fopen($to_be_searched, "r");
if ($handle) {
// go line by line until end of file
while (($line = fgets($handle)) !== false) {
// check if line contains any items from $items_to_catch
foreach($items_to_catch as $key => $value) {
if(strpos($line, $value) === false) {
// email wasn't found on the line so we want this line in the results file, therefore add to $good_lines array
$good_lines[] = $line;
}
}
}
fclose($handle);
} else {
echo "Couldn't open " . $to_be_searched;
exit();
}
// write $array_of_good_lines into new file
$new_file = "Results.csv";
foreach($good_lines as $key => $value) {
file_put_contents($new_file, $value, FILE_APPEND | LOCK_EX);
}
?>
我做错了什么?
答案 0 :(得分:1)
目前无效,因为在您的foreach中,您多次向$good_lines
添加相同的行。
要解决此问题,您可以在循环中添加一个标志变量。
while (($line = fgets($handle)) !== false) {
// Declare our flag variable as false by default
$found = false;
// Loop through each item to see if the email has been found
foreach($items_to_catch as $key => $value) {
// If the email was found, stop looping in the second file
if(strpos($line, $value) !== false){
$found = true;
break;
}
}
// If the email was not found in the second file, add it to the good_lines array
if(!$found)
$good_lines[] = $line;
}
除了循环之外,当你阅读File2.txt
时还有另外一个问题,因为它会在字符串中添加换行符,因此,当你稍后将字符串与strpos
进行比较时,它就是不工作解决这个问题:
$items_to_catch = file("File2.txt", FILE_IGNORE_NEW_LINES);
这是没有标志的$ items_to_catch的var_dump:
array (size=3)
0 => string 'mrhappy@example.com
' (length=20)
1 => string 'mrsomeoneelse@example.com
' (length=26)
2 => string 'mrsomeoneelse2@example.com
' (length=27)
这是带有标志的$ items_to_catch的var_dump:
array (size=3)
0 => string 'mrhappy@example.com' (length=19)
1 => string 'mrsomeoneelse@example.com' (length=25)
2 => string 'mrsomeoneelse2@example.com' (length=26)
注意每封电子邮件中的额外字符,即换行符。
答案 1 :(得分:1)
file()
返回文件的每一行,包括终结行。如果您使用{{3}}查看$items_to_catch
,您会看到它:
array:3 [
0 => "mrhappy@example.com\n"
1 => "mrsomeoneelse@example.com\n"
2 => "mrsomeoneelse2@example.com\n"
]
这不是您想要的,因为您之后的比较不包含终端行结尾。另外,Symfony的VarDumper组件比print_r
和var_dump
好几个数量级:我高度建议将它组合到您的项目中。
所以,用以下方法修剪终端新线:
$items_to_catch = array_map('trim', file('File2.txt'));
最小的工作示例:
$excludedLinesWithTheseEmails = array_map('trim', file('File2.txt'));
$out = fopen('Results.csv', 'w') or die('Cannot open Results.csv');
$in = fopen('File1.csv', 'r') or die('Cannot open File1.csv');
while (false !== ($row = fgetcsv($in))) {
if (! in_array($row[3], $excludedLinesWithTheseEmails)) {
fputcsv($out, $row);
}
}
fclose($out);
fclose($in);