我有一个包含成千上万个条目的文件,试图将其转换为PHP数组,但是由于遇到了需要进入数组的条件,因此遇到了绊脚石。好消息是数据是可预测的,并且有两种类型的条目1)撤销2)撤销原因
已撤销#1的参赛示例
Serial Number: 0E76BE532946EFE890376F0339329A62
Revocation Date: Jun 27 14:46:26 2018 GMT
#2的进入示例已被撤销原因
Serial Number: 0E17C9648FF25C0FC537D97958E4D449
Revocation Date: Jun 27 14:48:07 2018 GMT
CRL entry extensions:
X509v3 CRL Reason Code:
Key Compromise
如果被撤销,则总共有5行,否则只有2行。
数据文件data.txt的示例
这是来自数千个条目列表的数据样本,我们可以将其用作样本数据文件。
Serial Number: 0E76BE532946EFE890376F0339329A62
Revocation Date: Jun 27 14:46:26 2018 GMT
Serial Number: 0E17C9648FF25C0FC537D97958E4D449
Revocation Date: Jun 27 14:48:07 2018 GMT
CRL entry extensions:
X509v3 CRL Reason Code:
Key Compromise
Serial Number: 06BB119BAA2ABC21F92B06ED8E14B113
Revocation Date: Jun 27 14:49:12 2018 GMT
CRL entry extensions:
X509v3 CRL Reason Code:
Key Compromise
Serial Number: 088925C97AC5991CDF5416D07FC5DB00
Revocation Date: Jun 27 15:50:51 2018 GMT
Serial Number: 091E2B2090C7F5DBBCC97EA958B110BC
Revocation Date: Jun 27 15:52:31 2018 GMT
Serial Number: 0E6E9D1E9818221538EA6AF16A279C89
Revocation Date: Jun 27 15:53:12 2018 GMT
CRL entry extensions:
X509v3 CRL Reason Code:
Key Compromise
Serial Number: 07852DF7D7DD35080DE3604836408ADE
Revocation Date: Jun 27 15:53:38 2018 GMT
Serial Number: 0DEA14237257A6A3049F934840DC2B47
Revocation Date: Jun 27 15:53:40 2018 GMT
CRL entry extensions:
X509v3 CRL Reason Code:
Key Compromise
预期产量
我想用以下输出构建一个数组
Array
(
[0] => Array
(
[serial] => 0E76BE532946EFE890376F0339329A62
[date] => Jun 27 14:46:26 2018 GMT
)
[1] => Array
(
[serial] => 0E17C9648FF25C0FC537D97958E4D449
[date] => Jun 27 14:48:07 2018 GMT
[reason] => Key Compromise
)
...
...
)
尝试失败
这是我的尝试,仅在考虑第一个条件(#1)的情况下进行。对于(#2),它有多余的行,但无法弄清楚如何将它们考虑在内。
$arr = array();
$lines = file('data.txt', FILE_IGNORE_NEW_LINES);
$x = 0;
foreach ($lines as $line) {
if (strpos($line, 'Serial Number: ') !== false) {
$arr[$x]['serial'] = str_replace('Serial Number: ', '', trim($line)) ;
}
if (strpos($line, 'Revocation Date: ') !== false) {
$arr[$x]['date'] = str_replace('Revocation Date: ', '', trim($line)) ;
$x++;
}
}
答案 0 :(得分:1)
这是基于字符串操作的简单解决方案:
输入:
Serial Number: 0E76BE532946EFE890376F0339329A62
Revocation Date: Jun 27 14:46:26 2018 GMT
Serial Number: 0E17C9648FF25C0FC537D97958E4D449
Revocation Date: Jun 27 14:48:07 2018 GMT
CRL entry extensions:
X509v3 CRL Reason Code:
Key Compromise
Serial Number: 06BB119BAA2ABC21F92B06ED8E14B113
Revocation Date: Jun 27 14:49:12 2018 GMT
CRL entry extensions:
X509v3 CRL Reason Code:
Key Compromise
Serial Number: 088925C97AC5991CDF5416D07FC5DB00
Revocation Date: Jun 27 15:50:51 2018 GMT
Serial Number: 091E2B2090C7F5DBBCC97EA958B110BC
Revocation Date: Jun 27 15:52:31 2018 GMT
Serial Number: 0E6E9D1E9818221538EA6AF16A279C89
Revocation Date: Jun 27 15:53:12 2018 GMT
CRL entry extensions:
X509v3 CRL Reason Code:
Key Compromise
Serial Number: 07852DF7D7DD35080DE3604836408ADE
Revocation Date: Jun 27 15:53:38 2018 GMT
Serial Number: 0DEA14237257A6A3049F934840DC2B47
Revocation Date: Jun 27 15:53:40 2018 GMT
CRL entry extensions:
X509v3 CRL Reason Code:
Key Compromise
PHP代码:
<?php
// Extract the lines.
$file = file($filename, FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
//
$output = array();
foreach ($file as $row) {
if (strpos($row, "Serial Number") === false) {
$n = (count($output)-1);
if (strpos($row, "Revocation Date") !== false) {
$date = $row;
$date = str_replace('Revocation Date: ', ' ', $date);
$output[$n]['date'] = $date;
} else if (strpos($row, "CRL entry extensions") !== false) {
} else if (strpos($row, "X509v3 CRL Reason Code") !== false) {
} else {
$output[$n]['reason'] = $row;
}
} else {
$sn = $row;
$sn = str_replace('Serial Number: ', ' ', $sn);
$output[] = array();
$n = (count($output)-1);
$output[$n]['serial'] = $sn;
$n++;
}
echo $row.'</br>';
}
print_r($output);
?>
输出:
Array (
[0] => Array (
[serial] => 0E76BE532946EFE890376F0339329A62
[date] => Jun 27 14:46:26 2018 GMT
)
[1] => Array (
[serial] => 0E17C9648FF25C0FC537D97958E4D449
[date] => Jun 27 14:48:07 2018 GMT
[reason] => Key Compromise
)
[2] => Array (
[serial] => 06BB119BAA2ABC21F92B06ED8E14B113
[date] => Jun 27 14:49:12 2018 GMT
[reason] => Key Compromise
)
[3] => Array (
[serial] => 088925C97AC5991CDF5416D07FC5DB00
[date] => Jun 27 15:50:51 2018 GMT
)
[4] => Array (
[serial] => 091E2B2090C7F5DBBCC97EA958B110BC
[date] => Jun 27 15:52:31 2018 GMT
)
[5] => Array (
[serial] => 0E6E9D1E9818221538EA6AF16A279C89
[date] => Jun 27 15:53:12 2018 GMT
[reason] => Key Compromise
)
[6] => Array (
[serial] => 07852DF7D7DD35080DE3604836408ADE
[date] => Jun 27 15:53:38 2018 GMT
)
[7] => Array (
[serial] => 0DEA14237257A6A3049F934840DC2B47
[date] => Jun 27 15:53:40 2018 GMT
[reason] => Key Compromise
)
)
答案 1 :(得分:0)
根据您正在使用的文本文件的大小以及对正则表达式的适应程度,可以使用一种模式来提取要查找的不同信息。
我整理了一个简短的概念证明,适用于您提供的示例:
$re = '/\W+Serial Number: (?<serial>.*?)$\n\W+Revocation Date: (?<date>.*?)$((?:(?!Serial Number)[\n]*.)+Code: \n\W+(?<reason>.*?$))?/m';
$str = ' Serial Number: 0E76BE532946EFE890376F0339329A62
Revocation Date: Jun 27 14:46:26 2018 GMT
Serial Number: 0E17C9648FF25C0FC537D97958E4D449
Revocation Date: Jun 27 14:48:07 2018 GMT
CRL entry extensions:
X509v3 CRL Reason Code:
Key Compromise
Serial Number: 06BB119BAA2ABC21F92B06ED8E14B113
Revocation Date: Jun 27 14:49:12 2018 GMT
CRL entry extensions:
X509v3 CRL Reason Code:
Key Compromise
Serial Number: 088925C97AC5991CDF5416D07FC5DB00
Revocation Date: Jun 27 15:50:51 2018 GMT
Serial Number: 091E2B2090C7F5DBBCC97EA958B110BC
Revocation Date: Jun 27 15:52:31 2018 GMT
Serial Number: 0E6E9D1E9818221538EA6AF16A279C89
Revocation Date: Jun 27 15:53:12 2018 GMT
CRL entry extensions:
X509v3 CRL Reason Code:
Key Compromise
Serial Number: 07852DF7D7DD35080DE3604836408ADE
Revocation Date: Jun 27 15:53:38 2018 GMT
Serial Number: 0DEA14237257A6A3049F934840DC2B47
Revocation Date: Jun 27 15:53:40 2018 GMT
CRL entry extensions:
X509v3 CRL Reason Code:
Key Compromise';
preg_match_all($re, $str, $matches, PREG_SET_ORDER, 0);
// Print the entire match result
var_dump($matches);
您可以在以下位置看到此示例的运行:https://regex101.com/r/7iSBrx/1。
此示例使用命名组来促进从匹配中提取所需目标,并且还有助于说明目标捕获在模式中发生的位置。如果有帮助,我很乐意解释为什么该模式有效。
作为警告,这将需要将整个文件加载到单个字符串中,如果文件很大,则可能会占用大量内存。您基于迭代的方法最适合于非常大的文件。
答案 2 :(得分:0)
尝试此代码:
$file_handle = fopen("data.txt", "rb");
while (!feof($file_handle) ) {
$line_of_text = fgets($file_handle);
$parts = explode('=', $line_of_text);
$name =array($line_of_text);
print_r($name);
}
fclose($file_handle);