我使用file()读取数据并迭代每一行。需要能够将字符串拆分为"列"的数组。问题是列的宽度不均匀(60个字符,40个字符等)。似乎所有要执行此操作的函数都希望列是固定大小。
这将定期在大型数据文件上执行,因此需要最佳性能。
数据示例。
XXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXX XX XXXXXX
XXXXXXXXX XXX XXX X XXX
XXXXXXXXXXXXXXX XXXXXXXXXXXXX XX XXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXX XX XXXXXX
XXXXXXXXX XXX XXX X XXX
XXXXXXXXXXXXXXX XXXXXXXXXXXXX XX XXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXX XX XXXXXX
XXXXXXXXX XXX XXX X XXX
XXXXXXXXXXXXXXX XXXXXXXXXXXXX XX XXXXXX
答案 0 :(得分:1)
直接的方法是使用substr来分割列:
foreach (file($fn) as $i=>$line) {
$rows[$i] = array(substr($line, 0, 60), substr($line, 60, 40), substr($line, 100, 30));
}
但与常识相反,使用PCRE和正则表达式分割字符串会更快:
preg_match_all('/^(.{60})(.{40})(.{30})\K/m', file_get_contents($fn), $rows, PREG_SET_ORDER);
这里的缺点是每行包含一个空[0]
(包含原始行),数据列从索引[1]
开始。
答案 1 :(得分:0)
唯一可以做到这一点的方法是文件中是否有分隔符。
explode()
拆分分隔符上的字符串,因此如果您知道文件列是以制表符分隔的,则可以
explode('\t',$string)
获取列的数组。
除此之外,没有可靠的方法可以让你在不知道尺寸的情况下拉出可变大小的列。
答案 2 :(得分:0)
在您对我之前的回答发表评论之后,似乎只需要substr()
。
如果您知道每行的每列的宽度,请执行以下操作:
$rows = array();
foreach( $lines as $line )
{
$columns = array();
array_push($columns, substr($line, FirstColStart, FirstColEnd));
array_push($columns, substr($line, SecondColStart, SecondColEnd));
//more array pushing for each column
array_push($rows, $columns);
}
//Do something with your 'row' array of columns ($rows)
答案 3 :(得分:-1)
这就是我想出的。我认为没有提前知道列宽。
<?php
$data = 'XXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXX XX XXXXXX
XXXXXXXXX XXX XXX X XXX
XXXXXXXXXXXXXXX XXXXXXXXXXXXX XX XXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXX XX XXXXXX
XXXXXXXXX XXX XXX X XXX
XXXXXXXXXXXXXXX XXXXXXXXXXXXX XX XXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXX XX XXXXXX
XXXXXXXXX XXX XXX X XXX
XXXXXXXXXXXXXXX XXXXXXXXXXXXX XX XXXXXX';
$dataLines = explode("\n", $data);
// detect column breaks
$numDataLines = count($dataLines);
$colBreaks = array();
$c = 0;
while (true) {
$rowEnds = 0; // count how many rows have terminated in the current col.
$notSet = 0; // a special case of $rowEnds, when the line no longer has
// chars.
// run down each column. if there are no X's, then it is a col break.
for ($r = 0; $r < $numDataLines; ++$r) {
if (!isset($dataLines[$r][$c])) {
++$notSet;
++$rowEnds;
} elseif ($dataLines[$r][$c] != 'X') {
++$rowEnds;
}
}
// if no lines have chars left, end the while loop. this counts as a col
// break.
if ($notSet == $numDataLines) {
$colBreaks[] = $c;
break;
}
// if no X's were in the line, this is a col break.
if ($rowEnds == $numDataLines) {
$colBreaks[] = $c;
}
++$c; // move on to the next col
}
// now that we have all the col breaks, we simply iterate over them and slice
// out the needed sections from each line to create the columns.
$dataCols = array();
$left = 0;
foreach ($colBreaks as $cb) {
// skip empty cols
if ($left == $cb) {
$left = $cb + 1;
continue;
}
$colLen = $cb - $left;
$dataCol = array();
echo "left: $left, len: $colLen, cb: $cb\n";
foreach ($dataLines as $dl) {
$dataCol[] = substr($dl, $left, $colLen);
}
$dataCols[] = implode("\n", $dataCol);
$left += $colLen + 1;
}
// tada!
print_r($dataCols);