大家好,我需要一些有关阵列问题的帮助。
我有一个由PDF文本创建的平面数组 数组是与其他页面文字混合的产品数据 看来模式是
DescriptionCodePrice
然后用X个产品
数据也不统一,某些产品的行数超过3
产品将是
ive将此用于类似项目,但效果不佳
$transactions = array();
foreach ($array as $row) {
if ($row['0'] === "DescriptionCodePrice") {
$transactions[] = array();
}
$transactions[count($transactions) - 1][0] = $row;
}
我正在尝试将所有产品数据提取到一个整齐的数组中
Array
(
[products] => Array
(
[0] => Array
(
[description] =>
[id] =>
[price] =>
)
[1] => Array
(
[description] =>
[id] =>
[price] =>
)
[2] => Array
(
[description] =>
[id] =>
[price] =>
)
)
)
这是我的数据
Array
(
[0] => 8 SANITARYWARE | HEMSWORTH CLOSE COUPLEDPrices include VAT
[1] => DescriptionCodePrice
[2] => Hemsworth Close Coupled Pan
[3] => 421(h) x 373(w) x 673(d) mm INST02007
[4] => £206.00
[5] => Hemsworth Close Coupled
[6] => Cistern 481(w) mm INST02001
[7] => £170.00
[8] => Hemsworth Basin 605mm
[9] => Two Taphole INST02003
[10] => £172.00
[11] => Hemsworth Pedestal INST02008£84.00
[12] => Hemsworth Soft Close Bar
[13] => Hinge Seat - Solid Natural Oak INST02011
[14] => £147.00
[15] => Total £779.00
[16] => Hemsworth Close Coupled WC Suite
[17] => Description CodePrice
[18] => Hemsworth Close Coupled Pan
[19] => 421(h) x 373(w) x 673(d) mm INST02007
[20] => £206.00
[21] => Hemsworth Close Coupled
[22] => Cistern 481(w) mm INST02001
[23] => £170.00
[24] => Hemsworth Soft Close Bar
[25] => Hinge Seat - Solid Natural Oak INST02011
[26] => £147.00
[27] => Hemsworth Soft Close Bar
[28] => Hinge Seat - White INST02012
[29] => £132.00
[30] => Hemsworth
[31] => Hemsworth Basin 605mm
[32] => Description CodePrice
[33] => Hemsworth Basin 605mm
[34] => Two Taphole INST02003
[35] => £172.00
[36] => Hemsworth Basin 605mm
[37] => One Taphole INST02010
[38] => £172.00
[39] => Hemsworth Cloakroom Basin
[40] => 500 x 305mm Two Taphole INST02013
[41] => £144.00
[42] => Hemsworth Pedestal (Fits
[43] => 605mm and 500mm basin) INST02008
[44] => £84.00
[45] => £256. 00
[46] => Hemsworth Basin
[47] => 605mm Two Taphole
[48] => & Pedestal
[49] => (Taps not included)
[50] => £523. 00
[51] => Hemsworth Close Coupled WC with Oak Seat
[52] => Hemsworth Suite with Close
[53] => Coupled Cistern WC & Basin
)
我当前的页面代码是
require_once("vendor/autoload.php");
// Parse pdf file and build necessary objects.
$parser = new \Smalot\PdfParser\Parser();
$pdf = $parser->parseFile('Instinct-Autumn-Bathroom-Catalogue-2018-
pages/page-9.pdf');
// Retrieve all pages from the pdf file.
$pages = $pdf->getPages();
// Loop over each page to extract text.
foreach ($pages as $page) {
$array = explode("\n", $page->getText());
echo "<pre>";
print_r($array);
echo "</pre>";
echo '<br><br>';
}
// Split array into transactions
$transactions = array();
foreach ($array as $row) {
if ($row['0'] === "DescriptionCodePrice") {
$transactions[] = array();
}
$transactions[count($transactions) - 1][0] = $row;
}
答案 0 :(得分:1)
这是一种实现方法:
首先删除标题。 (看来您实际上有两个不需要的标头,而不仅仅是“ DescriptionCodePrice”。)
array_splice($data, 0, 2);
将剩余的数据分成三个部分,并将每个部分与字符串键合并以产生结果。
$keys = ['title', 'description', 'price'];
$result['products'] = array_map(function ($item) use ($keys) {
return array_combine($keys, $item);
}, array_chunk($data, 3));