计算一列中的频率...并在多维数组中使用另一列...

时间:2015-03-12 19:49:00

标签: php regex

我有一个这样的csv文件,当然还有更多的行。它有4列。

433444 20 2009 Part_description_433444
432222 15 2009 Part_description_432222
535333 10 2010 Part_description_535333
433444 15 2009 Part_description_433444
432222 .4 2012 Part_description_432222
535333 20 2010 Part_description_535333

我想计算第一列中的部件号出现的次数。第二列表示该行中每个关联部件号的数量。所以我想计算第一列中的部件号出现的次数......并将每次出现的次数乘以第二列,即数量。另外,我想按每年的次数/数量排序,这是第3列。

到目前为止,我所拥有的只是多维数组格式。

$formattedArr = array();
$filename = "parts_compile.csv";

if (($handle = fopen($filename, "r")) !== FALSE) {
    $key = 0;    // Set the array key.
    while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
        $count = count($data);  //get the total keys in row
        //insert data to our array
        for ($i=0; $i < $count; $i++) {
            $formattedArr[$key][$i] = $data[$i];
        }
        $key++;
    }
    fclose($handle);    //close file handle
}
//csv to multidimensional array in php
echo $formattedArr[2][0];

我想我会尝试找到第一列中每个元素的匹配数,保留对第二列中数量的引用,但我无法理解如何执行此操作。非常感谢任何帮助。

谢谢, 小号

3 个答案:

答案 0 :(得分:0)

<?php
// read in CSV file
$file = fopen("$filename","r");
$data = array();
while(! feof($file)) {
   $data = fgetcsv($file);
}
fclose($file);

// Count the number of times a given 1st-column field appears
$cnt_array = array();
foreach ($data as $d) {
   if (!isset($cnt_array[$d[0]]))
      $cnt_array[$d[0]] = 1;
   else
      $cnt_array[$d[0]]++;
}
// display it
foreach ($cnt_array as $k=>$v)
   echo "Part #$k occurs $v times<br />\n";

// Total column 2 for each value in column 1
$sum_array = array();
foreach ($data as $d) {
   if (!isset($sum_array[$d[0]]))
      $sum_array[$d[0]] = 0;
   $sum_array[$d[0]] += $d[1];
}
// display it
foreach ($sum_array as $k=>$v)
   echo "Part #$k has $v units<br />\n";

//Same as above, but keep track of the year column too
// Total column 2 for each value in column 1
$sum_array = array();
foreach ($data as $d) {
   if (!isset($sum_array[$d[0]]))
      $sum_array[$d[0]] = array(0,$d[2]);
   $sum_array[$d[0]][0] += $d[1];
}
// sort it by the first year mentioned for that part #
uasort($sum_array, sort2nd);
// display it
foreach ($sum_array as $k=>$v)
   echo "Part #$k has was introduced in $v[1] and has $v[0] units<br />\n";

// Here's the function to do the user-defined sort
function sort2nd($a, $b)
{
   if ($a[2] == $b[2]) return 0;
   else return ($a[2] < $b[2]) ? -1 : 1;
}

答案 1 :(得分:0)

另一种可能的解决方案:

<?php

$formattedArr = array();
$filename = "parts_compile.csv";

if (($handle = fopen($filename, "r")) !== FALSE) {
    while (($data = fgetcsv($handle, 1000, ",")) !== FALSE) {
        //insert data to our array
        $formattedArr[] = $data;
    }
    fclose($handle);    //close file handle
}

echo "<pre>";
print_r($formattedArr);
echo "</pre>";

$idx_id = 0;
$idx_amount = 1;
$idx_year = 2;
$idx_name = 3;

$res = array();

foreach($formattedArr as $a) {
    $id = $a[$idx_id];
    $amount = $a[$idx_amount];
    $year = $a[$idx_year];
    $name = $a[$idx_name];
    if(!isset($res[$id])) {
        $res[$id] = array('name'=>$name, 'quant_per_year'=>array());
    }
    if(!isset($res[$id]['quant_per_year'][$year])) {
        $res[$id]['quant_per_year'][$year] = 0;
    }
    $res[$id]['quant_per_year'][$year] += $amount;
}

echo "<pre>";
print_r($res);
echo "</pre>";

$res的输出是:

Array
(
    [433444] => Array
        (
            [name] => Part_description_433444
            [quant_per_year] => Array
                (
                    [2009] => 35
                )

        )

    [432222] => Array
        (
            [name] => Part_description_432222
            [quant_per_year] => Array
                (
                    [2009] => 15
                    [2012] => 0.4
                )

        )

    [535333] => Array
        (
            [name] => Part_description_535333
            [quant_per_year] => Array
                (
                    [2010] => 30
                )

        )

)

答案 2 :(得分:0)

我不确定你希望它如何分类

  

此外,我想按每年的次数/数量排序,这是第3列。

有点模糊。

<?php 

$filename = "parts_compile.csv";

$parts = array();

if (($handle = fopen($filename, "r")) !== FALSE) {
    // the data is dilimited with 1 white space character in the question
    while (($columns = fgetcsv($handle, 0, " ")) !== FALSE) {

        $prod_number = $columns[0];
        $quantity    = $columns[1];
        $year        = $columns[2];

        if (!isset($parts[$prod_number])) {
            $parts[$prod_number] = array('name' => $columns[3], 'occurances' => 0, 'quantity' => 0, 'years' => array());
        }

        $parts[$prod_number]['occurances']++;
        $parts[$prod_number]['quantity'] += $quantity;

        if (!isset($parts[$prod_number]['years'][$year])) {
            $parts[$prod_number]['years'][$year] = 0;
        }

        $parts[$prod_number]['years'][$year] += $quantity;
    }

    fclose($handle);

    foreach ($parts as &$part) {
        arsort($part['years']);
    }
    echo '<pre>';
    print_r($parts);
}  

输出是:

<pre>
Array
(
    [433444] => Array
        (
            [name] => Part_description_433444
            [occurances] => 2
            [quantity] => 35
            [years] => Array
                (
                    [2009] => 35
                )
        )
    [432222] => Array
        (
            [name] => Part_description_432222
            [occurances] => 2
            [quantity] => 15.4
            [years] => Array
                (
                    [2009] => 15
                    [2012] => 0.4
                )
        )
    [535333] => Array
        (
            [name] => Part_description_535333
            [occurances] => 2
            [quantity] => 30
            [years] => Array
                (
                    [2010] => 30
                )
        )
)