如何使用PHP从CSV文件中删除前100行

时间:2013-09-30 13:18:53

标签: php csv syntax cron

我有一个定期运行的php脚本,用于处理CSV文件的前100行。完成后,我希望它从CSV文件中删除已处理的行。

我已尝试过以下代码,但它不会删除任何内容。我不确定在PHP中陈述条件的最佳方式,我不确定要为$ id添加什么。为测试目的,我将行数设置为5。可能有错误的语法错误。

有什么建议吗?提示?

    function delete_line($id)
{
    if($id) 
    {
        $file_handle = fopen("file.csv", "w+");
        $myCsv = array();
        while (!feof($file_handle) )
        {
            $line_of_text = fgetcsv($file_handle, 1024);    
            if ($id != $line_of_text[0]) 
            {
                fputcsv($file_handle, $line_of_text);
            }
        }
fclose($file_handle);
    }
}

$in = fopen( 'file.csv', 'r');
$out = fopen( 'file.csv', 'w'); 
// Check whether they opened

while( $row = fgetcsv( $in, $cnt)){

  fputcsv( $out, $row);
}

fclose( $in); fclose( $out);

整个脚本在这里......

<?php

require_once('wp-config.php');



$siteurl = get_site_url();



//print_r($_FILES);

//if($_FILES['file']['tmp_name']) {

  //$upload = ABSPATH . 'file.csv';

  //move_uploaded_file($_FILES['file']['tmp_name'], $upload);

//}



function clearer($str) {

  //$str = iconv("UTF-8", "UTF-8//IGNORE", $str);

  $str = utf8_encode($str);

  $str = str_replace("’", "'", $str);

  $str = str_replace("–", "-", $str);

  return htmlspecialchars($str);

}



//file read

if(file_exists("file.csv")) $csv_lines  = file("file.csv");

if(is_array($csv_lines)) {



  $cnt = 5;

  for($i = 0; $i < $cnt; $i++) {

    $line = $csv_lines[$i];

    $line = trim($line);

    $first_char = true;

    $col_num = 0;

    $length = strlen($line);

    for($b = 0; $b < $length; $b++) {

      if($skip_char != true) {

        $process = true;

        if($first_char == true) {

          if($line[$b] == '"') {

            $terminator = '",';

            $process = false;

          }else

            $terminator = ',';

          $first_char = false;

        }



        if($line[$b] == '"'){

          $next_char = $line[$b + 1];

          if($next_char == '"')

            $skip_char = true;

          elseif($next_char == ',') {

            if($terminator == '",') {

              $first_char = true;

              $process = false;

              $skip_char = true;

            }

          }

        }



        if($process == true){

          if($line[$b] == ',') {

             if($terminator == ',') {

                $first_char = true;

                $process = false;

             }

          }

        }



        if($process == true)

          $column .= $line[$b];



        if($b == ($length - 1)) {

          $first_char = true;

        }



        if($first_char == true) {

          $values[$i][$col_num] = $column;

          $column = '';

          $col_num++;

        }

      }

      else

        $skip_char = false;

    }

  }



  $values = array_values($values);

  //print_r($values);



  /*************************************************/



  if(is_array($values)) {

    //file.csv read

    for($i = 0; $i < count($values); $i++) {

      unset($post);



      //check duplicate

      //$wpdb->show_errors();

      $wpdb->query("SELECT `ID` FROM `" . $wpdb->prefix . "posts`

                            WHERE `post_title` = '".clearer($values[$i][0])."' AND `post_status` = 'publish'");

        //echo $wpdb->num_rows;



      if($values[$i][0] != "Name" && $values[$i][0] != "" && $wpdb->num_rows == 0) {

        $post['name'] = clearer($values[$i][0]);

        $post['Address'] = clearer($values[$i][1]);

        $post['City'] = clearer($values[$i][2]);

        $post['Categories'] = $values[$i][3];

        $post['Tags'] = $values[$i][4];

        $post['Top_image'] = $values[$i][5];

        $post['Body_text'] = clearer($values[$i][6]);



        //details

        for($k = 7; $k <= 56; $k++) {

          $values[$i][$k] != '' ? $post['details'] .= "<em>".clearer($values[$i][$k])."</em>\r\n" : '';

        }



        //cats

        $categoryes = explode(";", $post['Categories']);

        foreach($categoryes AS $category_name) {

          $term = term_exists($category_name, 'category');

          if (is_array($term)) {

            //category exist

            $cats[] = $term['term_id'];

          }else{

            //add category

            wp_insert_term( $category_name, 'category' );

            $term = term_exists($category_name, 'category');

            $cats[] = $term['term_id'];

          }

        }



        //top image

        if($post['Top_image'] != "") {

          $im_name = md5($post['Top_image']).'.jpg';



          $im = @imagecreatefromjpeg($post['Top_image']); 

          if ($im) {

            imagejpeg($im, ABSPATH.'images/'.$im_name);

            $post['topimage'] = '<img class="alignnone size-full" src="'.$siteurl.'/images/'.$im_name.'" alt="" />';

          }

        }



        //bottom images

        for($k = 57; $k <= 76; $k++) {

          if($values[$i][$k] != '') {

            $im_name = md5($values[$i][$k]).'.jpg';



            $im = @imagecreatefromjpeg($values[$i][$k]);

            if ($im) {

              imagejpeg($im, ABSPATH.'images/'.$im_name);

              $post['images'] .= '<a href="'.$siteurl.'/images/'.$im_name.'"><img class="alignnone size-full" src="'.$siteurl.'/images/'.$im_name.'" alt="" /></a>';

            }

          }

        }



        $post = array_map( 'stripslashes_deep', $post );



        //print_r($post);



        //post created

        $my_post = array (

           'post_title' => $post['name'],

           'post_content' => '

              <em>Address: '.$post['Address'].'</em>

              '.$post['topimage'].'

              '.$post['Body_text'].'

              <!--more-->

              '.$post['details'].'

              '.$post['images'].'

           ',

           'post_status' => 'publish',

           'post_author' => 1,

           'post_category' => $cats

        );

        unset($cats);



        //add post

        //echo "ID:" .

        $postid = wp_insert_post($my_post); //post ID



        //tags

        wp_set_post_tags( $postid, str_replace(';',',',$post['Tags']), true ); //tags



        echo $post['name']. ' - added. ';



        //google coords

        $address = preg_replace("!\((.*?)\)!si", " ", $post['Address']).', '.$post['City'];

        $json = json_decode(file_get_contents('http://hicon.by/temp/googlegeo.php?address='.urlencode($address)));

        //print_r($json);



        if($json->status == "OK") {

          //нашло адрес

          $google['status'] = $json->status;



          $params = $json->results[0]->address_components;

          if(is_array($params)) {

            foreach($params AS $id => $p) {

              if($p->types[0] == 'locality') $google['locality_name'] = $p->short_name;

              if($p->types[0] == 'administrative_area_level_2') $google['sub_admin_code'] = $p->short_name;

              if($p->types[0] == 'administrative_area_level_1') $google['admin_code'] = $p->short_name;

              if($p->types[0] == 'country') $google['country_code'] = $p->short_name;

              if($p->types[0] == 'postal_code') $google['postal_code'] = $p->short_name;

            }

          }

          $google['address'] = $json->results[0]->formatted_address;

          $google['location']['lat'] = $json->results[0]->geometry->location->lat;

          $google['location']['lng'] = $json->results[0]->geometry->location->lng;



          //print_r($params);



          //print_r($google);



          //insert into DB

          $insert_code = $wpdb->insert( $wpdb->prefix . 'geo_mashup_locations',

                                        array( 'lat' => $google['location']['lat'], 'lng' => $google['location']['lng'], 'address' => $google['address'],

                                               'saved_name' => $post['name'], 'postal_code' => $google['postal_code'],

                                               'country_code' => $google['country_code'], 'admin_code' => $google['admin_code'],

                                               'sub_admin_code' => $google['sub_admin_code'], 'locality_name' => $google['locality_name'] ),

                                        array( '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s' )

                                      );

          if($insert_code) {

            $google_code_id = $wpdb->insert_id;

            $geo_date = date( 'Y-m-d H:i:s' );

            $wpdb->insert(

              $wpdb->prefix . 'geo_mashup_location_relationships',

              array( 'object_name' => 'post', 'object_id' => $postid, 'location_id' => $google_code_id, 'geo_date' => $geo_date ),

              array( '%s', '%s', '%s', '%s' )

            );

          }else{

            //can't insert data

          }



          echo ' address added.<br />';



        }else{

          //echo $json->status;

        }



      }

    } //$values end (for)

  }

}else{

  //not found file.csv

  echo 'not found file.csv';

}

function delete_line($id)
{
    if($id) 
    {
        $file_handle = fopen("file.csv", "w+");
        $myCsv = array();
        while (!feof($file_handle) )
        {
            $line_of_text = fgetcsv($file_handle, 1024);    
            if ($id != $line_of_text[0]) 
            {
                fputcsv($file_handle, $line_of_text);
            }
        }
fclose($file_handle);
    }
}

$in = fopen( 'file.csv', 'r');
$out = fopen( 'file.csv', 'w'); 
// Check whether they opened

while( $row = fgetcsv( $in, $cnt)){

  fputcsv( $out, $row);
}

fclose( $in); fclose( $out);

?>

更新

我尝试了其中一个答案代码并收到服务器错误。下面的代码,想法?

<?php

require_once('wp-config.php');



$siteurl = get_site_url();



//print_r($_FILES);

//if($_FILES['file']['tmp_name']) {

  //$upload = ABSPATH . 'file.csv';

  //move_uploaded_file($_FILES['file']['tmp_name'], $upload);

//}



function clearer($str) {

  //$str = iconv("UTF-8", "UTF-8//IGNORE", $str);

  $str = utf8_encode($str);

  $str = str_replace("’", "'", $str);

  $str = str_replace("–", "-", $str);

  return htmlspecialchars($str);

}



//file read

if(file_exists("file.csv")) $csv_lines  = file("file.csv");

if(is_array($csv_lines)) {



  $cnt = 5;

  for($i = 0; $i < $cnt; $i++) {

    $line = $csv_lines[$i];

    $line = trim($line);

    $first_char = true;

    $col_num = 0;

    $length = strlen($line);

    for($b = 0; $b < $length; $b++) {

      if($skip_char != true) {

        $process = true;

        if($first_char == true) {

          if($line[$b] == '"') {

            $terminator = '",';

            $process = false;

          }else

            $terminator = ',';

          $first_char = false;

        }



        if($line[$b] == '"'){

          $next_char = $line[$b + 1];

          if($next_char == '"')

            $skip_char = true;

          elseif($next_char == ',') {

            if($terminator == '",') {

              $first_char = true;

              $process = false;

              $skip_char = true;

            }

          }

        }



        if($process == true){

          if($line[$b] == ',') {

             if($terminator == ',') {

                $first_char = true;

                $process = false;

             }

          }

        }



        if($process == true)

          $column .= $line[$b];



        if($b == ($length - 1)) {

          $first_char = true;

        }



        if($first_char == true) {

          $values[$i][$col_num] = $column;

          $column = '';

          $col_num++;

        }

      }

      else

        $skip_char = false;

    }

  }



  $values = array_values($values);

  //print_r($values);



  /*************************************************/



  if(is_array($values)) {

    //file.csv read

    for($i = 0; $i < count($values); $i++) {

      unset($post);



      //check duplicate

      //$wpdb->show_errors();

      $wpdb->query("SELECT `ID` FROM `" . $wpdb->prefix . "posts`

                            WHERE `post_title` = '".clearer($values[$i][0])."' AND `post_status` = 'publish'");

        //echo $wpdb->num_rows;



      if($values[$i][0] != "Name" && $values[$i][0] != "" && $wpdb->num_rows == 0) {

        $post['name'] = clearer($values[$i][0]);

        $post['Address'] = clearer($values[$i][1]);

        $post['City'] = clearer($values[$i][2]);

        $post['Categories'] = $values[$i][3];

        $post['Tags'] = $values[$i][4];

        $post['Top_image'] = $values[$i][5];

        $post['Body_text'] = clearer($values[$i][6]);



        //details

        for($k = 7; $k <= 56; $k++) {

          $values[$i][$k] != '' ? $post['details'] .= "<em>".clearer($values[$i][$k])."</em>\r\n" : '';

        }



        //cats

        $categoryes = explode(";", $post['Categories']);

        foreach($categoryes AS $category_name) {

          $term = term_exists($category_name, 'category');

          if (is_array($term)) {

            //category exist

            $cats[] = $term['term_id'];

          }else{

            //add category

            wp_insert_term( $category_name, 'category' );

            $term = term_exists($category_name, 'category');

            $cats[] = $term['term_id'];

          }

        }



        //top image

        if($post['Top_image'] != "") {

          $im_name = md5($post['Top_image']).'.jpg';



          $im = @imagecreatefromjpeg($post['Top_image']); 

          if ($im) {

            imagejpeg($im, ABSPATH.'images/'.$im_name);

            $post['topimage'] = '<img class="alignnone size-full" src="'.$siteurl.'/images/'.$im_name.'" alt="" />';

          }

        }



        //bottom images

        for($k = 57; $k <= 76; $k++) {

          if($values[$i][$k] != '') {

            $im_name = md5($values[$i][$k]).'.jpg';



            $im = @imagecreatefromjpeg($values[$i][$k]);

            if ($im) {

              imagejpeg($im, ABSPATH.'images/'.$im_name);

              $post['images'] .= '<a href="'.$siteurl.'/images/'.$im_name.'"><img class="alignnone size-full" src="'.$siteurl.'/images/'.$im_name.'" alt="" /></a>';

            }

          }

        }



        $post = array_map( 'stripslashes_deep', $post );



        //print_r($post);



        //post created

        $my_post = array (

           'post_title' => $post['name'],

           'post_content' => '

              <em>Address: '.$post['Address'].'</em>

              '.$post['topimage'].'

              '.$post['Body_text'].'

              <!--more-->

              '.$post['details'].'

              '.$post['images'].'

           ',

           'post_status' => 'publish',

           'post_author' => 1,

           'post_category' => $cats

        );

        unset($cats);



        //add post

        //echo "ID:" .

        $postid = wp_insert_post($my_post); //post ID



        //tags

        wp_set_post_tags( $postid, str_replace(';',',',$post['Tags']), true ); //tags



        echo $post['name']. ' - added. ';



        //google coords

        $address = preg_replace("!\((.*?)\)!si", " ", $post['Address']).', '.$post['City'];

        $json = json_decode(file_get_contents('http://hicon.by/temp/googlegeo.php?address='.urlencode($address)));

        //print_r($json);



        if($json->status == "OK") {

          //нашло адрес

          $google['status'] = $json->status;



          $params = $json->results[0]->address_components;

          if(is_array($params)) {

            foreach($params AS $id => $p) {

              if($p->types[0] == 'locality') $google['locality_name'] = $p->short_name;

              if($p->types[0] == 'administrative_area_level_2') $google['sub_admin_code'] = $p->short_name;

              if($p->types[0] == 'administrative_area_level_1') $google['admin_code'] = $p->short_name;

              if($p->types[0] == 'country') $google['country_code'] = $p->short_name;

              if($p->types[0] == 'postal_code') $google['postal_code'] = $p->short_name;

            }

          }

          $google['address'] = $json->results[0]->formatted_address;

          $google['location']['lat'] = $json->results[0]->geometry->location->lat;

          $google['location']['lng'] = $json->results[0]->geometry->location->lng;



          //print_r($params);



          //print_r($google);



          //insert into DB

          $insert_code = $wpdb->insert( $wpdb->prefix . 'geo_mashup_locations',

                                        array( 'lat' => $google['location']['lat'], 'lng' => $google['location']['lng'], 'address' => $google['address'],

                                               'saved_name' => $post['name'], 'postal_code' => $google['postal_code'],

                                               'country_code' => $google['country_code'], 'admin_code' => $google['admin_code'],

                                               'sub_admin_code' => $google['sub_admin_code'], 'locality_name' => $google['locality_name'] ),

                                        array( '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s', '%s' )

                                      );

          if($insert_code) {

            $google_code_id = $wpdb->insert_id;

            $geo_date = date( 'Y-m-d H:i:s' );

            $wpdb->insert(

              $wpdb->prefix . 'geo_mashup_location_relationships',

              array( 'object_name' => 'post', 'object_id' => $postid, 'location_id' => $google_code_id, 'geo_date' => $geo_date ),

              array( '%s', '%s', '%s', '%s' )

            );

          }else{

            //can't insert data

          }



          echo ' address added.<br />';



        }else{

          //echo $json->status;

        }



      }

    } //$values end (for)

  }

}else{

  //not found file.csv

  echo 'not found file.csv';

}


function csv_delete_rows($filename='file.csv', $startrow=1, $endrow=5, $inner=false) {
$status = 0;
//check if file exists
if (file_exists($filename)) {
    //end execution for invalid startrow or endrow
    if ($startrow < 0 || $endrow < 0 || $startrow > 0 && $endrow > 0 && $startrow > $endrow) {
        die('Invalid startrow or endrow value');
    }
    $updatedcsv = array();
    $count = 0;
    //open file to read contents
    $fp = fopen($filename, "r");
    //loop to read through csv contents
    while ($csvcontents = fgetcsv($fp)) {
        $count++;
        if ($startrow > 0 && $endrow > 0) {
            //delete rows inside startrow and endrow
            if ($inner) {
                $status = 1;
                if ($count >= $startrow && $count <= $endrow)
                    continue;
                array_push($updatedcsv, implode(',', $csvcontents));
            }
            //delete rows outside startrow and endrow
            else {
                $status = 2;
                if ($count < $startrow || $count > $endrow)
                    continue;
                array_push($updatedcsv, implode(',', $csvcontents));
            }
        }
        else if ($startrow == 0 && $endrow > 0) {
            $status = 3;
            if ($count <= $endrow)
                continue;
            array_push($updatedcsv, implode(',', $csvcontents));
        }
        else if ($endrow == 0 && $startrow > 0) {
            $status = 4;
            if ($count >= $startrow)
                continue;
            array_push($updatedcsv, implode(',', $csvcontents));
        }
        else if ($startrow == 0 && $endrow == 0) {
            $status = 5;
        } else {
            $status = 6;
        }
    }//end while
    if ($status < 5) {
        $finalcsvfile = implode("\n", $updatedcsv);
        fclose($fp);
        $fp = fopen($filename, "w");
        fwrite($fp, $finalcsvfile);
    }
    fclose($fp);
    return $status;
} else {
    die('File does not exist');
}
}


?>

<html>

<body>

<form enctype="multipart/form-data" method="post">

 CSV: <input name="file" type="file" />

 <input type="submit" value="Send File" />

</form>

</body>

</html>

2 个答案:

答案 0 :(得分:1)

这是php代码:

$input = explode("\n", file_get_contents("file.csv"));
foreach ($input as $line) {
 // process all lines.
}

// This function removes first 100 elements.
// More info:
// http://php.net/manual/en/function.array-slice.php
$output = array_slice($input, 100);
file_put_contents("out.csv", implode("\n", $output));

注意,如果csv文件包含标题,则必须从数组$input中删除第一个元素。

答案 1 :(得分:0)

我在Joby Joseph写的传递中使用了这个脚本:

function csv_delete_rows($filename=NULL, $startrow=0, $endrow=0, $inner=true) {
$status = 0;
//check if file exists
if (file_exists($filename)) {
    //end execution for invalid startrow or endrow
    if ($startrow < 0 || $endrow < 0 || $startrow > 0 && $endrow > 0 && $startrow > $endrow) {
        die('Invalid startrow or endrow value');
    }
    $updatedcsv = array();
    $count = 0;
    //open file to read contents
    $fp = fopen($filename, "r");
    //loop to read through csv contents
    while ($csvcontents = fgetcsv($fp)) {
        $count++;
        if ($startrow > 0 && $endrow > 0) {
            //delete rows inside startrow and endrow
            if ($inner) {
                $status = 1;
                if ($count >= $startrow && $count <= $endrow)
                    continue;
                array_push($updatedcsv, implode(',', $csvcontents));
            }
            //delete rows outside startrow and endrow
            else {
                $status = 2;
                if ($count < $startrow || $count > $endrow)
                    continue;
                array_push($updatedcsv, implode(',', $csvcontents));
            }
        }
        else if ($startrow == 0 && $endrow > 0) {
            $status = 3;
            if ($count <= $endrow)
                continue;
            array_push($updatedcsv, implode(',', $csvcontents));
        }
        else if ($endrow == 0 && $startrow > 0) {
            $status = 4;
            if ($count >= $startrow)
                continue;
            array_push($updatedcsv, implode(',', $csvcontents));
        }
        else if ($startrow == 0 && $endrow == 0) {
            $status = 5;
        } else {
            $status = 6;
        }
    }//end while
    if ($status < 5) {
        $finalcsvfile = implode("\n", $updatedcsv);
        fclose($fp);
        $fp = fopen($filename, "w");
        fwrite($fp, $finalcsvfile);
    }
    fclose($fp);
    return $status;
} else {
    die('File does not exist');
}
}

该函数接受4个参数:

  1. filename(string):csv文件的路径。例如:myfile.csv

  2. startRow(int):删除区域中的第一行

  3. endRow(int):删除区域的最后一行

  4. inner(boolean):决定删除的行是来自内部区域还是外部区域

  5. 现在让我们考虑各种情况。我有一个名为'test.csv'的csv文件。这是相同的截图。

    示例1:

    $status = csv_delete_rows('test.csv', 3, 5, true);

    将删除以下部分的红色部分:

    enter image description here

    示例2:

    $status = csv_delete_rows('test.csv', 3, 5, false);

    将删除以下部分的红色部分:

    enter image description here

    示例3: 与您的情况一样,如果要删除前100行,请使用:

    $status = csv_delete_rows('test.csv', 0, 100);