如何在perl中执行此文件操作?

时间:2011-12-15 16:54:30

标签: performance perl file-io

所以我的文件看起来像这样:

--some comments--
--a couple of lines of header info--
 comp:
  name: some_name_A
  type: some_type
  id:   an id_1
  owner: who owns it
  path:  path_A to more data
 end_comp

 comp:
  name: some_name_B
  type: some_type
  id:   an id_2
  owner: who owns it
  path:  path_B to more data
 end_comp  

我想做什么:从名称字段中获取名称,看看它是否与我们要搜索的名称(已在数组中提供)匹配,然后获取路径,转到那条路径,做一些perforce东西并获得新的id,然后用新的id替换当前的id,只要它与当前的id不同。

我做了什么(只是假):

@filedata = <read_file> #read file in an array
$names_to_search = join("|", @some_names);

while(lines=@filedata)
{
 if( $line =~ /comp:/ )
 {
   $line = <next line>;
   if( $line =~ /name: $names_to_search/ )
   {
    #loop until we find the id
    #remember this index since we need to change this id

    #loop until we find the path field
    #get the path, go to that path, do some perforce commands and obtain new id
    if( id is same as current id ) no action required
    else replace current id with new id
   }
  }
}

问题:我当前的实现有三个while循环!这样做有更好/更有效/更优雅的方式吗?

3 个答案:

答案 0 :(得分:4)

您已经以自定义格式编写了配置文件,然后尝试手动解析它。相反,为什么不以已建立的格式(如YAML或INI)编写文件,然后使用现有模块对其进行解析?

例如,使用YAML:

use YAML::Any;
my @data = YAML::Any::LoadFile($filename) or die "Could not read from $filename: $!":

# now you have your data structure in @data; parse it using while/for/map loops.

您可以使用Config::INIConfig::INI::Simple来阅读INI文件。

答案 1 :(得分:1)

这是一些伪代码:

index = 0;

index_of_id = 0; // this is the index of the line that contains the current company id

have_company = false; // track whether we are processing a copmany

while (line in @filedata)
{
  if (!have_company)
  {
    if (line is not "company") 
    {
      ++index;
      continue;
    }
    else
    {
      index_of_id = 0;
      have_company = true;
    }
  }
  else
  {
    if (line is "end_comp")
    {
      have_company = false; // force to start looking for new company
      ++index;
      continue;
    }

    if (line is "id")
      index_of_id = index;  // save the index

    if (line is "path")
    {
      // do your stuff then replace the string at the index given by index_of_id
    }
  }
  // line index
  ++index; 
}

// Now write the modified array to file

答案 2 :(得分:1)

由于没有两个块可以具有相同的name值,因此可以使用哈希引用的哈希引用:

{
  "name1"=>{type=>"type1",id=>"id1",owner=>"owner1",path=>"path1"},
  "name2"=>{type=>"type2",id=>"id2",owner=>"owner2",path=>"path2"},
  #etc
}

这样的事情应该有效(警告:未经测试):

use strict;
use warnings;

open(my $read,"<","input_file.txt") or die $!;

my $data={};
my $current_name=""; #Placeholder for the name that we're currently using.

while(<$read>)
{
  chomp; #get rid of trailing newline character.

  if(/^\s*name:\s*([\w]+)\s*$/) #If we hit a line specifying a name, 
                                #then this is the name we're working with
  {
    $current_name=$1;
  }
  elsif(/^\s*(type|id|owner|path):\s*([\w]+)\s*$/) #If it's data to go with the name, 
                                                   #then assign it.
  {
    $data->{$current_name}->{$1}=$2;
  }
}

close($read);

#Now you can search your given array for each of the names and do what you want from there.

但是,如果可以的话,我真的建议以某种标准格式(YAML,INI,JSON,XML等)将数据存储在文件中,然后适当地解析它。我还要补充一点,此代码取决于在相应的nametypeidowner之前出现的每个path