在perl中将文本文件读入数组

时间:2017-07-21 05:25:22

标签: perl

    my $file = '/var/tmp/temp_data.txt';      ##################### File for reading
    open(FILE, '<:encoding(UTF-8)', $file) || die("Unable to open file");
    my @fieldss = <FILE>;
    close(FILE);
    chomp @fieldss;

    my @fields = split(':', @fieldss);####################### Splitting the lines in file ##############
####################################### Dereference ############################
    my $names = $fields[0];
    my $rate = $fields[1];
    my $no_of_days = $fields[2];
    my $total_salary = $fields[3];
    my $total = $fields[4];
    my $basic = $fields[5];
    my $da = $fields[6];
    my $hra = $fields[7];
    my $ot_hallowance = $fields[8];
    my $gross = $fields[9];
    my $epf = $fields[10];
    my $nett = $fields[11];

    print "$names\n $rate\n";

        foreach $names (@fields) {
                my $dbh = "DBI:$platform:$database:$host:$port";
                my $connect = DBI->connect($dbh, $user, $pw) || die $DBI::errstr;
                $query = "SELECT * FROM salary WHERE name IN('sssss', 'ffffff', 'dddddd', 'ddededed', 'garaead', 'adgfdfg', 'gadfredg')";
                my $sth = $connect->prepare($query);
                $sth->execute() || die $DBI::errstr;

                my @data2;
                while (@data2 = $sth->fetchrow_array()) {
                    my $name = $data2[0];
                    my $email = $data2[1];

                        if ($names eq $name) {  ######################### Comparing names in file and database ########################

上面的代码是使用split函数读取文本文件和数据。我的问题是我在阅读和分割文件时犯了错误。我在运行代码时遇到504错误,我认为这是因为为文本文件中的每一行调用数据库。谢谢你的帮助。

2 个答案:

答案 0 :(得分:2)

查看输入文件中的数据是什么样子会很有帮助。但是你的代码肯定会发生一些奇怪的事情。主要问题似乎是这一行:

my @fields = split(':', @fieldss);

split()有两个标量参数 - 要拆分的正则表达式和要拆分的字符串。因此,它在标量上下文中评估@fieldss并获取数组中的元素数。当然,这将是一个整数,不包含冒号。这会导致@fields包含单个元素,即@fieldss中元素的数量。这个例子说明了:

use Data::Dumper;

my @arr = qw[1:foo 2:bar 3:baz];

my @arr2 = split /:/, @arr;

say Dumper @arr2;

我得到的输出是:

$VAR1 = '3';

您设置@fields的代码应该更像这样:

my $file = '/var/tmp/temp_data.txt';
open(my $fh, '<:encoding(UTF-8)', $file)
  # Note: Added filename and error code to output.
  || die("Unable to open $file: $!");

my @fields;
while (<$fh>) {
  chomp;
  push @fields, split /:/;
}
close $fh;

我没有详细查看其余代码,但肯定会提出一些问题。将数组复制到单个变量可能毫无意义,但可以在一个语句中实现:

my ($names, $rate, $no_of_days, $total_salary, $total,
    $basic, $da, $hra, $ot_hallowance, $gross, $epf,
    $nett) = @fields;

然后你迭代@fields,似乎假设每个元素都是一个名字。也许你需要更详细地解释你正在做的事情。

然而,这些都没有解释您获得的504状态代码。您提到您正在为文本文件中的每一行运行数据库查询。你在循环的每次迭代中都是完全相同的查询,这当然是真的 - 这似乎是非常浪费的(只运行一次并将结果存储在一个变量中),但是,我们已经确定{{1}只能包含一个元素,你的循环只运行一次而你只查询一次数据库。

严格地说,这可能不是你问题的答案。但是你的问题让很多事情无法解释,以至于无法回答。希望这能让您深入了解如何改善您的计划,以便我们能够开始解决当前隐藏的实际问题。

更新:我刚刚意识到输入文件的结构是什么。您可能希望使用以下代码将其读入哈希数组:

@fields

答案 1 :(得分:0)

我甚至不知道你的档案是什么样的。但是我之前在将蛋白质数据库(PDB)文件读入一个数组时,已经完成了这种类型的事情,然后将它们分隔成我的空白分隔的列。我无法保证我的代码能够正常工作,因为我甚至没有你的文件,但无论如何问题是你如何读取文件然后将其拆分成列是问题,是的...

您的拆分无法正常工作,因为您没有拆分,但是您的列是分开的,可能是逗号或至少一个空格。但我从未见过像你所说的用冒号分隔的文件中的列。此外,是的,您可以将文件读入数组,但是要将所有列作为每行的标量,我认为您需要在循环中继续这样做,否则您对每个列都有一个单独的值,不会延伸到整个所有文件。好的,你真的分裂了一个冒号。我很抱歉这个麻烦。请原谅我。

#!/usr/bin/env perl
# Program Description: _________ 
use strict;
use warnings;
use feature 'say'; # say is like print, but you don't need to type \n.

my $file = '/var/tmp/temp_data.txt';      ##################### File for reading
my $filehandle; # Just like your bareword filehandle FILE. 
my @fields = (); # My use of fields is the same as yours. 
open($filehandle, '<', $file) or die "Could not open file $_ because $!";

my @All_Lines = <$filehandle>; # This is just everything in the file ... all lines. 
my $line; # Our current line which will be useful for the split. 
close $filehandle;

# We will declare all of our variables outside of a loop
# That we split all the columns of the file on on colon.
# ... declaring our variables now so they don't get locked in our loop. 
my $names;
my $rate;
my $no_of_days;
my $total_salary;
my $total;
my $basic;
my $da;
my $hra;
my $ot_hallowance;
my $gross;
my $epf;
my $nett;

# Iterating over ever line of the file in a foreach loop. 
foreach $line (@All_Lines)
{

     @fields = split(':', $line); # splitting on a colon. 

$names = $fields[0];
$rate = $fields[1];
$no_of_days = $fields[2];
$total_salary = $fields[3];
$total = $fields[4];
$basic = $fields[5];
$da = $fields[6];
$hra = $fields[7];
$ot_hallowance = $fields[8];
$gross = $fields[9];
$epf = $fields[10];
$nett = $fields[11];

say "$names $rate";

# Then I'm pretty sure that I can dump the rest of your code here
# And you should because @fields is now within this foreach loop. It doesn't exist anywhere else. 
# I'm pretty sure I'm justified in removing your foreach loop here
# Because in mine, the column of names will be read and operated on line by line. 
# Also, your goal was to iterate over $names with a foreach loop
# which I'm doing right now, but where are you operating on it? 

       # Perhaps you should declare these variables with "my" before we start looping through all the lines of the file one line at a time.
       # But I think you will be okay having them here. 
        my $dbh = "DBI:$platform:$database:$host:$port";
        my $connect = DBI->connect($dbh, $user, $pw) || die $DBI::errstr;
        # You didn't declare query with "my $query". I think you should, especially since I classically enabled "use strict". 
        $query = "SELECT * FROM salary WHERE name IN('sssss', 'ffffff', 'dddddd', 'ddededed', 'garaead', 'adgfdfg', 'gadfredg')";
        my $sth = $connect->prepare($query);
        $sth->execute() || die $DBI::errstr;

         my @data2;
         # So @data2 gets formed when you start the while loop, right?
        while (@data2 = $sth->fetchrow_array()) 
                {
                    my $name = $data2[0];
                    my $email = $data2[1];

                        if ($names eq $name) 
                        {
                              ######################### Comparing names in file and database ######################## 
                        } # This bracket ends your if conditional
                } # This bracket ends your while loop.

} # This bracket ends my foreach loop that you want to stay within. 

# Note: Commenting where brackets begin and end can be really useful when you are trying to stay within a loop. 

注意:我根据文件中的一个或多个空格测试了拆分 这是一个看起来像这个由一个或多个空格分隔的PDB文件,PDB文件的每一行可以逐行隔离,所以我很确定这将对你的文件起作用。

ATOM     38  CB  PHE A   7      15.240  41.685  54.772  1.00 10.03           C
ATOM     39  CG  PHE A   7      15.740  42.936  54.063  1.00 12.39           C
ATOM     40  CD1 PHE A   7      15.096  44.166  54.328  1.00 11.59           C
ATOM     41  CD2 PHE A   7      16.851  42.864  53.189  1.00 10.59           C
ATOM     42  CE1 PHE A   7      15.540  45.310  53.626  1.00 12.79           C
ATOM     43  CE2 PHE A   7      17.329  44.058  52.581  1.00 13.07           C
ATOM     44  CZ  PHE A   7      16.651  45.259  52.776  1.00 12.33           C
ATOM    156  CG  GLU A  20      14.635  48.596  50.249  1.00  9.28           C
ATOM    157  CD  GLU A  20      15.173  49.229  51.481  1.00 10.96           C
ATOM    158  OE1 GLU A  20      15.790  50.279  51.395  1.00 11.04           O
ATOM    159  OE2 GLU A  20      14.851  48.625  52.550  1.00 13.31           O 

现在,我通常会在程序中稍后关闭文件,但是你是对的。您可以将其保存到数组中,即使在关闭文件后也可以逐行操作该数组。