Question

使用以下csv数据

name,place,animal
a,,
b,,
a,,
,b,

名称字段有3行，但在1行中不可用场地字段有1行但不是3行所有行中的动物场都是空的 - ＆gt;获取这些列名1

我希望只有在所有行中为空时才能获取列名。

我正在尝试编写一个perl脚本，但不确定如何解决这个问题。

step 1: Check all the columns in first row, if any column is not empty ,dont search it in next row
step2: keep repeating step1 in a loop  and finally we will get the output.and this brings down the complexity as we are not bothered about columns that have value even once.

我将实现代码并在此处发布。

但如果你有任何新想法，请告诉我

由于

Answer 1

对于没有引号和转义的CSV文件，到目前为止只保留空列的哈希值。逐行读取文件，从散列中删除任何非空列：

#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };

chomp( my @column_names = split /,/, <> );
my %empty;
@empty{ @column_names } = ();

while (<>) {
    chomp;
    my @columns = split /,/;
    for my $i (0 .. $#columns) {
        delete $empty{ $column_names[$i] } if length $columns[$i];
    }
}

say for keys %empty;

对于真实的CSV文件，请使用Text::CSV_XS，但方法是相同的：按列名填充哈希值，然后删除非空的哈希值：

#!/usr/bin/perl
use warnings;
use strict;
use feature qw{ say };

use Text::CSV_XS qw{ csv };

my %empty;

csv(in      => shift,
    out     => \ 'skip',
    headers => sub { undef $empty{ $_[0] }; $_[0] },
    on_in   => sub {
        my (undef, $columns) = @_;
        delete @empty{ grep length $columns->{$_}, keys %$columns }
    },
);

say for keys %empty;

Answer 2

在处理行时，更新辅助数组，跟踪每个字段的真值

如果新行中的任何字段非空，则数组的相应元素将翻转为true;否则它会保持虚假。最后，数组的假元素索引标识空列的索引。

use warnings;
use strict;
use feature 'say';    
use Text::CSV;

my $file = 'cols.csv';
my $csv = Text::CSV->new( { binary => 1 } ) 
    or die "Cannot use CSV: " . Text::CSV->error_diag (); 

open my $fh, '<', $file or die "Can't open $file: $!";

my @col_names = @{ $csv->getline($fh) };

my @mask;
while (my $line = $csv->getline($fh)) {
    @mask = map { $mask[$_] || $line->[$_] ne '' } (0..$#$line);
}

for (0..$#mask) {
    say "Column \"$col_names[$_]\" is empty" if not $mask[$_];
}

语法：$#$line是arrayref $line的最后一个元素的索引（就像$#ary @ary一样）

在整个文件中找到唯一具有空值的列

2 个答案: