Question

我有一个CSV文件，如下所示：

ACDB,this is a sentence
BECD,this is another sentence
BCAB,this is yet another

第一列中的每个字符对应于第二列中的单词，例如，在第一列中，A对应于“this”，C对应于“是”，{{1}带有“a”和D，带有句子。

给定变量B，可以设置为出现在第一列中的任何字符，我需要隔离与所选字母对应的单词，例如，如果我设置character ，那么上面的输出将是：

character="B"

如果我设置`character =“C”，那么上面的输出将是：

sentence
this
this another

如何仅输出与所选字母的位置对应的单词？

该文件包含许多UTF-8字符。
对于第1列中的每个字符，第2列中的字总数相同。
第2列中的单词以空格分隔。

这是我到目前为止的代码：

is
another
is

Answer 1

这是一个主要做过的臀部回答。

既然SO不是“为我做我的工作”网站，你需要填写一些琐碎的空白。

sub get_index_of_char {
   my ($character, $charset) = @_;
   # Homework: read about index() function
   #http://perldoc.perl.org/functions/index.html
}

sub split_line {
    my ($line) = @_;
    # Separate the line into a charset (before comma), 
    # and whitespace separated word list.
    # You can use a regex for that
    my ($charset, @words) = ($line =~ /^([^,]+),(?(\S+)\s+)+(\S+)$/g); # Not tested
    return ($charset, \@words);
}

sub process_line {
    my ($line, $character) = @_;
    chomp($line);
    my ($charset, $words) = split_line($line);
    my $index = get_index_of_char($character, $charset);
    print $words->[$index] . "\n"; # Could contain a off-by-one bug
}

# Here be the main loop calling process_line() for every line from input

Answer 2

这似乎可以解决问题。它使用DATA文件句柄从源文件中读取数据，而您必须从您自己的源中获取它。您可能还必须满足没有对应于给定字母的单词（此处第二条数据行中的“A”）。

use strict;
use warnings;

my @data;

while (<DATA>) {
  my ($keys, $words) = split /,/;
  my @keys = split //, $keys;
  my @words = split ' ', $words;
  my %index;
  push @{ $index{shift @keys} }, shift @words while @keys;
  push @data, \%index;
}

for my $character (qw/ B C /) {
  print "character = $character\n";
  print join(' ', @{$_->{$character}}), "\n" for @data;
  print "\n";
}

__DATA__
ACDB,this is a sentence
BECD,this is another sentence
BCAB,this is yet another

<强>输出

character = B
sentence
this
this another

character = C
is
another
is

Answer 3

这可能对您有用：

x=B                                                      # set wanted key variable
sed '
:a;s/^\([^,]\)\(.*,\)\([^ \n]*\) *\(.*\)/\2\4\n\1 \3/;ta # pair keys with values
s/,//                                                    # delete ,
s/\n[^'$x'] [^\n]*//g                                    # delete unwanted keys/values
s/\n.//g                                                 # delete wanted keys
s/ //                                                    # delete first space
/^$/d                                                    # delete empty lines
' file
sentence
this
this another

或在awk中：

awk -F, -vx=B '{i=split($1,a,"");split($2,b," ");c=s="";for(n=1;n<=i;n++)if(a[n]==x){c=c s b[n];s=" "} if(length(c))print c}' file
sentence
this
this another

如何隔离与CSV文件的不同列中的字母对应的单词？

3 个答案:

如何隔离与CSV文件的不同列中的字​​母对应的单词？

3 个答案:

如何隔离与CSV文件的不同列中的字母对应的单词？