打印数组的哈希值数组的哈希值

时间:2018-11-18 13:47:01

标签: arrays perl hash

我正在尝试在给定的文档语料库中创建单词及其位置的反向索引。我要针对的数据结构示例如下:

+----------+--------------------------------------------------------------+
|   Word   |                           Location                           |
+----------+--------------------------------------------------------------+
| 'word 1' | 'doc1' 'title',  'doc4' 'text', 'doc7' 'title' 'text'        |
+----------+--------------------------------------------------------------+

其中“标题”和“文本”是可能的位置

我解析和生成数据的代码是:

while (my $line = <$fh>) { 
    # determine doc no and location within docs
    ....

    #iterate words in a given location within a document 
    foreach my $str ($line =~ /[[:alpha:]]+/g) { 
        push @{ $doc{$docno} }, $location;        
        push @{ $wordlist{$str} }, $doc{$docno}; 
    }
}

我要打印数据的代码是:

foreach my $str (reverse sort { $wordlist{$a} <=> $wordlist{$b} } keys %wordlist) { 
    printf $fo "%-15s %-15s \n", $str, "@{ $wordlist{$str} }";
} 

但是,结果是:

+----------+--------------------------------------------------------------+
|   Word   |                           Location                           |
+----------+--------------------------------------------------------------+
|  'word1' | ARRAY(0x66d4508) ARRAY(0x66d4508) ARRAY(0x66d4508)           |
+----------+--------------------------------------------------------------+

我哪里出错了?

编辑:

我尝试将打印代码更改为:

foreach my $str (reverse sort { $wordlist{$a} <=> $wordlist{$b} } keys %wordlist) { 
    printf "%-15s", $str;

    @arr = @{ $wordlist{$str} };
    foreach $arr (@arr)
    {
        print "@{ $arr }: , ";
    }

    print "\n";
} 

但是结果是:

word101        title title text text text text text text ...

我不知道如何在所述文档中的位置旁边打印文档编号

1 个答案:

答案 0 :(得分:0)

您的数据结构丢弃了您所需要的信息。

只需执行以下操作:

while (my $line = <$fh>) { 
    # determine doc no and location within docs
    ....

    #iterate words in a given location within a document 
    foreach my $str ($line =~ /[[:alpha:]]+/g) { 
        push $worldlist{Sstr}->@*, {
            docno => $docno,
            location => $location
        };
    }
}

这使打印数据结构的工作变得微不足道。