我有一个数据集,其中包含与这些UA相对应的用户代理和设备的列表。还有另一个数据集与用户代理一起具有其他数据。我需要一种方法来识别该数据中的设备。
因此,我必须在两个文件中映射UA,然后从包含该列表的文件中获取相应的设备信息。我已经从第一个文件中创建了一个UA列表并将其与数据文件中的UA相匹配。如何从包含设备信息的第一个文件中获取相应信息并将其写入文件?
#!/usr/bin/perl
use warnings;
use strict;
our $inputfile = $ARGV[0];
our $outputfile = "$inputfile" . '.devidx';
our $devid_file = "devid_master"; # the file that has the UA and the corresponding device info
our %ua_list_hash = ();
# Create a list of mobile user agents in the devid_master file
open DEVID, "$devid_file" or die "can't open $devid_file";
while(<DEVID>) {
chomp;
my @devidfile = split /\t/;
$ua_list_hash{$devidfile[1]} = 0;
}
open IN,"$inputfile" or die "can't open $inputfile";
while(<IN>) {
chomp;
my @hhfile = split /\t/;
if(exists $ua_list_hash{$hhfile[24]}) {
# how do I get the rest of the columns from the devidfile, columns 2...10?
}
}
close IN;
或者有更好的方法吗?Perl?这总是受欢迎的:)。
答案 0 :(得分:2)
在构建第一个查找哈希时,您是否不能将对其他列数据的引用存储为哈希值,而不仅仅是0?
#!/usr/bin/perl
use warnings;
use strict;
our $inputfile = $ARGV[0];
our $outputfile = "$inputfile" . '.devidx';
our $devid_file = "devid_master"; # the file that has the UA and the corresponding device info
our %ua_list_hash = ();
# Create a list of mobile user agents in the devid_master file
open DEVID, "$devid_file" or die "can't open $devid_file";
while(<DEVID>) {
chomp;
my @devidfile = split /\t/;
# save the columns you'll want to access later and
# store a reference to them as the hash value
my @values = @devidfile[2..$#devidfile];
$ua_list_hash{$devidfile[1]} = \@values;
}
open IN,"$inputfile" or die "can't open $inputfile";
while(<IN>) {
chomp;
my @hhfile = split /\t/;
if(exists $ua_list_hash{$hhfile[24]}) {
my @rest_of_vals = @{$ua_list_hash{$hhfile[24]};
# do something with @rest_of_vals
}
}
close IN;
注意:我没有对此进行测试。
答案 1 :(得分:0)
您希望输出看起来像什么? $ inputfile中出现的所有唯一设备的列表。或者对于$ inputfile中的每一行,输出一行显示它是哪个设备?
我会回答后者,因为如果需要你可以对它做一个独特的排序。此外,看起来每个UA都有多个设备。作为一般方法,您可以将UA名称存储为哈希中的键,值可以是设备名称数组,也可以是字符分隔的设备名称字符串。
如果您知道设备名称是元素2..10,则可以使用切片和连接运算符来构造,例如,逗号分隔的设备名称字符串。该字符串将是分配给UA名称密钥的值。
#!/usr/bin/perl
use warnings;
use strict;
our $inputfile = $ARGV[0];
our $outputfile = "$inputfile" . '.devidx';
our $devid_file = "devid_master"; # the file that has the UA and the corresponding device info
our %ua_list_hash = ();
# Create a list of mobile user agents in the devid_master file
open DEVID, "$devid_file" or die "can't open $devid_file";
while(<DEVID>) {
chomp;
my @devidfile = split /\t/;
my @slice = @devidfile[2..10];
my $deviceString = join(",", @slice);
$ua_list_hash{$devidfile[1]} = $deviceString;
}
my $outputfilename = "output.txt";
open IN,"$inputfile" or die "can't open $inputfile";
open OUT,"$outputfilename" or die "can't open $outputfilename";
while(<IN>) {
chomp;
my @hhfile = split /\t/;
if(exists $ua_list_hash{$hhfile[24]}) {
print OUT $ua_list_hash{$hhfile[24]}."\n";
}
}
close IN;
close OUT;