我有一些基本的代码来处理Perl哈希,在这里我可以处理以下元素:$ data {“ WV2”} {789} {PP1}(或在分配中使用该实际文本)...但是我想要使用Python字典做类似的事情。
Perl和Python中的两个简单程序说明了我一直在尝试复制的内容:-
因此,Perl代码:-
whoami
...以及Python代码:-
# hash.pl
use strict;
use warnings;
use Data::Dumper;
my %data = ();
my @reg_list = ( "MC1", "CA2", "WV2" );
my @site_list = ( 123, 456, 391, 287 );
$data{MC1}{4564}{PP}{1} = "-15,-15C";
$data{MC1}{4564}{PP}{2} = "5,5C";
$data{MC1}{4564}{PP}{3} = "-19,-19C";
$data{MC1}{4564}{PP}{4} = "-12,-12C";
printf("---- One:\n");
print Dumper(%data); # Ok, shows the full strucure
printf("---- Two:\n");
print Dumper($data{"MC2"}); # Shows as undef (sensible)
printf("---- Three:\n");
print Dumper($data{"MC1"}); # Ok, showing the key:values for each "site" key
printf("---- Four:\n");
print Dumper($data{"MC1"}{"4564"}); # Ok, shows the actual equality value above
# ---- This works Ok
my %xdata = ();
$xdata{"MC1"}{123}{"PP"} = "-15,-15C";
$xdata{"MC1"}{456}{"PP"} = "5,5C";
$xdata{"MC1"}{391}{"PP"} = "-19,-19C";
$xdata{"MC1"}{287}{"PP"} = "-12,-12C";
printf("---- One:\n");
print Dumper(%xdata); # Ok, shows the full strucure
#pprint.pprint(data["MC2"])
#pprint.pprint(data["MC1"}{391])
# [eof]
每个程序的输出如下:-
# dict.py
import pprint
import collections
reg_list = [ "MC1", "CA2", "WV2" ]
site_list = [ 123, 456, 391, 287 ]
#data = {}
data = collections.defaultdict(dict) # {}
data["MC1"][123] = "-15,-15C"
data["MC1"][456] = "5,5C"
data["MC1"][391] = "-19,-19C"
data["MC1"][287] = "-12,-12C"
print("---- One:")
pprint.pprint(data) # Ok, shows the full strucure
print("---- Two:")
pprint.pprint(data["MC2"]) # Shows: {} [...Ok, undefined...]
print("---- Three:")
pprint.pprint(data["MC1"]) # Ok, showing the key:values for each "site" key
print("---- Four:")
pprint.pprint(data["MC1"][391]) # Ok, shows the actual equality value above
# ---- Cannot get the following to work
xdata = collections.defaultdict(dict) # {}
xdata["MC1"][123]["PP"] = "-15,-15C" # ERROR: Key error 123
xdata["MC1"][456]["PP"] = "5,5C"
xdata["MC1"][391]["PP"] = "-19,-19C"
xdata["MC1"][287]["PP"] = "-12,-12C"
#pprint.pprint(data["MC2"])
#pprint.pprint(data["MC1"][391])
# [eof]
...并且来自Python:-
# Perl Output:
---- One:
$VAR1 = 'MC1';
$VAR2 = {
'4564' => {
'PP' => {
'4' => '-12,-12C',
'1' => '-15,-15C',
'3' => '-19,-19C',
'2' => '5,5C'
}
}
};
---- Two:
$VAR1 = undef;
---- Three:
$VAR1 = {
'4564' => {
'PP' => {
'4' => '-12,-12C',
'1' => '-15,-15C',
'3' => '-19,-19C',
'2' => '5,5C'
}
}
};
---- Four:
$VAR1 = {
'PP' => {
'4' => '-12,-12C',
'1' => '-15,-15C',
'3' => '-19,-19C',
'2' => '5,5C'
}
};
---- One:
$VAR1 = 'MC1';
$VAR2 = {
'391' => {
'PP' => '-19,-19C'
},
'456' => {
'PP' => '5,5C'
},
'123' => {
'PP' => '-15,-15C'
},
'287' => {
'PP' => '-12,-12C'
}
};
我尝试查找有关Nesting Dictionaries的信息...但是我看过的所有内容并不能清楚地解释该概念的工作原理(无论如何,在我看来)....尤其是在以下情况下词典的使用级别更高。
我已经写了25年的Perl代码,但仅从Python开始。
在Windows 10 x64上运行ActiveState Perl v5.16.3,Build 1603和Anaconda Python 3.6.5。
非常感谢您的任何想法或建议。
答案 0 :(得分:1)
Python不能像Perl的散列那样自动保留多级字典。在第二层和更深的级别上,您必须为空的dict
分配一个空的dict
,然后再向其添加更多密钥:
xdata = collections.defaultdict(dict)
xdata["MC1"] = collections.defaultdict(dict)
xdata["MC1"][123]["PP"] = "-15,-15C" # ERROR: Key error 123
xdata["MC1"][456]["PP"] = "5,5C"
xdata["MC1"][391]["PP"] = "-19,-19C"
xdata["MC1"][287]["PP"] = "-12,-12C"
答案 1 :(得分:1)
解决问题的简单方法似乎是:-
library('rvest')
for (i in 1:40) {
webpage <- read_html(paste0(("http://search.beaconforfreedom.org/search/censored_publications/result.html?author=&cauthor=&title=&country=7327&language=&censored_year=&censortype=&published_year=&censorreason=&sort=t&page=, i"))
rank_data_html <- html_nodes(webpage,'tr+ tr td:nth-child(1)')
rank_data <- html_text(rank_data_html)
rank_data<-as.numeric(rank_data)
title_data_html <- html_nodes(webpage,'.censo_list font')
title_data <- html_text(title_data_html)
author_data_html <- html_nodes(webpage,'.censo_list+ td font')
author_data <- html_text(author_data_html)
country_data_html <- html_nodes(webpage,'.censo_list~ td:nth-child(4) font')
rcountry_data <- html_text(country_data_html)
year_data_html <- html_nodes(webpage,'tr+ tr td:nth-child(5) font')
year_data <- html_text(year_data_html)
type_data_html <- html_nodes(webpage,'tr+ tr td:nth-child(6) font')
type_data <- html_text(type_data_html)
}
censorship_df<-data.frame(Rank = rank_data, Title = title_data, Author = author_data, Country = rcountry_data, Type = type_data, Year = year_data)
write.table(censorship_df, file="sample.csv",sep=",",row.names=F)
...但是那仍然意味着我每次“发现”新的“值”时都必须“手动”定义一个字典... 等等
尽管固有的 Gotcha!带有错误的键入和dict内容的(可能)损坏,但What is the best way to implement nested dictionaries?似乎是解决此问题的好方法...尤其是值(无论如何,在我当前的应用程序中)“从来没有看到过”(它们是在进入我的应用程序之前由机器生成和验证的)...因此,一些可能有效的代码可能是:->
xdata = collections.defaultdict(dict) # {}
xdata["MC1"][123] = {} # Define the dict before using it
xdata["MC1"][123]["PP"] = "-15,-15C" # Works Ok
感谢大家的建议和指点。