这是我创建的词典的一部分:
defaultdict (int,
{"['por', 'rus']": 80,
"['nld', 'slv']": 4,
"['jpn', 'pol']": 48,
"['ces', 'epo']": 4,
"['oci', 'ron']": 4,
"['lit', 'mkd']": 2,
"['deu', 'ewe']": 2,
"['cat', 'ron']": 4,
"['ces', 'ita']": 18,
"['est', 'fra']": 14,
"['hin', 'mal']": 4,
我希望有3列,column1:第一个键,column2:第二个键,column3:值。
当我创建数据帧时:
pairs_df = pd.DataFrame(list(pairs.iteritems()), columns = ['column1','column2'])
pairs_df.head()
输出:
column1 column2
0 ['por', 'rus'] 80
1 ['est', 'fra'] 14
2 ['nld', 'slv'] 4
3 ['jpn', 'pol'] 48
4 ['ces', 'epo'] 4
5 ['hin', 'mal'] 4
6 ['oci', 'ron'] 4
7 ['lit', 'mkd'] 2
8 ['deu', 'ewe'] 2
9 ['cat', 'ron'] 4
10 ['ces', 'ita'] 18
密钥进入一列,但我无法将它们分成树列。
答案 0 :(得分:3)
import re
mydict= {"['por', 'rus']": 80,
"['nld', 'slv']": 4,
"['jpn', 'pol']": 48,
"['ces', 'epo']": 4,
"['oci', 'ron']": 4,
"['lit', 'mkd']": 2,
"['deu', 'ewe']": 2,
"['cat', 'ron']": 4,
"['ces', 'ita']": 18,
"['est', 'fra']": 14,
"['hin', 'mal']": 4}
# this is where you seem to be stuck
for k,v in mydict.iteritems():
print k,v # keys are still strings, not lists
# this is the resolution, separation of the keys into two strings
for k,v in mydict.iteritems():
a=re.findall('\w{3}',k)
print a[0],a[1],v
输出:
['por', 'rus'] 80
['nld', 'slv'] 4
['jpn', 'pol'] 48
['ces', 'epo'] 4
['oci', 'ron'] 4
['lit', 'mkd'] 2
['deu', 'ewe'] 2
['cat', 'ron'] 4
['ces', 'ita'] 18
['est', 'fra'] 14
['hin', 'mal'] 4
por rus 80
nld slv 4
jpn pol 48
ces epo 4
oci ron 4
lit mkd 2
deu ewe 2
cat ron 4
ces ita 18
est fra 14
hin mal 4
现在,如果您愿意,可以将它们附加到列表中:
x,y,z=[],[],[]
for k,v in mydict.iteritems():
a=re.findall('\w{3}',k)
x.append(a[0])
y.append(a[1])
z.append(v)
print x,y,z
或者如果你喜欢pandas Dataframe:
import pandas as pd
df = pd.DataFrame({'a': x, 'b': y,'c':z})
print df
输出:
['por', 'nld', 'jpn', 'ces', 'oci', 'lit', 'deu', 'cat', 'ces', 'est', 'hin'] ['rus', 'slv', 'pol', 'epo', 'ron', 'mkd', 'ewe', 'ron', 'ita', 'fra', 'mal'] [80, 4, 48, 4, 4, 2, 2, 4, 18, 14, 4]
a b c
0 por rus 80
1 nld slv 4
2 jpn pol 48
3 ces epo 4
4 oci ron 4
5 lit mkd 2
6 deu ewe 2
7 cat ron 4
8 ces ita 18
9 est fra 14
10 hin mal 4
答案 1 :(得分:2)
import pandas as pd
from collections import defaultdict
from ast import literal_eval
pairs = defaultdict (int,
{"['por', 'rus']": 80,
"['nld', 'slv']": 4,
"['jpn', 'pol']": 48,
"['ces', 'epo']": 4,
"['oci', 'ron']": 4,
"['lit', 'mkd']": 2,
"['deu', 'ewe']": 2,
"['cat', 'ron']": 4,
"['ces', 'ita']": 18,
"['est', 'fra']": 14,
"['hin', 'mal']": 4})
df = pd.DataFrame(list(pairs.iteritems()), columns = ['column1','column2'])
print df
column1 column2
0 ['por', 'rus'] 80
1 ['est', 'fra'] 14
2 ['nld', 'slv'] 4
3 ['jpn', 'pol'] 48
4 ['ces', 'epo'] 4
5 ['hin', 'mal'] 4
6 ['oci', 'ron'] 4
7 ['lit', 'mkd'] 2
8 ['deu', 'ewe'] 2
9 ['cat', 'ron'] 4
10 ['ces', 'ita'] 18
print type(df.at[0,'column1'])
<type 'str'>
您可以先string list
将list
更改为DataFrame
,然后创建column2
documented behaviour并上一次literal_eval
#change type string to list
df['column1'] = df['column1'].apply(literal_eval)
print df
column1 column2
0 [por, rus] 80
1 [est, fra] 14
2 [nld, slv] 4
3 [jpn, pol] 48
4 [ces, epo] 4
5 [hin, mal] 4
6 [oci, ron] 4
7 [lit, mkd] 2
8 [deu, ewe] 2
9 [cat, ron] 4
10 [ces, ita] 18
print type(df.at[0,'column1'])
<type 'list'>
:
df1 = pd.DataFrame.from_records([x for x in df['column1']], columns=['a','b'])
print df1
a b
0 por rus
1 est fra
2 nld slv
3 jpn pol
4 ces epo
5 hin mal
6 oci ron
7 lit mkd
8 deu ewe
9 cat ron
10 ces ita
print pd.concat([df1, df['column2']], axis=1)
a b column2
0 por rus 80
1 est fra 14
2 nld slv 4
3 jpn pol 48
4 ces epo 4
5 hin mal 4
6 oci ron 4
7 lit mkd 2
8 deu ewe 2
9 cat ron 4
10 ces ita 18
column1
或使用from_records
和concat
进行清理DataFrame
,然后按str.strip
和replace
column2
创建新的df['column1'] = df['column1'].str.strip('[]').str.replace("'","")
print df
column1 column2
0 por, rus 80
1 est, fra 14
2 nld, slv 4
3 jpn, pol 48
4 ces, epo 4
5 hin, mal 4
6 oci, ron 4
7 lit, mkd 2
8 deu, ewe 2
9 cat, ron 4
10 ces, ita 18
:
df1 = df['column1'].str.split(",", expand=True)
df1.columns = ['a','b']
print df1
a b
0 por rus
1 est fra
2 nld slv
3 jpn pol
4 ces epo
5 hin mal
6 oci ron
7 lit mkd
8 deu ewe
9 cat ron
10 ces ita
print pd.concat([df1, df['column2']], axis=1)
a b column2
0 por rus 80
1 est fra 14
2 nld slv 4
3 jpn pol 48
4 ces epo 4
5 hin mal 4
6 oci ron 4
7 lit mkd 2
8 deu ewe 2
9 cat ron 4
10 ces ita 18
{{1}}
答案 2 :(得分:0)
此解决方案使用dict
与eval
和from_records
函数进行理解:
import pandas as pd
from collections import defaultdict
pairs = defaultdict (int,
{"['por', 'rus']": 80,
"['nld', 'slv']": 4,
"['jpn', 'pol']": 48,
"['ces', 'epo']": 4,
"['oci', 'ron']": 4,
"['lit', 'mkd']": 2,
"['deu', 'ewe']": 2,
"['cat', 'ron']": 4,
"['ces', 'ita']": 18,
"['est', 'fra']": 14,
"['hin', 'mal']": 4})
rec = [ eval(x[0]) + [x[1]] for x in pairs.iteritems()]
print rec
[['por', 'rus', 80], ['est', 'fra', 14], ['nld', 'slv', 4], ['jpn', 'pol', 48],
['ces', 'epo', 4], ['hin', 'mal', 4], ['oci', 'ron', 4], ['lit', 'mkd', 2],
['deu', 'ewe', 2], ['cat', 'ron', 4], ['ces', 'ita', 18]]
print pd.DataFrame.from_records(rec, columns=['a','b','c'])
a b c
0 por rus 80
1 est fra 14
2 nld slv 4
3 jpn pol 48
4 ces epo 4
5 hin mal 4
6 oci ron 4
7 lit mkd 2
8 deu ewe 2
9 cat ron 4
10 ces ita 18