Python熊猫 - 麻烦压缩2个DataFrame列

时间:2017-12-13 07:13:51

标签: python pandas dataframe

我在使用zip()函数将两个数据帧列压缩在一起时遇到了麻烦,但我想我在某种程度上搞乱了检索列。我的理解是,检索列应该像df[['column']]一样简单,但在这种情况下由于某种原因它是不同的吗?

说明:

使用zip(),将'Total Population'的{​​{1}}和'Urban population (% of total)'列压缩在一起。将生成的zip对象分配给弹出窗口。

我的尝试:

df_pop_ceb

错误:

# Initialize reader object: urb_pop_reader
urb_pop_reader = pd.read_csv('ind_pop_data.csv', chunksize = 1000)

# Get the first DataFrame chunk: df_urb_pop
df_urb_pop = next(urb_pop_reader)

# Check out the head of the DataFrame
print(df_urb_pop.head())

# Check out specific country: df_pop_ceb
df_pop_ceb = df_urb_pop[df_urb_pop['CountryCode' == 'CEB']

# Zip DataFrame columns of interest: pops
pops = zip(df_pop_ceb[['Total Population']],df_pop_ceb[['Urban population (% of total)']])

# Turn zip object into list: pops_list
pops_list = list(pops)

# Print pops_list
print(pops_list)

1 个答案:

答案 0 :(得分:0)

有拼写错误,需要]

#                                      missing  ]      
df_pop_ceb = df_urb_pop[df_urb_pop['CountryCode'] == 'CEB']

然后,对于zip 2列,[]只需要Series list,如果列中的值tuples需要[],因为double df会返回一列pops = zip(df_pop_ceb['Total Population'],df_pop_ceb['Urban population (% of total)']) 此处:

pops = zip(df_pop_ceb[['Total Population']],df_pop_ceb[['Urban population (% of total)']])
print (list(pops))
[('Total Population', 'Urban population (% of total)')]

否则获取列名称:

df_pop_ceb = pd.DataFrame({'Total Population':[1,2,3,4,5,9],
                           'Urban population (% of total)':[5,3,6,9,2,4]})
print (df_pop_ceb)
   Total Population  Urban population (% of total)
0                 1                              5
1                 2                              3
2                 3                              6
3                 4                              9
4                 5                              2
5                 9                              4

pops = zip(df_pop_ceb[['Total Population']],df_pop_ceb[['Urban population (% of total)']])
print (list(pops))
[('Total Population', 'Urban population (% of total)')]

pops = zip(df_pop_ceb['Total Population'],df_pop_ceb['Urban population (% of total)'])
print (list(pops))
[(1, 5), (2, 3), (3, 6), (4, 9), (5, 2), (9, 4)]

样品:

           [{
             name : "John",
             subjects:["English", "Maths"]
            },
            {
             name : "Winsel",
             subjects : ["Maths"]
            }]