无法解压缩的值太多(预期为2)[列表]

时间:2019-04-15 11:17:32

标签: python pandas

我有一个包含4列的数据集: "Date""Num_week""Calendar"

df.head()如下:

    Date    Num_week    Calendar
412 2012-01-01  1      (2012, 1)
413 2012-01-02  2      (2012, 1)
414 2012-01-03  2      (2012, 1)
415 2012-01-04  2      (2012, 1)
416 2012-01-05  2      (2012, 1)

我在列sorted(list(set(date_week['calendar'])))

中存储值

结果

['(2012, 1)',
 '(2012, 10)',
 '(2012, 11)',
 '(2012, 12)',
 '(2012, 2)',
 '(2012, 3)', etc.

我试图将年和月分开循环。

for year, month in list(set(date_week['calendar'])):
    print(year, month)

但是得到ValueError:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-168-cf01e0d2888e> in <module>()
----> 1 for year, month in list(set(date_week['calendar'])):
      2     print(year, month)

ValueError: too many values to unpack (expected 2)

我已经尝试使用.items()并得到错误的结果。

您能帮我解决这个问题吗?

2 个答案:

答案 0 :(得分:1)

问题是没有元组,但是元组的字符串代表,所以需要先转换:

import ast
date_week['Calendar'] = date_week['Calendar'].apply(ast.literal_eval)

因此可以使用您的解决方案或替代方法:

for year, month in date_week['Calendar'].unique():
    print(year, month)
    2012 1

编辑:使用Series.str.findall并转换为元组的替代解决方案:

date_week['Calendar'] = date_week['Calendar'].str.findall('\d+').apply(tuple)
print (date_week)
           Date  Num_week   Calendar
412  2012-01-01         1  (2012, 1)
413  2012-01-02         2  (2012, 1)
414  2012-01-03         2  (2012, 1)
415  2012-01-04         2  (2012, 1)
416  2012-01-05         2  (2012, 1)

答案 1 :(得分:0)

date_week

           Date  Num_week   Calender
412  2012-01-01         1  (2012, 1)
413  2012-01-02         2  (2012, 1)
414  2012-01-03         2  (2012, 1)
415  2012-01-04         2  (2012, 1)
416  2012-01-05         2  (2012, 1)

解决方案1:在列表中获取输出

l = list(zip(*df['Calender']))
[(2012, 2012, 2012, 2012, 2012), (1, 1, 1, 1, 1)]

OR

y,m = list(zip(*df['Calender']))
year = list(y)
month = list(m)

输出:

print(year)
[2012, 2012, 2012, 2012, 2012]

print(month)
[1, 1, 1, 1, 1]

解决方案2: 您可以创建单独的数据框列

ym = pd.DataFrame(df['Calender'].values.tolist(), columns=['year','month'], index=date_week.index)
ym

     year  month
412  2012      1
413  2012      1
414  2012      1
415  2012      1
416  2012      1

并与现有数据框合并

date_week_new = pd.concat([df, ym],axis=1)
date_week_new 

           Date  Num_week   Calender  year  month
412  2012-01-01         1  (2012, 1)  2012      1
413  2012-01-02         2  (2012, 1)  2012      1
414  2012-01-03         2  (2012, 1)  2012      1
415  2012-01-04         2  (2012, 1)  2012      1
416  2012-01-05         2  (2012, 1)  2012      1