在下面。 dataframe,我有一个年和月的值集合作为列表中的元组:
state
alabama [(2017.0, 10.0), (2017.0, 11.0), (2017.0, 12.0), (2018.0, 1.0)]
arkansas [(2017.0, 10.0), (2017.0, 11.0), (2017.0, 12.0)]
colorado [(2017.0, 9.0), (2017.0, 10.0), (2017.0, 11.0)]
如何提取年度和月份组合的超集列表?在这种情况下,soln将是:
[(2017.0, 9.0), (2017.0, 10.0), (2017.0, 11.0), (2017.0, 12.0), (2018.0, 1.0)]
我可以使用for循环来做这件事,但那会慢,哪个更pythonic?
以下是我的尝试:
for row in df:
if all(y in row for x, y in df):
tmp = row
但是我收到了这个错误:
ValueError: too many values to unpack (expected 2)
答案 0 :(得分:1)
使用sample DF from your previous question:
In [109]: df[['Year','Month']].sort_values(['Year','Month']).drop_duplicates().values.tolist()
Out[109]:
[[2017.0, 10.0],
[2017.0, 11.0],
[2017.0, 12.0],
[2018.0, 1.0],
[2018.0, 2.0],
[2018.0, 3.0],
[2018.0, 4.0],
[2018.0, 5.0],
[2018.0, 6.0],
[2018.0, 7.0]]