转换pandas dataframe以包含字典或列表列表

时间:2017-12-04 20:16:27

标签: python pandas

SELECT loadtable.datetimestamp, 
       programtable.value AS 1, 
       pl.value           AS 2, 
       pl.value           AS 3 
FROM   ((loadtable 
         LEFT JOIN programtable 
                ON loadtable.datetimestamp = programtable.datetimestamp) 
        LEFT JOIN pl 
               ON loadtable.datetimestamp = pl.datetimestamp) 
       LEFT JOIN (SELECT starttime, 
                         endtime 
                  FROM   productrun 
                  WHERE  productrun.starttime >=#11/1/2017# 
                         AND productrun.starttime <=#12/1/2017# ) a 
              ON loadtable.datetimestamp >= Dateadd("n", 15, a.starttime) 
                 AND loadtable.datetimestamp <= Dateadd("n", -15, a.endtime) 
ORDER  BY loadtable.datetimestamp; 

我可以将上面的数据框转换为:

      state      Year  Month  count
0       alabama  2017.0   10.0     31
1       alabama  2017.0   11.0     30
2       alabama  2017.0   12.0     31
3       alabama  2018.0    1.0     31
4       alabama  2018.0    2.0     28
5       alabama  2018.0    3.0     31
6       alabama  2018.0    4.0     30
7       alabama  2018.0    5.0     31
8       alabama  2018.0    6.0     30
9       alabama  2018.0    7.0     14
10     arkansas  2017.0   10.0     31
11     arkansas  2017.0   11.0     30
12     arkansas  2017.0   12.0     31

converting pandas dataframe to contain a list

相关

基于@ Vaishali的评论,因为字典不能包含重复的密钥,所以这也应该没问题:

                                                            Month
state                                                        
alabama         {2017:10.0, 2017:11.0, 2017:12.0, 2018:1.0, 2018:2.0, 2018:3.0, 2018:4.0, 2018:5.0, 2018:6.0, 2018:7.0}
arkansas        {2017:10.0, 2017:11.0, 2017:12.0}

4 个答案:

答案 0 :(得分:5)

尝试

df.groupby('state').apply(lambda x: list(zip(x['Year'], x['Month'])))


state
alabama     [(2017.0, 10.0), (2017.0, 11.0), (2017.0, 12.0...
arkansas     [(2017.0, 10.0), (2017.0, 11.0), (2017.0, 12.0)]

答案 1 :(得分:2)

In [73]: (df.groupby('state')['Year','Month']
            .apply(lambda x: x.values.tolist())
            .to_frame('Month')
            .reset_index())
Out[73]:
      state                                              Month
0   alabama  [[2017.0, 10.0], [2017.0, 11.0], [2017.0, 12.0...
1  arkansas   [[2017.0, 10.0], [2017.0, 11.0], [2017.0, 12.0]]

答案 2 :(得分:1)

我想这会奏效。

d={}
for index, row in df.iterrows():
  if(d.get(row['state'],0)==0):
    d[row['state']=[].append(str(row['year'])+" : "+ str(row['month']))
  else:
    d[row['state']] = d[row['state']].append(str(row['year'])+" : "+ str(row['month']))

这就像

一样
arkansas        ["2017 : 10.0", "2017 : 11.0", "2017 : 12.0"]

答案 3 :(得分:1)

或者

df.groupby('state').apply(lambda x:x[['Year','Month']].values)

state
alabama     [[2017.0, 10.0], [2017.0, 11.0], [2017.0, 12.0...
arkansas     [[2017.0, 10.0], [2017.0, 11.0], [2017.0, 12.0]]
相关问题