pandas:从value_counts和dataframe链接两种类型的数据

时间:2015-03-03 03:22:26

标签: python pandas dataframe

我有一个派生自value_count函数的系列,如下所示:

(10, 9, 2011)     668
(10, 8, 2011)     584
(9, 4, 2011)      505
(8, 13, 2011)     502
(9, 5, 2011)      497
(8, 6, 2011)      489
(5, 27, 2012)     480

我还有一个与元组具有相同日期的pandas数据框,如下所示:

        date          avgtemp  num_rides
0     (7, 27, 2011)    76.01        NaN
1     (7, 28, 2011)    71.51        NaN
2     (7, 29, 2011)    72.50        NaN
3     (7, 30, 2011)    81.05        NaN
4     (7, 31, 2011)    80.51        NaN
5      (8, 1, 2011)    82.49        NaN
6      (8, 2, 2011)    78.98        NaN

如何加入这两个以便它们位于同一个数据帧中? “668”等应该在num_rides下。

1 个答案:

答案 0 :(得分:1)

假设:

In [245]: series
Out[245]: 
(10, 9, 2011)    668
(10, 8, 2011)    584
(9, 4, 2011)     505
(8, 13, 2011)    502
(9, 5, 2011)     497
(8, 6, 2011)     489
(5, 27, 2012)    480
dtype: int64

In [246]: df
Out[246]: 
            date  avgtemp
0  (7, 27, 2011)    76.01
1  (7, 28, 2011)    71.51
2  (7, 29, 2011)    72.50
3  (7, 30, 2011)    81.05
4  (7, 31, 2011)    80.51
5   (8, 1, 2011)    82.49
6   (8, 2, 2011)    78.98

将系列的名称更改为num_rides

In [247]: series.name = 'num_rides'

然后将系列转换为DataFrame,并将其与df合并:

In [248]: pd.merge(series.to_frame(), df, left_index=True, right_on='date', how='outer')
Out[248]: 
   num_rides           date  avgtemp
6        668  (10, 9, 2011)      NaN
6        584  (10, 8, 2011)      NaN
6        505   (9, 4, 2011)      NaN
6        502  (8, 13, 2011)      NaN
6        497   (9, 5, 2011)      NaN
6        489   (8, 6, 2011)      NaN
6        480  (5, 27, 2012)      NaN
0        NaN  (7, 27, 2011)    76.01
1        NaN  (7, 28, 2011)    71.51
2        NaN  (7, 29, 2011)    72.50
3        NaN  (7, 30, 2011)    81.05
4        NaN  (7, 31, 2011)    80.51
5        NaN   (8, 1, 2011)    82.49
6        NaN   (8, 2, 2011)    78.98