熊猫:将数据帧除以列中的某些值

时间:2016-12-14 08:05:46

标签: python pandas scikit-learn

我有数据框

date    city    brand   model   count
2016-02 abakan  audi    a6  1
2016-02 abakan  bmw 5-series    2
2016-02 abakan  bmw x5  2
2016-02 abakan  chery   a15 1
2016-02 abakan  chevrolet   cruze   3
2016-02 abakan  chevrolet   cruze   10

我需要将其划分为更少的数据帧,以使用linear regression中的sklearn。有没有办法做到这一点,或者有一些方法可以指定它linear regression来考虑列中的不同值?

date    city    brand   model   count
2016-02 abakan  audi    a6  1

date    city    brand   model   count
2016-02 abakan  bmw 5-series    2

date    city    brand   model   count
2016-02 abakan  bmw x5  2

date    city    brand   model   count
2016-02 abakan  chery   a15 1

date    city    brand   model   count
2016-02 abakan  chevrolet   cruze   3
2016-02 abakan  chevrolet   cruze   10

我该怎么做?

1 个答案:

答案 0 :(得分:1)

Pandas解决方案包含groupbylist comprehension - 输出是DataFrames的列表:

dfs = [g for i, g in df.groupby(['date','city','brand','model'])]
print (dfs)
[      date    city brand model  count
0  2016-02  abakan  audi    a6      1,       date    city brand     model  count
1  2016-02  abakan   bmw  5-series      2,       date    city brand model  count
2  2016-02  abakan   bmw    x5      2,       date    city  brand model  count
3  2016-02  abakan  chery   a15      1,       date    city      brand  model  count
4  2016-02  abakan  chevrolet  cruze      3
5  2016-02  abakan  chevrolet  cruze     10]

print (dfs[0])
      date    city brand model  count
0  2016-02  abakan  audi    a6      1