我有数据框
date city brand model count
2016-02 abakan audi a6 1
2016-02 abakan bmw 5-series 2
2016-02 abakan bmw x5 2
2016-02 abakan chery a15 1
2016-02 abakan chevrolet cruze 3
2016-02 abakan chevrolet cruze 10
我需要将其划分为更少的数据帧,以使用linear regression
中的sklearn
。有没有办法做到这一点,或者有一些方法可以指定它linear regression
来考虑列中的不同值?
date city brand model count
2016-02 abakan audi a6 1
date city brand model count
2016-02 abakan bmw 5-series 2
date city brand model count
2016-02 abakan bmw x5 2
date city brand model count
2016-02 abakan chery a15 1
date city brand model count
2016-02 abakan chevrolet cruze 3
2016-02 abakan chevrolet cruze 10
我该怎么做?
答案 0 :(得分:1)
Pandas解决方案包含groupby
和list comprehension
- 输出是DataFrames
的列表:
dfs = [g for i, g in df.groupby(['date','city','brand','model'])]
print (dfs)
[ date city brand model count
0 2016-02 abakan audi a6 1, date city brand model count
1 2016-02 abakan bmw 5-series 2, date city brand model count
2 2016-02 abakan bmw x5 2, date city brand model count
3 2016-02 abakan chery a15 1, date city brand model count
4 2016-02 abakan chevrolet cruze 3
5 2016-02 abakan chevrolet cruze 10]
print (dfs[0])
date city brand model count
0 2016-02 abakan audi a6 1