我有这个数据框:
Outlook Temperature PlayTennis Value
0 Sunny 60 Yes 1
1 Sunny 70 Yes 1
2 Sunny 40 No 1
3 Overcast 40 No 1
4 Overcast 60 Yes 1
5 Overcast 50 Yes 1
6 Overcast 70 Yes 1
7 Overcast 80 Yes 1
8 Rain 65 No 1
9 Rain 70 Yes 1
我希望得到这个
Outlook Yes No
Sunny 2 1
Overcast 4 1
Rain 1 1
不确定使用什么命令来计算基于Sunny / Overcast / Rain
的yes和nos答案 0 :(得分:0)
以下是开始的事情:
forecasts = [
["sunny", "yes"],
["sunny", "yes"],
["sunny", "no"],
["overcast", "no"],
# more forecasts ...
]
myForecasts = {}
for forecast in forecasts:
if forecast[0] not in myForecasts:
myForecasts[forecast[0]] = [0, 0]
if forecast[1] == "yes":
myForecasts[forecast[0]][0] += 1
else:
myForecasts[forecast[0]][1] += 1
print("Outlook | Yes | No")
for myForecast in myForecasts:
print("{} | {} | {}".format(myForecast, myForecasts[myForecast][0], myForecasts[myForecast][1]))
我希望这会有所帮助。下次,请告诉我们您已完成作业。
答案 1 :(得分:0)
这是怎么回事?
df.groupby('Outlook').apply(lambda g: g['PlayTennis'].value_counts())
或者,对于您的确切规格:
df.groupby('Outlook').apply(lambda g: g['PlayTennis'].value_counts()).unstack(1)
甚至更短:
df.groupby('Outlook')['PlayTennis'].value_counts().unstack(1)
答案 2 :(得分:0)
您可以使用pd.pivot_table
来解决此问题
In [88]: pd.pivot_table(df, index='Outlook', cols='PlayTennis',
values='Value', aggfunc='sum')
Out[88]:
PlayTennis No Yes
Outlook
Overcast 1 4
Rain 1 1
Sunny 1 2
此外,您groupby
可以'Outlook', 'PlayTennis'
获取数据并使用unstack('PlayTennis')
In [87]: df.groupby(['Outlook', 'PlayTennis']).size().unstack('PlayTennis')
Out[87]:
PlayTennis No Yes
Outlook
Overcast 1 4
Rain 1 1
Sunny 1 2