我真的被困住了。我有一个数据框,其列如下所示
Dailychange:
1
2
3
0
-1
-2
-3
1
2
我想用输出pos [3,2] nutral [1] neg [3]连续计算两个列表的正数和负数。我试过用像
这样的简单循环来解决它 # for i in symbol:
# if (symbol['Dailychange']>0):
# counter+=1
# cons_list.append(counter)
# else:
# counter=0
# cons_list.append(counter)
# print(cons_list)
由于我的if语句,并输出错误。然后我尝试使用where函数
symbol['positive']=symbol.where(symbol['Dailychange']>0,'positive','Negative')
这也没有成功。我非常感谢你的帮助。
答案 0 :(得分:2)
我们需要一个新的参数,我是使用np.where
df['New']=np.where(df['Num']>0,'positive',np.where(df['Num']==0,'Nutral','Negative'))
s=df.groupby([df['New'],(df['New']!=df['New'].shift()).cumsum()]).size().reset_index(level=1,drop=True)
s
Out[41]:
New
Negative 3
Nutral 1
positive 3
positive 2
dtype: int64
更多信息
(df['New']!=df['New'].shift()).cumsum()
Out[804]:
0 1
1 1
2 1
3 2
4 3
5 3
6 3
7 4
8 4
Name: New, dtype: int32
(df['New']!=df['New'].shift())
Out[805]:
0 True
1 False
2 False
3 True # here is the status change
4 True # here is the status change
5 False # those one do not change should carry over the same number as before
6 False
7 True # here is the status change
8 False
Name: New, dtype: bool
我们将连续的正面或负面视为一个群体,一旦他们改变了他们的下一个群体
还有一件事True + False = 1
答案 1 :(得分:0)
pd.cut和groupby正是你要找的 -
import numpy as np
import pandas as pd
x = pd.DataFrame([1, 2, 3, 0, -1, -2, -3, 1, 2],columns=['Dailychange'])
col = x['Dailychange']
x['Labels'] = list(pd.cut(x['Dailychange'],[-float("inf"),-0.1,0.1,float("inf")],labels=['neg','neutral','pos']))
# for i,e in enumerate(x['Labels']):
# print(col[i],x['Labels'][i])
x['chunk_number'] = (x['Labels'] != x['Labels'].shift()).cumsum()
grouped_df = x.groupby('chunk_number')
for i in grouped_df.groups.keys():
print(list(grouped_df.get_group(i)['Dailychange']))
同时结帐:Documentation | Related question | Another Related question