我试图根据条件在数据集中添加一个新列,但是,结果数据框不是我期望的。
我已经尝试了一些方法,这与我所经历的最接近。
import pandas as pd
data = {'Date' : ['3-Mar', '20-Mar', '20-Apr', '21-Apr', '29-Apr', '7-
May', '30-May', '31-May', '7-Jun', '16-Jun',
'1-Jul', '2-Jul', '10-Jul'],
'Value' : [0.5840, 0.8159, 0.7789, 0.7665, 0.8510, 0.7428, 0.7124,
0.6820, 0.8714, 0.8902, 0.8596, 0.8289, 0.6877],}
frame = pd.DataFrame(data)
for counter, value in enumerate(frame['Value']):
if value >= 0.7:
frame = frame.append({'result': 'High'}, ignore_index=True)
else:
frame = frame.append({'result': 'Low'}, ignore_index=True)
print(frame)
结果是:
Date Value result
0 3-Mar 0.5840 NaN
1 20-Mar 0.8159 NaN
2 20-Apr 0.7789 NaN
3 21-Apr 0.7665 NaN
4 29-Apr 0.8510 NaN
5 7-May 0.7428 NaN
6 30-May 0.7124 NaN
7 31-May 0.6820 NaN
8 7-Jun 0.8714 NaN
9 16-Jun 0.8902 NaN
10 1-Jul 0.8596 NaN
11 2-Jul 0.8289 NaN
12 10-Jul 0.6877 NaN
13 NaN NaN Low
14 NaN NaN High
15 NaN NaN High
16 NaN NaN High
17 NaN NaN High
18 NaN NaN High
19 NaN NaN High
20 NaN NaN Low
21 NaN NaN High
22 NaN NaN High
23 NaN NaN High
24 NaN NaN High
25 NaN NaN Low
但是,我希望这些值将放置在现有值而不是新值的旁边。
谢谢!
答案 0 :(得分:1)
如果您查看append函数的文档,您会发现它会将行追加到数据框的末尾,而不是您想要的:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.append.html
您可以使用lambda函数来实现此目的,该函数实际上会应用所需的逻辑遍历每一行。
frame['result'] = frame['Value'].apply(lambda x: 'High' if x > .7 else "Low")
答案 1 :(得分:0)
如果我理解得很好,这可能已经回答了,但是你去了
您需要创建一个新列result
定义一个函数(出于可读性),该函数需要一个值并返回结果
def udf(value):
if value >= .7:
return "High"
else
return "Low"
然后将此功能应用于列值
frame['result'] = frame['Value'].apply(udf)
我建议您阅读文档DataFrame.apply
答案 2 :(得分:0)
使用pandas.Series可以解决您的问题
import pandas as pd
data = {'Date' : ['3-Mar', '20-Mar', '20-Apr', '21-Apr', '29-Apr', '7- May',
'30-May', '31-May', '7-Jun', '16-Jun','1-Jul', '2-Jul', '10-Jul'],
'Value' : [0.5840, 0.8159, 0.7789, 0.7665, 0.8510, 0.7428, 0.7124,
0.6820, 0.8714, 0.8902, 0.8596, 0.8289, 0.6877]}
frame = pd.DataFrame(data)
frame['result'] = pd.Series(['High' if x >= 0.7 else 'Low' for x in frame['Value']])
输出:
Date Value result
0 3-Mar 0.5840 Low
1 20-Mar 0.8159 High
2 20-Apr 0.7789 High
3 21-Apr 0.7665 High
4 29-Apr 0.8510 High
5 7- May 0.7428 High
6 30-May 0.7124 High
7 31-May 0.6820 Low
8 7-Jun 0.8714 High
9 16-Jun 0.8902 High
10 1-Jul 0.8596 High
11 2-Jul 0.8289 High
12 10-Jul 0.6877 Low