我正在尝试过滤数据框的 3 列,并为 3 列设置条件,并返回一个二进制值,如果满足所有条件,则为 1,如果不满足条件,则为 0。示例如下所示。
data = {'PassengerId': array([2255, 2257, 2258, 2256, 2257, 2258, 2255, 2258, 2257, 2257, 2255,
2255, 2257, 2256, 2257, 2256, 2255, 2258, 2258, 2256, 2256, 2257,
2258, 2258, 2257]),
'Pclass': array([3, 2, 2, 2, 4, 3, 3, 4, 3, 1, 1, 1, 1, 2, 4, 3, 1, 2, 4, 3, 2, 3,
1, 1, 2]),
'Age': array([40, 33, 32, 40, 48, 24, 33, 29, 29, 31, 45, 47, 28, 32, 54, 39, 28,
50, 40, 31, 51, 26, 41, 46, 27]),
'SibSp': array([11, 13, 12, 19, 22, 17, 23, 12, 12, 12, 12, 24, 16, 21, 12, 15, 20,
18, 10, 17, 20, 12, 17, 17, 10]),
'Comf' : array([236.66883531, 235.46750709, 235.64574546, 241.16838089,
239.40728836, 239.95592634, 236.67806901, 237.73350635,
238.74497849, 235.17486552, 235.8457374 , 236.85133744,
240.9359547 , 236.27703374, 237.81871052, 241.62788018,
241.29185342, 235.0058136 , 240.69989317, 238.8073828 ,
238.08841364, 236.55259788, 237.58108419, 239.66916186,
241.97479544]),
'Parch': array([232.37686437, 232.39153096, 230.56566556, 232.77980061,
232.19436342, 232.2165835 , 232.28145641, 231.26988217,
230.55287196, 232.26528521, 230.45185855, 230.87525326,
231.38775744, 232.80960083, 232.33105822, 232.65782351,
231.64457366, 230.45225829, 231.05404057, 232.38229998,
232.57354117, 232.08690375, 230.40414215, 230.14361969,
231.40414745]),
'Fare': array([238.80427104, 239.32031287, 238.02212358, 238.40333494,
238.85929097, 239.51666683, 239.87771029, 238.06772515,
238.22734658, 238.54682118, 238.68880278, 239.79658425,
238.2642908 , 239.22884058, 239.84423352, 239.69438831,
238.85871719, 238.64632848, 238.7085097 , 239.5700877 ,
239.06199698, 238.37341378, 239.16126748, 239.01280153,
239.77047796])}
df = pd.DataFrame(data)
我试图为第一行设置一个条件,如果“Pclass”== 1 和“Comf”介于“Parch”和“Fare”之间,则创建一个新列“Survived”并分配 1 否则分配 0 .
然后对 "Pclass" == 2, 3... 做同样的事情
我想用熊猫来做这件事,但是欢迎所有解决这个问题的方法。
答案 0 :(得分:0)
使用 assign 只需计算条件并转换为 int
类型:
df = pd.DataFrame(data=data)
df = df.assign(Survived=lambda x: x['Comf'].between(x['Parch'], x['Fare']).astype(int))
print(df.to_string())
或与=
df = pd.DataFrame(data=data)
df['Survived'] = df['Comf'].between(df['Parch'], df['Fare']).astype(int)
print(df.to_string())
输出:
PassengerId Pclass Age SibSp Comf Parch Fare Survived 0 2255 3 40 11 236.668835 232.376864 238.804271 1 1 2257 2 33 13 235.467507 232.391531 239.320313 1 2 2258 2 32 12 235.645745 230.565666 238.022124 1 3 2256 2 40 19 241.168381 232.779801 238.403335 0 4 2257 4 48 22 239.407288 232.194363 238.859291 0 5 2258 3 24 17 239.955926 232.216584 239.516667 0 6 2255 3 33 23 236.678069 232.281456 239.877710 1 7 2258 4 29 12 237.733506 231.269882 238.067725 1 8 2257 3 29 12 238.744978 230.552872 238.227347 0 9 2257 1 31 12 235.174866 232.265285 238.546821 1 10 2255 1 45 12 235.845737 230.451859 238.688803 1 11 2255 1 47 24 236.851337 230.875253 239.796584 1 12 2257 1 28 16 240.935955 231.387757 238.264291 0 13 2256 2 32 21 236.277034 232.809601 239.228841 1 14 2257 4 54 12 237.818711 232.331058 239.844234 1 15 2256 3 39 15 241.627880 232.657824 239.694388 0 16 2255 1 28 20 241.291853 231.644574 238.858717 0 17 2258 2 50 18 235.005814 230.452258 238.646328 1 18 2258 4 40 10 240.699893 231.054041 238.708510 0 19 2256 3 31 17 238.807383 232.382300 239.570088 1 20 2256 2 51 20 238.088414 232.573541 239.061997 1 21 2257 3 26 12 236.552598 232.086904 238.373414 1 22 2258 1 41 17 237.581084 230.404142 239.161267 1 23 2258 1 46 17 239.669162 230.143620 239.012802 0 24 2257 2 27 10 241.974795 231.404147 239.770478 0
答案 1 :(得分:0)
如果您想对所有行都执行此操作,而不管 PClass 值如何,都可以使用
df["Survived"] = df["Comf"].between(df["Parch"], df["Fare"]).astype(int)
但是如果你想为特定的PClass做而不是你可以使用
df["Survived"] = (df["Pclass"]==1 & df["Comf"].between(df["Parch"], df["Fare"])).astype(int)
答案 2 :(得分:0)
试试这个。
步骤。
indexesOfTrue = df[(df["Pclass"]==1) & (df["Comf"] > df["Parch"]) & (df["Comf"] < df["Fare"])].index
df.loc[indexesOfTrue, "Survived"] = 1
df.loc[~df.index.isin(ind), "Survived"] = 0
输出
PassengerId Pclass Age SibSp Comf Parch Fare Survived
5 2258 3 24 17 239.955926 232.216584 239.516667 2
6 2255 3 33 23 236.678069 232.281456 239.877710 2
7 2258 4 29 12 237.733506 231.269882 238.067725 2
8 2257 3 29 12 238.744978 230.552872 238.227347 2
9 2257 1 31 12 235.174866 232.265285 238.546821 1
10 2255 1 45 12 235.845737 230.451859 238.688803 1
11 2255 1 47 24 236.851337 230.875253 239.796584 1
12 2257 1 28 16 240.935955 231.387757 238.264291 2
13 2256 2 32 21 236.277034 232.809601 239.228841 2
14 2257 4 54 12 237.818711 232.331058 239.844234 2