我有一个愚蠢的问题。我的df
看起来像这样:
FID_2 STA_SID s2 s1 Qh_STA Qh_FID2 \
14 222143.00 26040713.00 0.00 0.00 8.00 17.00
15 222143.00 26040713.00 0.00 8.00 6.00 17.00
13 222143.00 26040713.00 6.00 8.00 3.00 17.00
17 NaN 26033594.00 29445425.00 1707.00 5.00 nan
我定义了以下功能和命令:
A = 0.8
def seekDO(row):
if (row['Qh_STA'])/row['Qh_FID2'] < A :
return 1
if ((row['Qh_STA'] + row['s1'])/row['Qh_FID2'] < A) :
return 1
if ((row['Qh_STA'] + row['s1'] + row['s2']) / row['Qh_FID2'] < A) :
return 1
return 0
df['DO'] = df.apply (lambda row: seekDO(row),axis=1)
问题在于,DO
我得到了
DO
14 1
15 1
13 1
17 0
代替
DO
14 1
15 0
13 0
17 0
您也许能看到我弄错了地方吗?
答案 0 :(得分:2)
我相信您可以用所有列来测试每个条件,而不是循环,这很慢:
A = 0.8
m1 = df['Qh_STA']/df['Qh_FID2'] < A
m2 = (df['Qh_STA'] + df['s1'])/df['Qh_FID2'] < A
m3 = (df['Qh_STA'] + df['s1'] + df['s2']) / df['Qh_FID2'] < A
如果所有条件都为AND
,则需要用&
和True
乘以df['DO'] = (m1 & m2 & m3).astype(int)
print (df)
FID_2 STA_SID s2 s1 Qh_STA Qh_FID2 DO
14 222143.0 26040713.0 0.0 0.0 8.0 17.0 1
15 222143.0 26040713.0 0.0 8.0 6.0 17.0 0
13 222143.0 26040713.0 6.0 8.0 3.0 17.0 0
17 NaN 26033594.0 29445425.0 1707.0 5.0 NaN 0
的链列进行匹配:
sudo pip install fastText
答案 1 :(得分:1)
也许在哪里;
condition = ((df['Qh_STA'])/df['Qh_FID2'] < A) | (((df['Qh_STA'] + (df['s1'])/df['Qh_FID2']) < A)) | (((df['Qh_STA'] + df['s1'] + (df['s2']) / df['Qh_FID2']) < A))
df['DO'] = np.where(condition, 1, 0)
答案 2 :(得分:1)
但是你应该得到
DO
14 1
15 1
13 1
17 0
确实。
再次查看您的值。
8 / 17 IS < 0.8
6 / 17 IS < 0.8
3 / 17 IS < 0.8
输出正确,但您期望输出不正确。