根据条件转换熊猫数据框列

时间:2018-07-16 12:37:04

标签: python pandas

我有一个pandas列,值的范围是0.0到1.0。

我想根据阈值将此列转换为二进制列(0或1),即如果值<= threshold,它将变为0,否则为1。

3 个答案:

答案 0 :(得分:4)

通过gtdb.coll.delete_many({'primary_key': {'$in': keys}}))创建布尔掩码,然后将其转换为select [UniqueID], [Date], [Orders], sum([Orders]) over (partition by [UniqueID], [Date]) as SumOfOrders from test order by [UniqueID], [Date], [Orders]; s:

select t.[UniqueID], t.[Date], t.[Orders], t1.SumOfOrders
from test t cross apply 
     (select sum(t1.Orders) as SumOfOrders
      from test t1
      where t1.[UniqueID] = t.[UniqueID] and t1.[Date] = t.[Date]
     ) t1
order by t.[UniqueID], t.[Date], t.[Orders];  

答案 1 :(得分:2)

df.column = df.column > threshold
df.column.astype(int)

答案 2 :(得分:-1)

我将创建一个帮助器列,然后遍历行并为每个单元格设置值。像这样:

import pandas as pd
import numpy as np
a = np.random.random_sample(5)
df = pd.DataFrame({"A": a})
df["Helper"] = ""
for i in range(len(df)):
    if df.loc[i,"A"] <= 0.5:
        df.loc[i,"Helper"] = 0
    else:
        df.loc[i,"Helper"] = 1

这导致了什么:

          A  Helper
0  0.114089       0
1  0.309759       0
2  0.158169       0
3  0.444199       0
4  0.645443       1