查找CSV中的特定字段值(数字)并将其转换为文本值

时间:2018-10-26 13:57:37

标签: python pandas csv dataframe

我的CSV文件具有以下格式:

sidebars,notes,riskOthers,seriousEvents,goodCatches,harms
,SAFE; 2 moveouts; 0 discharges; ED patient awaiting bed in MAT,0,0,0,0
,Staffing,0,0,0,0
,,1,0,0,0
,,0,0,0,0
,,0,0,0,0
,Staffing needs,0,0,0,0
,Safe,1,0,0,0
,1- 1-1/ Staffing @ 3p- 7a,0,0,0,0
SB- Central Stores,,2,0,0,0
SB - ED Dr. G,,0,0,0,0
,,0,0,0,0
,1 pt in restraints,0,0,0,0
,1 Pt in Restraints,0,0,0,0
SB- Pharmacy,@ Risk - Staffing/ Security with Pt who had drug paraphernalia/ 1-1-1,1,0,0,0

我想在最后四列中选择大于1的值,并将其替换为1。这是我尝试的代码,但失败了。

data = pd.read_csv('reordered.csv')
df = pd.DataFrame(data, columns = ['sidebars','notes','riskOthers','seriousEvents', 'goodCatches', 'harms'])

# Values to find and their replacements
findL = ['3', '2', '4', '5', '6']
replaceL = ['1', '1', '1', '1', '1']

# Select column (can be A,B,C,D)
col = 'riskOthers';

# Find and replace values in the selected column
df[col] = df[col].replace(findL, replaceL)

这里,在这段代码中,我试图将所有大于1的值替换为1。但是我遇到类型不匹配错误。

2 个答案:

答案 0 :(得分:1)

这是通过pd.DataFrame.mask进行矢量化处理的方法:

values = df.iloc[:, -4:]
df.iloc[:, -4:] = values.mask(values > 1, 1)

print(df.iloc[:, -4:])

    riskOthers  seriousEvents  goodCatches  harms
0            0              0            0    0.0
1            0              0            0    0.0
2            1              0            0    0.0
3            0              0            0    0.0
4            0              0            0    0.0
5            0              0            0    0.0
6            1              0            0    0.0
7            0              0            0    0.0
8            1              0            0    0.0
9            0              0            0    0.0
10           0              0            0    0.0
11           0              0            0    0.0
12           0              0            0    0.0
13           1              0            0    NaN

答案 1 :(得分:0)

尝试映射df [col]并应用lambda函数。 例如:

df[col].map(lambda x: 1 if x > 1 else 0)