我需要df1中第一列,第一行索引(iloc [0])的值,并将df2中的匹配值转换为0。
我正在尝试这样的事情,但我知道语法已经不在了
df2[df1['ID'].iloc[0]] = 0
DF1:
ID Name X
6539 CM 20
9999 FM 30
DF2:
Out 1 Out 2 Out 3 Out 4 Out 5
7000 8000 6539 6539 6539
所以输出将是
Out 1 Out 2 Out 3 Out 4 Out 5
7000 8000 0 0 0
答案 0 :(得分:2)
我认为需要通过布尔掩码更改值:
df = df2.mask(df2 == df1.iloc[0,0], 0)
或者:
df2[df2 == df1.iloc[0, 0]] = 0
或者:
df = pd.DataFrame(np.where(df2 == df1.iloc[0,0], 0, df2),index=df2.index,columns=df2.columns)
print (df)
Out 1 Out 2 Out 3 Out 4 Out 5
0 7000 8000 0 0 0
<强>详细强>:
print (df2 == df1.iloc[0,0])
Out 1 Out 2 Out 3 Out 4 Out 5
0 False False True True True
<强>计时强>:
np.random.seed(100)
df1 = pd.DataFrame({
'a': [1,2],
'b': [3,4]
})
df2 = pd.DataFrame(np.random.randint(10, size=(1000,1000)))
In [106]: %timeit df2.replace(df1.iloc[0,0],0, inplace=True)
2.99 ms ± 324 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [107]: %timeit df2.mask(df2 == df1.iloc[0,0], 0)
22.8 ms ± 1.71 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [108]: %timeit df2[df2 == df1.iloc[0, 0]] = 0
19.6 ms ± 497 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [109]: %timeit df = pd.DataFrame(np.where(df2 == df1.iloc[0,0], 0, df2),index=df2.index,columns=df2.columns)
5.81 ms ± 91.7 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
如果有几列和多行:
np.random.seed(100)
df1 = pd.DataFrame({
'a': [1,2],
'b': [3,4]
})
df2 = pd.DataFrame(np.random.randint(5, size=(1000,10)))
In [116]: %timeit df2.replace(df1.iloc[0,0],0, inplace=True)
856 µs ± 12.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [117]: %timeit df2.mask(df2 == df1.iloc[0,0], 0)
1.23 ms ± 25.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [118]: %timeit df2[df2 == df1.iloc[0, 0]] = 0
1.21 ms ± 4.26 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [119]: %timeit df = pd.DataFrame(np.where(df2 == df1.iloc[0,0], 0, df2),index=df2.index,columns=df2.columns)
445 µs ± 13.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
答案 1 :(得分:2)
我认为你需要:
df2.replace(df.iloc[0,0], 0)
完整示例:
将pandas导入为pd
df1 = pd.DataFrame({
'a': [1,2],
'b': [3,4]
})
df2 = pd.DataFrame({
'a': [1,1],
'b': [1,1]
})
df2.replace(df.iloc[0,0],0)
返回:
a b
0 0 0
1 0 0
df1 = pd.concat([df1]*10000)
df2 = pd.concat([df2]*10000)
%timeit df2.replace(df.iloc[0,0],0, inplace=True)
%timeit df2[df2 == df1.iloc[0, 0]] = 0
返回:
341 µs ± 9.36 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
1.24 ms ± 39 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)