如果该列中的值等于字符串,我试图将一列替换为另一列。这个字符串的值是“wo”。如果在y列中显示,则替换为x列。 目前我使用以下代码:
SELECT
p_id,
r_date,
SUM(CASE WHEN p_type='h' AND post_p='c' THEN 1 ELSE 0 END) as test1
FROM (
SELECT 152234223 AS p_id,
date AS r_date,
sequence as p_type,
LEAD(p_type, 1) OVER
(PARTITION BY u_id ORDER BY visit_id) AS post_p
FROM (Table_date_range([152234223.ses_],Timestamp('25022016'),Timestamp('29022016')))
GROUP BY 1,2
这种情况持续很长时间(数百万次观察,相当于几天的计算)。
是否有更有效的替换方法?
以防万一,数据如下:
df.y.replace("wo",df.x)
它必须看起来像:
y x other variables
1 mo something
2 2 something
3 3 something
wo >5 something
4 4 something
wo 7 something
答案 0 :(得分:5)
试试这个:
df.loc[(df.y == 'wo'), 'y'] = df.x
它将首先仅过滤那些df.y == 'wo'
的行,并将x
列的值分配给'y'列
时间报告:
In [304]: %timeit df.y.replace("wo",df.x)
100 loops, best of 3: 13.9 ms per loop
In [305]: %timeit df.loc[(df.y == 'wo'), 'y'] = df.x
100 loops, best of 3: 3.31 ms per loop
In [306]: %timeit df.ix[(df.y == 'wo'), 'y'] = df.x
100 loops, best of 3: 3.31 ms per loop
更新:从Pandas 0.20.1 the .ix indexer is deprecated, in favor of the more strict .iloc and .loc indexers开始。
答案 1 :(得分:2)