为Pandas中的一组列设置新值

时间:2019-05-10 00:27:37

标签: python pandas data-science

我有一个应该可以解决的情况,但是我的解决方案太慢了(几个小时很慢):

我有一个数据集,其中需要根据条件更改一些列的值。

我写了代码:

for i, row in df.iterrows():
    if row.specific_column_value == 1:
        continue

    col_names = ['A1', 'A2', ..., 'An']
    new_values = [1, 2, 3, 4, .., n]

    for j, col in enumerate(col_names):
        df.loc[i, col] = new_values[j]

这非常慢。

如何加快速度?

2 个答案:

答案 0 :(得分:1)

您可以先设置assign的新值,然后再设置.loc的条件

df.loc[ df.order_received == 1, col_names ] = new_values

更新

for i, row in df.iterrows():
    if row.specific_column_value == 1:
        col_names = ['A1', 'A2', ..., 'An']
        new_values = [1, 2, 3, 4, .., n]
        df.loc[i, col_names ] = new_values 

答案 1 :(得分:0)

If you have limited number of columns(n), you may be able to reduced the search 
operation to O(n) instead of O(m x n) complexity that you have in current approach

inx_collection = set()
value_looking_for = 1
col_values = [1, 2, 3, 4, .., n]
for col in df.columns:
    inx = df.index[df[col] == value_looking_for]
    inx_collection.update(inx) # This set collects all indices containing the value
df.loc[inx_collection,:] = col_value