将新列添加到pandas DataFrame,其值依赖于前一行

时间:2016-11-24 11:55:56

标签: python pandas dataframe

我有一个现有的pandas DataFrame,我想添加一个新列,其中每行的值将取决于前一行。 例如:

df1 = pd.DataFrame(np.random.randint(10, size=(4, 4)), columns=['a', 'b', 'c', 'd'])

df1
Out[31]: 
   a  b  c  d
0  9  3  3  0
1  3  9  5  1
2  1  7  5  6
3  8  0  1  7

现在我要创建列e,其中对于每一行,我将df1 [' e'] [i]的值为:df1['e'][i] = df1['d'][i] - df1['d'][i-1]

期望的输出:

df1:
   a  b  c  d  e
0  9  3  3  0  0
1  3  9  5  1  1
2  1  7  5  6  5
3  8  0  1  7  1

我怎样才能做到这一点?

1 个答案:

答案 0 :(得分:1)

您可以sub使用shift

def mp_proc(batch_set):
    'given the batch, disperse it to the number of processes and ret the results'
    n_process = len(batch_set)
    output = mp.Queue()
    processes = [mp.Process(target=proc_lines, args=(i, output, batch_set[i]))
                 for i in range(process)]

    for p in processes:
        p.start()

    # Empty the queue while the processes are running so there is no
    # issue with uncomplete `put` operations.
    results = sorted([output.get() for p in processes])

    # Join the process to make sure everything finished correctly
    for p in processes:
        p.join()

    return results

如果需要转换为df['e'] = df.d.sub(df.d.shift(), fill_value=0) print (df) a b c d e 0 9 3 3 0 0.0 1 3 9 5 1 1.0 2 1 7 5 6 5.0 3 8 0 1 7 1.0

int