Question

我在python中有一个大熊猫数据框。我有7列原始数据，它们会定期一次全部更新，并且每次将新数据添加到第1-7列的底部时，我都需要更新其他84列的新行中的值。我想这样做，而不必重新计算整个其他84列的所有值。因为这些列中有数百万行。

Answer 1

在主数据帧上进行第一次计算之后，请尝试分别对新数据进行计算，然后在最后连接它们（前提是两个文件在连接前具有相同的列）。

import pandas as pd

columns = ['c1','c2','c3','c4','c5','c6','c7']

main = pd.read_csv('file.csv', names=columns)
# ... do your calculation

new = pd.read_csv('new_file.csv', names=columns)
# ... do your calculation

all = pd.concat([main, new])

# if you need to reset the index, use the following line instead:
# all = pd.concat([main, new], ignore_index=True)

当同一行中的数据添加到另一列时，计算一行的列值

1 个答案: