Question

我想基于共同的唯一行标识符和唯一行-列组合来计算两个数据框的数字列之间的变化率。

这里是一个例子。我选择将表格显示为图像，以便使用颜色突出显示这两个数据集的特殊性。也就是说，每个数据框都包含数字和非数字列，并且行和列的顺序可能不同。另外，应在其上进行计算的数字列始终是“时间”列之后的那些。

df.divide()方法在这里不起作用，因为行和列的顺序不同。我还在this线程中看到了最佳答案，但是这种方法并不能一概而论。

Answer 1

如果您的问题本质上是列和行的排列顺序不正确，则可以通过实质上重新排列列和行的顺序来解决。

#Identifying the columns for which the difference is to be computed. Since #'Time' is the 4th column, we take all columns after that
valCols = list(df.columns)[4:]

#Sorting the datasets so that the rows align
df1 = df1.sort_values('ID')
df2 = df2.sort_values('ID')

#Keeping only the value columns. This also ensures that the columns are in the same order now
df1 = df1[valCols]
df2 = df1[valCols]

Python / Pandas：基于共同的行标识符和唯一的行列组合，从不同的数据框中划分数字列

1 个答案: