Question

我正在使用Pandas处理表。

[table1]
  sample1 sample2 sample3
A 11      22      33
B 1       2       3

[table2]
  sample3 sample4 sample2
D 333     444     222

[Result]
  sample1 sample2 sample3
A 11      22      33
B 1       2       3
D NaN     222     333

我有两个表，我想在表1中添加行D（表2），考虑列名。如果表1中的列存在于表2中，则将相应的D值添加到表1中，如样本2和样本3.如果表2中的列不像样本1那样存在，则值D的设置为NaN或忽略。

Pandas有没有简单的方法呢？

Answer 1

我认为您可以使用concat，然后按drop删除列<TextView android:id="@+id/HeaderText" android:layout_width="wrap_content" android:layout_height="wrap_content" android:layout_below="@+id/mainImage" android:layout_centerInParent="true" android:fontFamily="sans-serif-light" android:gravity="center_vertical" android:padding="10dp" android:layout_marginLeft="10sp" android:layout_marginRight="10sp" android:text="*kkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk" android:textColor="#000" android:textSize="16sp"/>：

sample4

您可以使用intersection选择print (pd.concat([table1, table2]).drop('sample4', axis=1)) sample1 sample2 sample3 A 11.0 22 33 B 1.0 2 3 D NaN 222 333中的列，然后按以下列连接DataFrames的子集：

table2

然后，如果需要删除print (table2.columns.intersection(table1.columns)) Index(['sample2', 'sample3'], dtype='object') print (pd.concat([table1,table2[table2.columns.intersection(table1.columns)]])) sample1 sample2 sample3 A 11.0 22 33 B 1.0 2 3 D NaN 222 333行，请使用dropna：

NaN

Answer 2

您可以通过首先从table2中选择table1中的列来推广jezrael answer。这是使用numpy.in1d完成的。这也避免了使用来自两个数据帧的列形成潜在巨大的临时数据帧。例如：

import numpy as np
import pandas as pd

table1 = pd.DataFrame([[11, 22, 33], [1, 2, 3]], index=list('AB'), columns=['sample1', 'sample2', 'sample3'])
table2 = pd.DataFrame([[333, 444, 222]], index=['D'], columns=['sample3', 'sample4', 'sample2'])

# Sub-select columns...
cols_in_table1 = table2.columns[np.in1d(table2.columns, table1.columns)]

# ... and concatenate.
results = pd.concat((table1, table2[cols_in_table1]))

print(results)

打印哪些：

   sample1  sample2  sample3
A     11.0       22       33
B      1.0        2        3
D      NaN      222      333

使用Pandas从不同的表中添加属性

2 个答案: