Question

我想在Pandas数据框中添加新行，而不考虑每个新行中的顺序和列数。

添加新行时，我希望数据框如下所示。每行可以有不同的列数。

---- | 1    | 2    | 3    | 4 
row1 | data | data | 
row2 | data | data | data 
row3 | data | 
row4 | data | data | data | data

Answer 1

一次建立一行熊猫数据框通常很慢。一种解决方案是首先将数据收集到字典中，然后将其转换为数据帧以进行进一步处理：

d = {
    'att1': ['a', 'b'],
    'att2': ['c', 'd', 'e'],
    'att3': ['f'],
    'att4': ['g', 'h', 'i', 'j'],
}
df = pd.DataFrame.from_dict(d, orient='index')

其中df包含以下内容：

        0    1    2    3
att1    a    b    None None
att2    c    d    e    None
att3    f    None None None
att4    g    h    i    j

或更符合典型的熊猫格式，将数据存储在一个长序列中，其中“ att1”用作值“ a”和“ b”的索引，等等：

series = df.stack().reset_index(level=1, drop=True)

可以轻松选择各种属性：

series.loc[['att1', 'att3']]

返回：

att1    a
att1    b
att3    f

Answer 2

在熊猫中，您可以将新行与现有数据框连接起来（即使新行具有不同的列数），如下所示。

import pandas as pd

df = pd.DataFrame([list(range(5))])
new_row = pd.DataFrame([list(range(4))])
pd.concat([df,new_row], ignore_index=True, axis=0)

在上面的代码片段中，pd.concatenate函数合并了两个数据帧。如果您提供参数ignore_index = True，则熊猫将合并两个数据帧而不会考虑它们的长度。

如何在具有不同列号的Pandas数据框中添加新行？

2 个答案: