pandas添加行而不是列

时间:2015-07-03 06:12:23

标签: python pandas append

我是熊猫的新手,但只想添加一行

class Security:
    def __init__(self):
        self.structure = ['timestamp', 'open', 'high', 'low', 'close', 'vol']
        self.df = pd.DataFrame(columns=self.structure)  # index =
    def whats_inside(self):
        return self.df
    """
    Some skipped code...
    """
    def add_data(self, timestamp, open, high, low, close, vol):
        data = [timestamp, open, high, low, close, vol]
        self.df = self.df.append (data)

sec = Security()
print sec.whats_inside()
sec.add_data ('2015/06/01', '1', '2', '0.5', '1', '100')
print sec.whats_inside()

但输出是:

            0 close high  low open timestamp  vol
0  2015/06/01   NaN  NaN  NaN  NaN       NaN  NaN
1           1   NaN  NaN  NaN  NaN       NaN  NaN
2           2   NaN  NaN  NaN  NaN       NaN  NaN
3         0.5   NaN  NaN  NaN  NaN       NaN  NaN
4           1   NaN  NaN  NaN  NaN       NaN  NaN
5         100   NaN  NaN  NaN  NaN       NaN  NaN

这意味着,我正在添加一列而不是一行。是的,我已经尝试过谷歌,但仍然没有说明如何使它成为简单的pythonic方式。

P.S。我知道这很简单,但我只是遗漏了一些重要的东西。

2 个答案:

答案 0 :(得分:7)

有几种方法可以添加新行。也许最简单的是(如果你想将行添加到最后)是使用loc

df.loc[len(df)] = ['val_a', 'val_b', .... ]

loc期望一个索引。 len(df)将返回数据框中的行数,因此新行将添加到数据框的末尾。

'['val_a','val_b',....]'是行的值列表,列的顺序相同,因此列表的长度必须等于列数,否则您将获得ValueError例外。 例外情况是,如果您希望所有列具有相同的值,则允许将该值作为列表中的单个元素,例如df.loc[len(df)] = ['aa']

注意:一个好主意是在使用此方法之前始终使用reset_index,因为如果您删除行或处理过滤后的数据帧,则无法保证行'索引将与行数同步。

答案 1 :(得分:3)

您应该追加Series或DataFrame。 (在你的情况下系列会更合适)

import pandas as pd
from pandas import Series, DataFrame
class Security:
    def __init__(self):
        self.structure = ['timestamp', 'open', 'high', 'low', 'close', 'vol']
        self.df = pd.DataFrame(columns=self.structure)  # index =
    def whats_inside(self):
        return self.df
    """
    Some skipped code...
    """
    def add_data(self, timestamp, open, high, low, close, vol):
        data = [timestamp, open, high, low, close, vol]
        # append Series
        self.df = self.df.append(pd.Series(data, index=self.structure), ignore_index=True)
        # or DataFrame
        # self.df = self.df.append(pd.DataFrame([data], columns=self.structure), ignore_index=True)

sec = Security()
print sec.whats_inside()
sec.add_data ('2015/06/01', '1', '2', '0.5', '1', '100')
sec.add_data ('2015/06/02', '1', '2', '0.5', '1', '100')
print sec.whats_inside()

输出:

    timestamp open high  low close  vol
0  2015/06/01    1    2  0.5     1  100
1  2015/06/02    1    2  0.5     1  100