问题在for循环中添加大熊猫追加

时间:2020-04-17 20:27:02

标签: python-3.x pandas numpy

尝试循环播放并将结果附加到我创建的Pandas df中。目的是使df包含循环中的所有结果。

我无法让pd.append正常工作。现在,它似乎没有追加,但是覆盖了现有行,而我只剩下循环的最后一行。我知道所有数据都是正确的,因为我可以在循环中将其打印出来,并且看到正确的值。希望缺少一些简单的东西。

for year in dfClose['year'].unique():
        tempYearDF = dfClose[dfClose['year'] == year]
        for i in dfClose['month'].unique():
            tempOpenDF = tempYearDF.loc[tempYearDF["month"] == i, "open"]
            tempCloseDF = tempYearDF.loc[tempYearDF["month"] == i, "close"]
            # If statement below is stopping loops on months that hasnt happened yet for the latest year.
            if len(tempOpenDF) > 0:
                othernumpyopen = tempOpenDF.to_numpy()
                othernumpyclose = tempCloseDF.to_numpy()
                aroundOpen = np.around(othernumpyopen[0],3)
                aroundClose = np.around(othernumpyclose[-1],3)
                month_pd = pd.DataFrame (columns=["YEAR", "MONTH", "MONTH OPEN", "MONTH CLOSE"])
                month_pd = month_pd.append({'YEAR' : year , 'MONTH' : i , 'MONTH OPEN' : aroundOpen , "MONTH CLOSE" : aroundClose} , ignore_index=True)

执行后剩下的是什么。我正在尝试将所有行添加到df中。

    YEAR    MONTH   MONTH OPEN  MONTH CLOSE
0   2020.0  4.0 246.5   286.69

将打印品添加到循环中时的示例输出。

     YEAR  MONTH  MONTH OPEN  MONTH CLOSE
0  2020.0    1.0      296.24       309.51
     YEAR  MONTH  MONTH OPEN  MONTH CLOSE
0  2020.0    2.0       304.3       273.36
     YEAR  MONTH  MONTH OPEN  MONTH CLOSE
0  2020.0    3.0      282.28       254.29
     YEAR  MONTH  MONTH OPEN  MONTH CLOSE
0  2020.0    4.0       246.5       286.69

如果需要dfClose的示例

    open    year    month   day date
0   30.490000   2010    1   4   2010-01-04
1   30.657143   2010    1   5   2010-01-05
2   30.625713   2010    1   6   2010-01-06
3   30.250000   2010    1   7   2010-01-07
4   30.042856   2010    1   8   2010-01-08


open     float64
close    float64
year       int64
month      int64
day        int64
date      object
dtype: object

1 个答案:

答案 0 :(得分:1)

您每次在循环中都要重新定义month_pd,从而覆盖以前的版本。最后列出要连接的数据框列表。

dfs = []
for year in dfClose['year'].unique():
        tempYearDF = dfClose[dfClose['year'] == year]
        for i in dfClose['month'].unique():
            tempOpenDF = tempYearDF.loc[tempYearDF["month"] == i, "open"]
            tempCloseDF = tempYearDF.loc[tempYearDF["month"] == i, "close"]
            # If statement below is stopping loops on months that hasnt happened yet for the latest year.
            if len(tempOpenDF) > 0:
                othernumpyopen = tempOpenDF.to_numpy()
                othernumpyclose = tempCloseDF.to_numpy()
                aroundOpen = np.around(othernumpyopen[0],3)
                aroundClose = np.around(othernumpyclose[-1],3)
                dfs.append(pd.DataFrame({'YEAR' : year , 'MONTH' : i , 'MONTH OPEN' : aroundOpen , "MONTH CLOSE" : aroundClose}))

pd.concat(dfs)