如何在for循环中将现有数据框中的值附加到空数据框中?

时间:2019-05-06 07:35:20

标签: python pandas dataframe for-loop

我有一个数据框:

Index   city_code   date   sector   price
1   1           2010-01 A   50000
2   1           2010-01 B   100000
3   2           2010-01 A   150000
4   3           2010-01 A   322222
5   1           2010-01 C   124555
6   2           2010-01 C   30000
7   2           2010-01 B   20000
8   1           2010-02 A   45000
9   1           2010-02 B   120000
10  2           2010-02 A   30000
11  3           2010-02 A   1222400
12  1           2010-02 C   20000
13  2           2010-02 C   50000
14  2           2010-02 B   360000

我想根据扇区将数据附加到不同的数据帧中。

我尝试使用以下代码解决此问题。但不幸的是,它不起作用。

df = pd.read_csv('dataset.csv', sep=';')

area_list = pd.DataFrame(df['sector'].unique())
columns = df.columns
df_A = pd.DataFrame(columns=columns)
df_B = pd.DataFrame(columns=columns)
df_C = pd.DataFrame(columns=columns)

for i in area_list:
    x = df[df['sector'] == i]
    if i == 'A':
        df_A.append(x)
    elif i == 'B':
        df_B.append(x)
    elif i == 'C':
        df_C.append(x)

此代码不会将值附加到空数据帧(df_Adf_Bdf_C)。我该如何解决这个问题?

1 个答案:

答案 0 :(得分:0)

您可以通过扇区属性以这种方式分隔数据。

for sector, df_sector in df.groupby('sector'):
    if (sector == 'A'):
        df_A = df_sector
    elif (sector == 'B'):
        df_B = df_sector
    elif (sector == 'C'):
        df_C = df_sector

>>> import pandas as pd
>>> df = pd.read_csv('dataset.csv')
>>> df.head()
   city        date sector        price
0     1  2010-01-01      A   53675300.0
1     1  2010-01-01      B   13415070.0
2     1  2010-01-01      C  474007000.0
3     1  2010-01-01      D  218028700.0
4     1  2010-01-01      E    2073598.0
>>> for sector, df_sector in df.groupby('sector'):
...     if (sector == 'A'):
...             df_A = df_sector
...     elif (sector == 'B'):
...             df_B = df_sector
...     elif (sector == 'C'):
...             df_C = df_sector
...     elif (sector == 'D'):
...             df_D = df_sector
...     else:
...             df_E = df_sector
... 
>>> df_A
   city        date sector       price
0     1  2010-01-01      A  53675300.0
>>> df_B
   city        date sector       price
1     1  2010-01-01      B  13415070.0
>>> df_C
   city        date sector        price
2     1  2010-01-01      C  474007000.0
>>> df_D
   city        date sector        price
3     1  2010-01-01      D  218028700.0
>>> df_E
   city        date sector      price
4     1  2010-01-01      E  2073598.0
>>> 
...
onComplete: (self) => {},
...