我有一个数据框:
Index city_code date sector price
1 1 2010-01 A 50000
2 1 2010-01 B 100000
3 2 2010-01 A 150000
4 3 2010-01 A 322222
5 1 2010-01 C 124555
6 2 2010-01 C 30000
7 2 2010-01 B 20000
8 1 2010-02 A 45000
9 1 2010-02 B 120000
10 2 2010-02 A 30000
11 3 2010-02 A 1222400
12 1 2010-02 C 20000
13 2 2010-02 C 50000
14 2 2010-02 B 360000
我想根据扇区将数据附加到不同的数据帧中。
我尝试使用以下代码解决此问题。但不幸的是,它不起作用。
df = pd.read_csv('dataset.csv', sep=';')
area_list = pd.DataFrame(df['sector'].unique())
columns = df.columns
df_A = pd.DataFrame(columns=columns)
df_B = pd.DataFrame(columns=columns)
df_C = pd.DataFrame(columns=columns)
for i in area_list:
x = df[df['sector'] == i]
if i == 'A':
df_A.append(x)
elif i == 'B':
df_B.append(x)
elif i == 'C':
df_C.append(x)
此代码不会将值附加到空数据帧(df_A
,df_B
,df_C
)。我该如何解决这个问题?
答案 0 :(得分:0)
您可以通过扇区属性以这种方式分隔数据。
for sector, df_sector in df.groupby('sector'):
if (sector == 'A'):
df_A = df_sector
elif (sector == 'B'):
df_B = df_sector
elif (sector == 'C'):
df_C = df_sector
或
>>> import pandas as pd
>>> df = pd.read_csv('dataset.csv')
>>> df.head()
city date sector price
0 1 2010-01-01 A 53675300.0
1 1 2010-01-01 B 13415070.0
2 1 2010-01-01 C 474007000.0
3 1 2010-01-01 D 218028700.0
4 1 2010-01-01 E 2073598.0
>>> for sector, df_sector in df.groupby('sector'):
... if (sector == 'A'):
... df_A = df_sector
... elif (sector == 'B'):
... df_B = df_sector
... elif (sector == 'C'):
... df_C = df_sector
... elif (sector == 'D'):
... df_D = df_sector
... else:
... df_E = df_sector
...
>>> df_A
city date sector price
0 1 2010-01-01 A 53675300.0
>>> df_B
city date sector price
1 1 2010-01-01 B 13415070.0
>>> df_C
city date sector price
2 1 2010-01-01 C 474007000.0
>>> df_D
city date sector price
3 1 2010-01-01 D 218028700.0
>>> df_E
city date sector price
4 1 2010-01-01 E 2073598.0
>>>
...
onComplete: (self) => {},
...