如何根据Python中的条件从现有数据框创建多个数据框

时间:2019-02-26 14:05:30

标签: python pandas

我有一个数据框,如下所示。我想根据列ID从该数据框中创建多个数据框。

df = pd.DataFrame(results)
print(df)

结果是:

       ID  NAME    COLOR
    0  01   ABC      RED                               
    1  01   ABC      ORANGE                  
    2  01   ABC      WHITE   
    3  02   DEF      RED
    4  02   DEF      PURPLE
    5  02   DEF      GREEN
    6  02   DEF      ORANGE
    7  02   DEF      BLACK
    8  03   GHI      RED
    9  03   GHI      BLACK
   10  03   GHI      GREEN
   11  03   GHI      ORANGE
   12  04   JKL      RED

多个数据框应如下图所示:我无法将其放入python代码中,请帮助。

           ID  NAME    COLOR
        0  01   ABC      RED                               
        1  01   ABC      ORANGE                  
        2  01   ABC      WHITE  



          ID  NAME    COLOR
       0  02   DEF      RED
       1  02   DEF      PURPLE
       2  02   DEF      GREEN
       3  02   DEF      ORANGE
       4  02   DEF      BLACK

          ID  NAME    COLOR
       0  03   GHI      RED
       1  03   GHI      BLACK
       2  03   GHI      GREEN
       3  03   GHI      ORANGE

           ID  NAME    COLOR
       0   04   JKL      RED 

3 个答案:

答案 0 :(得分:0)

您必须按“ NAME”列进行过滤

df_EDF = df[df.NAME == "EDF"]
df_GHI = df[df.NAME == "GHI"]

很抱歉,您的硬编码解决方案是: 这是我的其他解决方案:

import numpy as np 
import pandas as pd 


d = {'NAME': ["ABC", "ABC","ABC","GHI","GHI"], 'VALUE': [3, 4,5,6,7]}
df = pd.DataFrame(data=d)

# Get all unique names
cat = np.unique(df.NAME)

# create empty list of dataframes 
listOfDf = []

# for each unique name, create df_i with df filter by name, and append the list 
for i in cat:
    df_i = df[df.NAME == i].reset_index(drop = True)
    listOfDf.append(df_i)

# now you have a list of dataframe and can work with each element of the list 
    # as dataframe

print(listOfDf)

[  NAME  VALUE
0  ABC      3
1  ABC      4
2  ABC      5,   NAME  VALUE
0  GHI      6
1  GHI      7]


for x in range(len(listOfDf)):
    print(listOfDf[x])
    print("------")

  NAME  VALUE
0  ABC      3
1  ABC      4
2  ABC      5
------
  NAME  VALUE
0  GHI      6
1  GHI      7
------

答案 1 :(得分:0)

您可以这样做:

data_dict={'df'+str(i): grp for i , grp in df.groupby('ID')}

哪个给出字典:

{'df1':    ID NAME   COLOR
 0   1  ABC     RED
 1   1  ABC  ORANGE
 2   1  ABC   WHITE, 'df2':    ID NAME   COLOR
 3   2  DEF     RED
 4   2  DEF  PURPLE
 5   2  DEF   GREEN
 6   2  DEF  ORANGE
 7   2  DEF   BLACK, 'df3':     ID NAME   COLOR
 8    3  GHI     RED
 9    3  GHI   BLACK
 10   3  GHI   GREEN
 11   3  GHI  ORANGE, 'df4':     ID NAME COLOR
 12   4  JKL   RED}

现在只需调用每个键即可访问每个ID组,

print(data_dict['df2'])

   ID NAME   COLOR
3   2  DEF     RED
4   2  DEF  PURPLE
5   2  DEF   GREEN
6   2  DEF  ORANGE
7   2  DEF   BLACK

答案 2 :(得分:0)

您可以尝试一下:

import pandas as pd
data= {'ID':[1,1,1,2,2,2,3,3,3,4], 'NAME':['ABC','ABC','ABC','DEF','DEF','DEF','GHI','GHI','GHI','JKL']}  
df = pd.DataFrame(data=data)

解决方案1 ​​

    myList=[]
    for id, df_id in df.groupby('ID'):
        print(df_id)
`       myList.append(df_id)
        Result:
         ID NAME
        0   1  ABC
        1   1  ABC
        2   1  ABC
           ID NAME
        3   2  DEF
        4   2  DEF
        5   2  DEF
           ID NAME
        6   3  GHI
        7   3  GHI
        8   3  GHI
           ID NAME
        9   4  JKL

您可以访问不同的数据框,例如myList [2]

   ID   NAME
6   3   GHI
7   3   GHI
8   3   GHI

解决方案2:

{k: v for k, v in df.groupby('ID')}

    Result:
    {1:    ID NAME
     0   1  ABC
     1   1  ABC
     2   1  ABC, 2:    ID NAME
     3   2  DEF
     4   2  DEF
     5   2  DEF, 3:    ID NAME
     6   3  GHI
     7   3  GHI
     8   3  GHI, 4:    ID NAME
     9   4  JKL}