如何根据条件将值分配给数据框的列?

时间:2016-11-18 06:34:16

标签: python python-2.7 pandas

我有一个如下所示的数据框:

POSITION    Code_Count
   S1       {"[471E;1]"}
   S2       {"[471E;1]"}
   S3       {"[471E;1]"} 
   S4       {"[471E;1]"}
   S5       {"[471E;1]"}
   S6       {"[5812;1]"}
   S7       {"[471E;1]"}
   S8       {"[471E;1]"}
   T1       {"[7A2A;1]"}
   T2       {"[471E;1]"}
   T3       {"[7C95;1]"}
   T4       {"[471E;1]"}
   T5       {"[471E;1]"}
   T6       {"[471E;1]"}
   T7       {"[471E;1]"}
   T8       {"[471E;1]"}

在Code_Count列中,第一个字符串是代码,数字是计数。 此外,代码分为4类A至D.类别中所有代码的列表如下: 代码分为4类,例如A到D,如下:

A类包含以下代码:7749 7783 7784 7786 7A14 7AC5 7C88 7C92 7C93 7C95 C749 C783 C784 C786 CA14 CAC5 CC88 CC92 CC93 CC95 442A 49C2

B类有以下代码:1D 32 430B 4415 448E 4490 4492 457A 457B 496C 4970 778A 7A09 7A2A 7A2C 7C7C 7C80 C78A CA09 CA2A CA2C

C类包含以下代码:7A7F 7A80 7C7E CA7F CA80 CAC8 7AC8 C77E 445A 496E 471E 49CA

D类:7AF0 7AF1 7AF2 7AF3 CAF0 CAF1 CAF2 CAF3 4616 4617 4618 5812

我希望我的最终数据帧根据初始数据帧中存在的代码,根据它们所属的类别对它们进行排序,从而将代码计数包含到相应的位置。例如,根据上述数据帧的输出数据帧应为:

POSITION    Category A     Category B      Category C     Category D
   S1           0              0               1              0
   S2           0              0               1              0
   S3           0              0               1              0
   S4           0              0               1              0
   S5           0              0               1              0
   S6           0              0               0              1
   S7           0              0               1              0
   S8           0              0               1              0
   T1           0              1               0              0
   T2           0              0               1              0
   T3           1              0               0              0
   T4           0              0               1              0
   T5           0              0               1              0
   T6           0              0               1              0
   T7           0              0               1              0
   T8           0              0               1              0           

我尝试过使用str.contains方法,但没有成功。任何帮助将非常感激。非常感谢提前!

1 个答案:

答案 0 :(得分:1)

我认为您可以先按stripsplit提取值,然后使用ix创建的掩码Count添加0。最近isin个不必要的列和drop catA = ['7749','7783','7784','7786','7A14','7AC5','7C88','7C92','7C93','7C95','C749','C783','C784','C786','CA14','CAC5','CC88','CC92','CC93','CC95','442A','49C2'] catB = ['1D','32','430B','4415','448E','4490','4492','457A','457B','496C','4970','778A','7A09','7A2A','7A2C','7C7C','7C80','C78A','CA09','CA2A','CA2C'] catC = ['7A7F','7A80','7C7E','CA7F','CA80','CAC8 7AC8 C77E','445A','496E','471E','49CA'] catD = ['7AF0','7AF1','7AF2','7AF3','CAF0','CAF1','CAF2','CAF3','4616','4617','4618','5812']

df[['Code','Count']] = df.Code_Count.str.strip('{["]}').str.split(';', expand=True)
df['Category A'] = df.ix[df.Code.isin(catA), 'Count']
df['Category B'] = df.ix[df.Code.isin(catB), 'Count']
df['Category C'] = df.ix[df.Code.isin(catC), 'Count']
df['Category D'] = df.ix[df.Code.isin(catD), 'Count']
df.drop(['Code_Count', 'Code', 'Count'], axis=1, inplace=True)

df[['Category A','Category B','Category C','Category D']] = 
df[['Category A','Category B','Category C','Category D']].fillna(0)
print (df)
   POSITION Category A Category B Category C Category D
0        S1          0          0          1          0
1        S2          0          0          1          0
2        S3          0          0          1          0
3        S4          0          0          1          0
4        S5          0          0          1          0
5        S6          0          0          0          1
6        S7          0          0          1          0
7        S8          0          0          1          0
8        T1          0          1          0          0
9        T2          0          0          1          0
10       T3          1          0          0          0
11       T4          0          0          1          0
12       T5          0          0          1          0
13       T6          0          0          1          0
14       T7          0          0          1          0
15       T8          0          0          1          0
security.csp.enable