如何隐式列出熊猫数据框列

时间:2020-08-18 13:07:01

标签: python-3.x pandas

我们有一个数据框架:df_1

CODE    DESCRIPTION                           COUNT     DEVICES_ID
00      Electrical surges                       21      SAT1, SAT3, SAT5, SAT11, SAT13, SAT15 
01      Overloading                              1      SAT1, SAT3, SAT5, SAT11, SAT13 
02      Power sags and dips                     12      SAT1, SAT3, SAT5, SAT11 
03      A junction box that is uncovered         2      SAT1, SAT3, SAT5 
04      Switches of light not working            1      SAT1, SAT3 
05      Flickering light                         4      SAT31, SAT33, SAT35, SAT41, SAT43 
06      Tripping circuit breaker                 5      SAT31, SAT33, SAT35, SAT41 
07      Less outlets                            20      SAT31, SAT33, SAT35 
08      Electric shocks                         21      SAT31, SAT33 
09      Frequent burning out of light bulbs     22      SAT31 
10      Overcircuited panel                     12      SAT31, SAT33, SAT35, SAT41, SAT43, SAT45 

我们要获取一个数据框:df_2

CODE    DESCRIPTION                           COUNT     DEVICES_ID
00      Electrical surges                       21      [SAT1, SAT3, SAT5, SAT11, SAT13, SAT15] 
01      Overloading                              1      [SAT1, SAT3, SAT5, SAT11, SAT13] 
02      Power sags and dips                     12      [SAT1, SAT3, SAT5, SAT11] 
03      A junction box that is uncovered         2      [SAT1, SAT3, SAT5] 
04      Switches of light not working            1      [SAT1, SAT3] 
05      Flickering light                         4      [SAT31, SAT33, SAT35, SAT41, SAT43] 
06      Tripping circuit breaker                 5      [SAT31, SAT33, SAT35, SAT41] 
07      Less outlets                            20      [SAT31, SAT33, SAT35]
08      Electric shocks                         21      [SAT31, SAT33]
09      Frequent burning out of light bulbs     22      [SAT31]
10      Overcircuited panel                     12      [SAT31, SAT33, SAT35, SAT41, SAT43, SAT45]

如何在熊猫数据框中做

1 个答案:

答案 0 :(得分:1)

使用Series.str.split

df['DEVICES_ID'] = df['DEVICES_ID'].str.split(', ')

-

编辑:''不显示,默认情况下,如果选中第一个列表,则可以看到它们:

df['DEVICES_ID1'] = df['DEVICES_ID'].str.split(', ')
print (df['DEVICES_ID1'].iat[0])
['SAT1', 'SAT3', 'SAT5', 'SAT11', 'SAT13', 'SAT15']

如果需要再添加一个''是可能的,但这意味着有2次"'

def f(s1):
    return "'{}'".format(s1)

df['DEVICES_ID2'] = df['DEVICES_ID'].str.split(', ').apply(lambda x: [f(y) for y in x])

print (df['DEVICES_ID2'].iat[0])
["'SAT1'", "'SAT3'", "'SAT5'", "'SAT11'", "'SAT13'", "'SAT15'"]

拆分列的输出:

print (df)
   CODE                       DESCRIPTION  COUNT  \
0     0                 Electrical surges     21   
1     1                       Overloading      1   
2     2               Power sags and dips     12   
3     3  A junction box that is uncovered      2   

                              DEVICES_ID  \
0  SAT1, SAT3, SAT5, SAT11, SAT13, SAT15   
1         SAT1, SAT3, SAT5, SAT11, SAT13   
2                SAT1, SAT3, SAT5, SAT11   
3                       SAT1, SAT3, SAT5   

                               DEVICES_ID1  \
0  [SAT1, SAT3, SAT5, SAT11, SAT13, SAT15]   
1         [SAT1, SAT3, SAT5, SAT11, SAT13]   
2                [SAT1, SAT3, SAT5, SAT11]   
3                       [SAT1, SAT3, SAT5]   

                                         DEVICES_ID2  
0  ['SAT1', 'SAT3', 'SAT5', 'SAT11', 'SAT13', 'SA...  
1         ['SAT1', 'SAT3', 'SAT5', 'SAT11', 'SAT13']  
2                  ['SAT1', 'SAT3', 'SAT5', 'SAT11']  
3                           ['SAT1', 'SAT3', 'SAT5']