如何从字符串中删除0而不影响熊猫数据框中的其他单元格?

时间:2018-11-16 20:31:21

标签: python pandas

我有一个数据框,其值为“ 0”,如下所示:

df = pd.DataFrame({
    'WARNING':['4402,43527,0,7628,54337',4402,0,0,'0,1234,56437,76252',0,3602],
    'FAILED':[0,0,'5555,6753,0','4572,0,8764,8753',9876,0,'0,4579,7514']
})

我想从具有多个值的字符串中删除零,以使结果df如下所示:

df = pd.DataFrame({
    'WARNING':['4402,43527,7628,54337',4402,0,0,'1234,56437,76252',0,3602],
    'FAILED':[0,0,'5555,6753','4572,8764,8753',9876,0,'4579,7514']
})

但是,在一个单元格中具有单个0的那些应保持完整。我该如何实现?

3 个答案:

答案 0 :(得分:3)

df = pd.DataFrame({
    'WARNING':['0,0786,1230,01234,0',4402,0,0,'0,1234,56437,76252',0,3602],
    'FAILED':[0,0,'5555,6753,0','4572,0,8764,8753',9876,0,'0,4579,7514']
})
df.apply(lambda x: x.str.strip('0,|,0')).replace(",0,", ",")

输出:

            WARNING            FAILED
0    786,1230,01234               NaN
1               NaN               NaN
2               NaN         5555,6753
3               NaN  4572,0,8764,8753
4  1234,56437,76252               NaN
5               NaN               NaN
6               NaN         4579,7514

答案 1 :(得分:2)

我会通过列表理解来解决它。

In [1]: df.apply(lambda col: col.astype(str).apply(lambda x: ','.join([y for y in x.split(',') if y != '0']) if ',' in x else x), axis=0)
Out[1]:  
           FAILED                WARNING
0               0  4402,43527,7628,54337
1               0                   4402
2       5555,6753                      0
3  4572,8764,8753                      0
4            9876       1234,56437,76252
5               0                      0
6       4579,7514                   3602

打破现状:

  1. 使用df.apply(lambda col: ..., axis=0)遍历所有列
  2. 使用col.astype(str)将每一列的值转换为字符串
  3. 使用col将函数应用于.apply(lambda x: ...)的每个“单元”
  4. lambda函数首先检查','中是否存在x,否则返回x的原始值
  5. 如果',' in xx除以',',则创建y的列表
  6. 它仅保留y != '0'
  7. 它以','.join(...)
  8. 连接所有内容

答案 2 :(得分:1)

仅当0,后面没有数字时,才可以使用带有负号的正则表达式替换import re df.applymap(lambda x: re.sub(r'(?<![0-9])0,', '', str(x))) WARNING FAILED 0 4402,43527,7628,54337 0 1 4402 0 2 0 5555,6753,0 3 0 4572,8764,8753 4 1234,56437,76252 9876 5 0 0 6 3602 4579,7514

s = '0,0999,9990,999'
re.sub(r'(?<![0-9])0,', '', s)
#'0999,9990,999'

对于测试用例W-B指出:

@Arr= {111110000,110100010,...}