替换列表中的N个特定条目

时间:2017-05-29 19:46:19

标签: python

我有以下字符串列表:

['17', ' 5', ' 6', ' 0', ' 0', '', '', '', '', ' 10.11', ' 10.57', ' 18.34', ' 16.41', ' 13.23', ' 11.55', ' 11.56', '', '', '', '', '', '', '', '', ' 12.77', ' 11.99', ' 21.88', ' 22.46', ' 26.82', ' 25.71', ' 27.43', ' 27.73', ' 29.44', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ' 28.68', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '1.40']

这是从一个非常混乱的.txt文件中解析的。每组“空白”条目对应一个零,但是我需要将这些零记录为999(我基本上需要用''替换每组4个连续'999'。最恐怖的方式是什么?

5 个答案:

答案 0 :(得分:1)

>>> from itertools import groupby
... 
... 
... def group_blanks_by_n(lst, n=4):
...     result = []
...     for k, g in groupby(lst):
...         if k == '':
...             quo, rem = divmod(sum(1 for _ in g), n)
...             result.extend(['999'] * quo)
...             result.extend([''] * rem)
...         else:
...             result.extend(g)
...     return result
... 
>>> test = ['17', ' 5', ' 6', ' 0', ' 0', '', '', '', '', ' 10.11', ' 10.57', ' 18.34', ' 16.41', ' 13.23', ' 11.55', ' 11.56', '', '', '', '', '', '', '', '', ' 12.77', ' 11.99', ' 21.88', ' 22.46', ' 26.82', ' 25.71', ' 27.43', ' 27.73', ' 29.44', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ' 28.68', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '1.40']
>>> group_blanks_by_n(test, n=4)
['17', ' 5', ' 6', ' 0', ' 0', '999', ' 10.11', ' 10.57', ' 18.34', ' 16.41', ' 13.23', ' 11.55', ' 11.56', '999', '999', ' 12.77', ' 11.99', ' 21.88', ' 22.46', ' 26.82', ' 25.71', ' 27.43', ' 27.73', ' 29.44', '999', '999', '999', '999', '999', ' 28.68', '999', '999', '999', '999', '999', '999', '999', '999', '', '1.40']

编辑:

为帐户添加了n参数以获取不同的值(不必默认为4,仅选择与问题说明匹配)。

答案 1 :(得分:1)

另一种方法是使用join()将列表转换为字符串,然后用999替换空格,然后使用split()

再次转换为list
a = ['17', ' 5', ' 6', ' 0', ' 0', '', '', '', '', ' 10.11', ' 10.57', ' 18.34', ' 16.41', ' 13.23', ' 11.55', ' 11.56', '', '', '', '', '', '', '', '', ' 12.77', ' 11.99', ' 21.88', ' 22.46', ' 26.82', ' 25.71', ' 27.43', ' 27.73', ' 29.44', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ' 28.68', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '1.40']

b = '*'.join(a).replace(4*'*',' 999 ').replace('*','')
c = b.split()
print c

['17', '5', '6', '0', '0', '999', '10.11', '10.57', '18.34', '16.41', '13.23', '11.55', '11.56', '999', '999', '12.77', '11.99', '21.88', '22.46', '26.82', '25.71', '27.43', '27.73', '29.44', '999', '999', '999', '999', '999', '28.68', '999', '999', '999', '999', '999', '999', '999', '999', '1.40']

答案 2 :(得分:0)

在Python 3中使用izip_longest(又名:zip_longest):

代码:

import itertools as it
new_list = []
N = 4
blanks = ('',) * N
an_iter = it.izip_longest(*[data[i:] for i in range(N)])
for x in an_iter:
    if x == blanks:
        new_list.append('999')
        for i in range(N-1):
            next(an_iter)
    else:
        new_list.append(x[0])

结果:

['17', ' 5', ' 6', ' 0', ' 0', '999', ' 10.11', ' 10.57', ' 18.34', 
 ' 16.41', ' 13.23', ' 11.55', ' 11.56', '999', '999', ' 12.77', ' 11.99',
 ' 21.88', ' 22.46', ' 26.82', ' 25.71', ' 27.43', ' 27.73', ' 29.44',
 '999', '999', '999', '999', '999', ' 28.68', '999', '999', '999', '999',
 '999', '999', '999', '999', '', '1.40']

答案 3 :(得分:0)

这是一个可以满足您需求的小功能。

a = ['17', ' 5', ' 6', ' 0', ' 0', '', '', '', '', ' 10.11', ' 10.57', ' 18.34', ' 16.41', ' 13.23', ' 11.55', ' 11.56', '', '', '', '', '', '', '', '', ' 12.77', ' 11.99', ' 21.88', ' 22.46', ' 26.82', ' 25.71', ' 27.43', ' 27.73', ' 29.44', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ' 28.68', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '1.40']

def f(list):
    r = []
    c = 0
    for item in list:
        if item == '':
            c += 1
            if c == 4:
                r.append('999')
                c = 0
        else:
            c = 0
            r.append(item)
    return r

print f(a)

['17', ' 5', ' 6', ' 0', ' 0', '999', ' 10.11', ' 10.57', ' 18.34', ' 16.41', ' 13.23', ' 11.55', ' 11.56', '999', '999', ' 12.77', ' 11.99', ' 21.88', ' 22.46', ' 26.82', ' 25.71', ' 27.43', ' 27.73', ' 29.44', '999', '999', '999', '999', '999', ' 28.68', '999', '999', '999', '999', '999', '999', '999', '999', '1.40']

答案 4 :(得分:0)

最狡猾的方式我猜?

from itertools import groupby

L = ['17', ' 5', '6', ' 0', ' 0', '', '', '', '', ' 10.11', ' 10.57', ' 18.34', ' 16.41', ' 13.23', ' 11.55', ' 11.56', '', '', '', '', '', '', '', '', ' 12.77', ' 11.99', ' 21.88', ' 22.46', ' 26.82', ' 25.71', ' 27.43', ' 27.73', ' 29.44', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ' 28.68', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '1.40']

def your_function(L):
    grouped_L = [(k, len(list(g))) for k,g in groupby(L)]
    final_list = [item 
    for x, y in grouped_L 
    for item in repeat(x if y < 4 else '999', y if y < 4 else y // 4)]
    return final_list

print(your_function(L))

itertools 中使用 groupby 重复,这会生成一个像这样的元组列表。

  

[(k,len(list(g)))k,g in groupby(L)]

     

[('17',1),('5',1),('6',1),('0',2),('',4),.. 等等上

其中输出是元组=&gt; (item,number_of_it's_consecutive_occurrence)

然后再次使用列表理解

注意:(x,y)=&gt; (item,number_of_it's_consecutive_occurrence)

final_list = [item 
    for x, y in grouped_L 
    for item in repeat(x if y < 4 else '999', y if y < 4 else y // 4)]