我有以下字符串列表:
['17', ' 5', ' 6', ' 0', ' 0', '', '', '', '', ' 10.11', ' 10.57', ' 18.34', ' 16.41', ' 13.23', ' 11.55', ' 11.56', '', '', '', '', '', '', '', '', ' 12.77', ' 11.99', ' 21.88', ' 22.46', ' 26.82', ' 25.71', ' 27.43', ' 27.73', ' 29.44', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ' 28.68', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '1.40']
这是从一个非常混乱的.txt
文件中解析的。每组“空白”条目对应一个零,但是我需要将这些零记录为999(我基本上需要用''
替换每组4个连续'999'
。最恐怖的方式是什么?
答案 0 :(得分:1)
>>> from itertools import groupby
...
...
... def group_blanks_by_n(lst, n=4):
... result = []
... for k, g in groupby(lst):
... if k == '':
... quo, rem = divmod(sum(1 for _ in g), n)
... result.extend(['999'] * quo)
... result.extend([''] * rem)
... else:
... result.extend(g)
... return result
...
>>> test = ['17', ' 5', ' 6', ' 0', ' 0', '', '', '', '', ' 10.11', ' 10.57', ' 18.34', ' 16.41', ' 13.23', ' 11.55', ' 11.56', '', '', '', '', '', '', '', '', ' 12.77', ' 11.99', ' 21.88', ' 22.46', ' 26.82', ' 25.71', ' 27.43', ' 27.73', ' 29.44', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ' 28.68', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '1.40']
>>> group_blanks_by_n(test, n=4)
['17', ' 5', ' 6', ' 0', ' 0', '999', ' 10.11', ' 10.57', ' 18.34', ' 16.41', ' 13.23', ' 11.55', ' 11.56', '999', '999', ' 12.77', ' 11.99', ' 21.88', ' 22.46', ' 26.82', ' 25.71', ' 27.43', ' 27.73', ' 29.44', '999', '999', '999', '999', '999', ' 28.68', '999', '999', '999', '999', '999', '999', '999', '999', '', '1.40']
编辑:
为帐户添加了n
参数以获取不同的值(不必默认为4
,仅选择与问题说明匹配)。
答案 1 :(得分:1)
另一种方法是使用join()将列表转换为字符串,然后用999替换空格,然后使用split()
再次转换为lista = ['17', ' 5', ' 6', ' 0', ' 0', '', '', '', '', ' 10.11', ' 10.57', ' 18.34', ' 16.41', ' 13.23', ' 11.55', ' 11.56', '', '', '', '', '', '', '', '', ' 12.77', ' 11.99', ' 21.88', ' 22.46', ' 26.82', ' 25.71', ' 27.43', ' 27.73', ' 29.44', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ' 28.68', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '1.40']
b = '*'.join(a).replace(4*'*',' 999 ').replace('*','')
c = b.split()
print c
['17', '5', '6', '0', '0', '999', '10.11', '10.57', '18.34', '16.41', '13.23', '11.55', '11.56', '999', '999', '12.77', '11.99', '21.88', '22.46', '26.82', '25.71', '27.43', '27.73', '29.44', '999', '999', '999', '999', '999', '28.68', '999', '999', '999', '999', '999', '999', '999', '999', '1.40']
答案 2 :(得分:0)
在Python 3中使用izip_longest
(又名:zip_longest
):
import itertools as it
new_list = []
N = 4
blanks = ('',) * N
an_iter = it.izip_longest(*[data[i:] for i in range(N)])
for x in an_iter:
if x == blanks:
new_list.append('999')
for i in range(N-1):
next(an_iter)
else:
new_list.append(x[0])
['17', ' 5', ' 6', ' 0', ' 0', '999', ' 10.11', ' 10.57', ' 18.34',
' 16.41', ' 13.23', ' 11.55', ' 11.56', '999', '999', ' 12.77', ' 11.99',
' 21.88', ' 22.46', ' 26.82', ' 25.71', ' 27.43', ' 27.73', ' 29.44',
'999', '999', '999', '999', '999', ' 28.68', '999', '999', '999', '999',
'999', '999', '999', '999', '', '1.40']
答案 3 :(得分:0)
这是一个可以满足您需求的小功能。
a = ['17', ' 5', ' 6', ' 0', ' 0', '', '', '', '', ' 10.11', ' 10.57', ' 18.34', ' 16.41', ' 13.23', ' 11.55', ' 11.56', '', '', '', '', '', '', '', '', ' 12.77', ' 11.99', ' 21.88', ' 22.46', ' 26.82', ' 25.71', ' 27.43', ' 27.73', ' 29.44', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ' 28.68', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '1.40']
def f(list):
r = []
c = 0
for item in list:
if item == '':
c += 1
if c == 4:
r.append('999')
c = 0
else:
c = 0
r.append(item)
return r
print f(a)
['17', ' 5', ' 6', ' 0', ' 0', '999', ' 10.11', ' 10.57', ' 18.34', ' 16.41', ' 13.23', ' 11.55', ' 11.56', '999', '999', ' 12.77', ' 11.99', ' 21.88', ' 22.46', ' 26.82', ' 25.71', ' 27.43', ' 27.73', ' 29.44', '999', '999', '999', '999', '999', ' 28.68', '999', '999', '999', '999', '999', '999', '999', '999', '1.40']
答案 4 :(得分:0)
最狡猾的方式我猜?
from itertools import groupby
L = ['17', ' 5', '6', ' 0', ' 0', '', '', '', '', ' 10.11', ' 10.57', ' 18.34', ' 16.41', ' 13.23', ' 11.55', ' 11.56', '', '', '', '', '', '', '', '', ' 12.77', ' 11.99', ' 21.88', ' 22.46', ' 26.82', ' 25.71', ' 27.43', ' 27.73', ' 29.44', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', ' 28.68', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '', '1.40']
def your_function(L):
grouped_L = [(k, len(list(g))) for k,g in groupby(L)]
final_list = [item
for x, y in grouped_L
for item in repeat(x if y < 4 else '999', y if y < 4 else y // 4)]
return final_list
print(your_function(L))
从 itertools 中使用 groupby 和重复,这会生成一个像这样的元组列表。
[(k,len(list(g)))k,g in groupby(L)]
[('17',1),('5',1),('6',1),('0',2),('',4),.. 等等上
其中输出是元组=&gt; (item,number_of_it's_consecutive_occurrence)
然后再次使用列表理解
注意:(x,y)=&gt; (item,number_of_it's_consecutive_occurrence)
final_list = [item
for x, y in grouped_L
for item in repeat(x if y < 4 else '999', y if y < 4 else y // 4)]