主要目标是消除彼此相邻的相似数字(例如,在第一行中,如果它是T1000,8,8,Error
,则初始程序将其消除),但是不幸的是,问题是有时csv文件包含上述行。我想分别处理这些价值观。此处,第一行和第二行T1000,8,4
加在一起时变成T1000,8,8,Error
,然后将其删除。同样,最后3个值S5214,20,8,Error
,S5214,20,4,Error
和S5214,20,8,Error
变成S5214,20,20,Error
,然后被删除。我已经做了一些工作,但似乎无法解决。非常感谢您的帮助!
with open('target.csv', 'r') as new:
yes = []
y = []
for line in new:
y.append(line)
print(line)
for n in range(0,len(y)-1):
x1 = int(y[n].split(',')[1]) + int(y[n+1].split(',')[1])
x2 = int(y[n].split(',')[2]) + int(y[n+1].split(',')[2])
y1 = int(y[n].split(',')[2])
y2 = int(y[n].split(',')[1])
y3 = int(y[n+1].split(',')[2])
y4 = int(y[n+1].split(',')[1])
if y[n].split(',')[0] == y[n+1].split(',')[0]:
if x1 == y1:
print(x2)
t = y[n].split(',')
t[1] = t[2]
y[n] = ",".join(t)
print(y[n])
elif x1 == y2:
print(x2)
t = y[n].split(',')
t[2] = t[1]
y[n] = ",".join(t)
print(y[n])
elif x2 == y1:
print(x2)
t = y[n].split(',')
t[1] = t[2]
y[n] = ",".join(t)
print(y[n])
elif x2 == y2:
print(x2)
t = y[n].split(',')
t[2] = t[1]
y[n] = ",".join(t)
print(y[n])
print(y)
elif x1 == y3:
print(x2)
t = y[n].split(',')
t[1] = t[2]
y[n] = ",".join(t)
print(y[n])
elif x1 == y4:
print(x2)
t = y[n].split(',')
t[2] = t[1]
y[n] = ",".join(t)
print(y[n])
elif x2 == y3:
print(x2)
t = y[n].split(',')
t[1] = t[2]
y[n] = ",".join(t)
print(y[n])
elif x2 == y4:
print(x2)
t = y[n].split(',')
t[2] = t[1]
y[n] = ",".join(t)
print(y[n])
我的csv文件如下:
T1000,8,4,Error
T1000,8,4,Error
S1234,2,4,Error
C1234,3,2,Error
S1348,4,2,Error
S5214,20,8,Error
S5214,20,4,Error
S5214,20,8,Error
答案 0 :(得分:1)
首先,如果可以帮助,应该避免在同一上下文管理器中进行操作和阅读。它使代码更难阅读。 (毕竟扁平比嵌套更好)
首先,进行一些设置:忽略此设置。
inp='''
T1000,8,4,Error
T1000,8,4,Error
S1234,2,4,Error
C1234,3,2,Error
S1348,4,2,Error
S5214,20,8,Error
S5214,20,4,Error
S5214,20,8,Error
'''.strip().splitlines(True)
上面的代码块应为您提供输入字符串,类似于以下几行。对于您的情况,请改用以下内容。
with open("target.txt","r") as f:
inp = f.readlines()
现在,您需要收集所有开头具有相似键的行,并对它们的值求和。我们可以为此使用defaultdict。
from collections import defaultdict
temp = defaultdict(int) #this makes the default value 0.
现在,对这些值求和。在此步骤中,还有助于将每行的第二个值存储在键本身中。对照此值,我们稍后将进行相等性检查。
for line in inp:
k1, k2, v1, v2 = line.strip().split(',')
temp[k1, int(k2)] += int(v1)
#Output:
defaultdict(int,
{('T1000', 8): 8,
('S1234', 2): 4,
('C1234', 3): 2,
('S1348', 4): 2,
('S5214', 20): 20})
现在,获取与之匹配的键名。这些都需要删除。
to_remove = [k[0] for k, v in temp.items() if k[1] == v]
#Output:
['T1000', 'S5214']
最后,根据所需的输出编写条件。我假设您希望所有其他行保持不变。
output = [line for line in inp if not any(line.startswith(s) for s in to_remove)]
#Output:
['S1234,2,4,Error\n', 'C1234,3,2,Error\n', 'S1348,4,2,Error\n']
然后,只需加入并写回文件即可。
with open("output.txt", "w") as f:
f.write("".join(output))