我有一个包含以下信息的csv文件,我需要将正则表达式与字符串匹配为' B08-1506'起点直到下一个模式与上面的字符串匹配。我想将这三行添加为单行
B08-1506,324873, st, $0.0,
ljkflka,,,,,
1 of 37 jksdfhjfhjk
jkdsfh,,,,,,,
B08-1606,324873, st, $0.0,
ljkflka,,,,,
1 of 37 jksdfhjfhjk
jkdsfh,,,,,,,
B09-0680,324873, st, $0.0,
ljkflka,,,,,
1 of 37 jksdfhjfhjk
jkdsfh,,,,,,,
B09-0681,324873, st, $0.0,
ljkflka,,,,,
1 of 37 jksdfhjfhjk
jkdsfh,,,,,,,
输出应该是这样的,
B08-1506,324873, st, $0.0,ljkflka,jksdfhjfhjk,jkdsfh
B08-1606,324873, st, $0.0,ljkflka,jksdfhjfhjk,jkdsfh
B09-0680,324873, st, $0.0,ljkflka,jksdfhjfhjk,jkdsfh
B09-0681,324873, st, $0.0,ljkflka,jksdfhjfhjk,jkdsfh
答案 0 :(得分:1)
喜欢 Nisarg 表示最好修复源csv格式。但是,如果您无法在下面的代码片段中提供帮助。
演示:(没有正则表达式)
s = """B08-1506,324873, st, $0.0,
ljkflka,,,,,
1 of 37 jksdfhjfhjk
jkdsfh,,,,,,,
B08-1606,324873, st, $0.0,
ljkflka,,,,,
1 of 37 jksdfhjfhjk
jkdsfh,,,,,,,
B09-0680,324873, st, $0.0,
ljkflka,,,,,
1 of 37 jksdfhjfhjk
jkdsfh,,,,,,,
B09-0681,324873, st, $0.0,
ljkflka,,,,,
1 of 37 jksdfhjfhjk
jkdsfh,,,,,,,"""
res = []
for i in s.split("\n"):
if i.startswith("B0"): #Check if line starts with "B0"
res.append(i)
else: #else concat to the previous element in res.
res[-1] = res[-1]+i
res = [filter(None, i.split(",")) for i in res] #Filter to remove all empty elements
for i in res:
print(", ".join(i))
<强>输出:强>
B08-1506, 324873, st, $0.0, ljkflka, 1 of 37 jksdfhjfhjkjkdsfh
B08-1606, 324873, st, $0.0, ljkflka, 1 of 37 jksdfhjfhjkjkdsfh
B09-0680, 324873, st, $0.0, ljkflka, 1 of 37 jksdfhjfhjkjkdsfh
B09-0681, 324873, st, $0.0, ljkflka, 1 of 37 jksdfhjfhjkjkdsfh