从包含某些字符的 CSV 文件中删除行

时间:2021-04-28 13:06:41

标签: python string csv row remove

我希望从 csv 文件中删除包含特定字符串或在其行中的行。

我希望能够创建一个新的输出文件而不是覆盖原始文件。

需要删除任何包含“py-board”或“coffee”的行

示例:

输入:

173.20.1.1,2-base
174.28.2.2,2-game
174.27.3.109,xyz-b13-coffee-2
174.28.32.8,2-play
175.31.4.4,xyz-102-o1-py-board
176.32.3.129,xyz-b2-coffee-1
177.18.2.8,six-jump-walk

预期输出:

173.20.1.1,2-base
174.28.2.2,2-game
174.28.32.8,2-play
177.18.2.8,six-jump-walk

我试过了 Deleting rows with Python in a CSV file

import csv
with open('input_csv_file.csv', 'rb') as inp, open('purged_csv_file', 'wb') as out:
    writer = csv.writer(out)
    for row in csv.reader(inp):
        if row[1] != "py-board" or if row[1] != "coffee":
            writer.writerow(row)

我试过了

import csv
with open('input_csv_file.csv', 'rb') as inp, open('purged_csv_file', 'wb') as out:
    writer = csv.writer(out)
    for row in csv.reader(inp):
        if row[1] != "py-board":
            if row[1] != "coffee":
                writer.writerow(row)

还有这个

        if row[1][-8:] != "py-board":
            if row[1][-8:] != "coffee-1":
                if row[1][-8:] != "coffee-2":

但得到这个错误

  File "C:\testing\syslogyamlclean.py", line 6, in <module>
    for row in csv.reader(inp):
_csv.Error: iterator should return strings, not bytes (did you open the file in text mode?)

2 个答案:

答案 0 :(得分:1)

我实际上不会将 csv 包用于此目标。这可以使用标准文件读取和写入轻松实现。

试试这个代码(我已经写了一些注释以使其不言自明):

# We open the source file and get its lines
with open('input_csv_file.csv', 'r') as inp:
    lines = inp.readlines()

# We open the target file in write-mode
with open('purged_csv_file.csv', 'w') as out:
    # We go line by line writing in the target file
    # if the original line does not include the
    # strings 'py-board' or 'coffee'
    for line in lines:
        if not 'py-board' in line and not 'coffee' in line:
            out.write(line)

答案 1 :(得分:0)

# pandas helps to read and manipulate .csv file
import pandas as pd

# read .csv file
df = pd.read_csv('input_csv_file.csv', sep=',', header=None)
df
              0                    1
0    173.20.1.1               2-base
1    174.28.2.2               2-game
2  174.27.3.109     xyz-b13-coffee-2
3   174.28.32.8               2-play
4    175.31.4.4  xyz-102-o1-py-board
5  176.32.3.129      xyz-b2-coffee-1
6    177.18.2.8        six-jump-walk

# filter rows
result = df[np.logical_not(df[1].str.contains('py-board') | df[1].str.contains('coffee'))]
print(result)
             0              1
0   173.20.1.1         2-base
1   174.28.2.2         2-game
3  174.28.32.8         2-play
6   177.18.2.8  six-jump-walk

# save to result.csv file
result.to_csv('result.csv', index=False, header=False)
相关问题