我想在csv文件的行中搜索子字符串。这就是我所拥有的。我知道它没有执行搜索,我没有正确写出输出。
import csv
def filterCSVfile (path):
filterSubstrings = ['signal1', 'signal2']
csvData = open (path)
filereader = csv.reader(csvData, delimiter=',')
rows = [row for row in filereader if row in filterSubstrings]
outFileHandle = open("output.csv", "w")
outFileHandle.write(rows)
outFileHandle.close()
filterCSVfile('history.csv')
修改
csv文件包含两列,一列是人类可读的日期时间,另一列是网址,如:
2016-02-12 15:37:15,http://www.youtube.com/watch?v=wt60lVB8sHo
2016-02-12 15:37:15,https://www.youtube.com/watch?v=wt60lVB8sHo
2016-02-12 15:54:33,http://kizi.com/games/paintworld-2-monsters
2016-02-12 16:12:56,http://kizi.com/games/u/icycle
2016-02-12 16:13:03,http://kizi.com/games/u/iron-turtle
2016-02-12 16:13:46,http://www.armorgames.com/
2016-02-12 16:13:46,http://armorgames.com/
我想提取包含' signal1'的行。或者' signal2'在网址中,例如http://signal1.com。
答案 0 :(得分:0)
替换行
rows = [row for row in filereader if row in filterSubstrings]
with,
rows = [row for row in filereader if any([word in row[1] for word in filterSubstrings])]
源代码
import csv
def filterCSVfile(path):
filterSubstrings = set(['signal1', 'signal2']) # for efficency reason
with open(path, 'r') as csvData:
filereader = csv.reader(csvData, delimiter=',')
rows = [row for row in filereader if any([word in row[1] for word in filterSubstrings])] # change this row
with open('output.csv', 'w') as outFileHandle
writer = csv.writer(outFileHandle) # get a write object
writer.writerows(rows)
filterCSVfile('history.csv')
<强>测试强>
history.csv
,
date1,http://signal1.com
2016-02-12 15:37:15,http://www.youtube.com/watch?v=wt60lVB8sHo
2016-02-12 15:37:15,https://www.youtube.com/watch?v=wt60lVB8sHo
date2,http://signal2.com
输出rows
,
[['date1', 'http://signal1.com'], ['date2', 'http://signal2.com']]