这是我的文件:
#This is TEST-data
2020-09-07T00:00:03.230+02:00,ID-10,3,Lon,Man,Lon,1,1,1
2020-09-07T00:00:03.230+02:00,ID-10,3,Lon,Lon,Man,1,1
2020-09-07T00:00:03.230+02:00,ID-20,2,Lon,Lon,1,1
2020-09-07T00:00:03.230+02:00,ID-20,2,Lon,Lon,1
2020-09-07T00:00:03.230+02:00,ID-30,3,Mad,Sev,Sev,1,1,1
2020-09-07T00:00:03.230+02:00,ID-30,GGG,Mad,Sev,Mad1
2020-09-07T00:00:03.230+02:00,ID-40,GGG,Mad,Bar,1,1,1,1
2020-09-07T00:00:03.230+02:00
2020-09-07T00:00:03.230+02:00
当我运行下面的代码时,我得到一个空的回报。那可能是因为我的代码似乎不知道 Man 在曼彻斯特,而 Sev 在塞维利亚。
我认为问题出现在 condition_1
path = r'c:\data\ELK\Desktop\test_data_countries.txt'
cities_to_filter = ['Sevilla', 'Manchester']
def filter_row(row):
if len(row) > 2 and row[2].isdigit():
amount_of_cities = int(row[2])
cities_to_check = row[3:3+amount_of_cities]
condition_1 = any(city in cities_to_check for city in cities_to_filter)
return condition_1
with open (path, 'r') as output_file:
reader = csv.reader(output_file, delimiter = ',')
next(reader)
for row in reader:
if filter_row(row):
print(row)
这是我的预期输出:
2020-09-07T00:00:03.230+02:00,ID-10,3,Lon,Man,Lon,1,1,1
2020-09-07T00:00:03.230+02:00,ID-10,3,Lon,Lon,Man,1,1
2020-09-07T00:00:03.230+02:00,ID-30,3,Mad,Sev,Sev,1,1,1
答案 0 :(得分:0)
您可以使用 .split() 函数拆分每一行,该函数根据您提供的参数拆分字符串,如果您不提供参数,它将把字符串分隔为空格。然后它将返回一个列表,因此您应该将其分配给一个列表。然后控制列表中是“man”还是列表中的“sev”。
for line in file:
myList=line.split(",")
if "man" in myList or "sev" in myList:
#blabla
答案 1 :(得分:0)
问题在于您试图将字符串 Man
与 Manchester
匹配。
您可以使用以下内容仅匹配前三个字符:
import csv
path = 'Pdata.txt'
cities_to_filter = ['Sevilla', 'Manchester']
def filter_row(row):
if len(row) > 2 and row[2].isdigit():
amount_of_cities = int(row[2])
cities_to_check = row[3:3+amount_of_cities]
#print(cities_to_check)
condition_1 = any(city[:3] in cities_to_check for city in cities_to_filter)
return condition_1
with open (path, 'r') as output_file:
reader = csv.reader(output_file, delimiter = ',')
next(reader)
for row in reader:
if filter_row(row):
print(row)
要匹配任何字符而不仅仅是前三个字符,您可以使用列表理解,使用两个列表来查找任何匹配的行。
因此,如果您将 cities_to_check
作为 ["Man", "vil"]
,则匹配将同时包含 ['Sevilla', 'Manchester']
,然后您可以使用 len(matching) != 0
作为返回条件来获得所需的结果。< /p>
cities_to_filter = ['Sevilla', 'Manchester']
cities_to_check = ["Man", "vil"]
matching = [city2 for city2 in cities_to_filter if any(city1 in city2 for city1 in cities_to_check)]