我有一个包含数据集(在本例中为地址)的csv文件。我想制作第二个csv文件,该文件仅包含在特定列中具有一组短语之一的条目。例如,我想返回目前居住在“ Viridian”中的所有人员,而不是以前居住在该处或从未住过的人。
示例数据为:
First Name,Second Name,ID,Home Town,County,Current Town,Street
Sam,Smith,1234,Pallet,North,Orange,Lemon
Jenny,Walton,1456,Viridian,West,York,High View
Alan,Kirk,2378,Orange,West,Viridian,High street
Reese,Small,9840,Minsk,East,Viridian,Ocean Avenue
Audry,Owen,7865,York,South,Blackmarsh,8th Street
Marco,Jefferson,1580,Amsterdam,Central,Oxford,Church Road
Jim,Lowe,5218,Windy City,East,Windy City,Oak
Gillian,Pope,3217,Rome,Central,Rome,Low road
我以前使用过此代码:
town = ["Viridian", "Rome"]
with open("addresses.csv",) as oldfile, open("Filtered addresses.csv", "w") as newfile:
for line in oldfile:
if any(town in line.strip().lower() for town in town):
newfile.write(line)
但是,这将返回在所有列中都具有指定城市的行-我只希望在“当前镇”列中具有指定城市的行。
我尝试了以下方法:
import csv
town = ["Viridian", "Rome"]
with open("Filtered addresses.csv", "w", encoding="Latin-1") as newfile:
reader = csv.reader(open("addresses.csv", 'r', encoding="Latin-1"))
for data in reader:
if any(town in data[6] for town in town):
newfile.write(data)
但这会导致错误:
TypeError: write() argument must be str, not list
在将代码更改为以下内容时:
newfile.write(str(data))
返回一些条目,但它们的格式设置为单行而不是行。
实现我的目标的最佳方法是什么?我想在每种情况下都保留整行数据。
谢谢!
答案 0 :(得分:1)
熊猫将使其变得非常简单:
import pandas as pd
town = ["Viridian", "Rome"]
# Read csv as pandas dataframe
original = pd.read_csv("addresses.csv", index_col=False)
# Select rows where `Current Town` column's value is in `town`
filtered = original[original['Current Town'].isin(town)]
# Save the filtered dataframe to a file
filtered.to_csv("Filtered addresses.csv")
如果未安装熊猫,则可以轻松地运行它:
pip install pandas
在您的命令行中
答案 1 :(得分:-1)
import csv
town = ["Viridian", "Rome"]
with open("Filtered addresses.csv", "w", encoding="Latin-1") as newfile:
reader = csv.reader(open("addresses.csv", 'r', encoding="Latin-1"))
csvwriter = csv.writer(newfile)
for data in reader:
if any(town in data[6] for town in town):
csvwriter.writerow(data)