data = open("state_towns.txt")
for line in data:
print(line)
返回以下列表:
Colorado[edit]
Alamosa (Adams State College)[2]
Boulder (University of Colorado at Boulder)[12]
Durango (Fort Lewis College)[2]
Connecticut[edit]
Fairfield (Fairfield University, Sacred Heart University)
Middletown (Wesleyan University)
New Britain (Central Connecticut State University)
我想返回一个包含状态和区域两列的数据框,如下所示:
State Town
0 Colorado Alamosa
1 Colorado Boulder
2 Colorado Durango
3 Connecticut Fairfield
4 Connecticut Middletown
5 Connecticut New Britain
我如何拆分列表,以便将包含“ [edit]”的任何行添加到状态列?
我该如何删除城镇条目中括号中的所有文本?
谢谢
答案 0 :(得分:0)
d = {"state":[], "town":[]} #dictionary to hold the data
state = "" #placeholder state var
town = "" #placeholder town var
data = open("state_towns.txt")
for line in data:
if "[edit]" in line:
state = line.replace("[edit]","") #set the state var if it has edit
else:
town = line.split()[0] #remove the extra town line info
if state != "" and town != "": # if both vars are filled add to dictionary
d["state"].append(state)
d["town"].append(town)
import pandas as pd
df = pd.DataFrame(d)
print(df)
这很奇怪,但确实可以做到。
占位符状态,在循环中定义的占位符镇。如果两者都定义,则将它们添加到字典中,完成后将字典转换为数据框。