我有一个文本文件Univerity_towns.txt,其中的数据用换行符分隔,但是我想要两列,即州和镇,但是由于数据是垂直的,所以我去了:
df = pd.read_csv('university_towns.txt', delimiter= '\n', index_col=False, names = ["State", "RegionName"])
我得到:all the data in state column (Image link)相反,我希望代码区分州和镇,然后分别填充
答案 0 :(得分:0)
我从WikiPedia获得了数据并进行了尝试。如果只是州名和大学名的划分,我认为可以通过以下方式实现
data = '''
Alabama [edit]
Auburn (Auburn University, Edward Via College of Osteopathic Medicine)[6]
Birmingham (University of Alabama at Birmingham, Birmingham School of Law,
Birmingham Southern College, Cumberland School of Law, Miles Law School)[7]
Dothan (Fortis College, Troy University Dothan Campus, Alabama College of Osteopathic Medicine)
Florence (University of North Alabama)
Homewood (Samford University)
'''
import pandas as pd
import io
df = pd.read_csv(io.StringIO(data), sep='\n', header=None)
df2 = df[0].str.split(' ', 1, expand=True)
df2
0 1
0 Alabama [edit]
1 Auburn (Auburn University, Edward Via College of Oste...
2 Birmingham (University of Alabama at Birmingham, Birmingh...
3 Dothan (Fortis College, Troy University Dothan Campus...
4 Florence (University of North Alabama)
5 Homewood (Samford University)