如何使用熊猫将此文本文件转换为数据框?

时间:2020-05-04 18:28:27

标签: python pandas

我有一个文本文件Univerity_towns.txt,其中的数据用换行符分隔,但是我想要两列,即州和镇,但是由于数据是垂直的,所以我去了: df = pd.read_csv('university_towns.txt', delimiter= '\n', index_col=False, names = ["State", "RegionName"]) 我得到:all the data in state column (Image link)相反,我希望代码区分州和镇,然后分别填充

1 个答案:

答案 0 :(得分:0)

我从WikiPedia获得了数据并进行了尝试。如果只是州名和大学名的划分,我认为可以通过以下方式实现

 data = '''
 Alabama [edit]
 Auburn (Auburn University, Edward Via College of Osteopathic Medicine)[6]
 Birmingham (University of Alabama at Birmingham, Birmingham School of Law, 
 Birmingham Southern College, Cumberland School of Law, Miles Law School)[7]
 Dothan (Fortis College, Troy University Dothan Campus, Alabama College of Osteopathic Medicine)
 Florence (University of North Alabama)
 Homewood (Samford University)
 '''

 import pandas as pd
 import io

 df = pd.read_csv(io.StringIO(data), sep='\n', header=None)
 df2 = df[0].str.split(' ', 1, expand=True)

 df2
          0     1
 0  Alabama [edit]
 1  Auburn  (Auburn University, Edward Via College of Oste...
 2  Birmingham  (University of Alabama at Birmingham, Birmingh...
 3  Dothan  (Fortis College, Troy University Dothan Campus...
 4  Florence    (University of North Alabama)
 5  Homewood    (Samford University)