python多个文本文件与不同的分隔符

时间:2017-05-23 19:30:53

标签: python pandas

我有多个具有不同分隔符(|,)的文件。我需要在read_cvs函数中使用不同的seperater。这是可能的还是我需要将分隔符转换为一个(行逗号),然后使用sep =','?

import glob
path =r'C:\Users\hadadir\Desktop\temp' # use your path
all_Files = glob.glob(path + "/*.txt")
frame = pd.DataFrame()
df = pd.concat((pd.read_csv(f, sep=",|", header = None , nrows=2) for f in all_Files))
df

    0   1   2   3   4   5   6   7
0   3130A0|QE|39104|2000|20140630|0|17306|2000  NaN NaN NaN NaN NaN NaN NaN
1   3130A0|QY|39104|0|20140630|-1000|17306|1000 NaN NaN NaN NaN NaN NaN NaN
0   "3135G0"    "XC"    "39104" 1000    20130630    1000    "17306" 1000
1   "3136FP"    "DY"    "39104" 2000    20130630    0   "17306" 2000

转换:

import glob
path =r'C:\Users\hadadir\Desktop\temp' # use your path
all_Files = glob.glob(path + "/*.txt")
frame = pd.DataFrame()

结果:

C:\Users\hadadir\Desktop\temp\HOLDINGQ2.TXTC:\Users\hadadir\Desktop\temp\HOLDING_20131224.txt

1 个答案:

答案 0 :(得分:0)

pd.read_csv通常将多字符分隔符视为正则表达式(documentation)。因此,您可以使用set语法(如sep="[\|,]")来指示多个分隔符。