我有.TX0文件(某种csv txt文件)并通过python .readlines(),open(filename,' w')等方法将其转换为.txt文件。我有这个新保存的txt文件,但当我尝试将其转换为数据帧时,它只给我一列。 txt文件如下:
Empty DataFrame
Columns: [ '"Software Version:", 6.3.2.0646, Date:, 19/08/2015 09:26:04\n', '"Reprocess Number:", vma2: 261519, Unnamed: 7, \n', '"Sample Name:", , Data Acquisition Time:, 18/08/2015 17:23:23\n', '"Instrument Name:", natural gas (PE ASXL-TCD/FID), Channel:, B\n', '"Rack/Vial:", 0, 0.1, Operator:, joey.walker\n', '"Sample Amount:", 1.000000, Dilution Factor:, 1.000000\n', '"Cycle:", 1, Result File :, \\\\vma2\\TotalChrom\11170_he_tcd001.rst \n', '"Sequence File :", \\\\vma\C1_C2_binary.seq \n', '"===================================================================================================================================="\n', '""\n', '""\n'.1, '"condensate analysis (HP4890 Optic - FID)"\n', '"Peak", Component, Time, Area, Height, BL\n', '"#", Name, [min], [uV*sec], [uV], \n'.1, '------, ------, ------.1, ------.2, ------.3, ------\n', '1, Unnamed: 55, 0.810, 706.42, 304.38, *BB\n', '2, CH4, 0.900, 1113518.24, 495918.41, *BB\n'.1, '3, C2H6, 1.373, 901670.23, 295381.12, *BB\n'.2, '"", Unnamed: 73, Unnamed: 74, ------.4, ------.5, \n'.2, '"".1, Unnamed: 79, Unnamed: 80, 2015894.89, 791603.91, \n'.3, '"Missing Component Report"\n', '"Component", Expected Retention (Calibration File)\n', '------.1, ------\n'.1, '"All components were found"\n', '"Report stored in ASCII file :", C:\\Shared Folders\\TotalChrom\\11170_he_tcd001.TX0 \n']]
Index: []
更容易阅读:
清空DataFrame
专栏:['"软件版本:",6.3.2.0646,日期:,2015年8月19日 09:26:04 \ n','"重新编号:",vma2:261519,未命名:7,\ n', '"样品名称:" ,,数据采集时间:,18/08/2015 17:23:23 \ n', '"仪器名称:",天然气(PE ASXL-TCD / FID),频道:,B \ n', '"机架/样品瓶:",0,0.1,操作员:,joey.walker \ n','"样品量:", 1.000000,稀释因子:,1.000000 \ n','"循环:",1,结果文件:,\\ vma2 \ TotalChrom \ data \ Joey \ Binary_Mixtures \ Std1 \ 11170_he_tcd001。 RST \ n','"序列文件:", \\ vma2 \ TotalChrom \ sequences \ Joey \ C1_C2_binary.seq \ n', '" ========================================== ================================================== ========================================" \ n&#39 ;, '"" \ n','"" \ n' .1,'"凝析油分析(HP4890)光学 - FID)" \ n', '"峰值",组件,时间,区域,高度,BL \ n','"#",名称,[分钟], [uV * sec],[uV],\ n' .1,' ------,------,------。1,----- -.2,------。3, ------ \ n',' 1,未命名:55,0.810,706.42,304.38,* BB \ n',' 2,CH4,0.900,1115318.24,495918.41, * BB \ n' .1,' 3,C2H6,1.337,901670.23,295381.12,* BB \ n' .2,'"",未命名:73 ,未命名:74,------。4,------。5,\ n' .2,'"" .1,未命名:79,未命名:80,2015894.89,791603.91,\ n' .3,'"缺少组件报告" \ n','"组件",预期保留(校准文件)\ n',' ------。1,------ \ n' .1, '"找到所有组件" \ n','"报告存储在ASCII文件中:", C:\共享 文件夹\的TotalChrom \ DATA \乔伊\ Binary_Mixtures \ STD1 \ 11170_he_tcd001.TX0 \ n']]索引:[]
正如您所看到的,这是以逗号分隔的。有没有办法将此文本转换为逗号分隔的数据框?
感谢。
Ĵ
答案 0 :(得分:0)
答案 1 :(得分:0)
您可以尝试下面的代码将文本文件转换为数据框。
data = pd.read_csv('file.txt', sep=',')
希望它能自我解释。
答案 2 :(得分:0)
在这里,我对这个问题有一个一般性的答案:
import re
import pandas as pd
#first u have to open the file and seperate every line like below:
df = open('file.txt', "r")
lines = df.readlines()
df.close()
# remove /n at the end of each line
for index, line in enumerate(lines):
lines[index] = line.strip()
#creating a dataframe(consider u want to convert your data to 2 columns)
df_result = pd.DataFrame(columns=('first_col', 'second_col'))
i = 0
first_col = ""
second_col = ""
for line in lines:
#you can use "if" and "replace" in case you had some conditions to manipulate the txt data
if 'X' in line:
first_col = line.replace('X', "")
else:
#you have to kind of define what are the values in columns,for example second column includes:
second_col = re.sub(r' \(.*', "", line)
#this is how you create next line data
df_result.loc[i] = [first_col, second_col]
i =i+1
答案 3 :(得分:0)
我刚刚找到了一个简单的解决方案,它适用于我的代码。你也可以在你的 cade 中尝试这个:
f = open('glove.6B.100d.txt', encoding='utf8')