我想知道如何使用python将文本文件转换为csv。下面,我尝试的代码不起作用。我想细分为四列:日期,时间,数字,消息。预先谢谢你。
答案 0 :(得分:0)
可以使用正则表达式(Regex)执行此操作。如果您不熟悉regex,请尝试regexone.com,这是一个很好的初学者指南!
下面的正则表达式查找与指定条件匹配的字符串部分。然后取括号中的位,并将它们分配给变量“ match”,稍后再调用。
# First import pandas and the regex module
import pandas as pd
import re
#Define filepaths
InputFilePath =
OutputFilePath =
# Read the .txt file into a string
data = open(InputFilePath)
string = data.read()
data.close()
#Split seperate lines into list of strings
splitstring = string.splitlines()
# For each list item find the data needed (with regex or indexing)
# and assign to a dictionary
df = {}
for i in range(len(splitstring)):
match = re.search(r'(.*) \xc3\xa0 (.*) - (.*): (.*)',splitstring[i])
if match is None:
match = re.search(r'(.*) \xc3\xa0 (.*) - ()(.*)',splitstring[i])
if match is None:
line = {
'Date' : "",
'Time' : "",
'Number' : "",
'Text' : ""}
else:
line = {
'Date' : match.group(1),
'Time' : match.group(2),
'Number' : match.group(3),
'Text' : match.group(4)}
df[i] = line
# Convert dictionary to pandas dataframe
dataframe = pd.DataFrame(df).T
# Finally send to csv
dataframe.to_csv(OutputFilePath)