Question

基本上，原始数据没有标题，只有值（但是我有标题列表）。分隔符为“ |”。现在，我尝试执行的操作是将txt文件转换为csv文件。 csv文件包含我具有的标头和相应的值。

例如：

txt文件如下：

sadasd | dsdads | adsasd

值1 |值2 |值3 |值4 |值5 |值100 |值101 |值   102 |值103 |值104 |值105值200 |值201 |值202 |值   203 |值204 |值205

sdasd | dsa | dsdad

，并且在转换.csv文件后，其外观如下：

标题1，标题2，标题3，标题4，标题5

值1，值2，值3，值4，值5，

值100，值101，值102，值103，值104，值105

值200，值201，值202，值203，值204，值205

我刚刚开始学习python，我的想法是：

删除第一行和最后一行。
使用字典列表：每一列都是带有键的列表（我有标题）。到数据框
转换为.csv

所以看起来像{'header 1'：[值1，值100，值200]，'header 2'：[值2，值101，值201]。然后转换为.csv。

那只是我的想法，或者您有最简单的方法，但仅使用python。

Answer 1

使用csv模块

例如：

import csv
with open(filename, "r") as infile:
    data = []
    for i in infile.readlines()[1:-1]:                   #Strip first and last line. 
        if i.strip():
            data.extend(i.strip().split("|"))
data = [data[i:i+5] for i in range(0, len(data), 5)]     #Split list to sub-list of 5 elements
print(data)


header = ["header 1","header 2", "header 3", "header 4", "header 5"]
with open(outfile, "w") as outfile:                     #Output CSV file
    writer = csv.writer(outfile, delimiter=",")
    writer.writerow(header)                             #Write Header
    writer.writerows(data)                              #Write content.

Answer 2

从stackoverflow的各个零件中缝制出以下解决方案

import pandas as pd

mycolnames = ['col1','col2','col3','col4','col5']

# Use the sep argument to change your delimiter accordingly
df = pd.read_csv("foo.txt", sep="|")

# Set your column names to the data frame
df.columns = mycolnames

# Write your desired columns to csv
df['col1'].to_csv("bar.csv", sep=",")

积分

@ atomh33ls -How to read csv into record array in numpy?

@LangeHaare -set column names in pandas data frame from_dict with orient = 'index'

将.txt文件（数据Feed）转换为.csv文件

2 个答案: