熊猫将txt / TXT转换为Excel,但将一行拆分为两行

时间:2019-06-20 13:49:26

标签: python excel pandas tab-delimited

我正在使用熊猫将制表符分隔的txt转换为excel。大多数情况下效果良好。但是今天我发现一个问题,将一行分为两行,我不知道原因。

我检查了该行和其他具有相似模式的行,但是其他行没有拆分。我很困惑。

import pandas as pd
import pandas.io.formats.excel
mypath = "Path"
#print(mypath)
from os import listdir
from os.path import join
#print (listdir(mypath))
textfiles = [ join(mypath,f) for f in listdir(mypath) if '.txt' in f.lower()]
#textfiles2= [join(mypath,f) for f in listdir(mypath) if '.txt' in f]
#print(textfiles2)
#textfiles.extend(textfiles2)
print(textfiles)

for textfile in textfiles:
    df = pd.read_csv(textfile,sep='|', dtype=str)
    #df.dtypes
    #df.ROW = df.ROW.astype(str)
    print(df.iloc[1])
    print(df.iloc[2])
    filename=textfile.split('.')[0]
    # Create a Pandas Excel writer  
    # object using XlsxWriter as the engine.  
    writer_object = pd.ExcelWriter(filename+'.xlsx', 
                                engine ='xlsxwriter') 
    # Write a dataframe to the worksheet.  
    # we turn off the default header 
    # and skip one row because we want 
    # to insert a user defined header there. 

    df.to_excel(writer_object, 'Sheet1', index=False) 
    # Create xlsxwriter workbook object . 
    workbook_object = writer_object.book 
    # Create xlsxwriter worksheet object 
    worksheet_object = writer_object.sheets['Sheet1'] 

    # Create a new Format object to formats cells  
    # in worksheets using add_format() method . 

    # here we create a format object for header. 
    header_format_object = workbook_object.add_format({ 
                                'bold': False, 
                                'border': 0}) 
    # Write the column headers with the defined format. 
    for col_number, value in enumerate(df.columns.values): 
        worksheet_object.write(0, col_number , value,  
                              header_format_object) 

    # Close the Pandas Excel writer  
    # object and output the Excel file.  
    writer_object.save() 

无错误信息。 来源TXT:

Row|Month|CompanyName|Location|PRODUCT|Region|MEMBERID|CNN|CLIENTMEMBERID|MEMBERLASTNAME|MEMBERFIRSTNAME|DOB|GENDER|MEMBERADDRESS1|MEMBERADDRESS2|MEMBERCITY|MEMBERSTATE|MEMBERZIP|MEMBERPHONE

2574|201907|Apple|Palo Alto|OOO|California|7156||62980|Tim|Cook|06/01/2019|Male|4433 Mountain view||Sunnyvale|CA|95500|(999) 999-9999

2575|201907|Apple|Palo Alto|OOO|California|7158||63069|Tim|Cook|06/01/2019|Male|322 Sand AVENUE|1ST FL|Sunnyvale|CA|95500|(999) 999-9999

2576|201907|Apple|Palo Alto|OOO|California|7159||63128|Tim|Cook|06/01/2019|Male|187 Mountain view||Sunnyvale|CA|95500|(999) 999-9999

2577|201907|Apple|Palo Alto|OOO|California|7161||63145|Tim|Cook|06/01/2019|Male|145-40 21 AVE|PVT|Sunnyvale|CA|95500|(999) 999-9999

2578|201907|Apple|Palo Alto|OOO|California|7162||63222|Tim|Cook|06/01/2019|Male|2555 SAND AVENUE
||Sunnyvale|CA|95500|(999) 999-9999

2579|201907|Apple|Palo Alto|OOO|California|7163||63230|Tim|Cook|06/01/2019|Male|235 SAnd COURT||Sunnyvale|CA|95500|(999) 999-9999

2580|201907|Apple|Palo Alto|OOO|California|7164||63381|Tim|Cook|06/01/2019|Male|223 78TH STREET|3E|Sunnyvale|CA|95500|(999) 999-9999

2581|201907|Apple|Palo Alto|OOO|California|7165||63399|Tim|Cook|06/01/2019|Male|8739 26TH AVENUE|1|Sunnyvale|CA|95500|(999) 999-9999

Source

结果: 第2578行在下面的“ AVENUE”之后分割,请运行脚本并打开excel,然后您就会知道。 Split Row 2578

0 个答案:

没有答案