如何从Python中的列表中删除所有换行符

时间:2018-03-18 07:13:21

标签: python file split

所以我正在创建一个程序来读取pdf到文本文件,但每次运行我的代码时,新行字符都会弹出文本文件的列表。我尝试了很多方法,包括strip(),split()和replace(),但这些字符不会消失。如果有人能帮助我,那就太好了。下面发布的片段:

import PyPDF2 as pdf

# creating an object 
file = open(PDF_FILENAME_DIRECTORY, "rb")

# creating a pdf reader object
fileReader = pdf.PdfFileReader(file)

# print the number of pages in pdf file
textData = []

for pages in fileReader.pages:
    theText = pages.extractText()

    # for char in theText:
    #   theText.replace(char, "\n")

    textData.append(theText)

final_list = []

for i in textData:
    final_list.append(i.strip('\n'))

# [s.strip('\n') for s in theText]
# [s.replace('\n', '') for s in theText]


# text_data = []

# for elem in textData:
#         text_data.extend(elem.strip().split('n'))  

# for line in textData:
#     textData.append(line.strip().split('\n'))
#--------------------------------------------------------------------

import os.path

save_path = "FILENAME_SAVEPATH_DIRECTORY"

name_of_file = input("What is the name of the file: ")

completeName = os.path.join(save_path, name_of_file + ".txt")   

file1 = open(completeName, "w")

file1.write(str(final_list))

file1.close()

Sample output of code as a list in a text file. I want to take out the '\n' characters.

1 个答案:

答案 0 :(得分:0)

你的问题就在这一行:

file1.write(str(final_list))

这会调用__str__类型的list方法,该方法使用repr对列表中的元素进行字符串化,这就是导致输出看起来像它的方式。

请改为:

for line in final_list:
    file1.write(line)