如何拆分文本文件并在Python中修改它?

时间:2016-12-06 03:12:32

标签: python python-3.x split text-files

我目前有一个文本文件,如下所示:

101, Liberia, Monrovia, 111000, 3200000, Africa, English, Liberia Dollar;
102, Uganda, Kampala, 236000, 34000000, Africa, English and Swahili, Ugandan Shilling;
103, Madagascar, Antananarivo, 587000, 21000000, Africa, Magalasy and Frances, Malagasy Ariary;

我目前正在使用以下代码打印文件:

with open ("base.txt",'r') as f:
   for line in f:
      words = line.split(';')
      for word in words:
         print (word)

我想知道的是,我如何使用他们的身份证号码(例如101)修改一行,并保留他们的格式,并根据他们的身份证号码添加或删除行?

3 个答案:

答案 0 :(得分:1)

我理解你的问题是如何修改一行中的单词,然后将修改后的行插回文件中。

更改文件中的单词

def change_value(new_value, line_number, column):
    with open("base.txt",'r+') as f: #r+ means we can read and write to the file
        lines = f.read().split('\n') #lines is now a list of all the lines in the file
        words = lines[line_number].split(',')
        words[column] = new_value
        lines[line_number] = ','.join(words).rstrip('\n') #inserts the line into lines where each word is seperated by a ','
        f.seek(0)
        f.write('\n'.join(lines)) #writes our new lines back into the file

要使用此功能将line 3, word 2设置为Not_Madasgascar,请将其命名为:

change_word("Not_Madagascar", 2, 1)

您必须将1添加到行/单词编号,因为第一行/单词为0

在文件中添加新行

def add_line(words, line_number):
    with open("base.txt",'r+') as f:
        lines = f.readlines()
        lines.insert(line_number, ','.join(words) + '\n')
        f.seek(0)
        f.writelines(lines)

要使用此功能,请在末尾添加一行,其中包含单词this line is at the end像这样:

add_line(['this','line','is','at','the','end'], 4) #4 is the line number

有关打开文件的详细信息,请参阅here

有关读取和修改文件的更多信息,请参阅here

答案 1 :(得分:1)

pandas是解决您需求的强大工具。它提供了轻松使用CSV文件的工具。您可以在DataFrames

中管理您的数据
import pandas as pd

# read the CSV file into DataFrame
df = pd.read_csv('file.csv', sep=',', header=None, index_col = 0)
print (df)

enter image description here

# eliminating the `;` character
df[7] = df[7].map(lambda x: str(x).rstrip(';'))
print (df)

enter image description here

# eliminating the #101 row of data
df.drop(101, axis=0, inplace=True)
print (df)

enter image description here

答案 2 :(得分:0)

如果您尝试保留原始文件排序以及能够引用文件中的行以进行修改/添加/删除,则将此文件读入OrderedDict可能会有所帮助。在下面的示例中,对于文件的完整格式有很多假设,但它适用于您的测试用例:

from collections import OrderedDict

content = OrderedDict()

with open('base.txt', 'r') as f:
    for line in f:
        if line.strip():
            print line
            words = line.split(',')  # Assuming that you meant ',' vs ';' to split the line into words
            content[int(words[0])] = ','.join(words[1:])

print(content[101])  # Prints " Liberia, Monrovia, etc"...

content.pop(101, None)  # Remove line w/ 101 as the "id"