Question

所以我需要编写一个读取文本文件的程序，并将其内容复制到另一个文件中。然后，我必须在文本文件的末尾添加一个列，并使用使用函数calc_bill计算的int填充该列。我可以将原始文件的内容复制到新文件中，但我似乎无法让我的程序读取calc_bill运行所需的整数。任何帮助将不胜感激。

以下是我正在阅读的文本文件的前3行：

CustomerID  Title   FirstName   MiddleName  LastName    Customer Type   
1   Mr. Orlando N.  Gee Residential     297780  302555
2   Mr. Keith   NULL    Harris  Residential     274964  278126

它正在将文件完全按照原样复制到新文件中。什么是无效的是将bill_amount（calc_bill）/ billVal（main）写入新列中的新文件。以下是新文件的预期输出：

CustomerID  Title   FirstName   MiddleName  LastName    Customer Type   Company Name    Start Reading   End Reading  BillVal
1   Mr. Orlando N.  Gee Residential     297780  302555       some number
2   Mr. Keith   NULL    Harris  Residential     274964  278126    some number

这是我的代码：

def main():
    file_in = open("water_supplies.txt", "r")
    file_in.readline()
    file_out = input("Please enter a file name for the output:")
    output_file = open(file_out, 'w')
    lines = file_in.readlines()
    for line in lines:
        lines = [line.split('\t')]
        #output_file.write(str(lines)+ "\n")
        billVal = 0
        c_type = line[5]
        start = int(line[7])
        end = int(line[8])
        billVal = calc_bill(c_type, start, end)
        output_file.write(str(lines)+ "\t" + str(billVal) + "\n")


def calc_bill(customer_type, start_reading, end_reading):
    price_per_gallon = 0

    if customer_type == "Residential":
        price_per_gallon = .012

    elif customer_type == "Commercial":
        price_per_gallon = .011

    elif customer_type == "Industrial":
        price_per_gallon = .01

    if start_reading >= end_reading:
        print("Error: please try again")

    else:
        reading = end_reading - start_reading

    bill_amount = reading * price_per_gallon
    return bill_amount
main()

Answer 1

有几件事。列名中的间距不一致会使实际列的计数有点混乱，但我相信那里有9个列名。但是，您的每行数据只有8个元素，因此看起来您有一个额外的列名称（可能是“CompanyName”）。所以摆脱它，或修复数据。

然后你的“开始”和“结束”变量分别指向索引7和8。但是，由于行中只有8个元素，我认为索引应该是6和7。

另一个问题可能是在你的for循环中通过“lines”，你将“lines”设置为该行中的元素。我建议将for循环中的第二个“lines”变量重命名为其他东西，比如“elements”。

除此之外，我只是提醒你注意一致性。您的一些列名称是camel-case，其他列名称是空格。你的一些变量用下划线分隔，而其他变量则是驼峰式的。

希望这会有所帮助。如果您有任何其他问题，请与我们联系。

Answer 2

处理变量时有两个错误，两个错误都在同一行：

    lines = [line.split()]

您将其放入lines变量中，这是整个文件内容。您刚丢失了剩余的输入数据。
您从split。

试试这一行：

    line = line.split()

一旦我对您的标签位置做了一些假设，我就得到了合理的输出结果。

另外，考虑不用不同的数据语义覆盖变量;它混淆了用法。例如：

    for record in lines:
    line = record.split()

Answer 3

上面提到了一些问题，但这里只是对您的test = 'test1 test2 [32mOK[0m' test = re.sub(r'(.\[[\d*]m)+?', '', test)方法进行了一些小改动。

main()

请注意，循环中的def main(): file_in = open("water_supplies.txt", "r") # skip the headers in the input file, and save for output headers = file_in.readline() # changed to raw_input to not require quotes file_out = raw_input("Please enter a file name for the output: ") output_file = open(file_out, 'w') # write the headers back into output file output_file.write(headers) lines = file_in.readlines() for line in lines: # renamed variable here to split split = line.split('\t') bill_val = 0 c_type = split[5] start = int(split[6]) end = int(split[7]) bill_val = calc_bill(c_type, start, end) # line is already a string, don't need to cast it # added rstrip() to remove trailing newline output_file.write(line.rstrip() + "\t" + str(bill_val) + "\n")变量包含尾部换行符，因此如果要按原样将其写入输出文件，则需要将其删除。您的line和start指数也减少了1，因此我更改为end和split[6]。

最好不要求用户包含文件名的引号，因此请记住这一点。一种简单的方法是使用split[7]代替raw_input。

示例输入文件（来自OP）：

input

输出（test.out）：

CustomerID      Title   FirstName       MiddleName      LastName        Customer Type
1       Mr.     Orlando N.      Gee     Residential     297780  302555
2       Mr.     Keith   NULL    Harris  Residential     274964  278126

$ python test.py
Please enter a file name for the output:test.out

Python文件阅读＆amp;写作

3 个答案: