Question

读取文件Qdata.txt并计算其中的平均值第二列（或第三列确实，如果日期被视为第一列

import sys

td = open("Qdata.txt", "r")  # opening the file to variable ("file handle") td

sum = 0
n = 0
firstround = True

for line in td:
    if (firstround):
        firstround = False  # nothing else is done for the first line (header)
    else:
        fields = line.split()  # This creates a list containing the strings on 
        # the line, by default separated by spaces or tabs.
        # Now fields[0] contains the date, fields[1] the 
        # 1st data value and fields[2] the 2nd one.
        try:
            sum = sum + float(fields[1])  # increasing the cumulative value
            field1 = (fields[1])

            print(field1)

            n = 5
        # Handling possible errors.
        except IndexError:  # IndexError occurs e.g. in the case of empty lines
            # (when fields[2], for example, doesn't exist)
            continue
        except ValueError:  # ValueError occurs e.g. if there are letters instead of 
            # numbers (when conversion to float causes an error)
            print("Incorrect values in the file.")
            sys.exit()
print("Average over the whole period was ", sum / n)
print("Total number of values was ", n)

这是Qdata.txt

Date   3700300   6701500
20000101 21.00   223.00  
20000102 20.00   218.00  
20000103 18.00   218.00  
20000104 17.00   213.00  
20000105 17.00   210.00  
20000106 18.00   210.00  
20000107 21.00   210.00  
20000108 23.00   208.00  
20000109 27.00   201.00  
20000110 28.00   199.00  
20000111 26.00   196.00  
20000112 24.00   196.00  
20000113 23.00   194.00  
20000114 21.00   192.00  
20000115 19.00   185.00  
20000116 17.00   183.00  
20000117 12.00   179.00  
20000118 11.00   173.00  
20000119 10.00   171.00  
20000120 9.80   167.00  
20000121 9.00   165.00  
20000122 8.40   163.00  
20000123 7.50   157.00  
20000124 7.10   156.00  
20000125 6.70   150.00  
20000126 6.40   148.00  
20000127 6.00   148.00  
20000128 5.90   147.00  
20000129 5.50   145.00  
20000130 5.40   143.00  
20000131 5.30   140.00  
20000201 5.30   140.00

所以，现在我想要总和，然后平均使用field1中的数字。如果我尝试从field1 [-5：]中提取，它就不起作用。我可以通过什么方式获得中间列的最后5位数来求和和平均值，我想要列出一个列表？

Answer 1

我一直是列表理解的支持者。这个稍微复杂一点，因为你可能会省略一些值，但它仍然是我首选的方法。

def middle_item(line):
    try:
        return float(line.split()[1]), True
    except IndexError:
        return None, False
    except ValueError:
        raise ValueError('Incorrect values in the file.')

with open('out.txt', 'r') as td:
    column = [value
              for line in td
              for value, is_valid in [middle_item(line)]
              if is_valid][1:]
    n = len(column)
    print("Average over the whole period was ", sum(column) / n)
    print("Total number of values was ", n)
    print('Sum of last five:', sum(column[-5:]))

Answer 2

我同意Brett Beatty使用列表推导的答案，但如果您想知道如何改进原始代码，您可以执行以下操作。

1）重命名变量＆＃39; sum＆＃39;到另一个变量，如＆＃34; my_sum＆＃34;因为sum（）是构建函数中的python

2）在循环之前创建一个列表（field1 = []）并在循环的每一步附加到此列表。在循环结束时，您将有一个列表，其中包含该列的所有条目。

3）您可以使用内置函数sum（field1 [-5：]）/ n

计算列的最后五个条目的平均值

如下所示：

import sys

td = open(r"Qdata.csv", "r")  # opening the file to variable ("file handle") td

my_sum = 0
n = 0
firstround = True
field1 = [] # make an empty list
for line in td:
    if (firstround):
        firstround = False  # nothing else is done for the first line (header)
    else:
        fields = line.split()  # This creates a list containing the strings on 
        # the line, by default separated by spaces or tabs.
        # Now fields[0] contains the date, fields[1] the 
        # 1st data value and fields[2] the 2nd one.
        try:
            my_sum = my_sum + float(fields[1])  # increasing the cumulative value
            #field1 = (fields[1])
            field1.append(float(fields[1])) # add elements to the end of the list 
            n = 5
        # Handling possible errors.
        except IndexError:  # IndexError occurs e.g. in the case of empty lines
            # (when fields[2], for example, doesn't exist)
            continue
        except ValueError:  # ValueError occurs e.g. if there are letters instead of 
            # numbers (when conversion to float causes an error)
            print("Incorrect values in the file.")
            sys.exit()
print("Average over the whole period was ", my_sum / n)
print("Total number of values was ", n)
print("average of last 5 elements of field1: " , sum(field1[-5:])/5)

Answer 3

首先请不要使用函数作为变量名。在以下示例中，我将sum重命名为sum1。我还实现了在列表中追加并将列表转换为numpy数组。这使得计算变得更加容易。

import sys
import numpy as np

td = open("Qdata.txt", "r")  # opening the file to variable ("file handle") td

sum1 = 0
n = 0
firstround = True
field1 = []

for line in td:
    if (firstround):
        firstround = False  # nothing else is done for the first line (header)
    else:
        fields = line.split()  # This creates a list containing the strings on 
        # the line, by default separated by spaces or tabs.
        # Now fields[0] contains the date, fields[1] the 
        # 1st data value and fields[2] the 2nd one.
        try:
            sum1 = sum1 + float(fields[1])  # increasing the cumulative value
            field1.append(float(fields[1]))

            print(fields[1])

            n = 5
        # Handling possible errors.
        except IndexError:  # IndexError occurs e.g. in the case of empty lines
            # (when fields[2], for example, doesn't exist)
            continue
        except ValueError:  # ValueError occurs e.g. if there are letters instead of 
            # numbers (when conversion to float causes an error)
            print("Incorrect values in the file.")
            sys.exit()


# transform list into numpy array
field1 = np.array(field1)

print("Average over the whole period was ", field1.mean())
print("Total number of values was ", len(field1))
print(field1)

print("Average over the last five periods was ", field1[-5:].mean())

如果你想更频繁地在python中导入txt文件，你可能想要查看Pandas包。

Python - 从最后5位浮点对象获得平均值

3 个答案: