尝试从拆分列表中添加特定列

时间:2012-04-21 19:15:32

标签: python

我有一个需要拆分的输入文件。该文件可以有任意数量的行,但每行有4个行。首先是区域代码,接下来是在该地区销售的一些小说书籍,其次是在该地区销售的非小说书籍的数量,最后是该地区的税收(例如:TX 493 515 0.055)。除了总结所有小说书籍,非小说类书籍和总销售额外,我已经想出了我需要做的所有我需要做的事情。假设只有三个总线和每个地区销售的小说书籍分别为493,500,289,显然它们各自分开。这是我写的,并且想知道我做错了什么:

while (myFile != ""):

    myFile = myFile.split()
    sumFiction = 0
    for i in range(myFile):
        sumFiction = sumFiction + eval(myFile[1])

如果我拆分(CO 493 515 0.055)的文件不是CO是myFile [0],493是myFile [1]等等。任何帮助都将不胜感激。

编辑:对不起,我应该更具体一点。我正在读取一个文件,并且假设这个文件有3行(但我的代码需要无限量的行):

TX 415 555 0.55
MN 330 999 0.78
HA 401 674 0.99

首先是区域代码,然后是销售的小说书籍数量,然后出售的非小说类书籍数量,然后是该地区的税收。我需要弄清楚在我所做的地区销售的书籍总数。我唯一想不通的是如何总结出售的所有三行小说(ex:415,330,401)。到目前为止,这是代码:

def ComputeSales(fictionBooks,nonFictionBooks,areaTax):
    total = (fictionBooks * 14.95) + (nonFictionBooks * 9.95)
    tax = total * areaTax
    totalSales = total + tax

    return total,tax,totalSales

def main():
    #inFile = input("Please enter name of book data file:  ")
    #inFile = open(inFile,"r")
    inFile = open("pa7.books","r")
    myFile = inFile.readline()

    print()
    print("{0:14}{1:10}".format("","Units Sold"))
    print("{0:10}{1:11}{2:17}{3:12}{4:8}{5:11}".format(
                "Region","Fiction","Non-Fiction","Total","Tax","Total Sales"))
    print("---------------------------------------------------------------------")

    while (myFile != ""):
        myFile = myFile.split()
        sumFiction = 0
        #for i in range(myFile):
            #sumFiction = sumFiction + eval(myFile[1])

        total,tax,totalSales = ComputeSales(eval(myFile[1]),eval(myFile[2]),eval(myFile[3]))

        print("{0:2}{1:10}{2:13}{3:4}{4:14.2f}{5:10.2f}{6:16.2f}".format(
                   "",myFile[0],myFile[1],myFile[2],total,tax,totalSales))

        myFile = inFile.readline()

     print("---------------------------------------------------------------------")
    #print("{0:11}{1:13}{2:34}{3:2}{4:8}".format(
    #             "Total","15035","3155","$","272843.41"))
    print(sumFiction)

main()

2 个答案:

答案 0 :(得分:2)

编辑:好的,我之前的回答是基于假设myFile实际上是文件对象,而不是文件的一行。

你的主要问题似乎是你正试图在另一个循环中做一个循环,这实际上没有意义:你只需要一个循环,在文件的行上,并加起来每行的总数。

以下是您main功能的编辑版本。我也是:

  • 切换到文件上的for循环,因为它更自然。
  • 使用float代替eval,正如评论中所建议的那样,因此恶意或错误的数据文件只会使您的程序崩溃而不是运行任意代码。
  • 切换到使用with语句打开文件:即使您的程序中途崩溃,也可以保证文件将被关闭,这是一个很好的习惯,尽管它没有这里有很大的不同。
  • 为变量名而不是snake_case切换为标准Python样式camelCase样式。 (另外,ComputeSales通常为compute_sales; CamelCase名称通常仅用于类名。)
  • 将文件名更改为参数,以便您可以使用例如main(sys.argv[1] if len(sys.argv) > 1 else "pa7.books")支持命令行参数。

这是:

def main(filename="pa7.books"):
    sum_fiction = 0
    sum_nonfiction = 0
    sum_total = 0

    with open(filename) as in_file:
        for line in in_file:
            if not line.strip():
                 continue # skip any blank lines

            fields = line.split()
            region = fields[0]
            fiction, nonfiction, area_tax = [float(x) for x in fields[1:]]

            total, tax, total_sales = ComputeSales(fiction, nonfiction, area_tax)

            sum_fiction += fiction
            sum_nonfiction += nonfiction
            sum_total += total_sales

    print("{0:2}{1:10}{2:13}{3:4}{4:14.2f}{5:10.2f}{6:16.2f}".format(
           "", region, fiction, nonfiction, total, tax, total_sales))

    print("---------------------------------------------------------------------")
    print("{0:11}{1:13}{2:34}{3:2}{4:8}".format(
           "Total", sum_fiction, sum_nonfiction, "$", sum_total))

如果您不了解我建议的任何更改,请随时提问!

答案 1 :(得分:1)

唉。这很难看。 Beautiful is Better Than Ugly。我不再是Python程序员了,所以可能有更好的工具。但是让我们从概念层面解决这个问题。

这是标准的命令式编程,使问题复杂化。这使得很容易迷失在实现噪声中,例如你遇到的问题。它让你远离树木的森林。让我们尝试另一种方法。

让我们专注于我们需要做的事情,让实现从中产生。首先我们知道我们需要从文件中读取。

从文件中读取

Scenario: Calculate totals within region database
Feature: Read from database

As a user, in order to be able to view the total sales of my books and differentiate them by fiction and nonfiction, I want to be able to read data from a file.

Given: I have a file that has region data, for example data.text
When: I load data from it
Then: I should have associated region data available in my program.

以下是作为测试用例的Python实现:

import unittest

class RegionTests(unittest.TestCase):
    def testLoadARegionDatabase(self):
        """Given a region file,when I load it, then it should be stored in memory"""
        # Given region database
        regionDatabase = []
        # When I load it
        with open('./regions.txt','r') as f:
            regionDatabase = f.readlines()
        # Then contents should be available
        self.assertTrue(len(regionDatabase) > 0)

从文件中获取区域数据

从概念上讲,我们知道该文件中的每一行都有意义。从根本上说,每一行都是 Region 。我们在文件中存储了代码,小说销售,非小说销售和税率。区域的概念应该在我们的系统中具有明确的第一类表示,因为Explicit is Better Than Implicit

Feature: Create a Region

As a user, in order to be able to know a region is information--including nonfiction sales, fiction sales, and tax rate-- I want to be able to create a Region.

Given: I have data for fiction sales, non-fiction sales, and tax rate
When:  I create a Region
Then:  Its sales, non-fiction sales, and tax-rate should be set accordingly

以下是作为测试用例的Python实现:

def testCreateRegionFromData(self):
        """Given a set of data, when I create a region, then its non-fiction sales, fiction sales,and tax rate should be set"""
        # Given a set of data
        texas = { "regionCode": "TX", "fiction" : 415, "nonfiction" : 555, "taxRate" : 0.55 }
        # When I create a region
        region = Region(texas["regionCode"], texas["fiction"], texas["nonfiction"], texas["taxRate"])
        # Then its attributes should be set
        self.assertEquals("TX", region.code)
        self.assertEquals(415, region.fiction)
        self.assertEquals(555, region.nonfiction)
        self.assertEquals(0.55, region.taxRate)

这失败了。让我们通过。

class Region:
    def __init__(self, code, fiction, nonfiction,rate):
        self.code = code
        self.fiction = fiction
        self.nonfiction = nonfiction
        self.taxRate = rate

分析总计

现在我们知道我们的系统可以代表区域。我们想要一些可以分析一堆地区的东西,并给出我们关于销售的汇总统计数据。我们称之为分析师

Feature: Calculate Total Sales

As a user, in order to be able to know what is going on, I want to be able to ask an Analyst what the total sales are for my region

Given: I have a set of regions
When : I ask my Analyst what the total sales are
Then : The analyst should return me the correct answers

以下是作为测试用例的Python实现。

def testAnalyzeRegionsForTotalNonFictionSales(self):
    """Given a set of Region, When I ask an Analyst for total non-fiction sales, then I should get the sum of non-fiction sales"""
    # Given a set of regions
    regions = [ Region("TX", 415, 555, 0.55), Region("MN", 330, 999, 0.78), Region("HA", 401, 674, 0.99) ]
    # When I ask my analyst for the total non-fiction sales
    analyst = Analyst(regions)
    result = analyst.calculateTotalNonFictionSales()
    self.assertEquals(2228, result)

这失败了。让我们通过。

class Analyst:
    def __init__(self,regions):
        self.regions = regions

    def calculateTotalNonFictionSales(self):
        return sum([reg.nonfiction for reg in self.regions])

你应该可以从这里推断小说销售。

决定,决定

在总销售额方面,有一个有趣的设计决策。

  • 我们是否应让分析师直接阅读小说和非小说 一个地区的属性并总结起来?

我们可以这样做:

def calculateTotalSales(self):
    return sum([reg.fiction + reg.nonfiction for reg in self.regions])

但是,如果我们添加“历史剧”(小说和非小说)或其他一些属性会怎样?然后,每次我们更改Region时,我们都必须更改Analyst,以便考虑Region的新结构。

不。这是一个糟糕的设计决定。 地区已经知道它需要了解的总销售额。区域应该能够报告其总数。

做出好的选择!

Feature: Report Total Sales
Given: I have a region with fiction and non-fiction sales
When : I ask the region for its total sales
Then: The region should tell me its total sales

以下是作为测试用例的Python实现:

def testGetTotalSalesForRegion(self):
        """Given a region with fiction and nonfiction sales, when I ask for its total sales, then I should get the result"""
        # Given a set of data
        texas = { "regionCode": "TX", "fiction" : 415, "nonfiction" : 555, "taxRate" : 0.55 }
        region = Region("TX", 415, 555, 0.55)
        # When I ask the region for its total sales
        result = region.totalSales()
        # Then I should get the sum of the sales
        self.assertEquals(970,result)

分析师应该Tell, Don't Ask

def calculateTotalSales(self):
        return sum([reg.totalSales() for reg in self.regions])

现在您已经拥有了编写此应用程序所需的一切。另外,如果您稍后进行更改,则可以使用自动回归套件。它可以准确地告诉您您已经破坏了什么,并且测试明确指定了应用程序是什么以及它可以做什么。

结果

这是由此产生的程序:

from region import Region
from analyst import Analyst

def main():
   text = readFromRegionFile()
   regions = createRegionsFromText(text)
   analyst = Analyst(regions)
   printResults(analyst)

def readFromRegionFile():
    regionDatabase = []
    with open('./regions.txt','r') as f:
            regionDatabase = f.readlines()
    return regionDatabase

def createRegionsFromText(text):
    regions = []
    for line in text:
        data = line.split()
        regions.append(Region(data[0],data[1], data[2], data[3]))
    return regions

def printResults(analyst):
    totSales = analyst.calculateTotalSales()
    totFic = analyst.calculateTotalFictionSales()
    totNon = analyst.calculateTotalNonFictionSales()
    for r in analyst.regions:
        print("{0:2}{1:10}{2:13}{3:4}{4:14.2f}{5:10.2f}".format(
           "", r.code, r.fiction, r.nonfiction, r.totalSales(), r.taxRate))

    print("---------------------------------------------------------------------")
    print("{0:11}{1:13}{2:34}{3:2}{4:8}".format(
           "Total", totFic, totNon, "$", totSales))

if __name__ == "__main__":
    main()

比较你写的内容。哪一个更容易理解?简洁?如果出现以下情况,你需要改变两个:

  • 您为每个地区添加了音乐销售?
  • 您是从文本文件移到MySQL数据库还是Web服务调用?

让您的概念显露出来。用您的代码清晰,简洁,富有表现力。