Python - 搜索ID的txt.file,然后从下面的行返回变量

时间:2015-05-23 08:35:22

标签: python

在Python中,我正在尝试(非常糟糕)读取.txt文件,找到引用特定客户的最后一个字符串,并在下面读几行以获得当前的点余额。

.txt文件的快照是:

Customer ID:123
Total sale amount:2345.45

Points from sale:23
Points until next bonus: 77

我可以搜索(并找到)特定的客户ID,但无法弄清楚如何仅搜索此ID的最后一次出现,或者如何返回“直到下一次奖励的点数”值...我如果这是一个基本问题,请道歉,但任何帮助将不胜感激!

我的代码到目前为止......

def reward_points():

#current points total
rewards = open('sales.txt', 'r')

line = rewards.readlines()
search = (str('Customer ID:') + str(Cust_ID))
print(search) #Customer ID:123

while line != ' ':
    if line.startswith(search):
        find('Points until next bonus:')
        current_point_total = line[50:52]
        cust_record = rewards.readlines()
        print(current_point_total)


rewards.close()

reward_points()

5 个答案:

答案 0 :(得分:2)

我认为你最好将文件解析为结构化数据,而不是试图寻找文件,这不是一种特别方便的文件格式。

这是建议的方法

使用readline

对文件进行迭代

通过匹配':'

将行拆分为字段和标签

将代表客户的字段和标签放入字典

将代表客户的字典放入另一个字典

然后你有一个内存数据库,你可以通过dict lookups取消引用

例如customers['1234']['Points until next bonus']

以下是此方法的简化示例代码

#!/usr/bin/env python
import re

# dictionary with all the customers in 
customers = dict()

with open("sales.txt") as f:
    #one line at a time
    for line in f:
        #pattern match on 'key : value'
        field_match = re.match('^(.*):(.*)$',line)

        if field_match :
            # store the fields in variables
            (key,value) = field_match.groups()
            # Customer ID means a new record
            if key == "Customer ID" :
                # set a key for the 'customers database'
                current_id = value
                # if we have never seen this id before it's the first, make a record
                if customers.get(current_id) == None :
                    customers[current_id] = []
                # make the record an ordered list of dicts for each block
                customers[current_id].append(dict())
            # not a new record, so store the key and value in the dictionary at the end of the list
            customers[current_id][-1][key] = value

# now customers is a "database" indexed on customer id
#  where the values are a list of dicts of each data block
#
# -1 indexes the last of the list
# so the last customer's record for "123" is 

print customers["123"][-1]["Points until next bonus"]

更新

我没有意识到你为客户提供了多个块,并且对订购感兴趣,所以我重新设计了示例代码,以便根据客户ID解析每个数据块的有序列表

答案 1 :(得分:1)

这是itertools.groupby()的一个很好的用例,这个用例很适合这种模式:

示例:

from itertools import groupby, ifilter, imap


def search(d):
    """Key function used to group our dataset"""

    return d[0] == "Customer ID"


def read_customer_records(filename):
    """Read customer records and return a nicer data structure"""

    data = {}

    with open(filename, "r") as f:
        # clean adn remove blank lines
        lines = ifilter(None, imap(str.strip, f))

        # split each line on the ':' token
        lines = (line.split(":", 1) for line in lines)

        # iterate through each customer and their records
        for newcustomer, records in groupby(lines, search):
            if newcustomer:
                # we've found a new customer
                # create a new dict against their customer id
                customer_id = list(records)[0][1]
                data[customer_id] = {}
            else:
                # we've found customer records
                # add each key/value pair (split from ';')
                # to the customer record from above
                for k, v in records:
                    data[customer_id][k] = v

    return data

<强>输出:

>>> read_customer_records("foo.txt")
{'123': {'Total sale amount': '2345.45', 'Points until next bonus': ' 77', 'Points from sale': '23'}, '124': {'Total sale amount': '245.45', 'Points until next bonus': ' 79', 'Points from sale': '27'}}

然后,您可以直接查找客户;例如:

>>> data = read_customer_records("foo.txt")
>>> data["123"]
{'Total sale amount': '2345.45', 'Points until next bonus': ' 77', 'Points from sale': '23'}
>>> data["123"]["Points until next bonus"]
' 77'

基本上我们在这里做的是&#34;分组&#34;基于Customer ID:行的数据集。然后,我们创建一个数据结构( a dict ),然后我们可以轻松地进行O(1)次查找。

注意:只要您的&#34;客户记录&#34;在你的&#34;数据集&#34;由Customer ID分隔,无论有多少&#34;记录&#34;顾客有。这个实现也试图处理&#34;凌乱&#34;通过稍微清理输入,数据也尽可能多。

答案 2 :(得分:0)

我会更普遍地接近这一点。如果我没有弄错,请提供特定格式的记录文件,记录以**开头和结尾。为什么不这样做呢?

records = file_content.split("**")
for each record in records:
    if (record.split("\n")[0] == search):
        customer_id = getCustomerIdFromRecord(record)
        customer_dictionary.put(customer_id, record)

这将生成customer_id和最新记录的映射。您可以解析它以获取所需的信息。

编辑: 由于每条记录总共有9行,您可以获取文件中所有行的列表,并创建记录列表,其中记录将由9行列表表示。您可以使用此处发布的答案:

Convert List to a list of tuples python

答案 3 :(得分:0)

您需要做的就是找到以Customer ID:123开头的行,当您发现它在内循环中循环遍历文件对象时,直到找到Points until行然后提取点。 points将是具有id的客户的最后一次出现的最后一个值。

with open("test.txt") as f:
    points = ""
    for line in f:
        if line.rstrip() == "Customer ID:123":
            for line in f:
                if line.startswith("Points until"):
                    points = line.rsplit(None, 1)[1]
                    break

print(points)
77

答案 4 :(得分:0)

def get_points_until_next_bonus(filename, customerID):
    #get the last "Customer ID":
    last_id = open(filename, 'r').read().split('Customer ID:'+str(customerID))[-1]
    #get the first line with Points until next bonus: 77
    return last_id.split('Points until next bonus: ')[1].split('\n')[0]
    #there you go...