Question

在Python中，我正在尝试（非常糟糕）读取.txt文件，找到引用特定客户的最后一个字符串，并在下面读几行以获得当前的点余额。

.txt文件的快照是：

Customer ID:123
Total sale amount:2345.45

Points from sale:23
Points until next bonus: 77

我可以搜索（并找到）特定的客户ID，但无法弄清楚如何仅搜索此ID的最后一次出现，或者如何返回“直到下一次奖励的点数”值...我如果这是一个基本问题，请道歉，但任何帮助将不胜感激！

我的代码到目前为止......

def reward_points（）：

#current points total
rewards = open('sales.txt', 'r')

line = rewards.readlines()
search = (str('Customer ID:') + str(Cust_ID))
print(search) #Customer ID:123

while line != ' ':
    if line.startswith(search):
        find('Points until next bonus:')
        current_point_total = line[50:52]
        cust_record = rewards.readlines()
        print(current_point_total)


rewards.close()

reward_points（）

Answer 1

我认为你最好将文件解析为结构化数据，而不是试图寻找文件，这不是一种特别方便的文件格式。

这是建议的方法

使用readline

对文件进行迭代

通过匹配'：'

将行拆分为字段和标签

将代表客户的字段和标签放入字典

将代表客户的字典放入另一个字典

然后你有一个内存数据库，你可以通过dict lookups取消引用

例如customers['1234']['Points until next bonus']

以下是此方法的简化示例代码

#!/usr/bin/env python
import re

# dictionary with all the customers in 
customers = dict()

with open("sales.txt") as f:
    #one line at a time
    for line in f:
        #pattern match on 'key : value'
        field_match = re.match('^(.*):(.*)$',line)

        if field_match :
            # store the fields in variables
            (key,value) = field_match.groups()
            # Customer ID means a new record
            if key == "Customer ID" :
                # set a key for the 'customers database'
                current_id = value
                # if we have never seen this id before it's the first, make a record
                if customers.get(current_id) == None :
                    customers[current_id] = []
                # make the record an ordered list of dicts for each block
                customers[current_id].append(dict())
            # not a new record, so store the key and value in the dictionary at the end of the list
            customers[current_id][-1][key] = value

# now customers is a "database" indexed on customer id
#  where the values are a list of dicts of each data block
#
# -1 indexes the last of the list
# so the last customer's record for "123" is 

print customers["123"][-1]["Points until next bonus"]

更新

我没有意识到你为客户提供了多个块，并且对订购感兴趣，所以我重新设计了示例代码，以便根据客户ID解析每个数据块的有序列表

Answer 2

这是itertools.groupby()的一个很好的用例，这个用例很适合这种模式：

示例：

from itertools import groupby, ifilter, imap def search(d): """Key function used to group our dataset""" return d[0] == "Customer ID" def read_customer_records(filename): """Read customer records and return a nicer data structure""" data = {} with open(filename, "r") as f: # clean adn remove blank lines lines = ifilter(None, imap(str.strip, f)) # split each line on the ':' token lines = (line.split(":", 1) for line in lines) # iterate through each customer and their records for newcustomer, records in groupby(lines, search): if newcustomer: # we've found a new customer # create a new dict against their customer id customer_id = list(records)[0][1] data[customer_id] = {} else: # we've found customer records # add each key/value pair (split from ';') # to the customer record from above for k, v in records: data[customer_id][k] = v return data

<强>输出：

>>> read_customer_records("foo.txt") {'123': {'Total sale amount': '2345.45', 'Points until next bonus': ' 77', 'Points from sale': '23'}, '124': {'Total sale amount': '245.45', 'Points until next bonus': ' 79', 'Points from sale': '27'}}

然后，您可以直接查找客户;例如：

>>> data = read_customer_records("foo.txt") >>> data["123"] {'Total sale amount': '2345.45', 'Points until next bonus': ' 77', 'Points from sale': '23'} >>> data["123"]["Points until next bonus"] ' 77'

基本上我们在这里做的是＆＃34;分组＆＃34;基于Customer ID:行的数据集。然后，我们创建一个数据结构（ a dict ），然后我们可以轻松地进行O(1)次查找。

注意：只要您的＆＃34;客户记录＆＃34;在你的＆＃34;数据集＆＃34;由Customer ID分隔，无论有多少＆＃34;记录＆＃34;顾客有。这个实现也试图处理＆＃34;凌乱＆＃34;通过稍微清理输入，数据也尽可能多。

Answer 3

我会更普遍地接近这一点。如果我没有弄错，请提供特定格式的记录文件，记录以**开头和结尾。为什么不这样做呢？

records = file_content.split("**")
for each record in records:
    if (record.split("\n")[0] == search):
        customer_id = getCustomerIdFromRecord(record)
        customer_dictionary.put(customer_id, record)

这将生成customer_id和最新记录的映射。您可以解析它以获取所需的信息。

编辑：由于每条记录总共有9行，您可以获取文件中所有行的列表，并创建记录列表，其中记录将由9行列表表示。您可以使用此处发布的答案：

Convert List to a list of tuples python

Answer 4

您需要做的就是找到以Customer ID:123开头的行，当您发现它在内循环中循环遍历文件对象时，直到找到Points until行然后提取点。 points将是具有id的客户的最后一次出现的最后一个值。

with open("test.txt") as f:
    points = ""
    for line in f:
        if line.rstrip() == "Customer ID:123":
            for line in f:
                if line.startswith("Points until"):
                    points = line.rsplit(None, 1)[1]
                    break

print(points)
77

Answer 5

def get_points_until_next_bonus(filename, customerID):
    #get the last "Customer ID":
    last_id = open(filename, 'r').read().split('Customer ID:'+str(customerID))[-1]
    #get the first line with Points until next bonus: 77
    return last_id.split('Points until next bonus: ')[1].split('\n')[0]
    #there you go...

Python - 搜索ID的txt.file，然后从下面的行返回变量

5 个答案: