在Python中,我正在尝试(非常糟糕)读取.txt文件,找到引用特定客户的最后一个字符串,并在下面读几行以获得当前的点余额。
.txt文件的快照是:
Customer ID:123
Total sale amount:2345.45
Points from sale:23
Points until next bonus: 77
我可以搜索(并找到)特定的客户ID,但无法弄清楚如何仅搜索此ID的最后一次出现,或者如何返回“直到下一次奖励的点数”值...我如果这是一个基本问题,请道歉,但任何帮助将不胜感激!
我的代码到目前为止......
def reward_points():
#current points total
rewards = open('sales.txt', 'r')
line = rewards.readlines()
search = (str('Customer ID:') + str(Cust_ID))
print(search) #Customer ID:123
while line != ' ':
if line.startswith(search):
find('Points until next bonus:')
current_point_total = line[50:52]
cust_record = rewards.readlines()
print(current_point_total)
rewards.close()
reward_points()
答案 0 :(得分:2)
我认为你最好将文件解析为结构化数据,而不是试图寻找文件,这不是一种特别方便的文件格式。
这是建议的方法
使用readline
通过匹配':'
将行拆分为字段和标签将代表客户的字段和标签放入字典
将代表客户的字典放入另一个字典
然后你有一个内存数据库,你可以通过dict lookups取消引用
例如customers['1234']['Points until next bonus']
以下是此方法的简化示例代码
#!/usr/bin/env python
import re
# dictionary with all the customers in
customers = dict()
with open("sales.txt") as f:
#one line at a time
for line in f:
#pattern match on 'key : value'
field_match = re.match('^(.*):(.*)$',line)
if field_match :
# store the fields in variables
(key,value) = field_match.groups()
# Customer ID means a new record
if key == "Customer ID" :
# set a key for the 'customers database'
current_id = value
# if we have never seen this id before it's the first, make a record
if customers.get(current_id) == None :
customers[current_id] = []
# make the record an ordered list of dicts for each block
customers[current_id].append(dict())
# not a new record, so store the key and value in the dictionary at the end of the list
customers[current_id][-1][key] = value
# now customers is a "database" indexed on customer id
# where the values are a list of dicts of each data block
#
# -1 indexes the last of the list
# so the last customer's record for "123" is
print customers["123"][-1]["Points until next bonus"]
更新
我没有意识到你为客户提供了多个块,并且对订购感兴趣,所以我重新设计了示例代码,以便根据客户ID解析每个数据块的有序列表
答案 1 :(得分:1)
这是itertools.groupby()
的一个很好的用例,这个用例很适合这种模式:
示例:强>
from itertools import groupby, ifilter, imap
def search(d):
"""Key function used to group our dataset"""
return d[0] == "Customer ID"
def read_customer_records(filename):
"""Read customer records and return a nicer data structure"""
data = {}
with open(filename, "r") as f:
# clean adn remove blank lines
lines = ifilter(None, imap(str.strip, f))
# split each line on the ':' token
lines = (line.split(":", 1) for line in lines)
# iterate through each customer and their records
for newcustomer, records in groupby(lines, search):
if newcustomer:
# we've found a new customer
# create a new dict against their customer id
customer_id = list(records)[0][1]
data[customer_id] = {}
else:
# we've found customer records
# add each key/value pair (split from ';')
# to the customer record from above
for k, v in records:
data[customer_id][k] = v
return data
<强>输出:强>
>>> read_customer_records("foo.txt")
{'123': {'Total sale amount': '2345.45', 'Points until next bonus': ' 77', 'Points from sale': '23'}, '124': {'Total sale amount': '245.45', 'Points until next bonus': ' 79', 'Points from sale': '27'}}
然后,您可以直接查找客户;例如:
>>> data = read_customer_records("foo.txt")
>>> data["123"]
{'Total sale amount': '2345.45', 'Points until next bonus': ' 77', 'Points from sale': '23'}
>>> data["123"]["Points until next bonus"]
' 77'
基本上我们在这里做的是&#34;分组&#34;基于Customer ID:
行的数据集。然后,我们创建一个数据结构( a dict
),然后我们可以轻松地进行O(1)
次查找。
注意:只要您的&#34;客户记录&#34;在你的&#34;数据集&#34;由Customer ID
分隔,无论有多少&#34;记录&#34;顾客有。这个实现也试图处理&#34;凌乱&#34;通过稍微清理输入,数据也尽可能多。
答案 2 :(得分:0)
我会更普遍地接近这一点。如果我没有弄错,请提供特定格式的记录文件,记录以**
开头和结尾。为什么不这样做呢?
records = file_content.split("**")
for each record in records:
if (record.split("\n")[0] == search):
customer_id = getCustomerIdFromRecord(record)
customer_dictionary.put(customer_id, record)
这将生成customer_id和最新记录的映射。您可以解析它以获取所需的信息。
编辑: 由于每条记录总共有9行,您可以获取文件中所有行的列表,并创建记录列表,其中记录将由9行列表表示。您可以使用此处发布的答案:
答案 3 :(得分:0)
您需要做的就是找到以Customer ID:123
开头的行,当您发现它在内循环中循环遍历文件对象时,直到找到Points until
行然后提取点。 points将是具有id的客户的最后一次出现的最后一个值。
with open("test.txt") as f:
points = ""
for line in f:
if line.rstrip() == "Customer ID:123":
for line in f:
if line.startswith("Points until"):
points = line.rsplit(None, 1)[1]
break
print(points)
77
答案 4 :(得分:0)
def get_points_until_next_bonus(filename, customerID):
#get the last "Customer ID":
last_id = open(filename, 'r').read().split('Customer ID:'+str(customerID))[-1]
#get the first line with Points until next bonus: 77
return last_id.split('Points until next bonus: ')[1].split('\n')[0]
#there you go...