我是Python的新手,需要一些我有的字符串的帮助:
string='Starters\nSalad with Greens 14.00\nSalad Goat Cheese 12.75\nMains\nPizza 12.75\nPasta 12.75\n'
并且需要将其转换为看起来更像这样的表:
Category Dish Price
Starters Salad with Greens 14.00
Starters Salad Goat Cheese 12.75
Mains Pizza 12.75
Mains Pasta 12.75
实现这一目标的最佳方式是什么?
我试图应用string.rsplit(“”,2),但无法弄清楚是否每行都这样做。并且不知道如何将标题重复到单独的列中。 任何帮助将不胜感激。
提前致谢!
答案 0 :(得分:2)
我想你必须决定如何区分类别和项目。我认为一件物品应该有它的价格。此代码检查是否存在点,但您可能应该使用regexp。
s = 'Starters\nSalad with Greens 14.00\nSalad Goat Cheese 12.75\nMains\nPizza 12.75\nPasta 12.75'
items = s.split('\n')
# ['Starters', 'Salad with Greens 14.00', 'Salad Goat Cheese 12.75', 'Mains', 'Pizza 12.75', 'Pasta 12.75']
category = ''
menu = {}
for item in items:
print(item)
if '.' in item:
menu[category].append(item)
else:
category = item
menu[category] = []
print(menu)
# {'Starters': ['Salad with Greens 14.00', 'Salad Goat Cheese 12.75'], 'Mains': ['Pizza 12.75', 'Pasta 12.75']}
UPD:您可以替换
if '.' in item:
与
if re.match(r".*\d.\d\d", item):
它正在搜索以1.11结尾的字符串(如果您在类别名称中有缩写,则非常有用)
答案 1 :(得分:1)
不是说我会在生产环境中使用它,而是为了学术挑战:
import re
string = """Starters
Salad with Greens 14.00
Salad Goat Cheese 12.75
Mains
Pizza 12.75
Pasta 12.75"""
rx = re.compile(r'^(Starters|Mains)', re.MULTILINE)
result = "\n".join(["{}\t{}".format(category, line)
for parts in [[part.strip() for part in rx.split(string) if part]]
for category, dish in zip(parts[0::2], parts[1::2])
for line in dish.split("\n")])
print(result)
这会产生
Starters Salad with Greens 14.00
Starters Salad Goat Cheese 12.75
Mains Pizza 12.75
Mains Pasta 12.75
答案 2 :(得分:0)
试试这个。注意:它假设' Starters'在主要'
之前列出category = 'Starters'
for item in string.split('\n'):
if item == 'Mains': category = 'Mains'
if item in ('Starters', 'Mains'): continue
price = item.split(' ')[-1]
dish = ' '.join(item.split(' ')[:-1])
print ('{} {} {}'.format(category, dish, price))
答案 3 :(得分:0)
您可以在Python3中使用基于类的解决方案,并使用运算符重载来获得对数据的额外可访问性:
import re
import itertools
class MealPlan:
def __init__(self, string, headers):
self.headers = headers
self.grouped_data = [d for c, d in [(a, list(b)) for a, b in itertools.groupby(string.split('\n'), key=lambda x:x in ['Starters', 'Mains'])]]
self.final_grouped_data = list(map(lambda x:[x[0][0], x[-1]], [grouped_data[i:i+2] for i in range(0, len(grouped_data), 2)]))
self.final_data = [[[a, *list(filter(None, re.split('\s(?=\d)', i)))] for i in b] for a, b in final_grouped_data]
self.final_data = [list(filter(lambda x:len(x) > 1, i)) for i in self.final_data]
def __getattr__(self, column):
if column not in self.headers:
raise KeyError("'{}' not found".format(column))
transposed = [dict(zip(self.headers, i)) for i in itertools.chain.from_iterable(self.final_data)]
yield from map(lambda x:x[column], transposed)
def __getitem__(self, row):
new_grouped_data = {a:dict(zip(self.headers[1:], zip(*[i[1:] for i in list(b)]))) for a, b in itertools.groupby(list(itertools.chain(*self.final_data)), key=lambda x:x[0])}
return new_grouped_data[row]
def __repr__(self):
return ' '.join(self.headers)+'\n'+'\n'.join('\n'.join(' '.join(c) for c in i) for i in self.final_data)
string='Starters\nSalad with Greens 14.00\nSalad Goat Cheese 12.75\nMains\nPizza 12.75\nPasta 12.75\n'
meal = MealPlan(string, ['Category', 'Dish', 'Price'])
print(meal)
print([i for i in meal.Category])
print(meal['Starters'])
输出:
Category Dish Price
Starters Salad with Greens 14.00
Starters Salad Goat Cheese 12.75
Mains Pizza 12.75
Mains Pasta 12.75
['Starters', 'Starters', 'Mains', 'Mains']
{'Dish': ('Salad with Greens', 'Salad Goat Cheese'), 'Price': ('14.00', '12.75')}