Question

我有一个类似下面的示例数据的列表。列表中的每个条目都遵循“ source / number_something /”模式。我想像下面的输出一样创建一个新列表，其中的条目只是“东西”。我当时以为可以在_上使用for循环和字符串拆分，但是随后的一些文本也包括_。这似乎可以用正则表达式来完成，但是我对正则表达式的了解并不高。任何提示都将不胜感激。

示例数据：

['source/108_cash_total/',
 'source/108_customer/',
 'source/108_daily_units_total/',
 'source/108_discounts/',
 'source/108_employee/',
'source/56_cash_total/',
 'source/56_customer/',
 'source/56_daily_units_total/',
 'source/56_discounts/',
 'source/56_employee/']

输出：

['cash_total',
 'customer',
 'daily_units_total',
 'discounts',
 'employee',
'cash_total',
 'customer/',
 'daily_units_total',
 'discounts',
 'employee']

Answer 1

您可以使用正则表达式：

\d+_([^/]+)

请参见a demo on regex101.com。

在Python中：

import re

lst = ['source/108_cash_total/',
       'source/108_customer/',
       'source/108_daily_units_total/',
       'source/108_discounts/',
       'source/108_employee/',
       'source/56_cash_total/',
       'source/56_customer/',
       'source/56_daily_units_total/',
       'source/56_discounts/',
       'source/56_employee/']

rx = re.compile(r'\d+_([^/]+)')

output = [match.group(1) 
          for item in lst 
          for match in [rx.search(item)] 
          if match]
print(output)

哪个产量

['cash_total', 'customer', 'daily_units_total', 
 'discounts', 'employee', 'cash_total', 'customer',
 'daily_units_total', 'discounts', 'employee']

Answer 2

只需使用偏移量和设置了maxsplit参数的split()，就可以轻松地执行此操作而无需使用正则表达式：

offset = len("source/")
result = []
for item in lst:
    num, data = item[offset:].split("_", 1)
    result.append(data[:-1])

当然，它不是很灵活，但是只要您的数据遵循模式，就没有关系。

Answer 3

与正则表达式相比可能不那么干净

使用list comprehension和split function

lst = ['source/108_cash_total/',
 'source/108_customer/',
 'source/108_daily_units_total/',
 'source/108_discounts/',
 'source/108_employee/',
'source/56_cash_total/',
 'source/56_customer/',
 'source/56_daily_units_total/',
 'source/56_discounts/',
 'source/56_employee/']

res = [ '_'.join(i.split('_')[1:]).split('/')[:-1][0]  for i in lst]

print(res)

# output ['cash_total', 'customer', 'daily_units_total', 'discounts', 'employee', 'cash_total', 'customer', 'daily_units_total', 'discounts', 'employee']

根据字符串模式从列表创建列表

3 个答案: