如何从python列表的每个元素中删除前缀?

时间:2019-08-28 09:50:38

标签: python regex list

我有一个包含以下各项的python列表:

[ 1.1 ] 1. a electronic bill presentment system.
[ 1.2 ] a network.
[ 1.3 ] a plurality of first stations, each associated with a respective one of a plurality of users and operable to transmit first requests for bills of its associated user via the network.
[ 1.4 ] a central network station configured to receive the transmitted first requests for bills and to transmit, responsive to each of the received first requests, bill availability information for the associated user via the network, wherein each of the plurality of first stations is configured to receive the transmitted bill availability information for its associated user and is operable to transmit second requests for bills of its associated user via the network.
[ 1.5 ] a plurality of second network stations, each associated with a respective one of a plurality of billers, configured to receive the transmitted second requests for bills and to transmit, responsive thereto, the requested bills of the associated user via the network.
[ 1.6 ] wherein the bill availability information for the associated user identifies those of the plurality of billers having a bill available for that user without identifying an amount of the bill of each of the identified billers for the associated user.
[ 2.1 ] 2. a method for presenting electronic bills.
[ 2.2 ] storing, at a plurality of different locations, electronic bills of a plurality of different billers for a plurality of different users.
[ 2.3 ] storing identifiers of the stored electronic bills at a location different than the plurality of different locations.
[ 2.4 ] transmitting a first request for the stored electronic bills for a first of the plurality of users.
[ 2.5 ] transmitting one or more of the stored identifiers of the stored electronic bills for the first user responsive to the transmitted first request, each of the transmitted one or more identifiers being associated with a respective one of the stored electronic bills of a different one of the plurality of billers.
[ 2.6 ] transmitting a second request for at least one of the stored electronic bills identified by the transmitted one or more identifiers.
[ 2.7 ] transmitting the at least one identified stored electronic bill responsive to the transmitted second request.
[ 2.8 ] wherein the transmitted one or more identifiers identifies the stored electronic bills without identifying an amount of the identified stored electronic bills.

我需要做的是从每个项目中删除前缀,但是不知道如何删除它。

例如,

我需要删除[2.8](space) [2.7](space)。上面的每个新行打印都代表列表的项目。 就像[ 1.1 ] 1. a electronic bill presentment system.一样,我需要删除[ 1.1 ] 1.

我要删除的代码功能如下,我正在使用一种逻辑,先使用空格分割,然后删除非alpha值。

但是它不能正常工作。

请帮助。

TextDictionaryValuesList = list(TextDictionary.values()) 
# You can make a test list using above given items of mylist

def remove_non_alpha(splitlist):
    for j in range(0, len(splitlist)):
        if(splitlist[j].isalpha()):
            splitlist[j] = splitlist[j]
        else:
            splitlist[j] = ""       

    return splitlist


for i in range(0, len(TextDictionaryValuesList)):

    print(TextDictionaryValuesList[i])

    splitlist = TextDictionaryValuesList[i].split(" ")
    splitlist = remove_non_alpha(splitlist)
    TextDictionaryValuesList[i] = splitlist

print(TextDictionaryValuesList)

4 个答案:

答案 0 :(得分:3)

这是使用re.sub的一种方法:

import re
l = ['[ 1.1 ] 1. a electronic bill presentment system.','[ 1.2 ] a network.']

[re.sub(r'\[\s*\d+\.*\d*\s*\]\s+(?:\d+\.\s*)?', '', s) for s in l]
# ['a electronic bill presentment system.', 'a network.']

请参见demo


使用更大的字符串列表进行测试:

l = ['[ 1.1 ] 1. a electronic bill presentment system.',\
'[ 1.2 ] a network.',\
'[ 1.3 ] a plurality of first stations, each associated with a respective one of a plurality of users and operable to transmit first requests for bills of its associated user via the network.',\
'[ 1.5 ] a plurality of second network stations, each associated with a respective one of a plurality of billers, configured to receive the transmitted second requests for bills and to transmit, responsive thereto, the requested bills of the associated user via the network.',\
'[ 1.6 ] wherein the bill availability information for the associated user identifies those of the plurality of billers having a bill available for that user without identifying an amount of the bill of each of the identified billers for the associated user.',\
'[ 2.1 ] 2. a method for presenting electronic bills.']

[re.sub(r'\[\s*\d+\.*\d*\s*\]\s+(?:\d+\.\s*)?', '', s) for s in l]

['a electronic bill presentment system.',
 'a network.',
 'a plurality of first stations, each associated with a respective one of a plurality of users and operable to transmit first requests for bills of its associated user via the network.',
 'a plurality of second network stations, each associated with a respective one of a plurality of billers, configured to receive the transmitted second requests for bills and to transmit, responsive thereto, the requested bills of the associated user via the network.',
 'wherein the bill availability information for the associated user identifies those of the plurality of billers having a bill available for that user without identifying an amount of the bill of each of the identified billers for the associated user.',
 'a method for presenting electronic bills.']

答案 1 :(得分:2)

您应使用正则表达式将模式替换为空字符串

>>> re.sub(r'\[\s?\d\.\d\s?]\s?(\d(\.\s)?)?', '', '[ 1.1 ] 1. a electronic bill presentment system.')
'a electronic bill presentment system.'

答案 2 :(得分:0)

import re

data = ["[ 1.1 ] 1. a electronic bill presentment system.","[ 1.2 ] a network."]

result = [re.search('[a-z,A-Z].*',i).group(0) for i in data]
print(result)

答案 3 :(得分:0)

如果您不想避免使用正则表达式,只要没有[1.2.b]之类的花哨前缀或任何带字母的内容,就可以使其简单。

def chop_non_alpha(txt):
    for i in range(len(txt)):
        if txt[i].isalpha():
            return txt[i:]

for line in lines:
    print(chop_non_alpha(line))