Python:从字符串列表中删除字符串的一部分

时间:2016-10-10 21:51:28

标签: python xlrd

我使用xlrd从Excel工作表中提取列以生成列表。

from xlrd import open_workbook
book = xlrd.open_workbook("HEENT.xlsx").sheet_by_index(0)
med_name = []
for row in sheet.col(2):
    med_name.append(row)
med_school = []
for row in sheet.col(3):
    med_school.append(row)
print(med_school)

以下是列表的片段:med_school。

[text:'University of San Francisco', 
text: 'Harvard University', 
text:'Class of 2016, University of Maryland School of Medicine', 
text:'Class of 2015, Johns Hopkins University School of Medicine', 
text:'Class of 2014, Raymond and Ruth Perelman School of Medicine at the
University of Pennsylvania']

我想删除"文字:' 2014年级"从列表中的每个字符串。我尝试了列表理解,但是我得到了一个属性错误:' Cell'对象没有属性' strip'。有没有人知道如何创建一个医学院名称列表,这些名称只有医学院名称而没有课程年份和单词" text"?

3 个答案:

答案 0 :(得分:4)

xlrd不会返回字符串,它会返回名为Cell的类的实例。它有一个属性value,其中包含您看到的字符串。

简单地修改这些:

for cell in med_school:
    cell.value = cell.value[:15]

这将删除前15个字符(“2014级”)。或者,您可以使用其他方法,如字符串拆分(在“,”上)或正则表达式。

这里的要点是,您不应该直接处理med_schools列表中的值,而应该处理.value属性。或者将它提取到你可以处理它的其他地方。

例如,要获取所有文本属性,请删除前缀:

values = [cell.value[15:] for cell in med_schools]

或者使用正则表达式替换以仅替换那些包含违规数据的实际操作

values = [re.sub(r"^Class of \d{4}, ", "", cell.value) for cell in med_schools]

答案 1 :(得分:1)

使用给定的分隔符剪掉每个字符串的头部。首先检查以确保它具有“Class”,因此我们知道逗号空间在那里。

med_school = ["text:'Class of 2016, University of Maryland School of Medicine'",  
              "text:'Class of 2015, Johns Hopkins University School of Medicine'", 
              "text:'Class of 2014, Raymond and Ruth Perelman School of Medicine at the University of Pennsylvania'",
              "text:'Class of 1989, Rush Medical School / Knox College'",
              "text:'Bernie\'s Back-Alley School of Black-Market Techniques'"
             ]

school_name = []
for first in med_school:
    name = first.value
    if ", " in name:
        cut  = name.index(", ")
        name = name[cut+2:]
    else:
        name = name[6:-1]
    school_name.append(name)

print school_name

输出(使用额外的换行符以提高可读性):

["University of Maryland School of Medicine'",
 "Johns Hopkins University School of Medicine'",
 "Raymond and Ruth Perelman School of Medicine at the University of Pennsylvania'"
 "Rush Medical School / Knox College'", 
 "Bernie's Back-Alley School of Black-Market Techniques"]

您还可以将循环包装到列表解析中:

school_name = [name.value[name.value.index(", ")+2:] \
                       if ", " in name \
                       else name[6:-1]   \
                   for name in med_school]

答案 2 :(得分:1)

var user = response.object as! Person var request = RequestParameters() request.hotel.name = user.hotel_name 更改为for row in sheet.col(2) 你将摆脱do文件类型并获得实际值。 做这个。

for row in sheet.col(2).value results =[]