Question

我是python的新手，并且正在尝试过滤类似于以下内容的字符串：

"{Red,Plant,Eel}{Blue,Animal,Maple}{Yellow,Plant,Crab}"

以此类推，持续三个单词集中的100个。

我想从每个用“ {}”标记的集合中提取第二个单词，因此在此示例中，我需要输出：

"Plant,Animal,Plant"

以此类推。

我如何有效地做到这一点？

截至目前，我为每个“ {}”组分别使用string.split(",")[1]。

谢谢。

Answer 1

这可以解决问题：

str_ = "{Red,Plant,Eel}{Blue,Animal,Maple}{Yellow,Plant,Crab}"
res = [x.split(',')[1] for x in str_[1:-1].split('}{')]

产生

['Plant', 'Animal', 'Plant']

使用str_[1:-1]，我们删除了开头的"{"和尾随的"}"，然后在"}{"的每个实例上拆分了其余实体，从而产生了：

["Red,Plant,Eel", "Blue,Animal,Maple", ...]

最后，对于每个字符串，我们在","上进行拆分以获得

[["Red", "Plant", "Eel"], ...]

从中，我们仅将每个子列表的第一个元素保留为x[1]。

请注意，出于您的特定目的，使用str_[1:-1]分割原始字符串不是强制性的（也可以不使用它），但是如果您只希望第一个而不是第二个项目，则将有所不同。如果您想获得第三名，也是如此。

如果您想连接输出的字符串以匹配所需的结果，只需将结果列表传递给.join即可，如下所示：

out = ','.join(res)

然后为您提供

"Plant,Animal,Plant"

Answer 2

尝试一下：

[i.split(',')[1] for i in str_[1:].split('}')[:len(str_.split('}'))-1]]

Answer 3

另一种解决方案是使用正则表达式，它稍微复杂一些，但这是值得讨论的技术：

import re
input_string = "{Red,Plant,Eel}{Blue,Animal,Maple}{Yellow,Plant,Crab}"
regex_string = "\{\w+\,(\w+)\,\w+\}"

result_list = re.findall(regex, input_string)

然后result_list的输出是：

['植物'，'动物'，'植物']

这是python中正则表达式的link 和一个online regex editor

Answer 4

#!/bin/python3

string = "{Red,Plant,Eel}{Blue,Animal,Maple}{Yellow,Plant,Crab}"
a = string.replace('{','').replace('}',',').split(',')[1::3]
print(a)

结果是 ['Plant', 'Animal', 'Plant']

如何为每个拆分执行命令？

4 个答案: