我有一个列表:
my_list = ['A70-11370; reprint; rolled; 2000; 26.5 x 38.5',
'A70-713; reprint; rolled; 1980; 26.5 x 38.5',
'b70-7814; reprint; Style A; rolled; 1939; 22.5 x 34.5',
'A70-7600; reprint; rolled; 1986; 26.5 x 38.5',
'A70-6912; reprint; style C; rolled; 1977; 26.5 x 38.5',
'A70-8692; reprint; regular; rolled; 1995; 26.5 x 38.5',
'A70-2978; reprint; rolled; 1991; 26.5 x 38.5',
'A70-4902; reprint; Style A; rolled; 1999; 26.5 x 38.5',
'A70-6300; reprint; regular; rolled; 1983; 26.5 x 38.5',
'MPW-6725; reprint; rolled; 1966; 26.5 x 38']
我想提取包含'x'的字符串(例如26.5 x 38.5)。我尝试过:
string = [i if 'x' in i else np.nan for i in str(my_string).split(';')]
在不满足条件的地方放置nan,但我只是在那儿。有和没有nan占位符的情况下,是否都可以获取我想要的字符串?
答案 0 :(得分:3)
您需要使用嵌套列表理解功能才能获取列表中的每个子字符串。
[x for s in my_list for x in s.split('; ') if 'x' in x]
结果:
['26.5 x 38.5', '26.5 x 38.5', '22.5 x 34.5', '26.5 x 38.5', '26.5 x 38.5', '26.5 x 38.5', '26.5 x 38.5', '26.5 x 38.5', '26.5 x 38.5', '26.5 x 38']
使用re
更适合此操作,尽管仅使用if 'x' in x
可能会返回不想要的结果:
p = re.compile("\d+\.\d+ x \d+\.\d+")
[m.group(0) for m in map(p.search, my_list) if m]
答案 1 :(得分:2)
outputs = [subitem for item in my_list for subitem in item.split(';') if 'x' in subitem]
print(outputs)
输出:
[' 26.5 x 38.5', ' 26.5 x 38.5', ' 22.5 x 34.5', ' 26.5 x 38.5', ' 26.5 x 38.5', ' 26.5 x 38.5', ' 26.5 x 38.5', ' 26.5 x 38.5', ' 26.5 x 38.5', ' 26.5 x 38']
答案 2 :(得分:1)
为此使用列表理解可能很难看,我建议分别使用两个for循环以提高可读性。
my_list = ['A70-11370; reprint; rolled; 2000; 26.5 x 38.5',
'A70-713; reprint; rolled; 1980; 26.5 x 38.5',
'b70-7814; reprint; Style A; rolled; 1939; 22.5 x 34.5',
'A70-7600; reprint; rolled; 1986; 26.5 x 38.5',
'A70-6912; reprint; style C; rolled; 1977; 26.5 x 38.5',
'A70-8692; reprint; regular; rolled; 1995; 26.5 x 38.5',
'A70-2978; reprint; rolled; 1991; 26.5 x 38.5',
'A70-4902; reprint; Style A; rolled; 1999; 26.5 x 38.5',
'A70-6300; reprint; regular; rolled; 1983; 26.5 x 38.5',
'MPW-6725; reprint; rolled; 1966; 26.5 x 38']
multiplications = []
for item in my_list:
for subitem in item.split(';'):
if 'x' in subitem:
multiplications.append(subitem.strip())
print('\n'.join(multiplications))
这将输出:
26.5 x 38.5
26.5 x 38.5
22.5 x 34.5
26.5 x 38.5
26.5 x 38.5
26.5 x 38.5
26.5 x 38.5
26.5 x 38.5
26.5 x 38.5
26.5 x 38
答案 3 :(得分:1)
赞
string = [i for my_string in my_list for i in str(my_string).split(';') if 'x' in i ]
答案 4 :(得分:0)
是的,如果您只想提取包含'x'的字符串,则可以
sep = ''.join(my_list).split(';')
with_x = filter(lambda str_: 'x' in str_, sep)
for i in with_x:
print(i)
答案 5 :(得分:0)
这是一个基于正则表达式的解决方案。它比提供的其他解决方案更健壮,因为即使所需的字符串前面没有;
,它也可以工作。
import re
reg = re.compile(r'\b(\d+\.\d+\b x \b\d+\.\d+)\b')
new_list = []
for elem in my_list:
result = re.search(reg, elem)
if result:
new_list.append(result.group(0))