如何编写正则表达式以仅匹配不带.csv扩展名的字符串名称。这应该是必需的输出
Required Output:
['ap_2010', 'class_size', 'demographics', 'graduation','hs_directory', 'sat_results']
Input:
data_files = [
"ap_2010.csv",
"class_size.csv",
"demographics.csv",
"graduation.csv",
"hs_directory.csv",
"sat_results.csv"]
我尝试过,但是返回了一个空列表。
for i in data_files:
regex = re.findall(r'/w+/_[/d{4}][/w*]?', i)
答案 0 :(得分:4)
如果您确实要使用正则表达式,则可以使用re.sub
删除扩展名(如果存在),如果不存在,则保留字符串:
[re.sub(r'\.csv$', '', i) for i in data_files]
['ap_2010',
'class_size',
'demographics',
'graduation',
'hs_directory',
'sat_results']
通常,更好的方法是使用os
模块来处理与文件名有关的所有事情:
[os.path.splitext(i)[0] for i in data_files]
['ap_2010',
'class_size',
'demographics',
'graduation',
'hs_directory',
'sat_results']
答案 1 :(得分:2)
答案 2 :(得分:1)
在'.'
处分割字符串,然后进行分割的最后一个元素(使用索引[-1]
)。如果这是'csv'
,则它是一个csv文件。
for i in data_files:
if i.split('.')[-1].lower() == 'csv':
# It is a CSV file
else:
# Not a CSV
答案 3 :(得分:1)
# Input
data_files = [ 'ap_2010.csv', 'class_size.csv', 'demographics.csv', 'graduation.csv', 'hs_directory.csv', 'sat_results.csv' ]
import re
pattern = '(?P<filename>[a-z0-9A-Z_]+)\.csv'
prog = re.compile(pattern)
# `map` function yields:
# - a `List` in Python 2.x
# - a `Generator` in Python 3.x
result = map(lambda data_file: re.search(prog, data_file).group('filename'), data_files)
答案 4 :(得分:1)
l = [
"ap_2010.csv",
"class_size.csv",
"demographics.csv",
"graduation.csv",
"hs_directory.csv",
"sat_results.csv"]
print([i.rstrip('.'+i.split('.')[-1]) for i in l])