从文件名中提取单词列表

时间:2019-06-22 16:41:15

标签: python regex list filenames

我需要获取文件包含的单词列表。这是文件:

private fun createInsecureTrustManager(): X509TrustManager = object : X509TrustManager {
  override fun checkClientTrusted(chain: Array<X509Certificate>, authType: String) {}

  override fun checkServerTrusted(chain: Array<X509Certificate>, authType: String) {
    val s = "Do you want to accept {${chain.first().subjectX500Principal.name}}?"
    val response = JOptionPane.showConfirmDialog(null, s, "Confirm",
        JOptionPane.YES_NO_OPTION, JOptionPane.QUESTION_MESSAGE)
    if (response == JOptionPane.NO_OPTION) {
      throw CertificateException("Not accepting ${chain.first().subjectX500Principal.name}}")
    }
  }

  override fun getAcceptedIssuers(): Array<X509Certificate> = arrayOf()
}

我需要把它放在任务-<> _之后,所以我的列表应该看起来像这样:

sub-Dzh_task-FmriPictures_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii
sub-Dzh_task-FmriVernike_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii
sub-Dzh_task-FmriWgWords_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii
sub-Dzh_task-RestingState_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii

如何在python3中实现它?

4 个答案:

答案 0 :(得分:2)

这是使用Regex的Python解决方案。

>>> import re
>>> test_str = 'sub-Dzh_task-FmriPictures_space- 
MNI152NLin2009cAsym_desc-preproc_bold_mask- 
Language_sub01_component_ica_s1_.nii'
>>> re.search('task-(.*?)_', test_str).group(1)
'FmriPictures'

我认为您可以对每个字符串执行相同操作。

答案 1 :(得分:0)

l=["sub-Dzh_task-FmriPictures_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii",
"sub-Dzh_task-FmriVernike_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii",
"sub-Dzh_task-FmriWgWords_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii",
"sub-Dzh_task-RestingState_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii"]

k=[]
for i in l:

    k.append(i.split('-')[2].replace("_space",""))
print(k)

那只是方法。

答案 2 :(得分:0)

您可以遍历列表,并使用regex从字符串中获取名称,例如以下示例:

import re

a = ['sub-Dzh_task-FmriPictures_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii',
 'sub-Dzh_task-FmriVernike_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii',
 'sub-Dzh_task-FmriWgWords_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii',
 'sub-Dzh_task-RestingState_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii']

out = []
for elm in a:
    condition = re.search(r'_task-(.*?)_', elm)
    if bool(condition):
        out.append(condition.group(1))

print(out)

输出:

['FmriPictures', 'FmriVernike', 'FmriWgWords', 'RestingState']

答案 3 :(得分:-1)

我只是简单地替换

sub-Dzh_task-

_space-MNI152NLin2009cAsym_desc-preproc_bold_mask-Language_sub01_component_ica_s1_.nii

,为null。只需将这些行清空即可,您将获得文件名。