Question

我正在尝试使用python重命名目录中的一组文件。这些文件当前标有池编号，AR编号和S编号（例如Pool1_AR001_S13__fw_paired.fastq.gz。）每个文件都指特定的工厂序列名称。我想通过删除＆＃39; Pool_AR_S＆＃39;来重命名这些文件。并用序列名称替换它，例如＆＃39; Lbienne_dor5_GS1＆＃39;，在留下后缀（例如fw_paired.fastq.gz，rv_unpaired.fastq.gz）的同时，我正在尝试将文件读入字典，但我对下一步该怎么做感到困惑。我有一个.txt文件，其中包含以下格式的必要信息：

Pool1_AR010_S17 - Lbienne_lla10_GS2
Pool1_AR011_S18 - Lbienne_lla10_GS3
Pool1_AR020_S19 - Lcampanulatum_borau4_T_GS1

我到目前为止的代码是：

from optparse import OptionParser
import csv
import os

parser = OptionParser()
parser.add_option("-w", "--wanted", dest="w")
parser.add_option("-t","--trimmed", dest="t")
parser.add_option("-d", "--directory", dest="working_dir", default="./")
(options, args) = parser.parse_args()

wanted_file = options.w
trimmomatic_output = options.t

#Read the wanted file and create a dictionary of index vs species identity

with open(wanted_file, 'rb') as species_sequence:
    species_list = list(csv.DictReader(species_sequence, delimiter='-'))
    print species_list


#Rename the Trimmomatic Output files according to the dictionary


for trimmed_sequence in os.listdir(trimmomatic_output):
os.rename(os.path.join(trimmomatic_output, trimmed_sequence),
          os.path.join(trimmomatic_output, trimmed_sequence.replace(species_list[0], species_list[1]))

请你帮我换一半。我对python和堆栈溢出很新，所以如果之前已经问过这个问题，或者我在错误的地方问过这个问题，我很抱歉。

Answer 1

第一项工作是摆脱所有这些模块。他们可能很好，但对于像你这样的工作，他们不太可能让事情变得更容易。

在这些.gz文件所在的目录中创建.py文件。

import os
files = os.listdir() #files is of list type
#'txt_file' is the path of your .txt file containing those conversions
dic=parse_txt(txt_file) #omitted the body of parse_txt() func.Should return a dictionary by parsing that .txt file
for f in files:
    pre,suf=f.split('__') #"Pool1_AR001_S13__(1)fw_paired.fastq.gz"
                          #(1)=assuming prefix and suffix are divided by double underscore
    pre = dic[pre] 
    os.rename(f,pre+'__'+suf)

如果您需要有关parse_txt（）函数的帮助，请与我们联系。

Answer 2

这是我用Python 2测试的解决方案。如果您使用自己的逻辑而不是get_mappings函数，那就很好。请参阅代码中的注释以获得解释。



    import os

    def get_mappings():
        mappings_dict = {}
        with(open('wanted_file.txt', 'r')) as f:
            for line in f:
                # if you have Pool1_AR010_S17 - Lbienne_lla10_GS2
                # it becomes a list i.e ['Pool1_AR010_S17 ', ' Lbienne_lla10_GS2']
                #note that there may be spaces before/after the names as shown above
                text = line.split('-')
                #trim is used to remove spaces in the names
                mappings_dict[text[0].strip()] = text[1].strip()

        return mappings_dict

    #PROGRAM EXECUTION STARTS FROM HERE
    #assuming all files are in the current directory
    # if not replace the dot(.) with the path of the directory where you have the files
    files = os.listdir('.')
    wanted_names_dict = get_mappings()
    for filename in files:
        try:
            #prefix='Pool1_AR010_S17', suffix='fw_paired.fastq.gz'
            prefix, suffix = filename.split('__')
            new_filename = wanted_names_dict[prefix] + '__' + suffix
            os.rename(filename, new_filename)
            print 'renamed', filename, 'to', new_filename
        except:
            print 'No new name defined for file:' + filename

使用python替换目录中的部分文件名

2 个答案: