如何查找目录中包含列表或其他文件中的唯一项的文件?

时间:2015-11-23 02:40:06

标签: python python-3.x

我正在尝试编写一个Python3脚本来解析一个目录,该目录包含具有细菌物种名称和对应于每个物种的蛋白质序列的文件。这些文件称为fasta文件,因为每个序列的标识符(也称为标题)以">"开头。示例如下。

File_1
>Mycoplasma_hypo
MAKEIILGIDLGTTNSVVAIIENQKPVVLENPNGKTTTPSVVAFKNNEEIVGDAAKRQ
LETNPEAIASIKRLMGTDKTVRANNNNERDYKPEEISAKILAYLKEYAEKKIGHKVTK
>Mycoplasma_galli
MSNNNGLIIGIDLGTTNSCVSVMEGAQKVVIENPEGKRTTPSVVSYKNGEIIVGDAAKRQ
MLTNPNTIVSIKRLMGTSKKVKINDKGVEKELTPEEVSASILSYLKDYAEKKTGQKISR
>Mycoplasma_agal
MAKEVIIGIDLGTTNSVVSIVDNGSPVVLENLNGKRTTPSVVSFKDGEIIVGDNAKNQ
IETNPDTVASIKRLMGTSKTVHVNNNNNKDYKPEEISAMILEHLKKYAEEKIGHKIEK

File_2
>Mycoplasma_hypo
MAKEIILGIDLGTTNSVVAIIENQKPVVLENPNGKTTTPSVVAFKNNEEIVGDAAKRQ
LETNPEAIASIKRLMGTDKTVRANNNNERDYKPEEISAKILAYLKEYAEKKIGHKVTK
>Mycoplasma_galli
MSNNNGLIIGIDLGTTNSCVSVMEGAQKVVIENPEGKRTTPSVVSYKNGEIIVGDAAKRQ
MLTNPNTIVSIKRLMGTSKKVKINDKGVEKELTPEEVSASILSYLKDYAEKKTGQKISR
>Mycoplasma_galli
MSNNNGLIIGIDLGTTNSCVSVMEGAQKVVISVVSYKNLKDYAEKKHHGEIIVGDAAKRQ
MLTNPNTIVSIKRLMGTSKKVKI-NDKGVEKELTPEEVSASILSYLKDYAEKKTGQKISR
>Mycoplasma_gen
MAKENNVIIGIDLGTTNSVRTTPSVVSFKDGEIIVGDNAKNQVSIVDNGSPVVLENLNGK
IETNPDTVASIKRLMGTSKTVHVNNNNNNKDYKPEEISAMILEHLKKYAEEKIGHKIEK

如您所见 File_2 包含重复内容(> Mycoplasma_galli)。我想跳过这个文件并创建一个所有其他文件的目录,其中包含来自给定列表的唯一细菌种类或包含这些细菌种类名称的另一个文件。此类查找文件的示例可以是:

LookUp_File

>Mycoplasma_galli 
>Mycoplasma_hypo
>Mycoplasma_gen
>Mycoplasma_agal

1 个答案:

答案 0 :(得分:0)

AVMutableVideoComposition* videoComp = [AVMutableVideoComposition videoComposition] ;

    CGSize videoSize = CGSizeApplyAffineTransform(a_compositionVideoTrack.naturalSize, a_compositionVideoTrack.preferredTransform);

        CATextLayer *titleLayer = [CATextLayer layer];
        titleLayer.string = @"lippieapp.com";
        titleLayer.font = (__bridge CFTypeRef)(@"Helvetica-Bold");
        titleLayer.fontSize = 32.0;
        //titleLayer.alignmentMode = kCAAlignmentCenter;
        titleLayer.frame = CGRectMake(30, 0, 250, 60); //You may need to adjust this for proper display

    CALayer *parentLayer = [CALayer layer];
    CALayer *videoLayer = [CALayer layer];
    parentLayer.frame = CGRectMake(0, 0, videoSize.width, videoSize.height);
    videoLayer.frame = CGRectMake(0, 0, videoSize.width, videoSize.height);
    [parentLayer addSublayer:videoLayer];