基于python中的标题将大型文本文件拆分为较小的文本文件

时间:2014-03-20 05:01:49

标签: python

我有一个包含一些Medline摘要的文件。我只想将这些摘要放到单独的txt文件中。

我想在Python中使用一个脚本来读取大文件中的数据,并找到MEDLINE:########,之后的所有文本都将保存到单独的文本文件中,直到MEDLINE的下一次出现:后跟8位数(例如MEDLINE:95369245)

以下是文件中的5个示例摘要

MEDLINE:95369245

IL-2 gene expression and NF-kappa B activation through CD28 requires reactive oxygen production by 5-lipoxygenase. 
Activation of the CD28 surface receptor provides a major costimulatory signal for T cell activation resulting in enhanced production of interleukin-2 (IL-2) and cell proliferation. In primary T lymphocytes we show that CD28 ligation leads to the rapid intracellular formation of reactive oxygen intermediates (ROIs) which are required for CD28-mediated activation of the NF-kappa B/CD28-responsive complex and IL-2 expression. Delineation of the CD28 signaling cascade was found to involve protein tyrosine kinase activity, followed by the activation of phospholipase A2 and 5-lipoxygenase. Our data suggest that lipoxygenase metabolites activate ROI formation which then induce IL-2 expression via NF-kappa B activation. These findings should be useful for therapeutic strategies and the development of immunosuppressants targeting the CD28 costimulatory pathway. 

MEDLINE:95333264

The peri-kappa B site mediates human immunodeficiency virus type 2 enhancer activation in monocytes but not in T cells. 
Human immunodeficiency virus type 2 (HIV-2), like HIV-1, causes AIDS and is associated with AIDS cases primarily in West Africa. HIV-1 and HIV-2 display significant differences in nucleic acid sequence and in the natural history of clinical disease. Consistent with these differences, we have previously demonstrated that the enhancer/promoter region of HIV-2 functions quite differently from that of HIV-1. Whereas activation of the HIV-1 enhancer following T-cell stimulation is mediated largely through binding of the transcription factor NF-kappa B to two adjacent kappa B sites in the HIV-1 long terminal repeat, activation of the HIV-2 enhancer in monocytes and T cells is dependent on four cis-acting elements: a single kappa B site, two purine-rich binding sites, PuB1 and PuB2, and a pets site. We have now identified a novel cis-acting element within the HIV-2 enhancer, immediately upstream of the kappa B site, designated peri-kappa B. This site is conserved among isolates of HIV-2 and the closely related simian immunodeficiency virus, and transfection assays show this site to mediate HIV-2 enhancer activation following stimulation of monocytic but not T-cell lines. This is the first description of an HIV-2 enhancer element which displays such monocyte specificity, and no comparable enhancer element has been clearly defined for HIV-1. While a nuclear factor(s) from both peripheral blood monocytes and T cells binds the peri-kappa B site, electrophoretic mobility shift assays suggest that either a different protein binds to this site in monocytes versus T cells or that the protein recognizing this enhancer element undergoes differential modification in monocytes and T cells, thus supporting the transfection data. Further, while specific constitutive binding to the peri-kappa B site is seen in monocytes, stimulation with phorbol esters induces additional, specific binding. Understanding the monocyte-specific function of the peri-kappa B factor may ultimately provide insight into the different role monocytes and T cells play in HIV pathogenesis.

MEDLINE:95343554

E1A gene expression induces susceptibility to killing by NK cells following immortalization but not adenovirus infection of human cells. 
Adenovirus (Ad) infection and E1A transfection were used to model changes in susceptibility to NK cell killing caused by transient vs stable E1A expression in human cells. Only stably transfected target cells exhibited cytolytic susceptibility, despite expression of equivalent levels of E1A proteins in Ad-infected targets. The inability of E1A gene products to induce cytolytic susceptibility during infection was not explained by an inhibitory effect of viral infection on otherwise susceptible target cells or by viral gene effects on class I MHC antigen expression on target cells. This differential effect of E1A expression on the cytolytic phenotypes of infected and stably transfected human cells suggests that human NK cells provide an effective immunologic barrier against the in vivo survival and neoplastic progression of E1A-immortalized cells that may emerge from the reservoir of persistently infected cells in the human host. 

MEDLINE:95347379

Distinct signaling properties identify functionally different CD4 epitopes. 
The CD4 coreceptor interacts with non-polymorphic regions of major histocompatibility complex class II molecules on antigen-presenting cells and contributes to T cell activation. We have investigated the effect of CD4 triggering on T cell activating signals in a lymphoma model using monoclonal antibodies (mAb) which recognize different CD4 epitopes. We demonstrate that CD4 triggering delivers signals capable of activating the NF-AT transcription factor which is required for interleukin-2 gene expression. Whereas different anti-CD4 mAb or HIV-1 gp120 could all trigger activation of the protein tyrosine kinases p56lck and p59fyn and phosphorylation of the Shc adaptor protein, which mediates signals to Ras, they differed significantly in their ability to activate NF-AT. Lack of full activation of NF-AT could be correlated to a dramatically reduced capacity to induce calcium flux and could be complemented with a calcium ionophore. The results identify functionally distinct epitopes on the CD4 coreceptor involved in activation of the Ras/protein kinase C and calcium pathways. 

MEDLINE:95280913

Ligand-dependent repression of the erythroid transcription factor GATA-1 by the estrogen receptor. 
High-dose estrogen administration induces anemia in mammals. In chickens, estrogens stimulate outgrowth of bone marrow-derived erythroid progenitor cells and delay their maturation. This delay is associated with down-regulation of many erythroid cell-specific genes, including alpha- and beta-globin, band 3, band 4.1, and the erythroid cell-specific histone H5. We show here that estrogens also reduce the number of erythroid progenitor cells in primary human bone marrow cultures. To address potential mechanisms by which estrogens suppress erythropoiesis, we have examined their effects on GATA-1, an erythroid transcription factor that participates in the regulation of the majority of erythroid cell-specific genes and is necessary for full maturation of erythrocytes. We demonstrate that the transcriptional activity of GATA-1 is strongly repressed by the estrogen receptor (ER) in a ligand-dependent manner and that this repression is reversible in the presence of 4-hydroxytamoxifen. ER-mediated repression of GATA-1 activity occurs on an artificial promoter containing a single GATA-binding site, as well as in the context of an intact promoter which is normally regulated by GATA-1. GATA-1 and ER bind to each other in vitro in the absence of DNA. In coimmunoprecipitation experiments using transfected COS cells, GATA-1 and ER associate in a ligand-dependent manner. Mapping experiments indicate that GATA-1 and the ER form at least two contacts, which involve the finger region and the N-terminal activation domain of GATA-1. We speculate that estrogens exert effects on erythropoiesis by modulating GATA-1 activity through protein-protein interaction with the ER. (ABSTRACT TRUNCATED AT 250 WORDS) 

1 个答案:

答案 0 :(得分:0)

试试这个,

注意:main.txt文件应为此格式http://textuploader.com/1yrd

import re


def file_match(text):
    patt = re.compile('MEDLINE:%s' % text)
    support = False
    i = 0
    with open('main.txt', 'r+') as f:
        for line in f:
            if patt.search(line) or support:
                file_output = open('MEDLINE_%s.txt' % text, 'a+')
                support = True
                if i <= 2:
                    file_output.write(line)
                    print line
                    i = i + 1
                    file_output.close()
                    continue
                else:
                    break
    f.close()

file_match('95333264')