批处理文件,用于提取许多.xml文件

时间:2017-05-27 22:44:28

标签: xml batch-file

我需要有关优化批处理文件的帮助,以便将超过一千个xml文件的多个xml标记转换为.txt或.csv。

.xml的格式相同。他们是临床研究,看起来像这样:

    <?xml version="1.0" encoding="UTF-8"?>
<clinical_study rank="373">
  <!-- This xml conforms to an XML Schema at:
    https://clinicaltrials.gov/ct2/html/images/info/public.xsd -->
  <required_header>
    <download_date>ClinicalTrials.gov processed this data on May 25, 2017</download_date>
    <link_text>Link to the current ClinicalTrials.gov record.</link_text>
    <url>https://clinicaltrials.gov/show/NCT00146471</url>
  </required_header>
  <id_info>
    <org_study_id>Kep-F10.3.01</org_study_id>
    <nct_id>NCT00146471</nct_id>
  </id_info>
  <brief_title>Efficacy and Safety of Levetiracetam in the Inpatient Treatment of Alcohol Withdrawal Syndrome</brief_title>
  <official_title>Efficacy and Safety of Levetiracetam in the Inpatient Treatment of Alcohol Withdrawal Syndrome [Sicherheit Und Wirksamkeit Von Levetiracetam (Keppra) für Die Behandlung Des stationären Alkoholentzugsyndroms]</official_title>
  <sponsors>
    <lead_sponsor>
      <agency>Charite University, Berlin, Germany</agency>
      <agency_class>Other</agency_class>
    </lead_sponsor>
  </sponsors>
  <source>Charite University, Berlin, Germany</source>
  <oversight_info>
    <has_dmc>Yes</has_dmc>
  </oversight_info>
  <brief_summary>
    <textblock>
      The purpose of this study is to evaluate the efficacy and safety of levetiracetam for
      treating alcohol withdrawal syndrome (AWS) in inpatients (vs. placebo). The primary come-out
      parameter is the reduction of the total needed amount of diazepam for add-on treatment of
      acute alcohol withdrawal symptoms. The secondary come-out parameter are - safety criteria
      (AE) - reduction of alcohol withdrawal score over the days.
    </textblock>
  </brief_summary>
  <overall_status>Completed</overall_status>
  <start_date>January 2006</start_date>
  <completion_date type="Actual">September 2007</completion_date>
  <primary_completion_date type="Actual">July 2007</primary_completion_date>
  <phase>Phase 3</phase>
  <study_type>Interventional</study_type>
  <has_expanded_access>No</has_expanded_access>
  <study_design_info>
    <allocation>Randomized</allocation>
    <intervention_model>Parallel Assignment</intervention_model>
    <primary_purpose>Treatment</primary_purpose>
    <masking>Double Blind (Participant, Care Provider, Investigator)</masking>
  </study_design_info>
  <primary_outcome>
    <measure>To evaluate the efficacy and safety of levetiracetam for treating alcohol withdrawal syndrome in inpatients. The primary come-out parameter is the reduction of the amount of diazepam for add-on treatment of acute alcohol withdrawal</measure>
    <time_frame>during trial</time_frame>
  </primary_outcome>
  <secondary_outcome>
    <measure>Secondary come-out parameters are - safety criteria (AE) - reduction of alcohol withdrawal score over the days</measure>
    <time_frame>during trial</time_frame>
  </secondary_outcome>
  <number_of_arms>2</number_of_arms>
  <enrollment type="Actual">120</enrollment>
  <condition>Alcohol Withdrawal Syndrome</condition>
  <arm_group>
    <arm_group_label>2</arm_group_label>
    <arm_group_type>Active Comparator</arm_group_type>
  </arm_group>
  <arm_group>
    <arm_group_label>1: Diazepam plus Placebo</arm_group_label>
    <arm_group_type>Placebo Comparator</arm_group_type>
  </arm_group>
  <intervention>
    <intervention_type>Drug</intervention_type>
    <intervention_name>Levetiracetam</intervention_name>
    <description>1500-2000 mg daily add-on or Placebo Diazepam as needed</description>
    <arm_group_label>2</arm_group_label>
    <other_name>KEPPRA</other_name>
  </intervention>
  <intervention>
    <intervention_type>Drug</intervention_type>
    <intervention_name>Placebo</intervention_name>
    <description>1500-2000 mg daily add-on or Placebo Diazepam as needed</description>
    <arm_group_label>1: Diazepam plus Placebo</arm_group_label>
  </intervention>
  <eligibility>
    <criteria>
      <textblock>
        Inclusion Criteria:

          -  Ages eligible for study: 18-65 years.

          -  Meets criteria for alcohol dependence according to DSM-IV/ICD-10

          -  Known withdrawal symptoms in the past in case of discontinuation of alcohol
             consumption

          -  Hospital admission for alcohol detoxification

          -  Able to provide a written informed consent.

          -  Able to follow verbal and written instructions (incl. a sufficient knowledge of
             German language).

          -  Must be medically acceptable for study treatment. No past or present physical
             disorder that is likely to deteriorate during participation. No ECG abnormality which
             would likely worsen during participation and no clinical laboratory abnormality that
             would also suggest deterioration during treatment.

          -  Have a negative urine drug screen for benzodiazepines or heroine or methadone

        Exclusion Criteria:

          -  Current diagnosis of any other substance dependence syndrome other than alcohol
             dependence (excluding nicotine and caffeine dependence).

          -  History of idiopathic epilepsy.

          -  Patient with any current clinically significant psychiatric disorder (acute
             suiciality) or developmental disorder (including organic mental disorder), like
             psychotic disorders.

          -  Patients with the following complications of alcoholism (lifetime): acute delirium
             tremens, hallucinatory alcoholic state, Korsakoff`s syndrome, Wernicke
             encephalopathy, decomposed liver cirrhosis (Child B, C), suspected cirrhosis with the
             following clinical symptoms detected at clinical exam: signs of portal hypertension
             and signs of hepato-cellular failure, thrombocytopenia.

          -  Subjects with known sensitivity of previous adverse reaction to levetiracetam

          -  Contra-indication (hypersensitivity to levetiracetam or pyrrolidone derivatives) or
             known non-response to levetiracetam.

          -  History of severe GI disease which might render absorption of the medication
             difficult or produce medical instability of the patient which would include active
             peptic ulcer disease, ulcerative colitis, regional colitis, or evidence by history or
             physical exam of GI bleeding.

          -  Patients with any clinically significant acute or chronic progressive neurological,
             gastrointestinal, cardiovascular, hepatic, renal, haematological, endocrine,
             dermatological or respiratory disease, such as diabetes, severe infection, acute
             alcoholic hepatitis, or any other medical condition with significant worsening of the
             clinical situation of the patient that might interfere with the evaluation of study
             medication.

          -  Female patients pregnant, breast-feeding or of child bearing age and not protected by
             effective contraceptive such as implants, injectables, combined oral contraceptives,
             some IUDS, sexual abstinence, sterilization or vasectomized partner.

          -  Actually continuous use of pharmacological agents that are known to lower the seizure
             threshold or augment or decrease the alcohol withdrawal syndrome.

          -  Subjects with known sensitivity of previous adverse reaction to diazepam or clonidine

          -  Contra-indication or known non-response to diazepam or clonidine
      </textblock>
    </criteria>
    <gender>All</gender>
    <minimum_age>18 Years</minimum_age>
    <maximum_age>65 Years</maximum_age>
    <healthy_volunteers>No</healthy_volunteers>
  </eligibility>
  <overall_official>
    <last_name>Martin Schaefer, MD</last_name>
    <role>Principal Investigator</role>
    <affiliation>Charité Campus Mitte, Klinik für Psychiatrie und Psychotherapie</affiliation>
  </overall_official>
  <location>
    <facility>
      <name>MLU Halle-Wittenberg</name>
      <address>
        <city>Halle</city>
        <state>Sachen/Anhalt</state>
        <zip>06097</zip>
        <country>Germany</country>
      </address>
    </facility>
  </location>
  <location>
    <facility>
      <name>Charité - Universitätsmedizin Berlin, Campus Charité Mitte, Klinik für Psychiatrie und Psychotherapie</name>
      <address>
        <city>Berlin</city>
        <zip>10117</zip>
        <country>Germany</country>
      </address>
    </facility>
  </location>
  <location>
    <facility>
      <name>Psychiatrische Klinik der Charité im St.-Hedwig Krankenhaus</name>
      <address>
        <city>Berlin</city>
        <zip>10559</zip>
        <country>Germany</country>
      </address>
    </facility>
  </location>
  <location>
    <facility>
      <name>Klinik für Psychiatrie und Suchtmedizin, Kliniken Essen Mitte</name>
      <address>
        <city>Essen</city>
        <zip>45136</zip>
        <country>Germany</country>
      </address>
    </facility>
  </location>
  <location>
    <facility>
      <name>Zentrum für Seelische Gesundheit</name>
      <address>
        <city>Rhede</city>
        <zip>46414</zip>
        <country>Germany</country>
      </address>
    </facility>
  </location>
  <location_countries>
    <country>Germany</country>
  </location_countries>
  <reference>
    <citation>Krebs M, Leopold K, Richter C, Kienast T, Hinzpeter A, Heinz A, Schaefer M. Levetiracetam for the treatment of alcohol withdrawal syndrome: an open-label pilot trial. J Clin Psychopharmacol. 2006 Jun;26(3):347-9.</citation>
    <PMID>16702910</PMID>
  </reference>
  <verification_date>September 2008</verification_date>
  <lastchanged_date>December 29, 2009</lastchanged_date>
  <firstreceived_date>September 6, 2005</firstreceived_date>
  <responsible_party>
    <name_title>Martin Schaefer, MD</name_title>
    <organization>Charite University, Berlin, Germany</organization>
  </responsible_party>
  <keyword>alcohol withdrawal</keyword>
  <keyword>detoxification</keyword>
  <keyword>Inpatients</keyword>
  <keyword>alcohol dependence according to DSM-IV/ICD-10</keyword>
  <keyword>withdrawal symptoms</keyword>
  <condition_browse>
    <!-- CAUTION:  The following MeSH terms are assigned with an imperfect algorithm  -->
    <mesh_term>Syndrome</mesh_term>
    <mesh_term>Substance Withdrawal Syndrome</mesh_term>
  </condition_browse>
  <intervention_browse>
    <!-- CAUTION:  The following MeSH terms are assigned with an imperfect algorithm  -->
    <mesh_term>Ethanol</mesh_term>
    <mesh_term>Diazepam</mesh_term>
    <mesh_term>Etiracetam</mesh_term>
    <mesh_term>Piracetam</mesh_term>
  </intervention_browse>
  <!-- Results have not yet been posted for this study                                -->
</clinical_study>

所以他们都使用相同的标签,我需要一些像:

  • overall_official
  • lead_sponsor
  • official_title
  • results_reference
  • overall_status

到目前为止,我尝试使用以下代码:

    @echo off
setlocal enabledelayedexpansion
for %%a in (*.xml) do (
call :XMLExtract "%%a" "<results_reference>" location
echo.!location!,%%~na
)
exit /b

:XMLExtract file keystart location
@echo off & setlocal
for /f "tokens=3 delims=<>" %%a in ('Findstr /i /c:%2 "%~1"') do (
   set "loc=%%a" & goto :endloop
)
:endLoop
ENDLOCAL & IF "%~3" NEQ "" (SET %~3=%loc%) ELSE echo.%loc%
exit /b

我在命令行中运行批处理为:bat&gt;&gt; output.txt或output.csv,它完美地适用于overall_status,但是所有其他标签都存在问题,例如:

  • overall_offical:在10个人之后停止
  • 其他标签:文件名列出(一如既往),但背后没有任何信息。

我真的很感激有关如何修复此问题的任何帮助或其他有效解决此任务的方法。我对编程只有一个小的,基本的理解,但我确信能够自己解决任何简单的解决方案。最好的帮助是优化批处理代码以适应这种情况的方法。如果遗失了一些信息,我很抱歉,我会提供。

2 个答案:

答案 0 :(得分:0)

@ECHO Off
SETLOCAL
SET "sourcedir=U:\sourcedir"
SET "destdir=U:\destdir"
:: SET "tags=overall_official lead_sponsor official_title results_reference overall_status"
SET "tags=%*"

FOR /f "tokens=1delims=" %%a IN (
 'dir /b /a-d "%sourcedir%\*.xml" '
 ) DO (
 REM Clear detected-tags flags for each file "%%a"
 FOR %%t IN (%tags% malformed) DO SET "%%t="
 REM remove "rem" from following line to delete any existing result file
 REM del "%destdir%\%%~na.txt" >nul 2>nul
 REM Read each line to %%L - usebackq to allow "quoted filenames"
 FOR /f "usebackqdelims=" %%L IN ("%sourcedir%\%%a") DO (
  REM remove leading spaces from %%L into %%P
  FOR /f "tokens=*" %%P IN ("%%L") DO (
   REM tokenise on "<>"
   FOR /f "tokens=1-3*delims=<>" %%w IN ("%%P") DO (
    IF "%%z" neq "" SET "malformed=%%z"
    FOR %%t IN (%tags%) DO IF "%%w"=="%%t" (SET "%%t=Y") else IF "%%w"=="/%%t" (SET "%%t=") 
    SET "report="
    FOR %%t IN (%tags%) DO IF DEFINED %%t SET "report=Y"
    REM (1 of 2) un-rem this to deposit in individual filenames
    REM (
    IF DEFINED report (
     REM we may have 1,2 or 3 tokens
     REM if 3, output token 2
     REM if 2, output token 1 if token 2 starts "/", token 2 otherwise
     REM if only 1, output entire line unless it is a target token
     IF "%%y" equ "" (
      IF "%%x" equ "" (
       REM only one token
       FOR %%t IN (%tags%) DO IF "%%w"=="%%t" (SET "report=") else IF "%%w"=="/%%t" (SET "report=") 
       IF DEFINED report ECHO %%L
      ) ELSE (
       REM two tokens
       ECHO %%x|FINDSTR /b "/">NUL 2>NUL
       IF ERRORLEVEL 1 (ECHO %%x) ELSE (ECHO %%w)
      )
     ) ELSE (ECHO %%x)
    )
    REM (2 of 2) un-rem this to deposit in individual filenames
    REM )>>"%destdir%\%%~na.txt"
    FOR %%t IN (%tags%) DO IF "%%y"=="/%%t" (SET "%%t=") 
    FOR %%t IN (%tags%) DO IF "%%x"=="/%%t" (SET "%%t=") 
   )
  REM pause
  )
 )
)

GOTO :EOF

您需要更改sourcedirdestdir的设置以适合您的具体情况。

这可能会给你一些想法。您尚未提供输出示例,因此您可能希望在每个输出行前面添加相应%%~na s中的源文件名(echo}

要运行的预期语法:

  

thisbatchname 标记标记标记

我的方法是让%%a包含要处理的文件名,%%L来自文件的原始行数据和%%P带有前导空格的原始行数据。

使用分隔符对%%P进行标记生成%%W%%z,因为每行包含1-3个可能的元素 - 标记或数据。如果有第四个,那么就有错误(为文件设置了标志malformed,虽然我没有对它做任何事情 - 它将包含问题所在的文本[对于整行来说也可以设置为%%P ...])

因此,使用required-tags作为变量名称,只需将这些varnames设置为 nothing something 并使用if defined来解释其状态 - 随着数据逐行变化,它们的运行时状态。

请注意,由于代码的整个操作部分是一个巨大的代码块,因此rem而非::必须用于提供有用的评论。

另请注意

(
 commands
)>file

将根据指定的重定向器(如果需要)重定向commands的输出

答案 1 :(得分:0)

尝试xpath.bat

for /f "tokens=* delims=" %%# in ('xpath.bat "study.xml" "//reference/citation"') do set "reference_citation=%%#"
echo %reference_citation%

for /f "tokens=* delims=" %%# in ('xpath.bat "study.xml" "//official_title"') do set "official_title=%%#"
echo %official_title%

for /f "tokens=* delims=" %%# in ('xpath.bat "study.xml" "//lead_sponsor/agency"') do set "lead_sponsor=%%#"
echo %lead_sponsor%

for /f "tokens=* delims=" %%# in ('xpath.bat "study.xml"  "//overall_official"') do set  "overall_official=%%#"
echo %overall_official%