使用Batch脚本替换XML文件的特定标记中的未嵌入双引号

时间:2016-12-13 07:17:39

标签: xml string perl batch-file vbscript

我的CSV文件包含以下数据。我想替换未嵌入的单身"仅包含评论标记的空白字符。 此标记可以在单个记录/行中多次出现。我不想影响其他标签和"字符。 文件大小约为30MB。

ABCD ,
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>
<customerDetailsExtension xmlns=\"http://asdfg.net\">
<Comments>
<Comment><Date>2001-12-04</Date><AssociateID>12345</AssociateID>
<AssociateFirstName>ABC</AssociateFirstName>
<Comment>measurements: 34,28,37 height 5'4\". ABC</Comment>
<Priority>false</Priority><IsRead>false</IsRead>
</Comment>
<Comment>
<Date>2001-12-04</Date><AssociateID>12345</AssociateID><AssociateFirstName>ABC</AssociateFirstName>
<Comment>measurements: 32,24.5,34 height 5'3\". ABC</Comment><Priority>false</Priority><IsRead>false</IsRead>
</Comment>
<Comment><Date>2016-12-04</Date><AssociateID>12345</AssociateID><AssociateFirstName>ABC</AssociateFirstName>
<Comment>measurements: 32.5,26,36.5 height 5'5\"  ABC</Comment><Priority>false</Priority><IsRead>false</IsRead>
</Comment>
</Comments>
<EventDate>2017-06-10</EventDate>
</customerDetailsExtension>"

我不了解批处理脚本。我在下面试过但是没有用。

@echo off

  for /f "delims=, tokens=2" %%A in (
    'findstr /r "<Comment>.*</Comment>" "D:\data.csv"'
  ) do (
    set code=%%A
    set code=!code:"=!
    echo(!code!
)

2 个答案:

答案 0 :(得分:0)

这应该对你有用

@echo off
setlocal EnableExtensions EnableDelayedExpansion 

>D:\data_new.csv (
  for /f "tokens=*" %%A in (D:\data.csv) do (
    set "code=%%A" & if /I "!code:~0,9!" EQU "<Comment>" set "code=!code:"=!"
    echo(!code!
  )
)  
rem remove the rem in next line to overwrite original file
rem copy /Y D:\data_new.csv D:\data.csv
exit/B

set "code=%%A" & if /I "!code:~0,9!" EQU "<Comment>" set "code=!code:\"=\!"

避免替换另一个引号

答案 1 :(得分:0)

var mockPostedFileBase = new Mock<HttpPostedFileBase>(); mockPostedFileBase.SetupGet(s => s.InputStream).Returns(new StreamReader(/* path to test file */).BaseStream); 是解析XML或CSV的工作的错误工具。

你有两个复杂的例子,而且 - 实际上 - 如果你想要一个不会变脆的解决方案,可能需要对CSV进行csv解析,并对XML进行XML解析。

但是,您尝试删除提交中的转义引号这一事实表明您正在做其他脏事,因为引用解析而导致其中断。我首先建议,回顾一下你在那里做的事情,因为这可能是一个XY问题。

虽然失败了 - 我可能会这样做:

findstr

这并不完美,因为我并不完全确定我是否正确捕获了换行符 - #!/usr/bin/env perl use strict; use warnings; use Text::ParseWords; use XML::Twig; use Data::Dumper; sub fix_comment { my ( $twig, $comment ) = @_; my $text = $comment->text; $text =~ s/\"//g; $comment->set_text($text); } #extract quoted-comma separate things. foreach my $entry ( quotewords( ",", 0, do { local $/; <DATA> } ) ) { if ( $entry =~ m/^\s*<\?xml/ms ) { $entry =~ s/^\s+//ms; #eval so we can fail gracefully if this doesn't work. my $twig = XML::Twig->new( pretty_print => 'indented', twig_handlers => { 'Comment/Comment' => \&fix_comment } ); eval { $twig->parse($entry) }; if ($@) { warn $@ } else { $entry = $twig->sprint; } } print $entry; } __DATA__ DATA , " test ", "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?> <customerDetailsExtension xmlns=\"http://asdfg.net\"> <Comments> <Comment><Date>2001-12-04</Date><AssociateID>12345</AssociateID> <AssociateFirstName>ABC</AssociateFirstName> <Comment>measurements: 34,28,37 height 5'4\". ABC</Comment> <Priority>false</Priority><IsRead>false</IsRead> </Comment> <Comment> <Date>2001-12-04</Date><AssociateID>12345</AssociateID><AssociateFirstName>ABC</AssociateFirstName> <Comment>measurements: 32,24.5,34 height 5'3\". ABC</Comment><Priority>false</Priority><IsRead>false</IsRead> </Comment> <Comment><Date>2016-12-04</Date><AssociateID>12345</AssociateID><AssociateFirstName>ABC</AssociateFirstName> <Comment>measurements: 32.5,26,36.5 height 5'5\" ABC</Comment><Priority>false</Priority><IsRead>false</IsRead> </Comment> </Comments> <EventDate>2017-06-10</EventDate> </customerDetailsExtension>", 可能是解决此问题的更合适的解决方案。这很难说。