我的CSV文件包含以下数据。我想替换未嵌入的单身"仅包含评论标记的空白字符。 此标记可以在单个记录/行中多次出现。我不想影响其他标签和"字符。 文件大小约为30MB。
ABCD ,
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>
<customerDetailsExtension xmlns=\"http://asdfg.net\">
<Comments>
<Comment><Date>2001-12-04</Date><AssociateID>12345</AssociateID>
<AssociateFirstName>ABC</AssociateFirstName>
<Comment>measurements: 34,28,37 height 5'4\". ABC</Comment>
<Priority>false</Priority><IsRead>false</IsRead>
</Comment>
<Comment>
<Date>2001-12-04</Date><AssociateID>12345</AssociateID><AssociateFirstName>ABC</AssociateFirstName>
<Comment>measurements: 32,24.5,34 height 5'3\". ABC</Comment><Priority>false</Priority><IsRead>false</IsRead>
</Comment>
<Comment><Date>2016-12-04</Date><AssociateID>12345</AssociateID><AssociateFirstName>ABC</AssociateFirstName>
<Comment>measurements: 32.5,26,36.5 height 5'5\" ABC</Comment><Priority>false</Priority><IsRead>false</IsRead>
</Comment>
</Comments>
<EventDate>2017-06-10</EventDate>
</customerDetailsExtension>"
我不了解批处理脚本。我在下面试过但是没有用。
@echo off
for /f "delims=, tokens=2" %%A in (
'findstr /r "<Comment>.*</Comment>" "D:\data.csv"'
) do (
set code=%%A
set code=!code:"=!
echo(!code!
)
答案 0 :(得分:0)
这应该对你有用
@echo off
setlocal EnableExtensions EnableDelayedExpansion
>D:\data_new.csv (
for /f "tokens=*" %%A in (D:\data.csv) do (
set "code=%%A" & if /I "!code:~0,9!" EQU "<Comment>" set "code=!code:"=!"
echo(!code!
)
)
rem remove the rem in next line to overwrite original file
rem copy /Y D:\data_new.csv D:\data.csv
exit/B
或
set "code=%%A" & if /I "!code:~0,9!" EQU "<Comment>" set "code=!code:\"=\!"
避免替换另一个引号
答案 1 :(得分:0)
var mockPostedFileBase = new Mock<HttpPostedFileBase>();
mockPostedFileBase.SetupGet(s => s.InputStream).Returns(new StreamReader(/* path to test file */).BaseStream);
是解析XML或CSV的工作的错误工具。
你有两个复杂的例子,而且 - 实际上 - 如果你想要一个不会变脆的解决方案,可能需要对CSV进行csv解析,并对XML进行XML解析。
但是,您尝试删除提交中的转义引号这一事实表明您正在做其他脏事,因为引用解析而导致其中断。我首先建议,回顾一下你在那里做的事情,因为这可能是一个XY问题。
虽然失败了 - 我可能会这样做:
findstr
这并不完美,因为我并不完全确定我是否正确捕获了换行符 - #!/usr/bin/env perl
use strict;
use warnings;
use Text::ParseWords;
use XML::Twig;
use Data::Dumper;
sub fix_comment {
my ( $twig, $comment ) = @_;
my $text = $comment->text;
$text =~ s/\"//g;
$comment->set_text($text);
}
#extract quoted-comma separate things.
foreach my $entry (
quotewords(
",", 0,
do { local $/; <DATA> }
)
)
{
if ( $entry =~ m/^\s*<\?xml/ms ) {
$entry =~ s/^\s+//ms;
#eval so we can fail gracefully if this doesn't work.
my $twig = XML::Twig->new(
pretty_print => 'indented',
twig_handlers => { 'Comment/Comment' => \&fix_comment }
);
eval { $twig->parse($entry) };
if ($@) { warn $@ }
else {
$entry = $twig->sprint;
}
}
print $entry;
}
__DATA__
DATA , " test ",
"<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"yes\"?>
<customerDetailsExtension xmlns=\"http://asdfg.net\">
<Comments>
<Comment><Date>2001-12-04</Date><AssociateID>12345</AssociateID>
<AssociateFirstName>ABC</AssociateFirstName>
<Comment>measurements: 34,28,37 height 5'4\". ABC</Comment>
<Priority>false</Priority><IsRead>false</IsRead>
</Comment>
<Comment>
<Date>2001-12-04</Date><AssociateID>12345</AssociateID><AssociateFirstName>ABC</AssociateFirstName>
<Comment>measurements: 32,24.5,34 height 5'3\". ABC</Comment><Priority>false</Priority><IsRead>false</IsRead>
</Comment>
<Comment><Date>2016-12-04</Date><AssociateID>12345</AssociateID><AssociateFirstName>ABC</AssociateFirstName>
<Comment>measurements: 32.5,26,36.5 height 5'5\" ABC</Comment><Priority>false</Priority><IsRead>false</IsRead>
</Comment>
</Comments>
<EventDate>2017-06-10</EventDate>
</customerDetailsExtension>",
可能是解决此问题的更合适的解决方案。这很难说。