Question

我有一个在Linux上使用XML文件的Perl脚本，偶尔在某些节点值中有CRLF（Hex 0D0A，Dos新行）。

生成XML文件的系统将它们全部写为一行，看起来好像它偶尔会判断它太长并将CRLF写入其中一个数据元素。不幸的是，我无法对提供系统做些什么。

我只需要在处理它之前从字符串中删除它们。

我已经尝试了使用perl char类，十六进制值，各种类型的各种正则表达式替换，似乎没有任何工作。

我甚至在处理之前通过dos2unix运行输入文件，我仍然无法摆脱错误的字符。

有没有人有任何想法？

非常感谢，

Answer 1

典型的，经过约2小时的战斗，我在提出问题的5分钟内解决了它。

$output =~ s/[\x0A\x0D]//g;

终于明白了。

Answer 2

$output =~ tr/\x{d}\x{a}//d;

这些都是空白字符，所以如果终结符总是在最后，你可以用

进行右边修剪

$output =~ s/\s+\z//;

Answer 3

一些选择：
  1.用lf代替所有出现的cr / lf：$output =~ s/\r\n/\n/g; #instead of \r\n might want to use \012\015
  2.删除所有尾随空格：output =~ s/\s+$//g;
  3. Slurp和split：

#!/usr/bin/perl -w  

use strict;  
use LWP::Simple;  

   sub main{  
      createfile();  
      outputfile();
   }

   main();

   sub createfile{
      (my $file = $0)=~ s/\.pl/\.txt/;

      open my $fh, ">", $file;
         print $fh "1\n2\r\n3\n4\r\n5";
      close $fh;
   }

   sub outputfile{
      (my $filei = $0)=~ s/\.pl/\.txt/;
      (my $fileo = $0)=~ s/\.pl/out\.txt/;

      open my $fin, "<", $filei;
         local $/;                                # slurp the file
         my $text = <$fin>;                       # store the text
         my @text = split(/(?:\r\n|\n)/, $text);  # split on dos or unix newlines
      close $fin;

      local $" = ", ";                            # change array scalar separator
      open my $fout, ">", $fileo;
         print $fout "@text";                     # should output numbers separated by comma space
      close $fout;
   }

从Perl中的字符串中删除CRLF（0D 0A）

3 个答案: