您好我有一些脚本将xlsx文件转换为制表符分隔文件,然后删除任何逗号,重复项,然后用逗号分隔。 (我这样做是为了确保用户没有在colomn中添加任何逗号) 然后我做了一些事情。然后将其转换回xlsx文件。这一直很好。但是,我不是一直打开和关闭文件,而是认为我会将文件推送到数组,然后在最后将其转换为xlsx。不幸的是,当我尝试转换回xlsx文件时,它正在名称之间的空格中创建换行符。如果我输出到csv文件然后打开它并转换为xlsx文件它工作正常。
#!/usr/bin/perl
use strict;
use warnings;
use Spreadsheet::BasicRead;
use Excel::Writer::XLSX;
local $" = "'\n'";
open( STDERR, ">&STDOUT" );
#covert to csv
my $xlsx_WSD = ( "C:\\Temp\\testing_file.xlsx"),, 1;
my @csvtemp;
if ( -e $xlsx_WSD ) {
my $ss = new Spreadsheet::BasicRead($xlsx_WSD) or die;
my $col = '';
my $row = 0;
while ( my $data = $ss->getNextRow() ) {
$row++;
$col= join( "\t", @$data );
push @csvtemp, $col . "\n" if ( $col ne "" );
}
}
else {
print " C:\\Temp\\testing_file.xlsx file EXISTS ...!!\n";
print " please investigate and use the restore option if required !..\n";
exit;
}
;
my @arraynew;
my %seen;
our $Header_row = shift (@csvtemp);
foreach (@csvtemp){
chomp;
$_ =~ s/,//g;
$_ =~ s/\t/,/g;
# print $_ . "\n" if !$seen{$_}++ ;
push @arraynew, $_ . "\n" if !$seen{$_}++ ; #remove any dupes
}
#covert back to xlsx
my $workbook = Excel::Writer::XLSX->new("C:\\Temp\\testing_filet.xlsx");
my $worksheet = $workbook->add_worksheet();
my ( $x, $y ) = ( 0, 0 );
while (<@arraynew>) {
my @list = split /,/;
foreach my $c (@list) {
$worksheet->write( $x, $y++, $c );
}
$x++;
$y = 0;
}
__DATA__
Animal keeper M/F Years START DATE FRH FSM
GIRAFFE JAMES LE M 5 10/12/2007 Y
HIPPO JACKIE LEAN F 6 11/12/2007 Y
ZEBRA JAMES LEHERN M 7 12/12/2007 Y
GIRAFFE AMIE CAHORT M 5 13/12/2012 Y
GIRAFFE MICKY JAMES M 5 14/06/2007 Y
MEERKAT JOHN JONES M 9 15/12/2007 v v
LEOPPARD JIM LEE M 8 16/12/2002
unexpected result
GIRAFFE JAMES
LE M 5 10/12/2007 Y
"
HIPPO" JACKIE
LEAN F 6 11/12/2007 Y
"
ZEBRA" JAMES
LEHERN M 7 12/12/2007 Y
"
GIRAFFE" AMIE
CAHORT M 5 13/12/2012 Y
"
GIRAFFE" MICKY
JAMES M 5 14/06/2007 Y
"
MEERKAT" JOHN
JONES M 9 15/12/2007 v v
"
LEOPPARD" JIM
LEE M 8 16/12/2002
答案 0 :(得分:1)
由于您在Windows上运行此功能,您是否考虑过使用Win32 :: OLE?
use strict;
use Win32::OLE;
my $app = Win32::OLE->GetActiveObject('Excel.Application')
|| Win32::OLE->new('Excel.Application', 'Quit');
my $wb = $app->Workbooks->Open("C:/Temp/testing_file.xlsx");
my $ws = $wb->ActiveSheet;
my $max_row = $ws->UsedRange->Rows->Count;
my $max_col = $ws->UsedRange->Columns->Count;
my ($row, %already) = (1);
while ($row <= $max_row) {
my ($col, @output) = (1);
while ($col <= $max_col) {
my $val = $ws->Cells($row, $col)->{Text};
if ($val =~ /[,\t]/) {
$val =~ tr/,//d;
$val =~ tr/\t/,/;
$ws->Cells($row, $col)->{Value} = $val;
}
@output[$col - 1] = $val;
$col++;
}
if ($already{join "|", @output}++) {
$ws->Rows($row)->EntireRow->Delete;
$max_row--;
} else {
$row++;
}
}
$wb->SaveAs("C:\\temp\\testing_filet.xlsx");
答案 1 :(得分:0)
这是行尾字符的问题。
标记行尾有三种约定:Unix上为\n
,Windows上为\r\n
,Mac上为\r
。看起来你的脚本假定Mac约定,而输入和输出使用Windows约定。
因此,在阅读输入后,除了第一行之外的所有行都会出现前导\n
。只要在使用\r
编写输出行之前输出行也是这种情况,最终会得到一个带有完美\r\n
- 分隔行的输出文件。显然,最好让你的脚本对输入使用的行结束约定保持警惕,并确保它使用相同的方法来分割行和组合输出。