我有一批带有
等行的excel文件1/13/04 21
我正在尝试将它们转换为.csv,但发现该行已转换为
36537,21
事实证明这是excel存储规则的副作用。 Excel应该将日期存储为自1900年1月1日以来的天数。根据该规则,这是一个错误的整数,对应于2001年1月12日而不是2004年1月13日(1/13/04
表示的日期)。
这是代码的粗略草图:
my $xlsparser = Spreadsheet::ParseExcel->new();
my $xlsbook = $xlsparser->Parse('xls_test.xls');
my $xls = $xlsbook->{Worksheet}[0];
my $csv = '';
# then a loop over rows and columns with...
my $cell = $xls->get_cell( $row, $col );
$cellcon = $cell->unformatted();
$csv .= $cellcon;
如果我的论述不够清楚或者你无法重现这个问题,这里有一个最小的数据集和脚本可以为我重现:
https://dl.dropboxusercontent.com/u/58760/softwareGrr/xls_example.pl https://dl.dropboxusercontent.com/u/58760/softwareGrr/junk.xls
答案 0 :(得分:1)
如果要将Excel日期序列值格式36537,21转换为perl中的时间/日期变量,则可以使用自己的函数转换日期。 功能下方
sub date2excelvalue {
my($day1, $month, $year, $hour, $min, $sec) = @_;
my @cumul_d_in_m = (0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365);
my $doy = $cumul_d_in_m[$month - 1] + $day1;
#
full years + your day
for my $y(1900..$year) {
if ($y == $year) {
if ($month <= 2) {
#
dont add manually extra date
if inJanuary or February
last;
}
if ((($y % 4 == 0) && ($y % 100 != 0)) || ($y % 400 == 0) || ($y == 1900)) {
$doy++;#
leap year
}
} else {#
full years
$doy += 365;
if ((($y % 4 == 0) && ($y % 100 != 0)) || ($y % 400 == 0) || ($y == 1900)) {
$doy++;#
leap year
}
}
}#
end
for y# calculate second parts as a fraction of 86400 seconds
my $excel_decimaltimepart = 0;
my $total_seconds_from_time = ($hour * 60 * 60 + $min * 60 + $sec);
if ($total_seconds_from_time == 86400) {
$doy++;#
just add a day
} else {#
add decimal in excel
$excel_decimaltimepart = $total_seconds_from_time / (86400);
$excel_decimaltimepart = ~s / 0\. //;
}
return "$doy\.$excel_decimaltimepart";
}
sub excelvalue2date {
my($excelvalueintegerpart, $excelvaluedecimalpart) = @_;
my @cumul_d_in_m = (0, 31, 59, 90, 120, 151, 181, 212, 243, 273, 304, 334, 365);
my @cumul_d_in_m_leap = (0, 31, 60, 91, 121, 152, 182, 213, 244, 274, 305, 335, 366);
my @cumul_d_in_m_selected;
my($day1, $month, $year, $hour, $min, $sec);
$day1 = 0;#
all days all years
my $days_in_year;
my $acumdays_per_month;
my $daysinmonth;
my $day;
#
full years + your day
for my $y(1900. .3000) {
my $leap_year = 0;#
leap year
my $leap_year_mask = 0;#
leap year
if ((($y % 4 == 0) && ($y % 100 != 0)) || ($y % 400 == 0) || ($y == 1900)) {
$leap_year = 1;#
leap year
@cumul_d_in_m_selected = @cumul_d_in_m_leap;
} else {
$leap_year = 0;#
leap year
@cumul_d_in_m_selected = @cumul_d_in_m;
}
if (($day1 + (365 + $leap_year)) > $excelvalueintegerpart) {
#
found this year $y
$year = $y;
print "year $y\n";
$days_in_year = $excelvalueintegerpart - $day1;
$acumdays_per_month = 0;
print "excelvalueintegerpart $excelvalueintegerpart\n";
print "day1 $day1\n";
print "daysinyear $days_in_year\n";
for my $i(0..$# cumul_d_in_m) {
if ($i == $# cumul_d_in_m) {
$month = $i + 1;#
month 12 December
$day = $days_in_year - $cumul_d_in_m_selected[$i];
last;
} else {
if (($days_in_year > ($cumul_d_in_m_selected[$i])) && ($days_in_year <= ($cumul_d_in_m_selected[$i + 1]))) {
$month = $i + 1;
$day = $days_in_year - $cumul_d_in_m_selected[$i];
last;
}
}
}#
end
for $i months
# end year
last;
} else {#
full years
$day1 += (365 + $leap_year);
}
}#
end
for years interger part comparator
my $total_seconds_inaday;
$total_seconds_inaday = "0\.$excelvaluedecimalpart" * 86400;
$sec = $total_seconds_inaday;
$hour = int($sec / (60 * 60));
$sec -= $hour * (60 * 60);
$min = int($sec / 60);
$sec -= $min * (60);
$sec = int($sec);
return ($day, $month, $year, $hour, $min, $sec);
}
my $excelvariable = date2excelvalue(1, 3, 2018, 14, 14, 30);
print "Excel variable: $excelvariable\n";
my($integerpart, $decimalwithoutzero) = ($1, $2) if ($excelvariable = ~m / (\d + )\.(\d + ) / );
my($day1, $month, $year, $hour, $min, $sec) = excelvalue2date($integerpart, $decimalwithoutzero);
print "Excel Date from value: $day1, $month, $year, $hour, $min, $sec\n";
享受吧!
答案 1 :(得分:0)
有问题的一行是
$cellcon = $cell->unformatted();
除非有人能提供更好的解释,否则我会将此视为错误。我替换的行是
$cellcon = $cell->Value;