我尝试从Perl打开Word docx文件,然后将其另存为HTML。我得到了Win7 63bit和Office 15(Office 365订阅)。我阅读了一些示例并尝试使用Strawberry和ActiveState,但我总是遇到错误
perl -MWin32::OLE -e "$wd = Win32::OLE->GetObject('1994.62_01_fnd_en.docx'); print Win32::OLE->LastError"
Win32::OLE(0.1712) error 0x80004005: "Unspecified error"
perl -e "use Win32::OLE::Const('.*Word.*')"
No type library matching ".*Word.*" found at -e line 1.
Win32::OLE(0.1712): GetOleTypeLibObject() Not a Win32::OLE::TypeLib object at C:/Perl64/lib/Win32/OLE/Const.pm line 49.
答案 0 :(得分:1)
虽然GetObject对我不起作用,但Win32 :: OLE-> new('Word.Application')工作正常,我能用这样的脚本完成工作
use Win32::OLE; # http://search.cpan.org/~jdb/Win32-OLE-0.1712/lib/Win32/OLE.pm
use Win32::OLE::Variant; # http://search.cpan.org/~jdb/Win32-OLE-0.1712/lib/Win32/OLE/Variant.pm
use constant true => Variant->new(VT_BOOL, 'true');
use constant false => Variant->new(VT_BOOL, 'false');
use Cwd;
# use Path::Abstract qw(path); # http://search.cpan.org/~rokr/Path-Abstract-0.096/lib/Path/Abstract.pm#$path->extension
use constant MAX => 1024000; # max file size to open
# https://msdn.microsoft.com/en-us/library/office/ff839952.aspx
use constant wdFormatUnicodeText => 7;
use constant wdFormatFilteredHTML => 10;
# use Win32::OLE::Const '.*Microsoft Word'; # http://search.cpan.org/~jdb/Win32-OLE-0.1712/lib/Win32/OLE/Const.pm
# No type library matching ".*Word" found at -e line 1.
# Win32::OLE(0.1712): GetOleTypeLibObject() Not a Win32::OLE::TypeLib object at C:/Perl64/lib/Win32/OLE/Const.pm line 49.
my $w = Win32::OLE->new('Word.Application');
# https://msdn.microsoft.com/en-us/library/aa171814(v=office.11).aspx
$w->ChangeFileOpenDirectory(cwd);
for my $doc (<doc/*>) {
next if -s $doc > MAX;
my $html = $doc; $html =~ s{\bdocx?\b}{html}g;
my $txt = $doc; $txt =~ s{\bdocx?\b}{txt}g;
# https://msdn.microsoft.com/EN-US/library/office/ff835182.aspx
$d = $w->Documents->Open ($doc, {ConfirmConversions => false, ReadOnly => true, OpenAndRepair => false, AddToRecentFiles => false, Visible => false});
# https://msdn.microsoft.com/en-us/library/office/ff836084.aspx
$d->SaveAs2({FileName => $html, FileFormat => wdFormatFilteredHTML});
$d->SaveAs2({FileName => $txt, FileFormat => wdFormatUnicodeText});
# https://msdn.microsoft.com/EN-US/library/office/ff196343.aspx
$d->Close;
last;
}