我正在使用Mac OS,我正在为以下问题寻找最优雅的解决方案。既然它纯文本相关我觉得perl会是最好的选择吗?
它的结构是这样的:
Some top text
Some too text
Some top styles text
<h1>Topic 1 text</h1>
Some text that is applicable to topic 1 with formatting...
<h1>Topic 2 title</h1>
Some text applicable to topic 2...
我想为包含顶部文本和样式的每个主题编写一个文件。所以输入os data.html输出是topic1.html,topic2.html ...
答案 0 :(得分:2)
假设您的文件非常简单并且没有任何其他h1标记,这可能应该有效:
use strict;
use warnings;
use open qw(:std :encoding(utf8));
open my $input, '<', 'data.html';
my $content = join '', <$input>;
close $input;
my @parts = split /<\/?h1>/, $content;
my $top_text_and_styles = shift @parts;
my $count = 0;
while (my ($topic, $body) = splice @parts, 0, 2) {
my $topic_content = join "", $top_text_and_styles, $topic, $body;
$count += 1;
my $output_name = "topic${count}.html";
open my $output, '>', $output_name;
print $output $topic_content;
close $output;
}