使用RegEx,我如何删除本文中的目录?

时间:2014-03-20 17:09:23

标签: regex parsing language-agnostic epub

我有以下文字:

<?xml version='1.0' encoding='utf-8'?>
<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.1//EN' 'http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd'>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="generator" content="HTML Tidy for Linux (vers 7 December 2008), see www.w3.org"/>
<meta name="generator" content="HTML-Kit Tools HTML Tidy plugin"/>
<title>ADVENTURES OF TOM SAWYER, By Twain, Complete</title>




<link href="@public@vhost@g@gutenberg@html@files@74@74-h@images@bookcover.jpg" rel="coverpage"/><link href="0.css" type="text/css" rel="stylesheet"/>
<link href="1.css" type="text/css" rel="stylesheet"/>
<link href="pgepub.css" type="text/css" rel="stylesheet"/>
<meta content="EpubMaker 0.3.20a7 by Marcello Perathoner &lt;webmaster@gutenberg.org&gt;" name="generator"/>
</head>
<body>
<p><br/>
<br/></p>
<div xml:space="preserve" class="pgmonospaced pgheader">The Project Gutenberg EBook of The Adventures of Tom Sawyer, Complete<br/>by Mark Twain (Samuel Clemens)<br/><br/>This eBook is for the use of anyone anywhere at no cost and with<br/>almost no restrictions whatsoever.  You may copy it, give it away or<br/>re-use it under the terms of the Project Gutenberg License included<br/>with this eBook or online at www.gutenberg.net<br/><br/><br/>Title: The Adventures of Tom Sawyer, Complete<br/><br/>Author: Mark Twain (Samuel Clemens)<br/><br/>Release Date: August 20, 2006 [EBook #74]<br/>Last updated: October 20, 2012<br/><br/>Language: English<br/><br/><br/>*** START OF THIS PROJECT GUTENBERG EBOOK TOM SAWYER ***<br/><br/><br/><br/><br/>Produced by David Widger<br/><br/><br/></div>
<p><br/></p>
<div class="mynote c1"><a>LINK TO THE ORIGINAL HTML FILE: This eBook has been reformatted for better appearance in Mobile Viewers such as Kindles and others. The original format, which the editor believes has a more attractive appearance for laptops and other computers, may be viewed by clicking on this box.</a></div>
<p><br/></p>
<div class="fig c2"><br/></div>
<p><br/>
<br/>
<br/></p>
<div class="fig c3">spine.jpg (33K)<br/></div>
<p><br/>
<br/>
<br/></p>
<h1 id="pgepubid00000">THE ADVENTURES OF TOM SAWYER</h1>
<p><br/>
<br/></p>
<h2>BY MARK TWAIN</h2>
<h3 id="pgepubid00001">(Samuel Langhorne Clemens)</h3>
<p><br/>
<br/>
<a id="frontispiece"/><br/></p>
<div class="fig c2">frontispiece.jpg (259K)<br/></div>
<p><br/>
<br/>
<br/></p>
<div class="fig c2">titlepage.jpg (72K)<br/></div>
<p><br/>
<br/>
<br/></p>
<div class="fig c2">dedication.jpg (10K)<br/></div>
<p><br/>
<br/>
<br/>
<br/></p>
<h2 id="pgepubid00002">CONTENTS</h2>
<p><a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-0.htm.html#c1" class="pginternal">CHAPTER I.</a><br/>
Y-o-u-u Tom-Aunt Polly Decides Upon her Duty<br/>
—Tom Practices Music—The Challenge—A Private Entrance<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#c2" class="pginternal">CHAPTER II.</a><br/>
Strong Temptations—Strategic Movements<br/>
—The Innocents Beguiled<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#c3" class="pginternal">CHAPTER III.</a><br/>
Tom as a General—Triumph and Reward<br/>
—Dismal Felicity—Commission and Omission<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#c4" class="pginternal">CHAPTER IV.</a><br/>
Mental Acrobatics—Attending Sunday—School<br/>
—The Superintendent—"Showing off"—Tom Lionized<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#c5" class="pginternal">CHAPTER V.</a><br/>
A Useful Minister—In Church—The Climax<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#c6" class="pginternal">CHAPTER VI.</a><br/>
Self-Examination—Dentistry—The Midnight Charm<br/>
—Witches and Devils—Cautious Approaches—Happy Hours<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#c7" class="pginternal">CHAPTER VII.</a><br/>
A Treaty Entered Into—Early Lessons—A Mistake Made<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#c8" class="pginternal">CHAPTER VIII.</a><br/>
Tom Decides on his Course—Old Scenes Re-enacted<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#c9" class="pginternal">CHAPTER IX.</a><br/>
A Solemn Situation—Grave Subjects Introduced<br/>
—Injun Joe Explains<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#c10" class="pginternal">CHAPTER X.</a><br/>
The Solemn Oath—Terror Brings Repentance<br/>
—Mental Punishment<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#c11" class="pginternal">CHAPTER XI.</a><br/>
Muff Potter Comes Himself—Tom's Conscience at Work<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#c12" class="pginternal">CHAPTER XII.</a><br/>
Tom Shows his Generosity—Aunt Polly Weakens<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#c13" class="pginternal">CHAPTER XIII.</a><br/>
The Young Pirates—Going to the Rendezvous<br/>
—The Camp—Fire Talk<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#c14" class="pginternal">CHAPTER XIV.</a><br/>
Camp-Life—A Sensation—Tom Steals Away from Camp<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#c15" class="pginternal">CHAPTER XV.</a><br/>
Tom Reconnoiters—Learns the Situation—Reports at Camp<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-5.htm.html#c16" class="pginternal">CHAPTER XVI.</a><br/>
A Day's Amusements—Tom Reveals a Secret—The Pirates<br/>
take a Lesson —A Night Surprise—An Indian War<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-5.htm.html#c17" class="pginternal">CHAPTER XVII.</a><br/>
Memories of the Lost Heroes—The Point in Tom's Secret<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-5.htm.html#c18" class="pginternal">CHAPTER XVIII.</a><br/>
Tom's Feelings Investigated—Wonderful Dream<br/>
—Becky Thatcher Overshadowed<br/>
—Tom Becomes Jealous—Black Revenge<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-6.htm.html#c19" class="pginternal">CHAPTER XIX.</a><br/>
Tom Tells the Truth<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-6.htm.html#c20" class="pginternal">CHAPTER XX.</a><br/>
Becky in a Dilemma<br/>
—Tom's Nobility Asserts Itself<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-6.htm.html#c21" class="pginternal">CHAPTER XXI.</a><br/>
Youthful Eloquence—Compositions by the<br/>
Young Ladies—A Lengthy Vision<br/>
—The Boy's Vengeance Satisfied<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-6.htm.html#c22" class="pginternal">CHAPTER XXII.</a><br/>
Tom's Confidence Betrayed<br/>
—Expects Signal Punishment<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-6.htm.html#c23" class="pginternal">CHAPTER XXIII.</a> Old Muff's Friends—Muff Potter in Court<br/>
—Muff Potter Saved<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-7.htm.html#c24" class="pginternal">CHAPTER XXIV.</a> Tom as the Village Hero—Days of Splendor<br/>
and Nights of Horror—Pursuit of Injun Joe<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-7.htm.html#c25" class="pginternal">CHAPTER XXV.</a> About Kings and Diamonds—Search for the Treasure<br/>
—Dead People and Ghosts<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-7.htm.html#c26" class="pginternal">CHAPTER XXVI.</a> The Haunted House—Sleepy Ghosts<br/>
—A Box of Gold—Bitter Luck<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-7.htm.html#c27" class="pginternal">CHAPTER XXVII.</a> Doubts to be Settled—The Young Detectives<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-8.htm.html#c28" class="pginternal">CHAPTER XXVIII.</a><br/>
An Attempt at No. Two—Huck Mounts Guard<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-8.htm.html#c29" class="pginternal">CHAPTER XXIX.</a><br/>
The Pic-nic—Huck on Injun Joe's Track<br/>
—The "Revenge" Job—Aid for the Widow<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-8.htm.html#c30" class="pginternal">CHAPTER XXX.</a><br/>
The Welchman Reports—Huck Under Fire—The Story Circulated<br/>
—A New Sensation—Hope Giving Way to Despair<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#c31" class="pginternal">CHAPTER XXXI.</a><br/>
An Exploring Expedition—Trouble Commences<br/>
—Lost in the Cave—Total Darkness—Found but not Saved<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#c32" class="pginternal">CHAPTER XXXII.</a><br/>
Tom tells the Story of their Escape<br/>
—Tom's Enemy in Safe Quarters<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#c33" class="pginternal">CHAPTER XXXIII.</a><br/>
The Fate of Injun Joe—Huck and Tom Compare Notes<br/>
—An Expedition to the Cave—Protection Against Ghosts<br/>
—"An Awful Snug Place"—A Reception at the Widow Douglas's<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-10.htm.html#c34" class="pginternal">CHAPTER XXXIV.</a><br/>
Springing a Secret—Mr. Jones' Surprise a Failure<br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-10.htm.html#c35" class="pginternal">CHAPTER XXXV.</a><br/>
A New Order of Things—Poor Huck—New Adventures Planned<br/>
<br/>
<br/>
<br/>
<br/>
<br/></p>
<h2 id="pgepubid00003">ILLUSTRATIONS</h2>
<p><a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-0.htm.html#frontispiece" class="pginternal">Tom Sawyer</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-0.htm.html#img017" class="pginternal">Tom at Home</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-0.htm.html#img018" class="pginternal">Aunt Polly Beguiled</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-0.htm.html#img019" class="pginternal">A Good Opportunity</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-0.htm.html#img023" class="pginternal">Who's Afraid</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-0.htm.html#img025" class="pginternal">Late Home</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img026" class="pginternal">Jim</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img028" class="pginternal">'Tendin' to Business</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img030" class="pginternal">Ain't that Work?</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img031" class="pginternal">Cat and Toys</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img032" class="pginternal">Amusement</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img033" class="pginternal">Becky Thatcher</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img034" class="pginternal">Paying Off</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img035" class="pginternal">After the Battle</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img036" class="pginternal">"Showing Off"</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img038" class="pginternal">Not Amiss</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img039a" class="pginternal">Mary</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img039b" class="pginternal">Tom Contemplating</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img040" class="pginternal">Dampened Ardor</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img041" class="pginternal">Youth</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img042" class="pginternal">Boyhood</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img044" class="pginternal">Using the "Barlow"</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img045" class="pginternal">The Church</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img047" class="pginternal">Necessities</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-1.htm.html#img051" class="pginternal">Tom as a Sunday-School Hero</a>    <br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#img052" class="pginternal">The Prize</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#img053" class="pginternal">At Church</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#img054" class="pginternal">The Model Boy</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#img055" class="pginternal">The Church Choir</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#img057" class="pginternal">A Side Show</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#img058" class="pginternal">Result of Playing in Church</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#img059" class="pginternal">The Pinch-Bug</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#img060" class="pginternal">Sid</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#img063" class="pginternal">Dentistry</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#img064" class="pginternal">Huckleberry Finn</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#img067" class="pginternal">Mother Hopkins</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#img069" class="pginternal">Result of Tom's Truthfulness</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#img070" class="pginternal">Tom as an Artist</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#img071" class="pginternal">Interrupted Courtship</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#img072" class="pginternal">The Master</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-2.htm.html#img077" class="pginternal">Vain Pleading</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#img078" class="pginternal">Tail Piece</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#img079" class="pginternal">The Grave in the Woods</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#img081" class="pginternal">Tom Meditates</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#img083" class="pginternal">Robin Hood and his Foe</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#img084" class="pginternal">Death of Robin Hood</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#img085" class="pginternal">Midnight</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#img086" class="pginternal">Tom's Mode of Egress</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#img088" class="pginternal">Tom's Effort at Prayer</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#img091" class="pginternal">Muff Potter Outwitted</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#img092" class="pginternal">The Graveyard</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#img093" class="pginternal">Forewarnings</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#img098" class="pginternal">Disturbing Muff's Sleep</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#img100" class="pginternal">Tom's Talk with his Aunt</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#img101" class="pginternal">Muff Potter</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#img102" class="pginternal">A Suspicious Incident</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-3.htm.html#img103" class="pginternal">Injun Joe's two Victims</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#img106" class="pginternal">In the Coils</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#img107" class="pginternal">Peter</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#img108" class="pginternal">Aunt Polly seeks Information</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#img110" class="pginternal">A General Good Time</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#img112" class="pginternal">Demoralized</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#img113" class="pginternal">Joe Harper</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#img117" class="pginternal">On Board Their First Prize</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#img118" class="pginternal">The Pirates Ashore</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#img127" class="pginternal">Wild Life</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#img123" class="pginternal">The Pirate's Bath</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#img124" class="pginternal">The Pleasant Stroll</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#img125" class="pginternal">The Search for the Drowned</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#img127" class="pginternal">The Mysterious Writing</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#img128" class="pginternal">River View</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-4.htm.html#img130" class="pginternal">What Tom Saw</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-5.htm.html#img133" class="pginternal">Tom Swims the River</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-5.htm.html#img134" class="pginternal">Taking Lessons</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-5.htm.html#img135" class="pginternal">The Pirates' Egg Market</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-5.htm.html#img139" class="pginternal">Tom Looking for Joe's Knife</a>    <br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-5.htm.html#img141" class="pginternal">The Thunder Storm</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-5.htm.html#img143" class="pginternal">Terrible Slaughter</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-5.htm.html#img144" class="pginternal">The Mourner</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-5.htm.html#img147" class="pginternal">Tom's Proudest Moment</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-5.htm.html#img148" class="pginternal">Amy Lawrence</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-5.htm.html#img150" class="pginternal">Tom tries to Remember</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-5.htm.html#img152" class="pginternal">The Hero</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-7.htm.html#img190" class="pginternal">Tom Dreams</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-7.htm.html#img191" class="pginternal">The Treasure</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#img238" class="pginternal">Attacked by Natives</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#img240" class="pginternal">Despair</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#img242" class="pginternal">The Wedding Cake</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#img245" class="pginternal">A New Terror</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#img247" class="pginternal">Daylight</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#img248" class="pginternal">"Turn Out" to Receive Tom and Becky</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#img249" class="pginternal">The Escape from the Cave</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#img251" class="pginternal">Fate of the Ragged Man</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#img252" class="pginternal">The Treasures Found</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#img253" class="pginternal">Caught at Last</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#img254" class="pginternal">Drop after Drop</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#img255" class="pginternal">Having a Good Time</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#img257" class="pginternal">A Business Trip</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#img261" class="pginternal">"Got it at Last!"</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-9.htm.html#img263" class="pginternal">Tail Piece</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-10.htm.html#img264" class="pginternal">Widow Douglas</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-10.htm.html#img266" class="pginternal">Tom Backs his Statement</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-10.htm.html#img267" class="pginternal">Tail Piece</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-10.htm.html#img268" class="pginternal">Huck Transformed</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-10.htm.html#img271" class="pginternal">Comfortable Once More</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-10.htm.html#img273" class="pginternal">High up in Society</a><br/>
<br/>
<a href="@public@vhost@g@gutenberg@html@files@74@74-h@74-h-10.htm.html#img274" class="pginternal">Contentment</a><br/>
<br/>
<br/>
<br/>
<br/>
<br/></p>
<h2 id="pgepubid00004">PREFACE</h2>
<p><br/></p>
<p>Most of the adventures recorded in this book really occurred; one or two were experiences of my own, the rest those of boys who were schoolmates of mine. Huck Finn is drawn from life; Tom Sawyer also, but not from an individual—he is a combination of the characteristics of three boys whom I knew, and therefore belongs to the composite order of architecture.</p>
<p>The odd superstitions touched upon were all prevalent among children and slaves in the West at the period of this story—that is to say, thirty or forty years ago.</p>
<p>Although my book is intended mainly for the entertainment of boys and girls, I hope it will not be shunned by men and women on that account, for part of my plan has been to try to pleasantly remind adults of what they once were themselves, and of how they felt and thought and talked, and what queer enterprises they sometimes engaged in.</p>
<p>THE AUTHOR.</p>
<p>HARTFORD, 1876.</p>
<p><br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<br/>
<a id="c1"/></p>
<h2 id="pgepubid00005">CHAPTER I</h2>
<p><br/>
<br/>
<a id="img017"/><br/>
<br/></p>
<div class="fig c2">01-017.jpg (182K)<br/></div>
<p><br/>
<br/>
<br/></p>
<p>"TOM!"</p>
<p>No answer.</p>
<p>"TOM!"</p>
<p>No answer.</p>
<p>"What's gone with that boy,  I wonder? You TOM!"</p>
<p>No answer.</p>
<p>The old lady pulled her spectacles down and looked over them about the room; then she put them up and looked out under them. She seldom or never looked <i>through</i> them for so small a thing as a boy; they were her state pair, the pride of her heart, and were built for "style," not service—she could have seen through a pair of stove-lids just as well. She looked perplexed for a moment, and then said, not fiercely, but still loud enough for the furniture to hear:</p>
<p>"Well, I lay if I get hold of you I'll—"</p>
<p>She did not finish, for by this time she was bending down and punching under the bed with the broom, and so she needed breath to punctuate the punches with. She resurrected nothing but the cat.</p>
<p>"I never did see the beat of that boy!"</p>
<p>She went to the open door and stood in it and looked out among the tomato vines and "jimpson" weeds that constituted the garden. No Tom. So she lifted up her voice at an angle calculated for distance and shouted:</p>
<p>"Y-o-u-u TOM!"</p>
<p>There was a slight noise behind her and she turned just in time to seize a small boy by the slack of his roundabout and arrest his flight.</p>
<p>"There! I might 'a' thought of that closet. What you been doing in there?"</p>
<p>"Nothing."</p>
<p>"Nothing! Look at your hands. And look at your mouth. What <i>is</i> that truck?"</p>
...

来自Tom Sawyer ePub的HTML文件。

我的最终目标是让文字从“第1章”开始。我想通过剥离所有不必要的部分来做到这一点。对于这种情况,我只关注“内容”部分。我想删除它。

我尝试使用以下RegEx,但不起作用:

  

模式:[\S\s]*contents[\S\s]*<h\d

     

替换为:

但是这很好地超越了CONTENTS部分,并且也删除了PREFACE,这是我不想要的。

删除目录的最佳做法是什么?

1 个答案:

答案 0 :(得分:1)

^(?:.*CONTENTS)((?:(?!<h\d).*\r)+)

从文档的顶部开始,查看每一行,看看它是否包含CONTENTS(如果您愿意,可以将其扩展为更精确)。

如果找到CONTENTS,则从该行开始,匹配整行(包括尾随换行符),只要它不以<h\d开头。

您的结果应该是从CONTENTS行开始的捕获,一直到ILLUSTRATIONS行。

不会Calibre自动为你处理这个问题吗?