我正在从事一个项目,该项目需要我从从pdf解析的文本文件中提取完整的句子。这些原始文本文件确实很混乱,从某种意义上来说,pdf的表和段落都包括在内。
这是文本文件的快照
Issue 15-24 | Thursday 18 June 2015 PRICES Sulphur prices YL 4 Contract
Spot Saupe fob Vancouver Q2-2015 135-145 135-145 fob Middle East* Q2-
2015 140-165 145-151 fob Qatar QSP Jun 2015 141 fob UAE OSP Jun 2015
145 fob Iran 139-145 fob Black Sea (lump-gran) Q2-2015 110-130 120-130
fob US Gulf Q2-2015 135-150 135-140 cfr Brazil Q2-2015 150-165 155-160
cfr Med (under 10 k) 128-148 fob Med (under 10 k) 110-120 cfr N Africa
(lump-gran) Q2-2015 135-155 140-155 cfr India 163-168 cfr China Q2-2015
143-163 143-163 ex-w Nantong (CNY/t) 1250-1260
“excluding Iran cfr Tampa/C Fla (l.t.) Q2-2015 132 cfr Benelux (loc
refs) Q2-2015 155-172 cpt NW Europe Q2-2015 193-214
cpt = ‘carriage paid to’ for sulphur delivered by Roadtankcar FM
Argus FMB Sulphur pated after the Chinese New Year in February, prices
eroded slightly but did not enter a free-fall. Some argue that it was
down to a structural market tightness, which is expected to provide
support to current sulphur prices and to potentially prevent prices
from falling sharply even if Chinese buyers decided to exit the market
in the next few weeks.
我需要的是一个可以提取所有完整句子,忽略那些表和不完整句子的工具。我想知道现在是否有解决此问题的解决方案。
任何帮助将不胜感激!