Question

我一直在尝试将IBM Watson文档转换服务与演示PDF一起使用，但它并没有将文档转换为一点点。所有它正在做的，是创建一个答案单元，这真的很长：

"text": "Watson is an artificially intelligent computer system capable of answering questions posed in natural language,[2] developed in IBM's DeepQA project by a research team led by principal investigator David Ferrucci. Watson was named after IBM's first CEO and industrialist Thomas J. Watson.[3][4] The computer system was specifically developed to answer questions on the quiz show Jeopardy![5] In 2011, Watson competed on Jeopardy! against former winners Brad Rutter and Ken Jennings.[3][6] Watson received the first place prize of $1 million.[7] Watson had access to 200 million pages of structured and unstructured content consuming four terabytes of disk storage[8] including the full text of Wikipedia,[9] but was not connected to the Internet during the game.[10][11] For each clue, Watson's three most probable responses were displayed on the television screen. Watson consistently outperformed its human opponents on the game's signaling device, but had trouble responding to a few categories, notably those having short clues containing only a few words. In February 2013, IBM announced that Watson software system's first commercial application would be for utilization management decisions in lung cancer treatment at Memorial Sloan- Kettering Cancer Center in conjunction with health insurance company WellPoint.[12] IBM Watson's former business chief Manoj Saxena says that 90% of nurses in the field who use Watson now follow its guidance.[13]"

提前致谢！

Answer 1

不幸的是，该演示PDF不是最好使用的文档：目前，答案单元是根据标题标签（h1 - h6）拆分的，而且PDF不包含任何标题。 =（

如果您将conversion_target设置为NORMALIZED_HTML，则在将其转换为“应答单元”之前，您将能够看到已转换的PDF。它将包含段落但不包含标题。

将来，我们希望还允许逐段拆分答案单元，但尚未发布。

更新：我们在演示站点上更新了PDF，这是一个更好的例子。

文档转换Watson服务无法正常工作？

1 个答案: