Question

我想制作一个系统，我们可以在Raspberry Pi的终端上搜索一些内容，而Pi会提供语音输出。

我使用pico TTS解决了文本到语音转换问题。现在我想要做的是转到要搜索的术语的维基百科页面，并将页面的第一段存储到文本文件中。

例如，简单英语中输入Tiger的结果应该是一个包含 -

的文本文件

虎（Panthera tigris）是一种食肉哺乳动物。它是猫科动物猫科动物中最大的活体成员。它生活在亚洲，主要是印度，不丹，中国和西伯利亚。

我尝试使用this，但它似乎无法正常工作。

的错误消息

$ pip install wikipedia
...
Command /usr/bin/python -c "import setuptools, tokenize;__file__='/tmp/pip-build-qdTIZY/wikipedia/setup.py';exec(compile(getattr(tokenize, 'open', open)(__file__).read().replace('\r\n', '\n'), __file__, 'exec'))" install --record /tmp/pip-9CPD6D-record/install-record.txt --single-version-externally-managed --compile
failed with error code 1 in /tmp/pip-build-qdTIZY/wikipedia
Storing debug log for failure in /home/pi/.pip/pip.log

Answer 1

这似乎有效：

title=Tiger
n_sentences=2
curl -s http://simple.wikipedia.org/w/api.php?action=query&prop=extracts&titles="$title"&exsentences="$n_sentences"&explaintext=&format=json |
  sed 's/.*"extract":"\|"}}}}$//g'

它正确地产生：

虎（Panthera tigris）是一种食肉哺乳动物。它是猫科的最大生物成员，猫科。

还使用title=Albert_Einstein进行了测试：

阿尔伯特爱因斯坦（1879年3月14日，1955年4月18日）是德国出生的理论物理学家，他发展了广义相对论，是现代物理学的两大支柱之一（与量子力学一起）。\ n他获得诺贝尔奖在1921年的物理学，但不是相对论。

（请注意，title="Albert Einstein"，title=albert_einstein和title=albert%20einstein都不起作用，因此您最终需要其他命令才能找到最匹配的真正的simple.wikipedia文章标题。）

curl命令向simple.wikipedia.org发出http请求。要看到这一点，请试试这个：

curl http://simple.wikipedia.org/w/api.php?action=query&prop=extracts&titles=Tiger&exsentences=2&explaintext=&format=json

sed命令然后提取响应的所需部分。

已更新，以增加使用覆盆子的curl＆amp; sed：将https更改为http，并在没有sed的情况下重新编写-e命令。

REF：

MediaWiki API?

获取Wikipedia的第一段，并将其存储到文本文件中

1 个答案: