如何在Microsoft Windows上安装Python包pyrouge?

时间:2017-10-31 22:29:36

标签: python windows nlp summarization

我想在Microsoft Windows上使用python包pyrouge。该软件包没有提供有关如何在Microsoft Windows上安装它的任何说明。我怎么能这样做?

1 个答案:

答案 0 :(得分:5)

以下说明在Windows 7 SP1 x64 Ultimate和python 3.5 x64(Anaconda)上进行了测试。

1)在cmd.exe中,运行

pip install pyrouge

2)下载ROUGE-1.5.5。您可以从https://github.com/andersjo/pyrouge/tree/master/tools/ROUGE-1.5.5

下载

3)pyrouge附带一个名为pyrouge_set_rouge_path的python脚本(由于某种原因它没有文件扩展名),您需要运行该脚本以便将pyrouge指向目录找到ROUGE-1.5.5。您需要找到pyrouge_set_rouge_path,它通常位于python Scripts目录中。

cmd.exe运行以下命令,正确替换pyrouge_set_rouge_pathROUGE-1.5.5的目录:

python C:\Anaconda\envs\py35\Scripts\pyrouge_set_rouge_path  C:\pyrouge-master\tools\ROUGE-1.5.5

4)pyrouge现在应该能够初始化Rouge155个对象。你可以运行以下python脚本,它应该没有错误:

from pyrouge import Rouge155
r = Rouge155()

5)如果你没有perl.exe,你需要安装它(因为pyrouge只是原始ROUGE脚本的包装,用Perl编写)你可以安装http://strawberryperl.com < / p>

确保perl.exe二进制文件位于Path系统环境变量中,例如使用which perl

enter image description here

Path系统环境变量中添加perl:

enter image description here

最后,为了避免这种错误:

enter image description here

一种方法是将C:\Strawberry\c\bin\*.dll复制到C:\Strawberry\perl\bin\*.dll

6)运行pyrouge时阻止以下错误消息:

Cannot open exception db file for reading: C:\Anaconda\pyrouge-master\tools\ROUGE-1.5.5\data/WordNet-2.0.exc.db

您应该删除\RELEASE-1.5.5\data\WordNet-2.0.exc.db,然后从cmd.exe删除

cd RELEASE-1.5.5\data\
perl WordNet-2.0-Exceptions/buildExeptionDB.pl ./WordNet-2.0-Exceptions ./smart_common_words.txt ./WordNet-2.0.exc.db

7)打开C:\Anaconda\envs\py35\Lib\site-packages\pyrouge\Rouge155.py(或安装pyrouge的任何地方),转到def evaluate(self, system_id=1, rouge_args=None)函数(在我写这个答案的时候是第319行),并添加{在command.insert(0, 'perl ')之前{1}}。 (如果你不这样做,你将获得self.log.info("Running ROUGE with command {}".format(" ".join(command))),这与你未完成上述某些步骤时所获得的错误信息相同。)

8)此时OSError: [WinError 193] %1 is not a valid Win32 application应该可以正常工作。不要尝试运行pyrouge,它是buggy。相反,您可以按如下方式测试它:

python -m pyrouge.test

some_folder: │ rouge.py │ ├───model_summaries │ text.A.001.txt │ └───system_summaries text.001.txt 包含:

rouge.py

from pyrouge import Rouge155 r = Rouge155() r.system_dir = 'system_summaries' r.model_dir = 'model_summaries' r.system_filename_pattern = 'text.(\d+).txt' r.model_filename_pattern = 'text.[A-Z].#ID#.txt' output = r.convert_and_evaluate() print(output) output_dict = r.output_to_dict(output) 包含:

text.A.001.txt

preprocess my summaries, then run ROUGE 包含:

text.001.txt

运行I only want to preprocess my summaries and then run ROUGE on my own 时的输出:

rouge.py

如果您不执行第3步,则运行2017-10-31 21:55:37,239 [MainThread ] [INFO ] Writing summaries. 2017-10-31 21:55:37,249 [MainThread ] [INFO ] Processing summaries. Saving system files to C:\Users\Francky\AppData\Local\Temp\tmpmh72hoxa\system and model files to C:\Users\Francky\AppData\Local\Temp\tmpmh72hoxa\model. 2017-10-31 21:55:37,249 [MainThread ] [INFO ] Processing files in system_summaries. 2017-10-31 21:55:37,249 [MainThread ] [INFO ] Processing text.001.txt. 2017-10-31 21:55:37,249 [MainThread ] [INFO ] Saved processed files to C:\Users\Francky\AppData\Local\Temp\tmpmh72hoxa\system. 2017-10-31 21:55:37,249 [MainThread ] [INFO ] Processing files in model_summaries. 2017-10-31 21:55:37,249 [MainThread ] [INFO ] Processing text.A.001.txt. 2017-10-31 21:55:37,249 [MainThread ] [INFO ] Saved processed files to C:\Users\Francky\AppData\Local\Temp\tmpmh72hoxa\model. 2017-10-31 21:55:37,249 [MainThread ] [INFO ] Written ROUGE configuration to C:\Users\Francky\AppData\Local\Temp\tmpgx71qygq\rouge_conf.xml 2017-10-31 21:55:37,249 [MainThread ] [INFO ] Running ROUGE with command perl C:\Anaconda\pyrouge-master\tools\ROUGE-1.5.5\ROUGE-1.5.5.pl -e C:\Anaconda\pyrouge-master\tools\ROUGE-1.5.5\data -c 95 -2 -1 -U -r 1000 -n 4 -w 1.2 -a -m C:\Users\Francky\AppData\Local\Temp\tmpgx71qygq\rouge_conf.xml command: ['C:\\Anaconda\\pyrouge-master\\tools\\ROUGE-1.5.5\\ROUGE-1.5.5.pl', '-e', 'C:\\Anaconda\\pyrouge-master\\tools\\ROUGE-1.5.5\\data', '-c', '95', '-2', '-1', '-U', '-r', '1000', '-n', '4', '-w', '1.2', '-a', '-m', 'C:\\Users\\Francky\\AppData\\Local\\Temp\\tmpgx71qygq\\rouge_conf.xml'] --------------------------------------------- 1 ROUGE-1 Average_R: 1.00000 (95%-conf.int. 1.00000 - 1.00000) 1 ROUGE-1 Average_P: 0.42857 (95%-conf.int. 0.42857 - 0.42857) 1 ROUGE-1 Average_F: 0.60000 (95%-conf.int. 0.60000 - 0.60000) --------------------------------------------- 1 ROUGE-2 Average_R: 0.80000 (95%-conf.int. 0.80000 - 0.80000) 1 ROUGE-2 Average_P: 0.30769 (95%-conf.int. 0.30769 - 0.30769) 1 ROUGE-2 Average_F: 0.44444 (95%-conf.int. 0.44444 - 0.44444) --------------------------------------------- 1 ROUGE-3 Average_R: 0.50000 (95%-conf.int. 0.50000 - 0.50000) 1 ROUGE-3 Average_P: 0.16667 (95%-conf.int. 0.16667 - 0.16667) 1 ROUGE-3 Average_F: 0.25000 (95%-conf.int. 0.25000 - 0.25000) --------------------------------------------- 1 ROUGE-4 Average_R: 0.00000 (95%-conf.int. 0.00000 - 0.00000) 1 ROUGE-4 Average_P: 0.00000 (95%-conf.int. 0.00000 - 0.00000) 1 ROUGE-4 Average_F: 0.00000 (95%-conf.int. 0.00000 - 0.00000) --------------------------------------------- 1 ROUGE-L Average_R: 1.00000 (95%-conf.int. 1.00000 - 1.00000) 1 ROUGE-L Average_P: 0.42857 (95%-conf.int. 0.42857 - 0.42857) 1 ROUGE-L Average_F: 0.60000 (95%-conf.int. 0.60000 - 0.60000) --------------------------------------------- 1 ROUGE-W-1.2 Average_R: 0.69883 (95%-conf.int. 0.69883 - 0.69883) 1 ROUGE-W-1.2 Average_P: 0.42857 (95%-conf.int. 0.42857 - 0.42857) 1 ROUGE-W-1.2 Average_F: 0.53131 (95%-conf.int. 0.53131 - 0.53131) --------------------------------------------- 1 ROUGE-S* Average_R: 1.00000 (95%-conf.int. 1.00000 - 1.00000) 1 ROUGE-S* Average_P: 0.16484 (95%-conf.int. 0.16484 - 0.16484) 1 ROUGE-S* Average_F: 0.28303 (95%-conf.int. 0.28303 - 0.28303) --------------------------------------------- 1 ROUGE-SU* Average_R: 1.00000 (95%-conf.int. 1.00000 - 1.00000) 1 ROUGE-SU* Average_P: 0.19231 (95%-conf.int. 0.19231 - 0.19231) 1 ROUGE-SU* Average_F: 0.32258 (95%-conf.int. 0.32258 - 0.32258) 将收到以下错误消息

from pyrouge import Rouge155; r = Rouge155()