Question

在用于监督分类的paper on fasttext中，作者通过改变一些参数来指定不同数量的隐藏单位（h是第3,4页上的那个 - 在表1中你看到＆＃34;它有10个隐藏单位和我们使用和不使用双字母进行评估。＆＃34;）但在阅读the documentation之后，似乎没有一个隐藏单元＆＃34;要改变的参数。有没有办法指定隐藏单位的数量？或者这与指定-dim选项相同？

Answer 1

k是否定的。类

来自https://arxiv.org/pdf/1607.01759v3.pdf

的第2.1节

更确切地说，计算复杂度为O（kh），其中k是类的数量，h是文本表示的维度。

在预测文本分类中的类时，来自docs：

参数k是可选的，默认情况下等于1。   为了获得一段文本的k个最可能的标签，请使用：

$ ./fasttext预测model.bin test.txt k

训练模型时，在使用__label__*标签执行监督培训时，会在训练数据中隐式指定。

来自example tutorial：

$ wget https://s3-us-west-1.amazonaws.com/fasttext-vectors/cooking.stackexchange.tar.gz && tar xvzf cooking.stackexchange.tar.gz
--2017-05-23 09:03:26--  https://s3-us-west-1.amazonaws.com/fasttext-vectors/cooking.stackexchange.tar.gz
Resolving s3-us-west-1.amazonaws.com... 54.231.236.45
Connecting to s3-us-west-1.amazonaws.com|54.231.236.45|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 457609 (447K) [application/x-gzip]
Saving to: ‘cooking.stackexchange.tar.gz.1’

cooking.stackexchange.tar.gz.1      100%[================================================================>] 446.88K   385KB/s    in 1.2s    

2017-05-23 09:03:28 (385 KB/s) - ‘cooking.stackexchange.tar.gz.1’ saved [457609/457609]

x cooking.stackexchange.id
x cooking.stackexchange.txt
x readme.txt


$ cat readme.txt 
The data in this archive is derived from the user-contributed content on the
Cooking Stack Exchange website (https://cooking.stackexchange.com/), used under
CC-BY-SA 3.0 (http://creativecommons.org/licenses/by-sa/3.0/).

The original data dump can be downloaded from:
https://archive.org/download/stackexchange/cooking.stackexchange.com.7z
and details about the dump obtained from:
https://archive.org/details/stackexchange

We distribute two files, under CC-BY-SA 3.0:

 - cooking.stackexchange.txt, which contains all question titles and
   their associated tags (one question per line, tags are prefixed by
   the string "__label__") ;

 - cooking.stackexchange.id, which contains the corresponding row IDs,
   from the original data dump.

在Facebook fasttext中指定隐藏单位数

1 个答案: