我尝试使用cpu在mac上训练OpenNMT example,步骤如下:
环境:python3.5,Pytorch 0.1.10.1
第1步
预处理数据并缩小src
和tgt
,只需在preprocess.py
中line133之后插入以下行,就只有前100个句子
shrink = True
if shrink:
src = src[0:100]
tgt = tgt[0:100]
然后,我跑了
python preprocess.py -train_src data/src-train.txt -train_tgt data/tgt-train.txt -valid_src data/src-val.txt -valid_tgt data/tgt-val.txt -save_data data/demo
第2步
然后我使用python train.py -data data/demo.train.pt -save_model demo_model
然后在出现错误之前暂停了一段时间:
(dlnd-tf-lab) ->python train.py -data data/demo.train.pt -save_model demo_model
Namespace(batch_size=64, brnn=False, brnn_merge='concat', curriculum=False, data='data/demo.train.pt', dropout=0.3, epochs=13, extra_shuffle=False, gpus=[], input_feed=1, layers=2, learning_rate=1.0, learning_rate_decay=0.5, log_interval=50, max_generator_batches=32, max_grad_norm=5, optim='sgd', param_init=0.1, pre_word_vecs_dec=None, pre_word_vecs_enc=None, rnn_size=500, save_model='demo_model', start_decay_at=8, start_epoch=1, train_from='', train_from_state_dict='', word_vec_size=500)
Loading data from 'data/demo.train.pt'
* vocabulary size. source = 24999; target = 35820
* number of training sentences. 100
* maximum batch size. 64
Building model...
* number of parameters: 58121320
NMTModel (
(encoder): Encoder (
(word_lut): Embedding(24999, 500, padding_idx=0)
(rnn): LSTM(500, 500, num_layers=2, dropout=0.3)
)
(decoder): Decoder (
(word_lut): Embedding(35820, 500, padding_idx=0)
(rnn): StackedLSTM (
(dropout): Dropout (p = 0.3)
(layers): ModuleList (
(0): LSTMCell(1000, 500)
(1): LSTMCell(500, 500)
)
)
(attn): GlobalAttention (
(linear_in): Linear (500 -> 500)
(sm): Softmax ()
(linear_out): Linear (1000 -> 500)
(tanh): Tanh ()
)
(dropout): Dropout (p = 0.3)
)
(generator): Sequential (
(0): Linear (500 -> 35820)
(1): LogSoftmax ()
)
)
Train perplexity: 29508.9
Train accuracy: 0.0216306
Validation perplexity: 4.50917e+08
Validation accuracy: 3.57853
Train perplexity: 1.07012e+07
Train accuracy: 0.06198
Validation perplexity: 103639
Validation accuracy: 0.944334
Train perplexity: 458795
Train accuracy: 0.031198
Validation perplexity: 43578.2
Validation accuracy: 3.42942
Train perplexity: 144931
Train accuracy: 0.0432612
Validation perplexity: 78366.8
Validation accuracy: 2.33598
Decaying learning rate to 0.5
Train perplexity: 58696.8
Train accuracy: 0.0278702
Validation perplexity: 14045.8
Validation accuracy: 3.67793
Decaying learning rate to 0.25
Train perplexity: 10045.1
Train accuracy: 0.0457571
Validation perplexity: 26435.6
Validation accuracy: 4.87078
Decaying learning rate to 0.125
Train perplexity: 10301.5
Train accuracy: 0.0490849
Validation perplexity: 24243.5
Validation accuracy: 3.62823
Decaying learning rate to 0.0625
Train perplexity: 7927.77
Train accuracy: 0.062812
Validation perplexity: 7180.49
Validation accuracy: 5.31809
Decaying learning rate to 0.03125
Train perplexity: 4573.5
Train accuracy: 0.047421
Validation perplexity: 6545.51
Validation accuracy: 5.6163
Decaying learning rate to 0.015625
Train perplexity: 3995.7
Train accuracy: 0.0549085
Validation perplexity: 6316.25
Validation accuracy: 5.4175
Decaying learning rate to 0.0078125
Train perplexity: 3715.81
Train accuracy: 0.0540765
Validation perplexity: 6197.91
Validation accuracy: 5.86481
Decaying learning rate to 0.00390625
Train perplexity: 3672.46
Train accuracy: 0.0540765
Validation perplexity: 6144.18
Validation accuracy: 6.01392
Decaying learning rate to 0.00195312
Train perplexity: 3689.7
Train accuracy: 0.0528286
Validation perplexity: 6113.55
Validation accuracy: 6.31213
Decaying learning rate to 0.000976562
Exception ignored in: <function WeakValueDictionary.__init__.<locals>.remove at 0x118b19b70>
Traceback (most recent call last):
File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/weakref.py", line 117, in remove
TypeError: 'NoneType' object is not callable
你能告诉我如何解决这个问题吗?谢谢!