使用pytorch训练OpenNMT和`TypeError:NoneType对象不可调用`

时间:2017-03-30 07:15:37

标签: pytorch

我尝试使用cpu在mac上训练OpenNMT example,步骤如下:

环境:python3.5,Pytorch 0.1.10.1

第1步

预处理数据并缩小srctgt,只需在preprocess.pyline133之后插入以下行,就只有前100个句子

shrink = True if shrink: src = src[0:100] tgt = tgt[0:100]

然后,我跑了

python preprocess.py -train_src data/src-train.txt -train_tgt data/tgt-train.txt -valid_src data/src-val.txt -valid_tgt data/tgt-val.txt -save_data data/demo

第2步

然后我使用python train.py -data data/demo.train.pt -save_model demo_model

进行训练

然后在出现错误之前暂停了一段时间:

(dlnd-tf-lab)  ->python train.py -data data/demo.train.pt -save_model demo_model
Namespace(batch_size=64, brnn=False, brnn_merge='concat', curriculum=False, data='data/demo.train.pt', dropout=0.3, epochs=13, extra_shuffle=False, gpus=[], input_feed=1, layers=2, learning_rate=1.0, learning_rate_decay=0.5, log_interval=50, max_generator_batches=32, max_grad_norm=5, optim='sgd', param_init=0.1, pre_word_vecs_dec=None, pre_word_vecs_enc=None, rnn_size=500, save_model='demo_model', start_decay_at=8, start_epoch=1, train_from='', train_from_state_dict='', word_vec_size=500)
Loading data from 'data/demo.train.pt'
 * vocabulary size. source = 24999; target = 35820
 * number of training sentences. 100
 * maximum batch size. 64
Building model...
* number of parameters: 58121320
NMTModel (
  (encoder): Encoder (
    (word_lut): Embedding(24999, 500, padding_idx=0)
    (rnn): LSTM(500, 500, num_layers=2, dropout=0.3)
  )
  (decoder): Decoder (
    (word_lut): Embedding(35820, 500, padding_idx=0)
    (rnn): StackedLSTM (
      (dropout): Dropout (p = 0.3)
      (layers): ModuleList (
        (0): LSTMCell(1000, 500)
        (1): LSTMCell(500, 500)
      )
    )
    (attn): GlobalAttention (
      (linear_in): Linear (500 -> 500)
      (sm): Softmax ()
      (linear_out): Linear (1000 -> 500)
      (tanh): Tanh ()
    )
    (dropout): Dropout (p = 0.3)
  )
  (generator): Sequential (
    (0): Linear (500 -> 35820)
    (1): LogSoftmax ()
  )
)

Train perplexity: 29508.9
Train accuracy: 0.0216306
Validation perplexity: 4.50917e+08
Validation accuracy: 3.57853

Train perplexity: 1.07012e+07
Train accuracy: 0.06198
Validation perplexity: 103639
Validation accuracy: 0.944334

Train perplexity: 458795
Train accuracy: 0.031198
Validation perplexity: 43578.2
Validation accuracy: 3.42942

Train perplexity: 144931
Train accuracy: 0.0432612
Validation perplexity: 78366.8
Validation accuracy: 2.33598
Decaying learning rate to 0.5

Train perplexity: 58696.8
Train accuracy: 0.0278702
Validation perplexity: 14045.8
Validation accuracy: 3.67793
Decaying learning rate to 0.25

Train perplexity: 10045.1
Train accuracy: 0.0457571
Validation perplexity: 26435.6
Validation accuracy: 4.87078
Decaying learning rate to 0.125

Train perplexity: 10301.5
Train accuracy: 0.0490849
Validation perplexity: 24243.5
Validation accuracy: 3.62823
Decaying learning rate to 0.0625

Train perplexity: 7927.77
Train accuracy: 0.062812
Validation perplexity: 7180.49
Validation accuracy: 5.31809
Decaying learning rate to 0.03125

Train perplexity: 4573.5
Train accuracy: 0.047421
Validation perplexity: 6545.51
Validation accuracy: 5.6163
Decaying learning rate to 0.015625

Train perplexity: 3995.7
Train accuracy: 0.0549085
Validation perplexity: 6316.25
Validation accuracy: 5.4175
Decaying learning rate to 0.0078125

Train perplexity: 3715.81
Train accuracy: 0.0540765
Validation perplexity: 6197.91
Validation accuracy: 5.86481
Decaying learning rate to 0.00390625

Train perplexity: 3672.46
Train accuracy: 0.0540765
Validation perplexity: 6144.18
Validation accuracy: 6.01392
Decaying learning rate to 0.00195312

Train perplexity: 3689.7
Train accuracy: 0.0528286
Validation perplexity: 6113.55
Validation accuracy: 6.31213
Decaying learning rate to 0.000976562
Exception ignored in: <function WeakValueDictionary.__init__.<locals>.remove at 0x118b19b70>
Traceback (most recent call last):
  File "/Users/Natsume/miniconda2/envs/dlnd-tf-lab/lib/python3.5/weakref.py", line 117, in remove
TypeError: 'NoneType' object is not callable

你能告诉我如何解决这个问题吗?谢谢!

1 个答案:

答案 0 :(得分:0)

发布了两个解决方案here

尝试these后,错误似乎消失了