使用elmo从文本中提取特征

时间:2019-03-18 04:02:00

标签: python-3.x tensorflow nlp word-embedding elmo

我正在尝试通过ELMo(语言模型的嵌入)提取要素。 我有两个推文数据集,训练和测试。我执行了以下代码,但是会产生错误。我已经搜索过SO,可能是Tensorflow和Cuda之间的兼容性问题。我已经尝试过各种版本,但尚未解决。如果我可以获取确切的版本号指针和代码并进行更改,这将对您有所帮助。

import tensorflow_hub as hub
import tensorflow as tf

elmo = hub.Module("https://tfhub.dev/google/elmo/2", trainable=True)

为训练和测试数据集中的清洗推文提取ELMo向量。

def elmo_vectors(x):
  embeddings = elmo(x.tolist(), signature="default", as_dict=True)["elmo"]

with tf.Session() as sess:
  sess.run(tf.global_variables_initializer())
  sess.run(tf.tables_initializer())
  # return average of ELMo features
  return sess.run(tf.reduce_mean(embeddings,1))

如果使用上述功能一次性提取推文的嵌入内容,我可能会用完计算资源(内存)。解决方法是,将训练集和测试集分为100个样本。然后,将这些批处理顺序传递给函数elmo_vectors()。

我将这些批次保存在列表中

list_train = [train[i:i+100] for i in range(0,train.shape[0],100)]
list_test = [test[i:i+100] for i in range(0,test.shape[0],100)]


# Extract ELMo embeddings
elmo_train = [elmo_vectors(x['clean_tweet']) for x in list_train]
elmo_test = [elmo_vectors(x['clean_tweet']) for x in list_test]

它产生以下错误:

UnknownError跟踪(最近一次通话最近)    _do_call(self,fn,* args)中的/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py _run_fn中的(feed_dict,fetch_list,target_list,选项,run_metadata)

/usr/local/lib/python3.6/dist-packages/tensorflow/python/client/session.py在     _call_tf_sessionrun(自身,选项,feed_dict,fetch_list,target_list,run_metadata)

UnknownError: Failed to get convolution algorithm. This is probably because 

cuDNN初始化失败,因此请尝试查看是否有警告日志消息      印在上面。      [[{{node module_2_apply_default_1 / bilm / CNN_1 / Conv2D_6}}]]

 During handling of the above exception, another exception occurred:

   UnknownError                              Traceback (most recent call last)
   <ipython-input-84-5d4975a95f4d> in <module>()
   ----> 1 elmo_train = [elmo_vectors(x['clean_tweet']) for x in list_train]
    2 elmo_test = [elmo_vectors(x['clean_tweet']) for x in list_test]

    <ipython-input-84-5d4975a95f4d> in <listcomp>(.0)
    ----> 1 elmo_train = [elmo_vectors(x['clean_tweet']) for x in list_train]
    2 elmo_test = [elmo_vectors(x['clean_tweet']) for x in list_test]

    <ipython-input-82-c22e4c1ff381> in elmo_vectors(x)
     6     sess.run(tf.tables_initializer())
     7     # return average of ELMo features
     ----> 8     return sess.run(tf.reduce_mean(embeddings,1))

    /usr/local/lib/python3.6/dist- 
    packages/tensorflow/python/client/session.py in run(self, fetches, 
    feed_dict, options, run_metadata)

    /usr/local/lib/python3.6/dist- 
    packages/tensorflow/python/client/session.py in _run(self, handle, 
   fetches, feed_dict, options, run_metadata)

   /usr/local/lib/python3.6/dist- 
   packages/tensorflow/python/client/session.py in _do_run(self, handle, 
   target_list, fetch_list, feed_dict, options, run_metadata)

   /usr/local/lib/python3.6/dist- 
   packages/tensorflow/python/client/session.py in _do_call(self, fn, *args)

  UnknownError: Failed to get convolution algorithm. This is probably 
   because cuDNN failed to initialize, so try looking to see if a warning 
  log message was printed above.
 [[node module_2_apply_default_1/bilm/CNN_1/Conv2D_6 (defined at 
  /usr/local/lib/python3.6/dist- 
packages/tensorflow_hub/native_module.py:517) ]]

 Caused by op 'module_2_apply_default_1/bilm/CNN_1/Conv2D_6', defined at:
 File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
  "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
   exec(code, run_globals)
   File "/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py", line 16, in <module>
   app.launch_new_instance()
   File "/usr/local/lib/python3.6/dist-packages/traitlets/config/application.py", line 658, in launch_instance
app.start()

文件“ /usr/local/lib/python3.6/dist-packages/ipykernel/kernelapp.py”在开始处为第477行     ioloop.IOLoop.instance()。start()   在开始的文件“ /usr/local/lib/python3.6/dist-packages/tornado/ioloop.py”中,第888行     handler_func(fd_obj,事件)   在null_wrapper中的文件“ /usr/local/lib/python3.6/dist-packages/tornado/stack_context.py”,第277行     返回fn(* args,** kwargs)   _handle_events中的文件“ /usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py”,第450行     self._handle_recv()   _handle_recv中的第480行的文件“ /usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py”     self._run_callback(回调,味精)   _run_callback中的文件“ /usr/local/lib/python3.6/dist-packages/zmq/eventloop/zmqstream.py”,第432行     回调(* args,** kwargs)   在null_wrapper中的文件“ /usr/local/lib/python3.6/dist-packages/tornado/stack_context.py”,第277行     返回fn(* args,** kwargs)   调度程序中的文件“ /usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py”,第283行     返回self.dispatch_shell(stream,msg)   在dispatch_shell中,文件“ /usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py”,第235行     处理程序(流,标识,味精)      在execute_request中,文件“ /usr/local/lib/python3.6/dist-packages/ipykernel/kernelbase.py”,行399       user_expressions,allow_stdin)       文件“ /usr/local/lib/python3.6/dist-packages/ipykernel/ipkernel.py”,第196行,      在do_execute中     res = shell.run_cell(代码,store_history = store_history,silent =静音)   文件“ /usr/local/lib/python3.6/dist-packages/ipykernel/zmqshell.py”,第533行,     在run_cell中     返回super(ZMQInteractiveShell,self).run_cell(* args,** kwargs)        文件“ /usr/local/lib/python3.6/dist-         包/IPython/core/interactiveshell.py”,行2718,在run_cell中        交互性=交互性,编译器=编译器,结果=结果)        文件“ /usr/local/lib/python3.6/dist-         包/IPython/core/interactiveshell.py”,第2822行,在run_ast_nodes中         如果self.run_code(代码,结果):        文件“ /usr/local/lib/python3.6/dist-         包/IPython/core/interactiveshell.py”,第2882行,在run_code中          exec(code_obj,self.user_global_ns,self.user_ns)           文件“”,第1行,位于         elmo_train = [list_train中x的elmo_vectors(x ['clean_tweet'])]          文件“”,第1行,位于           elmo_train = [list_train中x的elmo_vectors(x ['clean_tweet'])]          文件“”,第2行,在elmo_vectors中           嵌入= elmo(x.tolist(),signature =“默认”,as_dict = True)[“ elmo”]           调用中的文件“ /usr/local/lib/python3.6/dist-packages/tensorflow_hub/module.py”,第250行        名称=名称)         文件“ /usr/local/lib/python3.6/dist-         package / tensorflow_hub / native_module.py”,第517行,位于create_apply_graph中         import_scope = relative_scope_name)          文件“ /usr/local/lib/python3.6/dist-         包/tensorflow/python/training/saver.py“,行1435,在import_meta_graph中        文件“ /usr/local/lib/python3.6/dist-       包/tensorflow/python/training/saver.py”,第1457行,在      _import_meta_graph_with_return_elements       文件“ /usr/local/lib/python3.6/dist-       包/tensorflow/python/framework/meta_graph.py”,第806行,在       import_scoped_meta_graph_with_return_elements        文件“ /usr/local/lib/python3.6/dist-        package / tensorflow / python / util / deprecation.py“,第507行,在new_func中        文件“ /usr/local/lib/python3.6/dist-        包/tensorflow/python/framework/importer.py“,行442,在import_graph_def中        文件“ /usr/local/lib/python3.6/dist-        包/tensorflow/python/framework/importer.py”,第235行,在_ProcessNewOps中       文件“ /usr/local/lib/python3.6/dist-       包/tensorflow/python/framework/ops.py”,第3433行,在       _add_new_tf_operations       文件“ /usr/local/lib/python3.6/dist-       包/tensorflow/python/framework/ops.py”,第3433行,在       文件“ /usr/local/lib/python3.6/dist-      包/tensorflow/python/framework/ops.py”,第3325行,在     _create_op_from_tf_operation      文件“ /usr/local/lib/python3.6/dist-      软件包/tensorflow/python/framework/ops.py”,行1801,位于 init

  UnknownError (see above for traceback): Failed to get convolution 
  algorithm. This is probably because cuDNN failed to initialize, so try 
  looking to see if a warning log message was printed above.
 [[node module_2_apply_default_1/bilm/CNN_1/Conv2D_6 (defined at 
  /usr/local/lib/python3.6/dist- 
   packages/tensorflow_hub/native_module.py:517) ]]

0 个答案:

没有答案