从Python脚本将MongoDB数据导入Azure ML Studio

时间:2019-02-03 12:28:01

标签: python mongodb azure pymongo azure-machine-learning-studio

当前在执行Python脚本时位于Azure ML中,带有以下代码。 (Python 2.7.11) 从mongoDB获得的结果试图使用pyMongo在DataFrame中返回。

我遇到了类似::: p的错误

"C:\pyhome\lib\site-packages\pymongo\topology.py", line 97, in select_servers
        self._error_message(selector))
    ServerSelectionTimeoutError: ... ('The write operation timed out',)

如果您知道错误的原因以及需要改进的地方,请告诉我。

我的源代码:

import pymongo as m
import pandas as pd

def azureml_main(dataframe1 = None, dataframe2 = None):

uri = "mongodb://xxxxx:yyyyyyyyyyyyyyy@zzz.mongodb.net:xxxxx/?ssl=true&replicaSet=globaldb"
client = m.MongoClient(uri,connect=False)
db = client['dbName']
coll = db['colectionName']
cursor = coll.find()
df = pd.DataFrame(list(cursor))
return df,

错误详细信息:

Error 0085: The following error occurred during script evaluation, please view the output log for more information:
---------- Start of error message from Python interpreter ----------
Caught exception while executing function: Traceback (most recent call last):
  File "C:\server\invokepy.py", line 199, in batch
    odfs = mod.azureml_main(*idfs)
  File "C:\temp\55a174d8dc584942908423ebc0bac110.py", line 32, in azureml_main
    result =  pd.DataFrame(list(cursor))
  File "C:\pyhome\lib\site-packages\pymongo\cursor.py", line 977, in next
    if len(self.__data) or self._refresh():
  File "C:\pyhome\lib\site-packages\pymongo\cursor.py", line 902, in _refresh
    self.__read_preference))
  File "C:\pyhome\lib\site-packages\pymongo\cursor.py", line 813, in __send_message
    **kwargs)
  File "C:\pyhome\lib\site-packages\pymongo\mongo_client.py", line 728, in _send_message_with_response
    server = topology.select_server(selector)
  File "C:\pyhome\lib\site-packages\pymongo\topology.py", line 121, in select_server
    address))
  File "C:\pyhome\lib\site-packages\pymongo\topology.py", line 97, in select_servers
    self._error_message(selector))
ServerSelectionTimeoutError: xxxxx-xxx.mongodb.net:xxxxx: ('The write operation timed out',)
Process returned with non-zero exit code 1

1 个答案:

答案 0 :(得分:0)

据我所知,Execute Python Scripts的局限性将导致此问题,请参阅Limitations部分以了解此问题,如下所示。

  

限制

     

“执行Python脚本”当前具有以下限制:

     
      
  1. 沙盒执行。目前,Python运行时已被沙箱化,结果,不允许持久访问网络。模块完成后,所有本地保存的文件都会被隔离并删除。 Python代码无法访问其运行的计算机上的大多数目录,但当前目录及其子目录除外。
  2.   

由于上述原因,您无法通过pymongo模块中的Execute Python Script驱动程序直接从Azure Cosmos DB在线导入数据。但是,您可以将Import Data模块与Azure Cosmos DB的连接和参数信息一起使用,并将其输出连接到Execute Python Script的输入以获取数据,如下图所示。

enter image description here

有关在线导入数据的更多信息,请参阅官方文档Import your training data into Azure Machine Learning Studio from various data sources的{​​{3}}部分。