我正在使用azure数据工厂从MySQL服务器复制数据作为源。数据量很大。当我设置管道并执行它时:
MySQL: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.
我认为这可以通过this answer来解决。如何使用MySQL作为源将此配置添加到我的数据工厂管道?
更新:我使用普通脚本将数据从内部部署MySQL复制到SQL数据仓库。 MySQL查询很简单:select * from mytable;
完成错误:
复制活动在源端遇到用户错误: GatewayNodeName = MYGATEWAY,错误码= UserErrorFailedMashupOperation, '类型= Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,消息=' 类型= Microsoft.Data.Mashup.MashupValueException,消息= MySQL的: 超时已过期。在完成之前经过了超时时间 操作或服务器不是 响应。,源= Microsoft.MashupEngine, '源='
答案 0 :(得分:0)
好吧,如果此问题与默认超时配置有关,您可以在管道设置的"activities"
中添加这些脚本,将超时设置为1小时:
"Policy": {
"concurrency": 1,
"timeout": "01:00:00"
}
---------- ----------更新
管道配置的整个JSON如下:
{
"name": "ADFTutorialPipelineOnPrem",
"properties": {
"description": "This pipeline has one Copy activity that copies data from an on-prem SQL to Azure blob",
"activities": [
{
"name": "CopyFromSQLtoBlob",
"description": "Copy data from on-prem SQL server to blob",
"type": "Copy",
"inputs": [
{
"name": "EmpOnPremSQLTable"
}
],
"outputs": [
{
"name": "OutputBlobTable"
}
],
"typeProperties": {
"source": {
"type": "SqlSource",
"sqlReaderQuery": "select * from emp"
},
"sink": {
"type": "BlobSink"
}
},
"Policy": {
"concurrency": 1,
"executionPriorityOrder": "NewestFirst",
"style": "StartOfInterval",
"retry": 0,
"timeout": "01:00:00"
}
}
],
"start": "2016-07-05T00:00:00Z",
"end": "2016-07-06T00:00:00Z",
"isPaused": false
}
}
以下示例假设您在MySQL中创建了一个表“MyTable”,它包含一个名为“timestampcolumn”的列,用于时间序列数据。设置“external”:“true”通知Data Factory服务该表是外部的数据工厂并不是由数据工厂中的活动生成的。:
{
"name": "MySqlDataSet",
"properties": {
"published": false,
"type": "RelationalTable",
"linkedServiceName": "OnPremMySqlLinkedService",
"typeProperties": {},
"availability": {
"frequency": "Hour",
"interval": 1
},
"external": true,
"policy": {
"externalData": {
"retryInterval": "00:01:00",
"retryTimeout": "01:00:00",
"maximumRetry": 3
}
}
}
}
有关如何为Azure数据工厂创建管道的更多详细信息,请参阅this document
有关将数据从内部部署MySQL移动到Azure数据工厂的整个教程的更多信息,请参阅this link。