我们的Azure Service Fabric群集中有一项服务似乎正在冻结,并且大约每48小时就会失去与数据库的连接。在让开发人员研究此问题之前,我的解决方法是先通过Service Fabric资源管理器删除服务,然后立即重新创建它。这暂时解决了该问题,直到再次冻结。
我的问题是,无论如何,我是否可以使该过程自动化?至少需要一两个月的时间,我才能让开发人员对此进行研究,因此我希望能每天自动运行一次该过程。
答案 0 :(得分:1)
不幸的是,Service Fabric中没有调度机制可以执行这种操作。
为您提供的解决方案是运行连接到SF的脚本,并通过powershell或API执行重启代码包。
对于powershell,可以使用Azure Automation Runbook或使用Azure函数在计划中调用API。
我认为Powershell更容易,但两者都应该起作用。
顾名思义,Restart-ServiceFabricDeployedCodePackage将强制关闭并重新启动进程及其中托管的所有副本。无需删除和重新创建,您可能会错过对服务的配置。
文档显示了可以一起使用的参数组合,在某些情况下,与其他参数一起使用时某些参数是必需的,文档应突出显示匹配项,结果将如下所示:
Restart-ServiceFabricDeployedCodePackage -ApplicationName "fabric:/appname" -ServiceName "fabric:/appname/servicename" -PartitionId "b098c9f0-009a-458d-8b2d-8089fedcd014"
或类似的特定副本:
Restart-ServiceFabricDeployedCodePackage -ApplicationName "fabric:/repairs" -ServiceName "fabric:/repairs/web" -PartitionId "b098c9f0-009a-458d-8b2d-8089fedcd014" -ReplicaOrInstanceId 131896982398426643
。
Restart-ServiceFabricPartition
也很有用,具有相同的效果:
Restart-ServiceFabricPartition -RestartPartitionMode AllReplicasOrInstances -ServiceName "fabric:/appname/service" -PartitionId "b098c9f0-009a-458d-8b2d-8089fedcd014"
Restart-ServiceFabricPartition
已经过时,无法在需要可靠性的情况下使人们使用Start-ServiceFabricPartitionRestart
建议用于关闭服务,例如有状态服务,它将避免同时放下所有副本。
Start-ServiceFabricPartitionRestart
我还没有亲自使用过,但这是建议对有状态服务的建议。
参数组合有点棘手,我建议您尝试使用不同的组合。在某些情况下,它成功但显示错误,不确定原因!
答案 1 :(得分:1)
这是我用来重启无状态服务的脚本。它使用上面提到的Restart-ServiceFabricDeployedCodePackage
。
我已经将其命名为Restart-ServiceFabricServiceCodePackages.ps1
,您可以仅用-ServiceName fabric:/application/service
对其进行调用,并根据服务的正常启动时间来调整等待时间。
Param (
[Parameter(Mandatory=$true)]
[uri] $ServiceName,
[int] $WaitBetweenNodesSeconds = 30
)
try {
Test-ServiceFabricClusterConnection | Out-Null
}
catch {
throw "Active connection to Service Fabric cluster required"
}
$serviceDescription = Get-ServiceFabricServiceDescription -ServiceName $ServiceName -ErrorAction SilentlyContinue
if (!$serviceDescription) {
throw "Invalid Service Fabric service name"
}
if ($serviceDescription.ServiceKind -ne "Stateless") {
throw "Unknown outcomes could occur for non-stateless services"
}
$applicationName = $serviceDescription.ApplicationName
$serviceTypeName = $serviceDescription.ServiceTypeName
$service = Get-ServiceFabricService -ServiceName $ServiceName -ApplicationName $applicationName
$application = Get-ServiceFabricApplication -ApplicationName $applicationName
$serviceType = Get-ServiceFabricServiceType -ServiceTypeName $serviceTypeName -ApplicationTypeName $application.ApplicationTypeName -ApplicationTypeVersion $application.ApplicationTypeVersion
$serviceManifestName = $serviceType.ServiceManifestName
$nodes = Get-ServiceFabricNode -StatusFilter Up
$nodes | Where-Object {
$nodeName = $_.NodeName
$hasService = $null
$hasApplication = Get-ServiceFabricDeployedApplication -NodeName $nodeName -ApplicationName $applicationName
if ($hasApplication) {
$hasService = Get-ServiceFabricDeployedServicePackage -NodeName $nodeName -ApplicationName $applicationName -ServiceManifestName $serviceManifestName -ErrorAction SilentlyContinue
}
return $hasApplication -and $hasService
} | ForEach-Object {
$nodeName = $_.NodeName
$codePackages = Get-ServiceFabricDeployedCodePackage -NodeName $nodeName -ApplicationName $applicationName -ServiceManifestName $serviceManifestName
$codePackages | ForEach-Object {
$codePackageName = $_.CodePackageName
$servicePackageActivationId = $_.ServicePackageActivationId
$codePackageInstanceId = $_.EntryPoint.CodePackageInstanceId
Write-Host "Restarting deployed package on $nodeName named $codePackageName (for service package id: $servicePackageActivationId and code package id: $codePackageInstanceId)"
$success = Restart-ServiceFabricDeployedCodePackage -NodeName $nodeName -ApplicationName $applicationName -ServiceManifestName $serviceManifestName -CodePackageName $codePackageName -CodePackageInstanceId $codePackageInstanceId -ServicePackageActivationId $servicePackageActivationId -CommandCompletionMode Invalid
if ($success) {
Write-Host "Successfully restored deployed package on $nodeName" -ForegroundColor Green
}
Write-Host "Waiting for $WaitBetweenNodesSeconds seconds for previous node to restart before continuing"
Start-Sleep -Seconds $WaitBetweenNodesSeconds
$retries = 0
$service = Get-ServiceFabricService -ServiceName $ServiceName -ApplicationName $applicationName
while ($retries -lt 3 -and ($service.HealthState -ne "Ok" -or $service.ServiceStatus -ne "Active")) {
$service = Get-ServiceFabricService -ServiceName $ServiceName -ApplicationName $applicationName
$retries = $retries + 1
Write-Host "Waiting for an additional 15 seconds for previous node to restart before continuing because service state is not healthy"
Start-Sleep -Seconds 15
}
}
}