通常,我们可以使用Azure Service Fabric DNS通过服务名称对另一个服务执行ping操作。昨晚凌晨1点左右,此操作停止了。没有更改代码或配置,也没有部署任何内容。现在,我们无法在容器内对其他服务执行ping操作:
一些信息:我们正在Azure中运行Service Fabric,并正在运行Windows群集。我们所有的服务都使用Windows Docker在Docker容器中运行。
我们尝试过的方法:重新启动VM,删除和重新部署所有应用程序,重新启动群集中的Naming Service和DNS服务。
有人看到过这样的东西吗?我正在寻找有关可能出问题或如何进一步调试此问题的提示。同样,没有部署任何内容,也没有更改代码或配置。似乎Service Fabric的内部DNS突然关闭并且不会再次出现。谢谢!
更新:其中一个节点上的Get-ServiceFabricNodeHealth输出:
NodeName : _B_41
AggregatedHealthState : Ok
HealthEvents :
SourceId : System.FabricNode
Property : Certificate_client
HealthState : Ok
SequenceNumber : 132078454391466815
SentAt : 7/17/2019 1:57:19 PM
ReceivedAt : 7/17/2019 1:57:24 PM
TTL : Infinite
Description : Certificate expiration: thumbprint = adf7ae93a524d181106b0467a1f8e3375e1bf65f, expiration = 2020-06-20 01:17:33.000, remaining lifetime is
338:11:20:13.853, please refresh ahead of time to avoid catastrophic failure. Warning threshold Security/CertificateExpirySafetyMargin is configured at 30:0:00:00.000, if
needed, you can adjust it to fit your refresh process.
RemoveWhenExpired : False
IsExpired : False
Transitions : Warning->Ok = 7/13/2019 11:22:17 AM, LastError = 1/1/0001 12:00:00 AM
SourceId : System.FabricNode
Property : Certificate_cluster
HealthState : Ok
SequenceNumber : 132078386480915827
SentAt : 7/17/2019 12:04:08 PM
ReceivedAt : 7/17/2019 12:04:23 PM
TTL : Infinite
Description : Certificate expiration: thumbprint = adf7ae93a524d181106b0467a1f8e3375e1bf65f, expiration = 2020-06-20 01:17:33.000, remaining lifetime is
338:13:13:24.908, please refresh ahead of time to avoid catastrophic failure. Warning threshold Security/CertificateExpirySafetyMargin is configured at 30:0:00:00.000, if
needed, you can adjust it to fit your refresh process.
RemoveWhenExpired : False
IsExpired : False
Transitions : Warning->Ok = 7/13/2019 7:04:12 AM, LastError = 1/1/0001 12:00:00 AM
SourceId : System.FabricNode
Property : Certificate_server
HealthState : Ok
SequenceNumber : 132078441374480374
SentAt : 7/17/2019 1:35:37 PM
ReceivedAt : 7/17/2019 1:35:54 PM
TTL : Infinite
Description : Certificate expiration: thumbprint = adf7ae93a524d181106b0467a1f8e3375e1bf65f, expiration = 2020-06-20 01:17:33.000, remaining lifetime is
338:11:41:55.551, please refresh ahead of time to avoid catastrophic failure. Warning threshold Security/CertificateExpirySafetyMargin is configured at 30:0:00:00.000, if
needed, you can adjust it to fit your refresh process.
RemoveWhenExpired : False
IsExpired : False
Transitions : Warning->Ok = 7/13/2019 4:35:41 AM, LastError = 1/1/0001 12:00:00 AM
SourceId : System.RA
Property : RAStoreProvider
HealthState : Ok
SequenceNumber : 132072866375071389
SentAt : 7/11/2019 2:43:57 AM
ReceivedAt : 7/13/2019 1:15:33 PM
TTL : Infinite
Description : Store provider type ESE created and opened successfully.
RemoveWhenExpired : False
IsExpired : False
Transitions : Warning->Ok = 7/11/2019 2:44:27 AM, LastError = 1/1/0001 12:00:00 AM
SourceId : System.FM
Property : State
HealthState : Ok
SequenceNumber : 181
SentAt : 7/11/2019 2:44:15 AM
ReceivedAt : 7/13/2019 1:15:33 PM
TTL : Infinite
Description : Fabric node is up.
RemoveWhenExpired : False
IsExpired : False
Transitions : Warning->Ok = 7/11/2019 2:44:44 AM, LastError = 1/1/0001 12:00:00 AM
更新2:来自Docker容器中的网络接口信息: