从昨晚凌晨2点左右开始 - 大约8小时后,任何人触及与网站有关的任何事情 - 我们的Azure网站开始抛出此错误:
错误:ErrorCode:SubStatus:暂时失败。请稍后重试。 (一个或多个指定的缓存服务器不可用,这可能是由繁忙的网络或服务器引起的。对于内部部署缓存集群,还要验证以下条件。确保已为此客户端帐户授予安全权限,并检查AppFabric允许缓存服务通过所有缓存主机上的防火墙。服务器上的MaxBufferSize也必须大于或等于从客户端发送的序列化对象大小。)。附加信息:客户端尝试与服务器通信:net.tcp://payboardprod.cache.windows.net:22233。 (在Microsoft.ApplicationServer.Caching.DataCache.ThrowException(ErrStatus errStatus,Guid trackingId,Exception responseException,Byte [] [] payload,EndpointID destination)
基本上,我们的Azure缓存服务器看起来好像很有用。但是我们的Azure管理控制台上没有任何迹象表明这一点,这表明有问题的缓存服务器运行正常。 Azure服务可用性仪表板(http://azure.microsoft.com/en-us/support/service-dashboard/)上也没有任何问题迹象。任何类型问题的唯一指示是我们的Azure缓存服务在凌晨1点左右开始报告零请求。
我们的测试版网站使用了不同的缓存服务器,但配置相同,在整个剧集中保持不变。
我们只有一个BizSpark帐户,因此无法使用MS打开支持服务单。
我们通过禁用外部缓存来恢复服务,但这显然不是最佳的。
有关此问题的任何建议?
答案 0 :(得分:1)
将您的呼叫代码包裹在适当的保护(try / catch)中,然后应对应用层的失败。任何云中提供的商品平台都可以(并且确实)不时出现这些问题。您需要进行日志记录并在Azure Diagnostics(http://msdn.microsoft.com/en-us/library/gg433048.aspx)之类的某个位置进行日志记录,以便以后进行故障排除。
答案 1 :(得分:0)
我仍然无法弄清楚问题是什么,并且最终关注了西蒙W关于将所有内容包装在试图/抓住wazoo中的建议。但是因为它不是100%直观的,并且我花了好几次尝试来获取正确的缓存检索代码,我想我会在这里发布它给任何感兴趣的人。
public TValue Get(string key, Func<TValue> missingFunc)
{
// We need to ensure that two processes don't try to calculate the same value at the same time. That just wastes resources.
// So we pull out a value from the _cacheLocks dictionary, and lock on that before trying to retrieve the object.
// This does add a bit more locking, and hence the chance for one process to lock up everything else.
// We may need to add some timeouts here at some point in time. It also doesn't prevent two processes on different
// machines from trying the same bit o' nonsense. Oh well. It's probably still a worthwhile optimization.
key = _keyPrefix + "." + key;
var value = default(TValue);
object cacheLock;
lock (_cacheLocks)
{
if (!_cacheLocks.TryGetValue(key, out cacheLock))
{
cacheLock = new object();
_cacheLocks[key] = cacheLock;
}
}
lock (cacheLock)
{
// Try to get the value from the cache.
try
{
value = _cache.Get(key) as TValue;
}
catch (SerializationException ex)
{
// This can happen when the app restarts, and we discover that the dynamic entity names have changed, and the desired type
// is no longer around, e.g., "Organization_6BA9E1E1184D9B7BDCC50D94471D7A730423456A15BBAFB6A2C6AC0FF94C0D41"
// If that's the error, we should probably warn about it, but no point in logging it as an error, since it's more-or-less expected.
_logger.Warn("Error retrieving item '" + key + "' from Azure cache; falling back to missingFunc(). Error = " + ex);
}
catch (Exception ex)
{
_logger.Error("Error retrieving item '" + key + "' from Azure cache; falling back to missingFunc(). Error = " + ex);
}
// If we didn't get anything interesting, then call the function that should be able to retrieve it for us.
if (value == default(TValue))
{
// If that function throws an exception, don't swallow it.
value = missingFunc();
// If we try to put it into the cache, and *that* throws an exception,
// log it, and then swallow it.
try
{
_cache.Put(key, value);
}
catch (Exception ex)
{
_logger.Error("Error putting item '" + key + "' into Azure cache. Error = " + ex);
}
}
}
return value;
}
您可以像这样使用它:
var user = UserCache.Get(email, () =>
_db.Users
.FirstOrDefault(u => u.Email == email)
.ShallowClone());