我的服务器应用程序有一个奇怪的问题。我的系统很简单:我有1个以上的设备和一个通过网络进行通信的服务器应用程序。协议具有可变长度的二进制数据包,但固定标头(包含有关当前数据包大小的信息)。包的示例:
char pct[maxSize] = {}
pct[0] = 0x5a //preambule
pct[1] = 0xa5 //preambule
pct[2] = 0x07 //packet size
pct[3] = 0x0A //command
... [payload]
该协议建立在命令答案的原则之上。
我使用boost :: asio进行通信 - io_service使用线程拉(4个线程)+异步读/写操作(下面的代码示例)并创建一个“查询周期” - 每个200ms由计时器:
这对boost 1.53(调试和发布)非常有效。但随后我转向增强1.54(特别是在发布模式下)魔术开始。我的服务器成功启动,连接到设备并启动“查询周期”。大约30-60秒一切都运行良好(我收到数据,数据是正确的),但后来我开始在最后一个读取句柄上收到asio :: error(总是在一个地方)。错误类型:EOF。收到错误后,我必须断开与设备的连接。
谷歌搜索的一些时间给我提供有关EOF的信息,表明另一方(我的情况下的设备)启动了断开连接程序。但是,根据设备的逻辑,它不可能是真的。 可能有人解释发生了什么?可能是我需要设置一些套接字选项或定义?我看到两个可能的原因:
我的环境:
主“查询周期”的示例代码。改编自官方boost chat example所有简化的代码以减少空间:)
所以,代码示例。套接字包装器:
namespace b = boost;
namespace ba = boost::asio;
typedef b::function<void(const ProtoAnswer answ)> DataReceiverType;
class SocketWorker
{
private:
typedef ba::ip::tcp::socket socketType;
typedef std::unique_ptr<socketType> socketPtrType;
socketPtrType devSocket;
ProtoCmd sendCmd;
ProtoAnswer rcvAnsw;
//[other definitions]
public:
//---------------------------------------------------------------------------
ERes SocketWorker::Connect(/*[connect settings]*/)
{
ERes res(LGS_RESULT_ERROR, "Connect to device - Unknow Error");
using namespace boost::asio::ip;
boost::system::error_code sock_error;
//try to connect
devSocket->connect(tcp::endpoint(address::from_string(/*[connect settings ip]*/), /*[connect settings port]*/), sock_error);
if(sock_error.value() > 0) {
//[work with error]
devSocket->close();
}
else {
//[res code ok]
}
return res;
}
//---------------------------------------------------------------------------
ERes SocketWorker::Disconnect()
{
if (devSocket->is_open())
{
boost::system::error_code ec;
devSocket->shutdown(bi::tcp::socket::shutdown_send, ec);
devSocket->close();
}
return ERes(LGS_RESULT_OK, "OK");
}
//---------------------------------------------------------------------------
//query any cmd
void SocketWorker::QueryCommand(const ProtoCmd cmd, DataReceiverType dataClb)
{
sendCmd = std::move(cmd); //store command
if (sendCmd .CommandLength() > 0)
{
ba::async_write(*devSocket.get(), ba::buffer(sendCmd.Data(), sendCmd.Length()),
b::bind(&SocketWorker::HandleSocketWrite,
this, ba::placeholders::error, dataClb));
}
else
{
cerr << "Send command error: nothing to send" << endl;
}
}
//---------------------------------------------------------------------------
// boost socket handlers
void SocketWorker::HandleSocketWrite(const b::system::error_code& error,
DataReceiverType dataClb)
{
if (error)
{
cerr << "Send cmd error: " << error.message() << endl;
//[send error to other place]
return;
}
//start reading header of answer (lw_service_proto::headerSize == 3 bytes)
ba::async_read(*devSocket.get(),
ba::buffer(rcvAnsw.Data(), lw_service_proto::headerSize),
b::bind(&SocketWorker::HandleSockReadHeader,
this, ba::placeholders::error, dataClb));
}
//---------------------------------------------------------------------------
//handler for read header
void SocketWorker::HandleSockReadHeader(const b::system::error_code& error, DataReceiverType dataClb)
{
if (error)
{
//[error working]
return;
}
//decode header (check preambule and get full packet size) and read answer payload
if (rcvAnsw.DecodeHeaderAndGetCmdSize())
{
ba::async_read(*devSocket.get(),
ba::buffer(rcvAnsw.Answer(), rcvAnsw.AnswerLength()),
b::bind(&SocketWorker::HandleSockReadBody,
this, ba::placeholders::error, dataClb));
}
}
//---------------------------------------------------------------------------
//handler for andwer payload
void SocketWorker::HandleSockReadBody(const b::system::error_code& error, DataReceiverType dataClb)
{
//if no error - send anwser to 'master'
if (!error){
if (dataClb != nullptr)
dataClb(rcvAnsw);
}
else{
//[error process]
//here i got EOF in release mode
}
}
};
设备工作人员
class DeviceWorker
{
private:
const static int LW_QUERY_TIME = 200;
LWDeviceSocketWorker sockWorker;
ba::io_service& timerIOService;
typedef std::shared_ptr<ba::deadline_timer> TimerPtr;
TimerPtr queryTimer;
bool queryCycleWorking;
//[other definitions]
public:
ERes DeviceWorker::Connect()
{
ERes intRes = sockWorker.Connect(/*[connect settings here]*/);
if(intRes != LGS_RESULT_OK) {
//[set result to error]
}
else {
//[set result to success]
//start "query cycle"
StartNewCycleQuery();
}
return intRes;
}
//---------------------------------------------------------------------------
ERes DeviceWorker::Disconnect()
{
return sockWorker.Disconnect();
}
//---------------------------------------------------------------------------
void DeviceWorker::StartNewCycleQuery()
{
queryCycleWorking = true;
//start timer
queryTimer = make_shared<ba::deadline_timer>(timerIOService, bt::milliseconds(LW_QUERY_TIME));
queryTimer->async_wait(boost::bind(&DeviceWorker::HandleQueryTimer,
this, boost::asio::placeholders::error));
}
//---------------------------------------------------------------------------
void DeviceWorker::StopCycleQuery()
{
//kill timer
if (queryTimer)
queryTimer->cancel();
queryCycleWorking = false;
}
//---------------------------------------------------------------------------
//timer handler
void DeviceWorker::HandleQueryTimer(const b::system::error_code& error)
{
if (!error)
{
ProtoCmd cmd;
//query for first value
cmd.EncodeCommandCore(lw_service_proto::cmdGetAlarm, 1);
sockWorker.QueryCommand(cmd, boost::bind(&DeviceWorker::ReceiveAlarmCycle,
this, _1));
}
}
//---------------------------------------------------------------------------
//receive first value
void DeviceWorker::ReceiveAlarmCycle(ProtoAnswer adata)
{
//check and fix last bytes (remove \r\n from some commands)
adata.CheckAndFixFooter();
//[working with answer]
if (queryCycleWorking)
{
//query for second value
ProtoCmd cmd;
cmd.EncodeCommandCore(lw_service_proto::cmdGetEnergyLevel, 1);
sockWorker.QueryCommand(cmd, b::bind(&DeviceWorker::ReceiveEnergyCycle,
this, _1));
}
}
//---------------------------------------------------------------------------
//receive second value
void DeviceWorker::ReceiveEnergyCycle(ProtoAnswer edata)
{
//check and fix last bytes (remove \r\n from some commands)
edata.CheckAndFixFooter();
//[working with second value]
//start new "query cycle"
if (queryCycleWorking)
StartNewCycleQuery();
}
};
欢迎任何想法:)
修改 经过几次测试后,我看到了anower图片:
我还花了一些时间使用调试器和提升源并得出一些结论:
此时我被迫开启了增强1.53 ......
答案 0 :(得分:0)
如果您的应用程序在调用io_service::run()
的单个线程中正常工作但是有四个线程失败,那么您很可能遇到竞争条件。这类问题难以诊断。一般来说,您应该确保devSocket
最多只有一个未完成的async_read()
和async_write()
操作。您当前SocketWorker::QueryCommand()
的实施无条件调用async_write()
,这可能违反了排序假设documented
此操作是根据对零的或多个调用实现的 stream的
async_write_some
函数,被称为组合 操作。程序必须确保流不执行任何其他操作 写操作(例如async_write
,流的async_write_some
函数或执行写入的任何其他组合操作直到 此操作完成。
此问题的classic solution是维护传出消息的队列。如果先前的写入未完成,则将下一个传出消息附加到队列。上一次写入完成后,为队列中的下一条消息启动async_write()
。当使用多个线程调用io_service::run()
时,您可能需要使用链作为链接的答案。
答案 1 :(得分:0)