使用nl_recvmsgs时为什么会收到Netlink ERRORMSG?

时间:2016-03-20 12:46:26

标签: c linux linux-kernel netlink

我正在尝试使用nl_recvmsgs作为阻止函数来从内核模块接收Netlink消息。 在我的示例中,客户端向内核发送消息,然后调用nl_recvmsgs_report()(等于nl_recvmsgs)。然后内核模块发送返回消息。该消息已成功收到客户。

现在我希望客户端在将来侦听更多消息并再次调用nl_recvmsgs_report()。内核没有发送任何第二条消息。但不知何故,客户收到了ERRORMSG。这导致客户端出现SEGFAULT,因为他试图将消息解析为ERRORMSG。

如果我检查消息类型是否为2,并跳过消息解析,则第三次调用nl_recvmsgs_report()会完全没问题。

有人知道客户收到此ERRORMSG的原因吗?

查看我的github branch 。只需致电make, sudo insmod nlk.ko, ./nlclient,我只复制了相关部分。

客户端代码

nlclient.c main()发送和接收部分:

  // setup netlink socket
  sk = nl_socket_alloc();
  nl_socket_disable_seq_check(sk);  // disable sequence number check
  genl_connect(sk);

  int id = genl_ctrl_resolve(sk, DEMO_FAMILY_NAME);

  struct nl_msg * msg;


  // create a messgae
  msg = nlmsg_alloc();
  genlmsg_put(msg, NL_AUTO_PORT, NL_AUTO_SEQ, id, 0,    // hdrlen
                        0,  // flags
                        DEMO_CMD,   // numeric command identifier
                        DEMO_VERSION    // interface version
                       );

  nla_put_string(msg, DEMO_ATTR1_STRING, "hola");
  nla_put_u16(msg, DEMO_ATTR2_UINT16, 0xf1);

  // send it
  nl_send_auto(sk, msg);

  // handle reply
  struct nl_cb * cb = NULL;
  cb = nl_cb_alloc(NL_CB_CUSTOM);

  //nl_cb_set_all(cb, NL_CB_DEBUG, NULL, NULL);
  nl_cb_set_all(cb, NL_CB_CUSTOM, cb_handler, &cbarg);
  nl_cb_err(cb, NL_CB_DEBUG, NULL, NULL);

  int nrecv = nl_recvmsgs_report(sk, cb);

  printf("cbarg %d nrecv %d\n", cbarg, nrecv);

  printf("First test if it blocks here for incoming messages:\n");
  nrecv = nl_recvmsgs_report(sk, cb);

  printf("cbarg %d nrecv %d\n", cbarg, nrecv);

  printf("Second test if it blocks here for incoming messages:\n");
  nrecv = nl_recvmsgs_report(sk, cb);

  printf("cbarg %d nrecv %d\n", cbarg, nrecv);

nlclient.c cb_handler()解析标题和消息

  struct nlmsghdr * hdr = nlmsg_hdr(msg);

  struct genlmsghdr * gnlh = nlmsg_data(hdr);

  nl_msg_dump(msg, stderr);

  if (hdr->nlmsg_type == 2) {
    printf("hdr->nlmsg_type is ERROR. Skipping message parsing!\n");    
  } else {

    int valid =
      genlmsg_validate(hdr, 0, DEMO_ATTR_MAX, demo_gnl_policy);
    printf("valid %d %s\n", valid, valid ? "ERROR" : "OK");

    // one way
    struct nlattr * attrs[DEMO_ATTR_MAX + 1];

    if (genlmsg_parse(hdr, 0, attrs, DEMO_ATTR_MAX, demo_gnl_policy) < 0)
      {
        printf("genlsmg_parse ERROR\n");
      }

    else
      {
        printf("genlsmg_parse OK\n");

        printf("attr1 %s\n", nla_get_string(attrs[DEMO_ATTR1_STRING]));
        printf("attr2 %x\n", nla_get_u16(attrs[DEMO_ATTR2_UINT16]));
        struct attr_custom * cp = (struct attr_custom *) nla_data(attrs[DEMO_ATTR3_CUSTOM]);
        printf("attr3 %d %ld %f %lf\n", cp->a, cp->b, cp->c,cp->d);

      }
    }
  // another way
  printf("gnlh->cmd %d\n", gnlh->cmd);  //--- DEMO_CMD_ECHO

  int remaining = genlmsg_attrlen(gnlh, 0);
  struct nlattr * attr = genlmsg_attrdata(gnlh, 0);

  while (nla_ok(attr, remaining))
    {
      printf("remaining %d\n", remaining);
      printf("attr @ %p\n", attr); // nla_get_string(attr)
      attr = nla_next(attr, &remaining);
    }

内核代码

nlkernel.c demo_cmd()发送到客户端部分:

/* send message back */
    /* allocate some memory, since the size is not yet known use NLMSG_GOODSIZE */
    skb = genlmsg_new(NLMSG_GOODSIZE, GFP_KERNEL);
    if (skb == NULL) {
        goto out;
    }

    /* create the message */
    msg_head =
        genlmsg_put(skb, 0, info->snd_seq + 1, &demo_gnl_family, 0,
            DEMO_CMD);

    if (msg_head == NULL) {
        rc = -ENOMEM;
        goto out;
    }

    rc |= nla_put_string(skb, DEMO_ATTR1_STRING,"world");
    rc |= nla_put_u16(skb, DEMO_ATTR2_UINT16, 0x1f);
    cp.a = 1;
    cp.b = 2;
    cp.c = 3.0;
    cp.d = 4.0;
    rc |= nla_put(skb, DEMO_ATTR3_CUSTOM, sizeof(struct attr_custom), &cp);

    if (rc != 0) {
        goto out;
    }

    /* finalize the message */
    genlmsg_end(skb, msg_head);

    /* send the message back */
    rc = genlmsg_unicast(&init_net, skb, info->snd_portid);

    if (rc != 0) {
        goto out;
    }

    return 0;

输出

nlclient控制台输出

./nlclient 
--------------------------   BEGIN NETLINK MESSAGE ---------------------------
  [NETLINK HEADER] 16 octets
    .nlmsg_len = 76
    .type = 27 <0x1b>
    .flags = 0 <>
    .seq = 1458476257
    .port = 0
  [GENERIC NETLINK HEADER] 4 octets
    .cmd = 1
    .version = 1
    .unused = 0
  [PAYLOAD] 56 octets
    0a 00 01 00 77 6f 72 6c 64 00 00 00 06 00 02 00 ....world.......
    1f 00 00 00 24 00 03 00 01 00 00 00 ff ff ff ff ....$...........
    02 00 00 00 00 00 00 00 00 00 40 40 04 88 ff ff ..........@@....
    00 00 00 00 00 00 10 40                         .......@
---------------------------  END NETLINK MESSAGE   ---------------------------
valid 0 OK
genlsmg_parse OK
attr1 world
attr2 1f
attr3 1 2 3.000000 4.000000
gnlh->cmd 1
remaining 56
attr @ 0x10df344
remaining 44
attr @ 0x10df350
remaining 36
attr @ 0x10df358
cbarg 123 nrecv 1
First test if it blocks here for incoming messages:
--------------------------   BEGIN NETLINK MESSAGE ---------------------------
  [NETLINK HEADER] 16 octets
    .nlmsg_len = 36
    .type = 2 <ERROR>
    .flags = 0 <>
    .seq = 1458476256
    .port = -1061151077
  [ERRORMSG] 20 octets
    .error = 0 "Success"
  [ORIGINAL MESSAGE] 16 octets
    .nlmsg_len = 16
    .type = 27 <0x1b>
    .flags = 5 <REQUEST,ACK>
    .seq = 1458476256
    .port = -1061151077
---------------------------  END NETLINK MESSAGE   ---------------------------
hdr->nlmsg_type is ERROR. Skipping message parsing!
gnlh->cmd 0
cbarg 123 nrecv 1
Second test if it blocks here for incoming messages:

KERNEL系统日志

kernel: [ 4694.318428] got demo_cmd
kernel: [ 4694.318430] attr1: hola
kernel: [ 4694.318431] attr2: f1

1 个答案:

答案 0 :(得分:0)

抱歉这么久。

输出有点误导。那不是错误信息;这是一个自动ACKNetlink defines ACKs to be "error" messages with error code 0

(零是C语言成功的典型术语。)

由于您正在制定答案,您可能无论如何都不需要ACK。您可以通过添加对nl_socket_disable_auto_ack()的调用来阻止客户端请求ACK。

我很接近序列检查禁用,因为它有点类似:

sk = nl_socket_alloc();
nl_socket_disable_seq_check(sk);
nl_socket_disable_auto_ack(sk);
genl_connect(sk);