使用字符串分隔符在sscanf中扫描集

时间:2014-02-20 14:43:49

标签: c parsing format scanf

我有以下形式的ASCII字符串:

00001\x02This is a string\x030000100001\x021.0\x03\x021.0\x03\x021.0\x03\x021.0\x03001

在描述性术语中,它是一个带有5位前导零整数的字符串,一个由STX和ETX ASCII字符封装的字符串,另外两个5位前导零整数,4个由STX和ETX ASCII字符封装的浮点值由3位前导零整数组成。我正在尝试使用sscanf来解析字符串。它不是应该表现的。我使用以下格式字符串:

"%5hu\x02%[0-9a-zA-Z _-]s\x03%5hu%05hu\x02%lf\x03\x02%lf\x03\x02%lf\x03\x02%lf\x03%3hhu"

我尝试将扫描集更改为包含[^\x03]的不同值。我也尝试添加和删除长度说明符。我很确定这是导致问题的扫描集。我想知道它是否在使用STX和ETX字符文字时遇到困难。任何人都知道为什么这不起作用?或者使用纯C89的更好选择?感谢。

对于那些想要完整代码进行测试的人:

unsigned short one;
char two[32];
unsigned short three, four;
double five, six, seven, eight;
unsigned char nine;
char temp[] = "00001\x02This is a string\x03""0000100002\x02""1.1\x03\x02""2.2\x03\x02""3.3\x03\x02""4.5\x03""003";
sscanf(temp, "%5hu\x02%[-0-9a-zA-Z _]s\x03%5hu%05hu\x02%lf\x03\x02%lf\x03\x02%lf\x03\x02%lf\x03%3hhu",
       &one, two, &three, &four, &five, &six, &seven, &eight, &nine);

4 个答案:

答案 0 :(得分:1)

我无法立即明白为什么它不起作用,不过我会把它分解成一个单独的字符串并让它越来越长直到它失败。

就个人而言,我会用strtok(在STX和ETX封装的边界上)解析它,然后使用scanf读取特定的浮点数和整数。

答案 1 :(得分:1)

注意:八位常量在反斜杠后限制为最多3位数,但是十六进制常量不限于两个或三个十六进制数字,因此\x0300000100001都是单个字符。

海湾合作委员会警告我:

ssss.c:6:1: error: hex escape sequence out of range [-Werror]
 "00001\x02This is a string\x030000100001\x021.0\x03\x021.0\x03\x021.0\x03\x021.0\x03001";
 ^
ssss.c:6:1: error: hex escape sequence out of range [-Werror]

(根据问题的编辑,你已经意识到了这个问题。)

另外,请注意扫描设备独立;它不是s转换说明符的限定符。您的格式字符串在扫描集匹配的数据之后查找数据中的实际s,并且由于扫描集在文字匹配之前吃掉任何s,因此永远不会找到一个。这是你的实际问题;移除s扫描集后的%[0-9a-zA-Z _-]

此代码有效。请注意data字符串的明智分解,以便在您希望终止它们的地方终止十六进制常量。格式字符串中的拆分简化了演示。 C将两个相邻的字符串文字连接在一起,这非常有用。

#include <stdio.h>

int main(void)
{
    char const data[] =
        "00001\x02This is a string\x03" "0000100001\x02" "1.0\x03\x02"
        "1.0\x03\x02" "1.0\x03\x02" "1.0\x03" "001";
    char const format[] =
        "%5hu\x02%[0-9a-zA-Z _-]\x03%5hu%05hu\x02%lf\x03\x02%lf\x03\x02"
        "%lf\x03\x02%lf\x03%3hhu";

    unsigned short i1;
    char s2[20];
    unsigned short i3;
    unsigned short i4;
    double d5;
    double d6;
    double d7;
    double d8;
    unsigned char i9;
    int rc;

    if ((rc = sscanf(data, format, &i1, s2, &i3, &i4, &d5, &d6, &d7, &d8, &i9)) != 9)
        printf("sscanf failed - %d conversions\n", rc);
    else
        printf("i1 = %d; s2 = [%s]; i3 = %d; i4 = %d; d5 = %f;\n"
               "d6 = %f; d7 = %f; d8 = %f; i9 = %d\n",
               i1, s2, i3, i4, d5, d6, d7, d8, i9);
    return 0;
}

示例输出:

i1 = 1; s2 = [This is a string]; i3 = 1; i4 = 1; d5 = 1.000000;
d6 = 1.000000; d7 = 1.000000; d8 = 1.000000; i9 = 1

我在return 0的{​​{1}}之前添加了此代码。扫描集后没有main();当我离开s时,s返回值2,而不是9。

sscanf()

组合的节目输出是:

    unsigned short one;
    char two[32];
    unsigned short three, four;
    double five, six, seven, eight;
    unsigned char nine;
    char temp[] = "00001\x02This is a string\x03""0000100002\x02""1.1\x03\x02""2.2\x03\x02""3.3\x03\x02""4.5\x03""003";
    if ((rc = sscanf(temp, "%5hu\x02%[-0-9a-zA-Z _]\x03%5hu%05hu\x02%lf\x03\x02%lf\x03\x02%lf\x03\x02%lf\x03%3hhu",
           &one, two, &three, &four, &five, &six, &seven, &eight, &nine)) != 9)
        printf("sscanf failed - %d conversions\n", rc);
    else
        printf("i1 = %d; s2 = [%s]; i3 = %d; i4 = %d; d5 = %f;\n"
               "d6 = %f; d7 = %f; d8 = %f; i9 = %d\n",
               one, two, three, four, five, six, seven, eight, nine);

使用GCC 4.8.2在Mac OS X 10.9.1 Mavericks上进行测试。

答案 2 :(得分:1)

格式说明符太容易混淆了。考虑分解(更容易理解和维护)并检查结果

#define Int5 "%5hu"
// Note:  no 0 ^

#define STX  "\x02"
#define ETX  "\x03"
// Could use hexadecimal constants here as the string is broken up.

#define EncStr STX "%31[0-9a-zA-Z _-]" ETX
// Note:                        no s ^   (@Jonathan comment s is not part of %[]
// String limit      ^

#define FP    STX "%lf" ETX
#define Int3  "%3hhu"

if (9 == sscanf(temp, Int5 EncStr Int5 Int5 FP FP FP FP Int3, 
    &one, two, &three, &four, &five, &six, &seven, &eight, &nine)) Success();

注意:temp需要分解为十六进制清晰度或使用八进制常量

char temp[]  = "00001\x02This is a string\x03" "0000100001\x02" "1.0\x03\x02" 
     "1.0\x03\x02" "1.0\x03\x02" "1.0\x03" "001";
char temp[]  = "00001\002This is a string\0030000100001\0021.0\003\0021.0\003\0021.0\003\0021.0\003001";

答案 3 :(得分:1)

因此,我已确定此版本的C运行时库中的scanset已损坏。以下代码有效且无效:

unsigned short one;
char two[32];
unsigned short three, four;
double five, six, seven, eight;
unsigned char nine;
int rc1, rc2;

char temp[] = "00001\x02string\x03""0000100002\x02""1.1\x03\x02""2.2\x03\x02""3.3\x03\x02""4.5\x03""003";

char format1[] = "%5hu\x02%[a-z]\x03%5hu%5hu\x02%lf\x03\x02%lf\x03\x02%lf\x03\x02%lf\x03%3hhu";
char format2[] = "%5hu\x02%s\x03%5hu%5hu\x02%lf\x03\x02%lf\x03\x02%lf\x03\x02%lf\x03%3hhu";

rc1 = sscanf(temp, format1, &one, two, &three, &four, &five, &six, &seven, &eight, &nine);   
rc2 = sscanf(temp, format2, &one, two, &three, &four, &five, &six, &seven, &eight, &nine);

在上面的代码中,rc1返回1个已成功扫描的项目,rc2返回显示已成功扫描的9个项目。因此,我所得出的结论是,扫描集无法与此硬件/软件组合正常工作。任何人有任何其他想法或结论?谢谢你的帮助。我没有给任何人一个解决方案,但确实给出了有用答案的分数。