将UNUSED元素添加到C / C ++结构会加快并减慢代码执行速度

时间:2015-05-04 19:14:35

标签: c++ optimization arduino avr

我编写了以下结构,用于我正在制作的Arduino软件PWM库中,一次PWM(在Uno上)或70个引脚(在Mega上)最多PWM。

如上所述,代码的ISR部分(eRCaGuy_SoftwarePWMupdate())处理此结构的数组,需要 133us才能运行。然而,非常奇怪的是,如果我取消注释“byte flags1”这一行, (在结构中)虽然flags1尚未在任何地方使用,但ISR现在需要运行158us 。然后,如果我取消注释“byte flags2;”为了使 BOTH标志现在不被注释,运行时会回落到之前的状态(133us)

为什么会这样?!我该如何解决? (即:我想确保一致的快速代码,对于这个特定的功能,而不是令人费解的变幻无常的代码)。添加一个字节会大大减慢代码速度,但添加两个字节根本不会发生任何变化。

我正在尝试优化代码(我还需要添加另一个功能,需要一个字节用于标志),但我不明白为什么添加一个未使用的字节会使代码减慢25us,但添加两个未使用的bytes根本不会改变运行时。

我需要了解这一点,以确保我的优化始终如一。

在.h文件中(我的原始结构,使用C风格的typedef'ed结构):

typedef struct softPWMpin //global struct
{
  //VOLATILE VARIBLES (WILL BE ACCESSED IN AND OUTSIDE OF ISRs)
  //for pin write access:
  volatile byte pinBitMask;
  volatile byte* volatile p_PORT_out; //pointer to port output register; NB: the 1st "volatile" says the port itself (1 byte) is volatile, the 2nd "volatile" says the *pointer* itself (2 bytes, pointing to the port) is volatile.
  //for PWM output:
  volatile unsigned long resolution;
  volatile unsigned long PWMvalue; //Note: duty cycle = PWMvalue/(resolution - 1) = PWMvalue/topValue;
                          //ex: if resolution is 256, topValue is 255
                          //if PWMvalue = 255, duty_cycle = PWMvalue/topValue = 255/255 = 1 = 100%
                          //if PWMvalue = 50, duty_cycle = PWMvalue/topValue = 50/255 = 0.196 = 19.6%
  //byte flags1;
  //byte flags2;

  //NON-VOLATILE VARIABLES (WILL ONLY BE ACCESSED INSIDE AN ISR, OR OUTSIDE AN ISR, BUT NOT BOTH)
  unsigned long counter; //incremented each time update() is called; goes back to zero after reaching topValue; does NOT need to be volatile, since only the update function updates this (it is read-to or written from nowhere else)
} softPWMpin_t; 

在.h文件中(新的,使用C ++样式的struct ....根据注释查看它是否有任何区别。它似乎没有任何区别,包括运行时和编译后的大小)

struct softPWMpin //global struct
{
  //VOLATILE VARIBLES (WILL BE ACCESSED IN AND OUTSIDE OF ISRs)
  //for pin write access:
  volatile byte pinBitMask;
  volatile byte* volatile p_PORT_out; //pointer to port output register; NB: the 1st "volatile" says the port itself (1 byte) is volatile, the 2nd "volatile" says the *pointer* itself (2 bytes, pointing to the port) is volatile.
  //for PWM output:
  volatile unsigned long resolution;
  volatile unsigned long PWMvalue; //Note: duty cycle = PWMvalue/(resolution - 1) = PWMvalue/topValue;
                          //ex: if resolution is 256, topValue is 255
                          //if PWMvalue = 255, duty_cycle = PWMvalue/topValue = 255/255 = 1 = 100%
                          //if PWMvalue = 50, duty_cycle = PWMvalue/topValue = 50/255 = 0.196 = 19.6%
  //byte flags1;
  //byte flags2;

  //NON-VOLATILE VARIABLES (WILL ONLY BE ACCESSED INSIDE AN ISR, OR OUTSIDE AN ISR, BUT NOT BOTH)
  unsigned long counter; //incremented each time update() is called; goes back to zero after reaching topValue; does NOT need to be volatile, since only the update function updates this (it is read-to or written from nowhere else)
}; 

在.cpp文件中(这里我创建了结构数组,这里是更新函数,在ISR中通过定时器中断以固定速率调用):

//static softPWMpin_t PWMpins[MAX_NUMBER_SOFTWARE_PWM_PINS]; //C-style, old, MAX_NUMBER_SOFTWARE_PWM_PINS = 20; static to give it file scope only
static softPWMpin PWMpins[MAX_NUMBER_SOFTWARE_PWM_PINS]; //C++-style, old, MAX_NUMBER_SOFTWARE_PWM_PINS = 20; static to give it file scope only

//This function must be placed within an ISR, to be called at a fixed interval
void eRCaGuy_SoftwarePWMupdate()
{
  //Forced nonatomic block (ie: interrupts *enabled*)
  byte SREG_old = SREG; //[1 clock cycle]
  interrupts(); //[1 clock cycle] turn interrupts ON to allow *nested interrupts* (ex: handling of time-sensitive timing, such as reading incoming PWM signals or counting Timer2 overflows)
  {    
    //first, increment all counters of attached pins (ie: where the value != PIN_NOT_ATTACHED)
    //pinMapArray
    for (byte pin=0; pin<NUM_DIGITAL_PINS; pin++)
    {
      byte i = pinMapArray[pin]; //[2 clock cycles: 0.125us]; No need to turn off interrupts to read this volatile variable here since reading pinMapArray[pin] is an atomic operation (since it's a single byte)
      if (i != PIN_NOT_ATTACHED) //if the pin IS attached, increment counter and decide what to do with pin...
      {
        //Read volatile variables ONE time, all at once, to optimize code (volatile variables take more time to read [I know] since their values can't be recalled from registers [I believe]).
        noInterrupts(); //[1 clock cycle] turn off interrupts to read non-atomic volatile variables that could be updated simultaneously right now in another ISR, since nested interrupts are enabled here
        unsigned long resolution = PWMpins[i].resolution;
        unsigned long PWMvalue = PWMpins[i].PWMvalue;
        volatile byte* p_PORT_out = PWMpins[i].p_PORT_out; //[0.44us raw: 5 clock cycles, 0.3125us]
        interrupts(); //[1 clock cycle]

        //handle edge cases FIRST (PWMvalue==0 and PMWvalue==topValue), since if an edge case exists we should NOT do the main case handling below
        if (PWMvalue==0) //the PWM command is 0% duty cycle
        {
          fastDigitalWrite(p_PORT_out,PWMpins[i].pinBitMask,LOW); //write LOW [1.19us raw: 17 clock cycles, 1.0625us]
        }
        else if (PWMvalue==resolution-1) //the PWM command is 100% duty cycle
        {
          fastDigitalWrite(p_PORT_out,PWMpins[i].pinBitMask,HIGH); //write HIGH [0.88us raw; 12 clock cycles, 0.75us]
        }
        //THEN handle main cases (PWMvalue is > 0 and < topValue)
        else //(0% < PWM command < 100%)
        {
          PWMpins[i].counter++; //not volatile
          if (PWMpins[i].counter >= resolution)
          {
            PWMpins[i].counter = 0; //reset
            fastDigitalWrite(p_PORT_out,PWMpins[i].pinBitMask,HIGH);
          }
          else if (PWMpins[i].counter>=PWMvalue)
          {
            fastDigitalWrite(p_PORT_out,PWMpins[i].pinBitMask,LOW); //write LOW [1.18us raw: 17 clock cycles, 1.0625us]
          }
        }
      }
    }
  }
  SREG = SREG_old; //restore interrupt enable status
}

更新(2015年5月4日,下午8:58):

我尝试通过对齐属性更改对齐方式。我的编译器是gcc。

以下是我如何修改.h文件中的结构以添加属性(它位于最后一行)。 请注意,我还将struct成员的顺序更改为最大

struct softPWMpin //C++ style
{
  volatile unsigned long resolution;
  volatile unsigned long PWMvalue; //Note: duty cycle = PWMvalue/(resolution - 1) = PWMvalue/topValue;
                          //ex: if resolution is 256, topValue is 255
                          //if PWMvalue = 255, duty_cycle = PWMvalue/topValue = 255/255 = 1 = 100%
                          //if PWMvalue = 50, duty_cycle = PWMvalue/topValue = 50/255 = 0.196 = 19.6%
  unsigned long counter; //incremented each time update() is called; goes back to zero after reaching topValue; does NOT need to be volatile, since only the update function updates this (it is read-to or written from nowhere else)
  volatile byte* volatile p_PORT_out; //pointer to port output register; NB: the 1st "volatile" says the port itself (1 byte) is volatile, the 2nd "volatile" says the *pointer* itself (2 bytes, pointing to the port) is volatile.
  volatile byte pinBitMask;

  // byte flags1;
  // byte flags2;
} __attribute__ ((aligned));

来源:https://gcc.gnu.org/onlinedocs/gcc-3.1/gcc/Type-Attributes.html

这是我到目前为止所尝试的结果:

__attribute__ ((aligned));  
__attribute__ ((aligned(1)));  
__attribute__ ((aligned(2)));  
__attribute__ ((aligned(4)));  
__attribute__ ((aligned(8)));  

当我添加一个标志字节时,它们似乎都无法解决我看到的问题。当离开标志字节注释掉时,2-8个运行时间超过133us,而对齐1没有差别(运行时间保持133us),暗示它是已经发生的未添加属性的情况一点都不另外,即使我使用2,4,8的对齐选项,sizeof(PWMvalue)函数仍然返回结构中的确切字节数,没有额外的填充。

......仍然不知道发生了什么......

更新,晚上11:02:

(见下面的评论) 优化级别肯定会产生影响。例如,当我将编译器优化级别从-Os更改为-O2时,基本情况保持在133us(如前所述),取消注释flags1给了我120us(vs 158us),并且取消注释flags1和flags2同时给了我132us(vs 133us) )。这仍然没有回答我的问题,但我至少知道存在优化级别,以及如何更改它们。

上段摘要:

Processing time of (of eRCaGuy_SoftwarePWMupdate() function)
Optimization   No flags     w/flags1     w/flags1+flags2
Os             133us        158us        133us
O2             132us        120us        132us

Memory Use (bytes: flash/global vars SRAM/sizeof(softPWMpin)/sizeof(PWMpins))
Optimization   No flags          w/flags1          w/flags1+flags2
Os             4020/591/15/300   3950/611/16/320   4020/631/17/340
O2             4154/591/15/300   4064/611/16/320   4154/631/17/340

更新(2015年5月5日,下午4:05):

  • 刚刚更新了上表,并提供了更详细的信息。
  • 在下面添加了资源。

的资源:

gcc编译器优化级别的来源:
  - https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
  - https://gcc.gnu.org/onlinedocs/gnat_ugn/Optimization-Levels.html
  - http://www.rapidtables.com/code/linux/gcc/gcc-o.htm

如何在Arduino IDE中更改编译器设置:
  - http://www.instructables.com/id/Arduino-IDE-16x-compiler-optimisations-faster-code/

结构包装信息:
  - http://www.catb.org/esr/structure-packing/

数据对齐:
  - http://www.songho.ca/misc/alignment/dataalign.html

为8位Atmel AVR微控制器编写高效的C代码
  - AVR035 AVR的高效C编码 - doc1497 - http://www.atmel.com/images/doc1497.pdf
  - AVR4027优化8位AVR微控制器C代码的技巧和窍门 - doc8453 - http://www.atmel.com/images/doc8453.pdf

可能有助于您解决我的问题的其他信息:

FOR NO FLAGS(标记为flags1和flags2)和Os优化
构建首选项(来自buildprefs.txt文件,其中Arduino吐出已编译的代码):
对我来说:“C:\ Users \ Gabriel \ AppData \ Local \ Temp \ build8427371380606368699.tmp”

build.arch = AVR
build.board = AVR_UNO
build.core = arduino
build.core.path = C:\Program Files (x86)\Arduino\hardware\arduino\avr\cores\arduino
build.extra_flags = 
build.f_cpu = 16000000L
build.mcu = atmega328p
build.path = C:\Users\Gabriel\AppData\Local\Temp\build8427371380606368699.tmp
build.project_name = software_PWM_fade13_speed_test2.cpp
build.system.path = C:\Program Files (x86)\Arduino\hardware\arduino\avr\system
build.usb_flags = -DUSB_VID={build.vid} -DUSB_PID={build.pid} '-DUSB_MANUFACTURER={build.usb_manufacturer}' '-DUSB_PRODUCT={build.usb_product}'
build.usb_manufacturer = 
build.variant = standard
build.variant.path = C:\Program Files (x86)\Arduino\hardware\arduino\avr\variants\standard
build.verbose = true
build.warn_data_percentage = 75
compiler.S.extra_flags = 
compiler.S.flags = -c -g -x assembler-with-cpp
compiler.ar.cmd = avr-ar
compiler.ar.extra_flags = 
compiler.ar.flags = rcs
compiler.c.cmd = avr-gcc
compiler.c.elf.cmd = avr-gcc
compiler.c.elf.extra_flags = 
compiler.c.elf.flags = -w -Os -Wl,--gc-sections
compiler.c.extra_flags = 
compiler.c.flags = -c -g -Os -w -ffunction-sections -fdata-sections -MMD
compiler.cpp.cmd = avr-g++
compiler.cpp.extra_flags = 
compiler.cpp.flags = -c -g -Os -w -fno-exceptions -ffunction-sections -fdata-sections -fno-threadsafe-statics -MMD
compiler.elf2hex.cmd = avr-objcopy
compiler.elf2hex.extra_flags = 
compiler.elf2hex.flags = -O ihex -R .eeprom
compiler.ldflags = 
compiler.objcopy.cmd = avr-objcopy
compiler.objcopy.eep.extra_flags = 
compiler.objcopy.eep.flags = -O ihex -j .eeprom --set-section-flags=.eeprom=alloc,load --no-change-warnings --change-section-lma .eeprom=0
compiler.path = {runtime.ide.path}/hardware/tools/avr/bin/
compiler.size.cmd = avr-size

大会的一些内容: (Os,没有旗帜):

00000328 <_Z25eRCaGuy_SoftwarePWMupdatev>:
 328:   8f 92           push    r8
 32a:   9f 92           push    r9
 32c:   af 92           push    r10
 32e:   bf 92           push    r11
 330:   cf 92           push    r12
 332:   df 92           push    r13
 334:   ef 92           push    r14
 336:   ff 92           push    r15
 338:   0f 93           push    r16
 33a:   1f 93           push    r17
 33c:   cf 93           push    r28
 33e:   df 93           push    r29
 340:   0f b7           in  r16, 0x3f   ; 63
 342:   78 94           sei
 344:   20 e0           ldi r18, 0x00   ; 0
 346:   30 e0           ldi r19, 0x00   ; 0
 348:   1f e0           ldi r17, 0x0F   ; 15
 34a:   f9 01           movw    r30, r18
 34c:   e8 5a           subi    r30, 0xA8   ; 168
 34e:   fe 4f           sbci    r31, 0xFE   ; 254
 350:   80 81           ld  r24, Z
 352:   8f 3f           cpi r24, 0xFF   ; 255
 354:   09 f4           brne    .+2         ; 0x358 <_Z25eRCaGuy_SoftwarePWMupdatev+0x30>
 356:   67 c0           rjmp    .+206       ; 0x426 <_Z25eRCaGuy_SoftwarePWMupdatev+0xfe>
 358:   f8 94           cli
 35a:   90 e0           ldi r25, 0x00   ; 0
 35c:   18 9f           mul r17, r24
 35e:   f0 01           movw    r30, r0
 360:   19 9f           mul r17, r25
 362:   f0 0d           add r31, r0
 364:   11 24           eor r1, r1
 366:   e4 59           subi    r30, 0x94   ; 148
 368:   fe 4f           sbci    r31, 0xFE   ; 254
 36a:   c0 80           ld  r12, Z
 36c:   d1 80           ldd r13, Z+1    ; 0x01
 36e:   e2 80           ldd r14, Z+2    ; 0x02
 370:   f3 80           ldd r15, Z+3    ; 0x03
 372:   44 81           ldd r20, Z+4    ; 0x04
 374:   55 81           ldd r21, Z+5    ; 0x05
 376:   66 81           ldd r22, Z+6    ; 0x06
 378:   77 81           ldd r23, Z+7    ; 0x07
 37a:   04 84           ldd r0, Z+12    ; 0x0c
 37c:   f5 85           ldd r31, Z+13   ; 0x0d
 37e:   e0 2d           mov r30, r0
 380:   78 94           sei
 382:   41 15           cp  r20, r1
 384:   51 05           cpc r21, r1
 386:   61 05           cpc r22, r1
 388:   71 05           cpc r23, r1
 38a:   51 f4           brne    .+20        ; 0x3a0 <_Z25eRCaGuy_SoftwarePWMupdatev+0x78>
 38c:   18 9f           mul r17, r24
 38e:   d0 01           movw    r26, r0
 390:   19 9f           mul r17, r25
 392:   b0 0d           add r27, r0
 394:   11 24           eor r1, r1
 396:   a4 59           subi    r26, 0x94   ; 148
 398:   be 4f           sbci    r27, 0xFE   ; 254
 39a:   1e 96           adiw    r26, 0x0e   ; 14
 39c:   4c 91           ld  r20, X
 39e:   3b c0           rjmp    .+118       ; 0x416 <_Z25eRCaGuy_SoftwarePWMupdatev+0xee>
 3a0:   46 01           movw    r8, r12
 3a2:   57 01           movw    r10, r14
 3a4:   a1 e0           ldi r26, 0x01   ; 1
 3a6:   8a 1a           sub r8, r26
 3a8:   91 08           sbc r9, r1
 3aa:   a1 08           sbc r10, r1
 3ac:   b1 08           sbc r11, r1
 3ae:   48 15           cp  r20, r8
 3b0:   59 05           cpc r21, r9
 3b2:   6a 05           cpc r22, r10
 3b4:   7b 05           cpc r23, r11
 3b6:   51 f4           brne    .+20        ; 0x3cc <_Z25eRCaGuy_SoftwarePWMupdatev+0xa4>
 3b8:   18 9f           mul r17, r24
 3ba:   d0 01           movw    r26, r0
 3bc:   19 9f           mul r17, r25
 3be:   b0 0d           add r27, r0
 3c0:   11 24           eor r1, r1
 3c2:   a4 59           subi    r26, 0x94   ; 148
 3c4:   be 4f           sbci    r27, 0xFE   ; 254
 3c6:   1e 96           adiw    r26, 0x0e   ; 14
 3c8:   9c 91           ld  r25, X
 3ca:   1c c0           rjmp    .+56        ; 0x404 <_Z25eRCaGuy_SoftwarePWMupdatev+0xdc>
 3cc:   18 9f           mul r17, r24
 3ce:   e0 01           movw    r28, r0
 3d0:   19 9f           mul r17, r25
 3d2:   d0 0d           add r29, r0
 3d4:   11 24           eor r1, r1
 3d6:   c4 59           subi    r28, 0x94   ; 148
 3d8:   de 4f           sbci    r29, 0xFE   ; 254
 3da:   88 85           ldd r24, Y+8    ; 0x08
 3dc:   99 85           ldd r25, Y+9    ; 0x09
 3de:   aa 85           ldd r26, Y+10   ; 0x0a
 3e0:   bb 85           ldd r27, Y+11   ; 0x0b
 3e2:   01 96           adiw    r24, 0x01   ; 1
 3e4:   a1 1d           adc r26, r1
 3e6:   b1 1d           adc r27, r1
 3e8:   88 87           std Y+8, r24    ; 0x08
 3ea:   99 87           std Y+9, r25    ; 0x09
 3ec:   aa 87           std Y+10, r26   ; 0x0a
 3ee:   bb 87           std Y+11, r27   ; 0x0b
 3f0:   8c 15           cp  r24, r12
 3f2:   9d 05           cpc r25, r13
 3f4:   ae 05           cpc r26, r14
 3f6:   bf 05           cpc r27, r15
 3f8:   40 f0           brcs    .+16        ; 0x40a <_Z25eRCaGuy_SoftwarePWMupdatev+0xe2>
 3fa:   18 86           std Y+8, r1 ; 0x08
 3fc:   19 86           std Y+9, r1 ; 0x09
 3fe:   1a 86           std Y+10, r1    ; 0x0a
 400:   1b 86           std Y+11, r1    ; 0x0b
 402:   9e 85           ldd r25, Y+14   ; 0x0e
 404:   80 81           ld  r24, Z
 406:   89 2b           or  r24, r25
 408:   0d c0           rjmp    .+26        ; 0x424 <_Z25eRCaGuy_SoftwarePWMupdatev+0xfc>
 40a:   84 17           cp  r24, r20
 40c:   95 07           cpc r25, r21
 40e:   a6 07           cpc r26, r22
 410:   b7 07           cpc r27, r23
 412:   48 f0           brcs    .+18        ; 0x426 <_Z25eRCaGuy_SoftwarePWMupdatev+0xfe>
 414:   4e 85           ldd r20, Y+14   ; 0x0e
 416:   80 81           ld  r24, Z
 418:   90 e0           ldi r25, 0x00   ; 0
 41a:   50 e0           ldi r21, 0x00   ; 0
 41c:   40 95           com r20
 41e:   50 95           com r21
 420:   84 23           and r24, r20
 422:   95 23           and r25, r21
 424:   80 83           st  Z, r24
 426:   2f 5f           subi    r18, 0xFF   ; 255
 428:   3f 4f           sbci    r19, 0xFF   ; 255
 42a:   24 31           cpi r18, 0x14   ; 20
 42c:   31 05           cpc r19, r1
 42e:   09 f0           breq    .+2         ; 0x432 <_Z25eRCaGuy_SoftwarePWMupdatev+0x10a>
 430:   8c cf           rjmp    .-232       ; 0x34a <_Z25eRCaGuy_SoftwarePWMupdatev+0x22>
 432:   0f bf           out 0x3f, r16   ; 63
 434:   df 91           pop r29
 436:   cf 91           pop r28
 438:   1f 91           pop r17
 43a:   0f 91           pop r16
 43c:   ff 90           pop r15
 43e:   ef 90           pop r14
 440:   df 90           pop r13
 442:   cf 90           pop r12
 444:   bf 90           pop r11
 446:   af 90           pop r10
 448:   9f 90           pop r9
 44a:   8f 90           pop r8
 44c:   08 95           ret

1 个答案:

答案 0 :(得分:5)

这几乎肯定是对齐问题。根据结构的大小来判断,编译器似乎自动打包它。

LDR指令将4字节值加载到寄存器中,并以4字节边界运行。如果它需要加载一个不在4字节边界上的存储器地址,它实际上执行两个加载并将它们组合起来以获得该地址的值。

例如,如果要在0x02加载4字节值,处理器将执行两次加载,因为0x02不会落在4字节边界上。

我们假设我们在地址0x00处有以下内存,我们希望将0x02的4字节值加载到寄存器r0中:

Address |0x00|0x01|0x02|0x03|0x04|0x05|0x06|0x07|0x08|
Value   | 12 | 34 | 56 | 78 | 90 | AB | CD | EF | 12 |
------------------------------------------------------
r0: 00 00 00 00

它将首先在0x00加载4个字节,因为它包含0x02的4字节段,并将{2}字节存储在0x02和{{1}在寄存器中:

0x03

然后将加载Address |0x00|0x01|0x02|0x03|0x04|0x05|0x06|0x07| Value | 12 | 34 | 56 | 78 | 90 | AB | CD | EF | Load 1 | ** ** | ------------------------------------------------------ r0: 56 78 00 00 的4个字节,这是下一个4字节的段,并将0x040x04的2个字节存储在寄存器中。

0x05

如您所见,每次要访问Address |0x00|0x01|0x02|0x03|0x04|0x05|0x06|0x07| Value | 12 | 34 | 56 | 78 | 90 | AB | CD | EF | Load 2 | ** ** | ------------------------------------------------------ r0: 56 78 90 AB 处的值时,处理器实际上必须将您的指令拆分为两个操作。但是,如果您想要访问0x02处的值,处理器可以在一次操作中执行此操作:

0x04

在您的示例中,同时注释了Address |0x00|0x01|0x02|0x03|0x04|0x05|0x06|0x07| Value | 12 | 34 | 56 | 78 | 90 | AB | CD | EF | Load 1 | ** ** ** ** | ------------------------------------------------------ r0: 90 AB CD EF flags1,结构的大小为15.这意味着数组中的每个第二个结构都将位于一个奇怪的地址,所以它的指针或长成员都不会正确对齐。

通过引入其中一个flags2变量,结构的大小增加到16,这是4的倍数。这可以确保所有结构都以4字节边界开始,所以你可能不会遇到对齐问题。

可能有一个编译器标志可以帮助你解决这个问题,但总的来说,了解你的结构布局是件好事。对齐是一个棘手的问题,只有符合当前标准的编译器才有明确定义的行为。