Friday, 2 December 2016

c++ - Is it undefined behaviour to allocate overlarge stack structures?



This is a C specification question.



We all know this is legal C and should work fine on any platform:



/* Stupid way to count the length of a number */

int count_len(int val) {
char buf[256];
return sprintf(buf, "%d", val);
}


But this is almost guaranteed to crash:



/* Stupid way to count the length of a number */
int count_len(int val) {

char buf[256000000];
return sprintf(buf, "%d", val);
}


The difference is that the latter program blows the stack and will probably crash. But, purely semantically, it really isn't any different than the previous program.



According to the C spec, is the latter program actually undefined behavior? If so, what distinguishes it from the former? If not, what in the C spec says it's OK for a conforming implementation to crash?



(If this differs between C89/C99/C11/C++*, this would be interesting too).



Answer



Language standards for C(89,99,11) begin with a scope section with this wording (also found in some C++, C#, Fortran and Pascal standards):



This International Standard does not specify




  • the size or complexity of a program and its data that will exceed the capacity of any speciļ¬c data-processing system or the capacity of a particular processor;

  • all minimal requirements of a data-processing system that is capable of supporting a conforming implementation.




The gcc compiler does offer an option to check for at runtime



21.1 Checking




For most operating systems, gcc does not perform checking by default. This means that if the main environment task or some other task exceeds the available stack space, then unpredictable behavior will occur. Most native systems offer some level of protection by adding a guard page at the end of each task stack. This mechanism is usually not enough for dealing properly with situations because a large local variable could “jump” above the guard page. Furthermore, when the guard page is hit, there may not be any space left on the stack for executing the exception propagation code. Enabling stack checking avoids such situations.
To activate stack checking, compile all units with the gcc option -fstack-check. For example:




gcc -c -fstack-check package1.adb




Units compiled with this option will generate extra instructions to check that any use of the stack (for procedure calls or for declaring local variables in declare blocks) does not exceed the available stack space. If the space is exceeded, then a Storage_Error exception is raised.




There was an attempt during the standardization process for C99 to make a stronger statement within the standard that while size and complexity are beyond the scope of the standard the implementer has a responsibility to document the limits.



The rationale was





The definition of conformance has always been a problem with the C
Standard, being described by one author as "not even rubber teeth, more
like rubber gums". Though there are improvements in C9X compared with C89,
many of the issues still remain.



This paper proposes changes which, while not perfect, hopefully improve the
situation.





The following wording was suggested for inclusion to section 5.2.4.1




  • translation or execution might fail if the size or complexity of a program or its data exceeds the capacity of the implementation.

  • The implementation shall document a way to determine if the size or complexity of a correct program exceeds or might exceed the capacity of the implementation.


    5.2.4.1. An implementation is always free to state that a given program is
    too large or too complex to be translated or executed. However, to stop
    this being a way to claim conformance while providing no useful facilities

    whatsoever, the implementer must show provide a way to determine whether a
    program is likely to exceed the limits. The method need not be perfect, so
    long as it errs on the side of caution.
    One way to do this would be to have a formula which converted values such
    as the number of variables into, say, the amount of memory the compiler
    would need. Similarly, if there is a limit on stack space, the formula need
    only show how to determine the stack requirements for each function call
    (assuming this is the only place the stack is allocated) and need not work
    through every possible execution path (which would be impossible in the
    face of recursion). The compiler could even have a mode which output a

    value for each function in the program.



The proposed wording did not make it into the C99 standard, and therefore this area remained outside the scope of the standard. Section 5.2.4.1 of C99 does list these limits

The implementation shall be able to translate and execute at least one program that contains at least one instance of every one of the following limits:





  • 127 nesting levels of blocks

  • 63 nesting levels of conditional inclusion


  • 12 pointer, array, and function declarators (in any combinations) modifying an arithmetic, structure, union, or incomplete type in a declaration

  • 63 nesting levels of parenthesized declarators within a full declarator

  • 63 nesting levels of parenthesized expressions within a full expression

  • 63 significant initial characters in an internal identifier or a macro name (each universal character name or extended source character is considered a single character)

  • 31 significant initial characters in an external identifier (each universal character name specifying a short identifier of 0000FFFF or less is considered 6 characters, each universal character name specifying a short identifier of 00010000 or more is considered 10 characters, and each extended source character is considered the same number of characters as the corresponding universal character name, if any)

  • 4095 external identifiers in one translation unit

  • 511 identifiers with block scope declared in one block

  • 4095 macro identifiers simultaneously defined in one preprocessing translation unit

  • 127 parameters in one function definition

  • 127 arguments in one function call


  • 127 parameters in one macro definition

  • 127 arguments in one macro invocation

  • 4095 characters in a logical source line

  • 4095 characters in a character string literal or wide string literal (after concatenation)

  • 65535 bytes in an object (in a hosted environment only)

  • 15 nesting levels for #included files

  • 1023 case labels for a switch statement (excluding those for any nested switch statements)

  • 1023 members in a single structure or union

  • 1023 enumeration constants in a single enumeration

  • 63 levels of nested structure or union definitions in a single struct-declaration-list



No comments:

Post a Comment

c++ - Does curly brackets matter for empty constructor?

Those brackets declare an empty, inline constructor. In that case, with them, the constructor does exist, it merely does nothing more than t...