From: Thomas Walker Lynch Date: Mon, 13 Oct 2025 03:28:17 +0000 (+0000) Subject: removing pencils on directory names X-Git-Url: https://git.reasoningtechnology.com/usr/lib/python2.7/encodings/koi8_r.py?a=commitdiff_plain;h=3d1bb41279613594453ea317d8dd293943463c42;p=RT-gcc removing pencils on directory names --- diff --git a/developer/document/.gitignore b/developer/document/.gitignore new file mode 100644 index 0000000..8e44d75 --- /dev/null +++ b/developer/document/.gitignore @@ -0,0 +1,3 @@ + +*.html +*.pdf diff --git a/developer/document/user_manual.org b/developer/document/user_manual.org new file mode 100644 index 0000000..69f9450 --- /dev/null +++ b/developer/document/user_manual.org @@ -0,0 +1,204 @@ +#+TITLE: RT Extensions to the C Preprocessor +#+AUTHOR: Thomas Walker Lynch +#+OPTIONS: toc:nil +#+OPTIONS: ^:nil + +* Overview + +The RT extensions modernize the C preprocessor (CPP) to support structural programming primitives, sets, associative maps, and token/argument list manipulation. + +These extensions are intended to support C include files that are type templated, and can be included multiple times with different type template bindings. However, these functions add flexibility to CPP that will be appreciated in many contexts. + +The RT extensions do not change existing CPP behavior. + +The RT extensions also do not facilitate recursive programming. During a given evaluation, expanded macros are colored, and thereafter their names are taken literally. On a conciliatory note, among the new built-ins is a map function. + +* Review of Conventional CPP + +CPP has two explicit list types: + +- `token_list`: A stream of tokens as produced by the lexer. +- `argument_list`: A comma-separated list of `token_list` elements. + +Although CPP has these two types of lists, there are few operators in the language for supporting this structure. Hence, the RT extensions contain some. + +CPP's `#define` is limited in that the name and body are not evaluated, and that it an entire definition must be on line line. The RT extensions provide two new directives to address these limitations. + +* `#macro` + +The `#macro` directive defines a function-style macro with named parameters and a token body enclosed in balanced parentheses. The parameter list is required, but may be empty. + +#+BEGIN_SRC c +#macro NAME(arg1 ,arg2 ,...) ( token sequence ) +#+END_SRC + +** BNF for `#macro` + +#+BEGIN_SRC bnf +directive ::= "#macro" name params body ; + +name ::= identifier ; + +params ::= "(" param_list? ")" ; +param_list ::= identifier ("," identifier)* ; + +body ::= paren_clause ; +paren_clause ::= "(" literal? ")" ; + +literal ::= ; sequence parsed into tokens without expansion + +; whitespace, including newlines, is ignored +#+END_SRC + +Unlike `#define`, which requires the entire macro to reside on a single line, `#macro` allows its body clause to span multiple lines with no end-of-line escapes being required. + +As with `#define`, the `#macro` name is not evaluated, and the body is stored as a literal token sequence that will only be expanded when the macro is called. + +The body-delimiting parentheses need to balance only at the time the definition is scanned by the lexer. If the programmer desires to include unbalanced parentheses within the body, then the programmer should first create a macro such as `#define OPEN (` or `#define CLOSE )`, and then use those in the body clause. + + +* `#assign` + +The `#define` directive has both a declarative and function form. The `#assign` variation of `#define` does not have a function form. + +With `#define`, neither the name nor the body are expanded at time of definition. With `#assign`, the name and/or the body can optionally be expanded by using brackets for the respective clause, rather than parentheses. + +#+BEGIN_SRC c +#assign (literal NAME) (literal body) +#assign (literal NAME) [expanded body] +#assign [expanded NAME] (literal body) +#assign [expanded NAME] [expanded body] +#+END_SRC + +As with `#macro`, the clauses in `#assign` can be multi-line. The clauses are contained within the specified delimiters, which must balance at the time of definition, when the lexer scans them. This happens before any expansion. + +** BNF for `#assign` + +#+BEGIN_SRC bnf +cmd ::= "#assign" name body ; + +name ::= clause ; +body ::= clause ; + +clause ::= "(" literal? ")" | "[" expr? "]" ; + +literal ::= ; sequence parsed into tokens +expr ::= ; sequence parsed into tokens with recursive expansion of each token + +; white space, including new lines, is ignored. +#+END_SRC + +The name clause must resolve to a valid name identifier. + +There is a corresponding built-in macro form `_ASSIGN name body` which expands to nothing, but has the side effect of creating a definition. + +* New Built-in Macros + +Each new built-in macro has a `_` prefix. + +** `_ASSIGN` + +The same syntax and evaluation semantics as for `#assign` apply. + +#+BEGIN_SRC c +_ASSIGN (literal NAME) (literal body) +_ASSIGN (literal NAME) [expanded body] +_ASSIGN [expanded NAME] (literal body) +_ASSIGN [expanded NAME] [expanded body] +#+END_SRC + +The `_ASSIGN` macro has the same behavior as the `#assign` directive and, unlike other macros, is followed by, not one, but two delineated clauses. Each clause may be parenthesized (for a literal) or bracketed (for an expanded). + +The `_ASSIGN` macro expands to nothing. Its only effect is to create or update a macro binding. + +When a macro is created with `_ASSIGN`, it is colored. Hence, the newly created macro cannot be expanded within the same macro where it was defined. + + +* List Transformations + +#+BEGIN_SRC c +_TO_ARG_LIST( ) // Converts tokens into an argument list +_TO_TOKEN_LIST( ) // Flattens argument list into token stream +#+END_SRC + +These form the basis for transitioning between structure-aware and structure-neutral representations. + +* Token List Operators + +#+BEGIN_SRC c +_FIRST( ) // First token +_REST( ) // All tokens after the first +_MAP(f ,) // f(token1) f(token2) ... +_AL_MAP(f ,) // f(arg1) f(arg2) ... +#+END_SRC + +These enable iteration-like behavior over static macro arguments. + +* Logic Primitives + +#+BEGIN_SRC c +_IF(p ,a ,b) // If p is present, expand to a; else b +_NOT(p) // Expand to TRUE if p is empty, else empty +#+END_SRC + +Logical presence is defined by *token existence*, not token content. Empty argument lists are `false`. All others are `true`. + +#+BEGIN_SRC c +_AND( , ,... ) // Returns nothing if any arg is empty +_OR( , ,... ) // Returns first non-empty argument +#+END_SRC + +Missing trailing arguments are *not* detected by `_AND`. Appending a `TRUE` token is recommended to guard edge cases. + +* Token-Level Checks + +#+BEGIN_SRC c +_IS_IDENTIFIER( ) // If arg is an identifier, expands to it; else nothing +_IS_NAME( ) // If arg is an identifier, expands to it; else nothing +#+END_SRC + +* Pasting and Construction + +#+BEGIN_SRC c +_PASTE( , ,... ) // Combines all tokens into a single token +#+END_SRC + +Pasting is only valid if the result is a valid preprocessing token (e.g., identifier). Invalid pastes will trigger compiler diagnostics. + +* Usage Example + +(To be added) + +* Sets + +Sets are macros whose membership is defined by naming convention. All set operations are evaluated through presence of a named binding. + +#+BEGIN_SRC c +#define _SET_ADD(set ,x) ASSIGN( _PASTE(__SET_ ,set ,_ ,x), ) +#define _SET_IN(set ,x) _NOT(_PASTE(__SET_ ,set ,_ ,x)) +#+END_SRC + +A macro such as `__SET_myset__foo` existing in the preprocessor environment marks `foo` as a member of `myset`. + +* Associative Sets (ASet) + +Associative sets allow storing values indexed by a key: + +#+BEGIN_SRC c +#assign _ASET_ADD(set ,x ,y) ( + ASSIGN(_PASTE(__ASET_ ,set ,_ ,x)) + ASSIGN(_PASTE(__ASET_ITEM_ ,set ,_ ,x)) +) + +#assign _ASET_GET(set ,x) ( + _IF( + _PASTE(__ASET_ ,set ,_ ,x) + ,_PASTE(__ASET_ITEM_ ,set ,_ ,x) + , + ) +) +#+END_SRC + +If `x` is in the set, `_ASET_GET` returns the associated value; else nothing. + +* (End of manual — more modules to be added as completed.) diff --git "a/developer/document\360\237\226\211/.gitignore" "b/developer/document\360\237\226\211/.gitignore" deleted file mode 100644 index 8e44d75..0000000 --- "a/developer/document\360\237\226\211/.gitignore" +++ /dev/null @@ -1,3 +0,0 @@ - -*.html -*.pdf diff --git "a/developer/document\360\237\226\211/user_manual.org" "b/developer/document\360\237\226\211/user_manual.org" deleted file mode 100644 index 69f9450..0000000 --- "a/developer/document\360\237\226\211/user_manual.org" +++ /dev/null @@ -1,204 +0,0 @@ -#+TITLE: RT Extensions to the C Preprocessor -#+AUTHOR: Thomas Walker Lynch -#+OPTIONS: toc:nil -#+OPTIONS: ^:nil - -* Overview - -The RT extensions modernize the C preprocessor (CPP) to support structural programming primitives, sets, associative maps, and token/argument list manipulation. - -These extensions are intended to support C include files that are type templated, and can be included multiple times with different type template bindings. However, these functions add flexibility to CPP that will be appreciated in many contexts. - -The RT extensions do not change existing CPP behavior. - -The RT extensions also do not facilitate recursive programming. During a given evaluation, expanded macros are colored, and thereafter their names are taken literally. On a conciliatory note, among the new built-ins is a map function. - -* Review of Conventional CPP - -CPP has two explicit list types: - -- `token_list`: A stream of tokens as produced by the lexer. -- `argument_list`: A comma-separated list of `token_list` elements. - -Although CPP has these two types of lists, there are few operators in the language for supporting this structure. Hence, the RT extensions contain some. - -CPP's `#define` is limited in that the name and body are not evaluated, and that it an entire definition must be on line line. The RT extensions provide two new directives to address these limitations. - -* `#macro` - -The `#macro` directive defines a function-style macro with named parameters and a token body enclosed in balanced parentheses. The parameter list is required, but may be empty. - -#+BEGIN_SRC c -#macro NAME(arg1 ,arg2 ,...) ( token sequence ) -#+END_SRC - -** BNF for `#macro` - -#+BEGIN_SRC bnf -directive ::= "#macro" name params body ; - -name ::= identifier ; - -params ::= "(" param_list? ")" ; -param_list ::= identifier ("," identifier)* ; - -body ::= paren_clause ; -paren_clause ::= "(" literal? ")" ; - -literal ::= ; sequence parsed into tokens without expansion - -; whitespace, including newlines, is ignored -#+END_SRC - -Unlike `#define`, which requires the entire macro to reside on a single line, `#macro` allows its body clause to span multiple lines with no end-of-line escapes being required. - -As with `#define`, the `#macro` name is not evaluated, and the body is stored as a literal token sequence that will only be expanded when the macro is called. - -The body-delimiting parentheses need to balance only at the time the definition is scanned by the lexer. If the programmer desires to include unbalanced parentheses within the body, then the programmer should first create a macro such as `#define OPEN (` or `#define CLOSE )`, and then use those in the body clause. - - -* `#assign` - -The `#define` directive has both a declarative and function form. The `#assign` variation of `#define` does not have a function form. - -With `#define`, neither the name nor the body are expanded at time of definition. With `#assign`, the name and/or the body can optionally be expanded by using brackets for the respective clause, rather than parentheses. - -#+BEGIN_SRC c -#assign (literal NAME) (literal body) -#assign (literal NAME) [expanded body] -#assign [expanded NAME] (literal body) -#assign [expanded NAME] [expanded body] -#+END_SRC - -As with `#macro`, the clauses in `#assign` can be multi-line. The clauses are contained within the specified delimiters, which must balance at the time of definition, when the lexer scans them. This happens before any expansion. - -** BNF for `#assign` - -#+BEGIN_SRC bnf -cmd ::= "#assign" name body ; - -name ::= clause ; -body ::= clause ; - -clause ::= "(" literal? ")" | "[" expr? "]" ; - -literal ::= ; sequence parsed into tokens -expr ::= ; sequence parsed into tokens with recursive expansion of each token - -; white space, including new lines, is ignored. -#+END_SRC - -The name clause must resolve to a valid name identifier. - -There is a corresponding built-in macro form `_ASSIGN name body` which expands to nothing, but has the side effect of creating a definition. - -* New Built-in Macros - -Each new built-in macro has a `_` prefix. - -** `_ASSIGN` - -The same syntax and evaluation semantics as for `#assign` apply. - -#+BEGIN_SRC c -_ASSIGN (literal NAME) (literal body) -_ASSIGN (literal NAME) [expanded body] -_ASSIGN [expanded NAME] (literal body) -_ASSIGN [expanded NAME] [expanded body] -#+END_SRC - -The `_ASSIGN` macro has the same behavior as the `#assign` directive and, unlike other macros, is followed by, not one, but two delineated clauses. Each clause may be parenthesized (for a literal) or bracketed (for an expanded). - -The `_ASSIGN` macro expands to nothing. Its only effect is to create or update a macro binding. - -When a macro is created with `_ASSIGN`, it is colored. Hence, the newly created macro cannot be expanded within the same macro where it was defined. - - -* List Transformations - -#+BEGIN_SRC c -_TO_ARG_LIST( ) // Converts tokens into an argument list -_TO_TOKEN_LIST( ) // Flattens argument list into token stream -#+END_SRC - -These form the basis for transitioning between structure-aware and structure-neutral representations. - -* Token List Operators - -#+BEGIN_SRC c -_FIRST( ) // First token -_REST( ) // All tokens after the first -_MAP(f ,) // f(token1) f(token2) ... -_AL_MAP(f ,) // f(arg1) f(arg2) ... -#+END_SRC - -These enable iteration-like behavior over static macro arguments. - -* Logic Primitives - -#+BEGIN_SRC c -_IF(p ,a ,b) // If p is present, expand to a; else b -_NOT(p) // Expand to TRUE if p is empty, else empty -#+END_SRC - -Logical presence is defined by *token existence*, not token content. Empty argument lists are `false`. All others are `true`. - -#+BEGIN_SRC c -_AND( , ,... ) // Returns nothing if any arg is empty -_OR( , ,... ) // Returns first non-empty argument -#+END_SRC - -Missing trailing arguments are *not* detected by `_AND`. Appending a `TRUE` token is recommended to guard edge cases. - -* Token-Level Checks - -#+BEGIN_SRC c -_IS_IDENTIFIER( ) // If arg is an identifier, expands to it; else nothing -_IS_NAME( ) // If arg is an identifier, expands to it; else nothing -#+END_SRC - -* Pasting and Construction - -#+BEGIN_SRC c -_PASTE( , ,... ) // Combines all tokens into a single token -#+END_SRC - -Pasting is only valid if the result is a valid preprocessing token (e.g., identifier). Invalid pastes will trigger compiler diagnostics. - -* Usage Example - -(To be added) - -* Sets - -Sets are macros whose membership is defined by naming convention. All set operations are evaluated through presence of a named binding. - -#+BEGIN_SRC c -#define _SET_ADD(set ,x) ASSIGN( _PASTE(__SET_ ,set ,_ ,x), ) -#define _SET_IN(set ,x) _NOT(_PASTE(__SET_ ,set ,_ ,x)) -#+END_SRC - -A macro such as `__SET_myset__foo` existing in the preprocessor environment marks `foo` as a member of `myset`. - -* Associative Sets (ASet) - -Associative sets allow storing values indexed by a key: - -#+BEGIN_SRC c -#assign _ASET_ADD(set ,x ,y) ( - ASSIGN(_PASTE(__ASET_ ,set ,_ ,x)) - ASSIGN(_PASTE(__ASET_ITEM_ ,set ,_ ,x)) -) - -#assign _ASET_GET(set ,x) ( - _IF( - _PASTE(__ASET_ ,set ,_ ,x) - ,_PASTE(__ASET_ITEM_ ,set ,_ ,x) - , - ) -) -#+END_SRC - -If `x` is in the set, `_ASET_GET` returns the associated value; else nothing. - -* (End of manual — more modules to be added as completed.) diff --git a/developer/script_Deb-12.10_gcc-12.4.1/README.org b/developer/script_Deb-12.10_gcc-12.4.1/README.org new file mode 100755 index 0000000..6a6bacd --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/README.org @@ -0,0 +1,53 @@ +* General Notes + +GNU `cpp` is integrated with the GNU gcc repo. + +There is a lot more to GCC than one might imagine. It was developed as though an integral part of Unix. Hence, the standalone build has top-level directories that resemble the top level of a Unix system. + +The scripts here will download source and build a standalone GCC 12 along with version-compatible tools. + + +* RT extensions + +If you want the RT extensions, the RT extension sources must be copied from library/ after the gcc sources are expanded. Use `RT_extensions_install.sh`. When editing the sources, generally the library/ versions are treated as authoritive. + +* environment + +Don't forget `source env_toolsmith` in your shell. + +* Setup + +- `setup_project.sh` :: Makes the directory structure for the build, creates a `tmp/` directory under the project. If it does not already exist, creates a `.gitignore` file with all the created directories listed. + +* Top level .gitignore + +- There is no `.gitignore` at the top level when the project is cloned. The `.gitignore` created by `setup_project.sh` ignores itself, so it will not be checked in. +- No script deletes the top level `.gitignore`, including the clean scripts, so the toolsmith can make edits there that persist locally. + +* Clean + +- `clean_build.sh` :: For saving space after the build is done. The build scripts are idempotent, so in an ideal world this need not be run to do a rebuild. + +- `clean_dist.sh` :: With one exception, this will delete everything that was synthesized. The one exception is that `.gitignore` is moved to the `tmp/` directory to preserve any changes a user might have made, and the contents of the `tmp/` directory are not removed. + +- `clean_tmp.sh` :: Wipes clean all contents of the temporary directory. + +* Download + +- `download_upstream_sources.sh` :: Goes to the Internet, fetches all the sources that have not already been fetched. Then expands the sources into the proper sub-directory under `source/1`. + +* Build + +See the script build_all.sh for a description of the complete build process. + +When editing the RT Extensions, the work flow is typically: + +while cwd in the script directory: +1. edit the library/ +2. `./RT_extension_install.h` +3. `./rebuild_gcc.sh` +4. run some experiments + +It would of course be better to have a test suite. + +I typically leave an emacs shell open in the $ROOT/source/gcc-12.2.0/libcpp/ directory for purposes of exploring the other source code files. diff --git a/developer/script_Deb-12.10_gcc-12.4.1/build_all.sh b/developer/script_Deb-12.10_gcc-12.4.1/build_all.sh new file mode 100755 index 0000000..767e6af --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/build_all.sh @@ -0,0 +1,26 @@ +#!/bin/bash +set -euo pipefail + +cd "$(dirname "$0")" + +source "$SCRIPT_DIR/environment.sh" + +./project_setup.sh +./project_download.sh +./project_extract.sh +./project_requisites.sh + +./mv_libs_to_gcc.sh +./build_gcc.sh + +echo "Toolchain build complete" +"$TOOLCHAIN/bin/gcc" --version + +# test + +./RT_extentions_libcpp_save.sh +./RT_extentions_install.sh +./rebuild_gcc.sh + +echo "Toolchain built with RT_extensions installed" +"$TOOLCHAIN/bin/gcc" --version diff --git a/developer/script_Deb-12.10_gcc-12.4.1/build_gcc.sh b/developer/script_Deb-12.10_gcc-12.4.1/build_gcc.sh new file mode 100755 index 0000000..db7a9c4 --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/build_gcc.sh @@ -0,0 +1,33 @@ +#!/bin/bash +# build_gcc.sh – Build GCC 12.2.0 using system libraries and headers + +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "🔧 Starting GCC build..." + +mkdir -p "$GCC_BUILD" +pushd "$GCC_BUILD" + +echo "gcc: $(command -v gcc)" +echo "toolchain: $TOOLCHAIN" + +"$GCC_SRC/configure" \ + --with-pkgversion="RT_gcc standalone by Reasoning Technology" \ + --with-bugurl="https://github.com/Thomas-Walker-Lynch/RT_gcc/issues" \ + --with-documentation-root-url="https://gcc.gnu.org/onlinedocs/" \ + --with-changes-root-url="https://github.com/Thomas-Walker-Lynch/RT_gcc/releases/" \ + --host="$HOST" \ + --prefix="$TOOLCHAIN" \ + --enable-languages=c,c++ \ + --enable-threads=posix \ + --disable-multilib + +$MAKE -j"$MAKE_JOBS" +$MAKE install + +popd + +echo "✅ GCC installed to $TOOLCHAIN/bin" +"$TOOLCHAIN/bin/gcc" --version diff --git a/developer/script_Deb-12.10_gcc-12.4.1/clean_build.sh b/developer/script_Deb-12.10_gcc-12.4.1/clean_build.sh new file mode 100755 index 0000000..7c9bca3 --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/clean_build.sh @@ -0,0 +1,15 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "🧹 Cleaning build directories..." + +for dir in "${BUILD_DIR_LIST[@]}"; do + if [[ -d "$dir" ]]; then + echo "rm -rf $dir" + rm -rf "$dir" + fi +done + +echo "✅ Build directories cleaned." diff --git a/developer/script_Deb-12.10_gcc-12.4.1/clean_dist.sh b/developer/script_Deb-12.10_gcc-12.4.1/clean_dist.sh new file mode 100755 index 0000000..3b319ec --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/clean_dist.sh @@ -0,0 +1,35 @@ +#!/bin/bash +set -euo pipefail + +echo "removing: build, source, upstream, and project directories" + +source "$(dirname "$0")/environment.sh" + +# Remove build +# + "./clean_build.sh" + ! ! rmdir "$BUILD_DIR" >& /dev/null && echo "rmdir $BUILD_DIR" + +# Remove source +# note that repos are removed with clean_upstream +# + "./clean_source.sh" + "./clean_upstream.sh" + + ! ! rmdir "$SRC" >& /dev/null && echo "rmdir $SRC" + ! ! rmdir "$UPSTREAM" >& /dev/null && echo "rmdir $UPSTREAM" + +# Remove binaries from toolchain (if they were copied to release, those copies remain). +# + "./clean_toolchain.sh" + +# Remove project directories +# + for dir in "${PROJECT_SUBDIR_LIST[@]}" "${PROJECT_DIR_LIST[@]}"; do + if [[ -d "$dir" ]]; then + echo "rm -rf $dir" + ! rm -rf "$dir" && echo "could not remove $dir" + fi + done + +echo "✅ clean_dist.sh" diff --git a/developer/script_Deb-12.10_gcc-12.4.1/clean_source.sh b/developer/script_Deb-12.10_gcc-12.4.1/clean_source.sh new file mode 100755 index 0000000..2f5beb0 --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/clean_source.sh @@ -0,0 +1,29 @@ +#!/bin/bash +# removes project tarball expansions from source/ +# git repos are part of `upstream` so are not removed + +set -euo pipefail + + +source "$(dirname "$0")/environment.sh" + +i=0 +while [ $i -lt ${#UPSTREAM_TARBALL_LIST[@]} ]; do + tarball="${UPSTREAM_TARBALL_LIST[$i]}" + # skip url + i=$((i + 1)) + # skip explicit dest dir + i=$((i + 1)) + + base_name="${tarball%.tar.*}" + dir="$SRC/$base_name" + + if [[ -d "$dir" ]]; then + echo "rm -rf $dir" + rm -rf "$dir" + fi + + i=$((i + 1)) +done + +echo "✅ clean_source.sh" diff --git a/developer/script_Deb-12.10_gcc-12.4.1/clean_toolchain.sh b/developer/script_Deb-12.10_gcc-12.4.1/clean_toolchain.sh new file mode 100755 index 0000000..435c1ab --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/clean_toolchain.sh @@ -0,0 +1,17 @@ +#!/bin/bash +# clean_toolchain.sh – Remove installed GCC toolchain artifacts + +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "🧹 Cleaning installed toolchain at: $TOOLCHAIN" + +if [[ -d "$TOOLCHAIN" ]]; then + echo "rm -rf $TOOLCHAIN" + rm -rf "$TOOLCHAIN" +else + echo "⚠️ Toolchain directory not found: $TOOLCHAIN (nothing to remove)" +fi + +echo "✅ Installed toolchain cleaned." diff --git a/developer/script_Deb-12.10_gcc-12.4.1/clean_upstream.sh b/developer/script_Deb-12.10_gcc-12.4.1/clean_upstream.sh new file mode 100755 index 0000000..50e8d98 --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/clean_upstream.sh @@ -0,0 +1,38 @@ +#!/bin/bash +# run this to force repeat of the downloads +# removes project tarballs from upstream/ +# removes project repos from source/ +# does not remove non-project files + +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +# Remove tarballs +i=0 +while [ $i -lt ${#UPSTREAM_TARBALL_LIST[@]} ]; do + tarball="${UPSTREAM_TARBALL_LIST[$i]}" + path="$UPSTREAM/$tarball" + + if [[ -f "$path" ]]; then + echo "rm $path" + rm "$path" + fi + + i=$((i + 3)) +done + +# Remove Git repositories +i=0 +while [ $i -lt ${#UPSTREAM_GIT_REPO_LIST[@]} ]; do + dir="${UPSTREAM_GIT_REPO_LIST[$((i+2))]}" + + if [[ -d "$dir" ]]; then + echo "rm -rf $dir" + rm -rf "$dir" + fi + + i=$((i + 3)) +done + +echo "✅ clean_upstream.sh" diff --git a/developer/script_Deb-12.10_gcc-12.4.1/deprecated/stuff.cc b/developer/script_Deb-12.10_gcc-12.4.1/deprecated/stuff.cc new file mode 100644 index 0000000..84429e4 --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/deprecated/stuff.cc @@ -0,0 +1,368 @@ + +/* + Parse a macro-style parameter list for `#assign` + + This expects the next token to be an opening parenthesis `(`. + + It returns: + - `params_out`: pointer to committed parameter array + - `param_count_out`: number of parameters parsed + - `is_variadic_out`: true if a variadic param was encountered + + On success, returns true and fills the out parameters. + On failure, returns false and issues an error diagnostic. +*/ +bool +make_parameter_list( + cpp_reader *pfile, + cpp_hashnode ***params_out, + unsigned int *param_count_out, + bool *is_variadic_out +){ + cpp_token first; + cpp_token *saved_cur_token = pfile->cur_token; + pfile->cur_token = &first; + cpp_token *token = _cpp_lex_direct(pfile); + pfile->cur_token = saved_cur_token; + + if (token->type != CPP_OPEN_PAREN) { + cpp_error_with_line( + pfile, + CPP_DL_ERROR, + token->src_loc, + 0, + "expected '(' to open parameter list, but found: %s", + cpp_token_as_text(token) + ); + return false; + } + + unsigned int nparms = 0; + bool variadic = false; + + if (!parse_params(pfile, &nparms, &variadic)) { + cpp_error_with_line( + pfile, + CPP_DL_ERROR, + token->src_loc, + 0, + "malformed parameter list" + ); + return false; + } + + cpp_hashnode **params = (cpp_hashnode **) + _cpp_commit_buff(pfile, sizeof(cpp_hashnode *) * nparms); + + *params_out = params; + *param_count_out = nparms; + *is_variadic_out = variadic; + + return true; +} + + /* Parse the parameter list + */ + cpp_hashnode **params; + unsigned int param_count; + bool is_variadic; + + if (!make_parameter_list(pfile, ¶ms, ¶m_count, &is_variadic)) { + return false; + } + + + + + +/*================================================================================ +from directive.cc + +*/ + + D(macro ,T_MACRO ,EXTENSION ,IN_I) \ + +/*-------------------------------------------------------------------------------- + directive `#macro` + + cmd ::= "#macro" name params body ; + + name ::= identifier ; + + params ::= "(" param_list? ")" ; + param_list ::= identifier ("," identifier)* ; + + body ::= clause ; + + clause ::= "(" literal? ")" | "[" expr? "]" ; + + literal ::= ; sequence parsed into tokens + expr ::= ; sequence parsed into tokens with recursive expansion of each token + + ; white space, including new lines, is ignored. + + +*/ +extern bool _cpp_create_macro (cpp_reader *pfile, cpp_hashnode *node); + +static void +do_macro (cpp_reader *pfile) +{ + cpp_hashnode *node = lex_macro_node(pfile, true); + + if(node) + { + /* If we have been requested to expand comments into macros, + then re-enable saving of comments. */ + pfile->state.save_comments = + ! CPP_OPTION (pfile, discard_comments_in_macro_exp); + + if(pfile->cb.before_define) + pfile->cb.before_define (pfile); + + if( _cpp_create_macro(pfile, node) ) + if (pfile->cb.define) + pfile->cb.define (pfile, pfile->directive_line, node); + + node->flags &= ~NODE_USED; + } +} + + + + + + +/*================================================================================ +from macro.cc + +*/ + + + + + +/*-------------------------------------------------------------------------------- + Given a pfile, returns a macro definition. + + #macro name (parameter [,parameter] ...) (body_expr) + #macro name () (body_expr) + + Upon entry, the name was already been parsed in directives.cc::do_macro, so the next token will be the opening paren of the parameter list. + + Thi code is similar to `_cpp_create_definition` though uses paren blancing around the body, instead of requiring the macro body be on a single line. + + The cpp_macro struct is defined in cpplib.h: `struct GTY(()) cpp_macro {` it has a flexible array field in a union as a last member: cpp_token tokens[1]; + + This code was derived from create_iso_definition(). The break out portions shared + with create_macro_definition code should be shared with the main code, so that there + is only one place for edits. + +*/ +static cpp_macro *create_iso_RT_macro (cpp_reader *pfile){ + + const char *paste_op_error_msg = + N_("'##' cannot appear at either end of a macro expansion"); + unsigned int num_extra_tokens = 0; + unsigned nparms = 0; + cpp_hashnode **params = NULL; + bool varadic = false; + bool ok = false; + cpp_macro *macro = NULL; + + /* + After these six lines of code, the next token, hopefully being '(', will be in the variable 'token'. + + _cpp_lex_direct() is going to clobber pfile->cur_token with the token pointer, so + it is saved then restored. + */ + cpp_token first; + cpp_token *saved_cur_token = pfile->cur_token; + pfile->cur_token = &first; + cpp_token *token = _cpp_lex_direct (pfile); + pfile->cur_token = saved_cur_token; + + // parameter list parsing + // + if(token->type != CPP_OPEN_PAREN){ + cpp_error_with_line( + pfile + ,CPP_DL_ERROR + ,token->src_loc + ,0 + ,"expected '(' to open arguments list, but found: %s" + ,cpp_token_as_text(token) + ); + goto out; + } + + /* + - returns parameter list for a function macro, or NULL + - returns via &arg count of parameters + - returns via &arg the varadic flag + + after parse_parms runs, the next token returned by pfile will be subsequent to the parameter list, e.g.: + 7 | #macro Q(f ,...) printf(f ,__VA_ARGS__) + | ^~~~~~ + + */ + if( !parse_params(pfile, &nparms, &varadic) ) goto out; + + // finalizes the reserved room, otherwise it will be reused on the next reserve room call. + params = (cpp_hashnode **)_cpp_commit_buff( pfile, sizeof (cpp_hashnode *) * nparms ); + token = NULL; + + // instantiate a temporary macro struct, and initialize it + // A macro struct instance is variable size, due to a trailing token list, so the memory + // reservations size will be adjusted when this is committed. + // + macro = _cpp_new_macro( + pfile + ,cmk_macro + ,_cpp_reserve_room( pfile, 0, sizeof(cpp_macro) ) + ); + macro->variadic = varadic; + macro->paramc = nparms; + macro->parm.params = params; + macro->fun_like = true; + + // parse macro body + // A `#macro` body is delineated by parentheses + // + if( + !collect_body_tokens( + pfile + ,macro + ,&num_extra_tokens + ,paste_op_error_msg + ,true // parenthesis delineated + ) + ) goto out; + + // ok time to commit the macro + // + ok = true; + macro = (cpp_macro *)_cpp_commit_buff( + pfile + ,sizeof (cpp_macro) - sizeof (cpp_token) + sizeof (cpp_token) * macro->count + ); + + // some end cases we must clean up + // + /* + It might be that the first token of the macro body was preceded by white space,so + the white space flag is set. However, upon expansion, there might not be a white + space before said token, so the following code clears the flag. + */ + if (macro->count) + macro->exp.tokens[0].flags &= ~PREV_WHITE; + + /* + Identifies consecutive ## tokens (a.k.a. CPP_PASTE) that were invalid or ambiguous, + + Removes them from the main macro body, + + Stashes them at the end of the tokens[] array in the same memory, + + Sets macro->extra_tokens = 1 to signal their presence. + */ + if (num_extra_tokens) + { + /* Place second and subsequent ## or %:%: tokens in sequences of + consecutive such tokens at the end of the list to preserve + information about where they appear, how they are spelt and + whether they are preceded by whitespace without otherwise + interfering with macro expansion. Remember, this is + extremely rare, so efficiency is not a priority. */ + cpp_token *temp = (cpp_token *)_cpp_reserve_room + (pfile, 0, num_extra_tokens * sizeof (cpp_token)); + unsigned extra_ix = 0, norm_ix = 0; + cpp_token *exp = macro->exp.tokens; + for (unsigned ix = 0; ix != macro->count; ix++) + if (exp[ix].type == CPP_PASTE) + temp[extra_ix++] = exp[ix]; + else + exp[norm_ix++] = exp[ix]; + memcpy (&exp[norm_ix], temp, num_extra_tokens * sizeof (cpp_token)); + + /* Record there are extra tokens. */ + macro->extra_tokens = 1; + } + + out: + + /* + - This resets a flag in the parser’s state machine, pfile. + - The field `va_args_ok` tracks whether the current macro body is allowed to reference `__VA_ARGS__` (or more precisely, `__VA_OPT__`). + - It's set **while parsing a macro body** that might use variadic logic — particularly in `vaopt_state` tracking. + + Resetting it here ensures that future macros aren't accidentally parsed under the assumption that variadic substitution is valid. + */ + pfile->state.va_args_ok = 0; + + /* + Earlier we did: + if (!parse_params(pfile, &nparms, &variadic)) goto out; + This cleans up temporary memory used by parse_params. + */ + _cpp_unsave_parameters (pfile, nparms); + + return ok ? macro : NULL; +} + +/* + called from directives.cc:: do_macro +*/ +bool +_cpp_create_macro(cpp_reader *pfile, cpp_hashnode *node){ + cpp_macro *macro; + + macro = create_iso_RT_macro (pfile); + + if (!macro) + return false; + + if (cpp_macro_p (node)) + { + if (CPP_OPTION (pfile, warn_unused_macros)) + _cpp_warn_if_unused_macro (pfile, node, NULL); + + if (warn_of_redefinition (pfile, node, macro)) + { + const enum cpp_warning_reason reason + = (cpp_builtin_macro_p (node) && !(node->flags & NODE_WARN)) + ? CPP_W_BUILTIN_MACRO_REDEFINED : CPP_W_NONE; + + bool warned = + cpp_pedwarning_with_line (pfile, reason, + pfile->directive_line, 0, + "\"%s\" redefined", NODE_NAME (node)); + + if (warned && cpp_user_macro_p (node)) + cpp_error_with_line (pfile, CPP_DL_NOTE, + node->value.macro->line, 0, + "this is the location of the previous definition"); + } + _cpp_free_definition (node); + } + + /* Enter definition in hash table. */ + node->type = NT_USER_MACRO; + node->value.macro = macro; + if (! ustrncmp (NODE_NAME (node), DSC ("__STDC_")) + && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_FORMAT_MACROS") + /* __STDC_LIMIT_MACROS and __STDC_CONSTANT_MACROS are mentioned + in the C standard, as something that one must use in C++. + However DR#593 and C++11 indicate that they play no role in C++. + We special-case them anyway. */ + && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_LIMIT_MACROS") + && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_CONSTANT_MACROS")) + node->flags |= NODE_WARN; + + /* If user defines one of the conditional macros, remove the + conditional flag */ + node->flags &= ~NODE_CONDITIONAL; + + return true; +} + diff --git a/developer/script_Deb-12.10_gcc-12.4.1/document/bump_buffer.org b/developer/script_Deb-12.10_gcc-12.4.1/document/bump_buffer.org new file mode 100644 index 0000000..7eddbac --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/document/bump_buffer.org @@ -0,0 +1,101 @@ +#+TITLE: CPP bump buffer +#+AUTHOR: Thomas +#+DATE: 2025-06-08 +#+OPTIONS: toc:nil ^:nil + +* References +https://www.chiark.greenend.org.uk/doc/cpp-4.3-doc/cppinternals.html + +#+begin_quote +**References** + +- *libcpp/lex.cc* and *libcpp/macro.cc* (GCC ≥ 12) +- Ian Lance Taylor, “*Inside libcpp*”, GNU Cauldron 2018 slides +#+end_quote + +* The arena +/libcpp/ keeps a bump-pointer arena inside =cpp_reader=. Two routines are used to directly interface with it: =_cpp_reserve_room()=, and =_cpp_commit_buff()=. + + + +| Helper | Exact responsibility | +|--------------------------------+--------------------------------------------------------| +| =_cpp_reserve_room()= | Provide tentative space without moving the cursor | +| =_cpp_commit_buff()= | Finalise the previous reservation and move the cursor | + +* _cpp_reserve_room + +Defined in `internal.h`: + +#+begin_src c +static inline void *_cpp_reserve_room (cpp_reader *pfile, size_t have, size_t extra){ + if (BUFF_ROOM (pfile->a_buff) < (have + extra)) + _cpp_extend_buff (pfile, &pfile->a_buff, extra); + return BUFF_FRONT (pfile->a_buff); +} +#+end_src + +Example invocation found in `macro.cc::create_iso_definition`: + +#+begin_src c + macro = _cpp_new_macro( + pfile + ,cmk_macro + ,_cpp_reserve_room(pfile, 0, sizeof (cpp_macro)) + ); +#+end_src + +=_cpp_reserve_room()= returns a pointer to a buffer of the specified size, `have` + `extra`. The `have` argument is interesting, as it can be changed from call to call, hence it is more for the programmer than for the function. + +The buffer returned is not generally a unique buffer. It might be the same one that was returned on the previous invocation of =_cpp_reserve_room()=, or it will be a new buffer if `_cpp_extend_buff` was called during allocation. In this latter case, a new larger buffer is made, and the old buffer is copied into it. The new buffer is then returned. + +Hence, each call to =_cpp_reserve_room()= requires that all pointers into the buffer that existed before the call be considered stale. After the call, the only valid pointer to the buffer is the one that =_cpp_reserve_room()= returns. + + +* `_cpp_commit_buff` + +Defined in `lex.cc`: + +#+begin_src c +void *_cpp_commit_buff (cpp_reader *pfile, size_t size){ + void *ptr = BUFF_FRONT (pfile->a_buff); + + if (pfile->hash_table->alloc_subobject) + { + void *copy = pfile->hash_table->alloc_subobject (size); + memcpy (copy, ptr, size); + ptr = copy; + } + else + BUFF_FRONT (pfile->a_buff) += size; + + return ptr; +} +#+end_src + +A call to =_cpp_commit_buff()= finalizes the accumulation of reserved room, and sets the buffer aside as a dedicated allocation. Once =_cpp_commit_buff()= is called, a subsequent call to =_cpp_reserve_room()= will start fresh with a new buffer. + +When `! pfile->hash_table->alloc_subobject`, the buffer will be committed with `BUFF_FRONT (pfile->a_buff) += size;` + +However, when `pfile->hash_table->alloc_subobject` is true, the buffer will be copied when it is committed. Hence, it is important to consider all pointers into the buffer created during the one or more calls to =_cpp_reserve_room()= to be stale after a call to =_cpp_commit_buff()=, and to subsequently only use the pointer that =_cpp_commit_buff()= returns. + +According to `GPT o3`, which hasn't gotten much right thus far,"`alloc_subobject` is non-null only while installing a macro into the hash table (create_definition etc.)." + +* Deallocating a `_cpp_reserve_room` buffer yet to be committed + +Each subsequent call to `_cpp_reserve_room`, even if it is unrelated to the prior one, either uses the buffer it finds at `pfile->a_buff`, or if it is not large enough, replaces it with a larger copy. Consequently, there is no concept of 'deallocating a `_cpp_reserve_room` buffer'. + +However, according to `o3`, if one desired: + +#+begin_src c + void *mark = _cpp_get_buff (pfile); /* snapshot cursor */ + ... + _cpp_release_buff (pfile, mark); /* rewind to mark */ +#+end_src + + +* Deallocating a committed `_cpp_reserve_room` buffer + +There is no facility for this. Detritus collects until CPP moves on to the next translation unit and then everything is released. + + diff --git a/developer/script_Deb-12.10_gcc-12.4.1/document/custom_directives_macros.org b/developer/script_Deb-12.10_gcc-12.4.1/document/custom_directives_macros.org new file mode 100644 index 0000000..a38289d --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/document/custom_directives_macros.org @@ -0,0 +1,73 @@ +#+TITLE: Adding Custom Directives and Built-in Macros in libcpp +#+AUTHOR: Thomas +#+DATE: 2025-06-10 +#+OPTIONS: toc:t + +* Custom Preprocessor Directives + +To add a new `#directive` (e.g. `#assign`, `#macro`), changes must be made in several files within libcpp. + +** `directives.cc` + +- Midway through the file, there is a static table mapping directive names to their handlers. +- Add an entry here for your new directive: + #+begin_src c + { "assign", RT_ASSIGN, false }, + { "macro", RT_MACRO, false }, + #+end_src + +- Define your handler in the switch for `handle_directive`. Place the handler logic (e.g., `handle_assign_directive`) in the section titled: + #+begin_quote + /* RT Extensions */ + #+end_quote + +** Tip: + Group your RT extensions near the end of the file for easy maintenance. + +--- + +* Custom Built-in Macros + +To define new built-in macros (e.g. `_CAT`, `_MAP`), several files need coordinated updates. + +** `include/cpplib.h` + +- Add a new entry to the `enum cpp_builtin_type`: + #+begin_src c + BT_CAT, + #+end_src + +- This enum distinguishes the macro’s kind in logic dispatch. + +** `init.cc` + +- Extend the `builtin_array[]` definition: + #+begin_src c + B("_CAT", BT_CAT, true), + #+end_src + +- The third parameter (`true`) indicates that a redefinition warning should be issued if the macro is redefined in user code. +- *Important:* The final entries in the table (`__DATE__`, `__TIME__`, etc.) are position-sensitive. Insert new macros near the top or middle, not the end. + +** `macros.cc` + +- This is where the actual macro expansion logic resides. +- Your `handle_builtin_macro` function will receive a `cpp_builtin_type` enum and dispatch accordingly. +- Add your implementation (e.g., for `_CAT`) in the RT Extensions section near the end of `macros.cc`. + + For example: + #+begin_src c + case BT_CAT: + return expand_cat_macro(pfile, ...); + #+end_src + +- If your macro uses token splicing, argument unpacking, or nesting, consider isolating each macro as a separate function for clarity and testability. + +--- + +* Notes + +- All RT Extension code is grouped under labeled sections (e.g., `/* RT Extensions */`) for both directives and built-in macros. This helps maintain a clean separation from upstream GCC code. +- If your extensions modify memory allocation, token arena behavior, or call back into the lexer, be sure to test under both expanded and unexpanded modes. + +Let me know if you'd like this converted into a template or included in your developer README. diff --git a/developer/script_Deb-12.10_gcc-12.4.1/document/fetching_a_token.org b/developer/script_Deb-12.10_gcc-12.4.1/document/fetching_a_token.org new file mode 100644 index 0000000..30f4988 --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/document/fetching_a_token.org @@ -0,0 +1,92 @@ +#+TITLE: Token Fetching Routines in libcpp +#+AUTHOR: Thomas +#+DATE: 2025-06-10 +#+OPTIONS: toc:t + +* Getting a Token + +There are many routines for getting a token in libcpp, each tuned for different phases of preprocessing. This guide summarizes the key interfaces, grouped by their expansion behavior. + +Note that the term *expansion* is overloaded in libcpp: + +1. One meaning refers to **macro expansion** — replacing a macro call with its definition (possibly recursively). +2. Another refers to the **macro definition buffer** itself: macros store an array of tokens as their "expansion". Adding a token to this array is said to "add an expansion token", which is unrelated to expanding a macro. + +--- + +* No Macro Expansion + +** `lex.cc::_cpp_lex_token(pfile)` + +- Returns a pointer to the next token without macro expansion. +- It does handle important preprocessor logic: + - Skipping gated code (`#if 0`) + - Handling `#line` or `#pragma` + - Deferred pragma state +- This is the general-purpose low-level token fetch used during parsing — but it still obeys the logical flow of the file. + +** `macro.cc::lex_expansion_token` + +- Lexes a token and places it into the macro's `expansion` array. +- The token is *not* processed in the usual way — it does *not* honor `#if` skipping, macro argument parsing, or directive detection. +- Used by `create_iso_definition()` when parsing macro bodies. + +** `lex.cc::_cpp_lex_direct(pfile)` + +- Very low-level lexer function. +- Called *only* by `_cpp_lex_token`. +- Assumes `pfile->cur_token` is pre-set; writes directly into that memory. +- Bypasses all macro, directive, or skipping logic. +- Should not be used outside of tightly controlled contexts — *not* a public interface. + +--- + +* With Macro Expansion + +** `macro.cc::cpp_get_token_1(pfile, &src_loc)` + +- Returns the next token with full macro expansion. +- Also returns `location_t` via an out parameter. +- This is the recommended entry point for frontends or extensions that need both token and position. +- Internally calls `_cpp_lex_token` and handles expansion. + +** `macro.cc::cpp_get_token(pfile)` + +- Thin wrapper around `cpp_get_token_1`, but discards the location. +- Used in internal code where location is unnecessary. + +** `macro.cc::cpp_get_token_no_padding(pfile)` + +- Calls `cpp_get_token`, but skips tokens of type `CPP_PADDING`. +- Useful when padding tokens are irrelevant (e.g. while collecting macro arguments). + +--- + +* RT Extension + +** `rt_extensions.cc::get_token_noexpand(pfile)` + +- Returns one unexpanded token by value. +- Does *not* use the regular token stream; instead, it temporarily sets `pfile->cur_token` to a local buffer, calls `_cpp_lex_direct`, and restores the original pointer. +- Unlike `lex_expansion_token`, it does *not* store the result in a macro; it simply returns the token. +- This is appropriate for previewing the next token *without affecting* the token stream or macro state. + +However, for parsing structured clauses such as macro bodies — especially those that may include `#if`/`#endif` constructs — it is better to use `_cpp_lex_token`, to match the conditional skipping behavior of `cpp_get_token_1`. + +--- + +* Summary Table + +| Function | Expansion | Skips `#if 0` | Stores Token | Returns Token | Returns Location | +|-----------------------------+-----------+---------------+---------------+----------------+-------------------| +| `lex_expansion_token` | ❌ | ❌ | ✔ (into macro) | ❌ | ✔ | +| `_cpp_lex_direct` | ❌ | ❌ | ✔ (in-place) | ❌ | ✔ | +| `_cpp_lex_token` | ❌ | ✔ | ✔ | ✔ | ✔ | +| `cpp_get_token` | ✔ | ✔ | ✔ | ✔ | ❌ | +| `cpp_get_token_1` | ✔ | ✔ | ✔ | ✔ | ✔ | +| `cpp_get_token_no_padding` | ✔ | ✔ | ✔ | ✔ (no padding) | ❌ | +| `get_token_noexpand` | ❌ | ❌ | ❌ | ✔ (by value) | ✔ | + +--- + +Let me know if you'd like to cross-link these to their actual definitions or integrate them into a larger RT CPP developer manual. diff --git a/developer/script_Deb-12.10_gcc-12.4.1/document/todo.org b/developer/script_Deb-12.10_gcc-12.4.1/document/todo.org new file mode 100644 index 0000000..40142c6 --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/document/todo.org @@ -0,0 +1,23 @@ +2025-05-00 + + - Add the call back and warn logic for #assign in the macro.cc::name_clause_is_name function. + + - The name is currently () or [], probably should allow a single name ID or [] instead. + or perhaps in general, allow for evaluated, not evaluated, or single identifier options. + + - When this matures, should replace the capture/install with diff and patch. + + +2025-05-17 in maco.cc, seems the end cases in `parse_clause_literal()` should be included in `parse_clause_expand()`. + +2025-05-18 It would have been better perhaps to send in a pointer to a token allocation, + instead of to a src_loc, and and terminal type. + + +2025-06-08 + + - I wonder if _ASSIGN can be contrived, perhaps along with a count, to create an + infinite number of uniquely named macros. This might be worth exploring. + + + diff --git a/developer/script_Deb-12.10_gcc-12.4.1/environment.sh b/developer/script_Deb-12.10_gcc-12.4.1/environment.sh new file mode 100755 index 0000000..4bcfd3d --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/environment.sh @@ -0,0 +1,159 @@ +# === environment.sh === +# Source this file in each build script to ensure consistent paths and settings + +#!/bin/sh + +: "${REPO_HOME:?REPO_HOME is not set}" +: "${DEVELOPER:?DEVELOPER is not set}" +: "${SCRIPT_DIR:?SCRIPT_DIR is not set}" + +[ -d "$REPO_HOME" ] || { echo "Directory not found: REPO_HOME ($REPO_HOME)" >&2; exit 1; } +[ -d "$DEVELOPER" ] || { echo "Directory not found: DEVELOPER ($DEVELOPER)" >&2; exit 1; } +[ -d "$SCRIPT_DIR" ] || { echo "Directory not found: SCRIPT_DIR ($SCRIPT_DIR)" >&2; exit 1; } + +echo "REPO_HOME: $REPO_HOME" +echo "DEVELOPER: $DEVELOPER" +echo "SCRIPT_DIR: $SCRIPT_DIR" + +#-------------------------------------------------------------------------------- +# project structure + + # temporary directory + export TMPDIR="$REPO_HOME/tmp" + + # Project directories + export SYSROOT="$DEVELOPER/sysroot" + export TOOLCHAIN="$DEVELOPER/toolchain" + export BUILD_DIR="$DEVELOPER/build" + export LOGDIR="$DEVELOPER/log" + export UPSTREAM="$DEVELOPER/upstream" + export SRC=$DEVELOPER/source + + # lists of project directories to synthesize + PROJECT_DIR_LIST=( + "$LOGDIR" + "$SYSROOT" "$TOOLCHAIN" "$BUILD_DIR" + "$UPSTREAM" "$SRC" + ) + # list these in the order for which they can be deleted + PROJECT_SUBDIR_LIST=( + "$SYSROOT/usr/lib" + "$SYSROOT/lib" + "$SYSROOT/usr/include" + ) + +#-------------------------------------------------------------------------------- +# Tool and library versions (optimized build with Graphite and LTO compression) + + export GCC_VER=12.2.0 # GCC version to build + export GMP_VER=6.2.1 # Sufficient for GCC 12.2 + export MPFR_VER=4.1.0 # Stable version compatible with GCC 12.2 + export MPC_VER=1.2.1 # Recommended for GCC 12.2 + export ISL_VER=0.24 # GCC 12.x infra uses this; don't use 0.26+ unless patched + export ZSTD_VER=1.5.5 # zstd compression for LTO bytecode + +#-------------------------------------------------------------------------------- +# tools + + # Compiler path prefixes + export CC_FOR_BUILD="$(command -v gcc)" + export CXX_FOR_BUILD="$(command -v g++)" + export MAKE="$(command -v make)" + + # Verify that compilers were found + : "${CC_FOR_BUILD:?gcc not found in PATH}" + : "${CXX_FOR_BUILD:?g++ not found in PATH}" + : "${MAKE:?make not found in PATH}" + + [ -x "$CC_FOR_BUILD" ] || { echo "❌ $CC_FOR_BUILD is not executable"; exit 1; } + [ -x "$CXX_FOR_BUILD" ] || { echo "❌ $CXX_FOR_BUILD is not executable"; exit 1; } + [ -x "$MAKE" ] || { echo "❌ $MAKE is not executable"; exit 1; } + + # Machine target + export HOST="$("$CC_FOR_BUILD" -dumpmachine)" + + # Determine parallelism + if command -v getconf >/dev/null 2>&1; then + export MAKE_JOBS=$(getconf _NPROCESSORS_ONLN) + else + echo "⚠️ getconf not found; defaulting MAKE_JOBS=1" + export MAKE_JOBS=1 + fi + + +#-------------------------------------------------------------------------------- +# upstream -> local stuff + + # see top of this file for the _VER variables + + # Tarball Download Info (Name, URL, Destination Directory) + export UPSTREAM_TARBALL_LIST=( + "gmp-${GMP_VER}.tar.xz" + "https://ftp.gnu.org/gnu/gmp/gmp-${GMP_VER}.tar.xz" + "$UPSTREAM/gmp-$GMP_VER" + + "mpfr-${MPFR_VER}.tar.xz" + "https://www.mpfr.org/mpfr-${MPFR_VER}/mpfr-${MPFR_VER}.tar.xz" + "$UPSTREAM/mpfr-$MPFR_VER" + + "mpc-${MPC_VER}.tar.gz" + "https://ftp.gnu.org/gnu/mpc/mpc-${MPC_VER}.tar.gz" + "$UPSTREAM/mpc-$MPC_VER" + + "isl-${ISL_VER}.tar.bz2" + "https://libisl.sourceforge.io/isl-${ISL_VER}.tar.bz2" + "$UPSTREAM/isl-$ISL_VER" + + "zstd-${ZSTD_VER}.tar.zst" + "https://github.com/facebook/zstd/releases/download/v${ZSTD_VER}/zstd-${ZSTD_VER}.tar.zst" + "$UPSTREAM/zstd-$ZSTD_VER" + ) + + # Git Repo Info + # Each entry is triple: Repository URL, Branch, Destination Directory + export UPSTREAM_GIT_REPO_LIST=( + + "git://gcc.gnu.org/git/gcc.git" + "releases/gcc-12" + "$SRC/gcc-$GCC_VER" + + #no second repo entry + ) + +#-------------------------------------------------------------------------------- +# source + + # Source directories + export GCC_SRC="$SRC/gcc-$GCC_VER" + export GMP_SRC="$SRC/gmp-$GMP_VER" + export MPFR_SRC="$SRC/mpfr-$MPFR_VER" + export MPC_SRC="$SRC/mpc-$MPC_VER" + export ISL_SRC="$SRC/isl-$ISL_VER" + export ZSTD_SRC="$SRC/zstd-$ZSTD_VER" + + SOURCE_DIR_LIST=( + "$GCC_SRC" + "$GMP_SRC" + "$MPFR_SRC" + "$MPC_SRC" + "$ISL_SRC" + "$ZSTD_SRC" + ) + +#-------------------------------------------------------------------------------- +# RT extensions affected files + + RT_CPP_FILES=(init.cc directives.cc macro.cc include/cpplib.h) + + +#-------------------------------------------------------------------------------- +# build + + # Build directories + export GCC_BUILD="$BUILD_DIR/gcc" + BUILD_DIR_LIST=( + "$GCC_BUILD" + ) + + + diff --git a/developer/script_Deb-12.10_gcc-12.4.1/ext_capture.sh b/developer/script_Deb-12.10_gcc-12.4.1/ext_capture.sh new file mode 100755 index 0000000..7e3ad86 --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/ext_capture.sh @@ -0,0 +1,100 @@ +#!/bin/bash +set -euo pipefail + +# provides RT_CPP_FILES +source "$(dirname "$0")/environment.sh" + + +echo "⚠️ You probably don't want to run this script. The files in \$DEVELOPER/script_Deb-12.10_gcc-12.4.1🖉/library are intended to be the authoritative copies." +echo "So you did the bad thing and edited the files directly in the GCC source tree? Then this script is for you. ;-)" +echo + +echo -n "Continue❓ [y/N]: " +read -r response +if [[ "$response" == "y" || "$response" == "Y" ]]; then + : +else + exit 1 +fi + + +if [[ -z "${DEVELOPER:-}" ]]; then + echo "❌ DEVELOPER environment variable is not set. Aborting." + exit 1 +fi +if [[ -z "${SCRIPT_DIR:-}" ]]; then + echo "❌ SCRIPT_DIR environment variable is not set. Aborting." + exit 1 +fi +SRCDIR="library/" +DESTDIR="$GCC_SRC/libcpp/" + + +SRCDIR="$DEVELOPER/source/gcc-12.2.0/libcpp" +DESTDIR="$DEVELOPER/script_Deb-12.10_gcc-12.4.1🖉/library" + +if [[ ! -d "$SRCDIR" ]]; then + echo "❌ Source directory '$SRCDIR' does not exist." + exit 1 +fi + +if [[ ! -d "$DESTDIR" ]]; then + echo "❌ Destination directory '$DESTDIR' does not exist." + exit 1 +fi + +echo "📋 Checking files in $SRCDIR to copy to $DESTDIR..." + +for file in "${RT_CPP_FILES[@]}"; do + SRC="$SRCDIR/$file" + DEST="$DESTDIR/$file" + + mkdir -p "$(dirname "$DEST")" + + if [[ ! -f "$SRC" ]]; then + echo "⚠️ Source file '$SRC' not found. Skipping." + continue + fi + + if [[ ! -f "$DEST" ]]; then + echo "📤 No destination file. Copying: $file" + cp -p "$SRC" "$DEST" + continue + fi + + if cmp -s "$SRC" "$DEST"; then + echo "✅ No changes: $file" + continue + fi + + if [[ "$SRC" -nt "$DEST" ]]; then + echo "📤 Source is newer and differs. Copying: $file" + cp -p "$SRC" "$DEST" + elif [[ "$DEST" -nt "$SRC" ]]; then + echo "⚠️ Destination file '$file' is newer than source and differs." + echo "🔍 Showing diff:" + diff -u "$DEST" "$SRC" || true + echo -n "❓ Overwrite the authoritative '$file' with the older source version? [y/N]: " + read -r response + if [[ "$response" == "y" || "$response" == "Y" ]]; then + echo "📤 Overwriting with older source: $file" + cp -p "$SRC" "$DEST" + else + echo "❌ Skipping: $file" + fi + else + echo "⚠️ Files differ but timestamps are equal: $file" + echo "🔍 Showing diff:" + diff -u "$DEST" "$SRC" || true + echo -n "❓ Overwrite anyway? [y/N]: " + read -r response + if [[ "$response" == "y" || "$response" == "Y" ]]; then + cp -p "$SRC" "$DEST" + echo "📤 Overwritten." + else + echo "❌ Skipped." + fi + fi +done + +echo "✅ Capture complete." diff --git a/developer/script_Deb-12.10_gcc-12.4.1/ext_diff.sh b/developer/script_Deb-12.10_gcc-12.4.1/ext_diff.sh new file mode 100755 index 0000000..9d75cc9 --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/ext_diff.sh @@ -0,0 +1,64 @@ +#!/bin/bash +set -euo pipefail + +# Provides RT_CPP_FILES and paths +source "$(dirname "$0")/environment.sh" + +# Check required env vars +if [[ -z "${DEVELOPER:-}" ]]; then + echo "❌ DEVELOPER environment variable is not set. Aborting." + exit 1 +fi +if [[ -z "${SCRIPT_DIR:-}" ]]; then + echo "❌ SCRIPT_DIR environment variable is not set. Aborting." + exit 1 +fi + +SRCDIR="library/" +DESTDIR="$GCC_SRC/libcpp/" + +if [[ ! -d "$SRCDIR" ]]; then + echo "❌ Source directory '$SRCDIR' does not exist." + exit 1 +fi + +if [[ ! -d "$DESTDIR" ]]; then + echo "❌ Destination directory '$DESTDIR' does not exist." + exit 1 +fi + +# Choose files to diff +FILES=() +if [[ "$#" -gt 0 ]]; then + FILES=("$@") +else + FILES=("${RT_CPP_FILES[@]}") +fi + +echo "🔍 Diffing library ↔ libcpp..." + +for file in "${FILES[@]}"; do + SRC="$SRCDIR/$file" + DEST="$DESTDIR/$file" + + echo "🔸 $file" + + if [[ ! -f "$SRC" ]]; then + echo " ⚠️ Missing in library/: $SRC" + continue + fi + + if [[ ! -f "$DEST" ]]; then + echo " ⚠️ Missing in libcpp/: $DEST" + continue + fi + + if cmp -s "$SRC" "$DEST"; then + echo " ✅ No differences." + else + echo " ❗ Differences found:" + diff -u "$DEST" "$SRC" || true + fi +done + +echo "✅ Diff check complete." diff --git a/developer/script_Deb-12.10_gcc-12.4.1/ext_install.sh b/developer/script_Deb-12.10_gcc-12.4.1/ext_install.sh new file mode 100755 index 0000000..a25a026 --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/ext_install.sh @@ -0,0 +1,58 @@ +#!/bin/bash +# ext_isntall.sh – Install RT library files into GCC libcpp source tree. +# Usage: +# ./ext_isntall.sh → ext_isntalls all files in RT_CPP_FILES +# ./ext_isntall.sh init.cc → ext_isntalls only init.cc +# ./ext_isntall.sh init.cc macro.cc → ext_isntalls just those + +set -euo pipefail + +# provides: $DEVELOPER, $RT_CPP_FILES, $GCC_SRC +source "$(dirname "$0")/environment.sh" + +SRCDIR="library" +DESTDIR="$GCC_SRC/libcpp" + +# Validate environment +[[ -z "${DEVELOPER:-}" ]] && { echo "❌ DEVELOPER is not set. Aborting."; exit 1; } +[[ ! -d "$SRCDIR" ]] && { echo "❌ Source directory '$SRCDIR' missing."; exit 1; } +[[ ! -d "$DESTDIR" ]] && { echo "❌ Destination directory '$DESTDIR' missing."; exit 1; } + +# Determine list of files to ext_isntall +if [[ $# -eq 0 ]]; then + file_list=("${RT_CPP_FILES[@]}") +else + file_list=("$@") +fi + +echo "📋 Ext_Isntallring files to $DESTDIR..." + +for file in "${file_list[@]}"; do + src="$SRCDIR/$file" + dest="$DESTDIR/$file" + + if [[ ! -f "$src" ]]; then + echo "⚠️ Missing source file: $src" + continue + fi + + if [[ ! -f "$dest" || "$src" -nt "$dest" ]]; then + echo "📥 Copying (newer or missing): $file" + cp -p "$src" "$dest" + elif [[ "$dest" -nt "$src" ]]; then + echo "⚠️ Destination '$file' is newer than source." + diff -u "$dest" "$src" || true + echo -n "❓ Overwrite destination '$file'? [y/N]: " + read -r response + if [[ "$response" =~ ^[Yy]$ ]]; then + echo "📥 Overwriting: $file" + cp -p "$src" "$dest" + else + echo "⏭️ Skipped: $file" + fi + else + echo "✅ Up-to-date: $file" + fi +done + +echo "✅ Ext_Isntall complete." diff --git a/developer/script_Deb-12.10_gcc-12.4.1/ext_save.sh b/developer/script_Deb-12.10_gcc-12.4.1/ext_save.sh new file mode 100755 index 0000000..2869048 --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/ext_save.sh @@ -0,0 +1,46 @@ +#!/bin/bash +set -euo pipefail + +# provides RT_CPP_FILES +source "$(dirname "$0")/environment.sh" + +# Save original versions of libcpp files to prevent accidental loss +# Appends _orig after the .cc extension (e.g., macro.cc → macro.cc_orig) +# Files remain in place but can be manually diffed or restored if needed + +if [[ -z "${DEVELOPER:-}" ]]; then + echo "❌ DEVELOPER environment variable is not set. Aborting." + exit 1 +fi +if [[ -z "${SCRIPT_DIR:-}" ]]; then + echo "❌ SCRIPT_DIR environment variable is not set. Aborting." + exit 1 +fi + +TARGETDIR="$GCC_SRC/libcpp/" + +if [[ ! -d "$TARGETDIR" ]]; then + echo "❌ Target directory '$TARGETDIR' does not exist." + exit 1 +fi + +echo "📦 Saving original copies of target files..." + +for file in "${RT_CPP_FILES[@]}"; do + SRC="$TARGETDIR/$file" + BACKUP="$SRC"_orig + + if [[ ! -f "$SRC" ]]; then + echo "⚠️ Source file '$SRC' not found. Skipping." + continue + fi + + if [[ -f "$BACKUP" ]]; then + echo "✅ Already saved: $file → $(basename "$BACKUP")" + else + cp -p "$SRC" "$BACKUP" + echo "📁 Saved: $file → $(basename "$BACKUP")" + fi +done + +echo "✅ All originals saved." diff --git a/developer/script_Deb-12.10_gcc-12.4.1/library/directives.cc b/developer/script_Deb-12.10_gcc-12.4.1/library/directives.cc new file mode 100644 index 0000000..39fbde6 --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/library/directives.cc @@ -0,0 +1,2886 @@ +/* CPP Library. (Directive handling.) + Copyright (C) 1986-2022 Free Software Foundation, Inc. + Contributed by Per Bothner, 1994-95. + Based on CCCP program by Paul Rubin, June 1986 + Adapted to ANSI C, Richard Stallman, Jan 1987 + +This program is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the +Free Software Foundation; either version 3, or (at your option) any +later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; see the file COPYING3. If not see +. */ + +#pragma GCC diagnostic ignored "-Wparentheses" + +#include "config.h" +#include "system.h" +#include "cpplib.h" +#include "internal.h" +#include "mkdeps.h" +#include "obstack.h" + +/* Stack of conditionals currently in progress + (including both successful and failing conditionals). */ +struct if_stack +{ + struct if_stack *next; + location_t line; /* Line where condition started. */ + const cpp_hashnode *mi_cmacro;/* macro name for #ifndef around entire file */ + bool skip_elses; /* Can future #else / #elif be skipped? */ + bool was_skipping; /* If were skipping on entry. */ + int type; /* Most recent conditional for diagnostics. */ +}; + +/* Contains a registered pragma or pragma namespace. */ +typedef void (*pragma_cb) (cpp_reader *); +struct pragma_entry +{ + struct pragma_entry *next; + const cpp_hashnode *pragma; /* Name and length. */ + bool is_nspace; + bool is_internal; + bool is_deferred; + bool allow_expansion; + union { + pragma_cb handler; + struct pragma_entry *space; + unsigned int ident; + } u; +}; + +/* Values for the origin field of struct directive. KANDR directives + come from traditional (K&R) C. STDC89 directives come from the + 1989 C standard. STDC2X directives come from the C2X standard. EXTENSION + directives are extensions. */ +#define KANDR 0 +#define STDC89 1 +#define STDC2X 2 +#define EXTENSION 3 + +/* Values for the flags field of struct directive. COND indicates a + conditional; IF_COND an opening conditional. INCL means to treat + "..." and <...> as q-char and h-char sequences respectively. IN_I + means this directive should be handled even if -fpreprocessed is in + effect (these are the directives with callback hooks). + + EXPAND is set on directives that are always macro-expanded. + + ELIFDEF is set on directives that are only handled for standards with the + #elifdef / #elifndef feature. */ +#define COND (1 << 0) +#define IF_COND (1 << 1) +#define INCL (1 << 2) +#define IN_I (1 << 3) +#define EXPAND (1 << 4) +#define DEPRECATED (1 << 5) +#define ELIFDEF (1 << 6) + +/* Defines one #-directive, including how to handle it. */ +typedef void (*directive_handler) (cpp_reader *); +typedef struct directive directive; +struct directive +{ + directive_handler handler; /* Function to handle directive. */ + const uchar *name; /* Name of directive. */ + unsigned short length; /* Length of name. */ + unsigned char origin; /* Origin of directive. */ + unsigned char flags; /* Flags describing this directive. */ +}; + +/* Forward declarations. */ + +static void skip_rest_of_line (cpp_reader *); +static void check_eol (cpp_reader *, bool); +static void start_directive (cpp_reader *); +static void prepare_directive_trad (cpp_reader *); +static void end_directive (cpp_reader *, int); +static void directive_diagnostics (cpp_reader *, const directive *, int); +static void run_directive (cpp_reader *, int, const char *, size_t); +static char *glue_header_name (cpp_reader *); +static const char *parse_include (cpp_reader *, int *, const cpp_token ***, + location_t *); +static void push_conditional (cpp_reader *, int, int, const cpp_hashnode *); +static unsigned int read_flag (cpp_reader *, unsigned int); +static bool strtolinenum (const uchar *, size_t, linenum_type *, bool *); +static void do_diagnostic (cpp_reader *, enum cpp_diagnostic_level code, + enum cpp_warning_reason reason, int); +static cpp_hashnode *lex_macro_node (cpp_reader *, bool); +static int undefine_macros (cpp_reader *, cpp_hashnode *, void *); +static void do_include_common (cpp_reader *, enum include_type); +static struct pragma_entry *lookup_pragma_entry (struct pragma_entry *, + const cpp_hashnode *); +static int count_registered_pragmas (struct pragma_entry *); +static char ** save_registered_pragmas (struct pragma_entry *, char **); +static char ** restore_registered_pragmas (cpp_reader *, struct pragma_entry *, + char **); +static void do_pragma_once (cpp_reader *); +static void do_pragma_poison (cpp_reader *); +static void do_pragma_system_header (cpp_reader *); +static void do_pragma_dependency (cpp_reader *); +static void do_pragma_warning_or_error (cpp_reader *, bool error); +static void do_pragma_warning (cpp_reader *); +static void do_pragma_error (cpp_reader *); +static void do_linemarker (cpp_reader *); +static const cpp_token *get_token_no_padding (cpp_reader *); +static const cpp_token *get__Pragma_string (cpp_reader *); +static void destringize_and_run (cpp_reader *, const cpp_string *, + location_t); +static bool parse_answer (cpp_reader *, int, location_t, cpp_macro **); +static cpp_hashnode *parse_assertion (cpp_reader *, int, cpp_macro **); +static cpp_macro **find_answer (cpp_hashnode *, const cpp_macro *); +static void handle_assertion (cpp_reader *, const char *, int); +static void do_pragma_push_macro (cpp_reader *); +static void do_pragma_pop_macro (cpp_reader *); +static void cpp_pop_definition (cpp_reader *, struct def_pragma_macro *); + +/* This is the table of directive handlers. All extensions other than + #warning, #include_next, and #import are deprecated. The name is + where the extension appears to have come from. */ + +#define DIRECTIVE_TABLE \ + D(define ,T_DEFINE = 0 ,KANDR ,IN_I) \ + D(include ,T_INCLUDE ,KANDR ,INCL | EXPAND) \ + D(endif ,T_ENDIF ,KANDR ,COND) \ + D(ifdef ,T_IFDEF ,KANDR ,COND | IF_COND) \ + D(if ,T_IF ,KANDR ,COND | IF_COND | EXPAND) \ + D(else ,T_ELSE ,KANDR ,COND) \ + D(ifndef ,T_IFNDEF ,KANDR ,COND | IF_COND) \ + D(undef ,T_UNDEF ,KANDR ,IN_I) \ + D(line ,T_LINE ,KANDR ,EXPAND) \ + D(elif ,T_ELIF ,STDC89 ,COND | EXPAND) \ + D(elifdef ,T_ELIFDEF ,STDC2X ,COND | ELIFDEF) \ + D(elifndef ,T_ELIFNDEF ,STDC2X ,COND | ELIFDEF) \ + D(error ,T_ERROR ,STDC89 ,0) \ + D(pragma ,T_PRAGMA ,STDC89 ,IN_I) \ + D(warning ,T_WARNING ,EXTENSION ,0) \ + D(include_next ,T_INCLUDE_NEXT ,EXTENSION ,INCL | EXPAND) \ + D(ident ,T_IDENT ,EXTENSION ,IN_I) \ + D(import ,T_IMPORT ,EXTENSION ,INCL | EXPAND) /* ObjC */ \ + D(assert ,T_ASSERT ,EXTENSION ,DEPRECATED) /* SVR4 */ \ + D(unassert ,T_UNASSERT ,EXTENSION ,DEPRECATED) /* SVR4 */ \ + D(sccs ,T_SCCS ,EXTENSION ,IN_I) /* SVR4? */ \ + D(rt_macro ,T_MACRO ,EXTENSION ,IN_I) \ + D(assign ,T_ASSIGN ,EXTENSION ,IN_I) + + +/* #sccs is synonymous with #ident. */ +#define do_sccs do_ident + +/* Use the table to generate a series of prototypes, an enum for the + directive names, and an array of directive handlers. */ + +#define D(name, t, o, f) static void do_##name (cpp_reader *); +DIRECTIVE_TABLE +#undef D + +#define D(n, tag, o, f) tag, +enum +{ + DIRECTIVE_TABLE + N_DIRECTIVES +}; +#undef D + +#define D(name, t, origin, flags) \ +{ do_##name, (const uchar *) #name, \ + sizeof #name - 1, origin, flags }, +static const directive dtable[] = +{ +DIRECTIVE_TABLE +}; +#undef D + +/* A NULL-terminated array of directive names for use + when suggesting corrections for misspelled directives. */ +#define D(name, t, origin, flags) #name, +static const char * const directive_names[] = { +DIRECTIVE_TABLE + NULL +}; +#undef D + +#undef DIRECTIVE_TABLE + +/* Wrapper struct directive for linemarkers. + The origin is more or less true - the original K+R cpp + did use this notation in its preprocessed output. */ +static const directive linemarker_dir = +{ + do_linemarker, UC"#", 1, KANDR, IN_I +}; + +/* Skip any remaining tokens in a directive. */ +static void +skip_rest_of_line (cpp_reader *pfile) +{ + /* Discard all stacked contexts. */ + while (pfile->context->prev) + _cpp_pop_context (pfile); + + /* Sweep up all tokens remaining on the line. */ + if (! SEEN_EOL ()) + while (_cpp_lex_token (pfile)->type != CPP_EOF) + ; +} + +/* Helper function for check_oel. */ + +static void +check_eol_1 (cpp_reader *pfile, bool expand, enum cpp_warning_reason reason) +{ + if (! SEEN_EOL () && (expand + ? cpp_get_token (pfile) + : _cpp_lex_token (pfile))->type != CPP_EOF) + cpp_pedwarning (pfile, reason, "extra tokens at end of #%s directive", + pfile->directive->name); +} + +/* Variant of check_eol used for Wendif-labels warnings. */ + +static void +check_eol_endif_labels (cpp_reader *pfile) +{ + check_eol_1 (pfile, false, CPP_W_ENDIF_LABELS); +} + +/* Ensure there are no stray tokens at the end of a directive. If + EXPAND is true, tokens macro-expanding to nothing are allowed. */ + +static void +check_eol (cpp_reader *pfile, bool expand) +{ + check_eol_1 (pfile, expand, CPP_W_NONE); +} + +/* Ensure there are no stray tokens other than comments at the end of + a directive, and gather the comments. */ +static const cpp_token ** +check_eol_return_comments (cpp_reader *pfile) +{ + size_t c; + size_t capacity = 8; + const cpp_token **buf; + + buf = XNEWVEC (const cpp_token *, capacity); + c = 0; + if (! SEEN_EOL ()) + { + while (1) + { + const cpp_token *tok; + + tok = _cpp_lex_token (pfile); + if (tok->type == CPP_EOF) + break; + if (tok->type != CPP_COMMENT) + cpp_error (pfile, CPP_DL_PEDWARN, + "extra tokens at end of #%s directive", + pfile->directive->name); + else + { + if (c + 1 >= capacity) + { + capacity *= 2; + buf = XRESIZEVEC (const cpp_token *, buf, capacity); + } + buf[c] = tok; + ++c; + } + } + } + buf[c] = NULL; + return buf; +} + +/* Called when entering a directive, _Pragma or command-line directive. */ +static void +start_directive (cpp_reader *pfile) +{ + /* Setup in-directive state. */ + pfile->state.in_directive = 1; + pfile->state.save_comments = 0; + pfile->directive_result.type = CPP_PADDING; + + /* Some handlers need the position of the # for diagnostics. */ + pfile->directive_line = pfile->line_table->highest_line; +} + +/* Called when leaving a directive, _Pragma or command-line directive. */ +static void +end_directive (cpp_reader *pfile, int skip_line) +{ + if (CPP_OPTION (pfile, traditional)) + { + /* Revert change of prepare_directive_trad. */ + if (!pfile->state.in_deferred_pragma) + pfile->state.prevent_expansion--; + + if (pfile->directive != &dtable[T_DEFINE]) + _cpp_remove_overlay (pfile); + } + else if (pfile->state.in_deferred_pragma) + ; + /* We don't skip for an assembler #. */ + else if (skip_line) + { + skip_rest_of_line (pfile); + if (!pfile->keep_tokens) + { + pfile->cur_run = &pfile->base_run; + pfile->cur_token = pfile->base_run.base; + } + } + + /* Restore state. */ + pfile->state.save_comments = ! CPP_OPTION (pfile, discard_comments); + pfile->state.in_directive = 0; + pfile->state.in_expression = 0; + pfile->state.angled_headers = 0; + pfile->directive = 0; +} + +/* Prepare to handle the directive in pfile->directive. */ +static void +prepare_directive_trad (cpp_reader *pfile) +{ + if (pfile->directive != &dtable[T_DEFINE]) + { + bool no_expand = (pfile->directive + && ! (pfile->directive->flags & EXPAND)); + bool was_skipping = pfile->state.skipping; + + pfile->state.in_expression = (pfile->directive == &dtable[T_IF] + || pfile->directive == &dtable[T_ELIF]); + if (pfile->state.in_expression) + pfile->state.skipping = false; + + if (no_expand) + pfile->state.prevent_expansion++; + _cpp_scan_out_logical_line (pfile, NULL, false); + if (no_expand) + pfile->state.prevent_expansion--; + + pfile->state.skipping = was_skipping; + _cpp_overlay_buffer (pfile, pfile->out.base, + pfile->out.cur - pfile->out.base); + } + + /* Stop ISO C from expanding anything. */ + pfile->state.prevent_expansion++; +} + +/* Output diagnostics for a directive DIR. INDENTED is nonzero if + the '#' was indented. */ +static void +directive_diagnostics (cpp_reader *pfile, const directive *dir, int indented) +{ + /* Issue -pedantic or deprecated warnings for extensions. We let + -pedantic take precedence if both are applicable. */ + if (! pfile->state.skipping) + { + if (dir->origin == EXTENSION + && !(dir == &dtable[T_IMPORT] && CPP_OPTION (pfile, objc)) + && CPP_PEDANTIC (pfile)) + cpp_error (pfile, CPP_DL_PEDWARN, "#%s is a GCC extension", dir->name); + else if (((dir->flags & DEPRECATED) != 0 + || (dir == &dtable[T_IMPORT] && !CPP_OPTION (pfile, objc))) + && CPP_OPTION (pfile, cpp_warn_deprecated)) + cpp_warning (pfile, CPP_W_DEPRECATED, + "#%s is a deprecated GCC extension", dir->name); + } + + /* Traditionally, a directive is ignored unless its # is in + column 1. Therefore in code intended to work with K+R + compilers, directives added by C89 must have their # + indented, and directives present in traditional C must not. + This is true even of directives in skipped conditional + blocks. #elif cannot be used at all. */ + if (CPP_WTRADITIONAL (pfile)) + { + if (dir == &dtable[T_ELIF]) + cpp_warning (pfile, CPP_W_TRADITIONAL, + "suggest not using #elif in traditional C"); + else if (indented && dir->origin == KANDR) + cpp_warning (pfile, CPP_W_TRADITIONAL, + "traditional C ignores #%s with the # indented", + dir->name); + else if (!indented && dir->origin != KANDR) + cpp_warning (pfile, CPP_W_TRADITIONAL, + "suggest hiding #%s from traditional C with an indented #", + dir->name); + } +} + +/* Check if we have a known directive. INDENTED is true if the + '#' of the directive was indented. This function is in this file + to save unnecessarily exporting dtable etc. to lex.cc. Returns + nonzero if the line of tokens has been handled, zero if we should + continue processing the line. */ +int +_cpp_handle_directive (cpp_reader *pfile, bool indented) +{ + const directive *dir = 0; + const cpp_token *dname; + bool was_parsing_args = pfile->state.parsing_args; + bool was_discarding_output = pfile->state.discarding_output; + int skip = 1; + + if (was_discarding_output) + pfile->state.prevent_expansion = 0; + + if (was_parsing_args) + { + if (CPP_OPTION (pfile, cpp_pedantic)) + cpp_error (pfile, CPP_DL_PEDWARN, + "embedding a directive within macro arguments is not portable"); + pfile->state.parsing_args = 0; + pfile->state.prevent_expansion = 0; + } + start_directive (pfile); + dname = _cpp_lex_token (pfile); + + if (dname->type == CPP_NAME) + { + if (dname->val.node.node->is_directive) + { + dir = &dtable[dname->val.node.node->directive_index]; + if ((dir->flags & ELIFDEF) + && !CPP_OPTION (pfile, elifdef) + /* For -std=gnu* modes elifdef is supported with + a pedwarn if pedantic. */ + && CPP_OPTION (pfile, std)) + dir = 0; + } + } + /* We do not recognize the # followed by a number extension in + assembler code. */ + else if (dname->type == CPP_NUMBER && CPP_OPTION (pfile, lang) != CLK_ASM) + { + dir = &linemarker_dir; + if (CPP_PEDANTIC (pfile) && ! CPP_OPTION (pfile, preprocessed) + && ! pfile->state.skipping) + cpp_error (pfile, CPP_DL_PEDWARN, + "style of line directive is a GCC extension"); + } + + if (dir) + { + /* If we have a directive that is not an opening conditional, + invalidate any control macro. */ + if (! (dir->flags & IF_COND)) + pfile->mi_valid = false; + + /* Kluge alert. In order to be sure that code like this + + #define HASH # + HASH define foo bar + + does not cause '#define foo bar' to get executed when + compiled with -save-temps, we recognize directives in + -fpreprocessed mode only if the # is in column 1. macro.cc + puts a space in front of any '#' at the start of a macro. + + We exclude the -fdirectives-only case because macro expansion + has not been performed yet, and block comments can cause spaces + to precede the directive. */ + if (CPP_OPTION (pfile, preprocessed) + && !CPP_OPTION (pfile, directives_only) + && (indented || !(dir->flags & IN_I))) + { + skip = 0; + dir = 0; + } + else + { + /* In failed conditional groups, all non-conditional + directives are ignored. Before doing that, whether + skipping or not, we should lex angle-bracketed headers + correctly, and maybe output some diagnostics. */ + pfile->state.angled_headers = dir->flags & INCL; + pfile->state.directive_wants_padding = dir->flags & INCL; + if (! CPP_OPTION (pfile, preprocessed)) + directive_diagnostics (pfile, dir, indented); + if (pfile->state.skipping && !(dir->flags & COND)) + dir = 0; + } + } + else if (dname->type == CPP_EOF) + ; /* CPP_EOF is the "null directive". */ + else + { + /* An unknown directive. Don't complain about it in assembly + source: we don't know where the comments are, and # may + introduce assembler pseudo-ops. Don't complain about invalid + directives in skipped conditional groups (6.10 p4). */ + if (CPP_OPTION (pfile, lang) == CLK_ASM) + skip = 0; + else if (!pfile->state.skipping) + { + const char *unrecognized + = (const char *)cpp_token_as_text (pfile, dname); + const char *hint = NULL; + + /* Call back into gcc to get a spelling suggestion. Ideally + we'd just use best_match from gcc/spellcheck.h (and filter + out the uncommon directives), but that requires moving it + to a support library. */ + if (pfile->cb.get_suggestion) + hint = pfile->cb.get_suggestion (pfile, unrecognized, + directive_names); + + if (hint) + { + rich_location richloc (pfile->line_table, dname->src_loc); + source_range misspelled_token_range + = get_range_from_loc (pfile->line_table, dname->src_loc); + richloc.add_fixit_replace (misspelled_token_range, hint); + cpp_error_at (pfile, CPP_DL_ERROR, &richloc, + "invalid preprocessing directive #%s;" + " did you mean #%s?", + unrecognized, hint); + } + else + cpp_error (pfile, CPP_DL_ERROR, + "invalid preprocessing directive #%s", + unrecognized); + } + } + + pfile->directive = dir; + if (CPP_OPTION (pfile, traditional)) + prepare_directive_trad (pfile); + + if (dir) + pfile->directive->handler (pfile); + else if (skip == 0) + _cpp_backup_tokens (pfile, 1); + + end_directive (pfile, skip); + if (was_parsing_args && !pfile->state.in_deferred_pragma) + { + /* Restore state when within macro args. */ + pfile->state.parsing_args = 2; + pfile->state.prevent_expansion = 1; + } + if (was_discarding_output) + pfile->state.prevent_expansion = 1; + return skip; +} + +/* Directive handler wrapper used by the command line option + processor. BUF is \n terminated. */ +static void +run_directive (cpp_reader *pfile, int dir_no, const char *buf, size_t count) +{ + cpp_push_buffer (pfile, (const uchar *) buf, count, + /* from_stage3 */ true); + start_directive (pfile); + + /* This is a short-term fix to prevent a leading '#' being + interpreted as a directive. */ + _cpp_clean_line (pfile); + + pfile->directive = &dtable[dir_no]; + if (CPP_OPTION (pfile, traditional)) + prepare_directive_trad (pfile); + pfile->directive->handler (pfile); + end_directive (pfile, 1); + _cpp_pop_buffer (pfile); +} + +/* Checks for validity the macro name in #define, #undef, #ifdef and + #ifndef directives. IS_DEF_OR_UNDEF is true if this call is + processing a #define or #undefine directive, and false + otherwise. */ +static cpp_hashnode * +lex_macro_node (cpp_reader *pfile, bool is_def_or_undef) +{ + const cpp_token *token = _cpp_lex_token (pfile); + + /* The token immediately after #define must be an identifier. That + identifier may not be "defined", per C99 6.10.8p4. + In C++, it may not be any of the "named operators" either, + per C++98 [lex.digraph], [lex.key]. + Finally, the identifier may not have been poisoned. (In that case + the lexer has issued the error message for us.) */ + + if (token->type == CPP_NAME) + { + cpp_hashnode *node = token->val.node.node; + + if (is_def_or_undef + && node == pfile->spec_nodes.n_defined) + cpp_error (pfile, CPP_DL_ERROR, + "\"%s\" cannot be used as a macro name", + NODE_NAME (node)); + else if (! (node->flags & NODE_POISONED)) + return node; + } + else if (token->flags & NAMED_OP) + cpp_error (pfile, CPP_DL_ERROR, + "\"%s\" cannot be used as a macro name as it is an operator in C++", + NODE_NAME (token->val.node.node)); + else if (token->type == CPP_EOF) + cpp_error (pfile, CPP_DL_ERROR, "no macro name given in #%s directive", + pfile->directive->name); + else + cpp_error (pfile, CPP_DL_ERROR, "macro names must be identifiers"); + + return NULL; +} + +/* Process a #define directive. Most work is done in macro.cc. */ +static void +do_define (cpp_reader *pfile) +{ + cpp_hashnode *node = lex_macro_node (pfile, true); + + if (node) + { + /* If we have been requested to expand comments into macros, + then re-enable saving of comments. */ + pfile->state.save_comments = + ! CPP_OPTION (pfile, discard_comments_in_macro_exp); + + if (pfile->cb.before_define) + pfile->cb.before_define (pfile); + + if (_cpp_create_definition (pfile, node)) + if (pfile->cb.define) + pfile->cb.define (pfile, pfile->directive_line, node); + + node->flags &= ~NODE_USED; + } +} + +/* Handle #undef. Mark the identifier NT_VOID in the hash table. */ +static void +do_undef (cpp_reader *pfile) +{ + cpp_hashnode *node = lex_macro_node (pfile, true); + + if (node) + { + if (pfile->cb.before_define) + pfile->cb.before_define (pfile); + + if (pfile->cb.undef) + pfile->cb.undef (pfile, pfile->directive_line, node); + + /* 6.10.3.5 paragraph 2: [#undef] is ignored if the specified + identifier is not currently defined as a macro name. */ + if (cpp_macro_p (node)) + { + if (node->flags & NODE_WARN) + cpp_error (pfile, CPP_DL_WARNING, + "undefining \"%s\"", NODE_NAME (node)); + else if (cpp_builtin_macro_p (node) + && CPP_OPTION (pfile, warn_builtin_macro_redefined)) + cpp_warning_with_line (pfile, CPP_W_BUILTIN_MACRO_REDEFINED, + pfile->directive_line, 0, + "undefining \"%s\"", NODE_NAME (node)); + + if (node->value.macro + && CPP_OPTION (pfile, warn_unused_macros)) + _cpp_warn_if_unused_macro (pfile, node, NULL); + + _cpp_free_definition (node); + } + } + + check_eol (pfile, false); +} + +/* Undefine a single macro/assertion/whatever. */ + +static int +undefine_macros (cpp_reader *pfile ATTRIBUTE_UNUSED, cpp_hashnode *h, + void *data_p ATTRIBUTE_UNUSED) +{ + /* Body of _cpp_free_definition inlined here for speed. + Macros and assertions no longer have anything to free. */ + h->type = NT_VOID; + h->value.answers = NULL; + h->flags &= ~(NODE_POISONED|NODE_DISABLED|NODE_USED); + return 1; +} + +/* Undefine all macros and assertions. */ + +void +cpp_undef_all (cpp_reader *pfile) +{ + cpp_forall_identifiers (pfile, undefine_macros, NULL); +} + + +/* Helper routine used by parse_include. Reinterpret the current line + as an h-char-sequence (< ... >); we are looking at the first token + after the <. Returns a malloced filename. */ +static char * +glue_header_name (cpp_reader *pfile) +{ + const cpp_token *token; + char *buffer; + size_t len, total_len = 0, capacity = 1024; + + /* To avoid lexed tokens overwriting our glued name, we can only + allocate from the string pool once we've lexed everything. */ + buffer = XNEWVEC (char, capacity); + for (;;) + { + token = get_token_no_padding (pfile); + + if (token->type == CPP_GREATER) + break; + if (token->type == CPP_EOF) + { + cpp_error (pfile, CPP_DL_ERROR, "missing terminating > character"); + break; + } + + len = cpp_token_len (token) + 2; /* Leading space, terminating \0. */ + if (total_len + len > capacity) + { + capacity = (capacity + len) * 2; + buffer = XRESIZEVEC (char, buffer, capacity); + } + + if (token->flags & PREV_WHITE) + buffer[total_len++] = ' '; + + total_len = (cpp_spell_token (pfile, token, (uchar *) &buffer[total_len], + true) + - (uchar *) buffer); + } + + buffer[total_len] = '\0'; + return buffer; +} + +/* Returns the file name of #include, #include_next, #import and + #pragma dependency. The string is malloced and the caller should + free it. Returns NULL on error. LOCATION is the source location + of the file name. */ + +static const char * +parse_include (cpp_reader *pfile, int *pangle_brackets, + const cpp_token ***buf, location_t *location) +{ + char *fname; + const cpp_token *header; + + /* Allow macro expansion. */ + header = get_token_no_padding (pfile); + *location = header->src_loc; + if ((header->type == CPP_STRING && header->val.str.text[0] != 'R') + || header->type == CPP_HEADER_NAME) + { + fname = XNEWVEC (char, header->val.str.len - 1); + memcpy (fname, header->val.str.text + 1, header->val.str.len - 2); + fname[header->val.str.len - 2] = '\0'; + *pangle_brackets = header->type == CPP_HEADER_NAME; + } + else if (header->type == CPP_LESS) + { + fname = glue_header_name (pfile); + *pangle_brackets = 1; + } + else + { + const unsigned char *dir; + + if (pfile->directive == &dtable[T_PRAGMA]) + dir = UC"pragma dependency"; + else + dir = pfile->directive->name; + cpp_error (pfile, CPP_DL_ERROR, "#%s expects \"FILENAME\" or ", + dir); + + return NULL; + } + + if (pfile->directive == &dtable[T_PRAGMA]) + { + /* This pragma allows extra tokens after the file name. */ + } + else if (buf == NULL || CPP_OPTION (pfile, discard_comments)) + check_eol (pfile, true); + else + { + /* If we are not discarding comments, then gather them while + doing the eol check. */ + *buf = check_eol_return_comments (pfile); + } + + return fname; +} + +/* Handle #include, #include_next and #import. */ +static void +do_include_common (cpp_reader *pfile, enum include_type type) +{ + const char *fname; + int angle_brackets; + const cpp_token **buf = NULL; + location_t location; + + /* Re-enable saving of comments if requested, so that the include + callback can dump comments which follow #include. */ + pfile->state.save_comments = ! CPP_OPTION (pfile, discard_comments); + + /* Tell the lexer this is an include directive -- we want it to + increment the line number even if this is the last line of a file. */ + pfile->state.in_directive = 2; + + fname = parse_include (pfile, &angle_brackets, &buf, &location); + if (!fname) + goto done; + + if (!*fname) + { + cpp_error_with_line (pfile, CPP_DL_ERROR, location, 0, + "empty filename in #%s", + pfile->directive->name); + goto done; + } + + /* Prevent #include recursion. */ + if (pfile->line_table->depth >= CPP_OPTION (pfile, max_include_depth)) + cpp_error (pfile, + CPP_DL_ERROR, + "#include nested depth %u exceeds maximum of %u" + " (use -fmax-include-depth=DEPTH to increase the maximum)", + pfile->line_table->depth, + CPP_OPTION (pfile, max_include_depth)); + else + { + /* Get out of macro context, if we are. */ + skip_rest_of_line (pfile); + + if (pfile->cb.include) + pfile->cb.include (pfile, pfile->directive_line, + pfile->directive->name, fname, angle_brackets, + buf); + + _cpp_stack_include (pfile, fname, angle_brackets, type, location); + } + + done: + XDELETEVEC (fname); + if (buf) + XDELETEVEC (buf); +} + +static void +do_include (cpp_reader *pfile) +{ + do_include_common (pfile, IT_INCLUDE); +} + +static void +do_import (cpp_reader *pfile) +{ + do_include_common (pfile, IT_IMPORT); +} + +static void +do_include_next (cpp_reader *pfile) +{ + enum include_type type = IT_INCLUDE_NEXT; + + /* If this is the primary source file, warn and use the normal + search logic. */ + if (_cpp_in_main_source_file (pfile)) + { + cpp_error (pfile, CPP_DL_WARNING, + "#include_next in primary source file"); + type = IT_INCLUDE; + } + do_include_common (pfile, type); +} + +/* Subroutine of do_linemarker. Read possible flags after file name. + LAST is the last flag seen; 0 if this is the first flag. Return the + flag if it is valid, 0 at the end of the directive. Otherwise + complain. */ +static unsigned int +read_flag (cpp_reader *pfile, unsigned int last) +{ + const cpp_token *token = _cpp_lex_token (pfile); + + if (token->type == CPP_NUMBER && token->val.str.len == 1) + { + unsigned int flag = token->val.str.text[0] - '0'; + + if (flag > last && flag <= 4 + && (flag != 4 || last == 3) + && (flag != 2 || last == 0)) + return flag; + } + + if (token->type != CPP_EOF) + cpp_error (pfile, CPP_DL_ERROR, "invalid flag \"%s\" in line directive", + cpp_token_as_text (pfile, token)); + return 0; +} + +/* Subroutine of do_line and do_linemarker. Convert a number in STR, + of length LEN, to binary; store it in NUMP, and return false if the + number was well-formed, true if not. WRAPPED is set to true if the + number did not fit into 'linenum_type'. */ +static bool +strtolinenum (const uchar *str, size_t len, linenum_type *nump, bool *wrapped) +{ + linenum_type reg = 0; + + uchar c; + bool seen_digit_sep = false; + *wrapped = false; + while (len--) + { + c = *str++; + if (!seen_digit_sep && c == '\'' && len) + { + seen_digit_sep = true; + continue; + } + if (!ISDIGIT (c)) + return true; + seen_digit_sep = false; + if (reg > ((linenum_type) -1) / 10) + *wrapped = true; + reg *= 10; + if (reg > ((linenum_type) -1) - (c - '0')) + *wrapped = true; + reg += c - '0'; + } + *nump = reg; + return false; +} + +/* Interpret #line command. + Note that the filename string (if any) is a true string constant + (escapes are interpreted). */ +static void +do_line (cpp_reader *pfile) +{ + class line_maps *line_table = pfile->line_table; + const line_map_ordinary *map = LINEMAPS_LAST_ORDINARY_MAP (line_table); + + /* skip_rest_of_line() may cause line table to be realloc()ed so note down + sysp right now. */ + + unsigned char map_sysp = ORDINARY_MAP_IN_SYSTEM_HEADER_P (map); + const cpp_token *token; + const char *new_file = ORDINARY_MAP_FILE_NAME (map); + linenum_type new_lineno; + + /* C99 raised the minimum limit on #line numbers. */ + linenum_type cap = CPP_OPTION (pfile, c99) ? 2147483647 : 32767; + bool wrapped; + + /* #line commands expand macros. */ + token = cpp_get_token (pfile); + if (token->type != CPP_NUMBER + || strtolinenum (token->val.str.text, token->val.str.len, + &new_lineno, &wrapped)) + { + if (token->type == CPP_EOF) + cpp_error (pfile, CPP_DL_ERROR, "unexpected end of file after #line"); + else + cpp_error (pfile, CPP_DL_ERROR, + "\"%s\" after #line is not a positive integer", + cpp_token_as_text (pfile, token)); + return; + } + + if (CPP_PEDANTIC (pfile) && (new_lineno == 0 || new_lineno > cap || wrapped)) + cpp_error (pfile, CPP_DL_PEDWARN, "line number out of range"); + else if (wrapped) + cpp_error (pfile, CPP_DL_WARNING, "line number out of range"); + + token = cpp_get_token (pfile); + if (token->type == CPP_STRING) + { + cpp_string s = { 0, 0 }; + if (cpp_interpret_string_notranslate (pfile, &token->val.str, 1, + &s, CPP_STRING)) + new_file = (const char *)s.text; + check_eol (pfile, true); + } + else if (token->type != CPP_EOF) + { + cpp_error (pfile, CPP_DL_ERROR, "\"%s\" is not a valid filename", + cpp_token_as_text (pfile, token)); + return; + } + + skip_rest_of_line (pfile); + _cpp_do_file_change (pfile, LC_RENAME_VERBATIM, new_file, new_lineno, + map_sysp); + line_table->seen_line_directive = true; +} + +/* Interpret the # 44 "file" [flags] notation, which has slightly + different syntax and semantics from #line: Flags are allowed, + and we never complain about the line number being too big. */ +static void +do_linemarker (cpp_reader *pfile) +{ + class line_maps *line_table = pfile->line_table; + const line_map_ordinary *map = LINEMAPS_LAST_ORDINARY_MAP (line_table); + const cpp_token *token; + const char *new_file = ORDINARY_MAP_FILE_NAME (map); + linenum_type new_lineno; + unsigned int new_sysp = ORDINARY_MAP_IN_SYSTEM_HEADER_P (map); + enum lc_reason reason = LC_RENAME_VERBATIM; + int flag; + bool wrapped; + + /* Back up so we can get the number again. Putting this in + _cpp_handle_directive risks two calls to _cpp_backup_tokens in + some circumstances, which can segfault. */ + _cpp_backup_tokens (pfile, 1); + + /* #line commands expand macros. */ + token = cpp_get_token (pfile); + if (token->type != CPP_NUMBER + || strtolinenum (token->val.str.text, token->val.str.len, + &new_lineno, &wrapped)) + { + /* Unlike #line, there does not seem to be a way to get an EOF + here. So, it should be safe to always spell the token. */ + cpp_error (pfile, CPP_DL_ERROR, + "\"%s\" after # is not a positive integer", + cpp_token_as_text (pfile, token)); + return; + } + + token = cpp_get_token (pfile); + if (token->type == CPP_STRING) + { + cpp_string s = { 0, 0 }; + if (cpp_interpret_string_notranslate (pfile, &token->val.str, + 1, &s, CPP_STRING)) + new_file = (const char *)s.text; + + new_sysp = 0; + flag = read_flag (pfile, 0); + if (flag == 1) + { + reason = LC_ENTER; + /* Fake an include for cpp_included (). */ + _cpp_fake_include (pfile, new_file); + flag = read_flag (pfile, flag); + } + else if (flag == 2) + { + reason = LC_LEAVE; + flag = read_flag (pfile, flag); + } + if (flag == 3) + { + new_sysp = 1; + flag = read_flag (pfile, flag); + if (flag == 4) + new_sysp = 2; + } + pfile->buffer->sysp = new_sysp; + + check_eol (pfile, false); + } + else if (token->type != CPP_EOF) + { + cpp_error (pfile, CPP_DL_ERROR, "\"%s\" is not a valid filename", + cpp_token_as_text (pfile, token)); + return; + } + + skip_rest_of_line (pfile); + + if (reason == LC_LEAVE) + { + /* Reread map since cpp_get_token can invalidate it with a + reallocation. */ + map = LINEMAPS_LAST_ORDINARY_MAP (line_table); + const line_map_ordinary *from + = linemap_included_from_linemap (line_table, map); + + if (!from) + /* Not nested. */; + else if (!new_file[0]) + /* Leaving to "" means fill in the popped-to name. */ + new_file = ORDINARY_MAP_FILE_NAME (from); + else if (filename_cmp (ORDINARY_MAP_FILE_NAME (from), new_file) != 0) + /* It's the wrong name, Grommit! */ + from = NULL; + + if (!from) + { + cpp_warning (pfile, CPP_W_NONE, + "file \"%s\" linemarker ignored due to " + "incorrect nesting", new_file); + return; + } + } + + /* Compensate for the increment in linemap_add that occurs in + _cpp_do_file_change. We're currently at the start of the line + *following* the #line directive. A separate location_t for this + location makes no sense (until we do the LC_LEAVE), and + complicates LAST_SOURCE_LINE_LOCATION. */ + pfile->line_table->highest_location--; + + _cpp_do_file_change (pfile, reason, new_file, new_lineno, new_sysp); + line_table->seen_line_directive = true; +} + +/* Arrange the file_change callback. Changing to TO_FILE:TO_LINE for + REASON. SYSP is 1 for a system header, 2 for a system header that + needs to be extern "C" protected, and zero otherwise. */ +void +_cpp_do_file_change (cpp_reader *pfile, enum lc_reason reason, + const char *to_file, linenum_type to_line, + unsigned int sysp) +{ + linemap_assert (reason != LC_ENTER_MACRO); + + const line_map_ordinary *ord_map = NULL; + if (!to_line && reason == LC_RENAME_VERBATIM) + { + /* A linemarker moving to line zero. If we're on the second + line of the current map, and it also starts at zero, just + rewind -- we're probably reading the builtins of a + preprocessed source. */ + line_map_ordinary *last = LINEMAPS_LAST_ORDINARY_MAP (pfile->line_table); + if (!ORDINARY_MAP_STARTING_LINE_NUMBER (last) + && 0 == filename_cmp (to_file, ORDINARY_MAP_FILE_NAME (last)) + && SOURCE_LINE (last, pfile->line_table->highest_line) == 2) + { + ord_map = last; + pfile->line_table->highest_location + = pfile->line_table->highest_line = MAP_START_LOCATION (last); + } + } + + if (!ord_map) + if (const line_map *map = linemap_add (pfile->line_table, reason, sysp, + to_file, to_line)) + { + ord_map = linemap_check_ordinary (map); + linemap_line_start (pfile->line_table, + ORDINARY_MAP_STARTING_LINE_NUMBER (ord_map), + 127); + } + + if (pfile->cb.file_change) + pfile->cb.file_change (pfile, ord_map); +} + +/* Report a warning or error detected by the program we are + processing. Use the directive's tokens in the error message. */ +static void +do_diagnostic (cpp_reader *pfile, enum cpp_diagnostic_level code, + enum cpp_warning_reason reason, int print_dir) +{ + const unsigned char *dir_name; + unsigned char *line; + location_t src_loc = pfile->cur_token[-1].src_loc; + + if (print_dir) + dir_name = pfile->directive->name; + else + dir_name = NULL; + pfile->state.prevent_expansion++; + line = cpp_output_line_to_string (pfile, dir_name); + pfile->state.prevent_expansion--; + + if (code == CPP_DL_WARNING_SYSHDR && reason) + cpp_warning_with_line_syshdr (pfile, reason, src_loc, 0, "%s", line); + else if (code == CPP_DL_WARNING && reason) + cpp_warning_with_line (pfile, reason, src_loc, 0, "%s", line); + else + cpp_error_with_line (pfile, code, src_loc, 0, "%s", line); + free (line); +} + +static void +do_error (cpp_reader *pfile) +{ + do_diagnostic (pfile, CPP_DL_ERROR, CPP_W_NONE, 1); +} + +static void +do_warning (cpp_reader *pfile) +{ + /* We want #warning diagnostics to be emitted in system headers too. */ + do_diagnostic (pfile, CPP_DL_WARNING_SYSHDR, CPP_W_WARNING_DIRECTIVE, 1); +} + +/* Report program identification. */ +static void +do_ident (cpp_reader *pfile) +{ + const cpp_token *str = cpp_get_token (pfile); + + if (str->type != CPP_STRING) + cpp_error (pfile, CPP_DL_ERROR, "invalid #%s directive", + pfile->directive->name); + else if (pfile->cb.ident) + pfile->cb.ident (pfile, pfile->directive_line, &str->val.str); + + check_eol (pfile, false); +} + +/* Lookup a PRAGMA name in a singly-linked CHAIN. Returns the + matching entry, or NULL if none is found. The returned entry could + be the start of a namespace chain, or a pragma. */ +static struct pragma_entry * +lookup_pragma_entry (struct pragma_entry *chain, const cpp_hashnode *pragma) +{ + while (chain && chain->pragma != pragma) + chain = chain->next; + + return chain; +} + +/* Create and insert a blank pragma entry at the beginning of a + singly-linked CHAIN. */ +static struct pragma_entry * +new_pragma_entry (cpp_reader *pfile, struct pragma_entry **chain) +{ + struct pragma_entry *new_entry; + + new_entry = (struct pragma_entry *) + _cpp_aligned_alloc (pfile, sizeof (struct pragma_entry)); + + memset (new_entry, 0, sizeof (struct pragma_entry)); + new_entry->next = *chain; + + *chain = new_entry; + return new_entry; +} + +/* Register a pragma NAME in namespace SPACE. If SPACE is null, it + goes in the global namespace. */ +static struct pragma_entry * +register_pragma_1 (cpp_reader *pfile, const char *space, const char *name, + bool allow_name_expansion) +{ + struct pragma_entry **chain = &pfile->pragmas; + struct pragma_entry *entry; + const cpp_hashnode *node; + + if (space) + { + node = cpp_lookup (pfile, UC space, strlen (space)); + entry = lookup_pragma_entry (*chain, node); + if (!entry) + { + entry = new_pragma_entry (pfile, chain); + entry->pragma = node; + entry->is_nspace = true; + entry->allow_expansion = allow_name_expansion; + } + else if (!entry->is_nspace) + goto clash; + else if (entry->allow_expansion != allow_name_expansion) + { + cpp_error (pfile, CPP_DL_ICE, + "registering pragmas in namespace \"%s\" with mismatched " + "name expansion", space); + return NULL; + } + chain = &entry->u.space; + } + else if (allow_name_expansion) + { + cpp_error (pfile, CPP_DL_ICE, + "registering pragma \"%s\" with name expansion " + "and no namespace", name); + return NULL; + } + + /* Check for duplicates. */ + node = cpp_lookup (pfile, UC name, strlen (name)); + entry = lookup_pragma_entry (*chain, node); + if (entry == NULL) + { + entry = new_pragma_entry (pfile, chain); + entry->pragma = node; + return entry; + } + + if (entry->is_nspace) + clash: + cpp_error (pfile, CPP_DL_ICE, + "registering \"%s\" as both a pragma and a pragma namespace", + NODE_NAME (node)); + else if (space) + cpp_error (pfile, CPP_DL_ICE, "#pragma %s %s is already registered", + space, name); + else + cpp_error (pfile, CPP_DL_ICE, "#pragma %s is already registered", name); + + return NULL; +} + +/* Register a cpplib internal pragma SPACE NAME with HANDLER. */ +static void +register_pragma_internal (cpp_reader *pfile, const char *space, + const char *name, pragma_cb handler) +{ + struct pragma_entry *entry; + + entry = register_pragma_1 (pfile, space, name, false); + entry->is_internal = true; + entry->u.handler = handler; +} + +/* Register a pragma NAME in namespace SPACE. If SPACE is null, it + goes in the global namespace. HANDLER is the handler it will call, + which must be non-NULL. If ALLOW_EXPANSION is set, allow macro + expansion while parsing pragma NAME. This function is exported + from libcpp. */ +void +cpp_register_pragma (cpp_reader *pfile, const char *space, const char *name, + pragma_cb handler, bool allow_expansion) +{ + struct pragma_entry *entry; + + if (!handler) + { + cpp_error (pfile, CPP_DL_ICE, "registering pragma with NULL handler"); + return; + } + + entry = register_pragma_1 (pfile, space, name, false); + if (entry) + { + entry->allow_expansion = allow_expansion; + entry->u.handler = handler; + } +} + +/* Similarly, but create mark the pragma for deferred processing. + When found, a CPP_PRAGMA token will be insertted into the stream + with IDENT in the token->u.pragma slot. */ +void +cpp_register_deferred_pragma (cpp_reader *pfile, const char *space, + const char *name, unsigned int ident, + bool allow_expansion, bool allow_name_expansion) +{ + struct pragma_entry *entry; + + entry = register_pragma_1 (pfile, space, name, allow_name_expansion); + if (entry) + { + entry->is_deferred = true; + entry->allow_expansion = allow_expansion; + entry->u.ident = ident; + } +} + +/* Register the pragmas the preprocessor itself handles. */ +void +_cpp_init_internal_pragmas (cpp_reader *pfile) +{ + /* Pragmas in the global namespace. */ + register_pragma_internal (pfile, 0, "once", do_pragma_once); + register_pragma_internal (pfile, 0, "push_macro", do_pragma_push_macro); + register_pragma_internal (pfile, 0, "pop_macro", do_pragma_pop_macro); + + /* New GCC-specific pragmas should be put in the GCC namespace. */ + register_pragma_internal (pfile, "GCC", "poison", do_pragma_poison); + register_pragma_internal (pfile, "GCC", "system_header", + do_pragma_system_header); + register_pragma_internal (pfile, "GCC", "dependency", do_pragma_dependency); + register_pragma_internal (pfile, "GCC", "warning", do_pragma_warning); + register_pragma_internal (pfile, "GCC", "error", do_pragma_error); +} + +/* Return the number of registered pragmas in PE. */ + +static int +count_registered_pragmas (struct pragma_entry *pe) +{ + int ct = 0; + for (; pe != NULL; pe = pe->next) + { + if (pe->is_nspace) + ct += count_registered_pragmas (pe->u.space); + ct++; + } + return ct; +} + +/* Save into SD the names of the registered pragmas referenced by PE, + and return a pointer to the next free space in SD. */ + +static char ** +save_registered_pragmas (struct pragma_entry *pe, char **sd) +{ + for (; pe != NULL; pe = pe->next) + { + if (pe->is_nspace) + sd = save_registered_pragmas (pe->u.space, sd); + *sd++ = (char *) xmemdup (HT_STR (&pe->pragma->ident), + HT_LEN (&pe->pragma->ident), + HT_LEN (&pe->pragma->ident) + 1); + } + return sd; +} + +/* Return a newly-allocated array which saves the names of the + registered pragmas. */ + +char ** +_cpp_save_pragma_names (cpp_reader *pfile) +{ + int ct = count_registered_pragmas (pfile->pragmas); + char **result = XNEWVEC (char *, ct); + (void) save_registered_pragmas (pfile->pragmas, result); + return result; +} + +/* Restore from SD the names of the registered pragmas referenced by PE, + and return a pointer to the next unused name in SD. */ + +static char ** +restore_registered_pragmas (cpp_reader *pfile, struct pragma_entry *pe, + char **sd) +{ + for (; pe != NULL; pe = pe->next) + { + if (pe->is_nspace) + sd = restore_registered_pragmas (pfile, pe->u.space, sd); + pe->pragma = cpp_lookup (pfile, UC *sd, strlen (*sd)); + free (*sd); + sd++; + } + return sd; +} + +/* Restore the names of the registered pragmas from SAVED. */ + +void +_cpp_restore_pragma_names (cpp_reader *pfile, char **saved) +{ + (void) restore_registered_pragmas (pfile, pfile->pragmas, saved); + free (saved); +} + +/* Pragmata handling. We handle some, and pass the rest on to the + front end. C99 defines three pragmas and says that no macro + expansion is to be performed on them; whether or not macro + expansion happens for other pragmas is implementation defined. + This implementation allows for a mix of both, since GCC did not + traditionally macro expand its (few) pragmas, whereas OpenMP + specifies that macro expansion should happen. */ +static void +do_pragma (cpp_reader *pfile) +{ + const struct pragma_entry *p = NULL; + const cpp_token *token, *pragma_token; + location_t pragma_token_virt_loc = 0; + cpp_token ns_token; + unsigned int count = 1; + + pfile->state.prevent_expansion++; + + pragma_token = token = cpp_get_token_with_location (pfile, + &pragma_token_virt_loc); + ns_token = *token; + if (token->type == CPP_NAME) + { + p = lookup_pragma_entry (pfile->pragmas, token->val.node.node); + if (p && p->is_nspace) + { + bool allow_name_expansion = p->allow_expansion; + if (allow_name_expansion) + pfile->state.prevent_expansion--; + + token = cpp_get_token (pfile); + if (token->type == CPP_NAME) + p = lookup_pragma_entry (p->u.space, token->val.node.node); + else + p = NULL; + if (allow_name_expansion) + pfile->state.prevent_expansion++; + count = 2; + } + } + + if (p) + { + if (p->is_deferred) + { + pfile->directive_result.src_loc = pragma_token_virt_loc; + pfile->directive_result.type = CPP_PRAGMA; + pfile->directive_result.flags = pragma_token->flags; + pfile->directive_result.val.pragma = p->u.ident; + pfile->state.in_deferred_pragma = true; + pfile->state.pragma_allow_expansion = p->allow_expansion; + if (!p->allow_expansion) + pfile->state.prevent_expansion++; + } + else + { + /* Since the handler below doesn't get the line number, that + it might need for diagnostics, make sure it has the right + numbers in place. */ + if (pfile->cb.line_change) + (*pfile->cb.line_change) (pfile, pragma_token, false); + if (p->allow_expansion) + pfile->state.prevent_expansion--; + (*p->u.handler) (pfile); + if (p->allow_expansion) + pfile->state.prevent_expansion++; + } + } + else if (pfile->cb.def_pragma) + { + if (count == 1 || pfile->context->prev == NULL) + _cpp_backup_tokens (pfile, count); + else + { + /* Invalid name comes from macro expansion, _cpp_backup_tokens + won't allow backing 2 tokens. */ + /* ??? The token buffer is leaked. Perhaps if def_pragma hook + reads both tokens, we could perhaps free it, but if it doesn't, + we don't know the exact lifespan. */ + cpp_token *toks = XNEWVEC (cpp_token, 2); + toks[0] = ns_token; + toks[0].flags |= NO_EXPAND; + toks[1] = *token; + toks[1].flags |= NO_EXPAND; + _cpp_push_token_context (pfile, NULL, toks, 2); + } + pfile->cb.def_pragma (pfile, pfile->directive_line); + } + + pfile->state.prevent_expansion--; +} + +/* Handle #pragma once. */ +static void +do_pragma_once (cpp_reader *pfile) +{ + if (_cpp_in_main_source_file (pfile)) + cpp_error (pfile, CPP_DL_WARNING, "#pragma once in main file"); + + check_eol (pfile, false); + _cpp_mark_file_once_only (pfile, pfile->buffer->file); +} + +/* Handle #pragma push_macro(STRING). */ +static void +do_pragma_push_macro (cpp_reader *pfile) +{ + cpp_hashnode *node; + size_t defnlen; + const uchar *defn = NULL; + char *macroname, *dest; + const char *limit, *src; + const cpp_token *txt; + struct def_pragma_macro *c; + + txt = get__Pragma_string (pfile); + if (!txt) + { + location_t src_loc = pfile->cur_token[-1].src_loc; + cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0, + "invalid #pragma push_macro directive"); + check_eol (pfile, false); + skip_rest_of_line (pfile); + return; + } + dest = macroname = (char *) alloca (txt->val.str.len + 2); + src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 'L')); + limit = (const char *) (txt->val.str.text + txt->val.str.len - 1); + while (src < limit) + { + /* We know there is a character following the backslash. */ + if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) + src++; + *dest++ = *src++; + } + *dest = 0; + check_eol (pfile, false); + skip_rest_of_line (pfile); + c = XNEW (struct def_pragma_macro); + memset (c, 0, sizeof (struct def_pragma_macro)); + c->name = XNEWVAR (char, strlen (macroname) + 1); + strcpy (c->name, macroname); + c->next = pfile->pushed_macros; + node = _cpp_lex_identifier (pfile, c->name); + if (node->type == NT_VOID) + c->is_undef = 1; + else if (node->type == NT_BUILTIN_MACRO) + c->is_builtin = 1; + else + { + defn = cpp_macro_definition (pfile, node); + defnlen = ustrlen (defn); + c->definition = XNEWVEC (uchar, defnlen + 2); + c->definition[defnlen] = '\n'; + c->definition[defnlen + 1] = 0; + c->line = node->value.macro->line; + c->syshdr = node->value.macro->syshdr; + c->used = node->value.macro->used; + memcpy (c->definition, defn, defnlen); + } + + pfile->pushed_macros = c; +} + +/* Handle #pragma pop_macro(STRING). */ +static void +do_pragma_pop_macro (cpp_reader *pfile) +{ + char *macroname, *dest; + const char *limit, *src; + const cpp_token *txt; + struct def_pragma_macro *l = NULL, *c = pfile->pushed_macros; + txt = get__Pragma_string (pfile); + if (!txt) + { + location_t src_loc = pfile->cur_token[-1].src_loc; + cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0, + "invalid #pragma pop_macro directive"); + check_eol (pfile, false); + skip_rest_of_line (pfile); + return; + } + dest = macroname = (char *) alloca (txt->val.str.len + 2); + src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 'L')); + limit = (const char *) (txt->val.str.text + txt->val.str.len - 1); + while (src < limit) + { + /* We know there is a character following the backslash. */ + if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) + src++; + *dest++ = *src++; + } + *dest = 0; + check_eol (pfile, false); + skip_rest_of_line (pfile); + + while (c != NULL) + { + if (!strcmp (c->name, macroname)) + { + if (!l) + pfile->pushed_macros = c->next; + else + l->next = c->next; + cpp_pop_definition (pfile, c); + free (c->definition); + free (c->name); + free (c); + break; + } + l = c; + c = c->next; + } +} + +/* Handle #pragma GCC poison, to poison one or more identifiers so + that the lexer produces a hard error for each subsequent usage. */ +static void +do_pragma_poison (cpp_reader *pfile) +{ + const cpp_token *tok; + cpp_hashnode *hp; + + pfile->state.poisoned_ok = 1; + for (;;) + { + tok = _cpp_lex_token (pfile); + if (tok->type == CPP_EOF) + break; + if (tok->type != CPP_NAME) + { + cpp_error (pfile, CPP_DL_ERROR, + "invalid #pragma GCC poison directive"); + break; + } + + hp = tok->val.node.node; + if (hp->flags & NODE_POISONED) + continue; + + if (cpp_macro_p (hp)) + cpp_error (pfile, CPP_DL_WARNING, "poisoning existing macro \"%s\"", + NODE_NAME (hp)); + _cpp_free_definition (hp); + hp->flags |= NODE_POISONED | NODE_DIAGNOSTIC; + } + pfile->state.poisoned_ok = 0; +} + +/* Mark the current header as a system header. This will suppress + some categories of warnings (notably those from -pedantic). It is + intended for use in system libraries that cannot be implemented in + conforming C, but cannot be certain that their headers appear in a + system include directory. To prevent abuse, it is rejected in the + primary source file. */ +static void +do_pragma_system_header (cpp_reader *pfile) +{ + if (_cpp_in_main_source_file (pfile)) + cpp_error (pfile, CPP_DL_WARNING, + "#pragma system_header ignored outside include file"); + else + { + check_eol (pfile, false); + skip_rest_of_line (pfile); + cpp_make_system_header (pfile, 1, 0); + } +} + +/* Check the modified date of the current include file against a specified + file. Issue a diagnostic, if the specified file is newer. We use this to + determine if a fixed header should be refixed. */ +static void +do_pragma_dependency (cpp_reader *pfile) +{ + const char *fname; + int angle_brackets, ordering; + location_t location; + + fname = parse_include (pfile, &angle_brackets, NULL, &location); + if (!fname) + return; + + ordering = _cpp_compare_file_date (pfile, fname, angle_brackets); + if (ordering < 0) + cpp_error (pfile, CPP_DL_WARNING, "cannot find source file %s", fname); + else if (ordering > 0) + { + cpp_error (pfile, CPP_DL_WARNING, + "current file is older than %s", fname); + if (cpp_get_token (pfile)->type != CPP_EOF) + { + _cpp_backup_tokens (pfile, 1); + do_diagnostic (pfile, CPP_DL_WARNING, CPP_W_NONE, 0); + } + } + + free ((void *) fname); +} + +/* Issue a diagnostic with the message taken from the pragma. If + ERROR is true, the diagnostic is a warning, otherwise, it is an + error. */ +static void +do_pragma_warning_or_error (cpp_reader *pfile, bool error) +{ + const cpp_token *tok = _cpp_lex_token (pfile); + cpp_string str; + if (tok->type != CPP_STRING + || !cpp_interpret_string_notranslate (pfile, &tok->val.str, 1, &str, + CPP_STRING) + || str.len == 0) + { + cpp_error (pfile, CPP_DL_ERROR, "invalid \"#pragma GCC %s\" directive", + error ? "error" : "warning"); + return; + } + cpp_error (pfile, error ? CPP_DL_ERROR : CPP_DL_WARNING, + "%s", str.text); + free ((void *)str.text); +} + +/* Issue a warning diagnostic. */ +static void +do_pragma_warning (cpp_reader *pfile) +{ + do_pragma_warning_or_error (pfile, false); +} + +/* Issue an error diagnostic. */ +static void +do_pragma_error (cpp_reader *pfile) +{ + do_pragma_warning_or_error (pfile, true); +} + +/* Get a token but skip padding. */ +static const cpp_token * +get_token_no_padding (cpp_reader *pfile) +{ + for (;;) + { + const cpp_token *result = cpp_get_token (pfile); + if (result->type != CPP_PADDING) + return result; + } +} + +/* Check syntax is "(string-literal)". Returns the string on success, + or NULL on failure. */ +static const cpp_token * +get__Pragma_string (cpp_reader *pfile) +{ + const cpp_token *string; + const cpp_token *paren; + + paren = get_token_no_padding (pfile); + if (paren->type == CPP_EOF) + _cpp_backup_tokens (pfile, 1); + if (paren->type != CPP_OPEN_PAREN) + return NULL; + + string = get_token_no_padding (pfile); + if (string->type == CPP_EOF) + _cpp_backup_tokens (pfile, 1); + if (string->type != CPP_STRING && string->type != CPP_WSTRING + && string->type != CPP_STRING32 && string->type != CPP_STRING16 + && string->type != CPP_UTF8STRING) + return NULL; + + paren = get_token_no_padding (pfile); + if (paren->type == CPP_EOF) + _cpp_backup_tokens (pfile, 1); + if (paren->type != CPP_CLOSE_PAREN) + return NULL; + + return string; +} + +/* Destringize IN into a temporary buffer, by removing the first \ of + \" and \\ sequences, and process the result as a #pragma directive. */ +static void +destringize_and_run (cpp_reader *pfile, const cpp_string *in, + location_t expansion_loc) +{ + const unsigned char *src, *limit; + char *dest, *result; + cpp_context *saved_context; + cpp_token *saved_cur_token; + tokenrun *saved_cur_run; + cpp_token *toks; + int count; + const struct directive *save_directive; + + dest = result = (char *) alloca (in->len - 1); + src = in->text + 1 + (in->text[0] == 'L'); + limit = in->text + in->len - 1; + while (src < limit) + { + /* We know there is a character following the backslash. */ + if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) + src++; + *dest++ = *src++; + } + *dest = '\n'; + + /* Ugh; an awful kludge. We are really not set up to be lexing + tokens when in the middle of a macro expansion. Use a new + context to force cpp_get_token to lex, and so skip_rest_of_line + doesn't go beyond the end of the text. Also, remember the + current lexing position so we can return to it later. + + Something like line-at-a-time lexing should remove the need for + this. */ + saved_context = pfile->context; + saved_cur_token = pfile->cur_token; + saved_cur_run = pfile->cur_run; + + pfile->context = XCNEW (cpp_context); + + /* Inline run_directive, since we need to delay the _cpp_pop_buffer + until we've read all of the tokens that we want. */ + cpp_push_buffer (pfile, (const uchar *) result, dest - result, + /* from_stage3 */ true); + /* ??? Antique Disgusting Hack. What does this do? */ + if (pfile->buffer->prev) + pfile->buffer->file = pfile->buffer->prev->file; + + start_directive (pfile); + _cpp_clean_line (pfile); + save_directive = pfile->directive; + pfile->directive = &dtable[T_PRAGMA]; + do_pragma (pfile); + if (pfile->directive_result.type == CPP_PRAGMA) + pfile->directive_result.flags |= PRAGMA_OP; + end_directive (pfile, 1); + pfile->directive = save_directive; + + /* We always insert at least one token, the directive result. It'll + either be a CPP_PADDING or a CPP_PRAGMA. In the later case, we + need to insert *all* of the tokens, including the CPP_PRAGMA_EOL. */ + + /* If we're not handling the pragma internally, read all of the tokens from + the string buffer now, while the string buffer is still installed. */ + /* ??? Note that the token buffer allocated here is leaked. It's not clear + to me what the true lifespan of the tokens are. It would appear that + the lifespan is the entire parse of the main input stream, in which case + this may not be wrong. */ + if (pfile->directive_result.type == CPP_PRAGMA) + { + int maxcount; + + count = 1; + maxcount = 50; + toks = XNEWVEC (cpp_token, maxcount); + toks[0] = pfile->directive_result; + toks[0].src_loc = expansion_loc; + + do + { + if (count == maxcount) + { + maxcount = maxcount * 3 / 2; + toks = XRESIZEVEC (cpp_token, toks, maxcount); + } + toks[count] = *cpp_get_token (pfile); + /* _Pragma is a builtin, so we're not within a macro-map, and so + the token locations are set to bogus ordinary locations + near to, but after that of the "_Pragma". + Paper over this by setting them equal to the location of the + _Pragma itself (PR preprocessor/69126). */ + toks[count].src_loc = expansion_loc; + /* Macros have been already expanded by cpp_get_token + if the pragma allowed expansion. */ + toks[count++].flags |= NO_EXPAND; + } + while (toks[count-1].type != CPP_PRAGMA_EOL); + } + else + { + count = 1; + toks = &pfile->avoid_paste; + + /* If we handled the entire pragma internally, make sure we get the + line number correct for the next token. */ + if (pfile->cb.line_change) + pfile->cb.line_change (pfile, pfile->cur_token, false); + } + + /* Finish inlining run_directive. */ + pfile->buffer->file = NULL; + _cpp_pop_buffer (pfile); + + /* Reset the old macro state before ... */ + XDELETE (pfile->context); + pfile->context = saved_context; + pfile->cur_token = saved_cur_token; + pfile->cur_run = saved_cur_run; + + /* ... inserting the new tokens we collected. */ + _cpp_push_token_context (pfile, NULL, toks, count); +} + +/* Handle the _Pragma operator. Return 0 on error, 1 if ok. */ +int +_cpp_do__Pragma (cpp_reader *pfile, location_t expansion_loc) +{ + const cpp_token *string = get__Pragma_string (pfile); + pfile->directive_result.type = CPP_PADDING; + + if (string) + { + destringize_and_run (pfile, &string->val.str, expansion_loc); + return 1; + } + cpp_error (pfile, CPP_DL_ERROR, + "_Pragma takes a parenthesized string literal"); + return 0; +} + +/* Handle #ifdef. */ +static void +do_ifdef (cpp_reader *pfile) +{ + int skip = 1; + + if (! pfile->state.skipping) + { + cpp_hashnode *node = lex_macro_node (pfile, false); + + if (node) + { + skip = !_cpp_defined_macro_p (node); + if (!_cpp_maybe_notify_macro_use (pfile, node, pfile->directive_line)) + /* It wasn't a macro after all. */ + skip = true; + _cpp_mark_macro_used (node); + if (pfile->cb.used) + pfile->cb.used (pfile, pfile->directive_line, node); + check_eol (pfile, false); + } + } + + push_conditional (pfile, skip, T_IFDEF, 0); +} + +/* Handle #ifndef. */ +static void +do_ifndef (cpp_reader *pfile) +{ + int skip = 1; + cpp_hashnode *node = 0; + + if (! pfile->state.skipping) + { + node = lex_macro_node (pfile, false); + + if (node) + { + skip = _cpp_defined_macro_p (node); + if (!_cpp_maybe_notify_macro_use (pfile, node, pfile->directive_line)) + /* It wasn't a macro after all. */ + skip = false; + _cpp_mark_macro_used (node); + if (pfile->cb.used) + pfile->cb.used (pfile, pfile->directive_line, node); + check_eol (pfile, false); + } + } + + push_conditional (pfile, skip, T_IFNDEF, node); +} + +/* _cpp_parse_expr puts a macro in a "#if !defined ()" expression in + pfile->mi_ind_cmacro so we can handle multiple-include + optimizations. If macro expansion occurs in the expression, we + cannot treat it as a controlling conditional, since the expansion + could change in the future. That is handled by cpp_get_token. */ +static void +do_if (cpp_reader *pfile) +{ + int skip = 1; + + if (! pfile->state.skipping) + skip = _cpp_parse_expr (pfile, true) == false; + + push_conditional (pfile, skip, T_IF, pfile->mi_ind_cmacro); +} + +/* Flip skipping state if appropriate and continue without changing + if_stack; this is so that the error message for missing #endif's + etc. will point to the original #if. */ +static void +do_else (cpp_reader *pfile) +{ + cpp_buffer *buffer = pfile->buffer; + struct if_stack *ifs = buffer->if_stack; + + if (ifs == NULL) + cpp_error (pfile, CPP_DL_ERROR, "#else without #if"); + else + { + if (ifs->type == T_ELSE) + { + cpp_error (pfile, CPP_DL_ERROR, "#else after #else"); + cpp_error_with_line (pfile, CPP_DL_ERROR, ifs->line, 0, + "the conditional began here"); + } + ifs->type = T_ELSE; + + /* Skip any future (erroneous) #elses or #elifs. */ + pfile->state.skipping = ifs->skip_elses; + ifs->skip_elses = true; + + /* Invalidate any controlling macro. */ + ifs->mi_cmacro = 0; + + /* Only check EOL if was not originally skipping. */ + if (!ifs->was_skipping && CPP_OPTION (pfile, warn_endif_labels)) + check_eol_endif_labels (pfile); + } +} + +/* Handle a #elif, #elifdef or #elifndef directive by not changing if_stack + either. See the comment above do_else. */ +static void +do_elif (cpp_reader *pfile) +{ + cpp_buffer *buffer = pfile->buffer; + struct if_stack *ifs = buffer->if_stack; + + if (ifs == NULL) + cpp_error (pfile, CPP_DL_ERROR, "#%s without #if", pfile->directive->name); + else + { + if (ifs->type == T_ELSE) + { + cpp_error (pfile, CPP_DL_ERROR, "#%s after #else", + pfile->directive->name); + cpp_error_with_line (pfile, CPP_DL_ERROR, ifs->line, 0, + "the conditional began here"); + } + ifs->type = T_ELIF; + + /* See DR#412: "Only the first group whose control condition + evaluates to true (nonzero) is processed; any following groups + are skipped and their controlling directives are processed as + if they were in a group that is skipped." */ + if (ifs->skip_elses) + { + /* In older GNU standards, #elifdef/#elifndef is supported + as an extension, but pedwarn if -pedantic if the presence + of the directive would be rejected. */ + if (pfile->directive != &dtable[T_ELIF] + && ! CPP_OPTION (pfile, elifdef) + && CPP_PEDANTIC (pfile) + && !pfile->state.skipping) + { + if (CPP_OPTION (pfile, cplusplus)) + cpp_error (pfile, CPP_DL_PEDWARN, + "#%s before C++23 is a GCC extension", + pfile->directive->name); + else + cpp_error (pfile, CPP_DL_PEDWARN, + "#%s before C2X is a GCC extension", + pfile->directive->name); + } + pfile->state.skipping = 1; + } + else + { + if (pfile->directive == &dtable[T_ELIF]) + pfile->state.skipping = !_cpp_parse_expr (pfile, false); + else + { + cpp_hashnode *node = lex_macro_node (pfile, false); + + if (node) + { + bool macro_defined = _cpp_defined_macro_p (node); + if (!_cpp_maybe_notify_macro_use (pfile, node, + pfile->directive_line)) + /* It wasn't a macro after all. */ + macro_defined = false; + bool skip = (pfile->directive == &dtable[T_ELIFDEF] + ? !macro_defined + : macro_defined); + if (pfile->cb.used) + pfile->cb.used (pfile, pfile->directive_line, node); + check_eol (pfile, false); + /* In older GNU standards, #elifdef/#elifndef is supported + as an extension, but pedwarn if -pedantic if the presence + of the directive would change behavior. */ + if (! CPP_OPTION (pfile, elifdef) + && CPP_PEDANTIC (pfile) + && pfile->state.skipping != skip) + { + if (CPP_OPTION (pfile, cplusplus)) + cpp_error (pfile, CPP_DL_PEDWARN, + "#%s before C++23 is a GCC extension", + pfile->directive->name); + else + cpp_error (pfile, CPP_DL_PEDWARN, + "#%s before C2X is a GCC extension", + pfile->directive->name); + } + pfile->state.skipping = skip; + } + } + ifs->skip_elses = !pfile->state.skipping; + } + + /* Invalidate any controlling macro. */ + ifs->mi_cmacro = 0; + } +} + +/* Handle a #elifdef directive. */ +static void +do_elifdef (cpp_reader *pfile) +{ + do_elif (pfile); +} + +/* Handle a #elifndef directive. */ +static void +do_elifndef (cpp_reader *pfile) +{ + do_elif (pfile); +} + +/* #endif pops the if stack and resets pfile->state.skipping. */ +static void +do_endif (cpp_reader *pfile) +{ + cpp_buffer *buffer = pfile->buffer; + struct if_stack *ifs = buffer->if_stack; + + if (ifs == NULL) + cpp_error (pfile, CPP_DL_ERROR, "#endif without #if"); + else + { + /* Only check EOL if was not originally skipping. */ + if (!ifs->was_skipping && CPP_OPTION (pfile, warn_endif_labels)) + check_eol_endif_labels (pfile); + + /* If potential control macro, we go back outside again. */ + if (ifs->next == 0 && ifs->mi_cmacro) + { + pfile->mi_valid = true; + pfile->mi_cmacro = ifs->mi_cmacro; + } + + buffer->if_stack = ifs->next; + pfile->state.skipping = ifs->was_skipping; + obstack_free (&pfile->buffer_ob, ifs); + } +} + +/* Push an if_stack entry for a preprocessor conditional, and set + pfile->state.skipping to SKIP. If TYPE indicates the conditional + is #if or #ifndef, CMACRO is a potentially controlling macro, and + we need to check here that we are at the top of the file. */ +static void +push_conditional (cpp_reader *pfile, int skip, int type, + const cpp_hashnode *cmacro) +{ + struct if_stack *ifs; + cpp_buffer *buffer = pfile->buffer; + + ifs = XOBNEW (&pfile->buffer_ob, struct if_stack); + ifs->line = pfile->directive_line; + ifs->next = buffer->if_stack; + ifs->skip_elses = pfile->state.skipping || !skip; + ifs->was_skipping = pfile->state.skipping; + ifs->type = type; + /* This condition is effectively a test for top-of-file. */ + if (pfile->mi_valid && pfile->mi_cmacro == 0) + ifs->mi_cmacro = cmacro; + else + ifs->mi_cmacro = 0; + + pfile->state.skipping = skip; + buffer->if_stack = ifs; +} + +/* Read the tokens of the answer into the macro pool, in a directive + of type TYPE. Only commit the memory if we intend it as permanent + storage, i.e. the #assert case. Returns 0 on success, and sets + ANSWERP to point to the answer. PRED_LOC is the location of the + predicate. */ +static bool +parse_answer (cpp_reader *pfile, int type, location_t pred_loc, + cpp_macro **answer_ptr) +{ + /* In a conditional, it is legal to not have an open paren. We + should save the following token in this case. */ + const cpp_token *paren = cpp_get_token (pfile); + + /* If not a paren, see if we're OK. */ + if (paren->type != CPP_OPEN_PAREN) + { + /* In a conditional no answer is a test for any answer. It + could be followed by any token. */ + if (type == T_IF) + { + _cpp_backup_tokens (pfile, 1); + return true; + } + + /* #unassert with no answer is valid - it removes all answers. */ + if (type == T_UNASSERT && paren->type == CPP_EOF) + return true; + + cpp_error_with_line (pfile, CPP_DL_ERROR, pred_loc, 0, + "missing '(' after predicate"); + return false; + } + + cpp_macro *answer = _cpp_new_macro (pfile, cmk_assert, + _cpp_reserve_room (pfile, 0, + sizeof (cpp_macro))); + answer->parm.next = NULL; + unsigned count = 0; + for (;;) + { + const cpp_token *token = cpp_get_token (pfile); + + if (token->type == CPP_CLOSE_PAREN) + break; + + if (token->type == CPP_EOF) + { + cpp_error (pfile, CPP_DL_ERROR, "missing ')' to complete answer"); + return false; + } + + answer = (cpp_macro *)_cpp_reserve_room + (pfile, sizeof (cpp_macro) + count * sizeof (cpp_token), + sizeof (cpp_token)); + answer->exp.tokens[count++] = *token; + } + + if (!count) + { + cpp_error (pfile, CPP_DL_ERROR, "predicate's answer is empty"); + return false; + } + + /* Drop whitespace at start, for answer equivalence purposes. */ + answer->exp.tokens[0].flags &= ~PREV_WHITE; + + answer->count = count; + *answer_ptr = answer; + + return true; +} + +/* Parses an assertion directive of type TYPE, returning a pointer to + the hash node of the predicate, or 0 on error. The node is + guaranteed to be disjoint from the macro namespace, so can only + have type 'NT_VOID'. If an answer was supplied, it is placed in + *ANSWER_PTR, which is otherwise set to 0. */ +static cpp_hashnode * +parse_assertion (cpp_reader *pfile, int type, cpp_macro **answer_ptr) +{ + cpp_hashnode *result = 0; + + /* We don't expand predicates or answers. */ + pfile->state.prevent_expansion++; + + *answer_ptr = NULL; + + const cpp_token *predicate = cpp_get_token (pfile); + if (predicate->type == CPP_EOF) + cpp_error (pfile, CPP_DL_ERROR, "assertion without predicate"); + else if (predicate->type != CPP_NAME) + cpp_error_with_line (pfile, CPP_DL_ERROR, predicate->src_loc, 0, + "predicate must be an identifier"); + else if (parse_answer (pfile, type, predicate->src_loc, answer_ptr)) + { + unsigned int len = NODE_LEN (predicate->val.node.node); + unsigned char *sym = (unsigned char *) alloca (len + 1); + + /* Prefix '#' to get it out of macro namespace. */ + sym[0] = '#'; + memcpy (sym + 1, NODE_NAME (predicate->val.node.node), len); + result = cpp_lookup (pfile, sym, len + 1); + } + + pfile->state.prevent_expansion--; + + return result; +} + +/* Returns a pointer to the pointer to CANDIDATE in the answer chain, + or a pointer to NULL if the answer is not in the chain. */ +static cpp_macro ** +find_answer (cpp_hashnode *node, const cpp_macro *candidate) +{ + unsigned int i; + cpp_macro **result = NULL; + + for (result = &node->value.answers; *result; result = &(*result)->parm.next) + { + cpp_macro *answer = *result; + + if (answer->count == candidate->count) + { + for (i = 0; i < answer->count; i++) + if (!_cpp_equiv_tokens (&answer->exp.tokens[i], + &candidate->exp.tokens[i])) + break; + + if (i == answer->count) + break; + } + } + + return result; +} + +/* Test an assertion within a preprocessor conditional. Returns + nonzero on failure, zero on success. On success, the result of + the test is written into VALUE, otherwise the value 0. */ +int +_cpp_test_assertion (cpp_reader *pfile, unsigned int *value) +{ + cpp_macro *answer; + cpp_hashnode *node = parse_assertion (pfile, T_IF, &answer); + + /* For recovery, an erroneous assertion expression is handled as a + failing assertion. */ + *value = 0; + + if (node) + { + if (node->value.answers) + *value = !answer || *find_answer (node, answer); + } + else if (pfile->cur_token[-1].type == CPP_EOF) + _cpp_backup_tokens (pfile, 1); + + /* We don't commit the memory for the answer - it's temporary only. */ + return node == 0; +} + +/* Handle #assert. */ +static void +do_assert (cpp_reader *pfile) +{ + cpp_macro *answer; + cpp_hashnode *node = parse_assertion (pfile, T_ASSERT, &answer); + + if (node) + { + /* Place the new answer in the answer list. First check there + is not a duplicate. */ + if (*find_answer (node, answer)) + { + cpp_error (pfile, CPP_DL_WARNING, "\"%s\" re-asserted", + NODE_NAME (node) + 1); + return; + } + + /* Commit or allocate storage for the answer. */ + answer = (cpp_macro *)_cpp_commit_buff + (pfile, sizeof (cpp_macro) - sizeof (cpp_token) + + sizeof (cpp_token) * answer->count); + + /* Chain into the list. */ + answer->parm.next = node->value.answers; + node->value.answers = answer; + + check_eol (pfile, false); + } +} + +/* Handle #unassert. */ +static void +do_unassert (cpp_reader *pfile) +{ + cpp_macro *answer; + cpp_hashnode *node = parse_assertion (pfile, T_UNASSERT, &answer); + + /* It isn't an error to #unassert something that isn't asserted. */ + if (node) + { + if (answer) + { + cpp_macro **p = find_answer (node, answer); + + /* Remove the assert from the list. */ + if (cpp_macro *temp = *p) + *p = temp->parm.next; + + check_eol (pfile, false); + } + else + _cpp_free_definition (node); + } + + /* We don't commit the memory for the answer - it's temporary only. */ +} + +/* These are for -D, -U, -A. */ + +/* Process the string STR as if it appeared as the body of a #define. + If STR is just an identifier, define it with value 1. + If STR has anything after the identifier, then it should + be identifier=definition. */ +void +cpp_define (cpp_reader *pfile, const char *str) +{ + char *buf; + const char *p; + size_t count; + + /* Copy the entire option so we can modify it. + Change the first "=" in the string to a space. If there is none, + tack " 1" on the end. */ + + count = strlen (str); + buf = (char *) alloca (count + 3); + memcpy (buf, str, count); + + p = strchr (str, '='); + if (p) + buf[p - str] = ' '; + else + { + buf[count++] = ' '; + buf[count++] = '1'; + } + buf[count] = '\n'; + + run_directive (pfile, T_DEFINE, buf, count); +} + +/* Like cpp_define, but does not warn about unused macro. */ +void +cpp_define_unused (cpp_reader *pfile, const char *str) +{ + unsigned char warn_unused_macros = CPP_OPTION (pfile, warn_unused_macros); + CPP_OPTION (pfile, warn_unused_macros) = 0; + cpp_define (pfile, str); + CPP_OPTION (pfile, warn_unused_macros) = warn_unused_macros; +} + +/* Use to build macros to be run through cpp_define() as + described above. + Example: cpp_define_formatted (pfile, "MACRO=%d", value); */ + +void +cpp_define_formatted (cpp_reader *pfile, const char *fmt, ...) +{ + char *ptr; + + va_list ap; + va_start (ap, fmt); + ptr = xvasprintf (fmt, ap); + va_end (ap); + + cpp_define (pfile, ptr); + free (ptr); +} + +/* Like cpp_define_formatted, but does not warn about unused macro. */ +void +cpp_define_formatted_unused (cpp_reader *pfile, const char *fmt, ...) +{ + char *ptr; + + va_list ap; + va_start (ap, fmt); + ptr = xvasprintf (fmt, ap); + va_end (ap); + + cpp_define_unused (pfile, ptr); + free (ptr); +} + +/* Slight variant of the above for use by initialize_builtins. */ +void +_cpp_define_builtin (cpp_reader *pfile, const char *str) +{ + size_t len = strlen (str); + char *buf = (char *) alloca (len + 1); + memcpy (buf, str, len); + buf[len] = '\n'; + run_directive (pfile, T_DEFINE, buf, len); +} + +/* Process MACRO as if it appeared as the body of an #undef. */ +void +cpp_undef (cpp_reader *pfile, const char *macro) +{ + size_t len = strlen (macro); + char *buf = (char *) alloca (len + 1); + memcpy (buf, macro, len); + buf[len] = '\n'; + run_directive (pfile, T_UNDEF, buf, len); +} + +/* Replace a previous definition DEF of the macro STR. If DEF is NULL, + or first element is zero, then the macro should be undefined. */ +static void +cpp_pop_definition (cpp_reader *pfile, struct def_pragma_macro *c) +{ + cpp_hashnode *node = _cpp_lex_identifier (pfile, c->name); + if (node == NULL) + return; + + if (pfile->cb.before_define) + pfile->cb.before_define (pfile); + + if (cpp_macro_p (node)) + { + if (pfile->cb.undef) + pfile->cb.undef (pfile, pfile->directive_line, node); + if (CPP_OPTION (pfile, warn_unused_macros)) + _cpp_warn_if_unused_macro (pfile, node, NULL); + _cpp_free_definition (node); + } + + if (c->is_undef) + return; + if (c->is_builtin) + { + _cpp_restore_special_builtin (pfile, c); + return; + } + + { + size_t namelen; + const uchar *dn; + cpp_hashnode *h = NULL; + cpp_buffer *nbuf; + + namelen = ustrcspn (c->definition, "( \n"); + h = cpp_lookup (pfile, c->definition, namelen); + dn = c->definition + namelen; + + nbuf = cpp_push_buffer (pfile, dn, ustrchr (dn, '\n') - dn, true); + if (nbuf != NULL) + { + _cpp_clean_line (pfile); + nbuf->sysp = 1; + if (!_cpp_create_definition (pfile, h)) + abort (); + _cpp_pop_buffer (pfile); + } + else + abort (); + h->value.macro->line = c->line; + h->value.macro->syshdr = c->syshdr; + h->value.macro->used = c->used; + } +} + +/* Process the string STR as if it appeared as the body of a #assert. */ +void +cpp_assert (cpp_reader *pfile, const char *str) +{ + handle_assertion (pfile, str, T_ASSERT); +} + +/* Process STR as if it appeared as the body of an #unassert. */ +void +cpp_unassert (cpp_reader *pfile, const char *str) +{ + handle_assertion (pfile, str, T_UNASSERT); +} + +/* Common code for cpp_assert (-A) and cpp_unassert (-A-). */ +static void +handle_assertion (cpp_reader *pfile, const char *str, int type) +{ + size_t count = strlen (str); + const char *p = strchr (str, '='); + + /* Copy the entire option so we can modify it. Change the first + "=" in the string to a '(', and tack a ')' on the end. */ + char *buf = (char *) alloca (count + 2); + + memcpy (buf, str, count); + if (p) + { + buf[p - str] = '('; + buf[count++] = ')'; + } + buf[count] = '\n'; + str = buf; + + run_directive (pfile, type, str, count); +} + +/* The options structure. */ +cpp_options * +cpp_get_options (cpp_reader *pfile) +{ + return &pfile->opts; +} + +/* The callbacks structure. */ +cpp_callbacks * +cpp_get_callbacks (cpp_reader *pfile) +{ + return &pfile->cb; +} + +/* Copy the given callbacks structure to our own. */ +void +cpp_set_callbacks (cpp_reader *pfile, cpp_callbacks *cb) +{ + pfile->cb = *cb; +} + +/* The narrow character set identifier. */ +const char * +cpp_get_narrow_charset_name (cpp_reader *pfile) +{ + return pfile->narrow_cset_desc.to; +} + +/* The wide character set identifier. */ +const char * +cpp_get_wide_charset_name (cpp_reader *pfile) +{ + return pfile->wide_cset_desc.to; +} + +/* The dependencies structure. (Creates one if it hasn't already been.) */ +class mkdeps * +cpp_get_deps (cpp_reader *pfile) +{ + if (!pfile->deps && CPP_OPTION (pfile, deps.style) != DEPS_NONE) + pfile->deps = deps_init (); + return pfile->deps; +} + +/* Push a new buffer on the buffer stack. Returns the new buffer; it + doesn't fail. It does not generate a file change call back; that + is the responsibility of the caller. */ +cpp_buffer * +cpp_push_buffer (cpp_reader *pfile, const uchar *buffer, size_t len, + int from_stage3) +{ + cpp_buffer *new_buffer = XOBNEW (&pfile->buffer_ob, cpp_buffer); + + /* Clears, amongst other things, if_stack and mi_cmacro. */ + memset (new_buffer, 0, sizeof (cpp_buffer)); + + new_buffer->next_line = new_buffer->buf = buffer; + new_buffer->rlimit = buffer + len; + new_buffer->from_stage3 = from_stage3; + new_buffer->prev = pfile->buffer; + new_buffer->need_line = true; + + pfile->buffer = new_buffer; + + return new_buffer; +} + +/* Pops a single buffer, with a file change call-back if appropriate. + Then pushes the next -include file, if any remain. */ +void +_cpp_pop_buffer (cpp_reader *pfile) +{ + cpp_buffer *buffer = pfile->buffer; + struct _cpp_file *inc = buffer->file; + struct if_stack *ifs; + const unsigned char *to_free; + + /* Walk back up the conditional stack till we reach its level at + entry to this file, issuing error messages. */ + for (ifs = buffer->if_stack; ifs; ifs = ifs->next) + cpp_error_with_line (pfile, CPP_DL_ERROR, ifs->line, 0, + "unterminated #%s", dtable[ifs->type].name); + + /* In case of a missing #endif. */ + pfile->state.skipping = 0; + + /* _cpp_do_file_change expects pfile->buffer to be the new one. */ + pfile->buffer = buffer->prev; + + to_free = buffer->to_free; + free (buffer->notes); + + /* Free the buffer object now; we may want to push a new buffer + in _cpp_push_next_include_file. */ + obstack_free (&pfile->buffer_ob, buffer); + + if (inc) + { + _cpp_pop_file_buffer (pfile, inc, to_free); + + _cpp_do_file_change (pfile, LC_LEAVE, 0, 0, 0); + } + else if (to_free) + free ((void *)to_free); +} + +/* Enter all recognized directives in the hash table. */ +void +_cpp_init_directives (cpp_reader *pfile) +{ + for (int i = 0; i < N_DIRECTIVES; i++) + { + cpp_hashnode *node = cpp_lookup (pfile, dtable[i].name, dtable[i].length); + node->is_directive = 1; + node->directive_index = i; + } +} + +/* Extract header file from a bracket include. Parsing starts after '<'. + The string is malloced and must be freed by the caller. */ +char * +_cpp_bracket_include(cpp_reader *pfile) +{ + return glue_header_name (pfile); +} + + +//-------------------------------------------------------------------------------- +// RT extensions +//-------------------------------------------------------------------------------- + +/*-------------------------------------------------------------------------------- + directive `#assign` + + cmd ::= "#assign" name body ; + + name ::= clause ; + body ::= clause ; + + clause ::= "(" literal? ")" | "[" expr? "]" ; + + literal ::= ; sequence parsed into tokens + expr ::= ; sequence parsed into tokens with recursive expansion of each token + + ; white space, including new lines, is ignored. + + This differs from `#define`: + -name clause must reduce to a valid #define name + -the assign is defined after the body clause has been parsed + +*/ + +extern bool _cpp_create_assign(cpp_reader *pfile); + + +static void do_assign(cpp_reader *pfile){ + + _cpp_create_assign(pfile); + +} + + +/*-------------------------------------------------------------------------------- + directive `#macro` + + directive ::= "#rt_macro" name params body ; + + name ::= identifier ; + + params ::= "(" param_list? ")" ; + param_list ::= identifier ("," identifier)* ; + + body ::= paren_clause ; + + paren_clause ::= "(" literal? ")" + + literal ::= ; sequence parsed into tokens without expansion + + + ; whitespace, including newlines, is ignored + + +*/ +extern bool _cpp_create_rt_macro (cpp_reader *pfile, cpp_hashnode *node); + +static void +do_rt_macro (cpp_reader *pfile) +{ + cpp_hashnode *node = lex_macro_node(pfile, true); + + if(node) + { + /* If we have been requested to expand comments into macros, + then re-enable saving of comments. */ + pfile->state.save_comments = + ! CPP_OPTION (pfile, discard_comments_in_macro_exp); + + if(pfile->cb.before_define) + pfile->cb.before_define (pfile); + + if( _cpp_create_rt_macro(pfile, node) ) + if (pfile->cb.define) + pfile->cb.define (pfile, pfile->directive_line, node); + + node->flags &= ~NODE_USED; + } +} + diff --git a/developer/script_Deb-12.10_gcc-12.4.1/library/include/#cpplib.h# b/developer/script_Deb-12.10_gcc-12.4.1/library/include/#cpplib.h# new file mode 100644 index 0000000..aea752f --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/library/include/#cpplib.h# @@ -0,0 +1,1585 @@ +/* Definitions for CPP library. + Copyright (C) 1995-2022 Free Software Foundation, Inc. + Written by Per Bothner, 1994-95. + +This program is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the +Free Software Foundation; either version 3, or (at your option) any +later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; see the file COPYING3. If not see +. + + In other words, you are welcome to use, share and improve this program. + You are forbidden to forbid anyone else to use, share and improve + what you give them. Help stamp out software-hoarding! */ +#ifndef LIBCPP_CPPLIB_H +#define LIBCPP_CPPLIB_H + +#include +#include "symtab.h" +#include "line-map.h" + +typedef struct cpp_reader cpp_reader; +typedef struct cpp_buffer cpp_buffer; +typedef struct cpp_options cpp_options; +typedef struct cpp_token cpp_token; +typedef struct cpp_string cpp_string; +typedef struct cpp_hashnode cpp_hashnode; +typedef struct cpp_macro cpp_macro; +typedef struct cpp_callbacks cpp_callbacks; +typedef struct cpp_dir cpp_dir; + +struct _cpp_file; + +/* The first three groups, apart from '=', can appear in preprocessor + expressions (+= and -= are used to indicate unary + and - resp.). + This allows a lookup table to be implemented in _cpp_parse_expr. + + The first group, to CPP_LAST_EQ, can be immediately followed by an + '='. The lexer needs operators ending in '=', like ">>=", to be in + the same order as their counterparts without the '=', like ">>". + + See the cpp_operator table optab in expr.cc if you change the order or + add or remove anything in the first group. */ + +#define TTYPE_TABLE \ + OP(EQ, "=") \ + OP(NOT, "!") \ + OP(GREATER, ">") /* compare */ \ + OP(LESS, "<") \ + OP(PLUS, "+") /* math */ \ + OP(MINUS, "-") \ + OP(MULT, "*") \ + OP(DIV, "/") \ + OP(MOD, "%") \ + OP(AND, "&") /* bit ops */ \ + OP(OR, "|") \ + OP(XOR, "^") \ + OP(RSHIFT, ">>") \ + OP(LSHIFT, "<<") \ + \ + OP(COMPL, "~") \ + OP(AND_AND, "&&") /* logical */ \ + OP(OR_OR, "||") \ + OP(QUERY, "?") \ + OP(COLON, ":") \ + OP(COMMA, ",") /* grouping */ \ + OP(OPEN_PAREN, "(") \ + OP(CLOSE_PAREN, ")") \ + TK(EOF, NONE) \ + OP(EQ_EQ, "==") /* compare */ \ + OP(NOT_EQ, "!=") \ + OP(GREATER_EQ, ">=") \ + OP(LESS_EQ, "<=") \ + OP(SPACESHIP, "<=>") \ + \ + /* These two are unary + / - in preprocessor expressions. */ \ + OP(PLUS_EQ, "+=") /* math */ \ + OP(MINUS_EQ, "-=") \ + \ + OP(MULT_EQ, "*=") \ + OP(DIV_EQ, "/=") \ + OP(MOD_EQ, "%=") \ + OP(AND_EQ, "&=") /* bit ops */ \ + OP(OR_EQ, "|=") \ + OP(XOR_EQ, "^=") \ + OP(RSHIFT_EQ, ">>=") \ + OP(LSHIFT_EQ, "<<=") \ + /* Digraphs together, beginning with CPP_FIRST_DIGRAPH. */ \ + OP(HASH, "#") /* digraphs */ \ + OP(PASTE, "##") \ + OP(OPEN_SQUARE, "[") \ + OP(CLOSE_SQUARE, "]") \ + OP(OPEN_BRACE, "{") \ + OP(CLOSE_BRACE, "}") \ + /* The remainder of the punctuation. Order is not significant. */ \ + OP(SEMICOLON, ";") /* structure */ \ + OP(ELLIPSIS, "...") \ + OP(PLUS_PLUS, "++") /* increment */ \ + OP(MINUS_MINUS, "--") \ + OP(DEREF, "->") /* accessors */ \ + OP(DOT, ".") \ + OP(SCOPE, "::") \ + OP(DEREF_STAR, "->*") \ + OP(DOT_STAR, ".*") \ + OP(ATSIGN, "@") /* used in Objective-C */ \ + \ + TK(NAME, IDENT) /* word */ \ + TK(AT_NAME, IDENT) /* @word - Objective-C */ \ + TK(NUMBER, LITERAL) /* 34_be+ta */ \ + \ + TK(CHAR, LITERAL) /* 'char' */ \ + TK(WCHAR, LITERAL) /* L'char' */ \ + TK(CHAR16, LITERAL) /* u'char' */ \ + TK(CHAR32, LITERAL) /* U'char' */ \ + TK(UTF8CHAR, LITERAL) /* u8'char' */ \ + TK(OTHER, LITERAL) /* stray punctuation */ \ + \ + TK(STRING, LITERAL) /* "string" */ \ + TK(WSTRING, LITERAL) /* L"string" */ \ + TK(STRING16, LITERAL) /* u"string" */ \ + TK(STRING32, LITERAL) /* U"string" */ \ + TK(UTF8STRING, LITERAL) /* u8"string" */ \ + TK(OBJC_STRING, LITERAL) /* @"string" - Objective-C */ \ + TK(HEADER_NAME, LITERAL) /* in #include */ \ + \ + TK(CHAR_USERDEF, LITERAL) /* 'char'_suffix - C++-0x */ \ + TK(WCHAR_USERDEF, LITERAL) /* L'char'_suffix - C++-0x */ \ + TK(CHAR16_USERDEF, LITERAL) /* u'char'_suffix - C++-0x */ \ + TK(CHAR32_USERDEF, LITERAL) /* U'char'_suffix - C++-0x */ \ + TK(UTF8CHAR_USERDEF, LITERAL) /* u8'char'_suffix - C++-0x */ \ + TK(STRING_USERDEF, LITERAL) /* "string"_suffix - C++-0x */ \ + TK(WSTRING_USERDEF, LITERAL) /* L"string"_suffix - C++-0x */ \ + TK(STRING16_USERDEF, LITERAL) /* u"string"_suffix - C++-0x */ \ + TK(STRING32_USERDEF, LITERAL) /* U"string"_suffix - C++-0x */ \ + TK(UTF8STRING_USERDEF,LITERAL) /* u8"string"_suffix - C++-0x */ \ + \ + TK(COMMENT, LITERAL) /* Only if output comments. */ \ + /* SPELL_LITERAL happens to DTRT. */ \ + TK(MACRO_ARG, NONE) /* Macro argument. */ \ + TK(PRAGMA, NONE) /* Only for deferred pragmas. */ \ + TK(PRAGMA_EOL, NONE) /* End-of-line for deferred pragmas. */ \ + TK(PADDING, NONE) /* Whitespace for -E. */ + +#define OP(e, s) CPP_ ## e, +#define TK(e, s) CPP_ ## e, +enum cpp_ttype +{ + TTYPE_TABLE + N_TTYPES, + + /* A token type for keywords, as opposed to ordinary identifiers. */ + CPP_KEYWORD, + + /* Positions in the table. */ + CPP_LAST_EQ = CPP_LSHIFT, + CPP_FIRST_DIGRAPH = CPP_HASH, + CPP_LAST_PUNCTUATOR= CPP_ATSIGN, + CPP_LAST_CPP_OP = CPP_LESS_EQ +}; +#undef OP +#undef TK + +/* C language kind, used when calling cpp_create_reader. */ +enum c_lang {CLK_GNUC89 = 0, CLK_GNUC99, CLK_GNUC11, CLK_GNUC17, CLK_GNUC2X, + CLK_STDC89, CLK_STDC94, CLK_STDC99, CLK_STDC11, CLK_STDC17, + CLK_STDC2X, + CLK_GNUCXX, CLK_CXX98, CLK_GNUCXX11, CLK_CXX11, + CLK_GNUCXX14, CLK_CXX14, CLK_GNUCXX17, CLK_CXX17, + CLK_GNUCXX20, CLK_CXX20, CLK_GNUCXX23, CLK_CXX23, + CLK_ASM}; + +/* Payload of a NUMBER, STRING, CHAR or COMMENT token. */ +struct GTY(()) cpp_string { + unsigned int len; + const unsigned char *text; +}; + +/* Flags for the cpp_token structure. */ +#define PREV_WHITE (1 << 0) /* If whitespace before this token. */ +#define DIGRAPH (1 << 1) /* If it was a digraph. */ +#define STRINGIFY_ARG (1 << 2) /* If macro argument to be stringified. */ +#define PASTE_LEFT (1 << 3) /* If on LHS of a ## operator. */ +#define NAMED_OP (1 << 4) /* C++ named operators. */ +#define PREV_FALLTHROUGH (1 << 5) /* On a token preceeded by FALLTHROUGH + comment. */ +#define BOL (1 << 6) /* Token at beginning of line. */ +#define PURE_ZERO (1 << 7) /* Single 0 digit, used by the C++ frontend, + set in c-lex.cc. */ +#define COLON_SCOPE PURE_ZERO /* Adjacent colons in C < 23. */ +#define SP_DIGRAPH (1 << 8) /* # or ## token was a digraph. */ +#define SP_PREV_WHITE (1 << 9) /* If whitespace before a ## + operator, or before this token + after a # operator. */ +#define NO_EXPAND (1 << 10) /* Do not macro-expand this token. */ +#define PRAGMA_OP (1 << 11) /* _Pragma token. */ + +/* Specify which field, if any, of the cpp_token union is used. */ + +enum cpp_token_fld_kind { + CPP_TOKEN_FLD_NODE, + CPP_TOKEN_FLD_SOURCE, + CPP_TOKEN_FLD_STR, + CPP_TOKEN_FLD_ARG_NO, + CPP_TOKEN_FLD_TOKEN_NO, + CPP_TOKEN_FLD_PRAGMA, + CPP_TOKEN_FLD_NONE +}; + +/* A macro argument in the cpp_token union. */ +struct GTY(()) cpp_macro_arg { + /* Argument number. */ + unsigned int arg_no; + /* The original spelling of the macro argument token. */ + cpp_hashnode * + GTY ((nested_ptr (union tree_node, + "%h ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (%h)) : NULL", + "%h ? HT_IDENT_TO_GCC_IDENT (HT_NODE (%h)) : NULL"))) + spelling; +}; + +/* An identifier in the cpp_token union. */ +struct GTY(()) cpp_identifier { + /* The canonical (UTF-8) spelling of the identifier. */ + cpp_hashnode * + GTY ((nested_ptr (union tree_node, + "%h ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (%h)) : NULL", + "%h ? HT_IDENT_TO_GCC_IDENT (HT_NODE (%h)) : NULL"))) + node; + /* The original spelling of the identifier. */ + cpp_hashnode * + GTY ((nested_ptr (union tree_node, + "%h ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (%h)) : NULL", + "%h ? HT_IDENT_TO_GCC_IDENT (HT_NODE (%h)) : NULL"))) + spelling; +}; + +/* A preprocessing token. This has been carefully packed and should + occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts. */ +struct GTY(()) cpp_token { + + /* Location of first char of token, together with range of full token. */ + location_t src_loc; + + ENUM_BITFIELD(cpp_ttype) type : CHAR_BIT; /* token type */ + unsigned short flags; /* flags - see above */ + + union cpp_token_u + { + /* An identifier. */ + struct cpp_identifier GTY ((tag ("CPP_TOKEN_FLD_NODE"))) node; + + /* Inherit padding from this token. */ + cpp_token * GTY ((tag ("CPP_TOKEN_FLD_SOURCE"))) source; + + /* A string, or number. */ + struct cpp_string GTY ((tag ("CPP_TOKEN_FLD_STR"))) str; + + /* Argument no. (and original spelling) for a CPP_MACRO_ARG. */ + struct cpp_macro_arg GTY ((tag ("CPP_TOKEN_FLD_ARG_NO"))) macro_arg; + + /* Original token no. for a CPP_PASTE (from a sequence of + consecutive paste tokens in a macro expansion). */ + unsigned int GTY ((tag ("CPP_TOKEN_FLD_TOKEN_NO"))) token_no; + + /* Caller-supplied identifier for a CPP_PRAGMA. */ + unsigned int GTY ((tag ("CPP_TOKEN_FLD_PRAGMA"))) pragma; + } GTY ((desc ("cpp_token_val_index (&%1)"))) val; +}; + +/* Say which field is in use. */ +extern enum cpp_token_fld_kind cpp_token_val_index (const cpp_token *tok); + +/* A type wide enough to hold any multibyte source character. + cpplib's character constant interpreter requires an unsigned type. + Also, a typedef for the signed equivalent. + The width of this type is capped at 32 bits; there do exist targets + where wchar_t is 64 bits, but only in a non-default mode, and there + would be no meaningful interpretation for a wchar_t value greater + than 2^32 anyway -- the widest wide-character encoding around is + ISO 10646, which stops at 2^31. */ +#if CHAR_BIT * SIZEOF_INT >= 32 +# define CPPCHAR_SIGNED_T int +#elif CHAR_BIT * SIZEOF_LONG >= 32 +# define CPPCHAR_SIGNED_T long +#else +# error "Cannot find a least-32-bit signed integer type" +#endif +typedef unsigned CPPCHAR_SIGNED_T cppchar_t; +typedef CPPCHAR_SIGNED_T cppchar_signed_t; + +/* Style of header dependencies to generate. */ +enum cpp_deps_style { DEPS_NONE = 0, DEPS_USER, DEPS_SYSTEM }; + +/* The possible normalization levels, from most restrictive to least. */ +enum cpp_normalize_level { + /* In NFKC. */ + normalized_KC = 0, + /* In NFC. */ + normalized_C, + /* In NFC, except for subsequences where being in NFC would make + the identifier invalid. */ + normalized_identifier_C, + /* Not normalized at all. */ + normalized_none +}; + +enum cpp_main_search +{ + CMS_none, /* A regular source file. */ + CMS_header, /* Is a directly-specified header file (eg PCH or + header-unit). */ + CMS_user, /* Search the user INCLUDE path. */ + CMS_system, /* Search the system INCLUDE path. */ +}; + +/* The possible bidirectional control characters checking levels. */ +enum cpp_bidirectional_level { + /* No checking. */ + bidirectional_none = 0, + /* Only detect unpaired uses of bidirectional control characters. */ + bidirectional_unpaired = 1, + /* Detect any use of bidirectional control characters. */ + bidirectional_any = 2, + /* Also warn about UCNs. */ + bidirectional_ucn = 4 +}; + +/* This structure is nested inside struct cpp_reader, and + carries all the options visible to the command line. */ +struct cpp_options +{ + /* The language we're preprocessing. */ + enum c_lang lang; + + /* Nonzero means use extra default include directories for C++. */ + unsigned char cplusplus; + + /* Nonzero means handle cplusplus style comments. */ + unsigned char cplusplus_comments; + + /* Nonzero means define __OBJC__, treat @ as a special token, use + the OBJC[PLUS]_INCLUDE_PATH environment variable, and allow + "#import". */ + unsigned char objc; + + /* Nonzero means don't copy comments into the output file. */ + unsigned char discard_comments; + + /* Nonzero means don't copy comments into the output file during + macro expansion. */ + unsigned char discard_comments_in_macro_exp; + + /* Nonzero means process the ISO trigraph sequences. */ + unsigned char trigraphs; + + /* Nonzero means process the ISO digraph sequences. */ + unsigned char digraphs; + + /* Nonzero means to allow hexadecimal floats and LL suffixes. */ + unsigned char extended_numbers; + + /* Nonzero means process u/U prefix literals (UTF-16/32). */ + unsigned char uliterals; + + /* Nonzero means process u8 prefixed character literals (UTF-8). */ + unsigned char utf8_char_literals; + + /* Nonzero means process r/R raw strings. If this is set, uliterals + must be set as well. */ + unsigned char rliterals; + + /* Nonzero means print names of header files (-H). */ + unsigned char print_include_names; + + /* Nonzero means complain about deprecated features. */ + unsigned char cpp_warn_deprecated; + + /* Nonzero means warn if slash-star appears in a comment. */ + unsigned char warn_comments; + + /* Nonzero means to warn about __DATA__, __TIME__ and __TIMESTAMP__ usage. */ + unsigned char warn_date_time; + + /* Nonzero means warn if a user-supplied include directory does not + exist. */ + unsigned char warn_missing_include_dirs; + + /* Nonzero means warn if there are any trigraphs. */ + unsigned char warn_trigraphs; + + /* Nonzero means warn about multicharacter charconsts. */ + unsigned char warn_multichar; + + /* Nonzero means warn about various incompatibilities with + traditional C. */ + unsigned char cpp_warn_traditional; + + /* Nonzero means warn about long long numeric constants. */ + unsigned char cpp_warn_long_long; + + /* Nonzero means warn about text after an #endif (or #else). */ + unsigned char warn_endif_labels; + + /* Nonzero means warn about implicit sign changes owing to integer + promotions. */ + unsigned char warn_num_sign_change; + + /* Zero means don't warn about __VA_ARGS__ usage in c89 pedantic mode. + Presumably the usage is protected by the appropriate #ifdef. */ + unsigned char warn_variadic_macros; + + /* Nonzero means warn about builtin macros that are redefined or + explicitly undefined. */ + unsigned char warn_builtin_macro_redefined; + + /* Different -Wimplicit-fallthrough= levels. */ + unsigned char cpp_warn_implicit_fallthrough; + + /* Nonzero means we should look for header.gcc files that remap file + names. */ + unsigned char remap; + + /* Zero means dollar signs are punctuation. */ + unsigned char dollars_in_ident; + + /* Nonzero means UCNs are accepted in identifiers. */ + unsigned char extended_identifiers; + + /* True if we should warn about dollars in identifiers or numbers + for this translation unit. */ + unsigned char warn_dollars; + + /* Nonzero means warn if undefined identifiers are evaluated in an #if. */ + unsigned char warn_undef; + + /* Nonzero means warn if "defined" is encountered in a place other than + an #if. */ + unsigned char warn_expansion_to_defined; + + /* Nonzero means warn of unused macros from the main file. */ + unsigned char warn_unused_macros; + + /* Nonzero for the 1999 C Standard, including corrigenda and amendments. */ + unsigned char c99; + + /* Nonzero if we are conforming to a specific C or C++ standard. */ + unsigned char std; + + /* Nonzero means give all the error messages the ANSI standard requires. */ + unsigned char cpp_pedantic; + + /* Nonzero means we're looking at already preprocessed code, so don't + bother trying to do macro expansion and whatnot. */ + unsigned char preprocessed; + + /* Nonzero means we are going to emit debugging logs during + preprocessing. */ + unsigned char debug; + + /* Nonzero means we are tracking locations of tokens involved in + macro expansion. 1 Means we track the location in degraded mode + where we do not track locations of tokens resulting from the + expansion of arguments of function-like macro. 2 Means we do + track all macro expansions. This last option is the one that + consumes the highest amount of memory. */ + unsigned char track_macro_expansion; + + /* Nonzero means handle C++ alternate operator names. */ + unsigned char operator_names; + + /* Nonzero means warn about use of C++ alternate operator names. */ + unsigned char warn_cxx_operator_names; + + /* True for traditional preprocessing. */ + unsigned char traditional; + + /* Nonzero for C++ 2011 Standard user-defined literals. */ + unsigned char user_literals; + + /* Nonzero means warn when a string or character literal is followed by a + ud-suffix which does not beging with an underscore. */ + unsigned char warn_literal_suffix; + + /* Nonzero means interpret imaginary, fixed-point, or other gnu extension + literal number suffixes as user-defined literal number suffixes. */ + unsigned char ext_numeric_literals; + + /* Nonzero means extended identifiers allow the characters specified + in C11. */ + unsigned char c11_identifiers; + + /* Nonzero for C++ 2014 Standard binary constants. */ + unsigned char binary_constants; + + /* Nonzero for C++ 2014 Standard digit separators. */ + unsigned char digit_separators; + + /* Nonzero for C2X decimal floating-point constants. */ + unsigned char dfp_constants; + + /* Nonzero for C++20 __VA_OPT__ feature. */ + unsigned char va_opt; + + /* Nonzero for the '::' token. */ + unsigned char scope; + + /* Nonzero for the '#elifdef' and '#elifndef' directives. */ + unsigned char elifdef; + + /* Nonzero means tokenize C++20 module directives. */ + unsigned char module_directives; + + /* Nonzero for C++23 size_t literals. */ + unsigned char size_t_literals; + + /* Holds the name of the target (execution) character set. */ + const char *narrow_charset; + + /* Holds the name of the target wide character set. */ + const char *wide_charset; + + /* Holds the name of the input character set. */ + const char *input_charset; + + /* The minimum permitted level of normalization before a warning + is generated. See enum cpp_normalize_level. */ + int warn_normalize; + + /* True to warn about precompiled header files we couldn't use. */ + bool warn_invalid_pch; + + /* True if dependencies should be restored from a precompiled header. */ + bool restore_pch_deps; + + /* True if warn about differences between C90 and C99. */ + signed char cpp_warn_c90_c99_compat; + + /* True if warn about differences between C11 and C2X. */ + signed char cpp_warn_c11_c2x_compat; + + /* True if warn about differences between C++98 and C++11. */ + bool cpp_warn_cxx11_compat; + + /* Nonzero if bidirectional control characters checking is on. See enum + cpp_bidirectional_level. */ + unsigned char cpp_warn_bidirectional; + + /* Dependency generation. */ + struct + { + /* Style of header dependencies to generate. */ + enum cpp_deps_style style; + + /* Assume missing files are generated files. */ + bool missing_files; + + /* Generate phony targets for each dependency apart from the first + one. */ + bool phony_targets; + + /* Generate dependency info for modules. */ + bool modules; + + /* If true, no dependency is generated on the main file. */ + bool ignore_main_file; + + /* If true, intend to use the preprocessor output (e.g., for compilation) + in addition to the dependency info. */ + bool need_preprocessor_output; + } deps; + + /* Target-specific features set by the front end or client. */ + + /* Precision for target CPP arithmetic, target characters, target + ints and target wide characters, respectively. */ + size_t precision, char_precision, int_precision, wchar_precision; + + /* True means chars (wide chars) are unsigned. */ + bool unsigned_char, unsigned_wchar; + + /* True if the most significant byte in a word has the lowest + address in memory. */ + bool bytes_big_endian; + + /* Nonzero means __STDC__ should have the value 0 in system headers. */ + unsigned char stdc_0_in_system_headers; + + /* True disables tokenization outside of preprocessing directives. */ + bool directives_only; + + /* True enables canonicalization of system header file paths. */ + bool canonical_system_headers; + + /* The maximum depth of the nested #include. */ + unsigned int max_include_depth; + + cpp_main_search main_search : 8; +}; + +/* Diagnostic levels. To get a diagnostic without associating a + position in the translation unit with it, use cpp_error_with_line + with a line number of zero. */ + +enum cpp_diagnostic_level { + /* Warning, an error with -Werror. */ + CPP_DL_WARNING = 0, + /* Same as CPP_DL_WARNING, except it is not suppressed in system headers. */ + CPP_DL_WARNING_SYSHDR, + /* Warning, an error with -pedantic-errors or -Werror. */ + CPP_DL_PEDWARN, + /* An error. */ + CPP_DL_ERROR, + /* An internal consistency check failed. Prints "internal error: ", + otherwise the same as CPP_DL_ERROR. */ + CPP_DL_ICE, + /* An informative note following a warning. */ + CPP_DL_NOTE, + /* A fatal error. */ + CPP_DL_FATAL +}; + +/* Warning reason codes. Use a reason code of CPP_W_NONE for unclassified + warnings and diagnostics that are not warnings. */ + +enum cpp_warning_reason { + CPP_W_NONE = 0, + CPP_W_DEPRECATED, + CPP_W_COMMENTS, + CPP_W_MISSING_INCLUDE_DIRS, + CPP_W_TRIGRAPHS, + CPP_W_MULTICHAR, + CPP_W_TRADITIONAL, + CPP_W_LONG_LONG, + CPP_W_ENDIF_LABELS, + CPP_W_NUM_SIGN_CHANGE, + CPP_W_VARIADIC_MACROS, + CPP_W_BUILTIN_MACRO_REDEFINED, + CPP_W_DOLLARS, + CPP_W_UNDEF, + CPP_W_UNUSED_MACROS, + CPP_W_CXX_OPERATOR_NAMES, + CPP_W_NORMALIZE, + CPP_W_INVALID_PCH, + CPP_W_WARNING_DIRECTIVE, + CPP_W_LITERAL_SUFFIX, + CPP_W_SIZE_T_LITERALS, + CPP_W_DATE_TIME, + CPP_W_PEDANTIC, + CPP_W_C90_C99_COMPAT, + CPP_W_C11_C2X_COMPAT, + CPP_W_CXX11_COMPAT, + CPP_W_EXPANSION_TO_DEFINED, + CPP_W_BIDIRECTIONAL +}; + +/* Callback for header lookup for HEADER, which is the name of a + source file. It is used as a method of last resort to find headers + that are not otherwise found during the normal include processing. + The return value is the malloced name of a header to try and open, + if any, or NULL otherwise. This callback is called only if the + header is otherwise unfound. */ +typedef const char *(*missing_header_cb)(cpp_reader *, const char *header, cpp_dir **); + +/* Call backs to cpplib client. */ +struct cpp_callbacks +{ + /* Called when a new line of preprocessed output is started. */ + void (*line_change) (cpp_reader *, const cpp_token *, int); + + /* Called when switching to/from a new file. + The line_map is for the new file. It is NULL if there is no new file. + (In C this happens when done with + and also + when done with a main file.) This can be used for resource cleanup. */ + void (*file_change) (cpp_reader *, const line_map_ordinary *); + + void (*dir_change) (cpp_reader *, const char *); + void (*include) (cpp_reader *, location_t, const unsigned char *, + const char *, int, const cpp_token **); + void (*define) (cpp_reader *, location_t, cpp_hashnode *); + void (*undef) (cpp_reader *, location_t, cpp_hashnode *); + void (*ident) (cpp_reader *, location_t, const cpp_string *); + void (*def_pragma) (cpp_reader *, location_t); + int (*valid_pch) (cpp_reader *, const char *, int); + void (*read_pch) (cpp_reader *, const char *, int, const char *); + missing_header_cb missing_header; + + /* Context-sensitive macro support. Returns macro (if any) that should + be expanded. */ + cpp_hashnode * (*macro_to_expand) (cpp_reader *, const cpp_token *); + + /* Called to emit a diagnostic. This callback receives the + translated message. */ + bool (*diagnostic) (cpp_reader *, + enum cpp_diagnostic_level, + enum cpp_warning_reason, + rich_location *, + const char *, va_list *) + ATTRIBUTE_FPTR_PRINTF(5,0); + + /* Callbacks for when a macro is expanded, or tested (whether + defined or not at the time) in #ifdef, #ifndef or "defined". */ + void (*used_define) (cpp_reader *, location_t, cpp_hashnode *); + void (*used_undef) (cpp_reader *, location_t, cpp_hashnode *); + /* Called before #define and #undef or other macro definition + changes are processed. */ + void (*before_define) (cpp_reader *); + /* Called whenever a macro is expanded or tested. + Second argument is the location of the start of the current expansion. */ + void (*used) (cpp_reader *, location_t, cpp_hashnode *); + + /* Callback to identify whether an attribute exists. */ + int (*has_attribute) (cpp_reader *, bool); + + /* Callback to determine whether a built-in function is recognized. */ + int (*has_builtin) (cpp_reader *); + + /* Callback that can change a user lazy into normal macro. */ + void (*user_lazy_macro) (cpp_reader *, cpp_macro *, unsigned); + + /* Callback to handle deferred cpp_macros. */ + cpp_macro *(*user_deferred_macro) (cpp_reader *, location_t, cpp_hashnode *); + + /* Callback to parse SOURCE_DATE_EPOCH from environment. */ + time_t (*get_source_date_epoch) (cpp_reader *); + + /* Callback for providing suggestions for misspelled directives. */ + const char *(*get_suggestion) (cpp_reader *, const char *, const char *const *); + + /* Callback for when a comment is encountered, giving the location + of the opening slash, a pointer to the content (which is not + necessarily 0-terminated), and the length of the content. + The content contains the opening slash-star (or slash-slash), + and for C-style comments contains the closing star-slash. For + C++-style comments it does not include the terminating newline. */ + void (*comment) (cpp_reader *, location_t, const unsigned char *, + size_t); + + /* Callback for filename remapping in __FILE__ and __BASE_FILE__ macro + expansions. */ + const char *(*remap_filename) (const char*); + + /* Maybe translate a #include into something else. Return a + cpp_buffer containing the translation if translating. */ + char *(*translate_include) (cpp_reader *, line_maps *, location_t, + const char *path); +}; + +#ifdef VMS +#define INO_T_CPP ino_t ino[3] +#elif defined (_AIX) && SIZEOF_INO_T == 4 +#define INO_T_CPP ino64_t ino +#else +#define INO_T_CPP ino_t ino +#endif + +#if defined (_AIX) && SIZEOF_DEV_T == 4 +#define DEV_T_CPP dev64_t dev +#else +#define DEV_T_CPP dev_t dev +#endif + +/* Chain of directories to look for include files in. */ +struct cpp_dir +{ + /* NULL-terminated singly-linked list. */ + struct cpp_dir *next; + + /* NAME of the directory, NUL-terminated. */ + char *name; + unsigned int len; + + /* One if a system header, two if a system header that has extern + "C" guards for C++. */ + unsigned char sysp; + + /* Is this a user-supplied directory? */ + bool user_supplied_p; + + /* The canonicalized NAME as determined by lrealpath. This field + is only used by hosts that lack reliable inode numbers. */ + char *canonical_name; + + /* Mapping of file names for this directory for MS-DOS and related + platforms. A NULL-terminated array of (from, to) pairs. */ + const char **name_map; + + /* Routine to construct pathname, given the search path name and the + HEADER we are trying to find, return a constructed pathname to + try and open. If this is NULL, the constructed pathname is as + constructed by append_file_to_dir. */ + char *(*construct) (const char *header, cpp_dir *dir); + + /* The C front end uses these to recognize duplicated + directories in the search path. */ + INO_T_CPP; + DEV_T_CPP; +}; + +/* The kind of the cpp_macro. */ +enum cpp_macro_kind { + cmk_macro, /* An ISO macro (token expansion). */ + cmk_assert, /* An assertion. */ + cmk_traditional /* A traditional macro (text expansion). */ +}; + +/* Each macro definition is recorded in a cpp_macro structure. + Variadic macros cannot occur with traditional cpp. */ +struct GTY(()) cpp_macro { + union cpp_parm_u + { + /* Parameters, if any. If parameter names use extended identifiers, + the original spelling of those identifiers, not the canonical + UTF-8 spelling, goes here. */ + cpp_hashnode ** GTY ((tag ("false"), + nested_ptr (union tree_node, + "%h ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (%h)) : NULL", + "%h ? HT_IDENT_TO_GCC_IDENT (HT_NODE (%h)) : NULL"), + length ("%1.paramc"))) params; + + /* If this is an assertion, the next one in the chain. */ + cpp_macro *GTY ((tag ("true"))) next; + } GTY ((desc ("%1.kind == cmk_assert"))) parm; + + /* Definition line number. */ + location_t line; + + /* Number of tokens in body, or bytes for traditional macros. */ + /* Do we really need 2^32-1 range here? */ + unsigned int count; + + /* Number of parameters. */ + unsigned short paramc; + + /* Non-zero if this is a user-lazy macro, value provided by user. */ + unsigned char lazy; + + /* The kind of this macro (ISO, trad or assert) */ + unsigned kind : 2; + + /* If a function-like macro. */ + unsigned int fun_like : 1; + + /* If a variadic macro. */ + unsigned int variadic : 1; + + /* If macro defined in system header. */ + unsigned int syshdr : 1; + + /* Nonzero if it has been expanded or had its existence tested. */ + unsigned int used : 1; + + /* Indicate whether the tokens include extra CPP_PASTE tokens at the + end to track invalid redefinitions with consecutive CPP_PASTE + tokens. */ + unsigned int extra_tokens : 1; + + /* Imported C++20 macro (from a header unit). */ + unsigned int imported_p : 1; + + /* 0 bits spare (32-bit). 32 on 64-bit target. */ + + union cpp_exp_u + { + /* Trailing array of replacement tokens (ISO), or assertion body value. */ + cpp_token GTY ((tag ("false"), length ("%1.count"))) tokens[1]; + + /* Pointer to replacement text (traditional). See comment at top + of cpptrad.c for how traditional function-like macros are + encoded. */ + const unsigned char *GTY ((tag ("true"))) text; + } GTY ((desc ("%1.kind == cmk_traditional"))) exp; +}; + +/* Poisoned identifiers are flagged NODE_POISONED. NODE_OPERATOR (C++ + only) indicates an identifier that behaves like an operator such as + "xor". NODE_DIAGNOSTIC is for speed in lex_token: it indicates a + diagnostic may be required for this node. Currently this only + applies to __VA_ARGS__, poisoned identifiers, and -Wc++-compat + warnings about NODE_OPERATOR. */ + +/* Hash node flags. */ +#define NODE_OPERATOR (1 << 0) /* C++ named operator. */ +#define NODE_POISONED (1 << 1) /* Poisoned identifier. */ +#define NODE_DIAGNOSTIC (1 << 2) /* Possible diagnostic when lexed. */ +#define NODE_WARN (1 << 3) /* Warn if redefined or undefined. */ +#define NODE_DISABLED (1 << 4) /* A disabled macro. */ +#define NODE_USED (1 << 5) /* Dumped with -dU. */ +#define NODE_CONDITIONAL (1 << 6) /* Conditional macro */ +#define NODE_WARN_OPERATOR (1 << 7) /* Warn about C++ named operator. */ +#define NODE_MODULE (1 << 8) /* C++-20 module-related name. */ + +/* Different flavors of hash node. */ +enum node_type +{ + NT_VOID = 0, /* Maybe an assert? */ + NT_MACRO_ARG, /* A macro arg. */ + NT_USER_MACRO, /* A user macro. */ + NT_BUILTIN_MACRO, /* A builtin macro. */ + NT_MACRO_MASK = NT_USER_MACRO /* Mask for either macro kind. */ +}; + +/* Different flavors of builtin macro. _Pragma is an operator, but we + handle it with the builtin code for efficiency reasons. */ +enum cpp_builtin_type +{ + BT_SPECLINE = 0, /* `__LINE__' */ + BT_DATE, /* `__DATE__' */ + BT_FILE, /* `__FILE__' */ + BT_FILE_NAME, /* `__FILE_NAME__' */ + BT_BASE_FILE, /* `__BASE_FILE__' */ + BT_INCLUDE_LEVEL, /* `__INCLUDE_LEVEL__' */ + BT_TIME, /* `__TIME__' */ + BT_STDC, /* `__STDC__' */ + BT_PRAGMA, /* `_Pragma' operator */ + BT_TIMESTAMP, /* `__TIMESTAMP__' */ + BT_COUNTER, /* `__COUNTER__' */ + BT_HAS_ATTRIBUTE, /* `__has_attribute(x)' */ + BT_HAS_STD_ATTRIBUTE, /* `__has_c_attribute(x)' */ + BT_HAS_BUILTIN, /* `__has_builtin(x)' */ + BT_HAS_INCLUDE, /* `__has_include(x)' */ + BT_HAS_INCLUDE_NEXT, /* `__has_include_next(x)' */ + + // RT Extension + BT_RT_ASSIGN, + BT_RT_TO_ARG_LIST, + BT_RT_TO_TOKEN_LIST, + BT_RT_FIRST, + BT_RT_REST, + BT_RT_MAP, + BT_RT_AL_MAP, + BT_RT_IF, + BT_RT_NOT, + BT_RT_AND, + BT_RT_OR, + BT_RT_IS_IDENTIFIER, + BT_RT_IS_NAME, + BT_RT_PASTE +}; + +#define CPP_HASHNODE(HNODE) ((cpp_hashnode *) (HNODE)) +#define HT_NODE(NODE) (&(NODE)->ident) +#define NODE_LEN(NODE) HT_LEN (HT_NODE (NODE)) +#define NODE_NAME(NODE) HT_STR (HT_NODE (NODE)) + +/* The common part of an identifier node shared amongst all 3 C front + ends. Also used to store CPP identifiers, which are a superset of + identifiers in the grammatical sense. */ + +union GTY(()) _cpp_hashnode_value { + /* Assert (maybe NULL) */ + cpp_macro * GTY((tag ("NT_VOID"))) answers; + /* Macro (maybe NULL) */ + cpp_macro * GTY((tag ("NT_USER_MACRO"))) macro; + /* Code for a builtin macro. */ + enum cpp_builtin_type GTY ((tag ("NT_BUILTIN_MACRO"))) builtin; + /* Macro argument index. */ + unsigned short GTY ((tag ("NT_MACRO_ARG"))) arg_index; +}; + +struct GTY(()) cpp_hashnode { + struct ht_identifier ident; + unsigned int is_directive : 1; + unsigned int directive_index : 7; /* If is_directive, + then index into directive table. + Otherwise, a NODE_OPERATOR. */ + unsigned int rid_code : 8; /* Rid code - for front ends. */ + unsigned int flags : 9; /* CPP flags. */ + ENUM_BITFIELD(node_type) type : 2; /* CPP node type. */ + + /* 5 bits spare. */ + + /* The deferred cookie is applicable to NT_USER_MACRO or NT_VOID. + The latter for when a macro had a prevailing undef. + On a 64-bit system there would be 32-bits of padding to the value + field. So placing the deferred index here is not costly. */ + unsigned deferred; /* Deferred cookie */ + + union _cpp_hashnode_value GTY ((desc ("%1.type"))) value; +}; + +/* A class for iterating through the source locations within a + string token (before escapes are interpreted, and before + concatenation). */ + +class cpp_string_location_reader { + public: + cpp_string_location_reader (location_t src_loc, + line_maps *line_table); + + source_range get_next (); + + private: + location_t m_loc; + int m_offset_per_column; +}; + +/* A class for storing the source ranges of all of the characters within + a string literal, after escapes are interpreted, and after + concatenation. + + This is not GTY-marked, as instances are intended to be temporary. */ + +class cpp_substring_ranges +{ + public: + cpp_substring_ranges (); + ~cpp_substring_ranges (); + + int get_num_ranges () const { return m_num_ranges; } + source_range get_range (int idx) const + { + linemap_assert (idx < m_num_ranges); + return m_ranges[idx]; + } + + void add_range (source_range range); + void add_n_ranges (int num, cpp_string_location_reader &loc_reader); + + private: + source_range *m_ranges; + int m_num_ranges; + int m_alloc_ranges; +}; + +/* Call this first to get a handle to pass to other functions. + + If you want cpplib to manage its own hashtable, pass in a NULL + pointer. Otherwise you should pass in an initialized hash table + that cpplib will share; this technique is used by the C front + ends. */ +extern cpp_reader *cpp_create_reader (enum c_lang, struct ht *, + class line_maps *); + +/* Reset the cpp_reader's line_map. This is only used after reading a + PCH file. */ +extern void cpp_set_line_map (cpp_reader *, class line_maps *); + +/* Call this to change the selected language standard (e.g. because of + command line options). */ +extern void cpp_set_lang (cpp_reader *, enum c_lang); + +/* Set the include paths. */ +extern void cpp_set_include_chains (cpp_reader *, cpp_dir *, cpp_dir *, int); + +/* Call these to get pointers to the options, callback, and deps + structures for a given reader. These pointers are good until you + call cpp_finish on that reader. You can either edit the callbacks + through the pointer returned from cpp_get_callbacks, or set them + with cpp_set_callbacks. */ +extern cpp_options *cpp_get_options (cpp_reader *) ATTRIBUTE_PURE; +extern cpp_callbacks *cpp_get_callbacks (cpp_reader *) ATTRIBUTE_PURE; +extern void cpp_set_callbacks (cpp_reader *, cpp_callbacks *); +extern class mkdeps *cpp_get_deps (cpp_reader *) ATTRIBUTE_PURE; + +extern const char *cpp_probe_header_unit (cpp_reader *, const char *file, + bool angle_p, location_t); + +/* Call these to get name data about the various compile-time + charsets. */ +extern const char *cpp_get_narrow_charset_name (cpp_reader *) ATTRIBUTE_PURE; +extern const char *cpp_get_wide_charset_name (cpp_reader *) ATTRIBUTE_PURE; + +/* This function reads the file, but does not start preprocessing. It + returns the name of the original file; this is the same as the + input file, except for preprocessed input. This will generate at + least one file change callback, and possibly a line change callback + too. If there was an error opening the file, it returns NULL. */ +extern const char *cpp_read_main_file (cpp_reader *, const char *, + bool injecting = false); +extern location_t cpp_main_loc (const cpp_reader *); + +/* Adjust for the main file to be an include. */ +extern void cpp_retrofit_as_include (cpp_reader *); + +/* Set up built-ins with special behavior. Use cpp_init_builtins() + instead unless your know what you are doing. */ +extern void cpp_init_special_builtins (cpp_reader *); + +/* Set up built-ins like __FILE__. */ +extern void cpp_init_builtins (cpp_reader *, int); + +/* This is called after options have been parsed, and partially + processed. */ +extern void cpp_post_options (cpp_reader *); + +/* Set up translation to the target character set. */ +extern void cpp_init_iconv (cpp_reader *); + +/* Call this to finish preprocessing. If you requested dependency + generation, pass an open stream to write the information to, + otherwise NULL. It is your responsibility to close the stream. */ +extern void cpp_finish (cpp_reader *, FILE *deps_stream); + +/* Call this to release the handle at the end of preprocessing. Any + use of the handle after this function returns is invalid. */ +extern void cpp_destroy (cpp_reader *); + +extern unsigned int cpp_token_len (const cpp_token *); +extern unsigned char *cpp_token_as_text (cpp_reader *, const cpp_token *); +extern unsigned char *cpp_spell_token (cpp_reader *, const cpp_token *, + unsigned char *, bool); +extern void cpp_register_pragma (cpp_reader *, const char *, const char *, + void (*) (cpp_reader *), bool); +extern void cpp_register_deferred_pragma (cpp_reader *, const char *, + const char *, unsigned, bool, bool); +extern int cpp_avoid_paste (cpp_reader *, const cpp_token *, + const cpp_token *); +extern const cpp_token *cpp_get_token (cpp_reader *); +extern const cpp_token *cpp_get_token_with_location (cpp_reader *, + location_t *); +inline bool cpp_user_macro_p (const cpp_hashnode *node) +{ + return node->type == NT_USER_MACRO; +} +inline bool cpp_builtin_macro_p (const cpp_hashnode *node) +{ + return node->type == NT_BUILTIN_MACRO; +} +inline bool cpp_macro_p (const cpp_hashnode *node) +{ + return node->type & NT_MACRO_MASK; +} +inline cpp_macro *cpp_set_deferred_macro (cpp_hashnode *node, + cpp_macro *forced = NULL) +{ + cpp_macro *old = node->value.macro; + + node->value.macro = forced; + node->type = NT_USER_MACRO; + node->flags &= ~NODE_USED; + + return old; +} +cpp_macro *cpp_get_deferred_macro (cpp_reader *, cpp_hashnode *, location_t); + +/* Returns true if NODE is a function-like user macro. */ +inline bool cpp_fun_like_macro_p (cpp_hashnode *node) +{ + return cpp_user_macro_p (node) && node->value.macro->fun_like; +} + +extern const unsigned char *cpp_macro_definition (cpp_reader *, cpp_hashnode *); +extern const unsigned char *cpp_macro_definition (cpp_reader *, cpp_hashnode *, + const cpp_macro *); +inline location_t cpp_macro_definition_location (cpp_hashnode *node) +{ + const cpp_macro *macro = node->value.macro; + return macro ? macro->line : 0; +} +/* Return an idempotent time stamp (possibly from SOURCE_DATE_EPOCH). */ +enum class CPP_time_kind +{ + FIXED = -1, /* Fixed time via source epoch. */ + DYNAMIC = -2, /* Dynamic via time(2). */ + UNKNOWN = -3 /* Wibbly wobbly, timey wimey. */ +}; +extern CPP_time_kind cpp_get_date (cpp_reader *, time_t *); + +extern void _cpp_backup_tokens (cpp_reader *, unsigned int); +extern const cpp_token *cpp_peek_token (cpp_reader *, int); + +/* Evaluate a CPP_*CHAR* token. */ +extern cppchar_t cpp_interpret_charconst (cpp_reader *, const cpp_token *, + unsigned int *, int *); +/* Evaluate a vector of CPP_*STRING* tokens. */ +extern bool cpp_interpret_string (cpp_reader *, + const cpp_string *, size_t, + cpp_string *, enum cpp_ttype); +extern const char *cpp_interpret_string_ranges (cpp_reader *pfile, + const cpp_string *from, + cpp_string_location_reader *, + size_t count, + cpp_substring_ranges *out, + enum cpp_ttype type); +extern bool cpp_interpret_string_notranslate (cpp_reader *, + const cpp_string *, size_t, + cpp_string *, enum cpp_ttype); + +/* Convert a host character constant to the execution character set. */ +extern cppchar_t cpp_host_to_exec_charset (cpp_reader *, cppchar_t); + +/* Used to register macros and assertions, perhaps from the command line. + The text is the same as the command line argument. */ +extern void cpp_define (cpp_reader *, const char *); +extern void cpp_define_unused (cpp_reader *, const char *); +extern void cpp_define_formatted (cpp_reader *pfile, + const char *fmt, ...) ATTRIBUTE_PRINTF_2; +extern void cpp_define_formatted_unused (cpp_reader *pfile, + const char *fmt, + ...) ATTRIBUTE_PRINTF_2; +extern void cpp_assert (cpp_reader *, const char *); +extern void cpp_undef (cpp_reader *, const char *); +extern void cpp_unassert (cpp_reader *, const char *); + +/* Mark a node as a lazily defined macro. */ +extern void cpp_define_lazily (cpp_reader *, cpp_hashnode *node, unsigned N); + +/* Undefine all macros and assertions. */ +extern void cpp_undef_all (cpp_reader *); + +extern cpp_buffer *cpp_push_buffer (cpp_reader *, const unsigned char *, + size_t, int); +extern int cpp_defined (cpp_reader *, const unsigned char *, int); + +/* A preprocessing number. Code assumes that any unused high bits of + the double integer are set to zero. */ + +/* This type has to be equal to unsigned HOST_WIDE_INT, see + gcc/c-family/c-lex.cc. */ +typedef uint64_t cpp_num_part; +typedef struct cpp_num cpp_num; +struct cpp_num +{ + cpp_num_part high; + cpp_num_part low; + bool unsignedp; /* True if value should be treated as unsigned. */ + bool overflow; /* True if the most recent calculation overflowed. */ +}; + +/* cpplib provides two interfaces for interpretation of preprocessing + numbers. + + cpp_classify_number categorizes numeric constants according to + their field (integer, floating point, or invalid), radix (decimal, + octal, hexadecimal), and type suffixes. */ + +#define CPP_N_CATEGORY 0x000F +#define CPP_N_INVALID 0x0000 +#define CPP_N_INTEGER 0x0001 +#define CPP_N_FLOATING 0x0002 + +#define CPP_N_WIDTH 0x00F0 +#define CPP_N_SMALL 0x0010 /* int, float, short _Fract/Accum */ +#define CPP_N_MEDIUM 0x0020 /* long, double, long _Fract/_Accum. */ +#define CPP_N_LARGE 0x0040 /* long long, long double, + long long _Fract/Accum. */ + +#define CPP_N_WIDTH_MD 0xF0000 /* machine defined. */ +#define CPP_N_MD_W 0x10000 +#define CPP_N_MD_Q 0x20000 + +#define CPP_N_RADIX 0x0F00 +#define CPP_N_DECIMAL 0x0100 +#define CPP_N_HEX 0x0200 +#define CPP_N_OCTAL 0x0400 +#define CPP_N_BINARY 0x0800 + +#define CPP_N_UNSIGNED 0x1000 /* Properties. */ +#define CPP_N_IMAGINARY 0x2000 +#define CPP_N_DFLOAT 0x4000 +#define CPP_N_DEFAULT 0x8000 + +#define CPP_N_FRACT 0x100000 /* Fract types. */ +#define CPP_N_ACCUM 0x200000 /* Accum types. */ +#define CPP_N_FLOATN 0x400000 /* _FloatN types. */ +#define CPP_N_FLOATNX 0x800000 /* _FloatNx types. */ + +#define CPP_N_USERDEF 0x1000000 /* C++11 user-defined literal. */ + +#define CPP_N_SIZE_T 0x2000000 /* C++23 size_t literal. */ + +#define CPP_N_WIDTH_FLOATN_NX 0xF0000000 /* _FloatN / _FloatNx value + of N, divided by 16. */ +#define CPP_FLOATN_SHIFT 24 +#define CPP_FLOATN_MAX 0xF0 + +/* Classify a CPP_NUMBER token. The return value is a combination of + the flags from the above sets. */ +extern unsigned cpp_classify_number (cpp_reader *, const cpp_token *, + const char **, location_t); + +/* Return the classification flags for a float suffix. */ +extern unsigned int cpp_interpret_float_suffix (cpp_reader *, const char *, + size_t); + +/* Return the classification flags for an int suffix. */ +extern unsigned int cpp_interpret_int_suffix (cpp_reader *, const char *, + size_t); + +/* Evaluate a token classified as category CPP_N_INTEGER. */ +extern cpp_num cpp_interpret_integer (cpp_reader *, const cpp_token *, + unsigned int); + +/* Sign extend a number, with PRECISION significant bits and all + others assumed clear, to fill out a cpp_num structure. */ +cpp_num cpp_num_sign_extend (cpp_num, size_t); + +/* Output a diagnostic of some kind. */ +extern bool cpp_error (cpp_reader *, enum cpp_diagnostic_level, + const char *msgid, ...) + ATTRIBUTE_PRINTF_3; +extern bool cpp_warning (cpp_reader *, enum cpp_warning_reason, + const char *msgid, ...) + ATTRIBUTE_PRINTF_3; +extern bool cpp_pedwarning (cpp_reader *, enum cpp_warning_reason, + const char *msgid, ...) + ATTRIBUTE_PRINTF_3; +extern bool cpp_warning_syshdr (cpp_reader *, enum cpp_warning_reason reason, + const char *msgid, ...) + ATTRIBUTE_PRINTF_3; + +/* As their counterparts above, but use RICHLOC. */ +extern bool cpp_warning_at (cpp_reader *, enum cpp_warning_reason, + rich_location *richloc, const char *msgid, ...) + ATTRIBUTE_PRINTF_4; +extern bool cpp_pedwarning_at (cpp_reader *, enum cpp_warning_reason, + rich_location *richloc, const char *msgid, ...) + ATTRIBUTE_PRINTF_4; + +/* Output a diagnostic with "MSGID: " preceding the + error string of errno. No location is printed. */ +extern bool cpp_errno (cpp_reader *, enum cpp_diagnostic_level, + const char *msgid); +/* Similarly, but with "FILENAME: " instead of "MSGID: ", where + the filename is not localized. */ +extern bool cpp_errno_filename (cpp_reader *, enum cpp_diagnostic_level, + const char *filename, location_t loc); + +/* Same as cpp_error, except additionally specifies a position as a + (translation unit) physical line and physical column. If the line is + zero, then no location is printed. */ +extern bool cpp_error_with_line (cpp_reader *, enum cpp_diagnostic_level, + location_t, unsigned, + const char *msgid, ...) + ATTRIBUTE_PRINTF_5; +extern bool cpp_warning_with_line (cpp_reader *, enum cpp_warning_reason, + location_t, unsigned, + const char *msgid, ...) + ATTRIBUTE_PRINTF_5; +extern bool cpp_pedwarning_with_line (cpp_reader *, enum cpp_warning_reason, + location_t, unsigned, + const char *msgid, ...) + ATTRIBUTE_PRINTF_5; +extern bool cpp_warning_with_line_syshdr (cpp_reader *, enum cpp_warning_reason, + location_t, unsigned, + const char *msgid, ...) + ATTRIBUTE_PRINTF_5; + +extern bool cpp_error_at (cpp_reader * pfile, enum cpp_diagnostic_level, + location_t src_loc, const char *msgid, ...) + ATTRIBUTE_PRINTF_4; + +extern bool cpp_error_at (cpp_reader * pfile, enum cpp_diagnostic_level, + rich_location *richloc, const char *msgid, ...) + ATTRIBUTE_PRINTF_4; + +/* In lex.cc */ +extern int cpp_ideq (const cpp_token *, const char *); +extern void cpp_output_line (cpp_reader *, FILE *); +extern unsigned char *cpp_output_line_to_string (cpp_reader *, + const unsigned char *); +extern const unsigned char *cpp_alloc_token_string + (cpp_reader *, const unsigned char *, unsigned); +extern void cpp_output_token (const cpp_token *, FILE *); +extern const char *cpp_type2name (enum cpp_ttype, unsigned char flags); +/* Returns the value of an escape sequence, truncated to the correct + target precision. PSTR points to the input pointer, which is just + after the backslash. LIMIT is how much text we have. WIDE is true + if the escape sequence is part of a wide character constant or + string literal. Handles all relevant diagnostics. */ +extern cppchar_t cpp_parse_escape (cpp_reader *, const unsigned char ** pstr, + const unsigned char *limit, int wide); + +/* Structure used to hold a comment block at a given location in the + source code. */ + +typedef struct +{ + /* Text of the comment including the terminators. */ + char *comment; + + /* source location for the given comment. */ + location_t sloc; +} cpp_comment; + +/* Structure holding all comments for a given cpp_reader. */ + +typedef struct +{ + /* table of comment entries. */ + cpp_comment *entries; + + /* number of actual entries entered in the table. */ + int count; + + /* number of entries allocated currently. */ + int allocated; +} cpp_comment_table; + +/* Returns the table of comments encountered by the preprocessor. This + table is only populated when pfile->state.save_comments is true. */ +extern cpp_comment_table *cpp_get_comments (cpp_reader *); + +/* In hash.c */ + +/* Lookup an identifier in the hashtable. Puts the identifier in the + table if it is not already there. */ +extern cpp_hashnode *cpp_lookup (cpp_reader *, const unsigned char *, + unsigned int); + +typedef int (*cpp_cb) (cpp_reader *, cpp_hashnode *, void *); +extern void cpp_forall_identifiers (cpp_reader *, cpp_cb, void *); + +/* In macro.cc */ +extern void cpp_scan_nooutput (cpp_reader *); +extern int cpp_sys_macro_p (cpp_reader *); +extern unsigned char *cpp_quote_string (unsigned char *, const unsigned char *, + unsigned int); +extern bool cpp_compare_macros (const cpp_macro *macro1, + const cpp_macro *macro2); + +/* In files.cc */ +extern bool cpp_included (cpp_reader *, const char *); +extern bool cpp_included_before (cpp_reader *, const char *, location_t); +extern void cpp_make_system_header (cpp_reader *, int, int); +extern bool cpp_push_include (cpp_reader *, const char *); +extern bool cpp_push_default_include (cpp_reader *, const char *); +extern void cpp_change_file (cpp_reader *, enum lc_reason, const char *); +extern const char *cpp_get_path (struct _cpp_file *); +extern cpp_dir *cpp_get_dir (struct _cpp_file *); +extern cpp_buffer *cpp_get_buffer (cpp_reader *); +extern struct _cpp_file *cpp_get_file (cpp_buffer *); +extern cpp_buffer *cpp_get_prev (cpp_buffer *); +extern void cpp_clear_file_cache (cpp_reader *); + +/* cpp_get_converted_source returns the contents of the given file, as it exists + after cpplib has read it and converted it from the input charset to the + source charset. Return struct will be zero-filled if the data could not be + read for any reason. The data starts at the DATA pointer, but the TO_FREE + pointer is what should be passed to free(), as there may be an offset. */ +struct cpp_converted_source +{ + char *to_free; + char *data; + size_t len; +}; +cpp_converted_source cpp_get_converted_source (const char *fname, + const char *input_charset); + +/* In pch.cc */ +struct save_macro_data; +extern int cpp_save_state (cpp_reader *, FILE *); +extern int cpp_write_pch_deps (cpp_reader *, FILE *); +extern int cpp_write_pch_state (cpp_reader *, FILE *); +extern int cpp_valid_state (cpp_reader *, const char *, int); +extern void cpp_prepare_state (cpp_reader *, struct save_macro_data **); +extern int cpp_read_state (cpp_reader *, const char *, FILE *, + struct save_macro_data *); + +/* In lex.cc */ +extern void cpp_force_token_locations (cpp_reader *, location_t); +extern void cpp_stop_forcing_token_locations (cpp_reader *); +enum CPP_DO_task +{ + CPP_DO_print, + CPP_DO_location, + CPP_DO_token +}; + +extern void cpp_directive_only_process (cpp_reader *pfile, + void *data, + void (*cb) (cpp_reader *, + CPP_DO_task, + void *data, ...)); + +/* In expr.cc */ +extern enum cpp_ttype cpp_userdef_string_remove_type + (enum cpp_ttype type); +extern enum cpp_ttype cpp_userdef_string_add_type + (enum cpp_ttype type); +extern enum cpp_ttype cpp_userdef_char_remove_type + (enum cpp_ttype type); +extern enum cpp_ttype cpp_userdef_char_add_type + (enum cpp_ttype type); +extern bool cpp_userdef_string_p + (enum cpp_ttype type); +extern bool cpp_userdef_char_p + (enum cpp_ttype type); +extern const char * cpp_get_userdef_suffix + (const cpp_token *); + +/* In charset.cc */ + +/* The result of attempting to decode a run of UTF-8 bytes. */ + +struct cpp_decoded_char +{ + const char *m_start_byte; + const char *m_next_byte; + + bool m_valid_ch; + cppchar_t m_ch; +}; + +/* Information for mapping between code points and display columns. + + This is a tabstop value, along with a callback for getting the + widths of characters. Normally this callback is cpp_wcwidth, but we + support other schemes for escaping non-ASCII unicode as a series of + ASCII chars when printing the user's source code in diagnostic-show-locus.cc + + For example, consider: + - the Unicode character U+03C0 "GREEK SMALL LETTER PI" (UTF-8: 0xCF 0x80) + - the Unicode character U+1F642 "SLIGHTLY SMILING FACE" + (UTF-8: 0xF0 0x9F 0x99 0x82) + - the byte 0xBF (a stray trailing byte of a UTF-8 character) + Normally U+03C0 would occupy one display column, U+1F642 + would occupy two display columns, and the stray byte would be + printed verbatim as one display column. + + However when escaping them as unicode code points as "" + and "" they occupy 8 and 9 display columns respectively, + and when escaping them as bytes as "<80>" and "<9F><99><82>" + they occupy 8 and 16 display columns respectively. In both cases + the stray byte is escaped to as 4 display columns. */ + +struct cpp_char_column_policy +{ + cpp_char_column_policy (int tabstop, + int (*width_cb) (cppchar_t c)) + : m_tabstop (tabstop), + m_undecoded_byte_width (1), + m_width_cb (width_cb) + {} + + int m_tabstop; + /* Width in display columns of a stray byte that isn't decodable + as UTF-8. */ + int m_undecoded_byte_width; + int (*m_width_cb) (cppchar_t c); +}; + +/* A class to manage the state while converting a UTF-8 sequence to cppchar_t + and computing the display width one character at a time. */ +class cpp_display_width_computation { + public: + cpp_display_width_computation (const char *data, int data_length, + const cpp_char_column_policy &policy); + const char *next_byte () const { return m_next; } + int bytes_processed () const { return m_next - m_begin; } + int bytes_left () const { return m_bytes_left; } + bool done () const { return !bytes_left (); } + int display_cols_processed () const { return m_display_cols; } + + int process_next_codepoint (cpp_decoded_char *out); + int advance_display_cols (int n); + + private: + const char *const m_begin; + const char *m_next; + size_t m_bytes_left; + const cpp_char_column_policy &m_policy; + int m_display_cols; +}; + +/* Convenience functions that are simple use cases for class + cpp_display_width_computation. Tab characters will be expanded to spaces + as determined by POLICY.m_tabstop, and non-printable-ASCII characters + will be escaped as per POLICY. */ + +int cpp_byte_column_to_display_column (const char *data, int data_length, + int column, + const cpp_char_column_policy &policy); +inline int cpp_display_width (const char *data, int data_length, + const cpp_char_column_policy &policy) +{ + return cpp_byte_column_to_display_column (data, data_length, data_length, + policy); +} +int cpp_display_column_to_byte_column (const char *data, int data_length, + int display_col, + const cpp_char_column_policy &policy); +int cpp_wcwidth (cppchar_t c); + +bool cpp_input_conversion_is_trivial (const char *input_charset); +int cpp_check_utf8_bom (const char *data, size_t data_length); + +#endif /* ! LIBCPP_CPPLIB_H */ diff --git a/developer/script_Deb-12.10_gcc-12.4.1/library/include/cpplib.h b/developer/script_Deb-12.10_gcc-12.4.1/library/include/cpplib.h new file mode 100644 index 0000000..aea752f --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/library/include/cpplib.h @@ -0,0 +1,1585 @@ +/* Definitions for CPP library. + Copyright (C) 1995-2022 Free Software Foundation, Inc. + Written by Per Bothner, 1994-95. + +This program is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the +Free Software Foundation; either version 3, or (at your option) any +later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; see the file COPYING3. If not see +. + + In other words, you are welcome to use, share and improve this program. + You are forbidden to forbid anyone else to use, share and improve + what you give them. Help stamp out software-hoarding! */ +#ifndef LIBCPP_CPPLIB_H +#define LIBCPP_CPPLIB_H + +#include +#include "symtab.h" +#include "line-map.h" + +typedef struct cpp_reader cpp_reader; +typedef struct cpp_buffer cpp_buffer; +typedef struct cpp_options cpp_options; +typedef struct cpp_token cpp_token; +typedef struct cpp_string cpp_string; +typedef struct cpp_hashnode cpp_hashnode; +typedef struct cpp_macro cpp_macro; +typedef struct cpp_callbacks cpp_callbacks; +typedef struct cpp_dir cpp_dir; + +struct _cpp_file; + +/* The first three groups, apart from '=', can appear in preprocessor + expressions (+= and -= are used to indicate unary + and - resp.). + This allows a lookup table to be implemented in _cpp_parse_expr. + + The first group, to CPP_LAST_EQ, can be immediately followed by an + '='. The lexer needs operators ending in '=', like ">>=", to be in + the same order as their counterparts without the '=', like ">>". + + See the cpp_operator table optab in expr.cc if you change the order or + add or remove anything in the first group. */ + +#define TTYPE_TABLE \ + OP(EQ, "=") \ + OP(NOT, "!") \ + OP(GREATER, ">") /* compare */ \ + OP(LESS, "<") \ + OP(PLUS, "+") /* math */ \ + OP(MINUS, "-") \ + OP(MULT, "*") \ + OP(DIV, "/") \ + OP(MOD, "%") \ + OP(AND, "&") /* bit ops */ \ + OP(OR, "|") \ + OP(XOR, "^") \ + OP(RSHIFT, ">>") \ + OP(LSHIFT, "<<") \ + \ + OP(COMPL, "~") \ + OP(AND_AND, "&&") /* logical */ \ + OP(OR_OR, "||") \ + OP(QUERY, "?") \ + OP(COLON, ":") \ + OP(COMMA, ",") /* grouping */ \ + OP(OPEN_PAREN, "(") \ + OP(CLOSE_PAREN, ")") \ + TK(EOF, NONE) \ + OP(EQ_EQ, "==") /* compare */ \ + OP(NOT_EQ, "!=") \ + OP(GREATER_EQ, ">=") \ + OP(LESS_EQ, "<=") \ + OP(SPACESHIP, "<=>") \ + \ + /* These two are unary + / - in preprocessor expressions. */ \ + OP(PLUS_EQ, "+=") /* math */ \ + OP(MINUS_EQ, "-=") \ + \ + OP(MULT_EQ, "*=") \ + OP(DIV_EQ, "/=") \ + OP(MOD_EQ, "%=") \ + OP(AND_EQ, "&=") /* bit ops */ \ + OP(OR_EQ, "|=") \ + OP(XOR_EQ, "^=") \ + OP(RSHIFT_EQ, ">>=") \ + OP(LSHIFT_EQ, "<<=") \ + /* Digraphs together, beginning with CPP_FIRST_DIGRAPH. */ \ + OP(HASH, "#") /* digraphs */ \ + OP(PASTE, "##") \ + OP(OPEN_SQUARE, "[") \ + OP(CLOSE_SQUARE, "]") \ + OP(OPEN_BRACE, "{") \ + OP(CLOSE_BRACE, "}") \ + /* The remainder of the punctuation. Order is not significant. */ \ + OP(SEMICOLON, ";") /* structure */ \ + OP(ELLIPSIS, "...") \ + OP(PLUS_PLUS, "++") /* increment */ \ + OP(MINUS_MINUS, "--") \ + OP(DEREF, "->") /* accessors */ \ + OP(DOT, ".") \ + OP(SCOPE, "::") \ + OP(DEREF_STAR, "->*") \ + OP(DOT_STAR, ".*") \ + OP(ATSIGN, "@") /* used in Objective-C */ \ + \ + TK(NAME, IDENT) /* word */ \ + TK(AT_NAME, IDENT) /* @word - Objective-C */ \ + TK(NUMBER, LITERAL) /* 34_be+ta */ \ + \ + TK(CHAR, LITERAL) /* 'char' */ \ + TK(WCHAR, LITERAL) /* L'char' */ \ + TK(CHAR16, LITERAL) /* u'char' */ \ + TK(CHAR32, LITERAL) /* U'char' */ \ + TK(UTF8CHAR, LITERAL) /* u8'char' */ \ + TK(OTHER, LITERAL) /* stray punctuation */ \ + \ + TK(STRING, LITERAL) /* "string" */ \ + TK(WSTRING, LITERAL) /* L"string" */ \ + TK(STRING16, LITERAL) /* u"string" */ \ + TK(STRING32, LITERAL) /* U"string" */ \ + TK(UTF8STRING, LITERAL) /* u8"string" */ \ + TK(OBJC_STRING, LITERAL) /* @"string" - Objective-C */ \ + TK(HEADER_NAME, LITERAL) /* in #include */ \ + \ + TK(CHAR_USERDEF, LITERAL) /* 'char'_suffix - C++-0x */ \ + TK(WCHAR_USERDEF, LITERAL) /* L'char'_suffix - C++-0x */ \ + TK(CHAR16_USERDEF, LITERAL) /* u'char'_suffix - C++-0x */ \ + TK(CHAR32_USERDEF, LITERAL) /* U'char'_suffix - C++-0x */ \ + TK(UTF8CHAR_USERDEF, LITERAL) /* u8'char'_suffix - C++-0x */ \ + TK(STRING_USERDEF, LITERAL) /* "string"_suffix - C++-0x */ \ + TK(WSTRING_USERDEF, LITERAL) /* L"string"_suffix - C++-0x */ \ + TK(STRING16_USERDEF, LITERAL) /* u"string"_suffix - C++-0x */ \ + TK(STRING32_USERDEF, LITERAL) /* U"string"_suffix - C++-0x */ \ + TK(UTF8STRING_USERDEF,LITERAL) /* u8"string"_suffix - C++-0x */ \ + \ + TK(COMMENT, LITERAL) /* Only if output comments. */ \ + /* SPELL_LITERAL happens to DTRT. */ \ + TK(MACRO_ARG, NONE) /* Macro argument. */ \ + TK(PRAGMA, NONE) /* Only for deferred pragmas. */ \ + TK(PRAGMA_EOL, NONE) /* End-of-line for deferred pragmas. */ \ + TK(PADDING, NONE) /* Whitespace for -E. */ + +#define OP(e, s) CPP_ ## e, +#define TK(e, s) CPP_ ## e, +enum cpp_ttype +{ + TTYPE_TABLE + N_TTYPES, + + /* A token type for keywords, as opposed to ordinary identifiers. */ + CPP_KEYWORD, + + /* Positions in the table. */ + CPP_LAST_EQ = CPP_LSHIFT, + CPP_FIRST_DIGRAPH = CPP_HASH, + CPP_LAST_PUNCTUATOR= CPP_ATSIGN, + CPP_LAST_CPP_OP = CPP_LESS_EQ +}; +#undef OP +#undef TK + +/* C language kind, used when calling cpp_create_reader. */ +enum c_lang {CLK_GNUC89 = 0, CLK_GNUC99, CLK_GNUC11, CLK_GNUC17, CLK_GNUC2X, + CLK_STDC89, CLK_STDC94, CLK_STDC99, CLK_STDC11, CLK_STDC17, + CLK_STDC2X, + CLK_GNUCXX, CLK_CXX98, CLK_GNUCXX11, CLK_CXX11, + CLK_GNUCXX14, CLK_CXX14, CLK_GNUCXX17, CLK_CXX17, + CLK_GNUCXX20, CLK_CXX20, CLK_GNUCXX23, CLK_CXX23, + CLK_ASM}; + +/* Payload of a NUMBER, STRING, CHAR or COMMENT token. */ +struct GTY(()) cpp_string { + unsigned int len; + const unsigned char *text; +}; + +/* Flags for the cpp_token structure. */ +#define PREV_WHITE (1 << 0) /* If whitespace before this token. */ +#define DIGRAPH (1 << 1) /* If it was a digraph. */ +#define STRINGIFY_ARG (1 << 2) /* If macro argument to be stringified. */ +#define PASTE_LEFT (1 << 3) /* If on LHS of a ## operator. */ +#define NAMED_OP (1 << 4) /* C++ named operators. */ +#define PREV_FALLTHROUGH (1 << 5) /* On a token preceeded by FALLTHROUGH + comment. */ +#define BOL (1 << 6) /* Token at beginning of line. */ +#define PURE_ZERO (1 << 7) /* Single 0 digit, used by the C++ frontend, + set in c-lex.cc. */ +#define COLON_SCOPE PURE_ZERO /* Adjacent colons in C < 23. */ +#define SP_DIGRAPH (1 << 8) /* # or ## token was a digraph. */ +#define SP_PREV_WHITE (1 << 9) /* If whitespace before a ## + operator, or before this token + after a # operator. */ +#define NO_EXPAND (1 << 10) /* Do not macro-expand this token. */ +#define PRAGMA_OP (1 << 11) /* _Pragma token. */ + +/* Specify which field, if any, of the cpp_token union is used. */ + +enum cpp_token_fld_kind { + CPP_TOKEN_FLD_NODE, + CPP_TOKEN_FLD_SOURCE, + CPP_TOKEN_FLD_STR, + CPP_TOKEN_FLD_ARG_NO, + CPP_TOKEN_FLD_TOKEN_NO, + CPP_TOKEN_FLD_PRAGMA, + CPP_TOKEN_FLD_NONE +}; + +/* A macro argument in the cpp_token union. */ +struct GTY(()) cpp_macro_arg { + /* Argument number. */ + unsigned int arg_no; + /* The original spelling of the macro argument token. */ + cpp_hashnode * + GTY ((nested_ptr (union tree_node, + "%h ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (%h)) : NULL", + "%h ? HT_IDENT_TO_GCC_IDENT (HT_NODE (%h)) : NULL"))) + spelling; +}; + +/* An identifier in the cpp_token union. */ +struct GTY(()) cpp_identifier { + /* The canonical (UTF-8) spelling of the identifier. */ + cpp_hashnode * + GTY ((nested_ptr (union tree_node, + "%h ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (%h)) : NULL", + "%h ? HT_IDENT_TO_GCC_IDENT (HT_NODE (%h)) : NULL"))) + node; + /* The original spelling of the identifier. */ + cpp_hashnode * + GTY ((nested_ptr (union tree_node, + "%h ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (%h)) : NULL", + "%h ? HT_IDENT_TO_GCC_IDENT (HT_NODE (%h)) : NULL"))) + spelling; +}; + +/* A preprocessing token. This has been carefully packed and should + occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts. */ +struct GTY(()) cpp_token { + + /* Location of first char of token, together with range of full token. */ + location_t src_loc; + + ENUM_BITFIELD(cpp_ttype) type : CHAR_BIT; /* token type */ + unsigned short flags; /* flags - see above */ + + union cpp_token_u + { + /* An identifier. */ + struct cpp_identifier GTY ((tag ("CPP_TOKEN_FLD_NODE"))) node; + + /* Inherit padding from this token. */ + cpp_token * GTY ((tag ("CPP_TOKEN_FLD_SOURCE"))) source; + + /* A string, or number. */ + struct cpp_string GTY ((tag ("CPP_TOKEN_FLD_STR"))) str; + + /* Argument no. (and original spelling) for a CPP_MACRO_ARG. */ + struct cpp_macro_arg GTY ((tag ("CPP_TOKEN_FLD_ARG_NO"))) macro_arg; + + /* Original token no. for a CPP_PASTE (from a sequence of + consecutive paste tokens in a macro expansion). */ + unsigned int GTY ((tag ("CPP_TOKEN_FLD_TOKEN_NO"))) token_no; + + /* Caller-supplied identifier for a CPP_PRAGMA. */ + unsigned int GTY ((tag ("CPP_TOKEN_FLD_PRAGMA"))) pragma; + } GTY ((desc ("cpp_token_val_index (&%1)"))) val; +}; + +/* Say which field is in use. */ +extern enum cpp_token_fld_kind cpp_token_val_index (const cpp_token *tok); + +/* A type wide enough to hold any multibyte source character. + cpplib's character constant interpreter requires an unsigned type. + Also, a typedef for the signed equivalent. + The width of this type is capped at 32 bits; there do exist targets + where wchar_t is 64 bits, but only in a non-default mode, and there + would be no meaningful interpretation for a wchar_t value greater + than 2^32 anyway -- the widest wide-character encoding around is + ISO 10646, which stops at 2^31. */ +#if CHAR_BIT * SIZEOF_INT >= 32 +# define CPPCHAR_SIGNED_T int +#elif CHAR_BIT * SIZEOF_LONG >= 32 +# define CPPCHAR_SIGNED_T long +#else +# error "Cannot find a least-32-bit signed integer type" +#endif +typedef unsigned CPPCHAR_SIGNED_T cppchar_t; +typedef CPPCHAR_SIGNED_T cppchar_signed_t; + +/* Style of header dependencies to generate. */ +enum cpp_deps_style { DEPS_NONE = 0, DEPS_USER, DEPS_SYSTEM }; + +/* The possible normalization levels, from most restrictive to least. */ +enum cpp_normalize_level { + /* In NFKC. */ + normalized_KC = 0, + /* In NFC. */ + normalized_C, + /* In NFC, except for subsequences where being in NFC would make + the identifier invalid. */ + normalized_identifier_C, + /* Not normalized at all. */ + normalized_none +}; + +enum cpp_main_search +{ + CMS_none, /* A regular source file. */ + CMS_header, /* Is a directly-specified header file (eg PCH or + header-unit). */ + CMS_user, /* Search the user INCLUDE path. */ + CMS_system, /* Search the system INCLUDE path. */ +}; + +/* The possible bidirectional control characters checking levels. */ +enum cpp_bidirectional_level { + /* No checking. */ + bidirectional_none = 0, + /* Only detect unpaired uses of bidirectional control characters. */ + bidirectional_unpaired = 1, + /* Detect any use of bidirectional control characters. */ + bidirectional_any = 2, + /* Also warn about UCNs. */ + bidirectional_ucn = 4 +}; + +/* This structure is nested inside struct cpp_reader, and + carries all the options visible to the command line. */ +struct cpp_options +{ + /* The language we're preprocessing. */ + enum c_lang lang; + + /* Nonzero means use extra default include directories for C++. */ + unsigned char cplusplus; + + /* Nonzero means handle cplusplus style comments. */ + unsigned char cplusplus_comments; + + /* Nonzero means define __OBJC__, treat @ as a special token, use + the OBJC[PLUS]_INCLUDE_PATH environment variable, and allow + "#import". */ + unsigned char objc; + + /* Nonzero means don't copy comments into the output file. */ + unsigned char discard_comments; + + /* Nonzero means don't copy comments into the output file during + macro expansion. */ + unsigned char discard_comments_in_macro_exp; + + /* Nonzero means process the ISO trigraph sequences. */ + unsigned char trigraphs; + + /* Nonzero means process the ISO digraph sequences. */ + unsigned char digraphs; + + /* Nonzero means to allow hexadecimal floats and LL suffixes. */ + unsigned char extended_numbers; + + /* Nonzero means process u/U prefix literals (UTF-16/32). */ + unsigned char uliterals; + + /* Nonzero means process u8 prefixed character literals (UTF-8). */ + unsigned char utf8_char_literals; + + /* Nonzero means process r/R raw strings. If this is set, uliterals + must be set as well. */ + unsigned char rliterals; + + /* Nonzero means print names of header files (-H). */ + unsigned char print_include_names; + + /* Nonzero means complain about deprecated features. */ + unsigned char cpp_warn_deprecated; + + /* Nonzero means warn if slash-star appears in a comment. */ + unsigned char warn_comments; + + /* Nonzero means to warn about __DATA__, __TIME__ and __TIMESTAMP__ usage. */ + unsigned char warn_date_time; + + /* Nonzero means warn if a user-supplied include directory does not + exist. */ + unsigned char warn_missing_include_dirs; + + /* Nonzero means warn if there are any trigraphs. */ + unsigned char warn_trigraphs; + + /* Nonzero means warn about multicharacter charconsts. */ + unsigned char warn_multichar; + + /* Nonzero means warn about various incompatibilities with + traditional C. */ + unsigned char cpp_warn_traditional; + + /* Nonzero means warn about long long numeric constants. */ + unsigned char cpp_warn_long_long; + + /* Nonzero means warn about text after an #endif (or #else). */ + unsigned char warn_endif_labels; + + /* Nonzero means warn about implicit sign changes owing to integer + promotions. */ + unsigned char warn_num_sign_change; + + /* Zero means don't warn about __VA_ARGS__ usage in c89 pedantic mode. + Presumably the usage is protected by the appropriate #ifdef. */ + unsigned char warn_variadic_macros; + + /* Nonzero means warn about builtin macros that are redefined or + explicitly undefined. */ + unsigned char warn_builtin_macro_redefined; + + /* Different -Wimplicit-fallthrough= levels. */ + unsigned char cpp_warn_implicit_fallthrough; + + /* Nonzero means we should look for header.gcc files that remap file + names. */ + unsigned char remap; + + /* Zero means dollar signs are punctuation. */ + unsigned char dollars_in_ident; + + /* Nonzero means UCNs are accepted in identifiers. */ + unsigned char extended_identifiers; + + /* True if we should warn about dollars in identifiers or numbers + for this translation unit. */ + unsigned char warn_dollars; + + /* Nonzero means warn if undefined identifiers are evaluated in an #if. */ + unsigned char warn_undef; + + /* Nonzero means warn if "defined" is encountered in a place other than + an #if. */ + unsigned char warn_expansion_to_defined; + + /* Nonzero means warn of unused macros from the main file. */ + unsigned char warn_unused_macros; + + /* Nonzero for the 1999 C Standard, including corrigenda and amendments. */ + unsigned char c99; + + /* Nonzero if we are conforming to a specific C or C++ standard. */ + unsigned char std; + + /* Nonzero means give all the error messages the ANSI standard requires. */ + unsigned char cpp_pedantic; + + /* Nonzero means we're looking at already preprocessed code, so don't + bother trying to do macro expansion and whatnot. */ + unsigned char preprocessed; + + /* Nonzero means we are going to emit debugging logs during + preprocessing. */ + unsigned char debug; + + /* Nonzero means we are tracking locations of tokens involved in + macro expansion. 1 Means we track the location in degraded mode + where we do not track locations of tokens resulting from the + expansion of arguments of function-like macro. 2 Means we do + track all macro expansions. This last option is the one that + consumes the highest amount of memory. */ + unsigned char track_macro_expansion; + + /* Nonzero means handle C++ alternate operator names. */ + unsigned char operator_names; + + /* Nonzero means warn about use of C++ alternate operator names. */ + unsigned char warn_cxx_operator_names; + + /* True for traditional preprocessing. */ + unsigned char traditional; + + /* Nonzero for C++ 2011 Standard user-defined literals. */ + unsigned char user_literals; + + /* Nonzero means warn when a string or character literal is followed by a + ud-suffix which does not beging with an underscore. */ + unsigned char warn_literal_suffix; + + /* Nonzero means interpret imaginary, fixed-point, or other gnu extension + literal number suffixes as user-defined literal number suffixes. */ + unsigned char ext_numeric_literals; + + /* Nonzero means extended identifiers allow the characters specified + in C11. */ + unsigned char c11_identifiers; + + /* Nonzero for C++ 2014 Standard binary constants. */ + unsigned char binary_constants; + + /* Nonzero for C++ 2014 Standard digit separators. */ + unsigned char digit_separators; + + /* Nonzero for C2X decimal floating-point constants. */ + unsigned char dfp_constants; + + /* Nonzero for C++20 __VA_OPT__ feature. */ + unsigned char va_opt; + + /* Nonzero for the '::' token. */ + unsigned char scope; + + /* Nonzero for the '#elifdef' and '#elifndef' directives. */ + unsigned char elifdef; + + /* Nonzero means tokenize C++20 module directives. */ + unsigned char module_directives; + + /* Nonzero for C++23 size_t literals. */ + unsigned char size_t_literals; + + /* Holds the name of the target (execution) character set. */ + const char *narrow_charset; + + /* Holds the name of the target wide character set. */ + const char *wide_charset; + + /* Holds the name of the input character set. */ + const char *input_charset; + + /* The minimum permitted level of normalization before a warning + is generated. See enum cpp_normalize_level. */ + int warn_normalize; + + /* True to warn about precompiled header files we couldn't use. */ + bool warn_invalid_pch; + + /* True if dependencies should be restored from a precompiled header. */ + bool restore_pch_deps; + + /* True if warn about differences between C90 and C99. */ + signed char cpp_warn_c90_c99_compat; + + /* True if warn about differences between C11 and C2X. */ + signed char cpp_warn_c11_c2x_compat; + + /* True if warn about differences between C++98 and C++11. */ + bool cpp_warn_cxx11_compat; + + /* Nonzero if bidirectional control characters checking is on. See enum + cpp_bidirectional_level. */ + unsigned char cpp_warn_bidirectional; + + /* Dependency generation. */ + struct + { + /* Style of header dependencies to generate. */ + enum cpp_deps_style style; + + /* Assume missing files are generated files. */ + bool missing_files; + + /* Generate phony targets for each dependency apart from the first + one. */ + bool phony_targets; + + /* Generate dependency info for modules. */ + bool modules; + + /* If true, no dependency is generated on the main file. */ + bool ignore_main_file; + + /* If true, intend to use the preprocessor output (e.g., for compilation) + in addition to the dependency info. */ + bool need_preprocessor_output; + } deps; + + /* Target-specific features set by the front end or client. */ + + /* Precision for target CPP arithmetic, target characters, target + ints and target wide characters, respectively. */ + size_t precision, char_precision, int_precision, wchar_precision; + + /* True means chars (wide chars) are unsigned. */ + bool unsigned_char, unsigned_wchar; + + /* True if the most significant byte in a word has the lowest + address in memory. */ + bool bytes_big_endian; + + /* Nonzero means __STDC__ should have the value 0 in system headers. */ + unsigned char stdc_0_in_system_headers; + + /* True disables tokenization outside of preprocessing directives. */ + bool directives_only; + + /* True enables canonicalization of system header file paths. */ + bool canonical_system_headers; + + /* The maximum depth of the nested #include. */ + unsigned int max_include_depth; + + cpp_main_search main_search : 8; +}; + +/* Diagnostic levels. To get a diagnostic without associating a + position in the translation unit with it, use cpp_error_with_line + with a line number of zero. */ + +enum cpp_diagnostic_level { + /* Warning, an error with -Werror. */ + CPP_DL_WARNING = 0, + /* Same as CPP_DL_WARNING, except it is not suppressed in system headers. */ + CPP_DL_WARNING_SYSHDR, + /* Warning, an error with -pedantic-errors or -Werror. */ + CPP_DL_PEDWARN, + /* An error. */ + CPP_DL_ERROR, + /* An internal consistency check failed. Prints "internal error: ", + otherwise the same as CPP_DL_ERROR. */ + CPP_DL_ICE, + /* An informative note following a warning. */ + CPP_DL_NOTE, + /* A fatal error. */ + CPP_DL_FATAL +}; + +/* Warning reason codes. Use a reason code of CPP_W_NONE for unclassified + warnings and diagnostics that are not warnings. */ + +enum cpp_warning_reason { + CPP_W_NONE = 0, + CPP_W_DEPRECATED, + CPP_W_COMMENTS, + CPP_W_MISSING_INCLUDE_DIRS, + CPP_W_TRIGRAPHS, + CPP_W_MULTICHAR, + CPP_W_TRADITIONAL, + CPP_W_LONG_LONG, + CPP_W_ENDIF_LABELS, + CPP_W_NUM_SIGN_CHANGE, + CPP_W_VARIADIC_MACROS, + CPP_W_BUILTIN_MACRO_REDEFINED, + CPP_W_DOLLARS, + CPP_W_UNDEF, + CPP_W_UNUSED_MACROS, + CPP_W_CXX_OPERATOR_NAMES, + CPP_W_NORMALIZE, + CPP_W_INVALID_PCH, + CPP_W_WARNING_DIRECTIVE, + CPP_W_LITERAL_SUFFIX, + CPP_W_SIZE_T_LITERALS, + CPP_W_DATE_TIME, + CPP_W_PEDANTIC, + CPP_W_C90_C99_COMPAT, + CPP_W_C11_C2X_COMPAT, + CPP_W_CXX11_COMPAT, + CPP_W_EXPANSION_TO_DEFINED, + CPP_W_BIDIRECTIONAL +}; + +/* Callback for header lookup for HEADER, which is the name of a + source file. It is used as a method of last resort to find headers + that are not otherwise found during the normal include processing. + The return value is the malloced name of a header to try and open, + if any, or NULL otherwise. This callback is called only if the + header is otherwise unfound. */ +typedef const char *(*missing_header_cb)(cpp_reader *, const char *header, cpp_dir **); + +/* Call backs to cpplib client. */ +struct cpp_callbacks +{ + /* Called when a new line of preprocessed output is started. */ + void (*line_change) (cpp_reader *, const cpp_token *, int); + + /* Called when switching to/from a new file. + The line_map is for the new file. It is NULL if there is no new file. + (In C this happens when done with + and also + when done with a main file.) This can be used for resource cleanup. */ + void (*file_change) (cpp_reader *, const line_map_ordinary *); + + void (*dir_change) (cpp_reader *, const char *); + void (*include) (cpp_reader *, location_t, const unsigned char *, + const char *, int, const cpp_token **); + void (*define) (cpp_reader *, location_t, cpp_hashnode *); + void (*undef) (cpp_reader *, location_t, cpp_hashnode *); + void (*ident) (cpp_reader *, location_t, const cpp_string *); + void (*def_pragma) (cpp_reader *, location_t); + int (*valid_pch) (cpp_reader *, const char *, int); + void (*read_pch) (cpp_reader *, const char *, int, const char *); + missing_header_cb missing_header; + + /* Context-sensitive macro support. Returns macro (if any) that should + be expanded. */ + cpp_hashnode * (*macro_to_expand) (cpp_reader *, const cpp_token *); + + /* Called to emit a diagnostic. This callback receives the + translated message. */ + bool (*diagnostic) (cpp_reader *, + enum cpp_diagnostic_level, + enum cpp_warning_reason, + rich_location *, + const char *, va_list *) + ATTRIBUTE_FPTR_PRINTF(5,0); + + /* Callbacks for when a macro is expanded, or tested (whether + defined or not at the time) in #ifdef, #ifndef or "defined". */ + void (*used_define) (cpp_reader *, location_t, cpp_hashnode *); + void (*used_undef) (cpp_reader *, location_t, cpp_hashnode *); + /* Called before #define and #undef or other macro definition + changes are processed. */ + void (*before_define) (cpp_reader *); + /* Called whenever a macro is expanded or tested. + Second argument is the location of the start of the current expansion. */ + void (*used) (cpp_reader *, location_t, cpp_hashnode *); + + /* Callback to identify whether an attribute exists. */ + int (*has_attribute) (cpp_reader *, bool); + + /* Callback to determine whether a built-in function is recognized. */ + int (*has_builtin) (cpp_reader *); + + /* Callback that can change a user lazy into normal macro. */ + void (*user_lazy_macro) (cpp_reader *, cpp_macro *, unsigned); + + /* Callback to handle deferred cpp_macros. */ + cpp_macro *(*user_deferred_macro) (cpp_reader *, location_t, cpp_hashnode *); + + /* Callback to parse SOURCE_DATE_EPOCH from environment. */ + time_t (*get_source_date_epoch) (cpp_reader *); + + /* Callback for providing suggestions for misspelled directives. */ + const char *(*get_suggestion) (cpp_reader *, const char *, const char *const *); + + /* Callback for when a comment is encountered, giving the location + of the opening slash, a pointer to the content (which is not + necessarily 0-terminated), and the length of the content. + The content contains the opening slash-star (or slash-slash), + and for C-style comments contains the closing star-slash. For + C++-style comments it does not include the terminating newline. */ + void (*comment) (cpp_reader *, location_t, const unsigned char *, + size_t); + + /* Callback for filename remapping in __FILE__ and __BASE_FILE__ macro + expansions. */ + const char *(*remap_filename) (const char*); + + /* Maybe translate a #include into something else. Return a + cpp_buffer containing the translation if translating. */ + char *(*translate_include) (cpp_reader *, line_maps *, location_t, + const char *path); +}; + +#ifdef VMS +#define INO_T_CPP ino_t ino[3] +#elif defined (_AIX) && SIZEOF_INO_T == 4 +#define INO_T_CPP ino64_t ino +#else +#define INO_T_CPP ino_t ino +#endif + +#if defined (_AIX) && SIZEOF_DEV_T == 4 +#define DEV_T_CPP dev64_t dev +#else +#define DEV_T_CPP dev_t dev +#endif + +/* Chain of directories to look for include files in. */ +struct cpp_dir +{ + /* NULL-terminated singly-linked list. */ + struct cpp_dir *next; + + /* NAME of the directory, NUL-terminated. */ + char *name; + unsigned int len; + + /* One if a system header, two if a system header that has extern + "C" guards for C++. */ + unsigned char sysp; + + /* Is this a user-supplied directory? */ + bool user_supplied_p; + + /* The canonicalized NAME as determined by lrealpath. This field + is only used by hosts that lack reliable inode numbers. */ + char *canonical_name; + + /* Mapping of file names for this directory for MS-DOS and related + platforms. A NULL-terminated array of (from, to) pairs. */ + const char **name_map; + + /* Routine to construct pathname, given the search path name and the + HEADER we are trying to find, return a constructed pathname to + try and open. If this is NULL, the constructed pathname is as + constructed by append_file_to_dir. */ + char *(*construct) (const char *header, cpp_dir *dir); + + /* The C front end uses these to recognize duplicated + directories in the search path. */ + INO_T_CPP; + DEV_T_CPP; +}; + +/* The kind of the cpp_macro. */ +enum cpp_macro_kind { + cmk_macro, /* An ISO macro (token expansion). */ + cmk_assert, /* An assertion. */ + cmk_traditional /* A traditional macro (text expansion). */ +}; + +/* Each macro definition is recorded in a cpp_macro structure. + Variadic macros cannot occur with traditional cpp. */ +struct GTY(()) cpp_macro { + union cpp_parm_u + { + /* Parameters, if any. If parameter names use extended identifiers, + the original spelling of those identifiers, not the canonical + UTF-8 spelling, goes here. */ + cpp_hashnode ** GTY ((tag ("false"), + nested_ptr (union tree_node, + "%h ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (%h)) : NULL", + "%h ? HT_IDENT_TO_GCC_IDENT (HT_NODE (%h)) : NULL"), + length ("%1.paramc"))) params; + + /* If this is an assertion, the next one in the chain. */ + cpp_macro *GTY ((tag ("true"))) next; + } GTY ((desc ("%1.kind == cmk_assert"))) parm; + + /* Definition line number. */ + location_t line; + + /* Number of tokens in body, or bytes for traditional macros. */ + /* Do we really need 2^32-1 range here? */ + unsigned int count; + + /* Number of parameters. */ + unsigned short paramc; + + /* Non-zero if this is a user-lazy macro, value provided by user. */ + unsigned char lazy; + + /* The kind of this macro (ISO, trad or assert) */ + unsigned kind : 2; + + /* If a function-like macro. */ + unsigned int fun_like : 1; + + /* If a variadic macro. */ + unsigned int variadic : 1; + + /* If macro defined in system header. */ + unsigned int syshdr : 1; + + /* Nonzero if it has been expanded or had its existence tested. */ + unsigned int used : 1; + + /* Indicate whether the tokens include extra CPP_PASTE tokens at the + end to track invalid redefinitions with consecutive CPP_PASTE + tokens. */ + unsigned int extra_tokens : 1; + + /* Imported C++20 macro (from a header unit). */ + unsigned int imported_p : 1; + + /* 0 bits spare (32-bit). 32 on 64-bit target. */ + + union cpp_exp_u + { + /* Trailing array of replacement tokens (ISO), or assertion body value. */ + cpp_token GTY ((tag ("false"), length ("%1.count"))) tokens[1]; + + /* Pointer to replacement text (traditional). See comment at top + of cpptrad.c for how traditional function-like macros are + encoded. */ + const unsigned char *GTY ((tag ("true"))) text; + } GTY ((desc ("%1.kind == cmk_traditional"))) exp; +}; + +/* Poisoned identifiers are flagged NODE_POISONED. NODE_OPERATOR (C++ + only) indicates an identifier that behaves like an operator such as + "xor". NODE_DIAGNOSTIC is for speed in lex_token: it indicates a + diagnostic may be required for this node. Currently this only + applies to __VA_ARGS__, poisoned identifiers, and -Wc++-compat + warnings about NODE_OPERATOR. */ + +/* Hash node flags. */ +#define NODE_OPERATOR (1 << 0) /* C++ named operator. */ +#define NODE_POISONED (1 << 1) /* Poisoned identifier. */ +#define NODE_DIAGNOSTIC (1 << 2) /* Possible diagnostic when lexed. */ +#define NODE_WARN (1 << 3) /* Warn if redefined or undefined. */ +#define NODE_DISABLED (1 << 4) /* A disabled macro. */ +#define NODE_USED (1 << 5) /* Dumped with -dU. */ +#define NODE_CONDITIONAL (1 << 6) /* Conditional macro */ +#define NODE_WARN_OPERATOR (1 << 7) /* Warn about C++ named operator. */ +#define NODE_MODULE (1 << 8) /* C++-20 module-related name. */ + +/* Different flavors of hash node. */ +enum node_type +{ + NT_VOID = 0, /* Maybe an assert? */ + NT_MACRO_ARG, /* A macro arg. */ + NT_USER_MACRO, /* A user macro. */ + NT_BUILTIN_MACRO, /* A builtin macro. */ + NT_MACRO_MASK = NT_USER_MACRO /* Mask for either macro kind. */ +}; + +/* Different flavors of builtin macro. _Pragma is an operator, but we + handle it with the builtin code for efficiency reasons. */ +enum cpp_builtin_type +{ + BT_SPECLINE = 0, /* `__LINE__' */ + BT_DATE, /* `__DATE__' */ + BT_FILE, /* `__FILE__' */ + BT_FILE_NAME, /* `__FILE_NAME__' */ + BT_BASE_FILE, /* `__BASE_FILE__' */ + BT_INCLUDE_LEVEL, /* `__INCLUDE_LEVEL__' */ + BT_TIME, /* `__TIME__' */ + BT_STDC, /* `__STDC__' */ + BT_PRAGMA, /* `_Pragma' operator */ + BT_TIMESTAMP, /* `__TIMESTAMP__' */ + BT_COUNTER, /* `__COUNTER__' */ + BT_HAS_ATTRIBUTE, /* `__has_attribute(x)' */ + BT_HAS_STD_ATTRIBUTE, /* `__has_c_attribute(x)' */ + BT_HAS_BUILTIN, /* `__has_builtin(x)' */ + BT_HAS_INCLUDE, /* `__has_include(x)' */ + BT_HAS_INCLUDE_NEXT, /* `__has_include_next(x)' */ + + // RT Extension + BT_RT_ASSIGN, + BT_RT_TO_ARG_LIST, + BT_RT_TO_TOKEN_LIST, + BT_RT_FIRST, + BT_RT_REST, + BT_RT_MAP, + BT_RT_AL_MAP, + BT_RT_IF, + BT_RT_NOT, + BT_RT_AND, + BT_RT_OR, + BT_RT_IS_IDENTIFIER, + BT_RT_IS_NAME, + BT_RT_PASTE +}; + +#define CPP_HASHNODE(HNODE) ((cpp_hashnode *) (HNODE)) +#define HT_NODE(NODE) (&(NODE)->ident) +#define NODE_LEN(NODE) HT_LEN (HT_NODE (NODE)) +#define NODE_NAME(NODE) HT_STR (HT_NODE (NODE)) + +/* The common part of an identifier node shared amongst all 3 C front + ends. Also used to store CPP identifiers, which are a superset of + identifiers in the grammatical sense. */ + +union GTY(()) _cpp_hashnode_value { + /* Assert (maybe NULL) */ + cpp_macro * GTY((tag ("NT_VOID"))) answers; + /* Macro (maybe NULL) */ + cpp_macro * GTY((tag ("NT_USER_MACRO"))) macro; + /* Code for a builtin macro. */ + enum cpp_builtin_type GTY ((tag ("NT_BUILTIN_MACRO"))) builtin; + /* Macro argument index. */ + unsigned short GTY ((tag ("NT_MACRO_ARG"))) arg_index; +}; + +struct GTY(()) cpp_hashnode { + struct ht_identifier ident; + unsigned int is_directive : 1; + unsigned int directive_index : 7; /* If is_directive, + then index into directive table. + Otherwise, a NODE_OPERATOR. */ + unsigned int rid_code : 8; /* Rid code - for front ends. */ + unsigned int flags : 9; /* CPP flags. */ + ENUM_BITFIELD(node_type) type : 2; /* CPP node type. */ + + /* 5 bits spare. */ + + /* The deferred cookie is applicable to NT_USER_MACRO or NT_VOID. + The latter for when a macro had a prevailing undef. + On a 64-bit system there would be 32-bits of padding to the value + field. So placing the deferred index here is not costly. */ + unsigned deferred; /* Deferred cookie */ + + union _cpp_hashnode_value GTY ((desc ("%1.type"))) value; +}; + +/* A class for iterating through the source locations within a + string token (before escapes are interpreted, and before + concatenation). */ + +class cpp_string_location_reader { + public: + cpp_string_location_reader (location_t src_loc, + line_maps *line_table); + + source_range get_next (); + + private: + location_t m_loc; + int m_offset_per_column; +}; + +/* A class for storing the source ranges of all of the characters within + a string literal, after escapes are interpreted, and after + concatenation. + + This is not GTY-marked, as instances are intended to be temporary. */ + +class cpp_substring_ranges +{ + public: + cpp_substring_ranges (); + ~cpp_substring_ranges (); + + int get_num_ranges () const { return m_num_ranges; } + source_range get_range (int idx) const + { + linemap_assert (idx < m_num_ranges); + return m_ranges[idx]; + } + + void add_range (source_range range); + void add_n_ranges (int num, cpp_string_location_reader &loc_reader); + + private: + source_range *m_ranges; + int m_num_ranges; + int m_alloc_ranges; +}; + +/* Call this first to get a handle to pass to other functions. + + If you want cpplib to manage its own hashtable, pass in a NULL + pointer. Otherwise you should pass in an initialized hash table + that cpplib will share; this technique is used by the C front + ends. */ +extern cpp_reader *cpp_create_reader (enum c_lang, struct ht *, + class line_maps *); + +/* Reset the cpp_reader's line_map. This is only used after reading a + PCH file. */ +extern void cpp_set_line_map (cpp_reader *, class line_maps *); + +/* Call this to change the selected language standard (e.g. because of + command line options). */ +extern void cpp_set_lang (cpp_reader *, enum c_lang); + +/* Set the include paths. */ +extern void cpp_set_include_chains (cpp_reader *, cpp_dir *, cpp_dir *, int); + +/* Call these to get pointers to the options, callback, and deps + structures for a given reader. These pointers are good until you + call cpp_finish on that reader. You can either edit the callbacks + through the pointer returned from cpp_get_callbacks, or set them + with cpp_set_callbacks. */ +extern cpp_options *cpp_get_options (cpp_reader *) ATTRIBUTE_PURE; +extern cpp_callbacks *cpp_get_callbacks (cpp_reader *) ATTRIBUTE_PURE; +extern void cpp_set_callbacks (cpp_reader *, cpp_callbacks *); +extern class mkdeps *cpp_get_deps (cpp_reader *) ATTRIBUTE_PURE; + +extern const char *cpp_probe_header_unit (cpp_reader *, const char *file, + bool angle_p, location_t); + +/* Call these to get name data about the various compile-time + charsets. */ +extern const char *cpp_get_narrow_charset_name (cpp_reader *) ATTRIBUTE_PURE; +extern const char *cpp_get_wide_charset_name (cpp_reader *) ATTRIBUTE_PURE; + +/* This function reads the file, but does not start preprocessing. It + returns the name of the original file; this is the same as the + input file, except for preprocessed input. This will generate at + least one file change callback, and possibly a line change callback + too. If there was an error opening the file, it returns NULL. */ +extern const char *cpp_read_main_file (cpp_reader *, const char *, + bool injecting = false); +extern location_t cpp_main_loc (const cpp_reader *); + +/* Adjust for the main file to be an include. */ +extern void cpp_retrofit_as_include (cpp_reader *); + +/* Set up built-ins with special behavior. Use cpp_init_builtins() + instead unless your know what you are doing. */ +extern void cpp_init_special_builtins (cpp_reader *); + +/* Set up built-ins like __FILE__. */ +extern void cpp_init_builtins (cpp_reader *, int); + +/* This is called after options have been parsed, and partially + processed. */ +extern void cpp_post_options (cpp_reader *); + +/* Set up translation to the target character set. */ +extern void cpp_init_iconv (cpp_reader *); + +/* Call this to finish preprocessing. If you requested dependency + generation, pass an open stream to write the information to, + otherwise NULL. It is your responsibility to close the stream. */ +extern void cpp_finish (cpp_reader *, FILE *deps_stream); + +/* Call this to release the handle at the end of preprocessing. Any + use of the handle after this function returns is invalid. */ +extern void cpp_destroy (cpp_reader *); + +extern unsigned int cpp_token_len (const cpp_token *); +extern unsigned char *cpp_token_as_text (cpp_reader *, const cpp_token *); +extern unsigned char *cpp_spell_token (cpp_reader *, const cpp_token *, + unsigned char *, bool); +extern void cpp_register_pragma (cpp_reader *, const char *, const char *, + void (*) (cpp_reader *), bool); +extern void cpp_register_deferred_pragma (cpp_reader *, const char *, + const char *, unsigned, bool, bool); +extern int cpp_avoid_paste (cpp_reader *, const cpp_token *, + const cpp_token *); +extern const cpp_token *cpp_get_token (cpp_reader *); +extern const cpp_token *cpp_get_token_with_location (cpp_reader *, + location_t *); +inline bool cpp_user_macro_p (const cpp_hashnode *node) +{ + return node->type == NT_USER_MACRO; +} +inline bool cpp_builtin_macro_p (const cpp_hashnode *node) +{ + return node->type == NT_BUILTIN_MACRO; +} +inline bool cpp_macro_p (const cpp_hashnode *node) +{ + return node->type & NT_MACRO_MASK; +} +inline cpp_macro *cpp_set_deferred_macro (cpp_hashnode *node, + cpp_macro *forced = NULL) +{ + cpp_macro *old = node->value.macro; + + node->value.macro = forced; + node->type = NT_USER_MACRO; + node->flags &= ~NODE_USED; + + return old; +} +cpp_macro *cpp_get_deferred_macro (cpp_reader *, cpp_hashnode *, location_t); + +/* Returns true if NODE is a function-like user macro. */ +inline bool cpp_fun_like_macro_p (cpp_hashnode *node) +{ + return cpp_user_macro_p (node) && node->value.macro->fun_like; +} + +extern const unsigned char *cpp_macro_definition (cpp_reader *, cpp_hashnode *); +extern const unsigned char *cpp_macro_definition (cpp_reader *, cpp_hashnode *, + const cpp_macro *); +inline location_t cpp_macro_definition_location (cpp_hashnode *node) +{ + const cpp_macro *macro = node->value.macro; + return macro ? macro->line : 0; +} +/* Return an idempotent time stamp (possibly from SOURCE_DATE_EPOCH). */ +enum class CPP_time_kind +{ + FIXED = -1, /* Fixed time via source epoch. */ + DYNAMIC = -2, /* Dynamic via time(2). */ + UNKNOWN = -3 /* Wibbly wobbly, timey wimey. */ +}; +extern CPP_time_kind cpp_get_date (cpp_reader *, time_t *); + +extern void _cpp_backup_tokens (cpp_reader *, unsigned int); +extern const cpp_token *cpp_peek_token (cpp_reader *, int); + +/* Evaluate a CPP_*CHAR* token. */ +extern cppchar_t cpp_interpret_charconst (cpp_reader *, const cpp_token *, + unsigned int *, int *); +/* Evaluate a vector of CPP_*STRING* tokens. */ +extern bool cpp_interpret_string (cpp_reader *, + const cpp_string *, size_t, + cpp_string *, enum cpp_ttype); +extern const char *cpp_interpret_string_ranges (cpp_reader *pfile, + const cpp_string *from, + cpp_string_location_reader *, + size_t count, + cpp_substring_ranges *out, + enum cpp_ttype type); +extern bool cpp_interpret_string_notranslate (cpp_reader *, + const cpp_string *, size_t, + cpp_string *, enum cpp_ttype); + +/* Convert a host character constant to the execution character set. */ +extern cppchar_t cpp_host_to_exec_charset (cpp_reader *, cppchar_t); + +/* Used to register macros and assertions, perhaps from the command line. + The text is the same as the command line argument. */ +extern void cpp_define (cpp_reader *, const char *); +extern void cpp_define_unused (cpp_reader *, const char *); +extern void cpp_define_formatted (cpp_reader *pfile, + const char *fmt, ...) ATTRIBUTE_PRINTF_2; +extern void cpp_define_formatted_unused (cpp_reader *pfile, + const char *fmt, + ...) ATTRIBUTE_PRINTF_2; +extern void cpp_assert (cpp_reader *, const char *); +extern void cpp_undef (cpp_reader *, const char *); +extern void cpp_unassert (cpp_reader *, const char *); + +/* Mark a node as a lazily defined macro. */ +extern void cpp_define_lazily (cpp_reader *, cpp_hashnode *node, unsigned N); + +/* Undefine all macros and assertions. */ +extern void cpp_undef_all (cpp_reader *); + +extern cpp_buffer *cpp_push_buffer (cpp_reader *, const unsigned char *, + size_t, int); +extern int cpp_defined (cpp_reader *, const unsigned char *, int); + +/* A preprocessing number. Code assumes that any unused high bits of + the double integer are set to zero. */ + +/* This type has to be equal to unsigned HOST_WIDE_INT, see + gcc/c-family/c-lex.cc. */ +typedef uint64_t cpp_num_part; +typedef struct cpp_num cpp_num; +struct cpp_num +{ + cpp_num_part high; + cpp_num_part low; + bool unsignedp; /* True if value should be treated as unsigned. */ + bool overflow; /* True if the most recent calculation overflowed. */ +}; + +/* cpplib provides two interfaces for interpretation of preprocessing + numbers. + + cpp_classify_number categorizes numeric constants according to + their field (integer, floating point, or invalid), radix (decimal, + octal, hexadecimal), and type suffixes. */ + +#define CPP_N_CATEGORY 0x000F +#define CPP_N_INVALID 0x0000 +#define CPP_N_INTEGER 0x0001 +#define CPP_N_FLOATING 0x0002 + +#define CPP_N_WIDTH 0x00F0 +#define CPP_N_SMALL 0x0010 /* int, float, short _Fract/Accum */ +#define CPP_N_MEDIUM 0x0020 /* long, double, long _Fract/_Accum. */ +#define CPP_N_LARGE 0x0040 /* long long, long double, + long long _Fract/Accum. */ + +#define CPP_N_WIDTH_MD 0xF0000 /* machine defined. */ +#define CPP_N_MD_W 0x10000 +#define CPP_N_MD_Q 0x20000 + +#define CPP_N_RADIX 0x0F00 +#define CPP_N_DECIMAL 0x0100 +#define CPP_N_HEX 0x0200 +#define CPP_N_OCTAL 0x0400 +#define CPP_N_BINARY 0x0800 + +#define CPP_N_UNSIGNED 0x1000 /* Properties. */ +#define CPP_N_IMAGINARY 0x2000 +#define CPP_N_DFLOAT 0x4000 +#define CPP_N_DEFAULT 0x8000 + +#define CPP_N_FRACT 0x100000 /* Fract types. */ +#define CPP_N_ACCUM 0x200000 /* Accum types. */ +#define CPP_N_FLOATN 0x400000 /* _FloatN types. */ +#define CPP_N_FLOATNX 0x800000 /* _FloatNx types. */ + +#define CPP_N_USERDEF 0x1000000 /* C++11 user-defined literal. */ + +#define CPP_N_SIZE_T 0x2000000 /* C++23 size_t literal. */ + +#define CPP_N_WIDTH_FLOATN_NX 0xF0000000 /* _FloatN / _FloatNx value + of N, divided by 16. */ +#define CPP_FLOATN_SHIFT 24 +#define CPP_FLOATN_MAX 0xF0 + +/* Classify a CPP_NUMBER token. The return value is a combination of + the flags from the above sets. */ +extern unsigned cpp_classify_number (cpp_reader *, const cpp_token *, + const char **, location_t); + +/* Return the classification flags for a float suffix. */ +extern unsigned int cpp_interpret_float_suffix (cpp_reader *, const char *, + size_t); + +/* Return the classification flags for an int suffix. */ +extern unsigned int cpp_interpret_int_suffix (cpp_reader *, const char *, + size_t); + +/* Evaluate a token classified as category CPP_N_INTEGER. */ +extern cpp_num cpp_interpret_integer (cpp_reader *, const cpp_token *, + unsigned int); + +/* Sign extend a number, with PRECISION significant bits and all + others assumed clear, to fill out a cpp_num structure. */ +cpp_num cpp_num_sign_extend (cpp_num, size_t); + +/* Output a diagnostic of some kind. */ +extern bool cpp_error (cpp_reader *, enum cpp_diagnostic_level, + const char *msgid, ...) + ATTRIBUTE_PRINTF_3; +extern bool cpp_warning (cpp_reader *, enum cpp_warning_reason, + const char *msgid, ...) + ATTRIBUTE_PRINTF_3; +extern bool cpp_pedwarning (cpp_reader *, enum cpp_warning_reason, + const char *msgid, ...) + ATTRIBUTE_PRINTF_3; +extern bool cpp_warning_syshdr (cpp_reader *, enum cpp_warning_reason reason, + const char *msgid, ...) + ATTRIBUTE_PRINTF_3; + +/* As their counterparts above, but use RICHLOC. */ +extern bool cpp_warning_at (cpp_reader *, enum cpp_warning_reason, + rich_location *richloc, const char *msgid, ...) + ATTRIBUTE_PRINTF_4; +extern bool cpp_pedwarning_at (cpp_reader *, enum cpp_warning_reason, + rich_location *richloc, const char *msgid, ...) + ATTRIBUTE_PRINTF_4; + +/* Output a diagnostic with "MSGID: " preceding the + error string of errno. No location is printed. */ +extern bool cpp_errno (cpp_reader *, enum cpp_diagnostic_level, + const char *msgid); +/* Similarly, but with "FILENAME: " instead of "MSGID: ", where + the filename is not localized. */ +extern bool cpp_errno_filename (cpp_reader *, enum cpp_diagnostic_level, + const char *filename, location_t loc); + +/* Same as cpp_error, except additionally specifies a position as a + (translation unit) physical line and physical column. If the line is + zero, then no location is printed. */ +extern bool cpp_error_with_line (cpp_reader *, enum cpp_diagnostic_level, + location_t, unsigned, + const char *msgid, ...) + ATTRIBUTE_PRINTF_5; +extern bool cpp_warning_with_line (cpp_reader *, enum cpp_warning_reason, + location_t, unsigned, + const char *msgid, ...) + ATTRIBUTE_PRINTF_5; +extern bool cpp_pedwarning_with_line (cpp_reader *, enum cpp_warning_reason, + location_t, unsigned, + const char *msgid, ...) + ATTRIBUTE_PRINTF_5; +extern bool cpp_warning_with_line_syshdr (cpp_reader *, enum cpp_warning_reason, + location_t, unsigned, + const char *msgid, ...) + ATTRIBUTE_PRINTF_5; + +extern bool cpp_error_at (cpp_reader * pfile, enum cpp_diagnostic_level, + location_t src_loc, const char *msgid, ...) + ATTRIBUTE_PRINTF_4; + +extern bool cpp_error_at (cpp_reader * pfile, enum cpp_diagnostic_level, + rich_location *richloc, const char *msgid, ...) + ATTRIBUTE_PRINTF_4; + +/* In lex.cc */ +extern int cpp_ideq (const cpp_token *, const char *); +extern void cpp_output_line (cpp_reader *, FILE *); +extern unsigned char *cpp_output_line_to_string (cpp_reader *, + const unsigned char *); +extern const unsigned char *cpp_alloc_token_string + (cpp_reader *, const unsigned char *, unsigned); +extern void cpp_output_token (const cpp_token *, FILE *); +extern const char *cpp_type2name (enum cpp_ttype, unsigned char flags); +/* Returns the value of an escape sequence, truncated to the correct + target precision. PSTR points to the input pointer, which is just + after the backslash. LIMIT is how much text we have. WIDE is true + if the escape sequence is part of a wide character constant or + string literal. Handles all relevant diagnostics. */ +extern cppchar_t cpp_parse_escape (cpp_reader *, const unsigned char ** pstr, + const unsigned char *limit, int wide); + +/* Structure used to hold a comment block at a given location in the + source code. */ + +typedef struct +{ + /* Text of the comment including the terminators. */ + char *comment; + + /* source location for the given comment. */ + location_t sloc; +} cpp_comment; + +/* Structure holding all comments for a given cpp_reader. */ + +typedef struct +{ + /* table of comment entries. */ + cpp_comment *entries; + + /* number of actual entries entered in the table. */ + int count; + + /* number of entries allocated currently. */ + int allocated; +} cpp_comment_table; + +/* Returns the table of comments encountered by the preprocessor. This + table is only populated when pfile->state.save_comments is true. */ +extern cpp_comment_table *cpp_get_comments (cpp_reader *); + +/* In hash.c */ + +/* Lookup an identifier in the hashtable. Puts the identifier in the + table if it is not already there. */ +extern cpp_hashnode *cpp_lookup (cpp_reader *, const unsigned char *, + unsigned int); + +typedef int (*cpp_cb) (cpp_reader *, cpp_hashnode *, void *); +extern void cpp_forall_identifiers (cpp_reader *, cpp_cb, void *); + +/* In macro.cc */ +extern void cpp_scan_nooutput (cpp_reader *); +extern int cpp_sys_macro_p (cpp_reader *); +extern unsigned char *cpp_quote_string (unsigned char *, const unsigned char *, + unsigned int); +extern bool cpp_compare_macros (const cpp_macro *macro1, + const cpp_macro *macro2); + +/* In files.cc */ +extern bool cpp_included (cpp_reader *, const char *); +extern bool cpp_included_before (cpp_reader *, const char *, location_t); +extern void cpp_make_system_header (cpp_reader *, int, int); +extern bool cpp_push_include (cpp_reader *, const char *); +extern bool cpp_push_default_include (cpp_reader *, const char *); +extern void cpp_change_file (cpp_reader *, enum lc_reason, const char *); +extern const char *cpp_get_path (struct _cpp_file *); +extern cpp_dir *cpp_get_dir (struct _cpp_file *); +extern cpp_buffer *cpp_get_buffer (cpp_reader *); +extern struct _cpp_file *cpp_get_file (cpp_buffer *); +extern cpp_buffer *cpp_get_prev (cpp_buffer *); +extern void cpp_clear_file_cache (cpp_reader *); + +/* cpp_get_converted_source returns the contents of the given file, as it exists + after cpplib has read it and converted it from the input charset to the + source charset. Return struct will be zero-filled if the data could not be + read for any reason. The data starts at the DATA pointer, but the TO_FREE + pointer is what should be passed to free(), as there may be an offset. */ +struct cpp_converted_source +{ + char *to_free; + char *data; + size_t len; +}; +cpp_converted_source cpp_get_converted_source (const char *fname, + const char *input_charset); + +/* In pch.cc */ +struct save_macro_data; +extern int cpp_save_state (cpp_reader *, FILE *); +extern int cpp_write_pch_deps (cpp_reader *, FILE *); +extern int cpp_write_pch_state (cpp_reader *, FILE *); +extern int cpp_valid_state (cpp_reader *, const char *, int); +extern void cpp_prepare_state (cpp_reader *, struct save_macro_data **); +extern int cpp_read_state (cpp_reader *, const char *, FILE *, + struct save_macro_data *); + +/* In lex.cc */ +extern void cpp_force_token_locations (cpp_reader *, location_t); +extern void cpp_stop_forcing_token_locations (cpp_reader *); +enum CPP_DO_task +{ + CPP_DO_print, + CPP_DO_location, + CPP_DO_token +}; + +extern void cpp_directive_only_process (cpp_reader *pfile, + void *data, + void (*cb) (cpp_reader *, + CPP_DO_task, + void *data, ...)); + +/* In expr.cc */ +extern enum cpp_ttype cpp_userdef_string_remove_type + (enum cpp_ttype type); +extern enum cpp_ttype cpp_userdef_string_add_type + (enum cpp_ttype type); +extern enum cpp_ttype cpp_userdef_char_remove_type + (enum cpp_ttype type); +extern enum cpp_ttype cpp_userdef_char_add_type + (enum cpp_ttype type); +extern bool cpp_userdef_string_p + (enum cpp_ttype type); +extern bool cpp_userdef_char_p + (enum cpp_ttype type); +extern const char * cpp_get_userdef_suffix + (const cpp_token *); + +/* In charset.cc */ + +/* The result of attempting to decode a run of UTF-8 bytes. */ + +struct cpp_decoded_char +{ + const char *m_start_byte; + const char *m_next_byte; + + bool m_valid_ch; + cppchar_t m_ch; +}; + +/* Information for mapping between code points and display columns. + + This is a tabstop value, along with a callback for getting the + widths of characters. Normally this callback is cpp_wcwidth, but we + support other schemes for escaping non-ASCII unicode as a series of + ASCII chars when printing the user's source code in diagnostic-show-locus.cc + + For example, consider: + - the Unicode character U+03C0 "GREEK SMALL LETTER PI" (UTF-8: 0xCF 0x80) + - the Unicode character U+1F642 "SLIGHTLY SMILING FACE" + (UTF-8: 0xF0 0x9F 0x99 0x82) + - the byte 0xBF (a stray trailing byte of a UTF-8 character) + Normally U+03C0 would occupy one display column, U+1F642 + would occupy two display columns, and the stray byte would be + printed verbatim as one display column. + + However when escaping them as unicode code points as "" + and "" they occupy 8 and 9 display columns respectively, + and when escaping them as bytes as "<80>" and "<9F><99><82>" + they occupy 8 and 16 display columns respectively. In both cases + the stray byte is escaped to as 4 display columns. */ + +struct cpp_char_column_policy +{ + cpp_char_column_policy (int tabstop, + int (*width_cb) (cppchar_t c)) + : m_tabstop (tabstop), + m_undecoded_byte_width (1), + m_width_cb (width_cb) + {} + + int m_tabstop; + /* Width in display columns of a stray byte that isn't decodable + as UTF-8. */ + int m_undecoded_byte_width; + int (*m_width_cb) (cppchar_t c); +}; + +/* A class to manage the state while converting a UTF-8 sequence to cppchar_t + and computing the display width one character at a time. */ +class cpp_display_width_computation { + public: + cpp_display_width_computation (const char *data, int data_length, + const cpp_char_column_policy &policy); + const char *next_byte () const { return m_next; } + int bytes_processed () const { return m_next - m_begin; } + int bytes_left () const { return m_bytes_left; } + bool done () const { return !bytes_left (); } + int display_cols_processed () const { return m_display_cols; } + + int process_next_codepoint (cpp_decoded_char *out); + int advance_display_cols (int n); + + private: + const char *const m_begin; + const char *m_next; + size_t m_bytes_left; + const cpp_char_column_policy &m_policy; + int m_display_cols; +}; + +/* Convenience functions that are simple use cases for class + cpp_display_width_computation. Tab characters will be expanded to spaces + as determined by POLICY.m_tabstop, and non-printable-ASCII characters + will be escaped as per POLICY. */ + +int cpp_byte_column_to_display_column (const char *data, int data_length, + int column, + const cpp_char_column_policy &policy); +inline int cpp_display_width (const char *data, int data_length, + const cpp_char_column_policy &policy) +{ + return cpp_byte_column_to_display_column (data, data_length, data_length, + policy); +} +int cpp_display_column_to_byte_column (const char *data, int data_length, + int display_col, + const cpp_char_column_policy &policy); +int cpp_wcwidth (cppchar_t c); + +bool cpp_input_conversion_is_trivial (const char *input_charset); +int cpp_check_utf8_bom (const char *data, size_t data_length); + +#endif /* ! LIBCPP_CPPLIB_H */ diff --git a/developer/script_Deb-12.10_gcc-12.4.1/library/init.cc b/developer/script_Deb-12.10_gcc-12.4.1/library/init.cc new file mode 100644 index 0000000..36cdc6a --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/library/init.cc @@ -0,0 +1,935 @@ +/* CPP Library. + Copyright (C) 1986-2022 Free Software Foundation, Inc. + Contributed by Per Bothner, 1994-95. + Based on CCCP program by Paul Rubin, June 1986 + Adapted to ANSI C, Richard Stallman, Jan 1987 + +This program is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the +Free Software Foundation; either version 3, or (at your option) any +later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; see the file COPYING3. If not see +. */ + +#include "config.h" +#include "system.h" +#include "cpplib.h" +#include "internal.h" +#include "mkdeps.h" +#include "localedir.h" +#include "filenames.h" + +#ifndef ENABLE_CANONICAL_SYSTEM_HEADERS +#ifdef HAVE_DOS_BASED_FILE_SYSTEM +#define ENABLE_CANONICAL_SYSTEM_HEADERS 1 +#else +#define ENABLE_CANONICAL_SYSTEM_HEADERS 0 +#endif +#endif + +static void init_library (void); +static void mark_named_operators (cpp_reader *, int); +static bool read_original_filename (cpp_reader *); +static void read_original_directory (cpp_reader *); +static void post_options (cpp_reader *); + +/* If we have designated initializers (GCC >2.7) these tables can be + initialized, constant data. Otherwise, they have to be filled in at + runtime. */ +#if HAVE_DESIGNATED_INITIALIZERS + +#define init_trigraph_map() /* Nothing. */ +#define TRIGRAPH_MAP \ +__extension__ const uchar _cpp_trigraph_map[UCHAR_MAX + 1] = { + +#define END }; +#define s(p, v) [p] = v, + +#else + +#define TRIGRAPH_MAP uchar _cpp_trigraph_map[UCHAR_MAX + 1] = { 0 }; \ + static void init_trigraph_map (void) { \ + unsigned char *x = _cpp_trigraph_map; + +#define END } +#define s(p, v) x[p] = v; + +#endif + +TRIGRAPH_MAP + s('=', '#') s(')', ']') s('!', '|') + s('(', '[') s('\'', '^') s('>', '}') + s('/', '\\') s('<', '{') s('-', '~') +END + +#undef s +#undef END +#undef TRIGRAPH_MAP + +/* A set of booleans indicating what CPP features each source language + requires. */ +struct lang_flags +{ + char c99; + char cplusplus; + char extended_numbers; + char extended_identifiers; + char c11_identifiers; + char std; + char digraphs; + char uliterals; + char rliterals; + char user_literals; + char binary_constants; + char digit_separators; + char trigraphs; + char utf8_char_literals; + char va_opt; + char scope; + char dfp_constants; + char size_t_literals; + char elifdef; +}; + +static const struct lang_flags lang_defaults[] = +{ /* c99 c++ xnum xid c11 std digr ulit rlit udlit bincst digsep trig u8chlit vaopt scope dfp szlit elifdef */ + /* GNUC89 */ { 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0 }, + /* GNUC99 */ { 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0 }, + /* GNUC11 */ { 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0 }, + /* GNUC17 */ { 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0 }, + /* GNUC2X */ { 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1 }, + /* STDC89 */ { 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0 }, + /* STDC94 */ { 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0 }, + /* STDC99 */ { 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0 }, + /* STDC11 */ { 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0 }, + /* STDC17 */ { 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0 }, + /* STDC2X */ { 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1 }, + /* GNUCXX */ { 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0 }, + /* CXX98 */ { 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0 }, + /* GNUCXX11 */ { 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0 }, + /* CXX11 */ { 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0 }, + /* GNUCXX14 */ { 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0 }, + /* CXX14 */ { 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0 }, + /* GNUCXX17 */ { 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0 }, + /* CXX17 */ { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0 }, + /* GNUCXX20 */ { 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0 }, + /* CXX20 */ { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0 }, + /* GNUCXX23 */ { 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1 }, + /* CXX23 */ { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1 }, + /* ASM */ { 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 } +}; + +/* Sets internal flags correctly for a given language. */ +void +cpp_set_lang (cpp_reader *pfile, enum c_lang lang) +{ + const struct lang_flags *l = &lang_defaults[(int) lang]; + + CPP_OPTION (pfile, lang) = lang; + + CPP_OPTION (pfile, c99) = l->c99; + CPP_OPTION (pfile, cplusplus) = l->cplusplus; + CPP_OPTION (pfile, extended_numbers) = l->extended_numbers; + CPP_OPTION (pfile, extended_identifiers) = l->extended_identifiers; + CPP_OPTION (pfile, c11_identifiers) = l->c11_identifiers; + CPP_OPTION (pfile, std) = l->std; + CPP_OPTION (pfile, digraphs) = l->digraphs; + CPP_OPTION (pfile, uliterals) = l->uliterals; + CPP_OPTION (pfile, rliterals) = l->rliterals; + CPP_OPTION (pfile, user_literals) = l->user_literals; + CPP_OPTION (pfile, binary_constants) = l->binary_constants; + CPP_OPTION (pfile, digit_separators) = l->digit_separators; + CPP_OPTION (pfile, trigraphs) = l->trigraphs; + CPP_OPTION (pfile, utf8_char_literals) = l->utf8_char_literals; + CPP_OPTION (pfile, va_opt) = l->va_opt; + CPP_OPTION (pfile, scope) = l->scope; + CPP_OPTION (pfile, dfp_constants) = l->dfp_constants; + CPP_OPTION (pfile, size_t_literals) = l->size_t_literals; + CPP_OPTION (pfile, elifdef) = l->elifdef; +} + +/* Initialize library global state. */ +static void +init_library (void) +{ + static int initialized = 0; + + if (! initialized) + { + initialized = 1; + + _cpp_init_lexer (); + + /* Set up the trigraph map. This doesn't need to do anything if + we were compiled with a compiler that supports C99 designated + initializers. */ + init_trigraph_map (); + +#ifdef ENABLE_NLS + (void) bindtextdomain (PACKAGE, LOCALEDIR); +#endif + } +} + +/* Initialize a cpp_reader structure. */ +cpp_reader * +cpp_create_reader (enum c_lang lang, cpp_hash_table *table, + class line_maps *line_table) +{ + cpp_reader *pfile; + + /* Initialize this instance of the library if it hasn't been already. */ + init_library (); + + pfile = XCNEW (cpp_reader); + memset (&pfile->base_context, 0, sizeof (pfile->base_context)); + + cpp_set_lang (pfile, lang); + CPP_OPTION (pfile, warn_multichar) = 1; + CPP_OPTION (pfile, discard_comments) = 1; + CPP_OPTION (pfile, discard_comments_in_macro_exp) = 1; + CPP_OPTION (pfile, max_include_depth) = 200; + CPP_OPTION (pfile, operator_names) = 1; + CPP_OPTION (pfile, warn_trigraphs) = 2; + CPP_OPTION (pfile, warn_endif_labels) = 1; + CPP_OPTION (pfile, cpp_warn_c90_c99_compat) = -1; + CPP_OPTION (pfile, cpp_warn_c11_c2x_compat) = -1; + CPP_OPTION (pfile, cpp_warn_cxx11_compat) = 0; + CPP_OPTION (pfile, cpp_warn_deprecated) = 1; + CPP_OPTION (pfile, cpp_warn_long_long) = 0; + CPP_OPTION (pfile, dollars_in_ident) = 1; + CPP_OPTION (pfile, warn_dollars) = 1; + CPP_OPTION (pfile, warn_variadic_macros) = 1; + CPP_OPTION (pfile, warn_builtin_macro_redefined) = 1; + CPP_OPTION (pfile, cpp_warn_implicit_fallthrough) = 0; + /* By default, track locations of tokens resulting from macro + expansion. The '2' means, track the locations with the highest + accuracy. Read the comments for struct + cpp_options::track_macro_expansion to learn about the other + values. */ + CPP_OPTION (pfile, track_macro_expansion) = 2; + CPP_OPTION (pfile, warn_normalize) = normalized_C; + CPP_OPTION (pfile, warn_literal_suffix) = 1; + CPP_OPTION (pfile, canonical_system_headers) + = ENABLE_CANONICAL_SYSTEM_HEADERS; + CPP_OPTION (pfile, ext_numeric_literals) = 1; + CPP_OPTION (pfile, warn_date_time) = 0; + CPP_OPTION (pfile, cpp_warn_bidirectional) = bidirectional_unpaired; + + /* Default CPP arithmetic to something sensible for the host for the + benefit of dumb users like fix-header. */ + CPP_OPTION (pfile, precision) = CHAR_BIT * sizeof (long); + CPP_OPTION (pfile, char_precision) = CHAR_BIT; + CPP_OPTION (pfile, wchar_precision) = CHAR_BIT * sizeof (int); + CPP_OPTION (pfile, int_precision) = CHAR_BIT * sizeof (int); + CPP_OPTION (pfile, unsigned_char) = 0; + CPP_OPTION (pfile, unsigned_wchar) = 1; + CPP_OPTION (pfile, bytes_big_endian) = 1; /* does not matter */ + + /* Default to no charset conversion. */ + CPP_OPTION (pfile, narrow_charset) = _cpp_default_encoding (); + CPP_OPTION (pfile, wide_charset) = 0; + + /* Default the input character set to UTF-8. */ + CPP_OPTION (pfile, input_charset) = _cpp_default_encoding (); + + /* A fake empty "directory" used as the starting point for files + looked up without a search path. Name cannot be '/' because we + don't want to prepend anything at all to filenames using it. All + other entries are correct zero-initialized. */ + pfile->no_search_path.name = (char *) ""; + + /* Initialize the line map. */ + pfile->line_table = line_table; + + /* Initialize lexer state. */ + pfile->state.save_comments = ! CPP_OPTION (pfile, discard_comments); + + /* Set up static tokens. */ + pfile->avoid_paste.type = CPP_PADDING; + pfile->avoid_paste.val.source = NULL; + pfile->avoid_paste.src_loc = 0; + pfile->endarg.type = CPP_EOF; + pfile->endarg.flags = 0; + pfile->endarg.src_loc = 0; + + /* Create a token buffer for the lexer. */ + _cpp_init_tokenrun (&pfile->base_run, 250); + pfile->cur_run = &pfile->base_run; + pfile->cur_token = pfile->base_run.base; + + /* Initialize the base context. */ + pfile->context = &pfile->base_context; + pfile->base_context.c.macro = 0; + pfile->base_context.prev = pfile->base_context.next = 0; + + /* Aligned and unaligned storage. */ + pfile->a_buff = _cpp_get_buff (pfile, 0); + pfile->u_buff = _cpp_get_buff (pfile, 0); + + /* Initialize table for push_macro/pop_macro. */ + pfile->pushed_macros = 0; + + /* Do not force token locations by default. */ + pfile->forced_token_location = 0; + + /* Note the timestamp is unset. */ + pfile->time_stamp = time_t (-1); + pfile->time_stamp_kind = 0; + + /* The expression parser stack. */ + _cpp_expand_op_stack (pfile); + + /* Initialize the buffer obstack. */ + obstack_specify_allocation (&pfile->buffer_ob, 0, 0, xmalloc, free); + + _cpp_init_files (pfile); + + _cpp_init_hashtable (pfile, table); + + return pfile; +} + +/* Set the line_table entry in PFILE. This is called after reading a + PCH file, as the old line_table will be incorrect. */ +void +cpp_set_line_map (cpp_reader *pfile, class line_maps *line_table) +{ + pfile->line_table = line_table; +} + +/* Free resources used by PFILE. Accessing PFILE after this function + returns leads to undefined behavior. Returns the error count. */ +void +cpp_destroy (cpp_reader *pfile) +{ + cpp_context *context, *contextn; + struct def_pragma_macro *pmacro; + tokenrun *run, *runn; + int i; + + free (pfile->op_stack); + + while (CPP_BUFFER (pfile) != NULL) + _cpp_pop_buffer (pfile); + + free (pfile->out.base); + + if (pfile->macro_buffer) + { + free (pfile->macro_buffer); + pfile->macro_buffer = NULL; + pfile->macro_buffer_len = 0; + } + + if (pfile->deps) + deps_free (pfile->deps); + obstack_free (&pfile->buffer_ob, 0); + + _cpp_destroy_hashtable (pfile); + _cpp_cleanup_files (pfile); + _cpp_destroy_iconv (pfile); + + _cpp_free_buff (pfile->a_buff); + _cpp_free_buff (pfile->u_buff); + _cpp_free_buff (pfile->free_buffs); + + for (run = &pfile->base_run; run; run = runn) + { + runn = run->next; + free (run->base); + if (run != &pfile->base_run) + free (run); + } + + for (context = pfile->base_context.next; context; context = contextn) + { + contextn = context->next; + free (context); + } + + if (pfile->comments.entries) + { + for (i = 0; i < pfile->comments.count; i++) + free (pfile->comments.entries[i].comment); + + free (pfile->comments.entries); + } + if (pfile->pushed_macros) + { + do + { + pmacro = pfile->pushed_macros; + pfile->pushed_macros = pmacro->next; + free (pmacro->name); + free (pmacro); + } + while (pfile->pushed_macros); + } + + free (pfile); +} + +/* This structure defines one built-in identifier. A node will be + entered in the hash table under the name NAME, with value VALUE. + + There are two tables of these. builtin_array holds all the + "builtin" macros: these are handled by builtin_macro() in + macro.cc. Builtin is somewhat of a misnomer -- the property of + interest is that these macros require special code to compute their + expansions. The value is a "cpp_builtin_type" enumerator. + + operator_array holds the C++ named operators. These are keywords + which act as aliases for punctuators. In C++, they cannot be + altered through #define, and #if recognizes them as operators. In + C, these are not entered into the hash table at all (but see + ). The value is a token-type enumerator. */ +struct builtin_macro +{ + const uchar *const name; + const unsigned short len; + const unsigned short value; + const bool always_warn_if_redefined; +}; + +#define B(n, t, f) { DSC(n), t, f } +static const struct builtin_macro builtin_array[] = +{ + + B("_ASSIGN", BT_RT_ASSIGN, true), + B("_TO_ARG_LIST", BT_RT_TO_ARG_LIST, true), + B("_TO_TOKEN_LIST", BT_RT_TO_TOKEN_LIST, true), + B("_FIRST", BT_RT_FIRST, true), + B("_REST", BT_RT_REST, true), + B("_MAP", BT_RT_MAP, true), + B("_AL_MAP", BT_RT_AL_MAP, true), + B("_IF", BT_RT_IF, true), + B("_NOT", BT_RT_NOT, true), + B("_AND", BT_RT_AND, true), + B("_OR", BT_RT_OR, true), + B("_IS_IDENTIFIER", BT_RT_IS_IDENTIFIER, true), + B("_IS_NAME", BT_RT_IS_NAME, true), + B("_PASTE", BT_RT_PASTE, true), + + + B("__TIMESTAMP__", BT_TIMESTAMP, false), + B("__TIME__", BT_TIME, false), + B("__DATE__", BT_DATE, false), + B("__FILE__", BT_FILE, false), + B("__FILE_NAME__", BT_FILE_NAME, false), + B("__BASE_FILE__", BT_BASE_FILE, false), + B("__LINE__", BT_SPECLINE, true), + B("__INCLUDE_LEVEL__", BT_INCLUDE_LEVEL, true), + B("__COUNTER__", BT_COUNTER, true), + /* Make sure to update the list of built-in + function-like macros in traditional.cc: + fun_like_macro() when adding more following */ + B("__has_attribute", BT_HAS_ATTRIBUTE, true), + B("__has_c_attribute", BT_HAS_STD_ATTRIBUTE, true), + B("__has_cpp_attribute", BT_HAS_ATTRIBUTE, true), + B("__has_builtin", BT_HAS_BUILTIN, true), + B("__has_include", BT_HAS_INCLUDE, true), + B("__has_include_next",BT_HAS_INCLUDE_NEXT, true), + /* The following macros are excluded when -traditional-cpp is used. + Therefore, they must appear at the end of this array so that they can be + easily removed by slicing in cpp_init_special_builtins(). + (If you add new built-ins that should be excluded in traditional mode, + place them *before* __STDC__ and update cpp_init_special_builtins() accordingly.) + */ + B("_Pragma", BT_PRAGMA, true), + B("__STDC__", BT_STDC, true) +}; +#undef B + +struct builtin_operator +{ + const uchar *const name; + const unsigned short len; + const unsigned short value; +}; + +#define B(n, t) { DSC(n), t } +static const struct builtin_operator operator_array[] = +{ + B("and", CPP_AND_AND), + B("and_eq", CPP_AND_EQ), + B("bitand", CPP_AND), + B("bitor", CPP_OR), + B("compl", CPP_COMPL), + B("not", CPP_NOT), + B("not_eq", CPP_NOT_EQ), + B("or", CPP_OR_OR), + B("or_eq", CPP_OR_EQ), + B("xor", CPP_XOR), + B("xor_eq", CPP_XOR_EQ) +}; +#undef B + +/* Mark the C++ named operators in the hash table. */ +static void +mark_named_operators (cpp_reader *pfile, int flags) +{ + const struct builtin_operator *b; + + for (b = operator_array; + b < (operator_array + ARRAY_SIZE (operator_array)); + b++) + { + cpp_hashnode *hp = cpp_lookup (pfile, b->name, b->len); + hp->flags |= flags; + hp->is_directive = 0; + hp->directive_index = b->value; + } +} + +/* Helper function of cpp_type2name. Return the string associated with + named operator TYPE. */ +const char * +cpp_named_operator2name (enum cpp_ttype type) +{ + const struct builtin_operator *b; + + for (b = operator_array; + b < (operator_array + ARRAY_SIZE (operator_array)); + b++) + { + if (type == b->value) + return (const char *) b->name; + } + + return NULL; +} + +void +cpp_init_special_builtins (cpp_reader *pfile) +{ + const struct builtin_macro *b; + size_t n = ARRAY_SIZE (builtin_array); + + if (CPP_OPTION (pfile, traditional)) + n -= 2; + else if (! CPP_OPTION (pfile, stdc_0_in_system_headers) + || CPP_OPTION (pfile, std)) + n--; + + for (b = builtin_array; b < builtin_array + n; b++) + { + if ((b->value == BT_HAS_ATTRIBUTE + || b->value == BT_HAS_STD_ATTRIBUTE + || b->value == BT_HAS_BUILTIN) + && (CPP_OPTION (pfile, lang) == CLK_ASM + || pfile->cb.has_attribute == NULL)) + continue; + cpp_hashnode *hp = cpp_lookup (pfile, b->name, b->len); + hp->type = NT_BUILTIN_MACRO; + if (b->always_warn_if_redefined) + hp->flags |= NODE_WARN; + hp->value.builtin = (enum cpp_builtin_type) b->value; + } +} + +/* Restore macro C to builtin macro definition. */ + +void +_cpp_restore_special_builtin (cpp_reader *pfile, struct def_pragma_macro *c) +{ + size_t len = strlen (c->name); + + for (const struct builtin_macro *b = builtin_array; + b < builtin_array + ARRAY_SIZE (builtin_array); b++) + if (b->len == len && memcmp (c->name, b->name, len + 1) == 0) + { + cpp_hashnode *hp = cpp_lookup (pfile, b->name, b->len); + hp->type = NT_BUILTIN_MACRO; + if (b->always_warn_if_redefined) + hp->flags |= NODE_WARN; + hp->value.builtin = (enum cpp_builtin_type) b->value; + } +} + +/* Read the builtins table above and enter them, and language-specific + macros, into the hash table. HOSTED is true if this is a hosted + environment. */ +void +cpp_init_builtins (cpp_reader *pfile, int hosted) +{ + cpp_init_special_builtins (pfile); + + if (!CPP_OPTION (pfile, traditional) + && (! CPP_OPTION (pfile, stdc_0_in_system_headers) + || CPP_OPTION (pfile, std))) + _cpp_define_builtin (pfile, "__STDC__ 1"); + + if (CPP_OPTION (pfile, cplusplus)) + { + /* C++23 is not yet a standard. For now, use an invalid + * year/month, 202100L, which is larger than 202002L. */ + if (CPP_OPTION (pfile, lang) == CLK_CXX23 + || CPP_OPTION (pfile, lang) == CLK_GNUCXX23) + _cpp_define_builtin (pfile, "__cplusplus 202100L"); + else if (CPP_OPTION (pfile, lang) == CLK_CXX20 + || CPP_OPTION (pfile, lang) == CLK_GNUCXX20) + _cpp_define_builtin (pfile, "__cplusplus 202002L"); + else if (CPP_OPTION (pfile, lang) == CLK_CXX17 + || CPP_OPTION (pfile, lang) == CLK_GNUCXX17) + _cpp_define_builtin (pfile, "__cplusplus 201703L"); + else if (CPP_OPTION (pfile, lang) == CLK_CXX14 + || CPP_OPTION (pfile, lang) == CLK_GNUCXX14) + _cpp_define_builtin (pfile, "__cplusplus 201402L"); + else if (CPP_OPTION (pfile, lang) == CLK_CXX11 + || CPP_OPTION (pfile, lang) == CLK_GNUCXX11) + _cpp_define_builtin (pfile, "__cplusplus 201103L"); + else + _cpp_define_builtin (pfile, "__cplusplus 199711L"); + } + else if (CPP_OPTION (pfile, lang) == CLK_ASM) + _cpp_define_builtin (pfile, "__ASSEMBLER__ 1"); + else if (CPP_OPTION (pfile, lang) == CLK_STDC94) + _cpp_define_builtin (pfile, "__STDC_VERSION__ 199409L"); + else if (CPP_OPTION (pfile, lang) == CLK_STDC2X + || CPP_OPTION (pfile, lang) == CLK_GNUC2X) + _cpp_define_builtin (pfile, "__STDC_VERSION__ 202000L"); + else if (CPP_OPTION (pfile, lang) == CLK_STDC17 + || CPP_OPTION (pfile, lang) == CLK_GNUC17) + _cpp_define_builtin (pfile, "__STDC_VERSION__ 201710L"); + else if (CPP_OPTION (pfile, lang) == CLK_STDC11 + || CPP_OPTION (pfile, lang) == CLK_GNUC11) + _cpp_define_builtin (pfile, "__STDC_VERSION__ 201112L"); + else if (CPP_OPTION (pfile, c99)) + _cpp_define_builtin (pfile, "__STDC_VERSION__ 199901L"); + + if (CPP_OPTION (pfile, uliterals) + && !(CPP_OPTION (pfile, cplusplus) + && (CPP_OPTION (pfile, lang) == CLK_GNUCXX + || CPP_OPTION (pfile, lang) == CLK_CXX98))) + { + _cpp_define_builtin (pfile, "__STDC_UTF_16__ 1"); + _cpp_define_builtin (pfile, "__STDC_UTF_32__ 1"); + } + + if (hosted) + _cpp_define_builtin (pfile, "__STDC_HOSTED__ 1"); + else + _cpp_define_builtin (pfile, "__STDC_HOSTED__ 0"); + + if (CPP_OPTION (pfile, objc)) + _cpp_define_builtin (pfile, "__OBJC__ 1"); +} + +/* Sanity-checks are dependent on command-line options, so it is + called as a subroutine of cpp_read_main_file. */ +#if CHECKING_P +static void sanity_checks (cpp_reader *); +static void sanity_checks (cpp_reader *pfile) +{ + cppchar_t test = 0; + size_t max_precision = 2 * CHAR_BIT * sizeof (cpp_num_part); + + /* Sanity checks for assumptions about CPP arithmetic and target + type precisions made by cpplib. */ + test--; + if (test < 1) + cpp_error (pfile, CPP_DL_ICE, "cppchar_t must be an unsigned type"); + + if (CPP_OPTION (pfile, precision) > max_precision) + cpp_error (pfile, CPP_DL_ICE, + "preprocessor arithmetic has maximum precision of %lu bits;" + " target requires %lu bits", + (unsigned long) max_precision, + (unsigned long) CPP_OPTION (pfile, precision)); + + if (CPP_OPTION (pfile, precision) < CPP_OPTION (pfile, int_precision)) + cpp_error (pfile, CPP_DL_ICE, + "CPP arithmetic must be at least as precise as a target int"); + + if (CPP_OPTION (pfile, char_precision) < 8) + cpp_error (pfile, CPP_DL_ICE, "target char is less than 8 bits wide"); + + if (CPP_OPTION (pfile, wchar_precision) < CPP_OPTION (pfile, char_precision)) + cpp_error (pfile, CPP_DL_ICE, + "target wchar_t is narrower than target char"); + + if (CPP_OPTION (pfile, int_precision) < CPP_OPTION (pfile, char_precision)) + cpp_error (pfile, CPP_DL_ICE, + "target int is narrower than target char"); + + /* This is assumed in eval_token() and could be fixed if necessary. */ + if (sizeof (cppchar_t) > sizeof (cpp_num_part)) + cpp_error (pfile, CPP_DL_ICE, + "CPP half-integer narrower than CPP character"); + + if (CPP_OPTION (pfile, wchar_precision) > BITS_PER_CPPCHAR_T) + cpp_error (pfile, CPP_DL_ICE, + "CPP on this host cannot handle wide character constants over" + " %lu bits, but the target requires %lu bits", + (unsigned long) BITS_PER_CPPCHAR_T, + (unsigned long) CPP_OPTION (pfile, wchar_precision)); +} +#else +# define sanity_checks(PFILE) +#endif + +/* This is called after options have been parsed, and partially + processed. */ +void +cpp_post_options (cpp_reader *pfile) +{ + int flags; + + sanity_checks (pfile); + + post_options (pfile); + + /* Mark named operators before handling command line macros. */ + flags = 0; + if (CPP_OPTION (pfile, cplusplus) && CPP_OPTION (pfile, operator_names)) + flags |= NODE_OPERATOR; + if (CPP_OPTION (pfile, warn_cxx_operator_names)) + flags |= NODE_DIAGNOSTIC | NODE_WARN_OPERATOR; + if (flags != 0) + mark_named_operators (pfile, flags); +} + +/* Setup for processing input from the file named FNAME, or stdin if + it is the empty string. Return the original filename on success + (e.g. foo.i->foo.c), or NULL on failure. INJECTING is true if + there may be injected headers before line 1 of the main file. */ +const char * +cpp_read_main_file (cpp_reader *pfile, const char *fname, bool injecting) +{ + if (mkdeps *deps = cpp_get_deps (pfile)) + /* Set the default target (if there is none already). */ + deps_add_default_target (deps, fname); + + pfile->main_file + = _cpp_find_file (pfile, fname, + CPP_OPTION (pfile, preprocessed) ? &pfile->no_search_path + : CPP_OPTION (pfile, main_search) == CMS_user + ? pfile->quote_include + : CPP_OPTION (pfile, main_search) == CMS_system + ? pfile->bracket_include : &pfile->no_search_path, + /*angle=*/0, _cpp_FFK_NORMAL, 0); + + if (_cpp_find_failed (pfile->main_file)) + return NULL; + + _cpp_stack_file (pfile, pfile->main_file, + injecting || CPP_OPTION (pfile, preprocessed) + ? IT_PRE_MAIN : IT_MAIN, 0); + + /* For foo.i, read the original filename foo.c now, for the benefit + of the front ends. */ + if (CPP_OPTION (pfile, preprocessed)) + if (!read_original_filename (pfile)) + { + /* We're on line 1 after all. */ + auto *last = linemap_check_ordinary + (LINEMAPS_LAST_MAP (pfile->line_table, false)); + last->to_line = 1; + /* Inform of as-if a file change. */ + _cpp_do_file_change (pfile, LC_RENAME_VERBATIM, LINEMAP_FILE (last), + LINEMAP_LINE (last), LINEMAP_SYSP (last)); + } + + auto *map = LINEMAPS_LAST_ORDINARY_MAP (pfile->line_table); + pfile->main_loc = MAP_START_LOCATION (map); + + return ORDINARY_MAP_FILE_NAME (map); +} + +location_t +cpp_main_loc (const cpp_reader *pfile) +{ + return pfile->main_loc; +} + +/* For preprocessed files, if the very first characters are + '#[01]', then handle a line directive so we know the + original file name. This will generate file_change callbacks, + which the front ends must handle appropriately given their state of + initialization. We peek directly into the character buffer, so + that we're not confused by otherwise-skipped white space & + comments. We can be very picky, because this should have been + machine-generated text (by us, no less). This way we do not + interfere with the module directive state machine. */ + +static bool +read_original_filename (cpp_reader *pfile) +{ + auto *buf = pfile->buffer->next_line; + + if (pfile->buffer->rlimit - buf > 4 + && buf[0] == '#' + && buf[1] == ' ' + // Also permit '1', as that's what used to be here + && (buf[2] == '0' || buf[2] == '1') + && buf[3] == ' ') + { + const cpp_token *token = _cpp_lex_direct (pfile); + gcc_checking_assert (token->type == CPP_HASH); + if (_cpp_handle_directive (pfile, token->flags & PREV_WHITE)) + { + read_original_directory (pfile); + + auto *penult = &linemap_check_ordinary + (LINEMAPS_LAST_MAP (pfile->line_table, false))[-1]; + if (penult[1].reason == LC_RENAME_VERBATIM) + { + /* Expunge any evidence of the original linemap. */ + pfile->line_table->highest_location + = pfile->line_table->highest_line + = penult[0].start_location; + + penult[1].start_location = penult[0].start_location; + penult[1].reason = penult[0].reason; + penult[0] = penult[1]; + pfile->line_table->info_ordinary.used--; + pfile->line_table->info_ordinary.cache = 0; + } + + return true; + } + } + + return false; +} + +/* For preprocessed files, if the tokens following the first filename + line is of the form # "/path/name//", handle the + directive so we know the original current directory. + + As with the first line peeking, we can do this without lexing by + being picky. */ +static void +read_original_directory (cpp_reader *pfile) +{ + auto *buf = pfile->buffer->next_line; + + if (pfile->buffer->rlimit - buf > 4 + && buf[0] == '#' + && buf[1] == ' ' + // Also permit '1', as that's what used to be here + && (buf[2] == '0' || buf[2] == '1') + && buf[3] == ' ') + { + const cpp_token *hash = _cpp_lex_direct (pfile); + gcc_checking_assert (hash->type == CPP_HASH); + pfile->state.in_directive = 1; + const cpp_token *number = _cpp_lex_direct (pfile); + gcc_checking_assert (number->type == CPP_NUMBER); + const cpp_token *string = _cpp_lex_direct (pfile); + pfile->state.in_directive = 0; + + const unsigned char *text = nullptr; + size_t len = 0; + if (string->type == CPP_STRING) + { + /* The string value includes the quotes. */ + text = string->val.str.text; + len = string->val.str.len; + } + if (len < 5 + || !IS_DIR_SEPARATOR (text[len - 2]) + || !IS_DIR_SEPARATOR (text[len - 3])) + { + /* That didn't work out, back out. */ + _cpp_backup_tokens (pfile, 3); + return; + } + + if (pfile->cb.dir_change) + { + /* Smash the string directly, it's dead at this point */ + char *smashy = (char *)text; + smashy[len - 3] = 0; + + pfile->cb.dir_change (pfile, smashy + 1); + } + + /* We should be at EOL. */ + } +} + +/* This is called at the end of preprocessing. It pops the last + buffer and writes dependency output. + + Maybe it should also reset state, such that you could call + cpp_start_read with a new filename to restart processing. */ +void +cpp_finish (cpp_reader *pfile, FILE *deps_stream) +{ + /* Warn about unused macros before popping the final buffer. */ + if (CPP_OPTION (pfile, warn_unused_macros)) + cpp_forall_identifiers (pfile, _cpp_warn_if_unused_macro, NULL); + + /* lex.cc leaves the final buffer on the stack. This it so that + it returns an unending stream of CPP_EOFs to the client. If we + popped the buffer, we'd dereference a NULL buffer pointer and + segfault. It's nice to allow the client to do worry-free excess + cpp_get_token calls. */ + while (pfile->buffer) + _cpp_pop_buffer (pfile); + + if (deps_stream) + deps_write (pfile, deps_stream, 72); + + /* Report on headers that could use multiple include guards. */ + if (CPP_OPTION (pfile, print_include_names)) + _cpp_report_missing_guards (pfile); +} + +static void +post_options (cpp_reader *pfile) +{ + /* -Wtraditional is not useful in C++ mode. */ + if (CPP_OPTION (pfile, cplusplus)) + CPP_OPTION (pfile, cpp_warn_traditional) = 0; + + /* Permanently disable macro expansion if we are rescanning + preprocessed text. Read preprocesed source in ISO mode. */ + if (CPP_OPTION (pfile, preprocessed)) + { + if (!CPP_OPTION (pfile, directives_only)) + pfile->state.prevent_expansion = 1; + CPP_OPTION (pfile, traditional) = 0; + } + + if (CPP_OPTION (pfile, warn_trigraphs) == 2) + CPP_OPTION (pfile, warn_trigraphs) = !CPP_OPTION (pfile, trigraphs); + + if (CPP_OPTION (pfile, traditional)) + { + CPP_OPTION (pfile, trigraphs) = 0; + CPP_OPTION (pfile, warn_trigraphs) = 0; + } + + if (CPP_OPTION (pfile, module_directives)) + { + /* These unspellable tokens have a leading space. */ + const char *const inits[spec_nodes::M_HWM] + = {"export ", "module ", "import ", "__import"}; + + for (int ix = 0; ix != spec_nodes::M_HWM; ix++) + { + cpp_hashnode *node = cpp_lookup (pfile, UC (inits[ix]), + strlen (inits[ix])); + + /* Token we pass to the compiler. */ + pfile->spec_nodes.n_modules[ix][1] = node; + + if (ix != spec_nodes::M__IMPORT) + /* Token we recognize when lexing, drop the trailing ' '. */ + node = cpp_lookup (pfile, NODE_NAME (node), NODE_LEN (node) - 1); + + node->flags |= NODE_MODULE; + pfile->spec_nodes.n_modules[ix][0] = node; + } + } +} diff --git a/developer/script_Deb-12.10_gcc-12.4.1/library/macro.cc b/developer/script_Deb-12.10_gcc-12.4.1/library/macro.cc new file mode 100644 index 0000000..56c3b98 --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/library/macro.cc @@ -0,0 +1,5537 @@ +/* Part of CPP library. (Macro and #define handling.) + Copyright (C) 1986-2022 Free Software Foundation, Inc. + Written by Per Bothner, 1994. + Based on CCCP program by Paul Rubin, June 1986 + Adapted to ANSI C, Richard Stallman, Jan 1987 + +This program is free software; you can redistribute it and/or modify it +under the terms of the GNU General Public License as published by the +Free Software Foundation; either version 3, or (at your option) any +later version. + +This program is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with this program; see the file COPYING3. If not see +. + + In other words, you are welcome to use, share and improve this program. + You are forbidden to forbid anyone else to use, share and improve + what you give them. Help stamp out software-hoarding! */ + +#pragma GCC diagnostic ignored "-Wparentheses" + + +#include "config.h" +#include "system.h" +#include "cpplib.h" +#include "internal.h" + +// RT extension + static const uchar *evaluate_RT_ASSIGN(cpp_reader *pfile); + static const uchar *evaluate_RT_TO_ARG_LIST(cpp_reader *pfile); + static const uchar *evaluate_RT_TO_TOKEN_LIST(cpp_reader *pfile); + static const uchar *evaluate_RT_FIRST(cpp_reader *pfile); + static const uchar *evaluate_RT_REST(cpp_reader *pfile); + static const uchar *evaluate_RT_MAP(cpp_reader *pfile); + static const uchar *evaluate_RT_AL_MAP(cpp_reader *pfile); + static const uchar *evaluate_RT_IF(cpp_reader *pfile); + static const uchar *evaluate_RT_NOT(cpp_reader *pfile); + static const uchar *evaluate_RT_AND(cpp_reader *pfile); + static const uchar *evaluate_RT_OR(cpp_reader *pfile); + static const uchar *evaluate_RT_IS_IDENTIFIER(cpp_reader *pfile); + static const uchar *evaluate_RT_IS_NAME(cpp_reader *pfile); + static const uchar *evaluate_RT_PASTE(cpp_reader *pfile); + +typedef struct macro_arg macro_arg; +/* This structure represents the tokens of a macro argument. These + tokens can be macro themselves, in which case they can be either + expanded or unexpanded. When they are expanded, this data + structure keeps both the expanded and unexpanded forms. */ +struct macro_arg +{ + const cpp_token **first; /* First token in unexpanded argument. */ + const cpp_token **expanded; /* Macro-expanded argument. */ + const cpp_token *stringified; /* Stringified argument. */ + unsigned int count; /* # of tokens in argument. */ + unsigned int expanded_count; /* # of tokens in expanded argument. */ + location_t *virt_locs; /* Where virtual locations for + unexpanded tokens are stored. */ + location_t *expanded_virt_locs; /* Where virtual locations for + expanded tokens are + stored. */ +}; + +/* The kind of macro tokens which the instance of + macro_arg_token_iter is supposed to iterate over. */ +enum macro_arg_token_kind { + MACRO_ARG_TOKEN_NORMAL, + /* This is a macro argument token that got transformed into a string + literal, e.g. #foo. */ + MACRO_ARG_TOKEN_STRINGIFIED, + /* This is a token resulting from the expansion of a macro + argument that was itself a macro. */ + MACRO_ARG_TOKEN_EXPANDED +}; + +/* An iterator over tokens coming from a function-like macro + argument. */ +typedef struct macro_arg_token_iter macro_arg_token_iter; +struct macro_arg_token_iter +{ + /* Whether or not -ftrack-macro-expansion is used. */ + bool track_macro_exp_p; + /* The kind of token over which we are supposed to iterate. */ + enum macro_arg_token_kind kind; + /* A pointer to the current token pointed to by the iterator. */ + const cpp_token **token_ptr; + /* A pointer to the "full" location of the current token. If + -ftrack-macro-expansion is used this location tracks loci across + macro expansion. */ + const location_t *location_ptr; +#if CHECKING_P + /* The number of times the iterator went forward. This useful only + when checking is enabled. */ + size_t num_forwards; +#endif +}; + +/* Saved data about an identifier being used as a macro argument + name. */ +struct macro_arg_saved_data { + /* The canonical (UTF-8) spelling of this identifier. */ + cpp_hashnode *canonical_node; + /* The previous value & type of this identifier. */ + union _cpp_hashnode_value value; + node_type type; +}; + +static const char *vaopt_paste_error = + N_("'##' cannot appear at either end of __VA_OPT__"); + +static void expand_arg (cpp_reader *, macro_arg *); + +/* A class for tracking __VA_OPT__ state while iterating over a + sequence of tokens. This is used during both macro definition and + expansion. */ +class vaopt_state { + + public: + + enum update_type + { + ERROR, + DROP, + INCLUDE, + BEGIN, + END + }; + + /* Initialize the state tracker. ANY_ARGS is true if variable + arguments were provided to the macro invocation. */ + vaopt_state (cpp_reader *pfile, bool is_variadic, macro_arg *arg) + : m_pfile (pfile), + m_arg (arg), + m_variadic (is_variadic), + m_last_was_paste (false), + m_stringify (false), + m_state (0), + m_paste_location (0), + m_location (0), + m_update (ERROR) + { + } + + /* Given a token, update the state of this tracker and return a + boolean indicating whether the token should be be included in the + expansion. */ + update_type update (const cpp_token *token) + { + /* If the macro isn't variadic, just don't bother. */ + if (!m_variadic) + return INCLUDE; + + if (token->type == CPP_NAME + && token->val.node.node == m_pfile->spec_nodes.n__VA_OPT__) + { + if (m_state > 0) + { + cpp_error_at (m_pfile, CPP_DL_ERROR, token->src_loc, + "__VA_OPT__ may not appear in a __VA_OPT__"); + return ERROR; + } + ++m_state; + m_location = token->src_loc; + m_stringify = (token->flags & STRINGIFY_ARG) != 0; + return BEGIN; + } + else if (m_state == 1) + { + if (token->type != CPP_OPEN_PAREN) + { + cpp_error_at (m_pfile, CPP_DL_ERROR, m_location, + "__VA_OPT__ must be followed by an " + "open parenthesis"); + return ERROR; + } + ++m_state; + if (m_update == ERROR) + { + if (m_arg == NULL) + m_update = INCLUDE; + else + { + m_update = DROP; + if (!m_arg->expanded) + expand_arg (m_pfile, m_arg); + for (unsigned idx = 0; idx < m_arg->expanded_count; ++idx) + if (m_arg->expanded[idx]->type != CPP_PADDING) + { + m_update = INCLUDE; + break; + } + } + } + return DROP; + } + else if (m_state >= 2) + { + if (m_state == 2 && token->type == CPP_PASTE) + { + cpp_error_at (m_pfile, CPP_DL_ERROR, token->src_loc, + vaopt_paste_error); + return ERROR; + } + /* Advance states before further considering this token, in + case we see a close paren immediately after the open + paren. */ + if (m_state == 2) + ++m_state; + + bool was_paste = m_last_was_paste; + m_last_was_paste = false; + if (token->type == CPP_PASTE) + { + m_last_was_paste = true; + m_paste_location = token->src_loc; + } + else if (token->type == CPP_OPEN_PAREN) + ++m_state; + else if (token->type == CPP_CLOSE_PAREN) + { + --m_state; + if (m_state == 2) + { + /* Saw the final paren. */ + m_state = 0; + + if (was_paste) + { + cpp_error_at (m_pfile, CPP_DL_ERROR, token->src_loc, + vaopt_paste_error); + return ERROR; + } + + return END; + } + } + return m_update; + } + + /* Nothing to do with __VA_OPT__. */ + return INCLUDE; + } + + /* Ensure that any __VA_OPT__ was completed. If ok, return true. + Otherwise, issue an error and return false. */ + bool completed () + { + if (m_variadic && m_state != 0) + cpp_error_at (m_pfile, CPP_DL_ERROR, m_location, + "unterminated __VA_OPT__"); + return m_state == 0; + } + + /* Return true for # __VA_OPT__. */ + bool stringify () const + { + return m_stringify; + } + + private: + + /* The cpp_reader. */ + cpp_reader *m_pfile; + + /* The __VA_ARGS__ argument. */ + macro_arg *m_arg; + + /* True if the macro is variadic. */ + bool m_variadic; + /* If true, the previous token was ##. This is used to detect when + a paste occurs at the end of the sequence. */ + bool m_last_was_paste; + /* True for #__VA_OPT__. */ + bool m_stringify; + + /* The state variable: + 0 means not parsing + 1 means __VA_OPT__ seen, looking for "(" + 2 means "(" seen (so the next token can't be "##") + >= 3 means looking for ")", the number encodes the paren depth. */ + int m_state; + + /* The location of the paste token. */ + location_t m_paste_location; + + /* Location of the __VA_OPT__ token. */ + location_t m_location; + + /* If __VA_ARGS__ substitutes to no preprocessing tokens, + INCLUDE, otherwise DROP. ERROR when unknown yet. */ + update_type m_update; +}; + +/* Macro expansion. */ + +static cpp_macro *get_deferred_or_lazy_macro (cpp_reader *, cpp_hashnode *, + location_t); +static int enter_macro_context (cpp_reader *, cpp_hashnode *, + const cpp_token *, location_t); +static int builtin_macro (cpp_reader *, cpp_hashnode *, + location_t, location_t); +static void push_ptoken_context (cpp_reader *, cpp_hashnode *, _cpp_buff *, + const cpp_token **, unsigned int); +static void push_extended_tokens_context (cpp_reader *, cpp_hashnode *, + _cpp_buff *, location_t *, + const cpp_token **, unsigned int); +static _cpp_buff *collect_args (cpp_reader *, const cpp_hashnode *, + _cpp_buff **, unsigned *); +static cpp_context *next_context (cpp_reader *); +static const cpp_token *padding_token (cpp_reader *, const cpp_token *); +static const cpp_token *new_string_token (cpp_reader *, uchar *, unsigned int); +static const cpp_token *stringify_arg (cpp_reader *, const cpp_token **, + unsigned int); +static void paste_all_tokens (cpp_reader *, const cpp_token *); +static bool paste_tokens (cpp_reader *, location_t, + const cpp_token **, const cpp_token *); +static void alloc_expanded_arg_mem (cpp_reader *, macro_arg *, size_t); +static void ensure_expanded_arg_room (cpp_reader *, macro_arg *, size_t, size_t *); +static void delete_macro_args (_cpp_buff*, unsigned num_args); +static void set_arg_token (macro_arg *, const cpp_token *, + location_t, size_t, + enum macro_arg_token_kind, + bool); +static const location_t *get_arg_token_location (const macro_arg *, + enum macro_arg_token_kind); +static const cpp_token **arg_token_ptr_at (const macro_arg *, + size_t, + enum macro_arg_token_kind, + location_t **virt_location); + +static void macro_arg_token_iter_init (macro_arg_token_iter *, bool, + enum macro_arg_token_kind, + const macro_arg *, + const cpp_token **); +static const cpp_token *macro_arg_token_iter_get_token +(const macro_arg_token_iter *it); +static location_t macro_arg_token_iter_get_location +(const macro_arg_token_iter *); +static void macro_arg_token_iter_forward (macro_arg_token_iter *); +static _cpp_buff *tokens_buff_new (cpp_reader *, size_t, + location_t **); +static size_t tokens_buff_count (_cpp_buff *); +static const cpp_token **tokens_buff_last_token_ptr (_cpp_buff *); +static inline const cpp_token **tokens_buff_put_token_to (const cpp_token **, + location_t *, + const cpp_token *, + location_t, + location_t, + const line_map_macro *, + unsigned int); + +static const cpp_token **tokens_buff_add_token (_cpp_buff *, + location_t *, + const cpp_token *, + location_t, + location_t, + const line_map_macro *, + unsigned int); +static inline void tokens_buff_remove_last_token (_cpp_buff *); +static void replace_args (cpp_reader *, cpp_hashnode *, cpp_macro *, + macro_arg *, location_t); +static _cpp_buff *funlike_invocation_p (cpp_reader *, cpp_hashnode *, + _cpp_buff **, unsigned *); +static cpp_macro *create_iso_definition (cpp_reader *); + +/* #define directive parsing and handling. */ + +static cpp_macro *lex_expansion_token (cpp_reader *, cpp_macro *); +static bool parse_params (cpp_reader *, unsigned *, bool *); +static void check_trad_stringification (cpp_reader *, const cpp_macro *, + const cpp_string *); +static bool reached_end_of_context (cpp_context *); +static void consume_next_token_from_context (cpp_reader *pfile, + const cpp_token **, + location_t *); +static const cpp_token* cpp_get_token_1 (cpp_reader *, location_t *); + +static cpp_hashnode* macro_of_context (cpp_context *context); + +/* Statistical counter tracking the number of macros that got + expanded. */ +unsigned num_expanded_macros_counter = 0; +/* Statistical counter tracking the total number tokens resulting + from macro expansion. */ +unsigned num_macro_tokens_counter = 0; + +/* Wrapper around cpp_get_token to skip CPP_PADDING tokens + and not consume CPP_EOF. */ +static const cpp_token * +cpp_get_token_no_padding (cpp_reader *pfile) +{ + for (;;) + { + const cpp_token *ret = cpp_peek_token (pfile, 0); + if (ret->type == CPP_EOF) + return ret; + ret = cpp_get_token (pfile); + if (ret->type != CPP_PADDING) + return ret; + } +} + +/* Handle meeting "__has_include" builtin macro. */ + +static int +builtin_has_include (cpp_reader *pfile, cpp_hashnode *op, bool has_next) +{ + int result = 0; + + if (!pfile->state.in_directive) + cpp_error (pfile, CPP_DL_ERROR, + "\"%s\" used outside of preprocessing directive", + NODE_NAME (op)); + + pfile->state.angled_headers = true; + const cpp_token *token = cpp_get_token_no_padding (pfile); + bool paren = token->type == CPP_OPEN_PAREN; + if (paren) + token = cpp_get_token_no_padding (pfile); + else + cpp_error (pfile, CPP_DL_ERROR, + "missing '(' before \"%s\" operand", NODE_NAME (op)); + pfile->state.angled_headers = false; + + bool bracket = token->type != CPP_STRING; + char *fname = NULL; + if (token->type == CPP_STRING || token->type == CPP_HEADER_NAME) + { + fname = XNEWVEC (char, token->val.str.len - 1); + memcpy (fname, token->val.str.text + 1, token->val.str.len - 2); + fname[token->val.str.len - 2] = '\0'; + } + else if (token->type == CPP_LESS) + fname = _cpp_bracket_include (pfile); + else + cpp_error (pfile, CPP_DL_ERROR, + "operator \"%s\" requires a header-name", NODE_NAME (op)); + + if (fname) + { + /* Do not do the lookup if we're skipping, that's unnecessary + IO. */ + if (!pfile->state.skip_eval + && _cpp_has_header (pfile, fname, bracket, + has_next ? IT_INCLUDE_NEXT : IT_INCLUDE)) + result = 1; + + XDELETEVEC (fname); + } + + if (paren + && cpp_get_token_no_padding (pfile)->type != CPP_CLOSE_PAREN) + cpp_error (pfile, CPP_DL_ERROR, + "missing ')' after \"%s\" operand", NODE_NAME (op)); + + return result; +} + +/* Emits a warning if NODE is a macro defined in the main file that + has not been used. */ +int +_cpp_warn_if_unused_macro (cpp_reader *pfile, cpp_hashnode *node, + void *v ATTRIBUTE_UNUSED) +{ + if (cpp_user_macro_p (node)) + { + cpp_macro *macro = node->value.macro; + + if (!macro->used + && MAIN_FILE_P (linemap_check_ordinary + (linemap_lookup (pfile->line_table, + macro->line)))) + cpp_warning_with_line (pfile, CPP_W_UNUSED_MACROS, macro->line, 0, + "macro \"%s\" is not used", NODE_NAME (node)); + } + + return 1; +} + +/* Allocates and returns a CPP_STRING token, containing TEXT of length + LEN, after null-terminating it. TEXT must be in permanent storage. */ +static const cpp_token * +new_string_token (cpp_reader *pfile, unsigned char *text, unsigned int len) +{ + cpp_token *token = _cpp_temp_token (pfile); + + text[len] = '\0'; + token->type = CPP_STRING; + token->val.str.len = len; + token->val.str.text = text; + token->flags = 0; + return token; +} + +static const char * const monthnames[] = +{ + "Jan", "Feb", "Mar", "Apr", "May", "Jun", + "Jul", "Aug", "Sep", "Oct", "Nov", "Dec" +}; + +/* Helper function for builtin_macro. Returns the text generated by + a builtin macro. */ +const uchar * +_cpp_builtin_macro_text (cpp_reader *pfile, cpp_hashnode *node, + location_t loc) +{ + const uchar *result = NULL; + linenum_type number = 1; + + switch (node->value.builtin) + { + default: + cpp_error (pfile, CPP_DL_ICE, "invalid built-in macro \"%s\"", + NODE_NAME (node)); + break; + + case BT_TIMESTAMP: + { + if (CPP_OPTION (pfile, warn_date_time)) + cpp_warning (pfile, CPP_W_DATE_TIME, "macro \"%s\" might prevent " + "reproducible builds", NODE_NAME (node)); + + cpp_buffer *pbuffer = cpp_get_buffer (pfile); + if (pbuffer->timestamp == NULL) + { + /* Initialize timestamp value of the assotiated file. */ + struct _cpp_file *file = cpp_get_file (pbuffer); + if (file) + { + /* Generate __TIMESTAMP__ string, that represents + the date and time of the last modification + of the current source file. The string constant + looks like "Sun Sep 16 01:03:52 1973". */ + struct tm *tb = NULL; + struct stat *st = _cpp_get_file_stat (file); + if (st) + tb = localtime (&st->st_mtime); + if (tb) + { + char *str = asctime (tb); + size_t len = strlen (str); + unsigned char *buf = _cpp_unaligned_alloc (pfile, len + 2); + buf[0] = '"'; + strcpy ((char *) buf + 1, str); + buf[len] = '"'; + pbuffer->timestamp = buf; + } + else + { + cpp_errno (pfile, CPP_DL_WARNING, + "could not determine file timestamp"); + pbuffer->timestamp = UC"\"??? ??? ?? ??:??:?? ????\""; + } + } + } + result = pbuffer->timestamp; + } + break; + case BT_FILE: + case BT_FILE_NAME: + case BT_BASE_FILE: + { + unsigned int len; + const char *name; + uchar *buf; + + if (node->value.builtin == BT_FILE + || node->value.builtin == BT_FILE_NAME) + { + name = linemap_get_expansion_filename (pfile->line_table, + pfile->line_table->highest_line); + if ((node->value.builtin == BT_FILE_NAME) && name) + name = lbasename (name); + } + else + { + name = _cpp_get_file_name (pfile->main_file); + if (!name) + abort (); + } + if (pfile->cb.remap_filename) + name = pfile->cb.remap_filename (name); + len = strlen (name); + buf = _cpp_unaligned_alloc (pfile, len * 2 + 3); + result = buf; + *buf = '"'; + buf = cpp_quote_string (buf + 1, (const unsigned char *) name, len); + *buf++ = '"'; + *buf = '\0'; + } + break; + + case BT_INCLUDE_LEVEL: + /* The line map depth counts the primary source as level 1, but + historically __INCLUDE_DEPTH__ has called the primary source + level 0. */ + number = pfile->line_table->depth - 1; + break; + + case BT_SPECLINE: + /* If __LINE__ is embedded in a macro, it must expand to the + line of the macro's invocation, not its definition. + Otherwise things like assert() will not work properly. + See WG14 N1911, WG21 N4220 sec 6.5, and PR 61861. */ + if (CPP_OPTION (pfile, traditional)) + loc = pfile->line_table->highest_line; + else + loc = linemap_resolve_location (pfile->line_table, loc, + LRK_MACRO_EXPANSION_POINT, NULL); + number = linemap_get_expansion_line (pfile->line_table, loc); + break; + + /* __STDC__ has the value 1 under normal circumstances. + However, if (a) we are in a system header, (b) the option + stdc_0_in_system_headers is true (set by target config), and + (c) we are not in strictly conforming mode, then it has the + value 0. (b) and (c) are already checked in cpp_init_builtins. */ + case BT_STDC: + if (_cpp_in_system_header (pfile)) + number = 0; + else + number = 1; + break; + + case BT_DATE: + case BT_TIME: + if (CPP_OPTION (pfile, warn_date_time)) + cpp_warning (pfile, CPP_W_DATE_TIME, "macro \"%s\" might prevent " + "reproducible builds", NODE_NAME (node)); + if (pfile->date == NULL) + { + /* Allocate __DATE__ and __TIME__ strings from permanent + storage. We only do this once, and don't generate them + at init time, because time() and localtime() are very + slow on some systems. */ + time_t tt; + auto kind = cpp_get_date (pfile, &tt); + + if (kind == CPP_time_kind::UNKNOWN) + { + cpp_errno (pfile, CPP_DL_WARNING, + "could not determine date and time"); + + pfile->date = UC"\"??? ?? ????\""; + pfile->time = UC"\"??:??:??\""; + } + else + { + struct tm *tb = (kind == CPP_time_kind::FIXED + ? gmtime : localtime) (&tt); + + pfile->date = _cpp_unaligned_alloc (pfile, + sizeof ("\"Oct 11 1347\"")); + sprintf ((char *) pfile->date, "\"%s %2d %4d\"", + monthnames[tb->tm_mon], tb->tm_mday, + tb->tm_year + 1900); + + pfile->time = _cpp_unaligned_alloc (pfile, + sizeof ("\"12:34:56\"")); + sprintf ((char *) pfile->time, "\"%02d:%02d:%02d\"", + tb->tm_hour, tb->tm_min, tb->tm_sec); + } + } + + if (node->value.builtin == BT_DATE) + result = pfile->date; + else + result = pfile->time; + break; + + case BT_COUNTER: + if (CPP_OPTION (pfile, directives_only) && pfile->state.in_directive) + cpp_error (pfile, CPP_DL_ERROR, + "__COUNTER__ expanded inside directive with -fdirectives-only"); + number = pfile->counter++; + break; + + case BT_HAS_ATTRIBUTE: + number = pfile->cb.has_attribute (pfile, false); + break; + + case BT_HAS_STD_ATTRIBUTE: + number = pfile->cb.has_attribute (pfile, true); + break; + + case BT_HAS_BUILTIN: + number = pfile->cb.has_builtin (pfile); + break; + + case BT_HAS_INCLUDE: + case BT_HAS_INCLUDE_NEXT: + number = builtin_has_include (pfile, node, + node->value.builtin == BT_HAS_INCLUDE_NEXT); + break; + + case BT_RT_ASSIGN: + result = evaluate_RT_ASSIGN(pfile); + break; + + case BT_RT_TO_ARG_LIST: + result = evaluate_RT_TO_ARG_LIST(pfile); + break; + + case BT_RT_TO_TOKEN_LIST: + result = evaluate_RT_TO_TOKEN_LIST(pfile); + break; + + case BT_RT_FIRST: + result = evaluate_RT_FIRST(pfile); + break; + + case BT_RT_REST: + result = evaluate_RT_REST(pfile); + break; + + case BT_RT_MAP: + result = evaluate_RT_MAP(pfile); + break; + + case BT_RT_AL_MAP: + result = evaluate_RT_AL_MAP(pfile); + break; + + case BT_RT_IF: + result = evaluate_RT_IF(pfile); + break; + + case BT_RT_NOT: + result = evaluate_RT_NOT(pfile); + break; + + case BT_RT_AND: + result = evaluate_RT_AND(pfile); + break; + + case BT_RT_OR: + result = evaluate_RT_OR(pfile); + break; + + case BT_RT_IS_IDENTIFIER: + result = evaluate_RT_IS_IDENTIFIER(pfile); + break; + + case BT_RT_IS_NAME: + result = evaluate_RT_IS_NAME(pfile); + break; + + case BT_RT_PASTE: + result = evaluate_RT_PASTE(pfile); + break; + + } + + if (result == NULL) + { + /* 21 bytes holds all NUL-terminated unsigned 64-bit numbers. */ + result = _cpp_unaligned_alloc (pfile, 21); + sprintf ((char *) result, "%u", number); + } + + return result; +} + +/* Get an idempotent date. Either the cached value, the value from + source epoch, or failing that, the value from time(2). Use this + during compilation so that every time stamp is the same. */ +CPP_time_kind +cpp_get_date (cpp_reader *pfile, time_t *result) +{ + if (!pfile->time_stamp_kind) + { + int kind = 0; + if (pfile->cb.get_source_date_epoch) + { + /* Try reading the fixed epoch. */ + pfile->time_stamp = pfile->cb.get_source_date_epoch (pfile); + if (pfile->time_stamp != time_t (-1)) + kind = int (CPP_time_kind::FIXED); + } + + if (!kind) + { + /* Pedantically time_t (-1) is a legitimate value for + "number of seconds since the Epoch". It is a silly + time. */ + errno = 0; + pfile->time_stamp = time (nullptr); + /* Annoyingly a library could legally set errno and return a + valid time! Bad library! */ + if (pfile->time_stamp == time_t (-1) && errno) + kind = errno; + else + kind = int (CPP_time_kind::DYNAMIC); + } + + pfile->time_stamp_kind = kind; + } + + *result = pfile->time_stamp; + if (pfile->time_stamp_kind >= 0) + { + errno = pfile->time_stamp_kind; + return CPP_time_kind::UNKNOWN; + } + + return CPP_time_kind (pfile->time_stamp_kind); +} + +/* Convert builtin macros like __FILE__ to a token and push it on the + context stack. Also handles _Pragma, for which a new token may not + be created. Returns 1 if it generates a new token context, 0 to + return the token to the caller. LOC is the location of the expansion + point of the macro. */ +static int +builtin_macro (cpp_reader *pfile, cpp_hashnode *node, + location_t loc, location_t expand_loc) +{ + const uchar *buf; + size_t len; + char *nbuf; + + if (node->value.builtin == BT_PRAGMA) + { + /* Don't interpret _Pragma within directives. The standard is + not clear on this, but to me this makes most sense. + Similarly, don't interpret _Pragma inside expand_args, we might + need to stringize it later on. */ + if (pfile->state.in_directive || pfile->state.ignore__Pragma) + return 0; + + return _cpp_do__Pragma (pfile, loc); + } + + buf = _cpp_builtin_macro_text (pfile, node, expand_loc); + len = ustrlen (buf); + nbuf = (char *) alloca (len + 1); + memcpy (nbuf, buf, len); + nbuf[len]='\n'; + + cpp_push_buffer (pfile, (uchar *) nbuf, len, /* from_stage3 */ true); + _cpp_clean_line (pfile); + + /* Set pfile->cur_token as required by _cpp_lex_direct. */ + pfile->cur_token = _cpp_temp_token (pfile); + cpp_token *token = _cpp_lex_direct (pfile); + /* We should point to the expansion point of the builtin macro. */ + token->src_loc = loc; + if (pfile->context->tokens_kind == TOKENS_KIND_EXTENDED) + { + /* We are tracking tokens resulting from macro expansion. + Create a macro line map and generate a virtual location for + the token resulting from the expansion of the built-in + macro. */ + location_t *virt_locs = NULL; + _cpp_buff *token_buf = tokens_buff_new (pfile, 1, &virt_locs); + const line_map_macro * map = + linemap_enter_macro (pfile->line_table, node, loc, 1); + tokens_buff_add_token (token_buf, virt_locs, token, + pfile->line_table->builtin_location, + pfile->line_table->builtin_location, + map, /*macro_token_index=*/0); + push_extended_tokens_context (pfile, node, token_buf, virt_locs, + (const cpp_token **)token_buf->base, + 1); + } + else + _cpp_push_token_context (pfile, NULL, token, 1); + if (pfile->buffer->cur != pfile->buffer->rlimit) + cpp_error (pfile, CPP_DL_ICE, "invalid built-in macro \"%s\"", + NODE_NAME (node)); + _cpp_pop_buffer (pfile); + + return 1; +} + +/* Copies SRC, of length LEN, to DEST, adding backslashes before all + backslashes and double quotes. DEST must be of sufficient size. + Returns a pointer to the end of the string. */ +uchar * +cpp_quote_string (uchar *dest, const uchar *src, unsigned int len) +{ + while (len--) + { + uchar c = *src++; + + switch (c) + { + case '\n': + /* Naked LF can appear in raw string literals */ + c = 'n'; + /* FALLTHROUGH */ + + case '\\': + case '"': + *dest++ = '\\'; + /* FALLTHROUGH */ + + default: + *dest++ = c; + } + } + + return dest; +} + +/* Convert a token sequence FIRST to FIRST+COUNT-1 to a single string token + according to the rules of the ISO C #-operator. */ +static const cpp_token * +stringify_arg (cpp_reader *pfile, const cpp_token **first, unsigned int count) +{ + unsigned char *dest; + unsigned int i, escape_it, backslash_count = 0; + const cpp_token *source = NULL; + size_t len; + + if (BUFF_ROOM (pfile->u_buff) < 3) + _cpp_extend_buff (pfile, &pfile->u_buff, 3); + dest = BUFF_FRONT (pfile->u_buff); + *dest++ = '"'; + + /* Loop, reading in the argument's tokens. */ + for (i = 0; i < count; i++) + { + const cpp_token *token = first[i]; + + if (token->type == CPP_PADDING) + { + if (source == NULL + || (!(source->flags & PREV_WHITE) + && token->val.source == NULL)) + source = token->val.source; + continue; + } + + escape_it = (token->type == CPP_STRING || token->type == CPP_CHAR + || token->type == CPP_WSTRING || token->type == CPP_WCHAR + || token->type == CPP_STRING32 || token->type == CPP_CHAR32 + || token->type == CPP_STRING16 || token->type == CPP_CHAR16 + || token->type == CPP_UTF8STRING || token->type == CPP_UTF8CHAR + || cpp_userdef_string_p (token->type) + || cpp_userdef_char_p (token->type)); + + /* Room for each char being written in octal, initial space and + final quote and NUL. */ + len = cpp_token_len (token); + if (escape_it) + len *= 4; + len += 3; + + if ((size_t) (BUFF_LIMIT (pfile->u_buff) - dest) < len) + { + size_t len_so_far = dest - BUFF_FRONT (pfile->u_buff); + _cpp_extend_buff (pfile, &pfile->u_buff, len); + dest = BUFF_FRONT (pfile->u_buff) + len_so_far; + } + + /* Leading white space? */ + if (dest - 1 != BUFF_FRONT (pfile->u_buff)) + { + if (source == NULL) + source = token; + if (source->flags & PREV_WHITE) + *dest++ = ' '; + } + source = NULL; + + if (escape_it) + { + _cpp_buff *buff = _cpp_get_buff (pfile, len); + unsigned char *buf = BUFF_FRONT (buff); + len = cpp_spell_token (pfile, token, buf, true) - buf; + dest = cpp_quote_string (dest, buf, len); + _cpp_release_buff (pfile, buff); + } + else + dest = cpp_spell_token (pfile, token, dest, true); + + if (token->type == CPP_OTHER && token->val.str.text[0] == '\\') + backslash_count++; + else + backslash_count = 0; + } + + /* Ignore the final \ of invalid string literals. */ + if (backslash_count & 1) + { + cpp_error (pfile, CPP_DL_WARNING, + "invalid string literal, ignoring final '\\'"); + dest--; + } + + /* Commit the memory, including NUL, and return the token. */ + *dest++ = '"'; + len = dest - BUFF_FRONT (pfile->u_buff); + BUFF_FRONT (pfile->u_buff) = dest + 1; + return new_string_token (pfile, dest - len, len); +} + +/* Try to paste two tokens. On success, return nonzero. In any + case, PLHS is updated to point to the pasted token, which is + guaranteed to not have the PASTE_LEFT flag set. LOCATION is + the virtual location used for error reporting. */ +static bool +paste_tokens (cpp_reader *pfile, location_t location, + const cpp_token **plhs, const cpp_token *rhs) +{ + unsigned char *buf, *end, *lhsend; + cpp_token *lhs; + unsigned int len; + + len = cpp_token_len (*plhs) + cpp_token_len (rhs) + 2; + buf = (unsigned char *) alloca (len); + end = lhsend = cpp_spell_token (pfile, *plhs, buf, true); + + /* Avoid comment headers, since they are still processed in stage 3. + It is simpler to insert a space here, rather than modifying the + lexer to ignore comments in some circumstances. Simply returning + false doesn't work, since we want to clear the PASTE_LEFT flag. */ + if ((*plhs)->type == CPP_DIV && rhs->type != CPP_EQ) + *end++ = ' '; + /* In one obscure case we might see padding here. */ + if (rhs->type != CPP_PADDING) + end = cpp_spell_token (pfile, rhs, end, true); + *end = '\n'; + + cpp_push_buffer (pfile, buf, end - buf, /* from_stage3 */ true); + _cpp_clean_line (pfile); + + /* Set pfile->cur_token as required by _cpp_lex_direct. */ + pfile->cur_token = _cpp_temp_token (pfile); + lhs = _cpp_lex_direct (pfile); + if (pfile->buffer->cur != pfile->buffer->rlimit) + { + location_t saved_loc = lhs->src_loc; + + _cpp_pop_buffer (pfile); + + unsigned char *rhsstart = lhsend; + if ((*plhs)->type == CPP_DIV && rhs->type != CPP_EQ) + rhsstart++; + + /* We have to remove the PASTE_LEFT flag from the old lhs, but + we want to keep the new location. */ + *lhs = **plhs; + *plhs = lhs; + lhs->src_loc = saved_loc; + lhs->flags &= ~PASTE_LEFT; + + /* Mandatory error for all apart from assembler. */ + if (CPP_OPTION (pfile, lang) != CLK_ASM) + cpp_error_with_line (pfile, CPP_DL_ERROR, location, 0, + "pasting \"%.*s\" and \"%.*s\" does not give " + "a valid preprocessing token", + (int) (lhsend - buf), buf, + (int) (end - rhsstart), rhsstart); + return false; + } + + lhs->flags |= (*plhs)->flags & (PREV_WHITE | PREV_FALLTHROUGH); + *plhs = lhs; + _cpp_pop_buffer (pfile); + return true; +} + +/* Handles an arbitrarily long sequence of ## operators, with initial + operand LHS. This implementation is left-associative, + non-recursive, and finishes a paste before handling succeeding + ones. If a paste fails, we back up to the RHS of the failing ## + operator before pushing the context containing the result of prior + successful pastes, with the effect that the RHS appears in the + output stream after the pasted LHS normally. */ +static void +paste_all_tokens (cpp_reader *pfile, const cpp_token *lhs) +{ + const cpp_token *rhs = NULL; + cpp_context *context = pfile->context; + location_t virt_loc = 0; + + /* We are expanding a macro and we must have been called on a token + that appears at the left hand side of a ## operator. */ + if (macro_of_context (pfile->context) == NULL + || (!(lhs->flags & PASTE_LEFT))) + abort (); + + if (context->tokens_kind == TOKENS_KIND_EXTENDED) + /* The caller must have called consume_next_token_from_context + right before calling us. That has incremented the pointer to + the current virtual location. So it now points to the location + of the token that comes right after *LHS. We want the + resulting pasted token to have the location of the current + *LHS, though. */ + virt_loc = context->c.mc->cur_virt_loc[-1]; + else + /* We are not tracking macro expansion. So the best virtual + location we can get here is the expansion point of the macro we + are currently expanding. */ + virt_loc = pfile->invocation_location; + + do + { + /* Take the token directly from the current context. We can do + this, because we are in the replacement list of either an + object-like macro, or a function-like macro with arguments + inserted. In either case, the constraints to #define + guarantee we have at least one more token. */ + if (context->tokens_kind == TOKENS_KIND_DIRECT) + rhs = FIRST (context).token++; + else if (context->tokens_kind == TOKENS_KIND_INDIRECT) + rhs = *FIRST (context).ptoken++; + else if (context->tokens_kind == TOKENS_KIND_EXTENDED) + { + /* So we are in presence of an extended token context, which + means that each token in this context has a virtual + location attached to it. So let's not forget to update + the pointer to the current virtual location of the + current token when we update the pointer to the current + token */ + + rhs = *FIRST (context).ptoken++; + /* context->c.mc must be non-null, as if we were not in a + macro context, context->tokens_kind could not be equal to + TOKENS_KIND_EXTENDED. */ + context->c.mc->cur_virt_loc++; + } + + if (rhs->type == CPP_PADDING) + { + if (rhs->flags & PASTE_LEFT) + abort (); + } + if (!paste_tokens (pfile, virt_loc, &lhs, rhs)) + { + _cpp_backup_tokens (pfile, 1); + break; + } + } + while (rhs->flags & PASTE_LEFT); + + /* Put the resulting token in its own context. */ + if (context->tokens_kind == TOKENS_KIND_EXTENDED) + { + location_t *virt_locs = NULL; + _cpp_buff *token_buf = tokens_buff_new (pfile, 1, &virt_locs); + tokens_buff_add_token (token_buf, virt_locs, lhs, + virt_loc, 0, NULL, 0); + push_extended_tokens_context (pfile, context->c.mc->macro_node, + token_buf, virt_locs, + (const cpp_token **)token_buf->base, 1); + } + else + _cpp_push_token_context (pfile, NULL, lhs, 1); +} + +/* Returns TRUE if the number of arguments ARGC supplied in an + invocation of the MACRO referenced by NODE is valid. An empty + invocation to a macro with no parameters should pass ARGC as zero. + + Note that MACRO cannot necessarily be deduced from NODE, in case + NODE was redefined whilst collecting arguments. */ +bool +_cpp_arguments_ok (cpp_reader *pfile, cpp_macro *macro, const cpp_hashnode *node, unsigned int argc) +{ + if (argc == macro->paramc) + return true; + + if (argc < macro->paramc) + { + /* In C++20 (here the va_opt flag is used), and also as a GNU + extension, variadic arguments are allowed to not appear in + the invocation at all. + e.g. #define debug(format, args...) something + debug("string"); + + This is exactly the same as if an empty variadic list had been + supplied - debug("string", ). */ + + if (argc + 1 == macro->paramc && macro->variadic) + { + if (CPP_PEDANTIC (pfile) && ! macro->syshdr + && ! CPP_OPTION (pfile, va_opt)) + { + if (CPP_OPTION (pfile, cplusplus)) + cpp_error (pfile, CPP_DL_PEDWARN, + "ISO C++11 requires at least one argument " + "for the \"...\" in a variadic macro"); + else + cpp_error (pfile, CPP_DL_PEDWARN, + "ISO C99 requires at least one argument " + "for the \"...\" in a variadic macro"); + } + return true; + } + + cpp_error (pfile, CPP_DL_ERROR, + "macro \"%s\" requires %u arguments, but only %u given", + NODE_NAME (node), macro->paramc, argc); + } + else + cpp_error (pfile, CPP_DL_ERROR, + "macro \"%s\" passed %u arguments, but takes just %u", + NODE_NAME (node), argc, macro->paramc); + + if (macro->line > RESERVED_LOCATION_COUNT) + cpp_error_at (pfile, CPP_DL_NOTE, macro->line, "macro \"%s\" defined here", + NODE_NAME (node)); + + return false; +} + +/* Reads and returns the arguments to a function-like macro + invocation. Assumes the opening parenthesis has been processed. + If there is an error, emits an appropriate diagnostic and returns + NULL. Each argument is terminated by a CPP_EOF token, for the + future benefit of expand_arg(). If there are any deferred + #pragma directives among macro arguments, store pointers to the + CPP_PRAGMA ... CPP_PRAGMA_EOL tokens into *PRAGMA_BUFF buffer. + + What is returned is the buffer that contains the memory allocated + to hold the macro arguments. NODE is the name of the macro this + function is dealing with. If NUM_ARGS is non-NULL, *NUM_ARGS is + set to the actual number of macro arguments allocated in the + returned buffer. */ +static _cpp_buff * +collect_args (cpp_reader *pfile, const cpp_hashnode *node, + _cpp_buff **pragma_buff, unsigned *num_args) +{ + _cpp_buff *buff, *base_buff; + cpp_macro *macro; + macro_arg *args, *arg; + const cpp_token *token; + unsigned int argc; + location_t virt_loc; + bool track_macro_expansion_p = CPP_OPTION (pfile, track_macro_expansion); + unsigned num_args_alloced = 0; + + macro = node->value.macro; + if (macro->paramc) + argc = macro->paramc; + else + argc = 1; + +#define DEFAULT_NUM_TOKENS_PER_MACRO_ARG 50 +#define ARG_TOKENS_EXTENT 1000 + + buff = _cpp_get_buff (pfile, argc * (DEFAULT_NUM_TOKENS_PER_MACRO_ARG + * sizeof (cpp_token *) + + sizeof (macro_arg))); + base_buff = buff; + args = (macro_arg *) buff->base; + memset (args, 0, argc * sizeof (macro_arg)); + buff->cur = (unsigned char *) &args[argc]; + arg = args, argc = 0; + + /* Collect the tokens making up each argument. We don't yet know + how many arguments have been supplied, whether too many or too + few. Hence the slightly bizarre usage of "argc" and "arg". */ + do + { + unsigned int paren_depth = 0; + unsigned int ntokens = 0; + unsigned virt_locs_capacity = DEFAULT_NUM_TOKENS_PER_MACRO_ARG; + num_args_alloced++; + + argc++; + arg->first = (const cpp_token **) buff->cur; + if (track_macro_expansion_p) + { + virt_locs_capacity = DEFAULT_NUM_TOKENS_PER_MACRO_ARG; + arg->virt_locs = XNEWVEC (location_t, + virt_locs_capacity); + } + + for (;;) + { + /* Require space for 2 new tokens (including a CPP_EOF). */ + if ((unsigned char *) &arg->first[ntokens + 2] > buff->limit) + { + buff = _cpp_append_extend_buff (pfile, buff, + ARG_TOKENS_EXTENT + * sizeof (cpp_token *)); + arg->first = (const cpp_token **) buff->cur; + } + if (track_macro_expansion_p + && (ntokens + 2 > virt_locs_capacity)) + { + virt_locs_capacity += ARG_TOKENS_EXTENT; + arg->virt_locs = XRESIZEVEC (location_t, + arg->virt_locs, + virt_locs_capacity); + } + + token = cpp_get_token_1 (pfile, &virt_loc); + + if (token->type == CPP_PADDING) + { + /* Drop leading padding. */ + if (ntokens == 0) + continue; + } + else if (token->type == CPP_OPEN_PAREN) + paren_depth++; + else if (token->type == CPP_CLOSE_PAREN) + { + if (paren_depth-- == 0) + break; + } + else if (token->type == CPP_COMMA) + { + /* A comma does not terminate an argument within + parentheses or as part of a variable argument. */ + if (paren_depth == 0 + && ! (macro->variadic && argc == macro->paramc)) + break; + } + else if (token->type == CPP_EOF + || (token->type == CPP_HASH && token->flags & BOL)) + break; + else if (token->type == CPP_PRAGMA && !(token->flags & PRAGMA_OP)) + { + cpp_token *newtok = _cpp_temp_token (pfile); + + /* CPP_PRAGMA token lives in directive_result, which will + be overwritten on the next directive. */ + *newtok = *token; + token = newtok; + do + { + if (*pragma_buff == NULL + || BUFF_ROOM (*pragma_buff) < sizeof (cpp_token *)) + { + _cpp_buff *next; + if (*pragma_buff == NULL) + *pragma_buff + = _cpp_get_buff (pfile, 32 * sizeof (cpp_token *)); + else + { + next = *pragma_buff; + *pragma_buff + = _cpp_get_buff (pfile, + (BUFF_FRONT (*pragma_buff) + - (*pragma_buff)->base) * 2); + (*pragma_buff)->next = next; + } + } + *(const cpp_token **) BUFF_FRONT (*pragma_buff) = token; + BUFF_FRONT (*pragma_buff) += sizeof (cpp_token *); + if (token->type == CPP_PRAGMA_EOL) + break; + token = cpp_get_token_1 (pfile, &virt_loc); + } + while (token->type != CPP_EOF); + + /* In deferred pragmas parsing_args and prevent_expansion + had been changed, reset it. */ + pfile->state.parsing_args = 2; + pfile->state.prevent_expansion = 1; + + if (token->type == CPP_EOF) + break; + else + continue; + } + set_arg_token (arg, token, virt_loc, + ntokens, MACRO_ARG_TOKEN_NORMAL, + CPP_OPTION (pfile, track_macro_expansion)); + ntokens++; + } + + /* Drop trailing padding. */ + while (ntokens > 0 && arg->first[ntokens - 1]->type == CPP_PADDING) + ntokens--; + + arg->count = ntokens; + /* Append an EOF to mark end-of-argument. */ + set_arg_token (arg, &pfile->endarg, token->src_loc, + ntokens, MACRO_ARG_TOKEN_NORMAL, + CPP_OPTION (pfile, track_macro_expansion)); + + /* Terminate the argument. Excess arguments loop back and + overwrite the final legitimate argument, before failing. */ + if (argc <= macro->paramc) + { + buff->cur = (unsigned char *) &arg->first[ntokens + 1]; + if (argc != macro->paramc) + arg++; + } + } + while (token->type != CPP_CLOSE_PAREN && token->type != CPP_EOF); + + if (token->type == CPP_EOF) + { + /* Unless the EOF is marking the end of an argument, it's a fake + one from the end of a file that _cpp_clean_line will not have + advanced past. */ + if (token == &pfile->endarg) + _cpp_backup_tokens (pfile, 1); + cpp_error (pfile, CPP_DL_ERROR, + "unterminated argument list invoking macro \"%s\"", + NODE_NAME (node)); + } + else + { + /* A single empty argument is counted as no argument. */ + if (argc == 1 && macro->paramc == 0 && args[0].count == 0) + argc = 0; + if (_cpp_arguments_ok (pfile, macro, node, argc)) + { + /* GCC has special semantics for , ## b where b is a varargs + parameter: we remove the comma if b was omitted entirely. + If b was merely an empty argument, the comma is retained. + If the macro takes just one (varargs) parameter, then we + retain the comma only if we are standards conforming. + + If FIRST is NULL replace_args () swallows the comma. */ + if (macro->variadic && (argc < macro->paramc + || (argc == 1 && args[0].count == 0 + && !CPP_OPTION (pfile, std)))) + args[macro->paramc - 1].first = NULL; + if (num_args) + *num_args = num_args_alloced; + return base_buff; + } + } + + /* An error occurred. */ + _cpp_release_buff (pfile, base_buff); + return NULL; +} + +/* Search for an opening parenthesis to the macro of NODE, in such a + way that, if none is found, we don't lose the information in any + intervening padding tokens. If we find the parenthesis, collect + the arguments and return the buffer containing them. PRAGMA_BUFF + argument is the same as in collect_args. If NUM_ARGS is non-NULL, + *NUM_ARGS is set to the number of arguments contained in the + returned buffer. */ +static _cpp_buff * +funlike_invocation_p (cpp_reader *pfile, cpp_hashnode *node, + _cpp_buff **pragma_buff, unsigned *num_args) +{ + const cpp_token *token, *padding = NULL; + + for (;;) + { + token = cpp_get_token (pfile); + if (token->type != CPP_PADDING) + break; + gcc_assert ((token->flags & PREV_WHITE) == 0); + if (padding == NULL + || padding->val.source == NULL + || (!(padding->val.source->flags & PREV_WHITE) + && token->val.source == NULL)) + padding = token; + } + + if (token->type == CPP_OPEN_PAREN) + { + pfile->state.parsing_args = 2; + return collect_args (pfile, node, pragma_buff, num_args); + } + + /* Back up. A CPP_EOF is either an EOF from an argument we're + expanding, or a fake one from lex_direct. We want to backup the + former, but not the latter. We may have skipped padding, in + which case backing up more than one token when expanding macros + is in general too difficult. We re-insert it in its own + context. */ + if (token->type != CPP_EOF || token == &pfile->endarg) + { + _cpp_backup_tokens (pfile, 1); + if (padding) + _cpp_push_token_context (pfile, NULL, padding, 1); + } + + return NULL; +} + +/* Return the real number of tokens in the expansion of MACRO. */ +static inline unsigned int +macro_real_token_count (const cpp_macro *macro) +{ + if (__builtin_expect (!macro->extra_tokens, true)) + return macro->count; + + for (unsigned i = macro->count; i--;) + if (macro->exp.tokens[i].type != CPP_PASTE) + return i + 1; + + return 0; +} + +/* Push the context of a macro with hash entry NODE onto the context + stack. If we can successfully expand the macro, we push a context + containing its yet-to-be-rescanned replacement list and return one. + If there were additionally any unexpanded deferred #pragma + directives among macro arguments, push another context containing + the pragma tokens before the yet-to-be-rescanned replacement list + and return two. Otherwise, we don't push a context and return + zero. LOCATION is the location of the expansion point of the + macro. */ +static int +enter_macro_context (cpp_reader *pfile, cpp_hashnode *node, + const cpp_token *result, location_t location) +{ + /* The presence of a macro invalidates a file's controlling macro. */ + pfile->mi_valid = false; + + pfile->state.angled_headers = false; + + /* From here to when we push the context for the macro later down + this function, we need to flag the fact that we are about to + expand a macro. This is useful when -ftrack-macro-expansion is + turned off. In that case, we need to record the location of the + expansion point of the top-most macro we are about to to expand, + into pfile->invocation_location. But we must not record any such + location once the process of expanding the macro starts; that is, + we must not do that recording between now and later down this + function where set this flag to FALSE. */ + pfile->about_to_expand_macro_p = true; + + if (cpp_user_macro_p (node)) + { + cpp_macro *macro = node->value.macro; + _cpp_buff *pragma_buff = NULL; + + if (macro->fun_like) + { + _cpp_buff *buff; + unsigned num_args = 0; + + pfile->state.prevent_expansion++; + pfile->keep_tokens++; + pfile->state.parsing_args = 1; + buff = funlike_invocation_p (pfile, node, &pragma_buff, + &num_args); + pfile->state.parsing_args = 0; + pfile->keep_tokens--; + pfile->state.prevent_expansion--; + + if (buff == NULL) + { + if (CPP_WTRADITIONAL (pfile) && ! node->value.macro->syshdr) + cpp_warning (pfile, CPP_W_TRADITIONAL, + "function-like macro \"%s\" must be used with arguments in traditional C", + NODE_NAME (node)); + + if (pragma_buff) + _cpp_release_buff (pfile, pragma_buff); + + pfile->about_to_expand_macro_p = false; + return 0; + } + + if (macro->paramc > 0) + replace_args (pfile, node, macro, + (macro_arg *) buff->base, + location); + /* Free the memory used by the arguments of this + function-like macro. This memory has been allocated by + funlike_invocation_p and by replace_args. */ + delete_macro_args (buff, num_args); + } + + /* Disable the macro within its expansion. */ + node->flags |= NODE_DISABLED; + + /* Laziness can only affect the expansion tokens of the macro, + not its fun-likeness or parameters. */ + _cpp_maybe_notify_macro_use (pfile, node, location); + if (pfile->cb.used) + pfile->cb.used (pfile, location, node); + + macro->used = 1; + + if (macro->paramc == 0) + { + unsigned tokens_count = macro_real_token_count (macro); + if (CPP_OPTION (pfile, track_macro_expansion)) + { + unsigned int i; + const cpp_token *src = macro->exp.tokens; + const line_map_macro *map; + location_t *virt_locs = NULL; + _cpp_buff *macro_tokens + = tokens_buff_new (pfile, tokens_count, &virt_locs); + + /* Create a macro map to record the locations of the + tokens that are involved in the expansion. LOCATION + is the location of the macro expansion point. */ + map = linemap_enter_macro (pfile->line_table, + node, location, tokens_count); + for (i = 0; i < tokens_count; ++i) + { + tokens_buff_add_token (macro_tokens, virt_locs, + src, src->src_loc, + src->src_loc, map, i); + ++src; + } + push_extended_tokens_context (pfile, node, + macro_tokens, + virt_locs, + (const cpp_token **) + macro_tokens->base, + tokens_count); + } + else + _cpp_push_token_context (pfile, node, macro->exp.tokens, + tokens_count); + num_macro_tokens_counter += tokens_count; + } + + if (pragma_buff) + { + if (!pfile->state.in_directive) + _cpp_push_token_context (pfile, NULL, + padding_token (pfile, result), 1); + do + { + unsigned tokens_count; + _cpp_buff *tail = pragma_buff->next; + pragma_buff->next = NULL; + tokens_count = ((const cpp_token **) BUFF_FRONT (pragma_buff) + - (const cpp_token **) pragma_buff->base); + push_ptoken_context (pfile, NULL, pragma_buff, + (const cpp_token **) pragma_buff->base, + tokens_count); + pragma_buff = tail; + if (!CPP_OPTION (pfile, track_macro_expansion)) + num_macro_tokens_counter += tokens_count; + + } + while (pragma_buff != NULL); + pfile->about_to_expand_macro_p = false; + return 2; + } + + pfile->about_to_expand_macro_p = false; + return 1; + } + + pfile->about_to_expand_macro_p = false; + /* Handle built-in macros and the _Pragma operator. */ + { + location_t expand_loc; + + if (/* The top-level macro invocation that triggered the expansion + we are looking at is with a function-like user macro ... */ + cpp_fun_like_macro_p (pfile->top_most_macro_node) + /* ... and we are tracking the macro expansion. */ + && CPP_OPTION (pfile, track_macro_expansion)) + /* Then the location of the end of the macro invocation is the + location of the expansion point of this macro. */ + expand_loc = location; + else + /* Otherwise, the location of the end of the macro invocation is + the location of the expansion point of that top-level macro + invocation. */ + expand_loc = pfile->invocation_location; + + return builtin_macro (pfile, node, location, expand_loc); + } +} + +/* De-allocate the memory used by BUFF which is an array of instances + of macro_arg. NUM_ARGS is the number of instances of macro_arg + present in BUFF. */ +static void +delete_macro_args (_cpp_buff *buff, unsigned num_args) +{ + macro_arg *macro_args; + unsigned i; + + if (buff == NULL) + return; + + macro_args = (macro_arg *) buff->base; + + /* Walk instances of macro_arg to free their expanded tokens as well + as their macro_arg::virt_locs members. */ + for (i = 0; i < num_args; ++i) + { + if (macro_args[i].expanded) + { + free (macro_args[i].expanded); + macro_args[i].expanded = NULL; + } + if (macro_args[i].virt_locs) + { + free (macro_args[i].virt_locs); + macro_args[i].virt_locs = NULL; + } + if (macro_args[i].expanded_virt_locs) + { + free (macro_args[i].expanded_virt_locs); + macro_args[i].expanded_virt_locs = NULL; + } + } + _cpp_free_buff (buff); +} + +/* Set the INDEXth token of the macro argument ARG. TOKEN is the token + to set, LOCATION is its virtual location. "Virtual" location means + the location that encodes loci across macro expansion. Otherwise + it has to be TOKEN->SRC_LOC. KIND is the kind of tokens the + argument ARG is supposed to contain. Note that ARG must be + tailored so that it has enough room to contain INDEX + 1 numbers of + tokens, at least. */ +static void +set_arg_token (macro_arg *arg, const cpp_token *token, + location_t location, size_t index, + enum macro_arg_token_kind kind, + bool track_macro_exp_p) +{ + const cpp_token **token_ptr; + location_t *loc = NULL; + + token_ptr = + arg_token_ptr_at (arg, index, kind, + track_macro_exp_p ? &loc : NULL); + *token_ptr = token; + + if (loc != NULL) + { + /* We can't set the location of a stringified argument + token and we can't set any location if we aren't tracking + macro expansion locations. */ + gcc_checking_assert (kind != MACRO_ARG_TOKEN_STRINGIFIED + && track_macro_exp_p); + *loc = location; + } +} + +/* Get the pointer to the location of the argument token of the + function-like macro argument ARG. This function must be called + only when we -ftrack-macro-expansion is on. */ +static const location_t * +get_arg_token_location (const macro_arg *arg, + enum macro_arg_token_kind kind) +{ + const location_t *loc = NULL; + const cpp_token **token_ptr = + arg_token_ptr_at (arg, 0, kind, (location_t **) &loc); + + if (token_ptr == NULL) + return NULL; + + return loc; +} + +/* Return the pointer to the INDEXth token of the macro argument ARG. + KIND specifies the kind of token the macro argument ARG contains. + If VIRT_LOCATION is non NULL, *VIRT_LOCATION is set to the address + of the virtual location of the returned token if the + -ftrack-macro-expansion flag is on; otherwise, it's set to the + spelling location of the returned token. */ +static const cpp_token ** +arg_token_ptr_at (const macro_arg *arg, size_t index, + enum macro_arg_token_kind kind, + location_t **virt_location) +{ + const cpp_token **tokens_ptr = NULL; + + switch (kind) + { + case MACRO_ARG_TOKEN_NORMAL: + tokens_ptr = arg->first; + break; + case MACRO_ARG_TOKEN_STRINGIFIED: + tokens_ptr = (const cpp_token **) &arg->stringified; + break; + case MACRO_ARG_TOKEN_EXPANDED: + tokens_ptr = arg->expanded; + break; + } + + if (tokens_ptr == NULL) + /* This can happen for e.g, an empty token argument to a + funtion-like macro. */ + return tokens_ptr; + + if (virt_location) + { + if (kind == MACRO_ARG_TOKEN_NORMAL) + *virt_location = &arg->virt_locs[index]; + else if (kind == MACRO_ARG_TOKEN_EXPANDED) + *virt_location = &arg->expanded_virt_locs[index]; + else if (kind == MACRO_ARG_TOKEN_STRINGIFIED) + *virt_location = + (location_t *) &tokens_ptr[index]->src_loc; + } + return &tokens_ptr[index]; +} + +/* Initialize an iterator so that it iterates over the tokens of a + function-like macro argument. KIND is the kind of tokens we want + ITER to iterate over. TOKEN_PTR points the first token ITER will + iterate over. */ +static void +macro_arg_token_iter_init (macro_arg_token_iter *iter, + bool track_macro_exp_p, + enum macro_arg_token_kind kind, + const macro_arg *arg, + const cpp_token **token_ptr) +{ + iter->track_macro_exp_p = track_macro_exp_p; + iter->kind = kind; + iter->token_ptr = token_ptr; + /* Unconditionally initialize this so that the compiler doesn't warn + about iter->location_ptr being possibly uninitialized later after + this code has been inlined somewhere. */ + iter->location_ptr = NULL; + if (track_macro_exp_p) + iter->location_ptr = get_arg_token_location (arg, kind); +#if CHECKING_P + iter->num_forwards = 0; + if (track_macro_exp_p + && token_ptr != NULL + && iter->location_ptr == NULL) + abort (); +#endif +} + +/* Move the iterator one token forward. Note that if IT was + initialized on an argument that has a stringified token, moving it + forward doesn't make sense as a stringified token is essentially one + string. */ +static void +macro_arg_token_iter_forward (macro_arg_token_iter *it) +{ + switch (it->kind) + { + case MACRO_ARG_TOKEN_NORMAL: + case MACRO_ARG_TOKEN_EXPANDED: + it->token_ptr++; + if (it->track_macro_exp_p) + it->location_ptr++; + break; + case MACRO_ARG_TOKEN_STRINGIFIED: +#if CHECKING_P + if (it->num_forwards > 0) + abort (); +#endif + break; + } + +#if CHECKING_P + it->num_forwards++; +#endif +} + +/* Return the token pointed to by the iterator. */ +static const cpp_token * +macro_arg_token_iter_get_token (const macro_arg_token_iter *it) +{ +#if CHECKING_P + if (it->kind == MACRO_ARG_TOKEN_STRINGIFIED + && it->num_forwards > 0) + abort (); +#endif + if (it->token_ptr == NULL) + return NULL; + return *it->token_ptr; +} + +/* Return the location of the token pointed to by the iterator.*/ +static location_t +macro_arg_token_iter_get_location (const macro_arg_token_iter *it) +{ +#if CHECKING_P + if (it->kind == MACRO_ARG_TOKEN_STRINGIFIED + && it->num_forwards > 0) + abort (); +#endif + if (it->track_macro_exp_p) + return *it->location_ptr; + else + return (*it->token_ptr)->src_loc; +} + +/* Return the index of a token [resulting from macro expansion] inside + the total list of tokens resulting from a given macro + expansion. The index can be different depending on whether if we + want each tokens resulting from function-like macro arguments + expansion to have a different location or not. + + E.g, consider this function-like macro: + + #define M(x) x - 3 + + Then consider us "calling" it (and thus expanding it) like: + + M(1+4) + + It will be expanded into: + + 1+4-3 + + Let's consider the case of the token '4'. + + Its index can be 2 (it's the third token of the set of tokens + resulting from the expansion) or it can be 0 if we consider that + all tokens resulting from the expansion of the argument "1+2" have + the same index, which is 0. In this later case, the index of token + '-' would then be 1 and the index of token '3' would be 2. + + The later case is useful to use less memory e.g, for the case of + the user using the option -ftrack-macro-expansion=1. + + ABSOLUTE_TOKEN_INDEX is the index of the macro argument token we + are interested in. CUR_REPLACEMENT_TOKEN is the token of the macro + parameter (inside the macro replacement list) that corresponds to + the macro argument for which ABSOLUTE_TOKEN_INDEX is a token index + of. + + If we refer to the example above, for the '4' argument token, + ABSOLUTE_TOKEN_INDEX would be set to 2, and CUR_REPLACEMENT_TOKEN + would be set to the token 'x', in the replacement list "x - 3" of + macro M. + + This is a subroutine of replace_args. */ +inline static unsigned +expanded_token_index (cpp_reader *pfile, cpp_macro *macro, + const cpp_token *cur_replacement_token, + unsigned absolute_token_index) +{ + if (CPP_OPTION (pfile, track_macro_expansion) > 1) + return absolute_token_index; + return cur_replacement_token - macro->exp.tokens; +} + +/* Copy whether PASTE_LEFT is set from SRC to *PASTE_FLAG. */ + +static void +copy_paste_flag (cpp_reader *pfile, const cpp_token **paste_flag, + const cpp_token *src) +{ + cpp_token *token = _cpp_temp_token (pfile); + token->type = (*paste_flag)->type; + token->val = (*paste_flag)->val; + if (src->flags & PASTE_LEFT) + token->flags = (*paste_flag)->flags | PASTE_LEFT; + else + token->flags = (*paste_flag)->flags & ~PASTE_LEFT; + *paste_flag = token; +} + +/* True IFF the last token emitted into BUFF (if any) is PTR. */ + +static bool +last_token_is (_cpp_buff *buff, const cpp_token **ptr) +{ + return (ptr && tokens_buff_last_token_ptr (buff) == ptr); +} + +/* Replace the parameters in a function-like macro of NODE with the + actual ARGS, and place the result in a newly pushed token context. + Expand each argument before replacing, unless it is operated upon + by the # or ## operators. EXPANSION_POINT_LOC is the location of + the expansion point of the macro. E.g, the location of the + function-like macro invocation. */ +static void +replace_args (cpp_reader *pfile, cpp_hashnode *node, cpp_macro *macro, + macro_arg *args, location_t expansion_point_loc) +{ + unsigned int i, total; + const cpp_token *src, *limit; + const cpp_token **first = NULL; + macro_arg *arg; + _cpp_buff *buff = NULL; + location_t *virt_locs = NULL; + unsigned int exp_count; + const line_map_macro *map = NULL; + int track_macro_exp; + + /* First, fully macro-expand arguments, calculating the number of + tokens in the final expansion as we go. The ordering of the if + statements below is subtle; we must handle stringification before + pasting. */ + + /* EXP_COUNT is the number of tokens in the macro replacement + list. TOTAL is the number of tokens /after/ macro parameters + have been replaced by their arguments. */ + exp_count = macro_real_token_count (macro); + total = exp_count; + limit = macro->exp.tokens + exp_count; + + for (src = macro->exp.tokens; src < limit; src++) + if (src->type == CPP_MACRO_ARG) + { + /* Leading and trailing padding tokens. */ + total += 2; + /* Account for leading and padding tokens in exp_count too. + This is going to be important later down this function, + when we want to handle the case of (track_macro_exp < + 2). */ + exp_count += 2; + + /* We have an argument. If it is not being stringified or + pasted it is macro-replaced before insertion. */ + arg = &args[src->val.macro_arg.arg_no - 1]; + + if (src->flags & STRINGIFY_ARG) + { + if (!arg->stringified) + arg->stringified = stringify_arg (pfile, arg->first, arg->count); + } + else if ((src->flags & PASTE_LEFT) + || (src != macro->exp.tokens && (src[-1].flags & PASTE_LEFT))) + total += arg->count - 1; + else + { + if (!arg->expanded) + expand_arg (pfile, arg); + total += arg->expanded_count - 1; + } + } + + /* When the compiler is called with the -ftrack-macro-expansion + flag, we need to keep track of the location of each token that + results from macro expansion. + + A token resulting from macro expansion is not a new token. It is + simply the same token as the token coming from the macro + definition. The new things that are allocated are the buffer + that holds the tokens resulting from macro expansion and a new + location that records many things like the locus of the expansion + point as well as the original locus inside the definition of the + macro. This location is called a virtual location. + + So the buffer BUFF holds a set of cpp_token*, and the buffer + VIRT_LOCS holds the virtual locations of the tokens held by BUFF. + + Both of these two buffers are going to be hung off of the macro + context, when the latter is pushed. The memory allocated to + store the tokens and their locations is going to be freed once + the context of macro expansion is popped. + + As far as tokens are concerned, the memory overhead of + -ftrack-macro-expansion is proportional to the number of + macros that get expanded multiplied by sizeof (location_t). + The good news is that extra memory gets freed when the macro + context is freed, i.e shortly after the macro got expanded. */ + + /* Is the -ftrack-macro-expansion flag in effect? */ + track_macro_exp = CPP_OPTION (pfile, track_macro_expansion); + + /* Now allocate memory space for tokens and locations resulting from + the macro expansion, copy the tokens and replace the arguments. + This memory must be freed when the context of the macro MACRO is + popped. */ + buff = tokens_buff_new (pfile, total, track_macro_exp ? &virt_locs : NULL); + + first = (const cpp_token **) buff->base; + + /* Create a macro map to record the locations of the tokens that are + involved in the expansion. Note that the expansion point is set + to the location of the closing parenthesis. Otherwise, the + subsequent map created for the first token that comes after the + macro map might have a wrong line number. That would lead to + tokens with wrong line numbers after the macro expansion. This + adds up to the memory overhead of the -ftrack-macro-expansion + flag; for every macro that is expanded, a "macro map" is + created. */ + if (track_macro_exp) + { + int num_macro_tokens = total; + if (track_macro_exp < 2) + /* Then the number of macro tokens won't take in account the + fact that function-like macro arguments can expand to + multiple tokens. This is to save memory at the expense of + accuracy. + + Suppose we have #define SQUARE(A) A * A + + And then we do SQUARE(2+3) + + Then the tokens 2, +, 3, will have the same location, + saying they come from the expansion of the argument A. */ + num_macro_tokens = exp_count; + map = linemap_enter_macro (pfile->line_table, node, + expansion_point_loc, + num_macro_tokens); + } + i = 0; + vaopt_state vaopt_tracker (pfile, macro->variadic, &args[macro->paramc - 1]); + const cpp_token **vaopt_start = NULL; + for (src = macro->exp.tokens; src < limit; src++) + { + unsigned int arg_tokens_count; + macro_arg_token_iter from; + const cpp_token **paste_flag = NULL; + const cpp_token **tmp_token_ptr; + + /* __VA_OPT__ handling. */ + vaopt_state::update_type vostate = vaopt_tracker.update (src); + if (__builtin_expect (vostate != vaopt_state::INCLUDE, false)) + { + if (vostate == vaopt_state::BEGIN) + { + /* Padding on the left of __VA_OPT__ (unless RHS of ##). */ + if (src != macro->exp.tokens && !(src[-1].flags & PASTE_LEFT)) + { + const cpp_token *t = padding_token (pfile, src); + unsigned index = expanded_token_index (pfile, macro, src, i); + /* Allocate a virtual location for the padding token and + append the token and its location to BUFF and + VIRT_LOCS. */ + tokens_buff_add_token (buff, virt_locs, t, + t->src_loc, t->src_loc, + map, index); + } + vaopt_start = tokens_buff_last_token_ptr (buff); + } + else if (vostate == vaopt_state::END) + { + const cpp_token **start = vaopt_start; + vaopt_start = NULL; + + paste_flag = tokens_buff_last_token_ptr (buff); + + if (vaopt_tracker.stringify ()) + { + unsigned int count + = start ? paste_flag - start : tokens_buff_count (buff); + const cpp_token **first + = start ? start + 1 + : (const cpp_token **) (buff->base); + unsigned int i, j; + + /* Paste any tokens that need to be pasted before calling + stringify_arg, because stringify_arg uses pfile->u_buff + which paste_tokens can use as well. */ + for (i = 0, j = 0; i < count; i++, j++) + { + const cpp_token *token = first[i]; + + if (token->flags & PASTE_LEFT) + { + location_t virt_loc = pfile->invocation_location; + const cpp_token *rhs; + do + { + if (i == count) + abort (); + rhs = first[++i]; + if (!paste_tokens (pfile, virt_loc, &token, rhs)) + { + --i; + break; + } + } + while (rhs->flags & PASTE_LEFT); + } + + first[j] = token; + } + if (j != i) + { + while (i-- != j) + tokens_buff_remove_last_token (buff); + count = j; + } + + const cpp_token *t = stringify_arg (pfile, first, count); + while (count--) + tokens_buff_remove_last_token (buff); + if (src->flags & PASTE_LEFT) + copy_paste_flag (pfile, &t, src); + tokens_buff_add_token (buff, virt_locs, + t, t->src_loc, t->src_loc, + NULL, 0); + continue; + } + if (start && paste_flag == start && (*start)->flags & PASTE_LEFT) + /* If __VA_OPT__ expands to nothing (either because __VA_ARGS__ + is empty or because it is __VA_OPT__() ), drop PASTE_LEFT + flag from previous token. */ + copy_paste_flag (pfile, start, &pfile->avoid_paste); + if (src->flags & PASTE_LEFT) + { + /* Don't avoid paste after all. */ + while (paste_flag && paste_flag != start + && *paste_flag == &pfile->avoid_paste) + { + tokens_buff_remove_last_token (buff); + paste_flag = tokens_buff_last_token_ptr (buff); + } + + /* With a non-empty __VA_OPT__ on the LHS of ##, the last + token should be flagged PASTE_LEFT. */ + if (paste_flag && (*paste_flag)->type != CPP_PADDING) + copy_paste_flag (pfile, paste_flag, src); + } + else + { + /* Otherwise, avoid paste on RHS, __VA_OPT__(c)d or + __VA_OPT__(c)__VA_OPT__(d). */ + const cpp_token *t = &pfile->avoid_paste; + tokens_buff_add_token (buff, virt_locs, + t, t->src_loc, t->src_loc, + NULL, 0); + } + } + continue; + } + + if (src->type != CPP_MACRO_ARG) + { + /* Allocate a virtual location for token SRC, and add that + token and its virtual location into the buffers BUFF and + VIRT_LOCS. */ + unsigned index = expanded_token_index (pfile, macro, src, i); + tokens_buff_add_token (buff, virt_locs, src, + src->src_loc, src->src_loc, + map, index); + i += 1; + continue; + } + + paste_flag = 0; + arg = &args[src->val.macro_arg.arg_no - 1]; + /* SRC is a macro parameter that we need to replace with its + corresponding argument. So at some point we'll need to + iterate over the tokens of the macro argument and copy them + into the "place" now holding the correspondig macro + parameter. We are going to use the iterator type + macro_argo_token_iter to handle that iterating. The 'if' + below is to initialize the iterator depending on the type of + tokens the macro argument has. It also does some adjustment + related to padding tokens and some pasting corner cases. */ + if (src->flags & STRINGIFY_ARG) + { + arg_tokens_count = 1; + macro_arg_token_iter_init (&from, + CPP_OPTION (pfile, + track_macro_expansion), + MACRO_ARG_TOKEN_STRINGIFIED, + arg, &arg->stringified); + } + else if (src->flags & PASTE_LEFT) + { + arg_tokens_count = arg->count; + macro_arg_token_iter_init (&from, + CPP_OPTION (pfile, + track_macro_expansion), + MACRO_ARG_TOKEN_NORMAL, + arg, arg->first); + } + else if (src != macro->exp.tokens && (src[-1].flags & PASTE_LEFT)) + { + int num_toks; + arg_tokens_count = arg->count; + macro_arg_token_iter_init (&from, + CPP_OPTION (pfile, + track_macro_expansion), + MACRO_ARG_TOKEN_NORMAL, + arg, arg->first); + + num_toks = tokens_buff_count (buff); + + if (num_toks != 0) + { + /* So the current parameter token is pasted to the previous + token in the replacement list. Let's look at what + we have as previous and current arguments. */ + + /* This is the previous argument's token ... */ + tmp_token_ptr = tokens_buff_last_token_ptr (buff); + + if ((*tmp_token_ptr)->type == CPP_COMMA + && macro->variadic + && src->val.macro_arg.arg_no == macro->paramc) + { + /* ... which is a comma; and the current parameter + is the last parameter of a variadic function-like + macro. If the argument to the current last + parameter is NULL, then swallow the comma, + otherwise drop the paste flag. */ + if (macro_arg_token_iter_get_token (&from) == NULL) + tokens_buff_remove_last_token (buff); + else + paste_flag = tmp_token_ptr; + } + /* Remove the paste flag if the RHS is a placemarker. */ + else if (arg_tokens_count == 0) + paste_flag = tmp_token_ptr; + } + } + else + { + arg_tokens_count = arg->expanded_count; + macro_arg_token_iter_init (&from, + CPP_OPTION (pfile, + track_macro_expansion), + MACRO_ARG_TOKEN_EXPANDED, + arg, arg->expanded); + + if (last_token_is (buff, vaopt_start)) + { + /* We're expanding an arg at the beginning of __VA_OPT__. + Skip padding. */ + while (arg_tokens_count) + { + const cpp_token *t = macro_arg_token_iter_get_token (&from); + if (t->type != CPP_PADDING) + break; + macro_arg_token_iter_forward (&from); + --arg_tokens_count; + } + } + } + + /* Padding on the left of an argument (unless RHS of ##). */ + if ((!pfile->state.in_directive || pfile->state.directive_wants_padding) + && src != macro->exp.tokens + && !(src[-1].flags & PASTE_LEFT) + && !last_token_is (buff, vaopt_start)) + { + const cpp_token *t = padding_token (pfile, src); + unsigned index = expanded_token_index (pfile, macro, src, i); + /* Allocate a virtual location for the padding token and + append the token and its location to BUFF and + VIRT_LOCS. */ + tokens_buff_add_token (buff, virt_locs, t, + t->src_loc, t->src_loc, + map, index); + } + + if (arg_tokens_count) + { + /* So now we've got the number of tokens that make up the + argument that is going to replace the current parameter + in the macro's replacement list. */ + unsigned int j; + for (j = 0; j < arg_tokens_count; ++j) + { + /* So if track_macro_exp is < 2, the user wants to + save extra memory while tracking macro expansion + locations. So in that case here is what we do: + + Suppose we have #define SQUARE(A) A * A + + And then we do SQUARE(2+3) + + Then the tokens 2, +, 3, will have the same location, + saying they come from the expansion of the argument + A. + + So that means we are going to ignore the COUNT tokens + resulting from the expansion of the current macro + argument. In other words all the ARG_TOKENS_COUNT tokens + resulting from the expansion of the macro argument will + have the index I. Normally, each of those tokens should + have index I+J. */ + unsigned token_index = i; + unsigned index; + if (track_macro_exp > 1) + token_index += j; + + index = expanded_token_index (pfile, macro, src, token_index); + const cpp_token *tok = macro_arg_token_iter_get_token (&from); + tokens_buff_add_token (buff, virt_locs, tok, + macro_arg_token_iter_get_location (&from), + src->src_loc, map, index); + macro_arg_token_iter_forward (&from); + } + + /* With a non-empty argument on the LHS of ##, the last + token should be flagged PASTE_LEFT. */ + if (src->flags & PASTE_LEFT) + paste_flag + = (const cpp_token **) tokens_buff_last_token_ptr (buff); + } + else if (CPP_PEDANTIC (pfile) && ! CPP_OPTION (pfile, c99) + && ! macro->syshdr && ! _cpp_in_system_header (pfile)) + { + if (CPP_OPTION (pfile, cplusplus)) + cpp_pedwarning (pfile, CPP_W_PEDANTIC, + "invoking macro %s argument %d: " + "empty macro arguments are undefined" + " in ISO C++98", + NODE_NAME (node), src->val.macro_arg.arg_no); + else if (CPP_OPTION (pfile, cpp_warn_c90_c99_compat)) + cpp_pedwarning (pfile, + CPP_OPTION (pfile, cpp_warn_c90_c99_compat) > 0 + ? CPP_W_C90_C99_COMPAT : CPP_W_PEDANTIC, + "invoking macro %s argument %d: " + "empty macro arguments are undefined" + " in ISO C90", + NODE_NAME (node), src->val.macro_arg.arg_no); + } + else if (CPP_OPTION (pfile, cpp_warn_c90_c99_compat) > 0 + && ! CPP_OPTION (pfile, cplusplus) + && ! macro->syshdr && ! _cpp_in_system_header (pfile)) + cpp_warning (pfile, CPP_W_C90_C99_COMPAT, + "invoking macro %s argument %d: " + "empty macro arguments are undefined" + " in ISO C90", + NODE_NAME (node), src->val.macro_arg.arg_no); + + /* Avoid paste on RHS (even case count == 0). */ + if (!pfile->state.in_directive && !(src->flags & PASTE_LEFT)) + { + const cpp_token *t = &pfile->avoid_paste; + tokens_buff_add_token (buff, virt_locs, + t, t->src_loc, t->src_loc, + NULL, 0); + } + + /* Add a new paste flag, or remove an unwanted one. */ + if (paste_flag) + copy_paste_flag (pfile, paste_flag, src); + + i += arg_tokens_count; + } + + if (track_macro_exp) + push_extended_tokens_context (pfile, node, buff, virt_locs, first, + tokens_buff_count (buff)); + else + push_ptoken_context (pfile, node, buff, first, + tokens_buff_count (buff)); + + num_macro_tokens_counter += tokens_buff_count (buff); +} + +/* Return a special padding token, with padding inherited from SOURCE. */ +static const cpp_token * +padding_token (cpp_reader *pfile, const cpp_token *source) +{ + cpp_token *result = _cpp_temp_token (pfile); + + result->type = CPP_PADDING; + + /* Data in GCed data structures cannot be made const so far, so we + need a cast here. */ + result->val.source = (cpp_token *) source; + result->flags = 0; + return result; +} + +/* Get a new uninitialized context. Create a new one if we cannot + re-use an old one. */ +static cpp_context * +next_context (cpp_reader *pfile) +{ + cpp_context *result = pfile->context->next; + + if (result == 0) + { + result = XNEW (cpp_context); + memset (result, 0, sizeof (cpp_context)); + result->prev = pfile->context; + result->next = 0; + pfile->context->next = result; + } + + pfile->context = result; + return result; +} + +/* Push a list of pointers to tokens. */ +static void +push_ptoken_context (cpp_reader *pfile, cpp_hashnode *macro, _cpp_buff *buff, + const cpp_token **first, unsigned int count) +{ + cpp_context *context = next_context (pfile); + + context->tokens_kind = TOKENS_KIND_INDIRECT; + context->c.macro = macro; + context->buff = buff; + FIRST (context).ptoken = first; + LAST (context).ptoken = first + count; +} + +/* Push a list of tokens. + + A NULL macro means that we should continue the current macro + expansion, in essence. That means that if we are currently in a + macro expansion context, we'll make the new pfile->context refer to + the current macro. */ +void +_cpp_push_token_context (cpp_reader *pfile, cpp_hashnode *macro, + const cpp_token *first, unsigned int count) +{ + cpp_context *context; + + if (macro == NULL) + macro = macro_of_context (pfile->context); + + context = next_context (pfile); + context->tokens_kind = TOKENS_KIND_DIRECT; + context->c.macro = macro; + context->buff = NULL; + FIRST (context).token = first; + LAST (context).token = first + count; +} + +/* Build a context containing a list of tokens as well as their + virtual locations and push it. TOKENS_BUFF is the buffer that + contains the tokens pointed to by FIRST. If TOKENS_BUFF is + non-NULL, it means that the context owns it, meaning that + _cpp_pop_context will free it as well as VIRT_LOCS_BUFF that + contains the virtual locations. + + A NULL macro means that we should continue the current macro + expansion, in essence. That means that if we are currently in a + macro expansion context, we'll make the new pfile->context refer to + the current macro. */ +static void +push_extended_tokens_context (cpp_reader *pfile, + cpp_hashnode *macro, + _cpp_buff *token_buff, + location_t *virt_locs, + const cpp_token **first, + unsigned int count) +{ + cpp_context *context; + macro_context *m; + + if (macro == NULL) + macro = macro_of_context (pfile->context); + + context = next_context (pfile); + context->tokens_kind = TOKENS_KIND_EXTENDED; + context->buff = token_buff; + + m = XNEW (macro_context); + m->macro_node = macro; + m->virt_locs = virt_locs; + m->cur_virt_loc = virt_locs; + context->c.mc = m; + FIRST (context).ptoken = first; + LAST (context).ptoken = first + count; +} + +/* Push a traditional macro's replacement text. */ +void +_cpp_push_text_context (cpp_reader *pfile, cpp_hashnode *macro, + const uchar *start, size_t len) +{ + cpp_context *context = next_context (pfile); + + context->tokens_kind = TOKENS_KIND_DIRECT; + context->c.macro = macro; + context->buff = NULL; + CUR (context) = start; + RLIMIT (context) = start + len; + macro->flags |= NODE_DISABLED; +} + +/* Creates a buffer that holds tokens a.k.a "token buffer", usually + for the purpose of storing them on a cpp_context. If VIRT_LOCS is + non-null (which means that -ftrack-macro-expansion is on), + *VIRT_LOCS is set to a newly allocated buffer that is supposed to + hold the virtual locations of the tokens resulting from macro + expansion. */ +static _cpp_buff* +tokens_buff_new (cpp_reader *pfile, size_t len, + location_t **virt_locs) +{ + size_t tokens_size = len * sizeof (cpp_token *); + size_t locs_size = len * sizeof (location_t); + + if (virt_locs != NULL) + *virt_locs = XNEWVEC (location_t, locs_size); + return _cpp_get_buff (pfile, tokens_size); +} + +/* Returns the number of tokens contained in a token buffer. The + buffer holds a set of cpp_token*. */ +static size_t +tokens_buff_count (_cpp_buff *buff) +{ + return (BUFF_FRONT (buff) - buff->base) / sizeof (cpp_token *); +} + +/* Return a pointer to the last token contained in the token buffer + BUFF. */ +static const cpp_token ** +tokens_buff_last_token_ptr (_cpp_buff *buff) +{ + if (BUFF_FRONT (buff) == buff->base) + return NULL; + return &((const cpp_token **) BUFF_FRONT (buff))[-1]; +} + +/* Remove the last token contained in the token buffer TOKENS_BUFF. + If VIRT_LOCS_BUFF is non-NULL, it should point at the buffer + containing the virtual locations of the tokens in TOKENS_BUFF; in + which case the function updates that buffer as well. */ +static inline void +tokens_buff_remove_last_token (_cpp_buff *tokens_buff) + +{ + if (BUFF_FRONT (tokens_buff) > tokens_buff->base) + BUFF_FRONT (tokens_buff) = + (unsigned char *) &((cpp_token **) BUFF_FRONT (tokens_buff))[-1]; +} + +/* Insert a token into the token buffer at the position pointed to by + DEST. Note that the buffer is not enlarged so the previous token + that was at *DEST is overwritten. VIRT_LOC_DEST, if non-null, + means -ftrack-macro-expansion is effect; it then points to where to + insert the virtual location of TOKEN. TOKEN is the token to + insert. VIRT_LOC is the virtual location of the token, i.e, the + location possibly encoding its locus across macro expansion. If + TOKEN is an argument of a function-like macro (inside a macro + replacement list), PARM_DEF_LOC is the spelling location of the + macro parameter that TOKEN is replacing, in the replacement list of + the macro. If TOKEN is not an argument of a function-like macro or + if it doesn't come from a macro expansion, then VIRT_LOC can just + be set to the same value as PARM_DEF_LOC. If MAP is non null, it + means TOKEN comes from a macro expansion and MAP is the macro map + associated to the macro. MACRO_TOKEN_INDEX points to the index of + the token in the macro map; it is not considered if MAP is NULL. + + Upon successful completion this function returns the a pointer to + the position of the token coming right after the insertion + point. */ +static inline const cpp_token ** +tokens_buff_put_token_to (const cpp_token **dest, + location_t *virt_loc_dest, + const cpp_token *token, + location_t virt_loc, + location_t parm_def_loc, + const line_map_macro *map, + unsigned int macro_token_index) +{ + location_t macro_loc = virt_loc; + const cpp_token **result; + + if (virt_loc_dest) + { + /* -ftrack-macro-expansion is on. */ + if (map) + macro_loc = linemap_add_macro_token (map, macro_token_index, + virt_loc, parm_def_loc); + *virt_loc_dest = macro_loc; + } + *dest = token; + result = &dest[1]; + + return result; +} + +/* Adds a token at the end of the tokens contained in BUFFER. Note + that this function doesn't enlarge BUFFER when the number of tokens + reaches BUFFER's size; it aborts in that situation. + + TOKEN is the token to append. VIRT_LOC is the virtual location of + the token, i.e, the location possibly encoding its locus across + macro expansion. If TOKEN is an argument of a function-like macro + (inside a macro replacement list), PARM_DEF_LOC is the location of + the macro parameter that TOKEN is replacing. If TOKEN doesn't come + from a macro expansion, then VIRT_LOC can just be set to the same + value as PARM_DEF_LOC. If MAP is non null, it means TOKEN comes + from a macro expansion and MAP is the macro map associated to the + macro. MACRO_TOKEN_INDEX points to the index of the token in the + macro map; It is not considered if MAP is NULL. If VIRT_LOCS is + non-null, it means -ftrack-macro-expansion is on; in which case + this function adds the virtual location DEF_LOC to the VIRT_LOCS + array, at the same index as the one of TOKEN in BUFFER. Upon + successful completion this function returns the a pointer to the + position of the token coming right after the insertion point. */ +static const cpp_token ** +tokens_buff_add_token (_cpp_buff *buffer, + location_t *virt_locs, + const cpp_token *token, + location_t virt_loc, + location_t parm_def_loc, + const line_map_macro *map, + unsigned int macro_token_index) +{ + const cpp_token **result; + location_t *virt_loc_dest = NULL; + unsigned token_index = + (BUFF_FRONT (buffer) - buffer->base) / sizeof (cpp_token *); + + /* Abort if we pass the end the buffer. */ + if (BUFF_FRONT (buffer) > BUFF_LIMIT (buffer)) + abort (); + + if (virt_locs != NULL) + virt_loc_dest = &virt_locs[token_index]; + + result = + tokens_buff_put_token_to ((const cpp_token **) BUFF_FRONT (buffer), + virt_loc_dest, token, virt_loc, parm_def_loc, + map, macro_token_index); + + BUFF_FRONT (buffer) = (unsigned char *) result; + return result; +} + +/* Allocate space for the function-like macro argument ARG to store + the tokens resulting from the macro-expansion of the tokens that + make up ARG itself. That space is allocated in ARG->expanded and + needs to be freed using free. */ +static void +alloc_expanded_arg_mem (cpp_reader *pfile, macro_arg *arg, size_t capacity) +{ + gcc_checking_assert (arg->expanded == NULL + && arg->expanded_virt_locs == NULL); + + arg->expanded = XNEWVEC (const cpp_token *, capacity); + if (CPP_OPTION (pfile, track_macro_expansion)) + arg->expanded_virt_locs = XNEWVEC (location_t, capacity); + +} + +/* If necessary, enlarge ARG->expanded to so that it can contain SIZE + tokens. */ +static void +ensure_expanded_arg_room (cpp_reader *pfile, macro_arg *arg, + size_t size, size_t *expanded_capacity) +{ + if (size <= *expanded_capacity) + return; + + size *= 2; + + arg->expanded = + XRESIZEVEC (const cpp_token *, arg->expanded, size); + *expanded_capacity = size; + + if (CPP_OPTION (pfile, track_macro_expansion)) + { + if (arg->expanded_virt_locs == NULL) + arg->expanded_virt_locs = XNEWVEC (location_t, size); + else + arg->expanded_virt_locs = XRESIZEVEC (location_t, + arg->expanded_virt_locs, + size); + } +} + +/* + Expand an argument ARG before replacing parameters in a + function-like macro. This works by pushing a context with the + argument's tokens, and then expanding that into a temporary buffer + as if it were a normal part of the token stream. collect_args() + has terminated the argument's tokens with a CPP_EOF so that we know + when we have fully expanded the argument. + */ +static void +expand_arg (cpp_reader *pfile, macro_arg *arg) +{ + size_t capacity; + bool saved_warn_trad; + bool track_macro_exp_p = CPP_OPTION (pfile, track_macro_expansion); + bool saved_ignore__Pragma; + + if (arg->count == 0 + || arg->expanded != NULL) + return; + + /* Don't warn about funlike macros when pre-expanding. */ + saved_warn_trad = CPP_WTRADITIONAL (pfile); + CPP_WTRADITIONAL (pfile) = 0; + + /* Loop, reading in the tokens of the argument. */ + capacity = 256; + alloc_expanded_arg_mem (pfile, arg, capacity); + + if (track_macro_exp_p) + push_extended_tokens_context (pfile, NULL, NULL, + arg->virt_locs, + arg->first, + arg->count + 1); + else + push_ptoken_context (pfile, NULL, NULL, + arg->first, arg->count + 1); + + saved_ignore__Pragma = pfile->state.ignore__Pragma; + pfile->state.ignore__Pragma = 1; + + for (;;) + { + const cpp_token *token; + location_t location; + + ensure_expanded_arg_room (pfile, arg, arg->expanded_count + 1, + &capacity); + + token = cpp_get_token_1 (pfile, &location); + + if (token->type == CPP_EOF) + break; + + set_arg_token (arg, token, location, + arg->expanded_count, MACRO_ARG_TOKEN_EXPANDED, + CPP_OPTION (pfile, track_macro_expansion)); + arg->expanded_count++; + } + + _cpp_pop_context (pfile); + + CPP_WTRADITIONAL (pfile) = saved_warn_trad; + pfile->state.ignore__Pragma = saved_ignore__Pragma; +} + +/* Returns the macro associated to the current context if we are in + the context a macro expansion, NULL otherwise. */ +static cpp_hashnode* +macro_of_context (cpp_context *context) +{ + if (context == NULL) + return NULL; + + return (context->tokens_kind == TOKENS_KIND_EXTENDED) + ? context->c.mc->macro_node + : context->c.macro; +} + +/* Return TRUE iff we are expanding a macro or are about to start + expanding one. If we are effectively expanding a macro, the + function macro_of_context returns a pointer to the macro being + expanded. */ +static bool +in_macro_expansion_p (cpp_reader *pfile) +{ + if (pfile == NULL) + return false; + + return (pfile->about_to_expand_macro_p + || macro_of_context (pfile->context)); +} + +/* Pop the current context off the stack, re-enabling the macro if the + context represented a macro's replacement list. Initially the + context structure was not freed so that we can re-use it later, but + now we do free it to reduce peak memory consumption. */ +void +_cpp_pop_context (cpp_reader *pfile) +{ + cpp_context *context = pfile->context; + + /* We should not be popping the base context. */ + gcc_assert (context != &pfile->base_context); + + if (context->c.macro) + { + cpp_hashnode *macro; + if (context->tokens_kind == TOKENS_KIND_EXTENDED) + { + macro_context *mc = context->c.mc; + macro = mc->macro_node; + /* If context->buff is set, it means the life time of tokens + is bound to the life time of this context; so we must + free the tokens; that means we must free the virtual + locations of these tokens too. */ + if (context->buff && mc->virt_locs) + { + free (mc->virt_locs); + mc->virt_locs = NULL; + } + free (mc); + context->c.mc = NULL; + } + else + macro = context->c.macro; + + /* Beware that MACRO can be NULL in cases like when we are + called from expand_arg. In those cases, a dummy context with + tokens is pushed just for the purpose of walking them using + cpp_get_token_1. In that case, no 'macro' field is set into + the dummy context. */ + if (macro != NULL + /* Several contiguous macro expansion contexts can be + associated to the same macro; that means it's the same + macro expansion that spans across all these (sub) + contexts. So we should re-enable an expansion-disabled + macro only when we are sure we are really out of that + macro expansion. */ + && macro_of_context (context->prev) != macro) + macro->flags &= ~NODE_DISABLED; + + if (macro == pfile->top_most_macro_node && context->prev == NULL) + /* We are popping the context of the top-most macro node. */ + pfile->top_most_macro_node = NULL; + } + + if (context->buff) + { + /* Decrease memory peak consumption by freeing the memory used + by the context. */ + _cpp_free_buff (context->buff); + } + + pfile->context = context->prev; + /* decrease peak memory consumption by feeing the context. */ + pfile->context->next = NULL; + free (context); +} + +/* Return TRUE if we reached the end of the set of tokens stored in + CONTEXT, FALSE otherwise. */ +static inline bool +reached_end_of_context (cpp_context *context) +{ + if (context->tokens_kind == TOKENS_KIND_DIRECT) + return FIRST (context).token == LAST (context).token; + else if (context->tokens_kind == TOKENS_KIND_INDIRECT + || context->tokens_kind == TOKENS_KIND_EXTENDED) + return FIRST (context).ptoken == LAST (context).ptoken; + else + abort (); +} + +/* Consume the next token contained in the current context of PFILE, + and return it in *TOKEN. It's "full location" is returned in + *LOCATION. If -ftrack-macro-location is in effeect, fFull location" + means the location encoding the locus of the token across macro + expansion; otherwise it's just is the "normal" location of the + token which (*TOKEN)->src_loc. */ +static inline void +consume_next_token_from_context (cpp_reader *pfile, + const cpp_token ** token, + location_t *location) +{ + cpp_context *c = pfile->context; + + if ((c)->tokens_kind == TOKENS_KIND_DIRECT) + { + *token = FIRST (c).token; + *location = (*token)->src_loc; + FIRST (c).token++; + } + else if ((c)->tokens_kind == TOKENS_KIND_INDIRECT) + { + *token = *FIRST (c).ptoken; + *location = (*token)->src_loc; + FIRST (c).ptoken++; + } + else if ((c)->tokens_kind == TOKENS_KIND_EXTENDED) + { + macro_context *m = c->c.mc; + *token = *FIRST (c).ptoken; + if (m->virt_locs) + { + *location = *m->cur_virt_loc; + m->cur_virt_loc++; + } + else + *location = (*token)->src_loc; + FIRST (c).ptoken++; + } + else + abort (); +} + +/* In the traditional mode of the preprocessor, if we are currently in + a directive, the location of a token must be the location of the + start of the directive line. This function returns the proper + location if we are in the traditional mode, and just returns + LOCATION otherwise. */ + +static inline location_t +maybe_adjust_loc_for_trad_cpp (cpp_reader *pfile, location_t location) +{ + if (CPP_OPTION (pfile, traditional)) + { + if (pfile->state.in_directive) + return pfile->directive_line; + } + return location; +} + +/* Routine to get a token as well as its location. + + Macro expansions and directives are transparently handled, + including entering included files. Thus tokens are post-macro + expansion, and after any intervening directives. External callers + see CPP_EOF only at EOF. Internal callers also see it when meeting + a directive inside a macro call, when at the end of a directive and + state.in_directive is still 1, and at the end of argument + pre-expansion. + + LOC is an out parameter; *LOC is set to the location "as expected + by the user". Please read the comment of + cpp_get_token_with_location to learn more about the meaning of this + location. */ +static const cpp_token* +cpp_get_token_1 (cpp_reader *pfile, location_t *location) +{ + const cpp_token *result; + /* This token is a virtual token that either encodes a location + related to macro expansion or a spelling location. */ + location_t virt_loc = 0; + /* pfile->about_to_expand_macro_p can be overriden by indirect calls + to functions that push macro contexts. So let's save it so that + we can restore it when we are about to leave this routine. */ + bool saved_about_to_expand_macro = pfile->about_to_expand_macro_p; + + for (;;) + { + cpp_hashnode *node; + cpp_context *context = pfile->context; + + /* Context->prev == 0 <=> base context. */ + if (!context->prev) + { + result = _cpp_lex_token (pfile); + virt_loc = result->src_loc; + } + else if (!reached_end_of_context (context)) + { + consume_next_token_from_context (pfile, &result, + &virt_loc); + if (result->flags & PASTE_LEFT) + { + paste_all_tokens (pfile, result); + if (pfile->state.in_directive) + continue; + result = padding_token (pfile, result); + goto out; + } + } + else + { + if (pfile->context->c.macro) + ++num_expanded_macros_counter; + _cpp_pop_context (pfile); + if (pfile->state.in_directive) + continue; + result = &pfile->avoid_paste; + goto out; + } + + if (pfile->state.in_directive && result->type == CPP_COMMENT) + continue; + + if (result->type != CPP_NAME) + break; + + node = result->val.node.node; + + if (node->type == NT_VOID || (result->flags & NO_EXPAND)) + break; + + if (!(node->flags & NODE_USED) + && node->type == NT_USER_MACRO + && !node->value.macro + && !cpp_get_deferred_macro (pfile, node, result->src_loc)) + break; + + if (!(node->flags & NODE_DISABLED)) + { + int ret = 0; + /* If not in a macro context, and we're going to start an + expansion, record the location and the top level macro + about to be expanded. */ + if (!in_macro_expansion_p (pfile)) + { + pfile->invocation_location = result->src_loc; + pfile->top_most_macro_node = node; + } + if (pfile->state.prevent_expansion) + break; + + /* Conditional macros require that a predicate be evaluated + first. */ + if ((node->flags & NODE_CONDITIONAL) != 0) + { + if (pfile->cb.macro_to_expand) + { + bool whitespace_after; + const cpp_token *peek_tok = cpp_peek_token (pfile, 0); + + whitespace_after = (peek_tok->type == CPP_PADDING + || (peek_tok->flags & PREV_WHITE)); + node = pfile->cb.macro_to_expand (pfile, result); + if (node) + ret = enter_macro_context (pfile, node, result, virt_loc); + else if (whitespace_after) + { + /* If macro_to_expand hook returned NULL and it + ate some tokens, see if we don't need to add + a padding token in between this and the + next token. */ + peek_tok = cpp_peek_token (pfile, 0); + if (peek_tok->type != CPP_PADDING + && (peek_tok->flags & PREV_WHITE) == 0) + _cpp_push_token_context (pfile, NULL, + padding_token (pfile, + peek_tok), 1); + } + } + } + else + ret = enter_macro_context (pfile, node, result, virt_loc); + if (ret) + { + if (pfile->state.in_directive || ret == 2) + continue; + result = padding_token (pfile, result); + goto out; + } + } + else + { + /* Flag this token as always unexpandable. FIXME: move this + to collect_args()?. */ + cpp_token *t = _cpp_temp_token (pfile); + t->type = result->type; + t->flags = result->flags | NO_EXPAND; + t->val = result->val; + result = t; + } + + break; + } + + out: + if (location != NULL) + { + if (virt_loc == 0) + virt_loc = result->src_loc; + *location = virt_loc; + + if (!CPP_OPTION (pfile, track_macro_expansion) + && macro_of_context (pfile->context) != NULL) + /* We are in a macro expansion context, are not tracking + virtual location, but were asked to report the location + of the expansion point of the macro being expanded. */ + *location = pfile->invocation_location; + + *location = maybe_adjust_loc_for_trad_cpp (pfile, *location); + } + + pfile->about_to_expand_macro_p = saved_about_to_expand_macro; + + if (pfile->state.directive_file_token + && !pfile->state.parsing_args + && !(result->type == CPP_PADDING || result->type == CPP_COMMENT) + && !(15 & --pfile->state.directive_file_token)) + { + /* Do header-name frobbery. Concatenate < ... > as approprate. + Do header search if needed, and finally drop the outer <> or + "". */ + pfile->state.angled_headers = false; + + /* Do angle-header reconstitution. Then do include searching. + We'll always end up with a ""-quoted header-name in that + case. If searching finds nothing, we emit a diagnostic and + an empty string. */ + size_t len = 0; + char *fname = NULL; + + cpp_token *tmp = _cpp_temp_token (pfile); + *tmp = *result; + + tmp->type = CPP_HEADER_NAME; + bool need_search = !pfile->state.directive_file_token; + pfile->state.directive_file_token = 0; + + bool angle = result->type != CPP_STRING; + if (result->type == CPP_HEADER_NAME + || (result->type == CPP_STRING && result->val.str.text[0] != 'R')) + { + len = result->val.str.len - 2; + fname = XNEWVEC (char, len + 1); + memcpy (fname, result->val.str.text + 1, len); + fname[len] = 0; + } + else if (result->type == CPP_LESS) + fname = _cpp_bracket_include (pfile); + + if (fname) + { + /* We have a header-name. Look it up. This will emit an + unfound diagnostic. Canonicalize the found name. */ + const char *found = fname; + + if (need_search) + { + found = _cpp_find_header_unit (pfile, fname, angle, tmp->src_loc); + if (!found) + found = ""; + len = strlen (found); + } + /* Force a leading './' if it's not absolute. */ + bool dotme = (found[0] == '.' ? !IS_DIR_SEPARATOR (found[1]) + : found[0] && !IS_ABSOLUTE_PATH (found)); + + if (BUFF_ROOM (pfile->u_buff) < len + 1 + dotme * 2) + _cpp_extend_buff (pfile, &pfile->u_buff, len + 1 + dotme * 2); + unsigned char *buf = BUFF_FRONT (pfile->u_buff); + size_t pos = 0; + + if (dotme) + { + buf[pos++] = '.'; + /* Apparently '/' is unconditional. */ + buf[pos++] = '/'; + } + memcpy (&buf[pos], found, len); + pos += len; + buf[pos] = 0; + + tmp->val.str.len = pos; + tmp->val.str.text = buf; + + tmp->type = CPP_HEADER_NAME; + XDELETEVEC (fname); + + result = tmp; + } + } + + return result; +} + +/* External routine to get a token. Also used nearly everywhere + internally, except for places where we know we can safely call + _cpp_lex_token directly, such as lexing a directive name. + + Macro expansions and directives are transparently handled, + including entering included files. Thus tokens are post-macro + expansion, and after any intervening directives. External callers + see CPP_EOF only at EOF. Internal callers also see it when meeting + a directive inside a macro call, when at the end of a directive and + state.in_directive is still 1, and at the end of argument + pre-expansion. */ +const cpp_token * +cpp_get_token (cpp_reader *pfile) +{ + return cpp_get_token_1 (pfile, NULL); +} + +/* Like cpp_get_token, but also returns a virtual token location + separate from the spelling location carried by the returned token. + + LOC is an out parameter; *LOC is set to the location "as expected + by the user". This matters when a token results from macro + expansion; in that case the token's spelling location indicates the + locus of the token in the definition of the macro but *LOC + virtually encodes all the other meaningful locuses associated to + the token. + + What? virtual location? Yes, virtual location. + + If the token results from macro expansion and if macro expansion + location tracking is enabled its virtual location encodes (at the + same time): + + - the spelling location of the token + + - the locus of the macro expansion point + + - the locus of the point where the token got instantiated as part + of the macro expansion process. + + You have to use the linemap API to get the locus you are interested + in from a given virtual location. + + Note however that virtual locations are not necessarily ordered for + relations '<' and '>'. One must use the function + linemap_location_before_p instead of using the relational operator + '<'. + + If macro expansion tracking is off and if the token results from + macro expansion the virtual location is the expansion point of the + macro that got expanded. + + When the token doesn't result from macro expansion, the virtual + location is just the same thing as its spelling location. */ + +const cpp_token * +cpp_get_token_with_location (cpp_reader *pfile, location_t *loc) +{ + return cpp_get_token_1 (pfile, loc); +} + +/* Returns true if we're expanding an object-like macro that was + defined in a system header. Just checks the macro at the top of + the stack. Used for diagnostic suppression. + Also return true for builtin macros. */ +int +cpp_sys_macro_p (cpp_reader *pfile) +{ + cpp_hashnode *node = NULL; + + if (pfile->context->tokens_kind == TOKENS_KIND_EXTENDED) + node = pfile->context->c.mc->macro_node; + else + node = pfile->context->c.macro; + + if (!node) + return false; + if (cpp_builtin_macro_p (node)) + return true; + return node->value.macro && node->value.macro->syshdr; +} + +/* Read each token in, until end of the current file. Directives are + transparently processed. */ +void +cpp_scan_nooutput (cpp_reader *pfile) +{ + /* Request a CPP_EOF token at the end of this file, rather than + transparently continuing with the including file. */ + pfile->buffer->return_at_eof = true; + + pfile->state.discarding_output++; + pfile->state.prevent_expansion++; + + if (CPP_OPTION (pfile, traditional)) + while (_cpp_read_logical_line_trad (pfile)) + ; + else + while (cpp_get_token (pfile)->type != CPP_EOF) + ; + + pfile->state.discarding_output--; + pfile->state.prevent_expansion--; +} + +/* Step back one or more tokens obtained from the lexer. */ +void +_cpp_backup_tokens_direct (cpp_reader *pfile, unsigned int count) +{ + pfile->lookaheads += count; + while (count--) + { + pfile->cur_token--; + if (pfile->cur_token == pfile->cur_run->base + /* Possible with -fpreprocessed and no leading #line. */ + && pfile->cur_run->prev != NULL) + { + pfile->cur_run = pfile->cur_run->prev; + pfile->cur_token = pfile->cur_run->limit; + } + } +} + +/* Step back one (or more) tokens. Can only step back more than 1 if + they are from the lexer, and not from macro expansion. */ +void +_cpp_backup_tokens (cpp_reader *pfile, unsigned int count) +{ + if (pfile->context->prev == NULL) + _cpp_backup_tokens_direct (pfile, count); + else + { + if (count != 1) + abort (); + if (pfile->context->tokens_kind == TOKENS_KIND_DIRECT) + FIRST (pfile->context).token--; + else if (pfile->context->tokens_kind == TOKENS_KIND_INDIRECT) + FIRST (pfile->context).ptoken--; + else if (pfile->context->tokens_kind == TOKENS_KIND_EXTENDED) + { + FIRST (pfile->context).ptoken--; + if (pfile->context->c.macro) + { + macro_context *m = pfile->context->c.mc; + m->cur_virt_loc--; + gcc_checking_assert (m->cur_virt_loc >= m->virt_locs); + } + else + abort (); + } + else + abort (); + } +} + +/* #define directive parsing and handling. */ + +/* Returns true if a macro redefinition warning is required. */ +static bool +warn_of_redefinition (cpp_reader *pfile, cpp_hashnode *node, + const cpp_macro *macro2) +{ + /* Some redefinitions need to be warned about regardless. */ + if (node->flags & NODE_WARN) + return true; + + /* Suppress warnings for builtins that lack the NODE_WARN flag, + unless Wbuiltin-macro-redefined. */ + if (cpp_builtin_macro_p (node)) + return CPP_OPTION (pfile, warn_builtin_macro_redefined); + + /* Redefinitions of conditional (context-sensitive) macros, on + the other hand, must be allowed silently. */ + if (node->flags & NODE_CONDITIONAL) + return false; + + if (cpp_macro *macro1 = get_deferred_or_lazy_macro (pfile, node, macro2->line)) + return cpp_compare_macros (macro1, macro2); + return false; +} + +/* Return TRUE if MACRO1 and MACRO2 differ. */ + +bool +cpp_compare_macros (const cpp_macro *macro1, const cpp_macro *macro2) +{ + /* Redefinition of a macro is allowed if and only if the old and new + definitions are the same. (6.10.3 paragraph 2). */ + + /* Don't check count here as it can be different in valid + traditional redefinitions with just whitespace differences. */ + if (macro1->paramc != macro2->paramc + || macro1->fun_like != macro2->fun_like + || macro1->variadic != macro2->variadic) + return true; + + /* Check parameter spellings. */ + for (unsigned i = macro1->paramc; i--; ) + if (macro1->parm.params[i] != macro2->parm.params[i]) + return true; + + /* Check the replacement text or tokens. */ + if (macro1->kind == cmk_traditional) + return _cpp_expansions_different_trad (macro1, macro2); + + if (macro1->count != macro2->count) + return true; + + for (unsigned i= macro1->count; i--; ) + if (!_cpp_equiv_tokens (¯o1->exp.tokens[i], ¯o2->exp.tokens[i])) + return true; + + return false; +} + +/* Free the definition of hashnode H. */ +void +_cpp_free_definition (cpp_hashnode *h) +{ + /* Macros and assertions no longer have anything to free. */ + h->type = NT_VOID; + h->value.answers = NULL; + h->flags &= ~(NODE_DISABLED | NODE_USED); +} + +/* Save parameter NODE (spelling SPELLING) to the parameter list of + macro MACRO. Returns true on success, false on failure. */ +bool +_cpp_save_parameter (cpp_reader *pfile, unsigned n, cpp_hashnode *node, + cpp_hashnode *spelling) +{ + /* Constraint 6.10.3.6 - duplicate parameter names. */ + if (node->type == NT_MACRO_ARG) + { + cpp_error (pfile, CPP_DL_ERROR, "duplicate macro parameter \"%s\"", + NODE_NAME (node)); + return false; + } + + unsigned len = (n + 1) * sizeof (struct macro_arg_saved_data); + if (len > pfile->macro_buffer_len) + { + pfile->macro_buffer + = XRESIZEVEC (unsigned char, pfile->macro_buffer, len); + pfile->macro_buffer_len = len; + } + + macro_arg_saved_data *saved = (macro_arg_saved_data *)pfile->macro_buffer; + saved[n].canonical_node = node; + saved[n].value = node->value; + saved[n].type = node->type; + + void *base = _cpp_reserve_room (pfile, n * sizeof (cpp_hashnode *), + sizeof (cpp_hashnode *)); + ((cpp_hashnode **)base)[n] = spelling; + + /* Morph into a macro arg. */ + node->type = NT_MACRO_ARG; + /* Index is 1 based. */ + node->value.arg_index = n + 1; + + return true; +} + +/* Restore the parameters to their previous state. */ +void +_cpp_unsave_parameters (cpp_reader *pfile, unsigned n) +{ + /* Clear the fast argument lookup indices. */ + while (n--) + { + struct macro_arg_saved_data *save = + &((struct macro_arg_saved_data *) pfile->macro_buffer)[n]; + + struct cpp_hashnode *node = save->canonical_node; + node->type = save->type; + node->value = save->value; + } +} + +/* Check the syntax of the parameters in a MACRO definition. Return + false on failure. Set *N_PTR and *VARADIC_PTR as appropriate. + '(' ')' + '(' parm-list ',' last-parm ')' + '(' last-parm ')' + parm-list: name + | parm-list, name + last-parm: name + | name '...' + | '...' +*/ + +static bool +parse_params (cpp_reader *pfile, unsigned *n_ptr, bool *varadic_ptr) +{ + unsigned nparms = 0; + bool ok = false; + + for (bool prev_ident = false;;) + { + const cpp_token *token = _cpp_lex_token (pfile); + + switch (token->type) + { + case CPP_COMMENT: + /* Allow/ignore comments in parameter lists if we are + preserving comments in macro expansions. */ + if (!CPP_OPTION (pfile, discard_comments_in_macro_exp)) + break; + + /* FALLTHRU */ + default: + bad: + { + const char *const msgs[5] = + { + N_("expected parameter name, found \"%s\""), + N_("expected ',' or ')', found \"%s\""), + N_("expected parameter name before end of line"), + N_("expected ')' before end of line"), + N_("expected ')' after \"...\"") + }; + unsigned ix = prev_ident; + const unsigned char *as_text = NULL; + if (*varadic_ptr) + ix = 4; + else if (token->type == CPP_EOF) + ix += 2; + else + as_text = cpp_token_as_text (pfile, token); + cpp_error (pfile, CPP_DL_ERROR, msgs[ix], as_text); + } + goto out; + + case CPP_NAME: + if (prev_ident || *varadic_ptr) + goto bad; + prev_ident = true; + + if (!_cpp_save_parameter (pfile, nparms, token->val.node.node, + token->val.node.spelling)) + goto out; + nparms++; + break; + + case CPP_CLOSE_PAREN: + if (prev_ident || !nparms || *varadic_ptr) + { + ok = true; + goto out; + } + + /* FALLTHRU */ + case CPP_COMMA: + if (!prev_ident || *varadic_ptr) + goto bad; + prev_ident = false; + break; + + case CPP_ELLIPSIS: + if (*varadic_ptr) + goto bad; + *varadic_ptr = true; + if (!prev_ident) + { + /* An ISO bare ellipsis. */ + _cpp_save_parameter (pfile, nparms, + pfile->spec_nodes.n__VA_ARGS__, + pfile->spec_nodes.n__VA_ARGS__); + nparms++; + pfile->state.va_args_ok = 1; + if (! CPP_OPTION (pfile, c99) + && CPP_OPTION (pfile, cpp_pedantic) + && CPP_OPTION (pfile, warn_variadic_macros)) + cpp_pedwarning + (pfile, CPP_W_VARIADIC_MACROS, + CPP_OPTION (pfile, cplusplus) + ? N_("anonymous variadic macros were introduced in C++11") + : N_("anonymous variadic macros were introduced in C99")); + else if (CPP_OPTION (pfile, cpp_warn_c90_c99_compat) > 0 + && ! CPP_OPTION (pfile, cplusplus)) + cpp_error (pfile, CPP_DL_WARNING, + "anonymous variadic macros were introduced in C99"); + } + else if (CPP_OPTION (pfile, cpp_pedantic) + && CPP_OPTION (pfile, warn_variadic_macros)) + cpp_pedwarning (pfile, CPP_W_VARIADIC_MACROS, + CPP_OPTION (pfile, cplusplus) + ? N_("ISO C++ does not permit named variadic macros") + : N_("ISO C does not permit named variadic macros")); + break; + } + } + + out: + *n_ptr = nparms; + + return ok; +} + +/* + "Lex a token from the expansion of MACRO, but mark parameters as we + find them and warn of traditional stringification." -original comment. + + This routine, despite its name, does no expansion. It redirects the token + pointer inside the lexer so that it writes the next token raw from the + input file into the 'expansion array' of the macro. + + The expansion array in the macro is an expandable (sort of) array of tokens, + used for holding the body of the macro. I.e. 'expansion' here refers to + the token array getting longer, not to the expansion of macros. + + This routine requires that the macro passed in is on the bump pointer buffer (the buffer returned by _cpp_reserve_room). Given this requirement, there was no need to pass the original macro as they could instead recovered it from the buffer: + + cpp_macro *macro = (cpp_macro *)BUFF_FRONT(pfile->a_buff); + + The return value is that of the buffer holding the macro with one more token on its expansion array than it had when it was passed in. +*/ +static cpp_macro * +lex_expansion_token (cpp_reader *pfile, cpp_macro *macro) +{ + macro = (cpp_macro *)_cpp_reserve_room ( + pfile + ,sizeof (cpp_macro) + - sizeof (cpp_token) + + macro->count * sizeof (cpp_token) + ,sizeof (cpp_token) + ); + + // Tells the lexer to lex the next token into ¯o->exp.tokens[macro->count]. + // Perhaps the lexer was already set to lex the next token to a different buffer? + // So just in case, its value original value is saved then restored. + cpp_token *saved_cur_token = pfile->cur_token; + pfile->cur_token = ¯o->exp.tokens[macro->count]; + cpp_token *token = _cpp_lex_direct (pfile); + pfile->cur_token = saved_cur_token; + + /* Is this a parameter? */ + if (token->type == CPP_NAME && token->val.node.node->type == NT_MACRO_ARG) + { + /* Morph into a parameter reference. */ + cpp_hashnode *spelling = token->val.node.spelling; + token->type = CPP_MACRO_ARG; + token->val.macro_arg.arg_no = token->val.node.node->value.arg_index; + token->val.macro_arg.spelling = spelling; + } + else if (CPP_WTRADITIONAL (pfile) && macro->paramc > 0 + && (token->type == CPP_STRING || token->type == CPP_CHAR)) + check_trad_stringification (pfile, macro, &token->val.str); + + return macro; +} + +static cpp_macro * +create_iso_definition (cpp_reader *pfile) +{ + bool following_paste_op = false; + const char *paste_op_error_msg = + N_("'##' cannot appear at either end of a macro expansion"); + unsigned int num_extra_tokens = 0; + unsigned nparms = 0; + cpp_hashnode **params = NULL; + bool varadic = false; + bool ok = false; + cpp_macro *macro = NULL; + + /* Look at the first token, to see if this is a function-like + macro. */ + cpp_token first; + cpp_token *saved_cur_token = pfile->cur_token; + pfile->cur_token = &first; + cpp_token *token = _cpp_lex_direct (pfile); + pfile->cur_token = saved_cur_token; + + if (token->flags & PREV_WHITE) + /* Preceeded by space, must be part of expansion. */; + else if (token->type == CPP_OPEN_PAREN) + { + /* An open-paren, get a parameter list. */ + if (!parse_params (pfile, &nparms, &varadic)) + goto out; + + params = (cpp_hashnode **)_cpp_commit_buff + (pfile, sizeof (cpp_hashnode *) * nparms); + token = NULL; + } + else if (token->type != CPP_EOF + && !(token->type == CPP_COMMENT + && ! CPP_OPTION (pfile, discard_comments_in_macro_exp))) + { + /* While ISO C99 requires whitespace before replacement text + in a macro definition, ISO C90 with TC1 allows characters + from the basic source character set there. */ + if (CPP_OPTION (pfile, c99)) + cpp_error (pfile, CPP_DL_PEDWARN, + CPP_OPTION (pfile, cplusplus) + ? N_("ISO C++11 requires whitespace after the macro name") + : N_("ISO C99 requires whitespace after the macro name")); + else + { + enum cpp_diagnostic_level warntype = CPP_DL_WARNING; + switch (token->type) + { + case CPP_ATSIGN: + case CPP_AT_NAME: + case CPP_OBJC_STRING: + /* '@' is not in basic character set. */ + warntype = CPP_DL_PEDWARN; + break; + case CPP_OTHER: + /* Basic character set sans letters, digits and _. */ + if (strchr ("!\"#%&'()*+,-./:;<=>?[\\]^{|}~", + token->val.str.text[0]) == NULL) + warntype = CPP_DL_PEDWARN; + break; + default: + /* All other tokens start with a character from basic + character set. */ + break; + } + cpp_error (pfile, warntype, + "missing whitespace after the macro name"); + } + } + + macro = _cpp_new_macro (pfile, cmk_macro, + _cpp_reserve_room (pfile, 0, sizeof (cpp_macro))); + + if (!token) + { + macro->variadic = varadic; + macro->paramc = nparms; + macro->parm.params = params; + macro->fun_like = true; + } + else + { + /* Preserve the token we peeked, there is already a single slot for it. */ + macro->exp.tokens[0] = *token; + token = ¯o->exp.tokens[0]; + macro->count = 1; + } + + for (vaopt_state vaopt_tracker (pfile, macro->variadic, NULL);; token = NULL) + { + if (!token) + { + macro = lex_expansion_token (pfile, macro); + token = ¯o->exp.tokens[macro->count++]; + } + + /* Check the stringifying # constraint 6.10.3.2.1 of + function-like macros when lexing the subsequent token. */ + if (macro->count > 1 && token[-1].type == CPP_HASH && macro->fun_like) + { + if (token->type == CPP_MACRO_ARG + || (macro->variadic + && token->type == CPP_NAME + && token->val.node.node == pfile->spec_nodes.n__VA_OPT__)) + { + if (token->flags & PREV_WHITE) + token->flags |= SP_PREV_WHITE; + if (token[-1].flags & DIGRAPH) + token->flags |= SP_DIGRAPH; + token->flags &= ~PREV_WHITE; + token->flags |= STRINGIFY_ARG; + token->flags |= token[-1].flags & PREV_WHITE; + token[-1] = token[0]; + macro->count--; + } + /* Let assembler get away with murder. */ + else if (CPP_OPTION (pfile, lang) != CLK_ASM) + { + cpp_error (pfile, CPP_DL_ERROR, + "'#' is not followed by a macro parameter"); + goto out; + } + } + + if (token->type == CPP_EOF) + { + /* Paste operator constraint 6.10.3.3.1: + Token-paste ##, can appear in both object-like and + function-like macros, but not at the end. */ + if (following_paste_op) + { + cpp_error (pfile, CPP_DL_ERROR, paste_op_error_msg); + goto out; + } + if (!vaopt_tracker.completed ()) + goto out; + break; + } + + /* Paste operator constraint 6.10.3.3.1. */ + if (token->type == CPP_PASTE) + { + /* Token-paste ##, can appear in both object-like and + function-like macros, but not at the beginning. */ + if (macro->count == 1) + { + cpp_error (pfile, CPP_DL_ERROR, paste_op_error_msg); + goto out; + } + + if (following_paste_op) + { + /* Consecutive paste operators. This one will be moved + to the end. */ + num_extra_tokens++; + token->val.token_no = macro->count - 1; + } + else + { + /* Drop the paste operator. */ + --macro->count; + token[-1].flags |= PASTE_LEFT; + if (token->flags & DIGRAPH) + token[-1].flags |= SP_DIGRAPH; + if (token->flags & PREV_WHITE) + token[-1].flags |= SP_PREV_WHITE; + } + following_paste_op = true; + } + else + following_paste_op = false; + + if (vaopt_tracker.update (token) == vaopt_state::ERROR) + goto out; + } + + /* We're committed to winning now. */ + ok = true; + + /* Don't count the CPP_EOF. */ + macro->count--; + + macro = (cpp_macro *)_cpp_commit_buff + (pfile, sizeof (cpp_macro) - sizeof (cpp_token) + + sizeof (cpp_token) * macro->count); + + /* Clear whitespace on first token. */ + if (macro->count) + macro->exp.tokens[0].flags &= ~PREV_WHITE; + + if (num_extra_tokens) + { + /* Place second and subsequent ## or %:%: tokens in sequences of + consecutive such tokens at the end of the list to preserve + information about where they appear, how they are spelt and + whether they are preceded by whitespace without otherwise + interfering with macro expansion. Remember, this is + extremely rare, so efficiency is not a priority. */ + cpp_token *temp = (cpp_token *)_cpp_reserve_room + (pfile, 0, num_extra_tokens * sizeof (cpp_token)); + unsigned extra_ix = 0, norm_ix = 0; + cpp_token *exp = macro->exp.tokens; + for (unsigned ix = 0; ix != macro->count; ix++) + if (exp[ix].type == CPP_PASTE) + temp[extra_ix++] = exp[ix]; + else + exp[norm_ix++] = exp[ix]; + memcpy (&exp[norm_ix], temp, num_extra_tokens * sizeof (cpp_token)); + + /* Record there are extra tokens. */ + macro->extra_tokens = 1; + } + + out: + pfile->state.va_args_ok = 0; + _cpp_unsave_parameters (pfile, nparms); + + return ok ? macro : NULL; +} + +cpp_macro * +_cpp_new_macro (cpp_reader *pfile, cpp_macro_kind kind, void *placement) +{ + cpp_macro *macro = (cpp_macro *) placement; + + /* Zero init all the fields. This'll tell the compiler know all the + following inits are writing a virgin object. */ + memset (macro, 0, offsetof (cpp_macro, exp)); + + macro->line = pfile->directive_line; + macro->parm.params = 0; + macro->lazy = 0; + macro->paramc = 0; + macro->variadic = 0; + macro->used = !CPP_OPTION (pfile, warn_unused_macros); + macro->count = 0; + macro->fun_like = 0; + macro->imported_p = false; + macro->extra_tokens = 0; + /* To suppress some diagnostics. */ + macro->syshdr = pfile->buffer && pfile->buffer->sysp != 0; + + macro->kind = kind; + + return macro; +} + +/* Parse a macro and save its expansion. Returns nonzero on success. */ +bool +_cpp_create_definition (cpp_reader *pfile, cpp_hashnode *node) +{ + cpp_macro *macro; + + if (CPP_OPTION (pfile, traditional)) + macro = _cpp_create_trad_definition (pfile); + else + macro = create_iso_definition (pfile); + + if (!macro) + return false; + + if (cpp_macro_p (node)) + { + if (CPP_OPTION (pfile, warn_unused_macros)) + _cpp_warn_if_unused_macro (pfile, node, NULL); + + if (warn_of_redefinition (pfile, node, macro)) + { + const enum cpp_warning_reason reason + = (cpp_builtin_macro_p (node) && !(node->flags & NODE_WARN)) + ? CPP_W_BUILTIN_MACRO_REDEFINED : CPP_W_NONE; + + bool warned = + cpp_pedwarning_with_line (pfile, reason, + pfile->directive_line, 0, + "\"%s\" redefined", NODE_NAME (node)); + + if (warned && cpp_user_macro_p (node)) + cpp_error_with_line (pfile, CPP_DL_NOTE, + node->value.macro->line, 0, + "this is the location of the previous definition"); + } + _cpp_free_definition (node); + } + + /* Enter definition in hash table. */ + node->type = NT_USER_MACRO; + node->value.macro = macro; + if (! ustrncmp (NODE_NAME (node), DSC ("__STDC_")) + && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_FORMAT_MACROS") + /* __STDC_LIMIT_MACROS and __STDC_CONSTANT_MACROS are mentioned + in the C standard, as something that one must use in C++. + However DR#593 and C++11 indicate that they play no role in C++. + We special-case them anyway. */ + && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_LIMIT_MACROS") + && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_CONSTANT_MACROS")) + node->flags |= NODE_WARN; + + /* If user defines one of the conditional macros, remove the + conditional flag */ + node->flags &= ~NODE_CONDITIONAL; + + return true; +} + +extern void +cpp_define_lazily (cpp_reader *pfile, cpp_hashnode *node, unsigned num) +{ + cpp_macro *macro = node->value.macro; + + gcc_checking_assert (pfile->cb.user_lazy_macro && macro && num < UCHAR_MAX); + + macro->lazy = num + 1; +} + +/* NODE is a deferred macro, resolve it, returning the definition + (which may be NULL). */ +cpp_macro * +cpp_get_deferred_macro (cpp_reader *pfile, cpp_hashnode *node, + location_t loc) +{ + gcc_checking_assert (node->type == NT_USER_MACRO); + + node->value.macro = pfile->cb.user_deferred_macro (pfile, loc, node); + + if (!node->value.macro) + node->type = NT_VOID; + + return node->value.macro; +} + +static cpp_macro * +get_deferred_or_lazy_macro (cpp_reader *pfile, cpp_hashnode *node, + location_t loc) +{ + cpp_macro *macro = node->value.macro; + if (!macro) + { + macro = cpp_get_deferred_macro (pfile, node, loc); + gcc_checking_assert (!macro || !macro->lazy); + } + else if (macro->lazy) + { + pfile->cb.user_lazy_macro (pfile, macro, macro->lazy - 1); + macro->lazy = 0; + } + + return macro; +} + +/* Notify the use of NODE in a macro-aware context (i.e. expanding it, + or testing its existance). Also applies any lazy definition. + Return FALSE if the macro isn't really there. */ + +extern bool +_cpp_notify_macro_use (cpp_reader *pfile, cpp_hashnode *node, + location_t loc) +{ + node->flags |= NODE_USED; + switch (node->type) + { + case NT_USER_MACRO: + if (!get_deferred_or_lazy_macro (pfile, node, loc)) + return false; + /* FALLTHROUGH. */ + + case NT_BUILTIN_MACRO: + if (pfile->cb.used_define) + pfile->cb.used_define (pfile, loc, node); + break; + + case NT_VOID: + if (pfile->cb.used_undef) + pfile->cb.used_undef (pfile, loc, node); + break; + + default: + abort (); + } + + return true; +} + +/* Warn if a token in STRING matches one of a function-like MACRO's + parameters. */ +static void +check_trad_stringification (cpp_reader *pfile, const cpp_macro *macro, + const cpp_string *string) +{ + unsigned int i, len; + const uchar *p, *q, *limit; + + /* Loop over the string. */ + limit = string->text + string->len - 1; + for (p = string->text + 1; p < limit; p = q) + { + /* Find the start of an identifier. */ + while (p < limit && !is_idstart (*p)) + p++; + + /* Find the end of the identifier. */ + q = p; + while (q < limit && is_idchar (*q)) + q++; + + len = q - p; + + /* Loop over the function macro arguments to see if the + identifier inside the string matches one of them. */ + for (i = 0; i < macro->paramc; i++) + { + const cpp_hashnode *node = macro->parm.params[i]; + + if (NODE_LEN (node) == len + && !memcmp (p, NODE_NAME (node), len)) + { + cpp_warning (pfile, CPP_W_TRADITIONAL, + "macro argument \"%s\" would be stringified in traditional C", + NODE_NAME (node)); + break; + } + } + } +} + +/* Returns the name, arguments and expansion of a macro, in a format + suitable to be read back in again, and therefore also for DWARF 2 + debugging info. e.g. "PASTE(X, Y) X ## Y", or "MACNAME EXPANSION". + Caller is expected to generate the "#define" bit if needed. The + returned text is temporary, and automatically freed later. */ +const unsigned char * +cpp_macro_definition (cpp_reader *pfile, cpp_hashnode *node) +{ + gcc_checking_assert (cpp_user_macro_p (node)); + + if (const cpp_macro *macro = get_deferred_or_lazy_macro (pfile, node, 0)) + return cpp_macro_definition (pfile, node, macro); + return NULL; +} + +const unsigned char * +cpp_macro_definition (cpp_reader *pfile, cpp_hashnode *node, + const cpp_macro *macro) +{ + unsigned int i, len; + unsigned char *buffer; + + /* Calculate length. */ + len = NODE_LEN (node) * 10 + 2; /* ' ' and NUL. */ + if (macro->fun_like) + { + len += 4; /* "()" plus possible final ".." of named + varargs (we have + 1 below). */ + for (i = 0; i < macro->paramc; i++) + len += NODE_LEN (macro->parm.params[i]) + 1; /* "," */ + } + + /* This should match below where we fill in the buffer. */ + if (CPP_OPTION (pfile, traditional)) + len += _cpp_replacement_text_len (macro); + else + { + unsigned int count = macro_real_token_count (macro); + for (i = 0; i < count; i++) + { + const cpp_token *token = ¯o->exp.tokens[i]; + + if (token->type == CPP_MACRO_ARG) + len += NODE_LEN (token->val.macro_arg.spelling); + else + len += cpp_token_len (token); + + if (token->flags & STRINGIFY_ARG) + len++; /* "#" */ + if (token->flags & PASTE_LEFT) + len += 3; /* " ##" */ + if (token->flags & PREV_WHITE) + len++; /* " " */ + } + } + + if (len > pfile->macro_buffer_len) + { + pfile->macro_buffer = XRESIZEVEC (unsigned char, + pfile->macro_buffer, len); + pfile->macro_buffer_len = len; + } + + /* Fill in the buffer. Start with the macro name. */ + buffer = pfile->macro_buffer; + buffer = _cpp_spell_ident_ucns (buffer, node); + + /* Parameter names. */ + if (macro->fun_like) + { + *buffer++ = '('; + for (i = 0; i < macro->paramc; i++) + { + cpp_hashnode *param = macro->parm.params[i]; + + if (param != pfile->spec_nodes.n__VA_ARGS__) + { + memcpy (buffer, NODE_NAME (param), NODE_LEN (param)); + buffer += NODE_LEN (param); + } + + if (i + 1 < macro->paramc) + /* Don't emit a space after the comma here; we're trying + to emit a Dwarf-friendly definition, and the Dwarf spec + forbids spaces in the argument list. */ + *buffer++ = ','; + else if (macro->variadic) + *buffer++ = '.', *buffer++ = '.', *buffer++ = '.'; + } + *buffer++ = ')'; + } + + /* The Dwarf spec requires a space after the macro name, even if the + definition is the empty string. */ + *buffer++ = ' '; + + if (CPP_OPTION (pfile, traditional)) + buffer = _cpp_copy_replacement_text (macro, buffer); + else if (macro->count) + /* Expansion tokens. */ + { + unsigned int count = macro_real_token_count (macro); + for (i = 0; i < count; i++) + { + const cpp_token *token = ¯o->exp.tokens[i]; + + if (token->flags & PREV_WHITE) + *buffer++ = ' '; + if (token->flags & STRINGIFY_ARG) + *buffer++ = '#'; + + if (token->type == CPP_MACRO_ARG) + { + memcpy (buffer, + NODE_NAME (token->val.macro_arg.spelling), + NODE_LEN (token->val.macro_arg.spelling)); + buffer += NODE_LEN (token->val.macro_arg.spelling); + } + else + buffer = cpp_spell_token (pfile, token, buffer, true); + + if (token->flags & PASTE_LEFT) + { + *buffer++ = ' '; + *buffer++ = '#'; + *buffer++ = '#'; + /* Next has PREV_WHITE; see _cpp_create_definition. */ + } + } + } + + *buffer = '\0'; + return pfile->macro_buffer; +} + + +/*-------------------------------------------------------------------------------- + RT extensions +--------------------------------------------------------------------------------*/ + +/*-------------------------------------------------------------------------------- + shared declarations +*/ + + typedef struct{ + unsigned int count; + cpp_token *token_array; // _cpp_reserve_room buffer set by clause parser + } token_list; + + typedef struct{ + unsigned int count; + token_list token_list[1]; + } argument_list; + + typedef enum clause_parse_delimiting { + CPD_EOF + ,CPD_BALANCED + } clause_parse_delimiting; + + typedef enum clause_parse_comma { + CPC_ERROR + ,CPC_TERMINATOR + ,CPD_IS_TOKEN + } clause_parse_comma; + + typedef enum clause_parse_expand { + CPE_NOEXPAND + ,CPE_EXPAND_OPTION + ,CPE_EXPAND + } clause_parse_expand; + + typedef enum clause_parse_status { + PCS_COMPLETE // Clause completely parsed + ,PCS_ERR_EXPECTED_OPEN_DELIM // Failed to find expected opening '(' + ,PCS_ERR_UNEXPECTED_COMMA // probably has too many arguments in list + ,PCS_ERR_UNEXPECTED_EOF // Hit real EOF before matching ')' + ,PCS_ERR_PASTE_AT_END // Trailing '##' paste operator + ,PCS_ERR_HASH_NOT_FOLLOWED_BY_ARG // '#' not followed by macro parameter + ,PCS_ERR_VAOPT_STATE_INVALID // __VA_OPT__ or variadic tracking error + ,PCS_ERR_EOF_FETCH_FAILED // Failed to fetch next line after EOF + ,PCS_ERR_UNKNOWN // Fallback error (should not occur) + ,PCS_ERR_STATUS_NOT_SET // function did not set the status + } clause_parse_status; + + +/*-------------------------------------------------------------------------------- + debug helpers +*/ + + // debug info for clause parsing + #define DebugParseClause 1 + + // debug info for the macro directive + #define DebugRTMacro 1 + + // debug info for the assign directive and built in macro + #define DebugAssign 1 + + // gates compilation of functions that were defined specifically to assist with debug + #define DebugHelpers 1 + + #if DebugHelpers + + static const char * + ttype_to_text(enum cpp_ttype ttype) + { + switch (ttype) + { + case CPP_EOF: return "EOF"; + case CPP_PADDING: return "PADDING"; + case CPP_COMMENT: return "COMMENT"; + // case CPP_HSPACE: return "HSPACE"; + // case CPP_VSPACE: return "VSPACE"; + case CPP_OTHER: return "OTHER"; + case CPP_OPEN_PAREN: return "OPEN_PAREN"; + case CPP_CLOSE_PAREN: return "CLOSE_PAREN"; + case CPP_OPEN_SQUARE: return "OPEN_SQUARE"; + case CPP_CLOSE_SQUARE: return "CLOSE_SQUARE"; + case CPP_OPEN_BRACE: return "OPEN_BRACE"; + case CPP_CLOSE_BRACE: return "CLOSE_BRACE"; + case CPP_COMMA: return "COMMA"; + case CPP_SEMICOLON: return "SEMICOLON"; + case CPP_ELLIPSIS: return "ELLIPSIS"; + case CPP_NAME: return "NAME"; + case CPP_NUMBER: return "NUMBER"; + case CPP_CHAR: return "CHAR"; + case CPP_STRING: return "STRING"; + case CPP_HEADER_NAME: return "HEADER_NAME"; + case CPP_PLUS: return "PLUS"; + case CPP_MINUS: return "MINUS"; + case CPP_MULT: return "MULT"; + case CPP_DIV: return "DIV"; + case CPP_MOD: return "MOD"; + case CPP_AND: return "AND"; + case CPP_OR: return "OR"; + case CPP_XOR: return "XOR"; + case CPP_NOT: return "NOT"; + case CPP_LSHIFT: return "LSHIFT"; + case CPP_RSHIFT: return "RSHIFT"; + case CPP_EQ: return "EQ"; + // case CPP_NE: return "NE"; + // case CPP_LE: return "LE"; + // case CPP_GE: return "GE"; + // case CPP_LT: return "LT"; + // case CPP_GT: return "GT"; + case CPP_ATSIGN: return "@"; + case CPP_PLUS_EQ: return "PLUS_EQ"; + case CPP_MINUS_EQ: return "MINUS_EQ"; + case CPP_MULT_EQ: return "MULT_EQ"; + case CPP_DIV_EQ: return "DIV_EQ"; + case CPP_MOD_EQ: return "MOD_EQ"; + case CPP_AND_EQ: return "AND_EQ"; + case CPP_OR_EQ: return "OR_EQ"; + case CPP_XOR_EQ: return "XOR_EQ"; + case CPP_LSHIFT_EQ: return "LSHIFT_EQ"; + case CPP_RSHIFT_EQ: return "RSHIFT_EQ"; + // case CPP_CONDITIONAL: return "CONDITIONAL"; + case CPP_COLON: return "COLON"; + case CPP_DEREF: return "DEREF"; + case CPP_DOT: return "DOT"; + case CPP_DEREF_STAR: return "DEREF_STAR"; + case CPP_DOT_STAR: return "DOT_STAR"; + // case CPP_INCREMENT: return "INCREMENT"; + // case CPP_DECREMENT: return "DECREMENT"; + default: return ""; + } + } + + static void + print_ttype(enum cpp_ttype ttype){ + fprintf(stderr, "%s (%d)", ttype_to_text(ttype), ttype); + } + + const char *cpp_token_as_text(const cpp_token *token){ + static char buffer[256]; + + switch (token->type) + { + case CPP_NAME: + snprintf(buffer, sizeof(buffer), "CPP_NAME: '%s'", + NODE_NAME(token->val.node.node)); + break; + + case CPP_NUMBER: + case CPP_STRING: + case CPP_CHAR: + case CPP_HEADER_NAME: + snprintf(buffer, sizeof(buffer), "'%.*s'", + token->val.str.len, + token->val.str.text); + break; + + case CPP_EOF: + return ""; + case CPP_OTHER: + return ""; + case CPP_OPEN_PAREN: + return "'('"; + case CPP_CLOSE_PAREN: + return "')'"; + case CPP_COMMA: + return "','"; + case CPP_SEMICOLON: + return "';'"; + case CPP_PLUS: + return "'+'"; + case CPP_MINUS: + return "'-'"; + case CPP_MULT: + return "'*'"; + case CPP_DIV: + return "'/'"; + case CPP_MOD: + return "'%'"; + case CPP_MACRO_ARG: + snprintf( + buffer + ,sizeof(buffer) + ,"CPP_MACRO_ARG: '%s'" + ,NODE_NAME(token->val.macro_arg.spelling) + ); + break; + case CPP_PADDING: return ""; + case CPP_COMMENT: return ""; + case CPP_HASH: return "'#'"; + case CPP_PASTE: return "'##'"; + case CPP_ELLIPSIS: return "'...'"; + case CPP_COLON: return "':'"; + case CPP_OPEN_SQUARE: return "'['"; + case CPP_CLOSE_SQUARE: return "']'"; + case CPP_OPEN_BRACE: return "'{'"; + case CPP_CLOSE_BRACE: return "'}'"; + case CPP_DOT: return "'.'"; + case CPP_DEREF: return "'->'"; + case CPP_SCOPE: return "'::'"; + case CPP_DOT_STAR: return "'.*'"; + case CPP_DEREF_STAR: return "'->*'"; + case CPP_PRAGMA: return "<_Pragma>"; + case CPP_KEYWORD: return ""; + + default: + snprintf(buffer, sizeof(buffer), "", token->type); + break; + } + + // Append token flags if any are set + if (token->flags & (PREV_WHITE | DIGRAPH | STRINGIFY_ARG | + PASTE_LEFT | NAMED_OP | BOL | PURE_ZERO | + SP_DIGRAPH | SP_PREV_WHITE | NO_EXPAND | PRAGMA_OP)) + { + size_t len = strlen(buffer); + snprintf(buffer + len, sizeof(buffer) - len, " [flags:"); + + if (token->flags & PREV_WHITE) + strncat(buffer, " PREV_WHITE", sizeof(buffer) - strlen(buffer) - 1); + if (token->flags & DIGRAPH) + strncat(buffer, " DIGRAPH", sizeof(buffer) - strlen(buffer) - 1); + if (token->flags & STRINGIFY_ARG) + strncat(buffer, " STRINGIFY", sizeof(buffer) - strlen(buffer) - 1); + if (token->flags & PASTE_LEFT) + strncat(buffer, " ##L", sizeof(buffer) - strlen(buffer) - 1); + if (token->flags & NAMED_OP) + strncat(buffer, " NAMED_OP", sizeof(buffer) - strlen(buffer) - 1); + if (token->flags & BOL) + strncat(buffer, " BOL", sizeof(buffer) - strlen(buffer) - 1); + if (token->flags & PURE_ZERO) + strncat(buffer, " ZERO", sizeof(buffer) - strlen(buffer) - 1); + if (token->flags & SP_DIGRAPH) + strncat(buffer, " ##DIGRAPH", sizeof(buffer) - strlen(buffer) - 1); + if (token->flags & SP_PREV_WHITE) + strncat(buffer, " SP_WHITE", sizeof(buffer) - strlen(buffer) - 1); + if (token->flags & NO_EXPAND) + strncat(buffer, " NO_EXPAND", sizeof(buffer) - strlen(buffer) - 1); + if (token->flags & PRAGMA_OP) + strncat(buffer, " _Pragma", sizeof(buffer) - strlen(buffer) - 1); + + strncat(buffer, " ]", sizeof(buffer) - strlen(buffer) - 1); + } + + return buffer; + } + + void print_token_list(const cpp_token *tokens ,size_t count){ + for (size_t i = 0; i < countus; ++i) + fprintf( stderr ,"[%zu] %s\n" ,i , cpp_token_as_text(&tokens[i]) ); + } + + void print_clause_parse_status(enum clause_parse_status status){ + const char *message = NULL; + switch (status) + { + case PCS_COMPLETE: + message = "parse_clause status is OK"; + break; + case PCS_ERR_EXPECTED_OPEN_DELIM: + message = "expected opening delimiter such as '(' but did not find it."; + break; + case PCS_ERR_UNEXPECTED_EOF: + message = "unexpected EOF before closing ')'."; + break; + case PCS_ERR_PASTE_AT_END: + message = "paste operator '##' appeared at the beginning or end of macro body."; + break; + case PCS_ERR_HASH_NOT_FOLLOWED_BY_ARG: + message = "'#' was not followed by a valid macro parameter."; + break; + case PCS_ERR_VAOPT_STATE_INVALID: + message = "invalid __VA_OPT__ tracking state."; + break; + case PCS_ERR_EOF_FETCH_FAILED: + message = "_cpp_get_fresh_line() failed to fetch next line."; + break; + case PCS_ERR_STATUS_NOT_SET: + message = "Internal Error, status was not set"; + break; + case PCS_ERR_UNKNOWN: + default: + message = "unknown or unhandled error."; + break; + } + fprintf(stderr, "%s\n", message); + } + + // a helper function for probing where the parser thinks it is in the source + void debug_peek_token (cpp_reader *pfile){ + const cpp_token *tok = _cpp_lex_token(pfile); + + cpp_error_with_line( + pfile, + CPP_DL_ERROR, + tok->src_loc, + 0, + "DEBUG: next token is: `%s`", + (const char *) cpp_token_as_text(tok) + ); + + _cpp_backup_tokens (pfile, 1); + + } + +#endif + +/*-------------------------------------------------------------------------------- + Parse clauses + + Clause parsers are intended to grammatical, with CPP semantics added by functions that are given a token list as an argument. We will see if this paints us into a corner soon enough I guess. + + A clause is a delimited token list. Deliminators include either: + - balanced delimiters, current parens or brackets + - end of line (CPP quirk: the lexer replaces newline with EOF) + - comma + + A comma in a clause is optionally: + - an error + - an alternative terminator + - merely another token + + Optionally, the lexer can be told to expand tokens before they arrive at the clause parsing routine, or to not expand them. + +*/ + +/* + Similar to `macro.cc::lex_expansion_token`, but returns the token rather than writing directly into an macro structure. + + Would be used for standard behavior when parsing a function body, but we instead elected + to use _cpp_lex_token. Note the doc `fetching_a_token.org`. + +*/ +#if 0 // no longer used + cpp_token get_token_noexpand(cpp_reader *pfile){ + cpp_token result; + + // Tells the lexer to lex the next token into result. + // Perhaps the lexer was already set to lex the next token to a different buffer? + // So just in case, its value original value is saved then restored. + cpp_token *saved_cur_token = pfile->cur_token; + pfile->cur_token = &result; + _cpp_lex_direct(pfile); + pfile->cur_token = saved_cur_token; + + return result; + } +#endif + +/* + Given the pfile and mode of clause parsing, returns a token list and the terminating delimiter. The token_array for the token list is allocated on the bump buffer, `pfile->a`. + + When parsing a balanced paren clause, the opening paren has already been parsed, perhaps by `clause_parse`. This function then completes the parse. + + When parsing a line clause, this parses the clause. + + Optionally stops parsing at a comma. +*/ +static enum clause_parse_status clause_parse_1( + // inputs + cpp_reader *pfile + ,clause_parse_delimiting cpd + ,enum cpp_ttype opening // needed for counting in balanced delimiters mode + ,enum cpp_ttype closing // " + ,clause_parse_comma cpc + ,bool expand // whether tokens should be expanded when lexed + + // outputs + ,token_list *tl // caller allocates the pointed to token_list + ,cpp_token *terminator // caller allocates the pointed to token, or sets terminator to null +){ + #if DebugParseClause + fprintf(stderr, ">> parse_clause\n"); + fprintf(stderr, " delimiter_matching: %s\n", delimiter_matching ? "true" : "false"); + fprintf(stderr, " opening token: %s (%d)\n", ttype_to_text(opening), opening); + fprintf(stderr, " closing token: %s (%d)\n", ttype_to_text(closing), closing); + fprintf(stderr, " comma_list: %s\n", comma_list ? "true" : "false"); + // ./include/line-map.h:typedef unsigned int location_t; + fprintf(stderr, " src_loc: %u\n", src_loc); + #endif + + int nesting_depth = 1; + cpp_token token; + tl->count = 0; + tl->token_array = NULL; + location_t src_loc; + bool first = true; + + for(;;){ + + /* get a token + + For cpp_get_token_1 src_loc is an out parameter, = the location user would expect. + cpp_get_token_1 is defined in this file (macro.cc). + */ + if(expand){ + token = *cpp_get_token_1(pfile, &src_loc); + }else{ + token = *_cpp_lex_token(pfile); + src_loc = token.src_loc; + } + + /* skip padding + + This is necessary for the name expr, but does it impact potential other uses of parse_clause? Another flag needed for this perhaps? + + Didn't use `macro.cc::cpp_get_token_no_padding` due to the two lex options above, also because that routine does not return location. + */ + if(token.type == CPP_PADDING) continue; + + + /* Note that the lexer replaces newline with EOF when parsing a directive, but we + allow for multiple line clauses in directives. + */ + if(cpd != CPD_EOF && token.type == CPP_EOF){ + #if DebugParseClause + fprintf( stderr, "CPP_EOF during parse with parentheses matching \n"); + #endif + if(!_cpp_get_fresh_line(pfile)){ + if(terminator) *terminator = token; + return PCS_ERR_EOF_FETCH_FAILED; + } + continue; + } + + /* if we shouldn't see a comma + */ + if( cpc == CPC_ERROR && token.type == CPP_COMMA ){ + if(terminator) *terminator = token; + return PCS_ERR_UNEXPECTED_COMMA; + } + + /* parentheses matching overhead + */ + if(cpd == CPD_BALANCED){ + if (token.type == opening) { + nesting_depth++; + } + else if (token.type == closing) { + nesting_depth--; + if (nesting_depth < 0) { + cpp_error(pfile, CPP_DL_ERROR, "unmatched closing delimiter"); + if(terminator) *terminator = token; + return PCS_ERR_UNEXPECTED_EOF; + } + } + #if DebugParseClause + if( token.type == opening || token.type == closing){ + fprintf( stderr, "new nesting_depth: %d\n", nesting_depth); + } + #endif + } + + /* Determine if clause has reached a terminator + */ + bool terminated_by_matched_delimiter = + cpd == CPD_BALANCED + && nesting_depth == 0 + && token.type == closing + ; + + bool terminated_by_comma = + cpc == CPC_TERMINATOR + && token.type == CPP_COMMA + ; + + bool terminated_by_EOL = + cpd == CPD_EOF + && token.type == CPP_EOF + ; + + if( + terminated_by_matched_delimiter + || terminted_by_comma + || terminated_by_EOL + ){ + if(terminator) *terminator = token; + return PCS_COMPLETE; + } + + // store the token in the token list + tl->token_array = (cpp_token *)_cpp_reserve_room( + pfile + ,tl->count * sizeof(cpp_token) + ,sizeof(cpp_token) + ); + tl->token_array[tl->count] = token; + tl->count++; + #if DebugParseClause + fprintf( stderr, "token: %s\n", cpp_token_as_text(&token) ); + #endif + + }// end for next token loop + +} + + +/* + Request to parse a clause + + If given expand_option, square taken as expand, and paren taken as no expand. + + Check the returned terminator token to determine which option occurred. + + When there is an optional comma for termination, check the terminator to see + if the clause parse ran into a comma. + + For the continuing, end of line, and expand option clause types, the input opening and closing arguments are ignored. + +*/ +static enum clause_parse_status clause_parse( + // inputs + cpp_reader *pfile + ,clause_parse_delimiting cpd + ,enum cpp_ttype opening // needed for counting in balanced delimiters mode + ,enum cpp_ttype closing // " + ,bool continuing // typically true when continuing after bumping into a comma terminator + ,clause_parse_comma cpc + ,clause_parse_expand cpe + + // outputs + ,token_list *tl // caller allocates the pointed to token_list + ,cpp_token *terminator // caller allocates the pointed to token, or sets terminator to null +){ + + #if DebugParseClause + fprintf(stderr, ">> parse_clause_balanced\n"); + fprintf(stderr, " opening token: %s (%d)\n", ttype_to_text(opening), opening); + fprintf(stderr, " closing token: %s (%d)\n", ttype_to_text(closing), closing); + fprintf(stderr, " comma_list: %s\n", comma_list ? "true" : "false"); + fprintf(stderr, " expand: %s\n", expand ? "true" : "false"); + fprintf(stderr, " src_loc_pt: %p\n", (void *)src_loc_pt); + #endif + + if( + continuing + || cpd == CPD_EOF + ){ + return clause_parse_1( + pfile + ,cpd + ,opening + ,closing + ,cpc + ,cpe + ,tl + ,terminator + ); + } + + const cpp_token *token = _cpp_lex_token(pfile); + #if DebugParseClause + fprintf(stderr, "opening token: %s" ,cpp_token_as_text(token)); + #endif + + if(cpe == CPE_EXPAND_OPTION){ + switch(token.type){ + CPP_OPEN_SQUARE: + return clause_parse_1( + pfile + ,CPD_BALANCED + ,CPP_OPEN_SQUARE + ,CPP_CLOSE_SQUARE + ,cpc + ,CPE_EXPAND + ,tl + ,terminator + ); + + CPP_OPEN_PAREN: + return clause_parse_1( + pfile + ,CPD_BALANCED + ,CPP_OPEN_PAREN + ,CPP_CLOSE_PAREN + ,cpc + ,CPE_NOEXPAND + ,tl + ,terminator + ); + + default: + return PCS_ERR_EXPECTED_OPEN_DELIM; + } + } + + // at this point we must be doing CPD_BALANCED with no square/paren option + + if(token.type != opening) return PCS_ERR_EXPECTED_OPEN_DELIM; + + return clause_parse_1( + pfile + ,cpd + ,opening + ,closing + ,cpc + ,cpe + ,tl + ,terminator + ); + +} + + +static cpp_hashnode * +name_clause_is_name(cpp_reader *pfile, const cpp_macro *macro) +{ + if (!macro || macro->count != 1) + { + cpp_error(pfile, CPP_DL_ERROR, + "expected exactly one token in assign name expression, got %u", + macro ? macro->count : 0); + return NULL; + } + + const cpp_token *tok = ¯o->exp.tokens[0]; + + if (tok->type != CPP_NAME) + { + cpp_error(pfile, CPP_DL_ERROR, + "expected identifier in assign name expression, got: %s", + cpp_token_as_text(tok)); + return NULL; + } + + return tok->val.node.node; +} + + +/*-------------------------------------------------------------------------------- + `#assign` directive RT extension + + called from directives.cc::do_assign() + +*/ + +bool _cpp_create_assign(cpp_reader *pfile){ + + clause_parse_status status; + location_t src_loc; + unsigned int num_extra_tokens = 0; + PCPSO_choice choice; + + /* Parse name clause into a temporary macro. + + This macro will not be committed, so it will be overwritten on the next _cpp_new_macro call. + */ + cpp_macro *name_macro = _cpp_new_macro( + pfile + ,cmk_macro + ,_cpp_reserve_room( pfile, 0, sizeof(cpp_macro) ) + ); + name_macro->variadic = false; + name_macro->paramc = 0; + name_macro->parm.params = NULL; + name_macro->fun_like = false; + + status = parse_clause_paren_square_option( + pfile + ,name_macro + ,&choice // square brackets, or round parents? + ,false // not a commas list + ,&src_loc + ,NULL // don't need to know the terminator + ,&num_extra_tokens + ); + + + #if DebugAssign + fprintf(stderr,"name_macro->count: %d\n" ,name_macro->count); + fprintf(stderr,"assign directive name tokens:\n"); + print_token_list(name_macro->exp.tokens ,name_macro->count); + #endif + + /* The name clause must be either a noexpandly valid name, or it must expand into + a valid name, depending if the programmer used () or []. + If valid, keep the name node. + */ + cpp_hashnode *name_node = name_clause_is_name(pfile ,name_macro); + if(name_node){ + #if DebugAssign + fprintf( + stderr + ,"assign macro name: '%.*s'\n" + ,(int) NODE_LEN(name_node) + ,NODE_NAME(name_node) + ); + #endif + }else{ + #if DebugAssign + fprintf(stderr, "node is not a name\n"); + #endif + return false; + } + + /* Unpaint name_node + + There are three scenarios where name_node will already exist in the symbol table + before the name clause of `#assign` is evaluated: + + 1. A macro definition already exists for name_node, and the name clause + is not expanded (i.e., it was delineated with '()'). + + 2. A macro definition exists, and the name clause *is* expanded (i.e., it + was delineated with '[]'), but name_node was painted and thus skipped + during expansion. + + 3. A macro definition exists and was not painted initially, but the name + clause expands recursively to itself (e.g., `list -> list`), resulting + in name_node being painted *during* the name clause evaluation. + + After the name clause is parsed, the body clause might be expanded. If so, + name_node must not be painted — this ensures that it will expand at least once. This enables patterns like: + + #assign ()(list)(list second) + + ...to work even if 'list' was painted prior to entering #assign. + + If the macro recurs during evaluation of the body clause, it will be automatically painted by the expansion engine, as usual. + + Note also: upon exit from this routine, the newly created macro will *not* be painted. Its disabled flag will remain clear. + + Consequently, for a recursive macro, assign can be called repeatedly to get 'one more level' of evaluation upon each call. + */ + if (cpp_macro_p(name_node)) { + name_node->flags &= ~NODE_DISABLED; + } + + /* create a new macro and put the #assign body clause in it + */ + cpp_macro *body_macro = _cpp_new_macro( + pfile + ,cmk_macro + ,_cpp_reserve_room( pfile, 0, sizeof(cpp_macro) ) + ); + body_macro->variadic = false; + body_macro->paramc = 0; + body_macro->parm.params = NULL; + body_macro->fun_like = false; + + status = parse_clause_paren_square_option( + pfile + ,body_macro + ,&choice // square brackets, or round parents? + ,false // not a commas list + ,&src_loc + ,NULL // don't need to know the terminator + ,&num_extra_tokens + ); + #if DebugAssign + fprintf(stderr,"assign directive body tokens:\n"); + print_token_list(body_macro->exp.tokens ,body_macro->count); + #endif + + + cpp_macro *assign_macro = (cpp_macro *)_cpp_commit_buff( + pfile + ,sizeof(cpp_macro) - sizeof(cpp_token) + sizeof(cpp_token) * body_macro->count + ); + assign_macro->count = body_macro->count; + memcpy( + assign_macro->exp.tokens + ,body_macro->exp.tokens + ,sizeof(cpp_token) * body_macro->count + ); + body_macro->variadic = false; + body_macro->paramc = 0; + body_macro->parm.params = NULL; + body_macro->fun_like = false; + + /* Install the assign macro under name_node. + + If name_node previously had a macro definition, discard it. + Then install the new macro, and clear any disabled flag. + + This ensures the assigned macro can be expanded immediately, + even if it appeared in its own body clause and was painted. + */ + name_node->flags &= ~NODE_USED; + + if (cpp_macro_p(name_node)) { + // There is no mechanism in libcpp to free the memory taken by a committed macro, but wec an cast it adrift. + name_node->value.macro = NULL; + } + name_node->type = NT_USER_MACRO; + name_node->value.macro = assign_macro; + name_node->flags &= ~NODE_DISABLED; + + /* all done + */ + #if DebugAssign + fprintf( + stderr + ,"macro '%.*s' assigned successfully.\n\n" + ,(int) NODE_LEN(name_node) + ,NODE_NAME(name_node) + ); + #endif + + return true; + +} + +/*-------------------------------------------------------------------------------- + `#macro` directive RT extension + + Given a pfile, returns a macro definition. + + #macro name (parameter [,parameter] ...) (body_expr) + #macro name () (body_expr) + + Upon entry, the name was already been parsed in directives.cc::do_macro, so the next token will be the opening paren of the parameter list. + + Thi code is similar to `_cpp_create_definition` though uses paren blancing around the body, instead of requiring the macro body be on a single line. + + The cpp_macro struct is defined in cpplib.h: `struct GTY(()) cpp_macro {` it has a flexible array field in a union as a last member: cpp_token tokens[1]; + + This code was derived from create_iso_definition(). The break out portions shared + with create_macro_definition code should be shared with the main code, so that there + is only one place for edits. + +*/ +static cpp_macro *create_rt_macro (cpp_reader *pfile){ + + #if DebugRTMacro + fprintf(stderr,"entering create_rt_macro\n"); + #endif + + unsigned int num_extra_tokens = 0; + unsigned paramc = 0; + cpp_hashnode **params = NULL; + bool varadic = false; + bool ok = false; + cpp_macro *macro = NULL; + clause_parse_status status; + + /* parse parameter list + + after parse_parms runs, the next token returned by pfile will be subsequent to the parameter list, e.g.: + 7 | #macro Q(f ,...) printf(f ,__VA_ARGS__) + | ^~~~~~ + */ + const cpp_token *token = _cpp_lex_token(pfile); + location_t src_loc = token->src_loc; + + if(token->type != CPP_OPEN_PAREN){ + cpp_error_with_line( + pfile + ,CPP_DL_ERROR + ,src_loc + ,0 + ,"expected '(' to open arguments list, but found: %s" + ,cpp_token_as_text(token) + ); + goto out; + } + + if( !parse_params(pfile, ¶mc, &varadic) ) goto out; + + // finalizes the reserved room, otherwise it will be reused on the next reserve room call. + params = (cpp_hashnode **)_cpp_commit_buff( pfile, sizeof (cpp_hashnode *) * paramc ); + token = NULL; + + /* parse body macro + + A macro struct instance is variable size, due to tokens added to the macro.exp.tokens + during parse, and possible reallocations. + + Function like macros will later need space to hold parameter values. + */ + macro = _cpp_new_macro( + pfile + ,cmk_macro + ,_cpp_reserve_room( pfile, 0, sizeof(cpp_macro) ) + ); + // used by parse_clause_noexpand + macro->variadic = varadic; + macro->paramc = paramc; + macro->parm.params = params; + macro->fun_like = true; + + PCPSO_choice choice; + status = parse_clause_paren_square_option( + pfile + ,macro + ,&choice + ,false // not a commas list + ,&src_loc + ,NULL // don't need to know the terminator + ,&num_extra_tokens + ); + if( status != PCS_COMPLETE ){ + fprintf(stderr, "parse_paren_clause returned: "); + print_clause_parse_status(status); + goto out; + } + + #if DebugRTMacro + fprintf(stderr,"rt_macro directive body tokens:\n"); + print_token_list(macro->exp.tokens ,macro->count); + #endif + + // commit the macro, attach the parameter list + ok = true; + macro = (cpp_macro *)_cpp_commit_buff( + pfile + , + sizeof (cpp_macro) + - sizeof (cpp_token) + + sizeof (cpp_token) * macro->count + + sizeof(cpp_hashnode *) * paramc + ); + macro->variadic = varadic; + macro->paramc = paramc; + macro->parm.params = params; + macro->fun_like = true; + + /* some end cases we must clean up + */ + /* + It might be that the first token of the macro body was preceded by white space, so + the white space flag is set. However, upon expansion, there might not be a white + space before said token, so the following code clears the flag. + */ + if (macro->count) + macro->exp.tokens[0].flags &= ~PREV_WHITE; + + /* + Identifies consecutive ## tokens (a.k.a. CPP_PASTE) that were invalid or ambiguous, + + Removes them from the main macro body, + + Stashes them at the end of the tokens[] array in the same memory, + + Sets macro->extra_tokens = 1 to signal their presence. + */ + if (num_extra_tokens) + { + /* Place second and subsequent ## or %:%: tokens in sequences of + consecutive such tokens at the end of the list to preserve + information about where they appear, how they are spelt and + whether they are preceded by whitespace without otherwise + interfering with macro expansion. Remember, this is + extremely rare, so efficiency is not a priority. */ + cpp_token *temp = (cpp_token *)_cpp_reserve_room + (pfile, 0, num_extra_tokens * sizeof (cpp_token)); + unsigned extra_ix = 0, norm_ix = 0; + cpp_token *exp = macro->exp.tokens; + for (unsigned ix = 0; ix != macro->count; ix++) + if (exp[ix].type == CPP_PASTE) + temp[extra_ix++] = exp[ix]; + else + exp[norm_ix++] = exp[ix]; + memcpy (&exp[norm_ix], temp, num_extra_tokens * sizeof (cpp_token)); + + /* Record there are extra tokens. */ + macro->extra_tokens = 1; + } + + out: + + /* + - This resets a flag in the parser’s state machine, pfile. + - The field `va_args_ok` tracks whether the current macro body is allowed to reference `__VA_ARGS__` (or more precisely, `__VA_OPT__`). + - It's set **while parsing a macro body** that might use variadic logic — particularly in `vaopt_state` tracking. + + Resetting it here ensures that future macros aren't accidentally parsed under the assumption that variadic substitution is valid. + */ + pfile->state.va_args_ok = 0; + + /* + Earlier we did: + if (!parse_params(pfile, ¶mc, &variadic)) goto out; + This cleans up temporary memory used by parse_params. + */ + _cpp_unsave_parameters (pfile, paramc); + + return ok ? macro : NULL; + +} + +/* + called from directives.cc:: do_macro +*/ +bool +_cpp_create_rt_macro(cpp_reader *pfile, cpp_hashnode *node){ + + #if DebugRTMacro + fprintf(stderr,"entering _cpp_create_macro\n"); + #endif + + cpp_macro *macro; + macro = create_rt_macro (pfile); + + if (!macro) + return false; + + if (cpp_macro_p (node)) + { + if (CPP_OPTION (pfile, warn_unused_macros)) + _cpp_warn_if_unused_macro (pfile, node, NULL); + + if (warn_of_redefinition (pfile, node, macro)) + { + const enum cpp_warning_reason reason + = (cpp_builtin_macro_p (node) && !(node->flags & NODE_WARN)) + ? CPP_W_BUILTIN_MACRO_REDEFINED : CPP_W_NONE; + + bool warned = + cpp_pedwarning_with_line (pfile, reason, + pfile->directive_line, 0, + "\"%s\" redefined", NODE_NAME (node)); + + if (warned && cpp_user_macro_p (node)) + cpp_error_with_line (pfile, CPP_DL_NOTE, + node->value.macro->line, 0, + "this is the location of the previous definition"); + } + _cpp_free_definition (node); + } + + /* Enter definition in hash table. */ + node->type = NT_USER_MACRO; + node->value.macro = macro; + if (! ustrncmp (NODE_NAME (node), DSC ("__STDC_")) + && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_FORMAT_MACROS") + /* __STDC_LIMIT_MACROS and __STDC_CONSTANT_MACROS are mentioned + in the C standard, as something that one must use in C++. + However DR#593 and C++11 indicate that they play no role in C++. + We special-case them anyway. */ + && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_LIMIT_MACROS") + && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_CONSTANT_MACROS")) + node->flags |= NODE_WARN; + + /* If user defines one of the conditional macros, remove the + conditional flag */ + node->flags &= ~NODE_CONDITIONAL; + + return true; +} + +/*-------------------------------------------------------------------------------- + RT builtin macro extensions + +*/ + + static const uchar *evaluate_RT_ASSIGN(cpp_reader *pfile){ + if( ! _cpp_create_assign(pfile) ){ + cpp_error( + pfile + ,CPP_DL_ERROR + ,"#assign macro failed" + ); + } + // return UC""; // returning null string gave an internal compiler error + return NULL; // expands as `1` dunno why, but the code is there to make it happen + // return UC" "; // another internal compiler error + } + + static const uchar *evaluate_RT_TO_ARG_LIST(cpp_reader *pfile){ + cpp_error( + pfile + ,CPP_DL_NOTE + ,"#to_arg_list macro evaluated" + ); + return NULL; + } + + static const uchar *evaluate_RT_TO_TOKEN_LIST(cpp_reader *pfile){ + cpp_error( + pfile + ,CPP_DL_NOTE + ,"#to_token_list macro evaluated" + ); + return NULL; + } + + static const uchar *evaluate_RT_FIRST(cpp_reader *pfile){ + cpp_error( + pfile + ,CPP_DL_NOTE + ,"#first macro evaluated" + ); + return NULL; + } + + static const uchar *evaluate_RT_REST(cpp_reader *pfile){ + cpp_error( + pfile + ,CPP_DL_NOTE + ,"#rest macro evaluated" + ); + return NULL; + } + + static const uchar *evaluate_RT_MAP(cpp_reader *pfile){ + cpp_error( + pfile + ,CPP_DL_NOTE + ,"#map macro evaluated" + ); + return NULL; + } + + static const uchar *evaluate_RT_AL_MAP(cpp_reader *pfile){ + cpp_error( + pfile + ,CPP_DL_NOTE + ,"#al_map macro evaluated" + ); + return NULL; + } + + static const uchar *evaluate_RT_IF(cpp_reader *pfile){ + cpp_error( + pfile + ,CPP_DL_NOTE + ,"#if macro evaluated" + ); + return NULL; + } + + static const uchar *evaluate_RT_NOT(cpp_reader *pfile){ + cpp_error( + pfile + ,CPP_DL_NOTE + ,"#not macro evaluated" + ); + return NULL; + } + + static const uchar *evaluate_RT_AND(cpp_reader *pfile){ + cpp_error( + pfile + ,CPP_DL_NOTE + ,"#and macro evaluated" + ); + return NULL; + } + + static const uchar *evaluate_RT_OR(cpp_reader *pfile){ + cpp_error( + pfile + ,CPP_DL_NOTE + ,"#or macro evaluated" + ); + return NULL; + } + + static const uchar *evaluate_RT_IS_IDENTIFIER(cpp_reader *pfile){ + cpp_error( + pfile + ,CPP_DL_NOTE + ,"#is_identifier macro evaluated" + ); + return NULL; + } + + static const uchar *evaluate_RT_IS_NAME(cpp_reader *pfile){ + cpp_error( + pfile + ,CPP_DL_NOTE + ,"#is_name macro evaluated" + ); + return NULL; + } + + static const uchar *evaluate_RT_PASTE(cpp_reader *pfile){ + cpp_error( + pfile + ,CPP_DL_NOTE + ,"#paste macro evaluated" + ); + return NULL; + } + + +#if 0 + static const uchar *evaluate_RT_ASSIGN(cpp_reader *pfile){ + return UC"_ASSIGN"; + } + + static const uchar *evaluate_RT_TO_ARG_LIST(cpp_reader *pfile){ + return UC"_TO_ARG_LIST"; + } + + static const uchar *evaluate_RT_TO_TOKEN_LIST(cpp_reader *pfile){ + return UC"_TO_TOKEN_LIST"; + } + + static const uchar *evaluate_RT_FIRST(cpp_reader *pfile){ + return UC"_FIRST"; + } + + static const uchar *evaluate_RT_REST(cpp_reader *pfile){ + return UC"_REST"; + } + + static const uchar *evaluate_RT_MAP(cpp_reader *pfile){ + return UC"_MAP"; + } + + static const uchar *evaluate_RT_AL_MAP(cpp_reader *pfile){ + return UC"_AL_MAP"; + } + + static const uchar *evaluate_RT_IF(cpp_reader *pfile){ + return UC"_IF"; + } + + static const uchar *evaluate_RT_NOT(cpp_reader *pfile){ + return UC"_NOT"; + } + + static const uchar *evaluate_RT_AND(cpp_reader *pfile){ + return UC"_AND"; + } + + static const uchar *evaluate_RT_OR(cpp_reader *pfile){ + return UC"_OR"; + } + + static const uchar *evaluate_RT_IS_IDENTIFIER(cpp_reader *pfile){ + return UC"_IS_IDENTIFIER"; + } + + static const uchar *evaluate_RT_IS_NAME(cpp_reader *pfile){ + return UC"_IS_NAME"; + } + + static const uchar *evaluate_RT_PASTE(cpp_reader *pfile){ + return UC"_PASTE"; + } +#endif +#if 0 +/*───────────────────────── RT helper utilities ─────────────────────────*/ + +/* rt_read_paren_argument + Consume exactly one parenthesised argument list, collecting every token + between the outer ‘( … )’ into OUT_LIST. Returns true on success and + emits its own diagnostic on failure. */ +static bool rt_read_paren_argument(cpp_reader *pfile ,vec &out_list){ + const cpp_token *tok = cpp_get_token_no_padding(pfile); + if(tok->type != CPP_OPEN_PAREN){ + cpp_error(pfile ,CPP_DL_ERROR ,"missing '(' after built-in macro"); + return false; + } + unsigned depth = 1; + while(depth){ + tok = cpp_get_token_1(pfile ,nullptr); + if(tok->type == CPP_EOF){ + cpp_error(pfile ,CPP_DL_ERROR ,"unterminated argument list in built-in macro"); + return false; + } + if(tok->type == CPP_OPEN_PAREN) depth++; + else if(tok->type == CPP_CLOSE_PAREN){ + if(--depth == 0) break; + } + if(depth) out_list.safe_push(*tok); /* omit the final ‘)’ */ + } + return true; +} + +/* rt_tokens_as_text + Spell a vector of tokens back into a single space-separated byte string + allocated from the preprocessor’s permanent pool. */ +static const uchar *rt_tokens_as_text(cpp_reader *pfile ,const vec &src){ + size_t reserve = src.length()*20 + 1; /* generous bound */ + uchar *buf = _cpp_unaligned_alloc(pfile ,reserve); + uchar *dst = buf; + for(unsigned i = 0 ;i < src.length() ;++i){ + if(i) *dst++ = ' '; + dst = cpp_spell_token(pfile ,&src[i] ,dst ,true); + } + *dst = '\0'; + return buf; +} + +/*──────────────────── _FIRST(token_list) implementation ─────────────────*/ + +static const uchar *evaluate_RT_FIRST(cpp_reader *pfile){ + vec list; + if(!rt_read_paren_argument(pfile ,list)) return UC""; + unsigned idx = 0; + while(idx < list.length() && list[idx].type == CPP_PADDING) ++idx; + if(idx == list.length()) return UC""; /* empty list */ + vec one; + one.safe_push(list[idx]); + return rt_tokens_as_text(pfile ,one); +} + +/*──────────────────── _REST(token_list) implementation ──────────────────*/ + +static const uchar *evaluate_RT_REST(cpp_reader *pfile){ + vec list; + if(!rt_read_paren_argument(pfile ,list)) return UC""; + unsigned first = 0; + while(first < list.length() && list[first].type == CPP_PADDING) ++first; + if(first >= list.length() - 1) return UC""; /* one-token list */ + vec rest; + for(unsigned i = first + 1 ;i < list.length() ;++i) rest.safe_push(list[i]); + return rt_tokens_as_text(pfile ,rest); +} +#endif diff --git a/developer/script_Deb-12.10_gcc-12.4.1/mv_libs_to_gcc.sh b/developer/script_Deb-12.10_gcc-12.4.1/mv_libs_to_gcc.sh new file mode 100755 index 0000000..9e4b5e5 --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/mv_libs_to_gcc.sh @@ -0,0 +1,39 @@ +#!/bin/bash +# mv_libs_to_gcc.sh +# Move prerequisite libraries into the GCC source tree, replacing stale copies. +# This script can be run multiple times for incremental moves when more sources become available. + +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +LIB_LIST=( + "gmp" "$GMP_SRC" + "mpfr" "$MPFR_SRC" + "mpc" "$MPC_SRC" + "isl" "$ISL_SRC" + "zstd" "$ZSTD_SRC" +) + +i=0 +while [ $i -lt ${#LIB_LIST[@]} ]; do + lib="${LIB_LIST[$i]}" + src="${LIB_LIST[$((i + 1))]}" + dest="$GCC_SRC/$lib" + i=$((i + 2)) + + if [[ ! -d "$src" ]]; then + echo "Source not found, skipping: $src" + continue + fi + + if [[ -d "$dest" ]]; then + echo "Removing stale: $dest" + rm -rf "$dest" + fi + + echo "mv $src $dest" + mv "$src" "$dest" +done + +echo "completed mv_libs_to_gcc.sh" diff --git a/developer/script_Deb-12.10_gcc-12.4.1/project_download.sh b/developer/script_Deb-12.10_gcc-12.4.1/project_download.sh new file mode 100755 index 0000000..0aa20bc --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/project_download.sh @@ -0,0 +1,131 @@ +#!/bin/bash +# This script can be run multiple times to download what was missed on prior invocations +# If there is a corrupt tarball, delete it and run this again +# Sometimes a connection test will fails, then the downloads runs anyway + +set -uo pipefail # no `-e`, we want to continue on error + +source "$(dirname "$0")/environment.sh" + +check_internet_connection() { + echo "🌐 Checking internet connection..." + # Use a quick connection check without blocking the whole script + if ! curl -s --connect-timeout 5 https://google.com > /dev/null; then + echo "⚠️ No internet connection detected (proceeding with download anyway)" + else + echo "✅ Internet connection detected" + fi +} + +# check_server_reachability() { +# local url=$1 +# if ! curl -s --head "$url" | head -n 1 | grep -q "HTTP/1.1 200 OK"; then +# echo "⚠️ Cannot reach $url (proceeding with download anyway)" +# fi +# } + +check_server_reachability() { + local url=$1 + echo "checking is reachable: $url " + + # Attempt to get the HTTP response code without following redirects + http_code=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 5 "$url") + + # If the HTTP code is between 200 and 299, consider it reachable + if [[ "$http_code" -ge 200 && "$http_code" -lt 300 ]]; then + echo "✅ Server reachable (HTTP $http_code): $url " + else + # If not 2xx, print the status code for transparency + echo "⚠️ Server HTTP $http_code not 2xx, will try anyway: $url" + fi +} + +check_file_exists() { + local file=$1 + [[ -f "$UPSTREAM/$file" ]] +} + +download_file() { + local file=$1 + local url=$2 + + echo "Downloading $file from $url..." + if (cd "$UPSTREAM" && curl -LO "$url"); then + if file "$UPSTREAM/$file" | grep -qi 'html'; then + echo "❌ Invalid download (HTML, not archive): $file" + rm -f "$UPSTREAM/$file" + return 1 + elif [[ -f "$UPSTREAM/$file" ]]; then + echo "✅ Successfully downloaded: $file" + return 0 + # Validate it's not an HTML error page + else + echo "❌ Did not appear after download: $file " + return 1 + fi + else + echo "❌ Failed to download: $file" + return 1 + fi +} + +download_tarballs() { + i=0 + while [ $i -lt ${#UPSTREAM_TARBALL_LIST[@]} ]; do + tarball="${UPSTREAM_TARBALL_LIST[$i]}" + url="${UPSTREAM_TARBALL_LIST[$((i+1))]}" + i=$((i + 3)) + + if check_file_exists "$tarball"; then + echo "⚡ already exists, skipping download: $tarball " + continue + fi + + check_server_reachability "$url" + + if ! download_file "$tarball" "$url"; then + echo "⚠️ Skipping due to previous error: $tarball " + fi + + done +} + +download_git_repos() { + i=0 + while [ $i -lt ${#UPSTREAM_GIT_REPO_LIST[@]} ]; do + repo="${UPSTREAM_GIT_REPO_LIST[$i]}" + branch="${UPSTREAM_GIT_REPO_LIST[$((i+1))]}" + dir="${UPSTREAM_GIT_REPO_LIST[$((i+2))]}" + + if [[ -d "$dir/.git" ]]; then + echo "⚡ Already exists, skipping git clone: $dir " + i=$((i + 3)) + continue + fi + + echo "Cloning $repo into $dir..." + if ! git clone --branch "$branch" "$repo" "$dir"; then + echo "❌ Failed to clone $repo → $dir" + fi + + i=$((i + 3)) + done +} + +# do the downloads + +check_internet_connection + +echo "Downloading tarballs:" +for ((i=0; i<${#UPSTREAM_TARBALL_LIST[@]}; i+=3)); do + echo " - ${UPSTREAM_TARBALL_LIST[i]}" +done +download_tarballs + +echo "Cloning Git repositories:" +for ((i=0; i<${#UPSTREAM_GIT_REPO_LIST[@]}; i+=3)); do + echo " - ${UPSTREAM_GIT_REPO_LIST[i]} (branch ${UPSTREAM_GIT_REPO_LIST[i+1]})" +done +download_git_repos + +echo "project_download.sh completed" diff --git a/developer/script_Deb-12.10_gcc-12.4.1/project_extract.sh b/developer/script_Deb-12.10_gcc-12.4.1/project_extract.sh new file mode 100755 index 0000000..272470d --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/project_extract.sh @@ -0,0 +1,55 @@ +#!/bin/bash +# extracts (unpacks) the source tarballs held in upstream/ into the source/ directory. +# Will not extract if target already exists +# Delete any malformed extractions before running again +# +# gcc is not installed as a tar file, rather it is git cloned directly into source/ as part of the downloading from upstream sources. Hence, there is nothing to extract. + +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +had_error=0 +i=0 + +while [ $i -lt ${#UPSTREAM_TARBALL_LIST[@]} ]; do + tarball="${UPSTREAM_TARBALL_LIST[$i]}" + i=$((i + 3)) + + src_path="$UPSTREAM/$tarball" + + # Strip compression suffix to guess subdirectory name + base_name="${tarball%%.tar.*}" # safer across .tar.gz, .tar.zst, etc. + target_dir="$SRC/$base_name" + + if [[ -d "$target_dir" ]]; then + echo "⚡ Already exists, skipping: $target_dir" + continue + fi + + if [[ ! -f "$src_path" ]]; then + echo "❌ Missing tarball: $src_path" + had_error=1 + continue + fi + + echo "tar -xf $tarball" + if ! (cd "$SRC" && tar -xf "$src_path"); then + echo "❌ Extraction failed: $tarball" + had_error=1 + continue + fi + + if [[ -d "$target_dir" ]]; then + echo "Extracted to: $target_dir" + else + echo "❌ Target not found after extraction: $target_dir" + had_error=1 + fi +done + +if [[ $had_error -eq 0 ]]; then + echo "✅ All tarballs extracted successfully" +else + echo "❌ Some extractions failed or were incomplete" +fi diff --git a/developer/script_Deb-12.10_gcc-12.4.1/project_requisites.sh b/developer/script_Deb-12.10_gcc-12.4.1/project_requisites.sh new file mode 100755 index 0000000..1688e92 --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/project_requisites.sh @@ -0,0 +1,161 @@ +#!/bin/bash +# project_requisites.sh +# Checks that all required tools, libraries, and sources are available +# before proceeding with the GCC build. + +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "Checking requisites for native standalone GCC build." + +if ! command -v pkg-config >/dev/null; then + echo "❌ pkg-config command required for this script" + echo " Debian: sudo apt install pkg-config" + echo " Fedora: sudo dnf install pkg-config" + exit 1 +fi + +missing_requisite_list=() +failed_pkg_config_list=() +found_requisite_list=() + +# --- Required Script Tools (must be usable by this script itself) --- +script_tools=( + bash + awk + sed + grep +) + +echo "Checking for essential script dependencies." +for tool in "${script_tools[@]}"; do + location=$(command -v "$tool") + if [ $? -eq 0 ]; then + found_requisite_list+=("$location") + else + missing_requisite_list+=("tool: $tool") + fi +done + +# --- Build Tools --- +build_tools=( + gcc + g++ + make + tar + gzip + bzip2 + perl + patch + diff + python3 +) + +echo "Checking for required build tools." +for tool in "${build_tools[@]}"; do + location=$(command -v "$tool") + if [ $? -eq 0 ]; then + found_requisite_list+=("$location") + else + missing_requisite_list+=("tool: $tool") + fi +done + +# --- Libraries via pkg-config --- +required_pkgs=( + gmp + mpfr + mpc + isl + zstd +) + +echo "Checking for required development libraries (via pkg-config)." +for lib in "${required_pkgs[@]}"; do + if pkg-config --exists "$lib"; then + libdir=$(pkg-config --variable=libdir "$lib" 2>/dev/null) + soname="lib$lib.so" + + if [[ -f "$libdir/$soname" ]]; then + found_requisite_list+=("library: $lib @ $libdir/$soname") + else + found_requisite_list+=("library: $lib @ (not found in $libdir)") + fi + else + failed_pkg_config_list+=("library: $lib") + fi +done + +# --- Source Trees --- +echo "Checking for required source directories." +echo "These will be installed by project_download.sh and project_extract.sh" +for src in "${SOURCE_DIR_LIST[@]}"; do + if [[ -d "$src" && "$(ls -A "$src")" ]]; then + found_requisite_list+=("source: $src") + else + missing_requisite_list+=("source: $src") + fi +done + +# --- Optional Python Modules --- +optional_py_modules=( + re sys os json gzip pathlib shutil time tempfile +) + +echo "Checking optional Python3 modules." +for mod in "${optional_py_modules[@]}"; do + if python3 -c "import $mod" &>/dev/null; then + found_requisite_list+=("python: module $mod") + else + missing_requisite_list+=("python (optional): module $mod") + fi +done + +glibc_version=$(ldd --version 2>/dev/null | head -n1 | grep -oE '[0-9]+\.[0-9]+' | head -n1) +glibc_path=$(ldd /bin/ls | grep 'libc.so.6' | awk '{print $3}') +if [[ -n "$glibc_version" && -f "$glibc_path" ]]; then + found_requisite_list+=("library: glibc @ $glibc_path (version $glibc_version)") +else + missing_requisite_list+=("library: glibc") +fi + + +echo +echo "Summary:" +echo "--------" + +for item in "${found_requisite_list[@]}"; do + echo " found: $item" +done + +for item in "${missing_requisite_list[@]:-}"; do + echo "❌ missing required: $item" +done + +for item in "${failed_pkg_config_list[@]:-}"; do + echo "⚠️ pkg-config could not find: $item" +done + +echo + +if [[ ${#missing_requisite_list[@]} -eq 0 && ${#failed_pkg_config_list[@]} -eq 0 ]]; then + echo "✅ All required tools and sources are present." +else + echo "❌ Some requisites are missing or unresolved." + if [[ ${#failed_pkg_config_list[@]} -gt 0 ]]; then + echo + echo "Note: The following libraries were not found by pkg-config:" + for item in "${failed_pkg_config_list[@]}"; do + echo " - $item" + done + echo + echo "The following are expected to be missing if you are building them from source:" + echo " - mpc" + echo " - isl" + echo " - zstd" + echo "If not, consider installing the appropriate development packages:" + echo " Debian: sudo apt install libmpc-dev libisl-dev libzstd-dev" + echo " Fedora: sudo dnf install libmpc-devel isl-devel libzstd-devel" + fi +fi diff --git a/developer/script_Deb-12.10_gcc-12.4.1/project_setup.sh b/developer/script_Deb-12.10_gcc-12.4.1/project_setup.sh new file mode 100755 index 0000000..953a99c --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/project_setup.sh @@ -0,0 +1,47 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +# Create top-level project directories +for dir in "${PROJECT_DIR_LIST[@]}"; do + echo "mkdir -p $dir" + mkdir -p "$dir" +done + +# Create subdirectories within SYSROOT +for subdir in "${PROJECT_SUBDIR_LIST[@]}"; do + echo "mkdir -p $subdir" + mkdir -p "$subdir" +done + +# Ensure TMPDIR exists and add .gitignore +if [[ ! -d "$TMPDIR" ]]; then + echo "mkdir -p $TMPDIR" + mkdir -p "$TMPDIR" + + echo "$TMPDIR/" > "$TMPDIR/.gitignore" +else + echo "⚠️ TMPDIR already exists" +fi + +# Create root-level .gitignore if missing +if [[ -f "$REPO_HOME/.gitignore" ]]; then + echo "⚠️ $REPO_HOME/.gitignore already exists" +else + echo "create $REPO_HOME/.gitignore" + { + echo "# Ignore synthesized top-level directories" + for dir in "${PROJECT_DIR_LIST[@]}"; do + rel_path="${dir#$REPO_HOME/}" + echo "/$rel_path" + done + echo "# Ignore synthesized files" + echo "/.gitignore" + } > "$REPO_HOME/.gitignore" +fi + +echo +echo "Created project structure:" +# tree -L 2 "$REPO_HOME" 2>/dev/null || find "$REPO_HOME" -maxdepth 2 + diff --git a/developer/script_Deb-12.10_gcc-12.4.1/rebuild_gcc.sh b/developer/script_Deb-12.10_gcc-12.4.1/rebuild_gcc.sh new file mode 100755 index 0000000..447442a --- /dev/null +++ b/developer/script_Deb-12.10_gcc-12.4.1/rebuild_gcc.sh @@ -0,0 +1,21 @@ +#!/bin/bash +# rebuild_gcc.sh – no structural changes, and build directory is still intact + +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "🔧 Starting GCC rebuild..." + +pushd "$GCC_BUILD" + + echo "gcc: $(command -v gcc)" + echo "toolchain: $TOOLCHAIN" + + $MAKE -j"$MAKE_JOBS" + $MAKE install + +popd + +echo "✅ GCC re-installed to $TOOLCHAIN/bin" +"$TOOLCHAIN/bin/gcc" --version diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/README.org" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/README.org" deleted file mode 100755 index 6a6bacd..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/README.org" +++ /dev/null @@ -1,53 +0,0 @@ -* General Notes - -GNU `cpp` is integrated with the GNU gcc repo. - -There is a lot more to GCC than one might imagine. It was developed as though an integral part of Unix. Hence, the standalone build has top-level directories that resemble the top level of a Unix system. - -The scripts here will download source and build a standalone GCC 12 along with version-compatible tools. - - -* RT extensions - -If you want the RT extensions, the RT extension sources must be copied from library/ after the gcc sources are expanded. Use `RT_extensions_install.sh`. When editing the sources, generally the library/ versions are treated as authoritive. - -* environment - -Don't forget `source env_toolsmith` in your shell. - -* Setup - -- `setup_project.sh` :: Makes the directory structure for the build, creates a `tmp/` directory under the project. If it does not already exist, creates a `.gitignore` file with all the created directories listed. - -* Top level .gitignore - -- There is no `.gitignore` at the top level when the project is cloned. The `.gitignore` created by `setup_project.sh` ignores itself, so it will not be checked in. -- No script deletes the top level `.gitignore`, including the clean scripts, so the toolsmith can make edits there that persist locally. - -* Clean - -- `clean_build.sh` :: For saving space after the build is done. The build scripts are idempotent, so in an ideal world this need not be run to do a rebuild. - -- `clean_dist.sh` :: With one exception, this will delete everything that was synthesized. The one exception is that `.gitignore` is moved to the `tmp/` directory to preserve any changes a user might have made, and the contents of the `tmp/` directory are not removed. - -- `clean_tmp.sh` :: Wipes clean all contents of the temporary directory. - -* Download - -- `download_upstream_sources.sh` :: Goes to the Internet, fetches all the sources that have not already been fetched. Then expands the sources into the proper sub-directory under `source/1`. - -* Build - -See the script build_all.sh for a description of the complete build process. - -When editing the RT Extensions, the work flow is typically: - -while cwd in the script directory: -1. edit the library/ -2. `./RT_extension_install.h` -3. `./rebuild_gcc.sh` -4. run some experiments - -It would of course be better to have a test suite. - -I typically leave an emacs shell open in the $ROOT/source/gcc-12.2.0/libcpp/ directory for purposes of exploring the other source code files. diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/build_all.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/build_all.sh" deleted file mode 100755 index 767e6af..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/build_all.sh" +++ /dev/null @@ -1,26 +0,0 @@ -#!/bin/bash -set -euo pipefail - -cd "$(dirname "$0")" - -source "$SCRIPT_DIR/environment.sh" - -./project_setup.sh -./project_download.sh -./project_extract.sh -./project_requisites.sh - -./mv_libs_to_gcc.sh -./build_gcc.sh - -echo "Toolchain build complete" -"$TOOLCHAIN/bin/gcc" --version - -# test - -./RT_extentions_libcpp_save.sh -./RT_extentions_install.sh -./rebuild_gcc.sh - -echo "Toolchain built with RT_extensions installed" -"$TOOLCHAIN/bin/gcc" --version diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/build_gcc.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/build_gcc.sh" deleted file mode 100755 index db7a9c4..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/build_gcc.sh" +++ /dev/null @@ -1,33 +0,0 @@ -#!/bin/bash -# build_gcc.sh – Build GCC 12.2.0 using system libraries and headers - -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "🔧 Starting GCC build..." - -mkdir -p "$GCC_BUILD" -pushd "$GCC_BUILD" - -echo "gcc: $(command -v gcc)" -echo "toolchain: $TOOLCHAIN" - -"$GCC_SRC/configure" \ - --with-pkgversion="RT_gcc standalone by Reasoning Technology" \ - --with-bugurl="https://github.com/Thomas-Walker-Lynch/RT_gcc/issues" \ - --with-documentation-root-url="https://gcc.gnu.org/onlinedocs/" \ - --with-changes-root-url="https://github.com/Thomas-Walker-Lynch/RT_gcc/releases/" \ - --host="$HOST" \ - --prefix="$TOOLCHAIN" \ - --enable-languages=c,c++ \ - --enable-threads=posix \ - --disable-multilib - -$MAKE -j"$MAKE_JOBS" -$MAKE install - -popd - -echo "✅ GCC installed to $TOOLCHAIN/bin" -"$TOOLCHAIN/bin/gcc" --version diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/clean_build.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/clean_build.sh" deleted file mode 100755 index 7c9bca3..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/clean_build.sh" +++ /dev/null @@ -1,15 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "🧹 Cleaning build directories..." - -for dir in "${BUILD_DIR_LIST[@]}"; do - if [[ -d "$dir" ]]; then - echo "rm -rf $dir" - rm -rf "$dir" - fi -done - -echo "✅ Build directories cleaned." diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/clean_dist.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/clean_dist.sh" deleted file mode 100755 index 3b319ec..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/clean_dist.sh" +++ /dev/null @@ -1,35 +0,0 @@ -#!/bin/bash -set -euo pipefail - -echo "removing: build, source, upstream, and project directories" - -source "$(dirname "$0")/environment.sh" - -# Remove build -# - "./clean_build.sh" - ! ! rmdir "$BUILD_DIR" >& /dev/null && echo "rmdir $BUILD_DIR" - -# Remove source -# note that repos are removed with clean_upstream -# - "./clean_source.sh" - "./clean_upstream.sh" - - ! ! rmdir "$SRC" >& /dev/null && echo "rmdir $SRC" - ! ! rmdir "$UPSTREAM" >& /dev/null && echo "rmdir $UPSTREAM" - -# Remove binaries from toolchain (if they were copied to release, those copies remain). -# - "./clean_toolchain.sh" - -# Remove project directories -# - for dir in "${PROJECT_SUBDIR_LIST[@]}" "${PROJECT_DIR_LIST[@]}"; do - if [[ -d "$dir" ]]; then - echo "rm -rf $dir" - ! rm -rf "$dir" && echo "could not remove $dir" - fi - done - -echo "✅ clean_dist.sh" diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/clean_source.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/clean_source.sh" deleted file mode 100755 index 2f5beb0..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/clean_source.sh" +++ /dev/null @@ -1,29 +0,0 @@ -#!/bin/bash -# removes project tarball expansions from source/ -# git repos are part of `upstream` so are not removed - -set -euo pipefail - - -source "$(dirname "$0")/environment.sh" - -i=0 -while [ $i -lt ${#UPSTREAM_TARBALL_LIST[@]} ]; do - tarball="${UPSTREAM_TARBALL_LIST[$i]}" - # skip url - i=$((i + 1)) - # skip explicit dest dir - i=$((i + 1)) - - base_name="${tarball%.tar.*}" - dir="$SRC/$base_name" - - if [[ -d "$dir" ]]; then - echo "rm -rf $dir" - rm -rf "$dir" - fi - - i=$((i + 1)) -done - -echo "✅ clean_source.sh" diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/clean_toolchain.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/clean_toolchain.sh" deleted file mode 100755 index 435c1ab..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/clean_toolchain.sh" +++ /dev/null @@ -1,17 +0,0 @@ -#!/bin/bash -# clean_toolchain.sh – Remove installed GCC toolchain artifacts - -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "🧹 Cleaning installed toolchain at: $TOOLCHAIN" - -if [[ -d "$TOOLCHAIN" ]]; then - echo "rm -rf $TOOLCHAIN" - rm -rf "$TOOLCHAIN" -else - echo "⚠️ Toolchain directory not found: $TOOLCHAIN (nothing to remove)" -fi - -echo "✅ Installed toolchain cleaned." diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/clean_upstream.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/clean_upstream.sh" deleted file mode 100755 index 50e8d98..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/clean_upstream.sh" +++ /dev/null @@ -1,38 +0,0 @@ -#!/bin/bash -# run this to force repeat of the downloads -# removes project tarballs from upstream/ -# removes project repos from source/ -# does not remove non-project files - -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -# Remove tarballs -i=0 -while [ $i -lt ${#UPSTREAM_TARBALL_LIST[@]} ]; do - tarball="${UPSTREAM_TARBALL_LIST[$i]}" - path="$UPSTREAM/$tarball" - - if [[ -f "$path" ]]; then - echo "rm $path" - rm "$path" - fi - - i=$((i + 3)) -done - -# Remove Git repositories -i=0 -while [ $i -lt ${#UPSTREAM_GIT_REPO_LIST[@]} ]; do - dir="${UPSTREAM_GIT_REPO_LIST[$((i+2))]}" - - if [[ -d "$dir" ]]; then - echo "rm -rf $dir" - rm -rf "$dir" - fi - - i=$((i + 3)) -done - -echo "✅ clean_upstream.sh" diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/deprecated/stuff.cc" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/deprecated/stuff.cc" deleted file mode 100644 index 84429e4..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/deprecated/stuff.cc" +++ /dev/null @@ -1,368 +0,0 @@ - -/* - Parse a macro-style parameter list for `#assign` - - This expects the next token to be an opening parenthesis `(`. - - It returns: - - `params_out`: pointer to committed parameter array - - `param_count_out`: number of parameters parsed - - `is_variadic_out`: true if a variadic param was encountered - - On success, returns true and fills the out parameters. - On failure, returns false and issues an error diagnostic. -*/ -bool -make_parameter_list( - cpp_reader *pfile, - cpp_hashnode ***params_out, - unsigned int *param_count_out, - bool *is_variadic_out -){ - cpp_token first; - cpp_token *saved_cur_token = pfile->cur_token; - pfile->cur_token = &first; - cpp_token *token = _cpp_lex_direct(pfile); - pfile->cur_token = saved_cur_token; - - if (token->type != CPP_OPEN_PAREN) { - cpp_error_with_line( - pfile, - CPP_DL_ERROR, - token->src_loc, - 0, - "expected '(' to open parameter list, but found: %s", - cpp_token_as_text(token) - ); - return false; - } - - unsigned int nparms = 0; - bool variadic = false; - - if (!parse_params(pfile, &nparms, &variadic)) { - cpp_error_with_line( - pfile, - CPP_DL_ERROR, - token->src_loc, - 0, - "malformed parameter list" - ); - return false; - } - - cpp_hashnode **params = (cpp_hashnode **) - _cpp_commit_buff(pfile, sizeof(cpp_hashnode *) * nparms); - - *params_out = params; - *param_count_out = nparms; - *is_variadic_out = variadic; - - return true; -} - - /* Parse the parameter list - */ - cpp_hashnode **params; - unsigned int param_count; - bool is_variadic; - - if (!make_parameter_list(pfile, ¶ms, ¶m_count, &is_variadic)) { - return false; - } - - - - - -/*================================================================================ -from directive.cc - -*/ - - D(macro ,T_MACRO ,EXTENSION ,IN_I) \ - -/*-------------------------------------------------------------------------------- - directive `#macro` - - cmd ::= "#macro" name params body ; - - name ::= identifier ; - - params ::= "(" param_list? ")" ; - param_list ::= identifier ("," identifier)* ; - - body ::= clause ; - - clause ::= "(" literal? ")" | "[" expr? "]" ; - - literal ::= ; sequence parsed into tokens - expr ::= ; sequence parsed into tokens with recursive expansion of each token - - ; white space, including new lines, is ignored. - - -*/ -extern bool _cpp_create_macro (cpp_reader *pfile, cpp_hashnode *node); - -static void -do_macro (cpp_reader *pfile) -{ - cpp_hashnode *node = lex_macro_node(pfile, true); - - if(node) - { - /* If we have been requested to expand comments into macros, - then re-enable saving of comments. */ - pfile->state.save_comments = - ! CPP_OPTION (pfile, discard_comments_in_macro_exp); - - if(pfile->cb.before_define) - pfile->cb.before_define (pfile); - - if( _cpp_create_macro(pfile, node) ) - if (pfile->cb.define) - pfile->cb.define (pfile, pfile->directive_line, node); - - node->flags &= ~NODE_USED; - } -} - - - - - - -/*================================================================================ -from macro.cc - -*/ - - - - - -/*-------------------------------------------------------------------------------- - Given a pfile, returns a macro definition. - - #macro name (parameter [,parameter] ...) (body_expr) - #macro name () (body_expr) - - Upon entry, the name was already been parsed in directives.cc::do_macro, so the next token will be the opening paren of the parameter list. - - Thi code is similar to `_cpp_create_definition` though uses paren blancing around the body, instead of requiring the macro body be on a single line. - - The cpp_macro struct is defined in cpplib.h: `struct GTY(()) cpp_macro {` it has a flexible array field in a union as a last member: cpp_token tokens[1]; - - This code was derived from create_iso_definition(). The break out portions shared - with create_macro_definition code should be shared with the main code, so that there - is only one place for edits. - -*/ -static cpp_macro *create_iso_RT_macro (cpp_reader *pfile){ - - const char *paste_op_error_msg = - N_("'##' cannot appear at either end of a macro expansion"); - unsigned int num_extra_tokens = 0; - unsigned nparms = 0; - cpp_hashnode **params = NULL; - bool varadic = false; - bool ok = false; - cpp_macro *macro = NULL; - - /* - After these six lines of code, the next token, hopefully being '(', will be in the variable 'token'. - - _cpp_lex_direct() is going to clobber pfile->cur_token with the token pointer, so - it is saved then restored. - */ - cpp_token first; - cpp_token *saved_cur_token = pfile->cur_token; - pfile->cur_token = &first; - cpp_token *token = _cpp_lex_direct (pfile); - pfile->cur_token = saved_cur_token; - - // parameter list parsing - // - if(token->type != CPP_OPEN_PAREN){ - cpp_error_with_line( - pfile - ,CPP_DL_ERROR - ,token->src_loc - ,0 - ,"expected '(' to open arguments list, but found: %s" - ,cpp_token_as_text(token) - ); - goto out; - } - - /* - - returns parameter list for a function macro, or NULL - - returns via &arg count of parameters - - returns via &arg the varadic flag - - after parse_parms runs, the next token returned by pfile will be subsequent to the parameter list, e.g.: - 7 | #macro Q(f ,...) printf(f ,__VA_ARGS__) - | ^~~~~~ - - */ - if( !parse_params(pfile, &nparms, &varadic) ) goto out; - - // finalizes the reserved room, otherwise it will be reused on the next reserve room call. - params = (cpp_hashnode **)_cpp_commit_buff( pfile, sizeof (cpp_hashnode *) * nparms ); - token = NULL; - - // instantiate a temporary macro struct, and initialize it - // A macro struct instance is variable size, due to a trailing token list, so the memory - // reservations size will be adjusted when this is committed. - // - macro = _cpp_new_macro( - pfile - ,cmk_macro - ,_cpp_reserve_room( pfile, 0, sizeof(cpp_macro) ) - ); - macro->variadic = varadic; - macro->paramc = nparms; - macro->parm.params = params; - macro->fun_like = true; - - // parse macro body - // A `#macro` body is delineated by parentheses - // - if( - !collect_body_tokens( - pfile - ,macro - ,&num_extra_tokens - ,paste_op_error_msg - ,true // parenthesis delineated - ) - ) goto out; - - // ok time to commit the macro - // - ok = true; - macro = (cpp_macro *)_cpp_commit_buff( - pfile - ,sizeof (cpp_macro) - sizeof (cpp_token) + sizeof (cpp_token) * macro->count - ); - - // some end cases we must clean up - // - /* - It might be that the first token of the macro body was preceded by white space,so - the white space flag is set. However, upon expansion, there might not be a white - space before said token, so the following code clears the flag. - */ - if (macro->count) - macro->exp.tokens[0].flags &= ~PREV_WHITE; - - /* - Identifies consecutive ## tokens (a.k.a. CPP_PASTE) that were invalid or ambiguous, - - Removes them from the main macro body, - - Stashes them at the end of the tokens[] array in the same memory, - - Sets macro->extra_tokens = 1 to signal their presence. - */ - if (num_extra_tokens) - { - /* Place second and subsequent ## or %:%: tokens in sequences of - consecutive such tokens at the end of the list to preserve - information about where they appear, how they are spelt and - whether they are preceded by whitespace without otherwise - interfering with macro expansion. Remember, this is - extremely rare, so efficiency is not a priority. */ - cpp_token *temp = (cpp_token *)_cpp_reserve_room - (pfile, 0, num_extra_tokens * sizeof (cpp_token)); - unsigned extra_ix = 0, norm_ix = 0; - cpp_token *exp = macro->exp.tokens; - for (unsigned ix = 0; ix != macro->count; ix++) - if (exp[ix].type == CPP_PASTE) - temp[extra_ix++] = exp[ix]; - else - exp[norm_ix++] = exp[ix]; - memcpy (&exp[norm_ix], temp, num_extra_tokens * sizeof (cpp_token)); - - /* Record there are extra tokens. */ - macro->extra_tokens = 1; - } - - out: - - /* - - This resets a flag in the parser’s state machine, pfile. - - The field `va_args_ok` tracks whether the current macro body is allowed to reference `__VA_ARGS__` (or more precisely, `__VA_OPT__`). - - It's set **while parsing a macro body** that might use variadic logic — particularly in `vaopt_state` tracking. - - Resetting it here ensures that future macros aren't accidentally parsed under the assumption that variadic substitution is valid. - */ - pfile->state.va_args_ok = 0; - - /* - Earlier we did: - if (!parse_params(pfile, &nparms, &variadic)) goto out; - This cleans up temporary memory used by parse_params. - */ - _cpp_unsave_parameters (pfile, nparms); - - return ok ? macro : NULL; -} - -/* - called from directives.cc:: do_macro -*/ -bool -_cpp_create_macro(cpp_reader *pfile, cpp_hashnode *node){ - cpp_macro *macro; - - macro = create_iso_RT_macro (pfile); - - if (!macro) - return false; - - if (cpp_macro_p (node)) - { - if (CPP_OPTION (pfile, warn_unused_macros)) - _cpp_warn_if_unused_macro (pfile, node, NULL); - - if (warn_of_redefinition (pfile, node, macro)) - { - const enum cpp_warning_reason reason - = (cpp_builtin_macro_p (node) && !(node->flags & NODE_WARN)) - ? CPP_W_BUILTIN_MACRO_REDEFINED : CPP_W_NONE; - - bool warned = - cpp_pedwarning_with_line (pfile, reason, - pfile->directive_line, 0, - "\"%s\" redefined", NODE_NAME (node)); - - if (warned && cpp_user_macro_p (node)) - cpp_error_with_line (pfile, CPP_DL_NOTE, - node->value.macro->line, 0, - "this is the location of the previous definition"); - } - _cpp_free_definition (node); - } - - /* Enter definition in hash table. */ - node->type = NT_USER_MACRO; - node->value.macro = macro; - if (! ustrncmp (NODE_NAME (node), DSC ("__STDC_")) - && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_FORMAT_MACROS") - /* __STDC_LIMIT_MACROS and __STDC_CONSTANT_MACROS are mentioned - in the C standard, as something that one must use in C++. - However DR#593 and C++11 indicate that they play no role in C++. - We special-case them anyway. */ - && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_LIMIT_MACROS") - && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_CONSTANT_MACROS")) - node->flags |= NODE_WARN; - - /* If user defines one of the conditional macros, remove the - conditional flag */ - node->flags &= ~NODE_CONDITIONAL; - - return true; -} - diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/document\360\237\226\211/bump_buffer.org" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/document\360\237\226\211/bump_buffer.org" deleted file mode 100644 index 7eddbac..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/document\360\237\226\211/bump_buffer.org" +++ /dev/null @@ -1,101 +0,0 @@ -#+TITLE: CPP bump buffer -#+AUTHOR: Thomas -#+DATE: 2025-06-08 -#+OPTIONS: toc:nil ^:nil - -* References -https://www.chiark.greenend.org.uk/doc/cpp-4.3-doc/cppinternals.html - -#+begin_quote -**References** - -- *libcpp/lex.cc* and *libcpp/macro.cc* (GCC ≥ 12) -- Ian Lance Taylor, “*Inside libcpp*”, GNU Cauldron 2018 slides -#+end_quote - -* The arena -/libcpp/ keeps a bump-pointer arena inside =cpp_reader=. Two routines are used to directly interface with it: =_cpp_reserve_room()=, and =_cpp_commit_buff()=. - - - -| Helper | Exact responsibility | -|--------------------------------+--------------------------------------------------------| -| =_cpp_reserve_room()= | Provide tentative space without moving the cursor | -| =_cpp_commit_buff()= | Finalise the previous reservation and move the cursor | - -* _cpp_reserve_room - -Defined in `internal.h`: - -#+begin_src c -static inline void *_cpp_reserve_room (cpp_reader *pfile, size_t have, size_t extra){ - if (BUFF_ROOM (pfile->a_buff) < (have + extra)) - _cpp_extend_buff (pfile, &pfile->a_buff, extra); - return BUFF_FRONT (pfile->a_buff); -} -#+end_src - -Example invocation found in `macro.cc::create_iso_definition`: - -#+begin_src c - macro = _cpp_new_macro( - pfile - ,cmk_macro - ,_cpp_reserve_room(pfile, 0, sizeof (cpp_macro)) - ); -#+end_src - -=_cpp_reserve_room()= returns a pointer to a buffer of the specified size, `have` + `extra`. The `have` argument is interesting, as it can be changed from call to call, hence it is more for the programmer than for the function. - -The buffer returned is not generally a unique buffer. It might be the same one that was returned on the previous invocation of =_cpp_reserve_room()=, or it will be a new buffer if `_cpp_extend_buff` was called during allocation. In this latter case, a new larger buffer is made, and the old buffer is copied into it. The new buffer is then returned. - -Hence, each call to =_cpp_reserve_room()= requires that all pointers into the buffer that existed before the call be considered stale. After the call, the only valid pointer to the buffer is the one that =_cpp_reserve_room()= returns. - - -* `_cpp_commit_buff` - -Defined in `lex.cc`: - -#+begin_src c -void *_cpp_commit_buff (cpp_reader *pfile, size_t size){ - void *ptr = BUFF_FRONT (pfile->a_buff); - - if (pfile->hash_table->alloc_subobject) - { - void *copy = pfile->hash_table->alloc_subobject (size); - memcpy (copy, ptr, size); - ptr = copy; - } - else - BUFF_FRONT (pfile->a_buff) += size; - - return ptr; -} -#+end_src - -A call to =_cpp_commit_buff()= finalizes the accumulation of reserved room, and sets the buffer aside as a dedicated allocation. Once =_cpp_commit_buff()= is called, a subsequent call to =_cpp_reserve_room()= will start fresh with a new buffer. - -When `! pfile->hash_table->alloc_subobject`, the buffer will be committed with `BUFF_FRONT (pfile->a_buff) += size;` - -However, when `pfile->hash_table->alloc_subobject` is true, the buffer will be copied when it is committed. Hence, it is important to consider all pointers into the buffer created during the one or more calls to =_cpp_reserve_room()= to be stale after a call to =_cpp_commit_buff()=, and to subsequently only use the pointer that =_cpp_commit_buff()= returns. - -According to `GPT o3`, which hasn't gotten much right thus far,"`alloc_subobject` is non-null only while installing a macro into the hash table (create_definition etc.)." - -* Deallocating a `_cpp_reserve_room` buffer yet to be committed - -Each subsequent call to `_cpp_reserve_room`, even if it is unrelated to the prior one, either uses the buffer it finds at `pfile->a_buff`, or if it is not large enough, replaces it with a larger copy. Consequently, there is no concept of 'deallocating a `_cpp_reserve_room` buffer'. - -However, according to `o3`, if one desired: - -#+begin_src c - void *mark = _cpp_get_buff (pfile); /* snapshot cursor */ - ... - _cpp_release_buff (pfile, mark); /* rewind to mark */ -#+end_src - - -* Deallocating a committed `_cpp_reserve_room` buffer - -There is no facility for this. Detritus collects until CPP moves on to the next translation unit and then everything is released. - - diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/document\360\237\226\211/custom_directives_macros.org" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/document\360\237\226\211/custom_directives_macros.org" deleted file mode 100644 index a38289d..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/document\360\237\226\211/custom_directives_macros.org" +++ /dev/null @@ -1,73 +0,0 @@ -#+TITLE: Adding Custom Directives and Built-in Macros in libcpp -#+AUTHOR: Thomas -#+DATE: 2025-06-10 -#+OPTIONS: toc:t - -* Custom Preprocessor Directives - -To add a new `#directive` (e.g. `#assign`, `#macro`), changes must be made in several files within libcpp. - -** `directives.cc` - -- Midway through the file, there is a static table mapping directive names to their handlers. -- Add an entry here for your new directive: - #+begin_src c - { "assign", RT_ASSIGN, false }, - { "macro", RT_MACRO, false }, - #+end_src - -- Define your handler in the switch for `handle_directive`. Place the handler logic (e.g., `handle_assign_directive`) in the section titled: - #+begin_quote - /* RT Extensions */ - #+end_quote - -** Tip: - Group your RT extensions near the end of the file for easy maintenance. - ---- - -* Custom Built-in Macros - -To define new built-in macros (e.g. `_CAT`, `_MAP`), several files need coordinated updates. - -** `include/cpplib.h` - -- Add a new entry to the `enum cpp_builtin_type`: - #+begin_src c - BT_CAT, - #+end_src - -- This enum distinguishes the macro’s kind in logic dispatch. - -** `init.cc` - -- Extend the `builtin_array[]` definition: - #+begin_src c - B("_CAT", BT_CAT, true), - #+end_src - -- The third parameter (`true`) indicates that a redefinition warning should be issued if the macro is redefined in user code. -- *Important:* The final entries in the table (`__DATE__`, `__TIME__`, etc.) are position-sensitive. Insert new macros near the top or middle, not the end. - -** `macros.cc` - -- This is where the actual macro expansion logic resides. -- Your `handle_builtin_macro` function will receive a `cpp_builtin_type` enum and dispatch accordingly. -- Add your implementation (e.g., for `_CAT`) in the RT Extensions section near the end of `macros.cc`. - - For example: - #+begin_src c - case BT_CAT: - return expand_cat_macro(pfile, ...); - #+end_src - -- If your macro uses token splicing, argument unpacking, or nesting, consider isolating each macro as a separate function for clarity and testability. - ---- - -* Notes - -- All RT Extension code is grouped under labeled sections (e.g., `/* RT Extensions */`) for both directives and built-in macros. This helps maintain a clean separation from upstream GCC code. -- If your extensions modify memory allocation, token arena behavior, or call back into the lexer, be sure to test under both expanded and unexpanded modes. - -Let me know if you'd like this converted into a template or included in your developer README. diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/document\360\237\226\211/fetching_a_token.org" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/document\360\237\226\211/fetching_a_token.org" deleted file mode 100644 index 30f4988..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/document\360\237\226\211/fetching_a_token.org" +++ /dev/null @@ -1,92 +0,0 @@ -#+TITLE: Token Fetching Routines in libcpp -#+AUTHOR: Thomas -#+DATE: 2025-06-10 -#+OPTIONS: toc:t - -* Getting a Token - -There are many routines for getting a token in libcpp, each tuned for different phases of preprocessing. This guide summarizes the key interfaces, grouped by their expansion behavior. - -Note that the term *expansion* is overloaded in libcpp: - -1. One meaning refers to **macro expansion** — replacing a macro call with its definition (possibly recursively). -2. Another refers to the **macro definition buffer** itself: macros store an array of tokens as their "expansion". Adding a token to this array is said to "add an expansion token", which is unrelated to expanding a macro. - ---- - -* No Macro Expansion - -** `lex.cc::_cpp_lex_token(pfile)` - -- Returns a pointer to the next token without macro expansion. -- It does handle important preprocessor logic: - - Skipping gated code (`#if 0`) - - Handling `#line` or `#pragma` - - Deferred pragma state -- This is the general-purpose low-level token fetch used during parsing — but it still obeys the logical flow of the file. - -** `macro.cc::lex_expansion_token` - -- Lexes a token and places it into the macro's `expansion` array. -- The token is *not* processed in the usual way — it does *not* honor `#if` skipping, macro argument parsing, or directive detection. -- Used by `create_iso_definition()` when parsing macro bodies. - -** `lex.cc::_cpp_lex_direct(pfile)` - -- Very low-level lexer function. -- Called *only* by `_cpp_lex_token`. -- Assumes `pfile->cur_token` is pre-set; writes directly into that memory. -- Bypasses all macro, directive, or skipping logic. -- Should not be used outside of tightly controlled contexts — *not* a public interface. - ---- - -* With Macro Expansion - -** `macro.cc::cpp_get_token_1(pfile, &src_loc)` - -- Returns the next token with full macro expansion. -- Also returns `location_t` via an out parameter. -- This is the recommended entry point for frontends or extensions that need both token and position. -- Internally calls `_cpp_lex_token` and handles expansion. - -** `macro.cc::cpp_get_token(pfile)` - -- Thin wrapper around `cpp_get_token_1`, but discards the location. -- Used in internal code where location is unnecessary. - -** `macro.cc::cpp_get_token_no_padding(pfile)` - -- Calls `cpp_get_token`, but skips tokens of type `CPP_PADDING`. -- Useful when padding tokens are irrelevant (e.g. while collecting macro arguments). - ---- - -* RT Extension - -** `rt_extensions.cc::get_token_noexpand(pfile)` - -- Returns one unexpanded token by value. -- Does *not* use the regular token stream; instead, it temporarily sets `pfile->cur_token` to a local buffer, calls `_cpp_lex_direct`, and restores the original pointer. -- Unlike `lex_expansion_token`, it does *not* store the result in a macro; it simply returns the token. -- This is appropriate for previewing the next token *without affecting* the token stream or macro state. - -However, for parsing structured clauses such as macro bodies — especially those that may include `#if`/`#endif` constructs — it is better to use `_cpp_lex_token`, to match the conditional skipping behavior of `cpp_get_token_1`. - ---- - -* Summary Table - -| Function | Expansion | Skips `#if 0` | Stores Token | Returns Token | Returns Location | -|-----------------------------+-----------+---------------+---------------+----------------+-------------------| -| `lex_expansion_token` | ❌ | ❌ | ✔ (into macro) | ❌ | ✔ | -| `_cpp_lex_direct` | ❌ | ❌ | ✔ (in-place) | ❌ | ✔ | -| `_cpp_lex_token` | ❌ | ✔ | ✔ | ✔ | ✔ | -| `cpp_get_token` | ✔ | ✔ | ✔ | ✔ | ❌ | -| `cpp_get_token_1` | ✔ | ✔ | ✔ | ✔ | ✔ | -| `cpp_get_token_no_padding` | ✔ | ✔ | ✔ | ✔ (no padding) | ❌ | -| `get_token_noexpand` | ❌ | ❌ | ❌ | ✔ (by value) | ✔ | - ---- - -Let me know if you'd like to cross-link these to their actual definitions or integrate them into a larger RT CPP developer manual. diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/document\360\237\226\211/todo.org" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/document\360\237\226\211/todo.org" deleted file mode 100644 index 40142c6..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/document\360\237\226\211/todo.org" +++ /dev/null @@ -1,23 +0,0 @@ -2025-05-00 - - - Add the call back and warn logic for #assign in the macro.cc::name_clause_is_name function. - - - The name is currently () or [], probably should allow a single name ID or [] instead. - or perhaps in general, allow for evaluated, not evaluated, or single identifier options. - - - When this matures, should replace the capture/install with diff and patch. - - -2025-05-17 in maco.cc, seems the end cases in `parse_clause_literal()` should be included in `parse_clause_expand()`. - -2025-05-18 It would have been better perhaps to send in a pointer to a token allocation, - instead of to a src_loc, and and terminal type. - - -2025-06-08 - - - I wonder if _ASSIGN can be contrived, perhaps along with a count, to create an - infinite number of uniquely named macros. This might be worth exploring. - - - diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/environment.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/environment.sh" deleted file mode 100755 index 4bcfd3d..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/environment.sh" +++ /dev/null @@ -1,159 +0,0 @@ -# === environment.sh === -# Source this file in each build script to ensure consistent paths and settings - -#!/bin/sh - -: "${REPO_HOME:?REPO_HOME is not set}" -: "${DEVELOPER:?DEVELOPER is not set}" -: "${SCRIPT_DIR:?SCRIPT_DIR is not set}" - -[ -d "$REPO_HOME" ] || { echo "Directory not found: REPO_HOME ($REPO_HOME)" >&2; exit 1; } -[ -d "$DEVELOPER" ] || { echo "Directory not found: DEVELOPER ($DEVELOPER)" >&2; exit 1; } -[ -d "$SCRIPT_DIR" ] || { echo "Directory not found: SCRIPT_DIR ($SCRIPT_DIR)" >&2; exit 1; } - -echo "REPO_HOME: $REPO_HOME" -echo "DEVELOPER: $DEVELOPER" -echo "SCRIPT_DIR: $SCRIPT_DIR" - -#-------------------------------------------------------------------------------- -# project structure - - # temporary directory - export TMPDIR="$REPO_HOME/tmp" - - # Project directories - export SYSROOT="$DEVELOPER/sysroot" - export TOOLCHAIN="$DEVELOPER/toolchain" - export BUILD_DIR="$DEVELOPER/build" - export LOGDIR="$DEVELOPER/log" - export UPSTREAM="$DEVELOPER/upstream" - export SRC=$DEVELOPER/source - - # lists of project directories to synthesize - PROJECT_DIR_LIST=( - "$LOGDIR" - "$SYSROOT" "$TOOLCHAIN" "$BUILD_DIR" - "$UPSTREAM" "$SRC" - ) - # list these in the order for which they can be deleted - PROJECT_SUBDIR_LIST=( - "$SYSROOT/usr/lib" - "$SYSROOT/lib" - "$SYSROOT/usr/include" - ) - -#-------------------------------------------------------------------------------- -# Tool and library versions (optimized build with Graphite and LTO compression) - - export GCC_VER=12.2.0 # GCC version to build - export GMP_VER=6.2.1 # Sufficient for GCC 12.2 - export MPFR_VER=4.1.0 # Stable version compatible with GCC 12.2 - export MPC_VER=1.2.1 # Recommended for GCC 12.2 - export ISL_VER=0.24 # GCC 12.x infra uses this; don't use 0.26+ unless patched - export ZSTD_VER=1.5.5 # zstd compression for LTO bytecode - -#-------------------------------------------------------------------------------- -# tools - - # Compiler path prefixes - export CC_FOR_BUILD="$(command -v gcc)" - export CXX_FOR_BUILD="$(command -v g++)" - export MAKE="$(command -v make)" - - # Verify that compilers were found - : "${CC_FOR_BUILD:?gcc not found in PATH}" - : "${CXX_FOR_BUILD:?g++ not found in PATH}" - : "${MAKE:?make not found in PATH}" - - [ -x "$CC_FOR_BUILD" ] || { echo "❌ $CC_FOR_BUILD is not executable"; exit 1; } - [ -x "$CXX_FOR_BUILD" ] || { echo "❌ $CXX_FOR_BUILD is not executable"; exit 1; } - [ -x "$MAKE" ] || { echo "❌ $MAKE is not executable"; exit 1; } - - # Machine target - export HOST="$("$CC_FOR_BUILD" -dumpmachine)" - - # Determine parallelism - if command -v getconf >/dev/null 2>&1; then - export MAKE_JOBS=$(getconf _NPROCESSORS_ONLN) - else - echo "⚠️ getconf not found; defaulting MAKE_JOBS=1" - export MAKE_JOBS=1 - fi - - -#-------------------------------------------------------------------------------- -# upstream -> local stuff - - # see top of this file for the _VER variables - - # Tarball Download Info (Name, URL, Destination Directory) - export UPSTREAM_TARBALL_LIST=( - "gmp-${GMP_VER}.tar.xz" - "https://ftp.gnu.org/gnu/gmp/gmp-${GMP_VER}.tar.xz" - "$UPSTREAM/gmp-$GMP_VER" - - "mpfr-${MPFR_VER}.tar.xz" - "https://www.mpfr.org/mpfr-${MPFR_VER}/mpfr-${MPFR_VER}.tar.xz" - "$UPSTREAM/mpfr-$MPFR_VER" - - "mpc-${MPC_VER}.tar.gz" - "https://ftp.gnu.org/gnu/mpc/mpc-${MPC_VER}.tar.gz" - "$UPSTREAM/mpc-$MPC_VER" - - "isl-${ISL_VER}.tar.bz2" - "https://libisl.sourceforge.io/isl-${ISL_VER}.tar.bz2" - "$UPSTREAM/isl-$ISL_VER" - - "zstd-${ZSTD_VER}.tar.zst" - "https://github.com/facebook/zstd/releases/download/v${ZSTD_VER}/zstd-${ZSTD_VER}.tar.zst" - "$UPSTREAM/zstd-$ZSTD_VER" - ) - - # Git Repo Info - # Each entry is triple: Repository URL, Branch, Destination Directory - export UPSTREAM_GIT_REPO_LIST=( - - "git://gcc.gnu.org/git/gcc.git" - "releases/gcc-12" - "$SRC/gcc-$GCC_VER" - - #no second repo entry - ) - -#-------------------------------------------------------------------------------- -# source - - # Source directories - export GCC_SRC="$SRC/gcc-$GCC_VER" - export GMP_SRC="$SRC/gmp-$GMP_VER" - export MPFR_SRC="$SRC/mpfr-$MPFR_VER" - export MPC_SRC="$SRC/mpc-$MPC_VER" - export ISL_SRC="$SRC/isl-$ISL_VER" - export ZSTD_SRC="$SRC/zstd-$ZSTD_VER" - - SOURCE_DIR_LIST=( - "$GCC_SRC" - "$GMP_SRC" - "$MPFR_SRC" - "$MPC_SRC" - "$ISL_SRC" - "$ZSTD_SRC" - ) - -#-------------------------------------------------------------------------------- -# RT extensions affected files - - RT_CPP_FILES=(init.cc directives.cc macro.cc include/cpplib.h) - - -#-------------------------------------------------------------------------------- -# build - - # Build directories - export GCC_BUILD="$BUILD_DIR/gcc" - BUILD_DIR_LIST=( - "$GCC_BUILD" - ) - - - diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/ext_capture.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/ext_capture.sh" deleted file mode 100755 index 7e3ad86..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/ext_capture.sh" +++ /dev/null @@ -1,100 +0,0 @@ -#!/bin/bash -set -euo pipefail - -# provides RT_CPP_FILES -source "$(dirname "$0")/environment.sh" - - -echo "⚠️ You probably don't want to run this script. The files in \$DEVELOPER/script_Deb-12.10_gcc-12.4.1🖉/library are intended to be the authoritative copies." -echo "So you did the bad thing and edited the files directly in the GCC source tree? Then this script is for you. ;-)" -echo - -echo -n "Continue❓ [y/N]: " -read -r response -if [[ "$response" == "y" || "$response" == "Y" ]]; then - : -else - exit 1 -fi - - -if [[ -z "${DEVELOPER:-}" ]]; then - echo "❌ DEVELOPER environment variable is not set. Aborting." - exit 1 -fi -if [[ -z "${SCRIPT_DIR:-}" ]]; then - echo "❌ SCRIPT_DIR environment variable is not set. Aborting." - exit 1 -fi -SRCDIR="library/" -DESTDIR="$GCC_SRC/libcpp/" - - -SRCDIR="$DEVELOPER/source/gcc-12.2.0/libcpp" -DESTDIR="$DEVELOPER/script_Deb-12.10_gcc-12.4.1🖉/library" - -if [[ ! -d "$SRCDIR" ]]; then - echo "❌ Source directory '$SRCDIR' does not exist." - exit 1 -fi - -if [[ ! -d "$DESTDIR" ]]; then - echo "❌ Destination directory '$DESTDIR' does not exist." - exit 1 -fi - -echo "📋 Checking files in $SRCDIR to copy to $DESTDIR..." - -for file in "${RT_CPP_FILES[@]}"; do - SRC="$SRCDIR/$file" - DEST="$DESTDIR/$file" - - mkdir -p "$(dirname "$DEST")" - - if [[ ! -f "$SRC" ]]; then - echo "⚠️ Source file '$SRC' not found. Skipping." - continue - fi - - if [[ ! -f "$DEST" ]]; then - echo "📤 No destination file. Copying: $file" - cp -p "$SRC" "$DEST" - continue - fi - - if cmp -s "$SRC" "$DEST"; then - echo "✅ No changes: $file" - continue - fi - - if [[ "$SRC" -nt "$DEST" ]]; then - echo "📤 Source is newer and differs. Copying: $file" - cp -p "$SRC" "$DEST" - elif [[ "$DEST" -nt "$SRC" ]]; then - echo "⚠️ Destination file '$file' is newer than source and differs." - echo "🔍 Showing diff:" - diff -u "$DEST" "$SRC" || true - echo -n "❓ Overwrite the authoritative '$file' with the older source version? [y/N]: " - read -r response - if [[ "$response" == "y" || "$response" == "Y" ]]; then - echo "📤 Overwriting with older source: $file" - cp -p "$SRC" "$DEST" - else - echo "❌ Skipping: $file" - fi - else - echo "⚠️ Files differ but timestamps are equal: $file" - echo "🔍 Showing diff:" - diff -u "$DEST" "$SRC" || true - echo -n "❓ Overwrite anyway? [y/N]: " - read -r response - if [[ "$response" == "y" || "$response" == "Y" ]]; then - cp -p "$SRC" "$DEST" - echo "📤 Overwritten." - else - echo "❌ Skipped." - fi - fi -done - -echo "✅ Capture complete." diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/ext_diff.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/ext_diff.sh" deleted file mode 100755 index 9d75cc9..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/ext_diff.sh" +++ /dev/null @@ -1,64 +0,0 @@ -#!/bin/bash -set -euo pipefail - -# Provides RT_CPP_FILES and paths -source "$(dirname "$0")/environment.sh" - -# Check required env vars -if [[ -z "${DEVELOPER:-}" ]]; then - echo "❌ DEVELOPER environment variable is not set. Aborting." - exit 1 -fi -if [[ -z "${SCRIPT_DIR:-}" ]]; then - echo "❌ SCRIPT_DIR environment variable is not set. Aborting." - exit 1 -fi - -SRCDIR="library/" -DESTDIR="$GCC_SRC/libcpp/" - -if [[ ! -d "$SRCDIR" ]]; then - echo "❌ Source directory '$SRCDIR' does not exist." - exit 1 -fi - -if [[ ! -d "$DESTDIR" ]]; then - echo "❌ Destination directory '$DESTDIR' does not exist." - exit 1 -fi - -# Choose files to diff -FILES=() -if [[ "$#" -gt 0 ]]; then - FILES=("$@") -else - FILES=("${RT_CPP_FILES[@]}") -fi - -echo "🔍 Diffing library ↔ libcpp..." - -for file in "${FILES[@]}"; do - SRC="$SRCDIR/$file" - DEST="$DESTDIR/$file" - - echo "🔸 $file" - - if [[ ! -f "$SRC" ]]; then - echo " ⚠️ Missing in library/: $SRC" - continue - fi - - if [[ ! -f "$DEST" ]]; then - echo " ⚠️ Missing in libcpp/: $DEST" - continue - fi - - if cmp -s "$SRC" "$DEST"; then - echo " ✅ No differences." - else - echo " ❗ Differences found:" - diff -u "$DEST" "$SRC" || true - fi -done - -echo "✅ Diff check complete." diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/ext_install.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/ext_install.sh" deleted file mode 100755 index a25a026..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/ext_install.sh" +++ /dev/null @@ -1,58 +0,0 @@ -#!/bin/bash -# ext_isntall.sh – Install RT library files into GCC libcpp source tree. -# Usage: -# ./ext_isntall.sh → ext_isntalls all files in RT_CPP_FILES -# ./ext_isntall.sh init.cc → ext_isntalls only init.cc -# ./ext_isntall.sh init.cc macro.cc → ext_isntalls just those - -set -euo pipefail - -# provides: $DEVELOPER, $RT_CPP_FILES, $GCC_SRC -source "$(dirname "$0")/environment.sh" - -SRCDIR="library" -DESTDIR="$GCC_SRC/libcpp" - -# Validate environment -[[ -z "${DEVELOPER:-}" ]] && { echo "❌ DEVELOPER is not set. Aborting."; exit 1; } -[[ ! -d "$SRCDIR" ]] && { echo "❌ Source directory '$SRCDIR' missing."; exit 1; } -[[ ! -d "$DESTDIR" ]] && { echo "❌ Destination directory '$DESTDIR' missing."; exit 1; } - -# Determine list of files to ext_isntall -if [[ $# -eq 0 ]]; then - file_list=("${RT_CPP_FILES[@]}") -else - file_list=("$@") -fi - -echo "📋 Ext_Isntallring files to $DESTDIR..." - -for file in "${file_list[@]}"; do - src="$SRCDIR/$file" - dest="$DESTDIR/$file" - - if [[ ! -f "$src" ]]; then - echo "⚠️ Missing source file: $src" - continue - fi - - if [[ ! -f "$dest" || "$src" -nt "$dest" ]]; then - echo "📥 Copying (newer or missing): $file" - cp -p "$src" "$dest" - elif [[ "$dest" -nt "$src" ]]; then - echo "⚠️ Destination '$file' is newer than source." - diff -u "$dest" "$src" || true - echo -n "❓ Overwrite destination '$file'? [y/N]: " - read -r response - if [[ "$response" =~ ^[Yy]$ ]]; then - echo "📥 Overwriting: $file" - cp -p "$src" "$dest" - else - echo "⏭️ Skipped: $file" - fi - else - echo "✅ Up-to-date: $file" - fi -done - -echo "✅ Ext_Isntall complete." diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/ext_save.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/ext_save.sh" deleted file mode 100755 index 2869048..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/ext_save.sh" +++ /dev/null @@ -1,46 +0,0 @@ -#!/bin/bash -set -euo pipefail - -# provides RT_CPP_FILES -source "$(dirname "$0")/environment.sh" - -# Save original versions of libcpp files to prevent accidental loss -# Appends _orig after the .cc extension (e.g., macro.cc → macro.cc_orig) -# Files remain in place but can be manually diffed or restored if needed - -if [[ -z "${DEVELOPER:-}" ]]; then - echo "❌ DEVELOPER environment variable is not set. Aborting." - exit 1 -fi -if [[ -z "${SCRIPT_DIR:-}" ]]; then - echo "❌ SCRIPT_DIR environment variable is not set. Aborting." - exit 1 -fi - -TARGETDIR="$GCC_SRC/libcpp/" - -if [[ ! -d "$TARGETDIR" ]]; then - echo "❌ Target directory '$TARGETDIR' does not exist." - exit 1 -fi - -echo "📦 Saving original copies of target files..." - -for file in "${RT_CPP_FILES[@]}"; do - SRC="$TARGETDIR/$file" - BACKUP="$SRC"_orig - - if [[ ! -f "$SRC" ]]; then - echo "⚠️ Source file '$SRC' not found. Skipping." - continue - fi - - if [[ -f "$BACKUP" ]]; then - echo "✅ Already saved: $file → $(basename "$BACKUP")" - else - cp -p "$SRC" "$BACKUP" - echo "📁 Saved: $file → $(basename "$BACKUP")" - fi -done - -echo "✅ All originals saved." diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/library/directives.cc" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/library/directives.cc" deleted file mode 100644 index 39fbde6..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/library/directives.cc" +++ /dev/null @@ -1,2886 +0,0 @@ -/* CPP Library. (Directive handling.) - Copyright (C) 1986-2022 Free Software Foundation, Inc. - Contributed by Per Bothner, 1994-95. - Based on CCCP program by Paul Rubin, June 1986 - Adapted to ANSI C, Richard Stallman, Jan 1987 - -This program is free software; you can redistribute it and/or modify it -under the terms of the GNU General Public License as published by the -Free Software Foundation; either version 3, or (at your option) any -later version. - -This program is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -You should have received a copy of the GNU General Public License -along with this program; see the file COPYING3. If not see -. */ - -#pragma GCC diagnostic ignored "-Wparentheses" - -#include "config.h" -#include "system.h" -#include "cpplib.h" -#include "internal.h" -#include "mkdeps.h" -#include "obstack.h" - -/* Stack of conditionals currently in progress - (including both successful and failing conditionals). */ -struct if_stack -{ - struct if_stack *next; - location_t line; /* Line where condition started. */ - const cpp_hashnode *mi_cmacro;/* macro name for #ifndef around entire file */ - bool skip_elses; /* Can future #else / #elif be skipped? */ - bool was_skipping; /* If were skipping on entry. */ - int type; /* Most recent conditional for diagnostics. */ -}; - -/* Contains a registered pragma or pragma namespace. */ -typedef void (*pragma_cb) (cpp_reader *); -struct pragma_entry -{ - struct pragma_entry *next; - const cpp_hashnode *pragma; /* Name and length. */ - bool is_nspace; - bool is_internal; - bool is_deferred; - bool allow_expansion; - union { - pragma_cb handler; - struct pragma_entry *space; - unsigned int ident; - } u; -}; - -/* Values for the origin field of struct directive. KANDR directives - come from traditional (K&R) C. STDC89 directives come from the - 1989 C standard. STDC2X directives come from the C2X standard. EXTENSION - directives are extensions. */ -#define KANDR 0 -#define STDC89 1 -#define STDC2X 2 -#define EXTENSION 3 - -/* Values for the flags field of struct directive. COND indicates a - conditional; IF_COND an opening conditional. INCL means to treat - "..." and <...> as q-char and h-char sequences respectively. IN_I - means this directive should be handled even if -fpreprocessed is in - effect (these are the directives with callback hooks). - - EXPAND is set on directives that are always macro-expanded. - - ELIFDEF is set on directives that are only handled for standards with the - #elifdef / #elifndef feature. */ -#define COND (1 << 0) -#define IF_COND (1 << 1) -#define INCL (1 << 2) -#define IN_I (1 << 3) -#define EXPAND (1 << 4) -#define DEPRECATED (1 << 5) -#define ELIFDEF (1 << 6) - -/* Defines one #-directive, including how to handle it. */ -typedef void (*directive_handler) (cpp_reader *); -typedef struct directive directive; -struct directive -{ - directive_handler handler; /* Function to handle directive. */ - const uchar *name; /* Name of directive. */ - unsigned short length; /* Length of name. */ - unsigned char origin; /* Origin of directive. */ - unsigned char flags; /* Flags describing this directive. */ -}; - -/* Forward declarations. */ - -static void skip_rest_of_line (cpp_reader *); -static void check_eol (cpp_reader *, bool); -static void start_directive (cpp_reader *); -static void prepare_directive_trad (cpp_reader *); -static void end_directive (cpp_reader *, int); -static void directive_diagnostics (cpp_reader *, const directive *, int); -static void run_directive (cpp_reader *, int, const char *, size_t); -static char *glue_header_name (cpp_reader *); -static const char *parse_include (cpp_reader *, int *, const cpp_token ***, - location_t *); -static void push_conditional (cpp_reader *, int, int, const cpp_hashnode *); -static unsigned int read_flag (cpp_reader *, unsigned int); -static bool strtolinenum (const uchar *, size_t, linenum_type *, bool *); -static void do_diagnostic (cpp_reader *, enum cpp_diagnostic_level code, - enum cpp_warning_reason reason, int); -static cpp_hashnode *lex_macro_node (cpp_reader *, bool); -static int undefine_macros (cpp_reader *, cpp_hashnode *, void *); -static void do_include_common (cpp_reader *, enum include_type); -static struct pragma_entry *lookup_pragma_entry (struct pragma_entry *, - const cpp_hashnode *); -static int count_registered_pragmas (struct pragma_entry *); -static char ** save_registered_pragmas (struct pragma_entry *, char **); -static char ** restore_registered_pragmas (cpp_reader *, struct pragma_entry *, - char **); -static void do_pragma_once (cpp_reader *); -static void do_pragma_poison (cpp_reader *); -static void do_pragma_system_header (cpp_reader *); -static void do_pragma_dependency (cpp_reader *); -static void do_pragma_warning_or_error (cpp_reader *, bool error); -static void do_pragma_warning (cpp_reader *); -static void do_pragma_error (cpp_reader *); -static void do_linemarker (cpp_reader *); -static const cpp_token *get_token_no_padding (cpp_reader *); -static const cpp_token *get__Pragma_string (cpp_reader *); -static void destringize_and_run (cpp_reader *, const cpp_string *, - location_t); -static bool parse_answer (cpp_reader *, int, location_t, cpp_macro **); -static cpp_hashnode *parse_assertion (cpp_reader *, int, cpp_macro **); -static cpp_macro **find_answer (cpp_hashnode *, const cpp_macro *); -static void handle_assertion (cpp_reader *, const char *, int); -static void do_pragma_push_macro (cpp_reader *); -static void do_pragma_pop_macro (cpp_reader *); -static void cpp_pop_definition (cpp_reader *, struct def_pragma_macro *); - -/* This is the table of directive handlers. All extensions other than - #warning, #include_next, and #import are deprecated. The name is - where the extension appears to have come from. */ - -#define DIRECTIVE_TABLE \ - D(define ,T_DEFINE = 0 ,KANDR ,IN_I) \ - D(include ,T_INCLUDE ,KANDR ,INCL | EXPAND) \ - D(endif ,T_ENDIF ,KANDR ,COND) \ - D(ifdef ,T_IFDEF ,KANDR ,COND | IF_COND) \ - D(if ,T_IF ,KANDR ,COND | IF_COND | EXPAND) \ - D(else ,T_ELSE ,KANDR ,COND) \ - D(ifndef ,T_IFNDEF ,KANDR ,COND | IF_COND) \ - D(undef ,T_UNDEF ,KANDR ,IN_I) \ - D(line ,T_LINE ,KANDR ,EXPAND) \ - D(elif ,T_ELIF ,STDC89 ,COND | EXPAND) \ - D(elifdef ,T_ELIFDEF ,STDC2X ,COND | ELIFDEF) \ - D(elifndef ,T_ELIFNDEF ,STDC2X ,COND | ELIFDEF) \ - D(error ,T_ERROR ,STDC89 ,0) \ - D(pragma ,T_PRAGMA ,STDC89 ,IN_I) \ - D(warning ,T_WARNING ,EXTENSION ,0) \ - D(include_next ,T_INCLUDE_NEXT ,EXTENSION ,INCL | EXPAND) \ - D(ident ,T_IDENT ,EXTENSION ,IN_I) \ - D(import ,T_IMPORT ,EXTENSION ,INCL | EXPAND) /* ObjC */ \ - D(assert ,T_ASSERT ,EXTENSION ,DEPRECATED) /* SVR4 */ \ - D(unassert ,T_UNASSERT ,EXTENSION ,DEPRECATED) /* SVR4 */ \ - D(sccs ,T_SCCS ,EXTENSION ,IN_I) /* SVR4? */ \ - D(rt_macro ,T_MACRO ,EXTENSION ,IN_I) \ - D(assign ,T_ASSIGN ,EXTENSION ,IN_I) - - -/* #sccs is synonymous with #ident. */ -#define do_sccs do_ident - -/* Use the table to generate a series of prototypes, an enum for the - directive names, and an array of directive handlers. */ - -#define D(name, t, o, f) static void do_##name (cpp_reader *); -DIRECTIVE_TABLE -#undef D - -#define D(n, tag, o, f) tag, -enum -{ - DIRECTIVE_TABLE - N_DIRECTIVES -}; -#undef D - -#define D(name, t, origin, flags) \ -{ do_##name, (const uchar *) #name, \ - sizeof #name - 1, origin, flags }, -static const directive dtable[] = -{ -DIRECTIVE_TABLE -}; -#undef D - -/* A NULL-terminated array of directive names for use - when suggesting corrections for misspelled directives. */ -#define D(name, t, origin, flags) #name, -static const char * const directive_names[] = { -DIRECTIVE_TABLE - NULL -}; -#undef D - -#undef DIRECTIVE_TABLE - -/* Wrapper struct directive for linemarkers. - The origin is more or less true - the original K+R cpp - did use this notation in its preprocessed output. */ -static const directive linemarker_dir = -{ - do_linemarker, UC"#", 1, KANDR, IN_I -}; - -/* Skip any remaining tokens in a directive. */ -static void -skip_rest_of_line (cpp_reader *pfile) -{ - /* Discard all stacked contexts. */ - while (pfile->context->prev) - _cpp_pop_context (pfile); - - /* Sweep up all tokens remaining on the line. */ - if (! SEEN_EOL ()) - while (_cpp_lex_token (pfile)->type != CPP_EOF) - ; -} - -/* Helper function for check_oel. */ - -static void -check_eol_1 (cpp_reader *pfile, bool expand, enum cpp_warning_reason reason) -{ - if (! SEEN_EOL () && (expand - ? cpp_get_token (pfile) - : _cpp_lex_token (pfile))->type != CPP_EOF) - cpp_pedwarning (pfile, reason, "extra tokens at end of #%s directive", - pfile->directive->name); -} - -/* Variant of check_eol used for Wendif-labels warnings. */ - -static void -check_eol_endif_labels (cpp_reader *pfile) -{ - check_eol_1 (pfile, false, CPP_W_ENDIF_LABELS); -} - -/* Ensure there are no stray tokens at the end of a directive. If - EXPAND is true, tokens macro-expanding to nothing are allowed. */ - -static void -check_eol (cpp_reader *pfile, bool expand) -{ - check_eol_1 (pfile, expand, CPP_W_NONE); -} - -/* Ensure there are no stray tokens other than comments at the end of - a directive, and gather the comments. */ -static const cpp_token ** -check_eol_return_comments (cpp_reader *pfile) -{ - size_t c; - size_t capacity = 8; - const cpp_token **buf; - - buf = XNEWVEC (const cpp_token *, capacity); - c = 0; - if (! SEEN_EOL ()) - { - while (1) - { - const cpp_token *tok; - - tok = _cpp_lex_token (pfile); - if (tok->type == CPP_EOF) - break; - if (tok->type != CPP_COMMENT) - cpp_error (pfile, CPP_DL_PEDWARN, - "extra tokens at end of #%s directive", - pfile->directive->name); - else - { - if (c + 1 >= capacity) - { - capacity *= 2; - buf = XRESIZEVEC (const cpp_token *, buf, capacity); - } - buf[c] = tok; - ++c; - } - } - } - buf[c] = NULL; - return buf; -} - -/* Called when entering a directive, _Pragma or command-line directive. */ -static void -start_directive (cpp_reader *pfile) -{ - /* Setup in-directive state. */ - pfile->state.in_directive = 1; - pfile->state.save_comments = 0; - pfile->directive_result.type = CPP_PADDING; - - /* Some handlers need the position of the # for diagnostics. */ - pfile->directive_line = pfile->line_table->highest_line; -} - -/* Called when leaving a directive, _Pragma or command-line directive. */ -static void -end_directive (cpp_reader *pfile, int skip_line) -{ - if (CPP_OPTION (pfile, traditional)) - { - /* Revert change of prepare_directive_trad. */ - if (!pfile->state.in_deferred_pragma) - pfile->state.prevent_expansion--; - - if (pfile->directive != &dtable[T_DEFINE]) - _cpp_remove_overlay (pfile); - } - else if (pfile->state.in_deferred_pragma) - ; - /* We don't skip for an assembler #. */ - else if (skip_line) - { - skip_rest_of_line (pfile); - if (!pfile->keep_tokens) - { - pfile->cur_run = &pfile->base_run; - pfile->cur_token = pfile->base_run.base; - } - } - - /* Restore state. */ - pfile->state.save_comments = ! CPP_OPTION (pfile, discard_comments); - pfile->state.in_directive = 0; - pfile->state.in_expression = 0; - pfile->state.angled_headers = 0; - pfile->directive = 0; -} - -/* Prepare to handle the directive in pfile->directive. */ -static void -prepare_directive_trad (cpp_reader *pfile) -{ - if (pfile->directive != &dtable[T_DEFINE]) - { - bool no_expand = (pfile->directive - && ! (pfile->directive->flags & EXPAND)); - bool was_skipping = pfile->state.skipping; - - pfile->state.in_expression = (pfile->directive == &dtable[T_IF] - || pfile->directive == &dtable[T_ELIF]); - if (pfile->state.in_expression) - pfile->state.skipping = false; - - if (no_expand) - pfile->state.prevent_expansion++; - _cpp_scan_out_logical_line (pfile, NULL, false); - if (no_expand) - pfile->state.prevent_expansion--; - - pfile->state.skipping = was_skipping; - _cpp_overlay_buffer (pfile, pfile->out.base, - pfile->out.cur - pfile->out.base); - } - - /* Stop ISO C from expanding anything. */ - pfile->state.prevent_expansion++; -} - -/* Output diagnostics for a directive DIR. INDENTED is nonzero if - the '#' was indented. */ -static void -directive_diagnostics (cpp_reader *pfile, const directive *dir, int indented) -{ - /* Issue -pedantic or deprecated warnings for extensions. We let - -pedantic take precedence if both are applicable. */ - if (! pfile->state.skipping) - { - if (dir->origin == EXTENSION - && !(dir == &dtable[T_IMPORT] && CPP_OPTION (pfile, objc)) - && CPP_PEDANTIC (pfile)) - cpp_error (pfile, CPP_DL_PEDWARN, "#%s is a GCC extension", dir->name); - else if (((dir->flags & DEPRECATED) != 0 - || (dir == &dtable[T_IMPORT] && !CPP_OPTION (pfile, objc))) - && CPP_OPTION (pfile, cpp_warn_deprecated)) - cpp_warning (pfile, CPP_W_DEPRECATED, - "#%s is a deprecated GCC extension", dir->name); - } - - /* Traditionally, a directive is ignored unless its # is in - column 1. Therefore in code intended to work with K+R - compilers, directives added by C89 must have their # - indented, and directives present in traditional C must not. - This is true even of directives in skipped conditional - blocks. #elif cannot be used at all. */ - if (CPP_WTRADITIONAL (pfile)) - { - if (dir == &dtable[T_ELIF]) - cpp_warning (pfile, CPP_W_TRADITIONAL, - "suggest not using #elif in traditional C"); - else if (indented && dir->origin == KANDR) - cpp_warning (pfile, CPP_W_TRADITIONAL, - "traditional C ignores #%s with the # indented", - dir->name); - else if (!indented && dir->origin != KANDR) - cpp_warning (pfile, CPP_W_TRADITIONAL, - "suggest hiding #%s from traditional C with an indented #", - dir->name); - } -} - -/* Check if we have a known directive. INDENTED is true if the - '#' of the directive was indented. This function is in this file - to save unnecessarily exporting dtable etc. to lex.cc. Returns - nonzero if the line of tokens has been handled, zero if we should - continue processing the line. */ -int -_cpp_handle_directive (cpp_reader *pfile, bool indented) -{ - const directive *dir = 0; - const cpp_token *dname; - bool was_parsing_args = pfile->state.parsing_args; - bool was_discarding_output = pfile->state.discarding_output; - int skip = 1; - - if (was_discarding_output) - pfile->state.prevent_expansion = 0; - - if (was_parsing_args) - { - if (CPP_OPTION (pfile, cpp_pedantic)) - cpp_error (pfile, CPP_DL_PEDWARN, - "embedding a directive within macro arguments is not portable"); - pfile->state.parsing_args = 0; - pfile->state.prevent_expansion = 0; - } - start_directive (pfile); - dname = _cpp_lex_token (pfile); - - if (dname->type == CPP_NAME) - { - if (dname->val.node.node->is_directive) - { - dir = &dtable[dname->val.node.node->directive_index]; - if ((dir->flags & ELIFDEF) - && !CPP_OPTION (pfile, elifdef) - /* For -std=gnu* modes elifdef is supported with - a pedwarn if pedantic. */ - && CPP_OPTION (pfile, std)) - dir = 0; - } - } - /* We do not recognize the # followed by a number extension in - assembler code. */ - else if (dname->type == CPP_NUMBER && CPP_OPTION (pfile, lang) != CLK_ASM) - { - dir = &linemarker_dir; - if (CPP_PEDANTIC (pfile) && ! CPP_OPTION (pfile, preprocessed) - && ! pfile->state.skipping) - cpp_error (pfile, CPP_DL_PEDWARN, - "style of line directive is a GCC extension"); - } - - if (dir) - { - /* If we have a directive that is not an opening conditional, - invalidate any control macro. */ - if (! (dir->flags & IF_COND)) - pfile->mi_valid = false; - - /* Kluge alert. In order to be sure that code like this - - #define HASH # - HASH define foo bar - - does not cause '#define foo bar' to get executed when - compiled with -save-temps, we recognize directives in - -fpreprocessed mode only if the # is in column 1. macro.cc - puts a space in front of any '#' at the start of a macro. - - We exclude the -fdirectives-only case because macro expansion - has not been performed yet, and block comments can cause spaces - to precede the directive. */ - if (CPP_OPTION (pfile, preprocessed) - && !CPP_OPTION (pfile, directives_only) - && (indented || !(dir->flags & IN_I))) - { - skip = 0; - dir = 0; - } - else - { - /* In failed conditional groups, all non-conditional - directives are ignored. Before doing that, whether - skipping or not, we should lex angle-bracketed headers - correctly, and maybe output some diagnostics. */ - pfile->state.angled_headers = dir->flags & INCL; - pfile->state.directive_wants_padding = dir->flags & INCL; - if (! CPP_OPTION (pfile, preprocessed)) - directive_diagnostics (pfile, dir, indented); - if (pfile->state.skipping && !(dir->flags & COND)) - dir = 0; - } - } - else if (dname->type == CPP_EOF) - ; /* CPP_EOF is the "null directive". */ - else - { - /* An unknown directive. Don't complain about it in assembly - source: we don't know where the comments are, and # may - introduce assembler pseudo-ops. Don't complain about invalid - directives in skipped conditional groups (6.10 p4). */ - if (CPP_OPTION (pfile, lang) == CLK_ASM) - skip = 0; - else if (!pfile->state.skipping) - { - const char *unrecognized - = (const char *)cpp_token_as_text (pfile, dname); - const char *hint = NULL; - - /* Call back into gcc to get a spelling suggestion. Ideally - we'd just use best_match from gcc/spellcheck.h (and filter - out the uncommon directives), but that requires moving it - to a support library. */ - if (pfile->cb.get_suggestion) - hint = pfile->cb.get_suggestion (pfile, unrecognized, - directive_names); - - if (hint) - { - rich_location richloc (pfile->line_table, dname->src_loc); - source_range misspelled_token_range - = get_range_from_loc (pfile->line_table, dname->src_loc); - richloc.add_fixit_replace (misspelled_token_range, hint); - cpp_error_at (pfile, CPP_DL_ERROR, &richloc, - "invalid preprocessing directive #%s;" - " did you mean #%s?", - unrecognized, hint); - } - else - cpp_error (pfile, CPP_DL_ERROR, - "invalid preprocessing directive #%s", - unrecognized); - } - } - - pfile->directive = dir; - if (CPP_OPTION (pfile, traditional)) - prepare_directive_trad (pfile); - - if (dir) - pfile->directive->handler (pfile); - else if (skip == 0) - _cpp_backup_tokens (pfile, 1); - - end_directive (pfile, skip); - if (was_parsing_args && !pfile->state.in_deferred_pragma) - { - /* Restore state when within macro args. */ - pfile->state.parsing_args = 2; - pfile->state.prevent_expansion = 1; - } - if (was_discarding_output) - pfile->state.prevent_expansion = 1; - return skip; -} - -/* Directive handler wrapper used by the command line option - processor. BUF is \n terminated. */ -static void -run_directive (cpp_reader *pfile, int dir_no, const char *buf, size_t count) -{ - cpp_push_buffer (pfile, (const uchar *) buf, count, - /* from_stage3 */ true); - start_directive (pfile); - - /* This is a short-term fix to prevent a leading '#' being - interpreted as a directive. */ - _cpp_clean_line (pfile); - - pfile->directive = &dtable[dir_no]; - if (CPP_OPTION (pfile, traditional)) - prepare_directive_trad (pfile); - pfile->directive->handler (pfile); - end_directive (pfile, 1); - _cpp_pop_buffer (pfile); -} - -/* Checks for validity the macro name in #define, #undef, #ifdef and - #ifndef directives. IS_DEF_OR_UNDEF is true if this call is - processing a #define or #undefine directive, and false - otherwise. */ -static cpp_hashnode * -lex_macro_node (cpp_reader *pfile, bool is_def_or_undef) -{ - const cpp_token *token = _cpp_lex_token (pfile); - - /* The token immediately after #define must be an identifier. That - identifier may not be "defined", per C99 6.10.8p4. - In C++, it may not be any of the "named operators" either, - per C++98 [lex.digraph], [lex.key]. - Finally, the identifier may not have been poisoned. (In that case - the lexer has issued the error message for us.) */ - - if (token->type == CPP_NAME) - { - cpp_hashnode *node = token->val.node.node; - - if (is_def_or_undef - && node == pfile->spec_nodes.n_defined) - cpp_error (pfile, CPP_DL_ERROR, - "\"%s\" cannot be used as a macro name", - NODE_NAME (node)); - else if (! (node->flags & NODE_POISONED)) - return node; - } - else if (token->flags & NAMED_OP) - cpp_error (pfile, CPP_DL_ERROR, - "\"%s\" cannot be used as a macro name as it is an operator in C++", - NODE_NAME (token->val.node.node)); - else if (token->type == CPP_EOF) - cpp_error (pfile, CPP_DL_ERROR, "no macro name given in #%s directive", - pfile->directive->name); - else - cpp_error (pfile, CPP_DL_ERROR, "macro names must be identifiers"); - - return NULL; -} - -/* Process a #define directive. Most work is done in macro.cc. */ -static void -do_define (cpp_reader *pfile) -{ - cpp_hashnode *node = lex_macro_node (pfile, true); - - if (node) - { - /* If we have been requested to expand comments into macros, - then re-enable saving of comments. */ - pfile->state.save_comments = - ! CPP_OPTION (pfile, discard_comments_in_macro_exp); - - if (pfile->cb.before_define) - pfile->cb.before_define (pfile); - - if (_cpp_create_definition (pfile, node)) - if (pfile->cb.define) - pfile->cb.define (pfile, pfile->directive_line, node); - - node->flags &= ~NODE_USED; - } -} - -/* Handle #undef. Mark the identifier NT_VOID in the hash table. */ -static void -do_undef (cpp_reader *pfile) -{ - cpp_hashnode *node = lex_macro_node (pfile, true); - - if (node) - { - if (pfile->cb.before_define) - pfile->cb.before_define (pfile); - - if (pfile->cb.undef) - pfile->cb.undef (pfile, pfile->directive_line, node); - - /* 6.10.3.5 paragraph 2: [#undef] is ignored if the specified - identifier is not currently defined as a macro name. */ - if (cpp_macro_p (node)) - { - if (node->flags & NODE_WARN) - cpp_error (pfile, CPP_DL_WARNING, - "undefining \"%s\"", NODE_NAME (node)); - else if (cpp_builtin_macro_p (node) - && CPP_OPTION (pfile, warn_builtin_macro_redefined)) - cpp_warning_with_line (pfile, CPP_W_BUILTIN_MACRO_REDEFINED, - pfile->directive_line, 0, - "undefining \"%s\"", NODE_NAME (node)); - - if (node->value.macro - && CPP_OPTION (pfile, warn_unused_macros)) - _cpp_warn_if_unused_macro (pfile, node, NULL); - - _cpp_free_definition (node); - } - } - - check_eol (pfile, false); -} - -/* Undefine a single macro/assertion/whatever. */ - -static int -undefine_macros (cpp_reader *pfile ATTRIBUTE_UNUSED, cpp_hashnode *h, - void *data_p ATTRIBUTE_UNUSED) -{ - /* Body of _cpp_free_definition inlined here for speed. - Macros and assertions no longer have anything to free. */ - h->type = NT_VOID; - h->value.answers = NULL; - h->flags &= ~(NODE_POISONED|NODE_DISABLED|NODE_USED); - return 1; -} - -/* Undefine all macros and assertions. */ - -void -cpp_undef_all (cpp_reader *pfile) -{ - cpp_forall_identifiers (pfile, undefine_macros, NULL); -} - - -/* Helper routine used by parse_include. Reinterpret the current line - as an h-char-sequence (< ... >); we are looking at the first token - after the <. Returns a malloced filename. */ -static char * -glue_header_name (cpp_reader *pfile) -{ - const cpp_token *token; - char *buffer; - size_t len, total_len = 0, capacity = 1024; - - /* To avoid lexed tokens overwriting our glued name, we can only - allocate from the string pool once we've lexed everything. */ - buffer = XNEWVEC (char, capacity); - for (;;) - { - token = get_token_no_padding (pfile); - - if (token->type == CPP_GREATER) - break; - if (token->type == CPP_EOF) - { - cpp_error (pfile, CPP_DL_ERROR, "missing terminating > character"); - break; - } - - len = cpp_token_len (token) + 2; /* Leading space, terminating \0. */ - if (total_len + len > capacity) - { - capacity = (capacity + len) * 2; - buffer = XRESIZEVEC (char, buffer, capacity); - } - - if (token->flags & PREV_WHITE) - buffer[total_len++] = ' '; - - total_len = (cpp_spell_token (pfile, token, (uchar *) &buffer[total_len], - true) - - (uchar *) buffer); - } - - buffer[total_len] = '\0'; - return buffer; -} - -/* Returns the file name of #include, #include_next, #import and - #pragma dependency. The string is malloced and the caller should - free it. Returns NULL on error. LOCATION is the source location - of the file name. */ - -static const char * -parse_include (cpp_reader *pfile, int *pangle_brackets, - const cpp_token ***buf, location_t *location) -{ - char *fname; - const cpp_token *header; - - /* Allow macro expansion. */ - header = get_token_no_padding (pfile); - *location = header->src_loc; - if ((header->type == CPP_STRING && header->val.str.text[0] != 'R') - || header->type == CPP_HEADER_NAME) - { - fname = XNEWVEC (char, header->val.str.len - 1); - memcpy (fname, header->val.str.text + 1, header->val.str.len - 2); - fname[header->val.str.len - 2] = '\0'; - *pangle_brackets = header->type == CPP_HEADER_NAME; - } - else if (header->type == CPP_LESS) - { - fname = glue_header_name (pfile); - *pangle_brackets = 1; - } - else - { - const unsigned char *dir; - - if (pfile->directive == &dtable[T_PRAGMA]) - dir = UC"pragma dependency"; - else - dir = pfile->directive->name; - cpp_error (pfile, CPP_DL_ERROR, "#%s expects \"FILENAME\" or ", - dir); - - return NULL; - } - - if (pfile->directive == &dtable[T_PRAGMA]) - { - /* This pragma allows extra tokens after the file name. */ - } - else if (buf == NULL || CPP_OPTION (pfile, discard_comments)) - check_eol (pfile, true); - else - { - /* If we are not discarding comments, then gather them while - doing the eol check. */ - *buf = check_eol_return_comments (pfile); - } - - return fname; -} - -/* Handle #include, #include_next and #import. */ -static void -do_include_common (cpp_reader *pfile, enum include_type type) -{ - const char *fname; - int angle_brackets; - const cpp_token **buf = NULL; - location_t location; - - /* Re-enable saving of comments if requested, so that the include - callback can dump comments which follow #include. */ - pfile->state.save_comments = ! CPP_OPTION (pfile, discard_comments); - - /* Tell the lexer this is an include directive -- we want it to - increment the line number even if this is the last line of a file. */ - pfile->state.in_directive = 2; - - fname = parse_include (pfile, &angle_brackets, &buf, &location); - if (!fname) - goto done; - - if (!*fname) - { - cpp_error_with_line (pfile, CPP_DL_ERROR, location, 0, - "empty filename in #%s", - pfile->directive->name); - goto done; - } - - /* Prevent #include recursion. */ - if (pfile->line_table->depth >= CPP_OPTION (pfile, max_include_depth)) - cpp_error (pfile, - CPP_DL_ERROR, - "#include nested depth %u exceeds maximum of %u" - " (use -fmax-include-depth=DEPTH to increase the maximum)", - pfile->line_table->depth, - CPP_OPTION (pfile, max_include_depth)); - else - { - /* Get out of macro context, if we are. */ - skip_rest_of_line (pfile); - - if (pfile->cb.include) - pfile->cb.include (pfile, pfile->directive_line, - pfile->directive->name, fname, angle_brackets, - buf); - - _cpp_stack_include (pfile, fname, angle_brackets, type, location); - } - - done: - XDELETEVEC (fname); - if (buf) - XDELETEVEC (buf); -} - -static void -do_include (cpp_reader *pfile) -{ - do_include_common (pfile, IT_INCLUDE); -} - -static void -do_import (cpp_reader *pfile) -{ - do_include_common (pfile, IT_IMPORT); -} - -static void -do_include_next (cpp_reader *pfile) -{ - enum include_type type = IT_INCLUDE_NEXT; - - /* If this is the primary source file, warn and use the normal - search logic. */ - if (_cpp_in_main_source_file (pfile)) - { - cpp_error (pfile, CPP_DL_WARNING, - "#include_next in primary source file"); - type = IT_INCLUDE; - } - do_include_common (pfile, type); -} - -/* Subroutine of do_linemarker. Read possible flags after file name. - LAST is the last flag seen; 0 if this is the first flag. Return the - flag if it is valid, 0 at the end of the directive. Otherwise - complain. */ -static unsigned int -read_flag (cpp_reader *pfile, unsigned int last) -{ - const cpp_token *token = _cpp_lex_token (pfile); - - if (token->type == CPP_NUMBER && token->val.str.len == 1) - { - unsigned int flag = token->val.str.text[0] - '0'; - - if (flag > last && flag <= 4 - && (flag != 4 || last == 3) - && (flag != 2 || last == 0)) - return flag; - } - - if (token->type != CPP_EOF) - cpp_error (pfile, CPP_DL_ERROR, "invalid flag \"%s\" in line directive", - cpp_token_as_text (pfile, token)); - return 0; -} - -/* Subroutine of do_line and do_linemarker. Convert a number in STR, - of length LEN, to binary; store it in NUMP, and return false if the - number was well-formed, true if not. WRAPPED is set to true if the - number did not fit into 'linenum_type'. */ -static bool -strtolinenum (const uchar *str, size_t len, linenum_type *nump, bool *wrapped) -{ - linenum_type reg = 0; - - uchar c; - bool seen_digit_sep = false; - *wrapped = false; - while (len--) - { - c = *str++; - if (!seen_digit_sep && c == '\'' && len) - { - seen_digit_sep = true; - continue; - } - if (!ISDIGIT (c)) - return true; - seen_digit_sep = false; - if (reg > ((linenum_type) -1) / 10) - *wrapped = true; - reg *= 10; - if (reg > ((linenum_type) -1) - (c - '0')) - *wrapped = true; - reg += c - '0'; - } - *nump = reg; - return false; -} - -/* Interpret #line command. - Note that the filename string (if any) is a true string constant - (escapes are interpreted). */ -static void -do_line (cpp_reader *pfile) -{ - class line_maps *line_table = pfile->line_table; - const line_map_ordinary *map = LINEMAPS_LAST_ORDINARY_MAP (line_table); - - /* skip_rest_of_line() may cause line table to be realloc()ed so note down - sysp right now. */ - - unsigned char map_sysp = ORDINARY_MAP_IN_SYSTEM_HEADER_P (map); - const cpp_token *token; - const char *new_file = ORDINARY_MAP_FILE_NAME (map); - linenum_type new_lineno; - - /* C99 raised the minimum limit on #line numbers. */ - linenum_type cap = CPP_OPTION (pfile, c99) ? 2147483647 : 32767; - bool wrapped; - - /* #line commands expand macros. */ - token = cpp_get_token (pfile); - if (token->type != CPP_NUMBER - || strtolinenum (token->val.str.text, token->val.str.len, - &new_lineno, &wrapped)) - { - if (token->type == CPP_EOF) - cpp_error (pfile, CPP_DL_ERROR, "unexpected end of file after #line"); - else - cpp_error (pfile, CPP_DL_ERROR, - "\"%s\" after #line is not a positive integer", - cpp_token_as_text (pfile, token)); - return; - } - - if (CPP_PEDANTIC (pfile) && (new_lineno == 0 || new_lineno > cap || wrapped)) - cpp_error (pfile, CPP_DL_PEDWARN, "line number out of range"); - else if (wrapped) - cpp_error (pfile, CPP_DL_WARNING, "line number out of range"); - - token = cpp_get_token (pfile); - if (token->type == CPP_STRING) - { - cpp_string s = { 0, 0 }; - if (cpp_interpret_string_notranslate (pfile, &token->val.str, 1, - &s, CPP_STRING)) - new_file = (const char *)s.text; - check_eol (pfile, true); - } - else if (token->type != CPP_EOF) - { - cpp_error (pfile, CPP_DL_ERROR, "\"%s\" is not a valid filename", - cpp_token_as_text (pfile, token)); - return; - } - - skip_rest_of_line (pfile); - _cpp_do_file_change (pfile, LC_RENAME_VERBATIM, new_file, new_lineno, - map_sysp); - line_table->seen_line_directive = true; -} - -/* Interpret the # 44 "file" [flags] notation, which has slightly - different syntax and semantics from #line: Flags are allowed, - and we never complain about the line number being too big. */ -static void -do_linemarker (cpp_reader *pfile) -{ - class line_maps *line_table = pfile->line_table; - const line_map_ordinary *map = LINEMAPS_LAST_ORDINARY_MAP (line_table); - const cpp_token *token; - const char *new_file = ORDINARY_MAP_FILE_NAME (map); - linenum_type new_lineno; - unsigned int new_sysp = ORDINARY_MAP_IN_SYSTEM_HEADER_P (map); - enum lc_reason reason = LC_RENAME_VERBATIM; - int flag; - bool wrapped; - - /* Back up so we can get the number again. Putting this in - _cpp_handle_directive risks two calls to _cpp_backup_tokens in - some circumstances, which can segfault. */ - _cpp_backup_tokens (pfile, 1); - - /* #line commands expand macros. */ - token = cpp_get_token (pfile); - if (token->type != CPP_NUMBER - || strtolinenum (token->val.str.text, token->val.str.len, - &new_lineno, &wrapped)) - { - /* Unlike #line, there does not seem to be a way to get an EOF - here. So, it should be safe to always spell the token. */ - cpp_error (pfile, CPP_DL_ERROR, - "\"%s\" after # is not a positive integer", - cpp_token_as_text (pfile, token)); - return; - } - - token = cpp_get_token (pfile); - if (token->type == CPP_STRING) - { - cpp_string s = { 0, 0 }; - if (cpp_interpret_string_notranslate (pfile, &token->val.str, - 1, &s, CPP_STRING)) - new_file = (const char *)s.text; - - new_sysp = 0; - flag = read_flag (pfile, 0); - if (flag == 1) - { - reason = LC_ENTER; - /* Fake an include for cpp_included (). */ - _cpp_fake_include (pfile, new_file); - flag = read_flag (pfile, flag); - } - else if (flag == 2) - { - reason = LC_LEAVE; - flag = read_flag (pfile, flag); - } - if (flag == 3) - { - new_sysp = 1; - flag = read_flag (pfile, flag); - if (flag == 4) - new_sysp = 2; - } - pfile->buffer->sysp = new_sysp; - - check_eol (pfile, false); - } - else if (token->type != CPP_EOF) - { - cpp_error (pfile, CPP_DL_ERROR, "\"%s\" is not a valid filename", - cpp_token_as_text (pfile, token)); - return; - } - - skip_rest_of_line (pfile); - - if (reason == LC_LEAVE) - { - /* Reread map since cpp_get_token can invalidate it with a - reallocation. */ - map = LINEMAPS_LAST_ORDINARY_MAP (line_table); - const line_map_ordinary *from - = linemap_included_from_linemap (line_table, map); - - if (!from) - /* Not nested. */; - else if (!new_file[0]) - /* Leaving to "" means fill in the popped-to name. */ - new_file = ORDINARY_MAP_FILE_NAME (from); - else if (filename_cmp (ORDINARY_MAP_FILE_NAME (from), new_file) != 0) - /* It's the wrong name, Grommit! */ - from = NULL; - - if (!from) - { - cpp_warning (pfile, CPP_W_NONE, - "file \"%s\" linemarker ignored due to " - "incorrect nesting", new_file); - return; - } - } - - /* Compensate for the increment in linemap_add that occurs in - _cpp_do_file_change. We're currently at the start of the line - *following* the #line directive. A separate location_t for this - location makes no sense (until we do the LC_LEAVE), and - complicates LAST_SOURCE_LINE_LOCATION. */ - pfile->line_table->highest_location--; - - _cpp_do_file_change (pfile, reason, new_file, new_lineno, new_sysp); - line_table->seen_line_directive = true; -} - -/* Arrange the file_change callback. Changing to TO_FILE:TO_LINE for - REASON. SYSP is 1 for a system header, 2 for a system header that - needs to be extern "C" protected, and zero otherwise. */ -void -_cpp_do_file_change (cpp_reader *pfile, enum lc_reason reason, - const char *to_file, linenum_type to_line, - unsigned int sysp) -{ - linemap_assert (reason != LC_ENTER_MACRO); - - const line_map_ordinary *ord_map = NULL; - if (!to_line && reason == LC_RENAME_VERBATIM) - { - /* A linemarker moving to line zero. If we're on the second - line of the current map, and it also starts at zero, just - rewind -- we're probably reading the builtins of a - preprocessed source. */ - line_map_ordinary *last = LINEMAPS_LAST_ORDINARY_MAP (pfile->line_table); - if (!ORDINARY_MAP_STARTING_LINE_NUMBER (last) - && 0 == filename_cmp (to_file, ORDINARY_MAP_FILE_NAME (last)) - && SOURCE_LINE (last, pfile->line_table->highest_line) == 2) - { - ord_map = last; - pfile->line_table->highest_location - = pfile->line_table->highest_line = MAP_START_LOCATION (last); - } - } - - if (!ord_map) - if (const line_map *map = linemap_add (pfile->line_table, reason, sysp, - to_file, to_line)) - { - ord_map = linemap_check_ordinary (map); - linemap_line_start (pfile->line_table, - ORDINARY_MAP_STARTING_LINE_NUMBER (ord_map), - 127); - } - - if (pfile->cb.file_change) - pfile->cb.file_change (pfile, ord_map); -} - -/* Report a warning or error detected by the program we are - processing. Use the directive's tokens in the error message. */ -static void -do_diagnostic (cpp_reader *pfile, enum cpp_diagnostic_level code, - enum cpp_warning_reason reason, int print_dir) -{ - const unsigned char *dir_name; - unsigned char *line; - location_t src_loc = pfile->cur_token[-1].src_loc; - - if (print_dir) - dir_name = pfile->directive->name; - else - dir_name = NULL; - pfile->state.prevent_expansion++; - line = cpp_output_line_to_string (pfile, dir_name); - pfile->state.prevent_expansion--; - - if (code == CPP_DL_WARNING_SYSHDR && reason) - cpp_warning_with_line_syshdr (pfile, reason, src_loc, 0, "%s", line); - else if (code == CPP_DL_WARNING && reason) - cpp_warning_with_line (pfile, reason, src_loc, 0, "%s", line); - else - cpp_error_with_line (pfile, code, src_loc, 0, "%s", line); - free (line); -} - -static void -do_error (cpp_reader *pfile) -{ - do_diagnostic (pfile, CPP_DL_ERROR, CPP_W_NONE, 1); -} - -static void -do_warning (cpp_reader *pfile) -{ - /* We want #warning diagnostics to be emitted in system headers too. */ - do_diagnostic (pfile, CPP_DL_WARNING_SYSHDR, CPP_W_WARNING_DIRECTIVE, 1); -} - -/* Report program identification. */ -static void -do_ident (cpp_reader *pfile) -{ - const cpp_token *str = cpp_get_token (pfile); - - if (str->type != CPP_STRING) - cpp_error (pfile, CPP_DL_ERROR, "invalid #%s directive", - pfile->directive->name); - else if (pfile->cb.ident) - pfile->cb.ident (pfile, pfile->directive_line, &str->val.str); - - check_eol (pfile, false); -} - -/* Lookup a PRAGMA name in a singly-linked CHAIN. Returns the - matching entry, or NULL if none is found. The returned entry could - be the start of a namespace chain, or a pragma. */ -static struct pragma_entry * -lookup_pragma_entry (struct pragma_entry *chain, const cpp_hashnode *pragma) -{ - while (chain && chain->pragma != pragma) - chain = chain->next; - - return chain; -} - -/* Create and insert a blank pragma entry at the beginning of a - singly-linked CHAIN. */ -static struct pragma_entry * -new_pragma_entry (cpp_reader *pfile, struct pragma_entry **chain) -{ - struct pragma_entry *new_entry; - - new_entry = (struct pragma_entry *) - _cpp_aligned_alloc (pfile, sizeof (struct pragma_entry)); - - memset (new_entry, 0, sizeof (struct pragma_entry)); - new_entry->next = *chain; - - *chain = new_entry; - return new_entry; -} - -/* Register a pragma NAME in namespace SPACE. If SPACE is null, it - goes in the global namespace. */ -static struct pragma_entry * -register_pragma_1 (cpp_reader *pfile, const char *space, const char *name, - bool allow_name_expansion) -{ - struct pragma_entry **chain = &pfile->pragmas; - struct pragma_entry *entry; - const cpp_hashnode *node; - - if (space) - { - node = cpp_lookup (pfile, UC space, strlen (space)); - entry = lookup_pragma_entry (*chain, node); - if (!entry) - { - entry = new_pragma_entry (pfile, chain); - entry->pragma = node; - entry->is_nspace = true; - entry->allow_expansion = allow_name_expansion; - } - else if (!entry->is_nspace) - goto clash; - else if (entry->allow_expansion != allow_name_expansion) - { - cpp_error (pfile, CPP_DL_ICE, - "registering pragmas in namespace \"%s\" with mismatched " - "name expansion", space); - return NULL; - } - chain = &entry->u.space; - } - else if (allow_name_expansion) - { - cpp_error (pfile, CPP_DL_ICE, - "registering pragma \"%s\" with name expansion " - "and no namespace", name); - return NULL; - } - - /* Check for duplicates. */ - node = cpp_lookup (pfile, UC name, strlen (name)); - entry = lookup_pragma_entry (*chain, node); - if (entry == NULL) - { - entry = new_pragma_entry (pfile, chain); - entry->pragma = node; - return entry; - } - - if (entry->is_nspace) - clash: - cpp_error (pfile, CPP_DL_ICE, - "registering \"%s\" as both a pragma and a pragma namespace", - NODE_NAME (node)); - else if (space) - cpp_error (pfile, CPP_DL_ICE, "#pragma %s %s is already registered", - space, name); - else - cpp_error (pfile, CPP_DL_ICE, "#pragma %s is already registered", name); - - return NULL; -} - -/* Register a cpplib internal pragma SPACE NAME with HANDLER. */ -static void -register_pragma_internal (cpp_reader *pfile, const char *space, - const char *name, pragma_cb handler) -{ - struct pragma_entry *entry; - - entry = register_pragma_1 (pfile, space, name, false); - entry->is_internal = true; - entry->u.handler = handler; -} - -/* Register a pragma NAME in namespace SPACE. If SPACE is null, it - goes in the global namespace. HANDLER is the handler it will call, - which must be non-NULL. If ALLOW_EXPANSION is set, allow macro - expansion while parsing pragma NAME. This function is exported - from libcpp. */ -void -cpp_register_pragma (cpp_reader *pfile, const char *space, const char *name, - pragma_cb handler, bool allow_expansion) -{ - struct pragma_entry *entry; - - if (!handler) - { - cpp_error (pfile, CPP_DL_ICE, "registering pragma with NULL handler"); - return; - } - - entry = register_pragma_1 (pfile, space, name, false); - if (entry) - { - entry->allow_expansion = allow_expansion; - entry->u.handler = handler; - } -} - -/* Similarly, but create mark the pragma for deferred processing. - When found, a CPP_PRAGMA token will be insertted into the stream - with IDENT in the token->u.pragma slot. */ -void -cpp_register_deferred_pragma (cpp_reader *pfile, const char *space, - const char *name, unsigned int ident, - bool allow_expansion, bool allow_name_expansion) -{ - struct pragma_entry *entry; - - entry = register_pragma_1 (pfile, space, name, allow_name_expansion); - if (entry) - { - entry->is_deferred = true; - entry->allow_expansion = allow_expansion; - entry->u.ident = ident; - } -} - -/* Register the pragmas the preprocessor itself handles. */ -void -_cpp_init_internal_pragmas (cpp_reader *pfile) -{ - /* Pragmas in the global namespace. */ - register_pragma_internal (pfile, 0, "once", do_pragma_once); - register_pragma_internal (pfile, 0, "push_macro", do_pragma_push_macro); - register_pragma_internal (pfile, 0, "pop_macro", do_pragma_pop_macro); - - /* New GCC-specific pragmas should be put in the GCC namespace. */ - register_pragma_internal (pfile, "GCC", "poison", do_pragma_poison); - register_pragma_internal (pfile, "GCC", "system_header", - do_pragma_system_header); - register_pragma_internal (pfile, "GCC", "dependency", do_pragma_dependency); - register_pragma_internal (pfile, "GCC", "warning", do_pragma_warning); - register_pragma_internal (pfile, "GCC", "error", do_pragma_error); -} - -/* Return the number of registered pragmas in PE. */ - -static int -count_registered_pragmas (struct pragma_entry *pe) -{ - int ct = 0; - for (; pe != NULL; pe = pe->next) - { - if (pe->is_nspace) - ct += count_registered_pragmas (pe->u.space); - ct++; - } - return ct; -} - -/* Save into SD the names of the registered pragmas referenced by PE, - and return a pointer to the next free space in SD. */ - -static char ** -save_registered_pragmas (struct pragma_entry *pe, char **sd) -{ - for (; pe != NULL; pe = pe->next) - { - if (pe->is_nspace) - sd = save_registered_pragmas (pe->u.space, sd); - *sd++ = (char *) xmemdup (HT_STR (&pe->pragma->ident), - HT_LEN (&pe->pragma->ident), - HT_LEN (&pe->pragma->ident) + 1); - } - return sd; -} - -/* Return a newly-allocated array which saves the names of the - registered pragmas. */ - -char ** -_cpp_save_pragma_names (cpp_reader *pfile) -{ - int ct = count_registered_pragmas (pfile->pragmas); - char **result = XNEWVEC (char *, ct); - (void) save_registered_pragmas (pfile->pragmas, result); - return result; -} - -/* Restore from SD the names of the registered pragmas referenced by PE, - and return a pointer to the next unused name in SD. */ - -static char ** -restore_registered_pragmas (cpp_reader *pfile, struct pragma_entry *pe, - char **sd) -{ - for (; pe != NULL; pe = pe->next) - { - if (pe->is_nspace) - sd = restore_registered_pragmas (pfile, pe->u.space, sd); - pe->pragma = cpp_lookup (pfile, UC *sd, strlen (*sd)); - free (*sd); - sd++; - } - return sd; -} - -/* Restore the names of the registered pragmas from SAVED. */ - -void -_cpp_restore_pragma_names (cpp_reader *pfile, char **saved) -{ - (void) restore_registered_pragmas (pfile, pfile->pragmas, saved); - free (saved); -} - -/* Pragmata handling. We handle some, and pass the rest on to the - front end. C99 defines three pragmas and says that no macro - expansion is to be performed on them; whether or not macro - expansion happens for other pragmas is implementation defined. - This implementation allows for a mix of both, since GCC did not - traditionally macro expand its (few) pragmas, whereas OpenMP - specifies that macro expansion should happen. */ -static void -do_pragma (cpp_reader *pfile) -{ - const struct pragma_entry *p = NULL; - const cpp_token *token, *pragma_token; - location_t pragma_token_virt_loc = 0; - cpp_token ns_token; - unsigned int count = 1; - - pfile->state.prevent_expansion++; - - pragma_token = token = cpp_get_token_with_location (pfile, - &pragma_token_virt_loc); - ns_token = *token; - if (token->type == CPP_NAME) - { - p = lookup_pragma_entry (pfile->pragmas, token->val.node.node); - if (p && p->is_nspace) - { - bool allow_name_expansion = p->allow_expansion; - if (allow_name_expansion) - pfile->state.prevent_expansion--; - - token = cpp_get_token (pfile); - if (token->type == CPP_NAME) - p = lookup_pragma_entry (p->u.space, token->val.node.node); - else - p = NULL; - if (allow_name_expansion) - pfile->state.prevent_expansion++; - count = 2; - } - } - - if (p) - { - if (p->is_deferred) - { - pfile->directive_result.src_loc = pragma_token_virt_loc; - pfile->directive_result.type = CPP_PRAGMA; - pfile->directive_result.flags = pragma_token->flags; - pfile->directive_result.val.pragma = p->u.ident; - pfile->state.in_deferred_pragma = true; - pfile->state.pragma_allow_expansion = p->allow_expansion; - if (!p->allow_expansion) - pfile->state.prevent_expansion++; - } - else - { - /* Since the handler below doesn't get the line number, that - it might need for diagnostics, make sure it has the right - numbers in place. */ - if (pfile->cb.line_change) - (*pfile->cb.line_change) (pfile, pragma_token, false); - if (p->allow_expansion) - pfile->state.prevent_expansion--; - (*p->u.handler) (pfile); - if (p->allow_expansion) - pfile->state.prevent_expansion++; - } - } - else if (pfile->cb.def_pragma) - { - if (count == 1 || pfile->context->prev == NULL) - _cpp_backup_tokens (pfile, count); - else - { - /* Invalid name comes from macro expansion, _cpp_backup_tokens - won't allow backing 2 tokens. */ - /* ??? The token buffer is leaked. Perhaps if def_pragma hook - reads both tokens, we could perhaps free it, but if it doesn't, - we don't know the exact lifespan. */ - cpp_token *toks = XNEWVEC (cpp_token, 2); - toks[0] = ns_token; - toks[0].flags |= NO_EXPAND; - toks[1] = *token; - toks[1].flags |= NO_EXPAND; - _cpp_push_token_context (pfile, NULL, toks, 2); - } - pfile->cb.def_pragma (pfile, pfile->directive_line); - } - - pfile->state.prevent_expansion--; -} - -/* Handle #pragma once. */ -static void -do_pragma_once (cpp_reader *pfile) -{ - if (_cpp_in_main_source_file (pfile)) - cpp_error (pfile, CPP_DL_WARNING, "#pragma once in main file"); - - check_eol (pfile, false); - _cpp_mark_file_once_only (pfile, pfile->buffer->file); -} - -/* Handle #pragma push_macro(STRING). */ -static void -do_pragma_push_macro (cpp_reader *pfile) -{ - cpp_hashnode *node; - size_t defnlen; - const uchar *defn = NULL; - char *macroname, *dest; - const char *limit, *src; - const cpp_token *txt; - struct def_pragma_macro *c; - - txt = get__Pragma_string (pfile); - if (!txt) - { - location_t src_loc = pfile->cur_token[-1].src_loc; - cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0, - "invalid #pragma push_macro directive"); - check_eol (pfile, false); - skip_rest_of_line (pfile); - return; - } - dest = macroname = (char *) alloca (txt->val.str.len + 2); - src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 'L')); - limit = (const char *) (txt->val.str.text + txt->val.str.len - 1); - while (src < limit) - { - /* We know there is a character following the backslash. */ - if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) - src++; - *dest++ = *src++; - } - *dest = 0; - check_eol (pfile, false); - skip_rest_of_line (pfile); - c = XNEW (struct def_pragma_macro); - memset (c, 0, sizeof (struct def_pragma_macro)); - c->name = XNEWVAR (char, strlen (macroname) + 1); - strcpy (c->name, macroname); - c->next = pfile->pushed_macros; - node = _cpp_lex_identifier (pfile, c->name); - if (node->type == NT_VOID) - c->is_undef = 1; - else if (node->type == NT_BUILTIN_MACRO) - c->is_builtin = 1; - else - { - defn = cpp_macro_definition (pfile, node); - defnlen = ustrlen (defn); - c->definition = XNEWVEC (uchar, defnlen + 2); - c->definition[defnlen] = '\n'; - c->definition[defnlen + 1] = 0; - c->line = node->value.macro->line; - c->syshdr = node->value.macro->syshdr; - c->used = node->value.macro->used; - memcpy (c->definition, defn, defnlen); - } - - pfile->pushed_macros = c; -} - -/* Handle #pragma pop_macro(STRING). */ -static void -do_pragma_pop_macro (cpp_reader *pfile) -{ - char *macroname, *dest; - const char *limit, *src; - const cpp_token *txt; - struct def_pragma_macro *l = NULL, *c = pfile->pushed_macros; - txt = get__Pragma_string (pfile); - if (!txt) - { - location_t src_loc = pfile->cur_token[-1].src_loc; - cpp_error_with_line (pfile, CPP_DL_ERROR, src_loc, 0, - "invalid #pragma pop_macro directive"); - check_eol (pfile, false); - skip_rest_of_line (pfile); - return; - } - dest = macroname = (char *) alloca (txt->val.str.len + 2); - src = (const char *) (txt->val.str.text + 1 + (txt->val.str.text[0] == 'L')); - limit = (const char *) (txt->val.str.text + txt->val.str.len - 1); - while (src < limit) - { - /* We know there is a character following the backslash. */ - if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) - src++; - *dest++ = *src++; - } - *dest = 0; - check_eol (pfile, false); - skip_rest_of_line (pfile); - - while (c != NULL) - { - if (!strcmp (c->name, macroname)) - { - if (!l) - pfile->pushed_macros = c->next; - else - l->next = c->next; - cpp_pop_definition (pfile, c); - free (c->definition); - free (c->name); - free (c); - break; - } - l = c; - c = c->next; - } -} - -/* Handle #pragma GCC poison, to poison one or more identifiers so - that the lexer produces a hard error for each subsequent usage. */ -static void -do_pragma_poison (cpp_reader *pfile) -{ - const cpp_token *tok; - cpp_hashnode *hp; - - pfile->state.poisoned_ok = 1; - for (;;) - { - tok = _cpp_lex_token (pfile); - if (tok->type == CPP_EOF) - break; - if (tok->type != CPP_NAME) - { - cpp_error (pfile, CPP_DL_ERROR, - "invalid #pragma GCC poison directive"); - break; - } - - hp = tok->val.node.node; - if (hp->flags & NODE_POISONED) - continue; - - if (cpp_macro_p (hp)) - cpp_error (pfile, CPP_DL_WARNING, "poisoning existing macro \"%s\"", - NODE_NAME (hp)); - _cpp_free_definition (hp); - hp->flags |= NODE_POISONED | NODE_DIAGNOSTIC; - } - pfile->state.poisoned_ok = 0; -} - -/* Mark the current header as a system header. This will suppress - some categories of warnings (notably those from -pedantic). It is - intended for use in system libraries that cannot be implemented in - conforming C, but cannot be certain that their headers appear in a - system include directory. To prevent abuse, it is rejected in the - primary source file. */ -static void -do_pragma_system_header (cpp_reader *pfile) -{ - if (_cpp_in_main_source_file (pfile)) - cpp_error (pfile, CPP_DL_WARNING, - "#pragma system_header ignored outside include file"); - else - { - check_eol (pfile, false); - skip_rest_of_line (pfile); - cpp_make_system_header (pfile, 1, 0); - } -} - -/* Check the modified date of the current include file against a specified - file. Issue a diagnostic, if the specified file is newer. We use this to - determine if a fixed header should be refixed. */ -static void -do_pragma_dependency (cpp_reader *pfile) -{ - const char *fname; - int angle_brackets, ordering; - location_t location; - - fname = parse_include (pfile, &angle_brackets, NULL, &location); - if (!fname) - return; - - ordering = _cpp_compare_file_date (pfile, fname, angle_brackets); - if (ordering < 0) - cpp_error (pfile, CPP_DL_WARNING, "cannot find source file %s", fname); - else if (ordering > 0) - { - cpp_error (pfile, CPP_DL_WARNING, - "current file is older than %s", fname); - if (cpp_get_token (pfile)->type != CPP_EOF) - { - _cpp_backup_tokens (pfile, 1); - do_diagnostic (pfile, CPP_DL_WARNING, CPP_W_NONE, 0); - } - } - - free ((void *) fname); -} - -/* Issue a diagnostic with the message taken from the pragma. If - ERROR is true, the diagnostic is a warning, otherwise, it is an - error. */ -static void -do_pragma_warning_or_error (cpp_reader *pfile, bool error) -{ - const cpp_token *tok = _cpp_lex_token (pfile); - cpp_string str; - if (tok->type != CPP_STRING - || !cpp_interpret_string_notranslate (pfile, &tok->val.str, 1, &str, - CPP_STRING) - || str.len == 0) - { - cpp_error (pfile, CPP_DL_ERROR, "invalid \"#pragma GCC %s\" directive", - error ? "error" : "warning"); - return; - } - cpp_error (pfile, error ? CPP_DL_ERROR : CPP_DL_WARNING, - "%s", str.text); - free ((void *)str.text); -} - -/* Issue a warning diagnostic. */ -static void -do_pragma_warning (cpp_reader *pfile) -{ - do_pragma_warning_or_error (pfile, false); -} - -/* Issue an error diagnostic. */ -static void -do_pragma_error (cpp_reader *pfile) -{ - do_pragma_warning_or_error (pfile, true); -} - -/* Get a token but skip padding. */ -static const cpp_token * -get_token_no_padding (cpp_reader *pfile) -{ - for (;;) - { - const cpp_token *result = cpp_get_token (pfile); - if (result->type != CPP_PADDING) - return result; - } -} - -/* Check syntax is "(string-literal)". Returns the string on success, - or NULL on failure. */ -static const cpp_token * -get__Pragma_string (cpp_reader *pfile) -{ - const cpp_token *string; - const cpp_token *paren; - - paren = get_token_no_padding (pfile); - if (paren->type == CPP_EOF) - _cpp_backup_tokens (pfile, 1); - if (paren->type != CPP_OPEN_PAREN) - return NULL; - - string = get_token_no_padding (pfile); - if (string->type == CPP_EOF) - _cpp_backup_tokens (pfile, 1); - if (string->type != CPP_STRING && string->type != CPP_WSTRING - && string->type != CPP_STRING32 && string->type != CPP_STRING16 - && string->type != CPP_UTF8STRING) - return NULL; - - paren = get_token_no_padding (pfile); - if (paren->type == CPP_EOF) - _cpp_backup_tokens (pfile, 1); - if (paren->type != CPP_CLOSE_PAREN) - return NULL; - - return string; -} - -/* Destringize IN into a temporary buffer, by removing the first \ of - \" and \\ sequences, and process the result as a #pragma directive. */ -static void -destringize_and_run (cpp_reader *pfile, const cpp_string *in, - location_t expansion_loc) -{ - const unsigned char *src, *limit; - char *dest, *result; - cpp_context *saved_context; - cpp_token *saved_cur_token; - tokenrun *saved_cur_run; - cpp_token *toks; - int count; - const struct directive *save_directive; - - dest = result = (char *) alloca (in->len - 1); - src = in->text + 1 + (in->text[0] == 'L'); - limit = in->text + in->len - 1; - while (src < limit) - { - /* We know there is a character following the backslash. */ - if (*src == '\\' && (src[1] == '\\' || src[1] == '"')) - src++; - *dest++ = *src++; - } - *dest = '\n'; - - /* Ugh; an awful kludge. We are really not set up to be lexing - tokens when in the middle of a macro expansion. Use a new - context to force cpp_get_token to lex, and so skip_rest_of_line - doesn't go beyond the end of the text. Also, remember the - current lexing position so we can return to it later. - - Something like line-at-a-time lexing should remove the need for - this. */ - saved_context = pfile->context; - saved_cur_token = pfile->cur_token; - saved_cur_run = pfile->cur_run; - - pfile->context = XCNEW (cpp_context); - - /* Inline run_directive, since we need to delay the _cpp_pop_buffer - until we've read all of the tokens that we want. */ - cpp_push_buffer (pfile, (const uchar *) result, dest - result, - /* from_stage3 */ true); - /* ??? Antique Disgusting Hack. What does this do? */ - if (pfile->buffer->prev) - pfile->buffer->file = pfile->buffer->prev->file; - - start_directive (pfile); - _cpp_clean_line (pfile); - save_directive = pfile->directive; - pfile->directive = &dtable[T_PRAGMA]; - do_pragma (pfile); - if (pfile->directive_result.type == CPP_PRAGMA) - pfile->directive_result.flags |= PRAGMA_OP; - end_directive (pfile, 1); - pfile->directive = save_directive; - - /* We always insert at least one token, the directive result. It'll - either be a CPP_PADDING or a CPP_PRAGMA. In the later case, we - need to insert *all* of the tokens, including the CPP_PRAGMA_EOL. */ - - /* If we're not handling the pragma internally, read all of the tokens from - the string buffer now, while the string buffer is still installed. */ - /* ??? Note that the token buffer allocated here is leaked. It's not clear - to me what the true lifespan of the tokens are. It would appear that - the lifespan is the entire parse of the main input stream, in which case - this may not be wrong. */ - if (pfile->directive_result.type == CPP_PRAGMA) - { - int maxcount; - - count = 1; - maxcount = 50; - toks = XNEWVEC (cpp_token, maxcount); - toks[0] = pfile->directive_result; - toks[0].src_loc = expansion_loc; - - do - { - if (count == maxcount) - { - maxcount = maxcount * 3 / 2; - toks = XRESIZEVEC (cpp_token, toks, maxcount); - } - toks[count] = *cpp_get_token (pfile); - /* _Pragma is a builtin, so we're not within a macro-map, and so - the token locations are set to bogus ordinary locations - near to, but after that of the "_Pragma". - Paper over this by setting them equal to the location of the - _Pragma itself (PR preprocessor/69126). */ - toks[count].src_loc = expansion_loc; - /* Macros have been already expanded by cpp_get_token - if the pragma allowed expansion. */ - toks[count++].flags |= NO_EXPAND; - } - while (toks[count-1].type != CPP_PRAGMA_EOL); - } - else - { - count = 1; - toks = &pfile->avoid_paste; - - /* If we handled the entire pragma internally, make sure we get the - line number correct for the next token. */ - if (pfile->cb.line_change) - pfile->cb.line_change (pfile, pfile->cur_token, false); - } - - /* Finish inlining run_directive. */ - pfile->buffer->file = NULL; - _cpp_pop_buffer (pfile); - - /* Reset the old macro state before ... */ - XDELETE (pfile->context); - pfile->context = saved_context; - pfile->cur_token = saved_cur_token; - pfile->cur_run = saved_cur_run; - - /* ... inserting the new tokens we collected. */ - _cpp_push_token_context (pfile, NULL, toks, count); -} - -/* Handle the _Pragma operator. Return 0 on error, 1 if ok. */ -int -_cpp_do__Pragma (cpp_reader *pfile, location_t expansion_loc) -{ - const cpp_token *string = get__Pragma_string (pfile); - pfile->directive_result.type = CPP_PADDING; - - if (string) - { - destringize_and_run (pfile, &string->val.str, expansion_loc); - return 1; - } - cpp_error (pfile, CPP_DL_ERROR, - "_Pragma takes a parenthesized string literal"); - return 0; -} - -/* Handle #ifdef. */ -static void -do_ifdef (cpp_reader *pfile) -{ - int skip = 1; - - if (! pfile->state.skipping) - { - cpp_hashnode *node = lex_macro_node (pfile, false); - - if (node) - { - skip = !_cpp_defined_macro_p (node); - if (!_cpp_maybe_notify_macro_use (pfile, node, pfile->directive_line)) - /* It wasn't a macro after all. */ - skip = true; - _cpp_mark_macro_used (node); - if (pfile->cb.used) - pfile->cb.used (pfile, pfile->directive_line, node); - check_eol (pfile, false); - } - } - - push_conditional (pfile, skip, T_IFDEF, 0); -} - -/* Handle #ifndef. */ -static void -do_ifndef (cpp_reader *pfile) -{ - int skip = 1; - cpp_hashnode *node = 0; - - if (! pfile->state.skipping) - { - node = lex_macro_node (pfile, false); - - if (node) - { - skip = _cpp_defined_macro_p (node); - if (!_cpp_maybe_notify_macro_use (pfile, node, pfile->directive_line)) - /* It wasn't a macro after all. */ - skip = false; - _cpp_mark_macro_used (node); - if (pfile->cb.used) - pfile->cb.used (pfile, pfile->directive_line, node); - check_eol (pfile, false); - } - } - - push_conditional (pfile, skip, T_IFNDEF, node); -} - -/* _cpp_parse_expr puts a macro in a "#if !defined ()" expression in - pfile->mi_ind_cmacro so we can handle multiple-include - optimizations. If macro expansion occurs in the expression, we - cannot treat it as a controlling conditional, since the expansion - could change in the future. That is handled by cpp_get_token. */ -static void -do_if (cpp_reader *pfile) -{ - int skip = 1; - - if (! pfile->state.skipping) - skip = _cpp_parse_expr (pfile, true) == false; - - push_conditional (pfile, skip, T_IF, pfile->mi_ind_cmacro); -} - -/* Flip skipping state if appropriate and continue without changing - if_stack; this is so that the error message for missing #endif's - etc. will point to the original #if. */ -static void -do_else (cpp_reader *pfile) -{ - cpp_buffer *buffer = pfile->buffer; - struct if_stack *ifs = buffer->if_stack; - - if (ifs == NULL) - cpp_error (pfile, CPP_DL_ERROR, "#else without #if"); - else - { - if (ifs->type == T_ELSE) - { - cpp_error (pfile, CPP_DL_ERROR, "#else after #else"); - cpp_error_with_line (pfile, CPP_DL_ERROR, ifs->line, 0, - "the conditional began here"); - } - ifs->type = T_ELSE; - - /* Skip any future (erroneous) #elses or #elifs. */ - pfile->state.skipping = ifs->skip_elses; - ifs->skip_elses = true; - - /* Invalidate any controlling macro. */ - ifs->mi_cmacro = 0; - - /* Only check EOL if was not originally skipping. */ - if (!ifs->was_skipping && CPP_OPTION (pfile, warn_endif_labels)) - check_eol_endif_labels (pfile); - } -} - -/* Handle a #elif, #elifdef or #elifndef directive by not changing if_stack - either. See the comment above do_else. */ -static void -do_elif (cpp_reader *pfile) -{ - cpp_buffer *buffer = pfile->buffer; - struct if_stack *ifs = buffer->if_stack; - - if (ifs == NULL) - cpp_error (pfile, CPP_DL_ERROR, "#%s without #if", pfile->directive->name); - else - { - if (ifs->type == T_ELSE) - { - cpp_error (pfile, CPP_DL_ERROR, "#%s after #else", - pfile->directive->name); - cpp_error_with_line (pfile, CPP_DL_ERROR, ifs->line, 0, - "the conditional began here"); - } - ifs->type = T_ELIF; - - /* See DR#412: "Only the first group whose control condition - evaluates to true (nonzero) is processed; any following groups - are skipped and their controlling directives are processed as - if they were in a group that is skipped." */ - if (ifs->skip_elses) - { - /* In older GNU standards, #elifdef/#elifndef is supported - as an extension, but pedwarn if -pedantic if the presence - of the directive would be rejected. */ - if (pfile->directive != &dtable[T_ELIF] - && ! CPP_OPTION (pfile, elifdef) - && CPP_PEDANTIC (pfile) - && !pfile->state.skipping) - { - if (CPP_OPTION (pfile, cplusplus)) - cpp_error (pfile, CPP_DL_PEDWARN, - "#%s before C++23 is a GCC extension", - pfile->directive->name); - else - cpp_error (pfile, CPP_DL_PEDWARN, - "#%s before C2X is a GCC extension", - pfile->directive->name); - } - pfile->state.skipping = 1; - } - else - { - if (pfile->directive == &dtable[T_ELIF]) - pfile->state.skipping = !_cpp_parse_expr (pfile, false); - else - { - cpp_hashnode *node = lex_macro_node (pfile, false); - - if (node) - { - bool macro_defined = _cpp_defined_macro_p (node); - if (!_cpp_maybe_notify_macro_use (pfile, node, - pfile->directive_line)) - /* It wasn't a macro after all. */ - macro_defined = false; - bool skip = (pfile->directive == &dtable[T_ELIFDEF] - ? !macro_defined - : macro_defined); - if (pfile->cb.used) - pfile->cb.used (pfile, pfile->directive_line, node); - check_eol (pfile, false); - /* In older GNU standards, #elifdef/#elifndef is supported - as an extension, but pedwarn if -pedantic if the presence - of the directive would change behavior. */ - if (! CPP_OPTION (pfile, elifdef) - && CPP_PEDANTIC (pfile) - && pfile->state.skipping != skip) - { - if (CPP_OPTION (pfile, cplusplus)) - cpp_error (pfile, CPP_DL_PEDWARN, - "#%s before C++23 is a GCC extension", - pfile->directive->name); - else - cpp_error (pfile, CPP_DL_PEDWARN, - "#%s before C2X is a GCC extension", - pfile->directive->name); - } - pfile->state.skipping = skip; - } - } - ifs->skip_elses = !pfile->state.skipping; - } - - /* Invalidate any controlling macro. */ - ifs->mi_cmacro = 0; - } -} - -/* Handle a #elifdef directive. */ -static void -do_elifdef (cpp_reader *pfile) -{ - do_elif (pfile); -} - -/* Handle a #elifndef directive. */ -static void -do_elifndef (cpp_reader *pfile) -{ - do_elif (pfile); -} - -/* #endif pops the if stack and resets pfile->state.skipping. */ -static void -do_endif (cpp_reader *pfile) -{ - cpp_buffer *buffer = pfile->buffer; - struct if_stack *ifs = buffer->if_stack; - - if (ifs == NULL) - cpp_error (pfile, CPP_DL_ERROR, "#endif without #if"); - else - { - /* Only check EOL if was not originally skipping. */ - if (!ifs->was_skipping && CPP_OPTION (pfile, warn_endif_labels)) - check_eol_endif_labels (pfile); - - /* If potential control macro, we go back outside again. */ - if (ifs->next == 0 && ifs->mi_cmacro) - { - pfile->mi_valid = true; - pfile->mi_cmacro = ifs->mi_cmacro; - } - - buffer->if_stack = ifs->next; - pfile->state.skipping = ifs->was_skipping; - obstack_free (&pfile->buffer_ob, ifs); - } -} - -/* Push an if_stack entry for a preprocessor conditional, and set - pfile->state.skipping to SKIP. If TYPE indicates the conditional - is #if or #ifndef, CMACRO is a potentially controlling macro, and - we need to check here that we are at the top of the file. */ -static void -push_conditional (cpp_reader *pfile, int skip, int type, - const cpp_hashnode *cmacro) -{ - struct if_stack *ifs; - cpp_buffer *buffer = pfile->buffer; - - ifs = XOBNEW (&pfile->buffer_ob, struct if_stack); - ifs->line = pfile->directive_line; - ifs->next = buffer->if_stack; - ifs->skip_elses = pfile->state.skipping || !skip; - ifs->was_skipping = pfile->state.skipping; - ifs->type = type; - /* This condition is effectively a test for top-of-file. */ - if (pfile->mi_valid && pfile->mi_cmacro == 0) - ifs->mi_cmacro = cmacro; - else - ifs->mi_cmacro = 0; - - pfile->state.skipping = skip; - buffer->if_stack = ifs; -} - -/* Read the tokens of the answer into the macro pool, in a directive - of type TYPE. Only commit the memory if we intend it as permanent - storage, i.e. the #assert case. Returns 0 on success, and sets - ANSWERP to point to the answer. PRED_LOC is the location of the - predicate. */ -static bool -parse_answer (cpp_reader *pfile, int type, location_t pred_loc, - cpp_macro **answer_ptr) -{ - /* In a conditional, it is legal to not have an open paren. We - should save the following token in this case. */ - const cpp_token *paren = cpp_get_token (pfile); - - /* If not a paren, see if we're OK. */ - if (paren->type != CPP_OPEN_PAREN) - { - /* In a conditional no answer is a test for any answer. It - could be followed by any token. */ - if (type == T_IF) - { - _cpp_backup_tokens (pfile, 1); - return true; - } - - /* #unassert with no answer is valid - it removes all answers. */ - if (type == T_UNASSERT && paren->type == CPP_EOF) - return true; - - cpp_error_with_line (pfile, CPP_DL_ERROR, pred_loc, 0, - "missing '(' after predicate"); - return false; - } - - cpp_macro *answer = _cpp_new_macro (pfile, cmk_assert, - _cpp_reserve_room (pfile, 0, - sizeof (cpp_macro))); - answer->parm.next = NULL; - unsigned count = 0; - for (;;) - { - const cpp_token *token = cpp_get_token (pfile); - - if (token->type == CPP_CLOSE_PAREN) - break; - - if (token->type == CPP_EOF) - { - cpp_error (pfile, CPP_DL_ERROR, "missing ')' to complete answer"); - return false; - } - - answer = (cpp_macro *)_cpp_reserve_room - (pfile, sizeof (cpp_macro) + count * sizeof (cpp_token), - sizeof (cpp_token)); - answer->exp.tokens[count++] = *token; - } - - if (!count) - { - cpp_error (pfile, CPP_DL_ERROR, "predicate's answer is empty"); - return false; - } - - /* Drop whitespace at start, for answer equivalence purposes. */ - answer->exp.tokens[0].flags &= ~PREV_WHITE; - - answer->count = count; - *answer_ptr = answer; - - return true; -} - -/* Parses an assertion directive of type TYPE, returning a pointer to - the hash node of the predicate, or 0 on error. The node is - guaranteed to be disjoint from the macro namespace, so can only - have type 'NT_VOID'. If an answer was supplied, it is placed in - *ANSWER_PTR, which is otherwise set to 0. */ -static cpp_hashnode * -parse_assertion (cpp_reader *pfile, int type, cpp_macro **answer_ptr) -{ - cpp_hashnode *result = 0; - - /* We don't expand predicates or answers. */ - pfile->state.prevent_expansion++; - - *answer_ptr = NULL; - - const cpp_token *predicate = cpp_get_token (pfile); - if (predicate->type == CPP_EOF) - cpp_error (pfile, CPP_DL_ERROR, "assertion without predicate"); - else if (predicate->type != CPP_NAME) - cpp_error_with_line (pfile, CPP_DL_ERROR, predicate->src_loc, 0, - "predicate must be an identifier"); - else if (parse_answer (pfile, type, predicate->src_loc, answer_ptr)) - { - unsigned int len = NODE_LEN (predicate->val.node.node); - unsigned char *sym = (unsigned char *) alloca (len + 1); - - /* Prefix '#' to get it out of macro namespace. */ - sym[0] = '#'; - memcpy (sym + 1, NODE_NAME (predicate->val.node.node), len); - result = cpp_lookup (pfile, sym, len + 1); - } - - pfile->state.prevent_expansion--; - - return result; -} - -/* Returns a pointer to the pointer to CANDIDATE in the answer chain, - or a pointer to NULL if the answer is not in the chain. */ -static cpp_macro ** -find_answer (cpp_hashnode *node, const cpp_macro *candidate) -{ - unsigned int i; - cpp_macro **result = NULL; - - for (result = &node->value.answers; *result; result = &(*result)->parm.next) - { - cpp_macro *answer = *result; - - if (answer->count == candidate->count) - { - for (i = 0; i < answer->count; i++) - if (!_cpp_equiv_tokens (&answer->exp.tokens[i], - &candidate->exp.tokens[i])) - break; - - if (i == answer->count) - break; - } - } - - return result; -} - -/* Test an assertion within a preprocessor conditional. Returns - nonzero on failure, zero on success. On success, the result of - the test is written into VALUE, otherwise the value 0. */ -int -_cpp_test_assertion (cpp_reader *pfile, unsigned int *value) -{ - cpp_macro *answer; - cpp_hashnode *node = parse_assertion (pfile, T_IF, &answer); - - /* For recovery, an erroneous assertion expression is handled as a - failing assertion. */ - *value = 0; - - if (node) - { - if (node->value.answers) - *value = !answer || *find_answer (node, answer); - } - else if (pfile->cur_token[-1].type == CPP_EOF) - _cpp_backup_tokens (pfile, 1); - - /* We don't commit the memory for the answer - it's temporary only. */ - return node == 0; -} - -/* Handle #assert. */ -static void -do_assert (cpp_reader *pfile) -{ - cpp_macro *answer; - cpp_hashnode *node = parse_assertion (pfile, T_ASSERT, &answer); - - if (node) - { - /* Place the new answer in the answer list. First check there - is not a duplicate. */ - if (*find_answer (node, answer)) - { - cpp_error (pfile, CPP_DL_WARNING, "\"%s\" re-asserted", - NODE_NAME (node) + 1); - return; - } - - /* Commit or allocate storage for the answer. */ - answer = (cpp_macro *)_cpp_commit_buff - (pfile, sizeof (cpp_macro) - sizeof (cpp_token) - + sizeof (cpp_token) * answer->count); - - /* Chain into the list. */ - answer->parm.next = node->value.answers; - node->value.answers = answer; - - check_eol (pfile, false); - } -} - -/* Handle #unassert. */ -static void -do_unassert (cpp_reader *pfile) -{ - cpp_macro *answer; - cpp_hashnode *node = parse_assertion (pfile, T_UNASSERT, &answer); - - /* It isn't an error to #unassert something that isn't asserted. */ - if (node) - { - if (answer) - { - cpp_macro **p = find_answer (node, answer); - - /* Remove the assert from the list. */ - if (cpp_macro *temp = *p) - *p = temp->parm.next; - - check_eol (pfile, false); - } - else - _cpp_free_definition (node); - } - - /* We don't commit the memory for the answer - it's temporary only. */ -} - -/* These are for -D, -U, -A. */ - -/* Process the string STR as if it appeared as the body of a #define. - If STR is just an identifier, define it with value 1. - If STR has anything after the identifier, then it should - be identifier=definition. */ -void -cpp_define (cpp_reader *pfile, const char *str) -{ - char *buf; - const char *p; - size_t count; - - /* Copy the entire option so we can modify it. - Change the first "=" in the string to a space. If there is none, - tack " 1" on the end. */ - - count = strlen (str); - buf = (char *) alloca (count + 3); - memcpy (buf, str, count); - - p = strchr (str, '='); - if (p) - buf[p - str] = ' '; - else - { - buf[count++] = ' '; - buf[count++] = '1'; - } - buf[count] = '\n'; - - run_directive (pfile, T_DEFINE, buf, count); -} - -/* Like cpp_define, but does not warn about unused macro. */ -void -cpp_define_unused (cpp_reader *pfile, const char *str) -{ - unsigned char warn_unused_macros = CPP_OPTION (pfile, warn_unused_macros); - CPP_OPTION (pfile, warn_unused_macros) = 0; - cpp_define (pfile, str); - CPP_OPTION (pfile, warn_unused_macros) = warn_unused_macros; -} - -/* Use to build macros to be run through cpp_define() as - described above. - Example: cpp_define_formatted (pfile, "MACRO=%d", value); */ - -void -cpp_define_formatted (cpp_reader *pfile, const char *fmt, ...) -{ - char *ptr; - - va_list ap; - va_start (ap, fmt); - ptr = xvasprintf (fmt, ap); - va_end (ap); - - cpp_define (pfile, ptr); - free (ptr); -} - -/* Like cpp_define_formatted, but does not warn about unused macro. */ -void -cpp_define_formatted_unused (cpp_reader *pfile, const char *fmt, ...) -{ - char *ptr; - - va_list ap; - va_start (ap, fmt); - ptr = xvasprintf (fmt, ap); - va_end (ap); - - cpp_define_unused (pfile, ptr); - free (ptr); -} - -/* Slight variant of the above for use by initialize_builtins. */ -void -_cpp_define_builtin (cpp_reader *pfile, const char *str) -{ - size_t len = strlen (str); - char *buf = (char *) alloca (len + 1); - memcpy (buf, str, len); - buf[len] = '\n'; - run_directive (pfile, T_DEFINE, buf, len); -} - -/* Process MACRO as if it appeared as the body of an #undef. */ -void -cpp_undef (cpp_reader *pfile, const char *macro) -{ - size_t len = strlen (macro); - char *buf = (char *) alloca (len + 1); - memcpy (buf, macro, len); - buf[len] = '\n'; - run_directive (pfile, T_UNDEF, buf, len); -} - -/* Replace a previous definition DEF of the macro STR. If DEF is NULL, - or first element is zero, then the macro should be undefined. */ -static void -cpp_pop_definition (cpp_reader *pfile, struct def_pragma_macro *c) -{ - cpp_hashnode *node = _cpp_lex_identifier (pfile, c->name); - if (node == NULL) - return; - - if (pfile->cb.before_define) - pfile->cb.before_define (pfile); - - if (cpp_macro_p (node)) - { - if (pfile->cb.undef) - pfile->cb.undef (pfile, pfile->directive_line, node); - if (CPP_OPTION (pfile, warn_unused_macros)) - _cpp_warn_if_unused_macro (pfile, node, NULL); - _cpp_free_definition (node); - } - - if (c->is_undef) - return; - if (c->is_builtin) - { - _cpp_restore_special_builtin (pfile, c); - return; - } - - { - size_t namelen; - const uchar *dn; - cpp_hashnode *h = NULL; - cpp_buffer *nbuf; - - namelen = ustrcspn (c->definition, "( \n"); - h = cpp_lookup (pfile, c->definition, namelen); - dn = c->definition + namelen; - - nbuf = cpp_push_buffer (pfile, dn, ustrchr (dn, '\n') - dn, true); - if (nbuf != NULL) - { - _cpp_clean_line (pfile); - nbuf->sysp = 1; - if (!_cpp_create_definition (pfile, h)) - abort (); - _cpp_pop_buffer (pfile); - } - else - abort (); - h->value.macro->line = c->line; - h->value.macro->syshdr = c->syshdr; - h->value.macro->used = c->used; - } -} - -/* Process the string STR as if it appeared as the body of a #assert. */ -void -cpp_assert (cpp_reader *pfile, const char *str) -{ - handle_assertion (pfile, str, T_ASSERT); -} - -/* Process STR as if it appeared as the body of an #unassert. */ -void -cpp_unassert (cpp_reader *pfile, const char *str) -{ - handle_assertion (pfile, str, T_UNASSERT); -} - -/* Common code for cpp_assert (-A) and cpp_unassert (-A-). */ -static void -handle_assertion (cpp_reader *pfile, const char *str, int type) -{ - size_t count = strlen (str); - const char *p = strchr (str, '='); - - /* Copy the entire option so we can modify it. Change the first - "=" in the string to a '(', and tack a ')' on the end. */ - char *buf = (char *) alloca (count + 2); - - memcpy (buf, str, count); - if (p) - { - buf[p - str] = '('; - buf[count++] = ')'; - } - buf[count] = '\n'; - str = buf; - - run_directive (pfile, type, str, count); -} - -/* The options structure. */ -cpp_options * -cpp_get_options (cpp_reader *pfile) -{ - return &pfile->opts; -} - -/* The callbacks structure. */ -cpp_callbacks * -cpp_get_callbacks (cpp_reader *pfile) -{ - return &pfile->cb; -} - -/* Copy the given callbacks structure to our own. */ -void -cpp_set_callbacks (cpp_reader *pfile, cpp_callbacks *cb) -{ - pfile->cb = *cb; -} - -/* The narrow character set identifier. */ -const char * -cpp_get_narrow_charset_name (cpp_reader *pfile) -{ - return pfile->narrow_cset_desc.to; -} - -/* The wide character set identifier. */ -const char * -cpp_get_wide_charset_name (cpp_reader *pfile) -{ - return pfile->wide_cset_desc.to; -} - -/* The dependencies structure. (Creates one if it hasn't already been.) */ -class mkdeps * -cpp_get_deps (cpp_reader *pfile) -{ - if (!pfile->deps && CPP_OPTION (pfile, deps.style) != DEPS_NONE) - pfile->deps = deps_init (); - return pfile->deps; -} - -/* Push a new buffer on the buffer stack. Returns the new buffer; it - doesn't fail. It does not generate a file change call back; that - is the responsibility of the caller. */ -cpp_buffer * -cpp_push_buffer (cpp_reader *pfile, const uchar *buffer, size_t len, - int from_stage3) -{ - cpp_buffer *new_buffer = XOBNEW (&pfile->buffer_ob, cpp_buffer); - - /* Clears, amongst other things, if_stack and mi_cmacro. */ - memset (new_buffer, 0, sizeof (cpp_buffer)); - - new_buffer->next_line = new_buffer->buf = buffer; - new_buffer->rlimit = buffer + len; - new_buffer->from_stage3 = from_stage3; - new_buffer->prev = pfile->buffer; - new_buffer->need_line = true; - - pfile->buffer = new_buffer; - - return new_buffer; -} - -/* Pops a single buffer, with a file change call-back if appropriate. - Then pushes the next -include file, if any remain. */ -void -_cpp_pop_buffer (cpp_reader *pfile) -{ - cpp_buffer *buffer = pfile->buffer; - struct _cpp_file *inc = buffer->file; - struct if_stack *ifs; - const unsigned char *to_free; - - /* Walk back up the conditional stack till we reach its level at - entry to this file, issuing error messages. */ - for (ifs = buffer->if_stack; ifs; ifs = ifs->next) - cpp_error_with_line (pfile, CPP_DL_ERROR, ifs->line, 0, - "unterminated #%s", dtable[ifs->type].name); - - /* In case of a missing #endif. */ - pfile->state.skipping = 0; - - /* _cpp_do_file_change expects pfile->buffer to be the new one. */ - pfile->buffer = buffer->prev; - - to_free = buffer->to_free; - free (buffer->notes); - - /* Free the buffer object now; we may want to push a new buffer - in _cpp_push_next_include_file. */ - obstack_free (&pfile->buffer_ob, buffer); - - if (inc) - { - _cpp_pop_file_buffer (pfile, inc, to_free); - - _cpp_do_file_change (pfile, LC_LEAVE, 0, 0, 0); - } - else if (to_free) - free ((void *)to_free); -} - -/* Enter all recognized directives in the hash table. */ -void -_cpp_init_directives (cpp_reader *pfile) -{ - for (int i = 0; i < N_DIRECTIVES; i++) - { - cpp_hashnode *node = cpp_lookup (pfile, dtable[i].name, dtable[i].length); - node->is_directive = 1; - node->directive_index = i; - } -} - -/* Extract header file from a bracket include. Parsing starts after '<'. - The string is malloced and must be freed by the caller. */ -char * -_cpp_bracket_include(cpp_reader *pfile) -{ - return glue_header_name (pfile); -} - - -//-------------------------------------------------------------------------------- -// RT extensions -//-------------------------------------------------------------------------------- - -/*-------------------------------------------------------------------------------- - directive `#assign` - - cmd ::= "#assign" name body ; - - name ::= clause ; - body ::= clause ; - - clause ::= "(" literal? ")" | "[" expr? "]" ; - - literal ::= ; sequence parsed into tokens - expr ::= ; sequence parsed into tokens with recursive expansion of each token - - ; white space, including new lines, is ignored. - - This differs from `#define`: - -name clause must reduce to a valid #define name - -the assign is defined after the body clause has been parsed - -*/ - -extern bool _cpp_create_assign(cpp_reader *pfile); - - -static void do_assign(cpp_reader *pfile){ - - _cpp_create_assign(pfile); - -} - - -/*-------------------------------------------------------------------------------- - directive `#macro` - - directive ::= "#rt_macro" name params body ; - - name ::= identifier ; - - params ::= "(" param_list? ")" ; - param_list ::= identifier ("," identifier)* ; - - body ::= paren_clause ; - - paren_clause ::= "(" literal? ")" - - literal ::= ; sequence parsed into tokens without expansion - - - ; whitespace, including newlines, is ignored - - -*/ -extern bool _cpp_create_rt_macro (cpp_reader *pfile, cpp_hashnode *node); - -static void -do_rt_macro (cpp_reader *pfile) -{ - cpp_hashnode *node = lex_macro_node(pfile, true); - - if(node) - { - /* If we have been requested to expand comments into macros, - then re-enable saving of comments. */ - pfile->state.save_comments = - ! CPP_OPTION (pfile, discard_comments_in_macro_exp); - - if(pfile->cb.before_define) - pfile->cb.before_define (pfile); - - if( _cpp_create_rt_macro(pfile, node) ) - if (pfile->cb.define) - pfile->cb.define (pfile, pfile->directive_line, node); - - node->flags &= ~NODE_USED; - } -} - diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/library/include/#cpplib.h#" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/library/include/#cpplib.h#" deleted file mode 100644 index aea752f..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/library/include/#cpplib.h#" +++ /dev/null @@ -1,1585 +0,0 @@ -/* Definitions for CPP library. - Copyright (C) 1995-2022 Free Software Foundation, Inc. - Written by Per Bothner, 1994-95. - -This program is free software; you can redistribute it and/or modify it -under the terms of the GNU General Public License as published by the -Free Software Foundation; either version 3, or (at your option) any -later version. - -This program is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -You should have received a copy of the GNU General Public License -along with this program; see the file COPYING3. If not see -. - - In other words, you are welcome to use, share and improve this program. - You are forbidden to forbid anyone else to use, share and improve - what you give them. Help stamp out software-hoarding! */ -#ifndef LIBCPP_CPPLIB_H -#define LIBCPP_CPPLIB_H - -#include -#include "symtab.h" -#include "line-map.h" - -typedef struct cpp_reader cpp_reader; -typedef struct cpp_buffer cpp_buffer; -typedef struct cpp_options cpp_options; -typedef struct cpp_token cpp_token; -typedef struct cpp_string cpp_string; -typedef struct cpp_hashnode cpp_hashnode; -typedef struct cpp_macro cpp_macro; -typedef struct cpp_callbacks cpp_callbacks; -typedef struct cpp_dir cpp_dir; - -struct _cpp_file; - -/* The first three groups, apart from '=', can appear in preprocessor - expressions (+= and -= are used to indicate unary + and - resp.). - This allows a lookup table to be implemented in _cpp_parse_expr. - - The first group, to CPP_LAST_EQ, can be immediately followed by an - '='. The lexer needs operators ending in '=', like ">>=", to be in - the same order as their counterparts without the '=', like ">>". - - See the cpp_operator table optab in expr.cc if you change the order or - add or remove anything in the first group. */ - -#define TTYPE_TABLE \ - OP(EQ, "=") \ - OP(NOT, "!") \ - OP(GREATER, ">") /* compare */ \ - OP(LESS, "<") \ - OP(PLUS, "+") /* math */ \ - OP(MINUS, "-") \ - OP(MULT, "*") \ - OP(DIV, "/") \ - OP(MOD, "%") \ - OP(AND, "&") /* bit ops */ \ - OP(OR, "|") \ - OP(XOR, "^") \ - OP(RSHIFT, ">>") \ - OP(LSHIFT, "<<") \ - \ - OP(COMPL, "~") \ - OP(AND_AND, "&&") /* logical */ \ - OP(OR_OR, "||") \ - OP(QUERY, "?") \ - OP(COLON, ":") \ - OP(COMMA, ",") /* grouping */ \ - OP(OPEN_PAREN, "(") \ - OP(CLOSE_PAREN, ")") \ - TK(EOF, NONE) \ - OP(EQ_EQ, "==") /* compare */ \ - OP(NOT_EQ, "!=") \ - OP(GREATER_EQ, ">=") \ - OP(LESS_EQ, "<=") \ - OP(SPACESHIP, "<=>") \ - \ - /* These two are unary + / - in preprocessor expressions. */ \ - OP(PLUS_EQ, "+=") /* math */ \ - OP(MINUS_EQ, "-=") \ - \ - OP(MULT_EQ, "*=") \ - OP(DIV_EQ, "/=") \ - OP(MOD_EQ, "%=") \ - OP(AND_EQ, "&=") /* bit ops */ \ - OP(OR_EQ, "|=") \ - OP(XOR_EQ, "^=") \ - OP(RSHIFT_EQ, ">>=") \ - OP(LSHIFT_EQ, "<<=") \ - /* Digraphs together, beginning with CPP_FIRST_DIGRAPH. */ \ - OP(HASH, "#") /* digraphs */ \ - OP(PASTE, "##") \ - OP(OPEN_SQUARE, "[") \ - OP(CLOSE_SQUARE, "]") \ - OP(OPEN_BRACE, "{") \ - OP(CLOSE_BRACE, "}") \ - /* The remainder of the punctuation. Order is not significant. */ \ - OP(SEMICOLON, ";") /* structure */ \ - OP(ELLIPSIS, "...") \ - OP(PLUS_PLUS, "++") /* increment */ \ - OP(MINUS_MINUS, "--") \ - OP(DEREF, "->") /* accessors */ \ - OP(DOT, ".") \ - OP(SCOPE, "::") \ - OP(DEREF_STAR, "->*") \ - OP(DOT_STAR, ".*") \ - OP(ATSIGN, "@") /* used in Objective-C */ \ - \ - TK(NAME, IDENT) /* word */ \ - TK(AT_NAME, IDENT) /* @word - Objective-C */ \ - TK(NUMBER, LITERAL) /* 34_be+ta */ \ - \ - TK(CHAR, LITERAL) /* 'char' */ \ - TK(WCHAR, LITERAL) /* L'char' */ \ - TK(CHAR16, LITERAL) /* u'char' */ \ - TK(CHAR32, LITERAL) /* U'char' */ \ - TK(UTF8CHAR, LITERAL) /* u8'char' */ \ - TK(OTHER, LITERAL) /* stray punctuation */ \ - \ - TK(STRING, LITERAL) /* "string" */ \ - TK(WSTRING, LITERAL) /* L"string" */ \ - TK(STRING16, LITERAL) /* u"string" */ \ - TK(STRING32, LITERAL) /* U"string" */ \ - TK(UTF8STRING, LITERAL) /* u8"string" */ \ - TK(OBJC_STRING, LITERAL) /* @"string" - Objective-C */ \ - TK(HEADER_NAME, LITERAL) /* in #include */ \ - \ - TK(CHAR_USERDEF, LITERAL) /* 'char'_suffix - C++-0x */ \ - TK(WCHAR_USERDEF, LITERAL) /* L'char'_suffix - C++-0x */ \ - TK(CHAR16_USERDEF, LITERAL) /* u'char'_suffix - C++-0x */ \ - TK(CHAR32_USERDEF, LITERAL) /* U'char'_suffix - C++-0x */ \ - TK(UTF8CHAR_USERDEF, LITERAL) /* u8'char'_suffix - C++-0x */ \ - TK(STRING_USERDEF, LITERAL) /* "string"_suffix - C++-0x */ \ - TK(WSTRING_USERDEF, LITERAL) /* L"string"_suffix - C++-0x */ \ - TK(STRING16_USERDEF, LITERAL) /* u"string"_suffix - C++-0x */ \ - TK(STRING32_USERDEF, LITERAL) /* U"string"_suffix - C++-0x */ \ - TK(UTF8STRING_USERDEF,LITERAL) /* u8"string"_suffix - C++-0x */ \ - \ - TK(COMMENT, LITERAL) /* Only if output comments. */ \ - /* SPELL_LITERAL happens to DTRT. */ \ - TK(MACRO_ARG, NONE) /* Macro argument. */ \ - TK(PRAGMA, NONE) /* Only for deferred pragmas. */ \ - TK(PRAGMA_EOL, NONE) /* End-of-line for deferred pragmas. */ \ - TK(PADDING, NONE) /* Whitespace for -E. */ - -#define OP(e, s) CPP_ ## e, -#define TK(e, s) CPP_ ## e, -enum cpp_ttype -{ - TTYPE_TABLE - N_TTYPES, - - /* A token type for keywords, as opposed to ordinary identifiers. */ - CPP_KEYWORD, - - /* Positions in the table. */ - CPP_LAST_EQ = CPP_LSHIFT, - CPP_FIRST_DIGRAPH = CPP_HASH, - CPP_LAST_PUNCTUATOR= CPP_ATSIGN, - CPP_LAST_CPP_OP = CPP_LESS_EQ -}; -#undef OP -#undef TK - -/* C language kind, used when calling cpp_create_reader. */ -enum c_lang {CLK_GNUC89 = 0, CLK_GNUC99, CLK_GNUC11, CLK_GNUC17, CLK_GNUC2X, - CLK_STDC89, CLK_STDC94, CLK_STDC99, CLK_STDC11, CLK_STDC17, - CLK_STDC2X, - CLK_GNUCXX, CLK_CXX98, CLK_GNUCXX11, CLK_CXX11, - CLK_GNUCXX14, CLK_CXX14, CLK_GNUCXX17, CLK_CXX17, - CLK_GNUCXX20, CLK_CXX20, CLK_GNUCXX23, CLK_CXX23, - CLK_ASM}; - -/* Payload of a NUMBER, STRING, CHAR or COMMENT token. */ -struct GTY(()) cpp_string { - unsigned int len; - const unsigned char *text; -}; - -/* Flags for the cpp_token structure. */ -#define PREV_WHITE (1 << 0) /* If whitespace before this token. */ -#define DIGRAPH (1 << 1) /* If it was a digraph. */ -#define STRINGIFY_ARG (1 << 2) /* If macro argument to be stringified. */ -#define PASTE_LEFT (1 << 3) /* If on LHS of a ## operator. */ -#define NAMED_OP (1 << 4) /* C++ named operators. */ -#define PREV_FALLTHROUGH (1 << 5) /* On a token preceeded by FALLTHROUGH - comment. */ -#define BOL (1 << 6) /* Token at beginning of line. */ -#define PURE_ZERO (1 << 7) /* Single 0 digit, used by the C++ frontend, - set in c-lex.cc. */ -#define COLON_SCOPE PURE_ZERO /* Adjacent colons in C < 23. */ -#define SP_DIGRAPH (1 << 8) /* # or ## token was a digraph. */ -#define SP_PREV_WHITE (1 << 9) /* If whitespace before a ## - operator, or before this token - after a # operator. */ -#define NO_EXPAND (1 << 10) /* Do not macro-expand this token. */ -#define PRAGMA_OP (1 << 11) /* _Pragma token. */ - -/* Specify which field, if any, of the cpp_token union is used. */ - -enum cpp_token_fld_kind { - CPP_TOKEN_FLD_NODE, - CPP_TOKEN_FLD_SOURCE, - CPP_TOKEN_FLD_STR, - CPP_TOKEN_FLD_ARG_NO, - CPP_TOKEN_FLD_TOKEN_NO, - CPP_TOKEN_FLD_PRAGMA, - CPP_TOKEN_FLD_NONE -}; - -/* A macro argument in the cpp_token union. */ -struct GTY(()) cpp_macro_arg { - /* Argument number. */ - unsigned int arg_no; - /* The original spelling of the macro argument token. */ - cpp_hashnode * - GTY ((nested_ptr (union tree_node, - "%h ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (%h)) : NULL", - "%h ? HT_IDENT_TO_GCC_IDENT (HT_NODE (%h)) : NULL"))) - spelling; -}; - -/* An identifier in the cpp_token union. */ -struct GTY(()) cpp_identifier { - /* The canonical (UTF-8) spelling of the identifier. */ - cpp_hashnode * - GTY ((nested_ptr (union tree_node, - "%h ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (%h)) : NULL", - "%h ? HT_IDENT_TO_GCC_IDENT (HT_NODE (%h)) : NULL"))) - node; - /* The original spelling of the identifier. */ - cpp_hashnode * - GTY ((nested_ptr (union tree_node, - "%h ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (%h)) : NULL", - "%h ? HT_IDENT_TO_GCC_IDENT (HT_NODE (%h)) : NULL"))) - spelling; -}; - -/* A preprocessing token. This has been carefully packed and should - occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts. */ -struct GTY(()) cpp_token { - - /* Location of first char of token, together with range of full token. */ - location_t src_loc; - - ENUM_BITFIELD(cpp_ttype) type : CHAR_BIT; /* token type */ - unsigned short flags; /* flags - see above */ - - union cpp_token_u - { - /* An identifier. */ - struct cpp_identifier GTY ((tag ("CPP_TOKEN_FLD_NODE"))) node; - - /* Inherit padding from this token. */ - cpp_token * GTY ((tag ("CPP_TOKEN_FLD_SOURCE"))) source; - - /* A string, or number. */ - struct cpp_string GTY ((tag ("CPP_TOKEN_FLD_STR"))) str; - - /* Argument no. (and original spelling) for a CPP_MACRO_ARG. */ - struct cpp_macro_arg GTY ((tag ("CPP_TOKEN_FLD_ARG_NO"))) macro_arg; - - /* Original token no. for a CPP_PASTE (from a sequence of - consecutive paste tokens in a macro expansion). */ - unsigned int GTY ((tag ("CPP_TOKEN_FLD_TOKEN_NO"))) token_no; - - /* Caller-supplied identifier for a CPP_PRAGMA. */ - unsigned int GTY ((tag ("CPP_TOKEN_FLD_PRAGMA"))) pragma; - } GTY ((desc ("cpp_token_val_index (&%1)"))) val; -}; - -/* Say which field is in use. */ -extern enum cpp_token_fld_kind cpp_token_val_index (const cpp_token *tok); - -/* A type wide enough to hold any multibyte source character. - cpplib's character constant interpreter requires an unsigned type. - Also, a typedef for the signed equivalent. - The width of this type is capped at 32 bits; there do exist targets - where wchar_t is 64 bits, but only in a non-default mode, and there - would be no meaningful interpretation for a wchar_t value greater - than 2^32 anyway -- the widest wide-character encoding around is - ISO 10646, which stops at 2^31. */ -#if CHAR_BIT * SIZEOF_INT >= 32 -# define CPPCHAR_SIGNED_T int -#elif CHAR_BIT * SIZEOF_LONG >= 32 -# define CPPCHAR_SIGNED_T long -#else -# error "Cannot find a least-32-bit signed integer type" -#endif -typedef unsigned CPPCHAR_SIGNED_T cppchar_t; -typedef CPPCHAR_SIGNED_T cppchar_signed_t; - -/* Style of header dependencies to generate. */ -enum cpp_deps_style { DEPS_NONE = 0, DEPS_USER, DEPS_SYSTEM }; - -/* The possible normalization levels, from most restrictive to least. */ -enum cpp_normalize_level { - /* In NFKC. */ - normalized_KC = 0, - /* In NFC. */ - normalized_C, - /* In NFC, except for subsequences where being in NFC would make - the identifier invalid. */ - normalized_identifier_C, - /* Not normalized at all. */ - normalized_none -}; - -enum cpp_main_search -{ - CMS_none, /* A regular source file. */ - CMS_header, /* Is a directly-specified header file (eg PCH or - header-unit). */ - CMS_user, /* Search the user INCLUDE path. */ - CMS_system, /* Search the system INCLUDE path. */ -}; - -/* The possible bidirectional control characters checking levels. */ -enum cpp_bidirectional_level { - /* No checking. */ - bidirectional_none = 0, - /* Only detect unpaired uses of bidirectional control characters. */ - bidirectional_unpaired = 1, - /* Detect any use of bidirectional control characters. */ - bidirectional_any = 2, - /* Also warn about UCNs. */ - bidirectional_ucn = 4 -}; - -/* This structure is nested inside struct cpp_reader, and - carries all the options visible to the command line. */ -struct cpp_options -{ - /* The language we're preprocessing. */ - enum c_lang lang; - - /* Nonzero means use extra default include directories for C++. */ - unsigned char cplusplus; - - /* Nonzero means handle cplusplus style comments. */ - unsigned char cplusplus_comments; - - /* Nonzero means define __OBJC__, treat @ as a special token, use - the OBJC[PLUS]_INCLUDE_PATH environment variable, and allow - "#import". */ - unsigned char objc; - - /* Nonzero means don't copy comments into the output file. */ - unsigned char discard_comments; - - /* Nonzero means don't copy comments into the output file during - macro expansion. */ - unsigned char discard_comments_in_macro_exp; - - /* Nonzero means process the ISO trigraph sequences. */ - unsigned char trigraphs; - - /* Nonzero means process the ISO digraph sequences. */ - unsigned char digraphs; - - /* Nonzero means to allow hexadecimal floats and LL suffixes. */ - unsigned char extended_numbers; - - /* Nonzero means process u/U prefix literals (UTF-16/32). */ - unsigned char uliterals; - - /* Nonzero means process u8 prefixed character literals (UTF-8). */ - unsigned char utf8_char_literals; - - /* Nonzero means process r/R raw strings. If this is set, uliterals - must be set as well. */ - unsigned char rliterals; - - /* Nonzero means print names of header files (-H). */ - unsigned char print_include_names; - - /* Nonzero means complain about deprecated features. */ - unsigned char cpp_warn_deprecated; - - /* Nonzero means warn if slash-star appears in a comment. */ - unsigned char warn_comments; - - /* Nonzero means to warn about __DATA__, __TIME__ and __TIMESTAMP__ usage. */ - unsigned char warn_date_time; - - /* Nonzero means warn if a user-supplied include directory does not - exist. */ - unsigned char warn_missing_include_dirs; - - /* Nonzero means warn if there are any trigraphs. */ - unsigned char warn_trigraphs; - - /* Nonzero means warn about multicharacter charconsts. */ - unsigned char warn_multichar; - - /* Nonzero means warn about various incompatibilities with - traditional C. */ - unsigned char cpp_warn_traditional; - - /* Nonzero means warn about long long numeric constants. */ - unsigned char cpp_warn_long_long; - - /* Nonzero means warn about text after an #endif (or #else). */ - unsigned char warn_endif_labels; - - /* Nonzero means warn about implicit sign changes owing to integer - promotions. */ - unsigned char warn_num_sign_change; - - /* Zero means don't warn about __VA_ARGS__ usage in c89 pedantic mode. - Presumably the usage is protected by the appropriate #ifdef. */ - unsigned char warn_variadic_macros; - - /* Nonzero means warn about builtin macros that are redefined or - explicitly undefined. */ - unsigned char warn_builtin_macro_redefined; - - /* Different -Wimplicit-fallthrough= levels. */ - unsigned char cpp_warn_implicit_fallthrough; - - /* Nonzero means we should look for header.gcc files that remap file - names. */ - unsigned char remap; - - /* Zero means dollar signs are punctuation. */ - unsigned char dollars_in_ident; - - /* Nonzero means UCNs are accepted in identifiers. */ - unsigned char extended_identifiers; - - /* True if we should warn about dollars in identifiers or numbers - for this translation unit. */ - unsigned char warn_dollars; - - /* Nonzero means warn if undefined identifiers are evaluated in an #if. */ - unsigned char warn_undef; - - /* Nonzero means warn if "defined" is encountered in a place other than - an #if. */ - unsigned char warn_expansion_to_defined; - - /* Nonzero means warn of unused macros from the main file. */ - unsigned char warn_unused_macros; - - /* Nonzero for the 1999 C Standard, including corrigenda and amendments. */ - unsigned char c99; - - /* Nonzero if we are conforming to a specific C or C++ standard. */ - unsigned char std; - - /* Nonzero means give all the error messages the ANSI standard requires. */ - unsigned char cpp_pedantic; - - /* Nonzero means we're looking at already preprocessed code, so don't - bother trying to do macro expansion and whatnot. */ - unsigned char preprocessed; - - /* Nonzero means we are going to emit debugging logs during - preprocessing. */ - unsigned char debug; - - /* Nonzero means we are tracking locations of tokens involved in - macro expansion. 1 Means we track the location in degraded mode - where we do not track locations of tokens resulting from the - expansion of arguments of function-like macro. 2 Means we do - track all macro expansions. This last option is the one that - consumes the highest amount of memory. */ - unsigned char track_macro_expansion; - - /* Nonzero means handle C++ alternate operator names. */ - unsigned char operator_names; - - /* Nonzero means warn about use of C++ alternate operator names. */ - unsigned char warn_cxx_operator_names; - - /* True for traditional preprocessing. */ - unsigned char traditional; - - /* Nonzero for C++ 2011 Standard user-defined literals. */ - unsigned char user_literals; - - /* Nonzero means warn when a string or character literal is followed by a - ud-suffix which does not beging with an underscore. */ - unsigned char warn_literal_suffix; - - /* Nonzero means interpret imaginary, fixed-point, or other gnu extension - literal number suffixes as user-defined literal number suffixes. */ - unsigned char ext_numeric_literals; - - /* Nonzero means extended identifiers allow the characters specified - in C11. */ - unsigned char c11_identifiers; - - /* Nonzero for C++ 2014 Standard binary constants. */ - unsigned char binary_constants; - - /* Nonzero for C++ 2014 Standard digit separators. */ - unsigned char digit_separators; - - /* Nonzero for C2X decimal floating-point constants. */ - unsigned char dfp_constants; - - /* Nonzero for C++20 __VA_OPT__ feature. */ - unsigned char va_opt; - - /* Nonzero for the '::' token. */ - unsigned char scope; - - /* Nonzero for the '#elifdef' and '#elifndef' directives. */ - unsigned char elifdef; - - /* Nonzero means tokenize C++20 module directives. */ - unsigned char module_directives; - - /* Nonzero for C++23 size_t literals. */ - unsigned char size_t_literals; - - /* Holds the name of the target (execution) character set. */ - const char *narrow_charset; - - /* Holds the name of the target wide character set. */ - const char *wide_charset; - - /* Holds the name of the input character set. */ - const char *input_charset; - - /* The minimum permitted level of normalization before a warning - is generated. See enum cpp_normalize_level. */ - int warn_normalize; - - /* True to warn about precompiled header files we couldn't use. */ - bool warn_invalid_pch; - - /* True if dependencies should be restored from a precompiled header. */ - bool restore_pch_deps; - - /* True if warn about differences between C90 and C99. */ - signed char cpp_warn_c90_c99_compat; - - /* True if warn about differences between C11 and C2X. */ - signed char cpp_warn_c11_c2x_compat; - - /* True if warn about differences between C++98 and C++11. */ - bool cpp_warn_cxx11_compat; - - /* Nonzero if bidirectional control characters checking is on. See enum - cpp_bidirectional_level. */ - unsigned char cpp_warn_bidirectional; - - /* Dependency generation. */ - struct - { - /* Style of header dependencies to generate. */ - enum cpp_deps_style style; - - /* Assume missing files are generated files. */ - bool missing_files; - - /* Generate phony targets for each dependency apart from the first - one. */ - bool phony_targets; - - /* Generate dependency info for modules. */ - bool modules; - - /* If true, no dependency is generated on the main file. */ - bool ignore_main_file; - - /* If true, intend to use the preprocessor output (e.g., for compilation) - in addition to the dependency info. */ - bool need_preprocessor_output; - } deps; - - /* Target-specific features set by the front end or client. */ - - /* Precision for target CPP arithmetic, target characters, target - ints and target wide characters, respectively. */ - size_t precision, char_precision, int_precision, wchar_precision; - - /* True means chars (wide chars) are unsigned. */ - bool unsigned_char, unsigned_wchar; - - /* True if the most significant byte in a word has the lowest - address in memory. */ - bool bytes_big_endian; - - /* Nonzero means __STDC__ should have the value 0 in system headers. */ - unsigned char stdc_0_in_system_headers; - - /* True disables tokenization outside of preprocessing directives. */ - bool directives_only; - - /* True enables canonicalization of system header file paths. */ - bool canonical_system_headers; - - /* The maximum depth of the nested #include. */ - unsigned int max_include_depth; - - cpp_main_search main_search : 8; -}; - -/* Diagnostic levels. To get a diagnostic without associating a - position in the translation unit with it, use cpp_error_with_line - with a line number of zero. */ - -enum cpp_diagnostic_level { - /* Warning, an error with -Werror. */ - CPP_DL_WARNING = 0, - /* Same as CPP_DL_WARNING, except it is not suppressed in system headers. */ - CPP_DL_WARNING_SYSHDR, - /* Warning, an error with -pedantic-errors or -Werror. */ - CPP_DL_PEDWARN, - /* An error. */ - CPP_DL_ERROR, - /* An internal consistency check failed. Prints "internal error: ", - otherwise the same as CPP_DL_ERROR. */ - CPP_DL_ICE, - /* An informative note following a warning. */ - CPP_DL_NOTE, - /* A fatal error. */ - CPP_DL_FATAL -}; - -/* Warning reason codes. Use a reason code of CPP_W_NONE for unclassified - warnings and diagnostics that are not warnings. */ - -enum cpp_warning_reason { - CPP_W_NONE = 0, - CPP_W_DEPRECATED, - CPP_W_COMMENTS, - CPP_W_MISSING_INCLUDE_DIRS, - CPP_W_TRIGRAPHS, - CPP_W_MULTICHAR, - CPP_W_TRADITIONAL, - CPP_W_LONG_LONG, - CPP_W_ENDIF_LABELS, - CPP_W_NUM_SIGN_CHANGE, - CPP_W_VARIADIC_MACROS, - CPP_W_BUILTIN_MACRO_REDEFINED, - CPP_W_DOLLARS, - CPP_W_UNDEF, - CPP_W_UNUSED_MACROS, - CPP_W_CXX_OPERATOR_NAMES, - CPP_W_NORMALIZE, - CPP_W_INVALID_PCH, - CPP_W_WARNING_DIRECTIVE, - CPP_W_LITERAL_SUFFIX, - CPP_W_SIZE_T_LITERALS, - CPP_W_DATE_TIME, - CPP_W_PEDANTIC, - CPP_W_C90_C99_COMPAT, - CPP_W_C11_C2X_COMPAT, - CPP_W_CXX11_COMPAT, - CPP_W_EXPANSION_TO_DEFINED, - CPP_W_BIDIRECTIONAL -}; - -/* Callback for header lookup for HEADER, which is the name of a - source file. It is used as a method of last resort to find headers - that are not otherwise found during the normal include processing. - The return value is the malloced name of a header to try and open, - if any, or NULL otherwise. This callback is called only if the - header is otherwise unfound. */ -typedef const char *(*missing_header_cb)(cpp_reader *, const char *header, cpp_dir **); - -/* Call backs to cpplib client. */ -struct cpp_callbacks -{ - /* Called when a new line of preprocessed output is started. */ - void (*line_change) (cpp_reader *, const cpp_token *, int); - - /* Called when switching to/from a new file. - The line_map is for the new file. It is NULL if there is no new file. - (In C this happens when done with + and also - when done with a main file.) This can be used for resource cleanup. */ - void (*file_change) (cpp_reader *, const line_map_ordinary *); - - void (*dir_change) (cpp_reader *, const char *); - void (*include) (cpp_reader *, location_t, const unsigned char *, - const char *, int, const cpp_token **); - void (*define) (cpp_reader *, location_t, cpp_hashnode *); - void (*undef) (cpp_reader *, location_t, cpp_hashnode *); - void (*ident) (cpp_reader *, location_t, const cpp_string *); - void (*def_pragma) (cpp_reader *, location_t); - int (*valid_pch) (cpp_reader *, const char *, int); - void (*read_pch) (cpp_reader *, const char *, int, const char *); - missing_header_cb missing_header; - - /* Context-sensitive macro support. Returns macro (if any) that should - be expanded. */ - cpp_hashnode * (*macro_to_expand) (cpp_reader *, const cpp_token *); - - /* Called to emit a diagnostic. This callback receives the - translated message. */ - bool (*diagnostic) (cpp_reader *, - enum cpp_diagnostic_level, - enum cpp_warning_reason, - rich_location *, - const char *, va_list *) - ATTRIBUTE_FPTR_PRINTF(5,0); - - /* Callbacks for when a macro is expanded, or tested (whether - defined or not at the time) in #ifdef, #ifndef or "defined". */ - void (*used_define) (cpp_reader *, location_t, cpp_hashnode *); - void (*used_undef) (cpp_reader *, location_t, cpp_hashnode *); - /* Called before #define and #undef or other macro definition - changes are processed. */ - void (*before_define) (cpp_reader *); - /* Called whenever a macro is expanded or tested. - Second argument is the location of the start of the current expansion. */ - void (*used) (cpp_reader *, location_t, cpp_hashnode *); - - /* Callback to identify whether an attribute exists. */ - int (*has_attribute) (cpp_reader *, bool); - - /* Callback to determine whether a built-in function is recognized. */ - int (*has_builtin) (cpp_reader *); - - /* Callback that can change a user lazy into normal macro. */ - void (*user_lazy_macro) (cpp_reader *, cpp_macro *, unsigned); - - /* Callback to handle deferred cpp_macros. */ - cpp_macro *(*user_deferred_macro) (cpp_reader *, location_t, cpp_hashnode *); - - /* Callback to parse SOURCE_DATE_EPOCH from environment. */ - time_t (*get_source_date_epoch) (cpp_reader *); - - /* Callback for providing suggestions for misspelled directives. */ - const char *(*get_suggestion) (cpp_reader *, const char *, const char *const *); - - /* Callback for when a comment is encountered, giving the location - of the opening slash, a pointer to the content (which is not - necessarily 0-terminated), and the length of the content. - The content contains the opening slash-star (or slash-slash), - and for C-style comments contains the closing star-slash. For - C++-style comments it does not include the terminating newline. */ - void (*comment) (cpp_reader *, location_t, const unsigned char *, - size_t); - - /* Callback for filename remapping in __FILE__ and __BASE_FILE__ macro - expansions. */ - const char *(*remap_filename) (const char*); - - /* Maybe translate a #include into something else. Return a - cpp_buffer containing the translation if translating. */ - char *(*translate_include) (cpp_reader *, line_maps *, location_t, - const char *path); -}; - -#ifdef VMS -#define INO_T_CPP ino_t ino[3] -#elif defined (_AIX) && SIZEOF_INO_T == 4 -#define INO_T_CPP ino64_t ino -#else -#define INO_T_CPP ino_t ino -#endif - -#if defined (_AIX) && SIZEOF_DEV_T == 4 -#define DEV_T_CPP dev64_t dev -#else -#define DEV_T_CPP dev_t dev -#endif - -/* Chain of directories to look for include files in. */ -struct cpp_dir -{ - /* NULL-terminated singly-linked list. */ - struct cpp_dir *next; - - /* NAME of the directory, NUL-terminated. */ - char *name; - unsigned int len; - - /* One if a system header, two if a system header that has extern - "C" guards for C++. */ - unsigned char sysp; - - /* Is this a user-supplied directory? */ - bool user_supplied_p; - - /* The canonicalized NAME as determined by lrealpath. This field - is only used by hosts that lack reliable inode numbers. */ - char *canonical_name; - - /* Mapping of file names for this directory for MS-DOS and related - platforms. A NULL-terminated array of (from, to) pairs. */ - const char **name_map; - - /* Routine to construct pathname, given the search path name and the - HEADER we are trying to find, return a constructed pathname to - try and open. If this is NULL, the constructed pathname is as - constructed by append_file_to_dir. */ - char *(*construct) (const char *header, cpp_dir *dir); - - /* The C front end uses these to recognize duplicated - directories in the search path. */ - INO_T_CPP; - DEV_T_CPP; -}; - -/* The kind of the cpp_macro. */ -enum cpp_macro_kind { - cmk_macro, /* An ISO macro (token expansion). */ - cmk_assert, /* An assertion. */ - cmk_traditional /* A traditional macro (text expansion). */ -}; - -/* Each macro definition is recorded in a cpp_macro structure. - Variadic macros cannot occur with traditional cpp. */ -struct GTY(()) cpp_macro { - union cpp_parm_u - { - /* Parameters, if any. If parameter names use extended identifiers, - the original spelling of those identifiers, not the canonical - UTF-8 spelling, goes here. */ - cpp_hashnode ** GTY ((tag ("false"), - nested_ptr (union tree_node, - "%h ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (%h)) : NULL", - "%h ? HT_IDENT_TO_GCC_IDENT (HT_NODE (%h)) : NULL"), - length ("%1.paramc"))) params; - - /* If this is an assertion, the next one in the chain. */ - cpp_macro *GTY ((tag ("true"))) next; - } GTY ((desc ("%1.kind == cmk_assert"))) parm; - - /* Definition line number. */ - location_t line; - - /* Number of tokens in body, or bytes for traditional macros. */ - /* Do we really need 2^32-1 range here? */ - unsigned int count; - - /* Number of parameters. */ - unsigned short paramc; - - /* Non-zero if this is a user-lazy macro, value provided by user. */ - unsigned char lazy; - - /* The kind of this macro (ISO, trad or assert) */ - unsigned kind : 2; - - /* If a function-like macro. */ - unsigned int fun_like : 1; - - /* If a variadic macro. */ - unsigned int variadic : 1; - - /* If macro defined in system header. */ - unsigned int syshdr : 1; - - /* Nonzero if it has been expanded or had its existence tested. */ - unsigned int used : 1; - - /* Indicate whether the tokens include extra CPP_PASTE tokens at the - end to track invalid redefinitions with consecutive CPP_PASTE - tokens. */ - unsigned int extra_tokens : 1; - - /* Imported C++20 macro (from a header unit). */ - unsigned int imported_p : 1; - - /* 0 bits spare (32-bit). 32 on 64-bit target. */ - - union cpp_exp_u - { - /* Trailing array of replacement tokens (ISO), or assertion body value. */ - cpp_token GTY ((tag ("false"), length ("%1.count"))) tokens[1]; - - /* Pointer to replacement text (traditional). See comment at top - of cpptrad.c for how traditional function-like macros are - encoded. */ - const unsigned char *GTY ((tag ("true"))) text; - } GTY ((desc ("%1.kind == cmk_traditional"))) exp; -}; - -/* Poisoned identifiers are flagged NODE_POISONED. NODE_OPERATOR (C++ - only) indicates an identifier that behaves like an operator such as - "xor". NODE_DIAGNOSTIC is for speed in lex_token: it indicates a - diagnostic may be required for this node. Currently this only - applies to __VA_ARGS__, poisoned identifiers, and -Wc++-compat - warnings about NODE_OPERATOR. */ - -/* Hash node flags. */ -#define NODE_OPERATOR (1 << 0) /* C++ named operator. */ -#define NODE_POISONED (1 << 1) /* Poisoned identifier. */ -#define NODE_DIAGNOSTIC (1 << 2) /* Possible diagnostic when lexed. */ -#define NODE_WARN (1 << 3) /* Warn if redefined or undefined. */ -#define NODE_DISABLED (1 << 4) /* A disabled macro. */ -#define NODE_USED (1 << 5) /* Dumped with -dU. */ -#define NODE_CONDITIONAL (1 << 6) /* Conditional macro */ -#define NODE_WARN_OPERATOR (1 << 7) /* Warn about C++ named operator. */ -#define NODE_MODULE (1 << 8) /* C++-20 module-related name. */ - -/* Different flavors of hash node. */ -enum node_type -{ - NT_VOID = 0, /* Maybe an assert? */ - NT_MACRO_ARG, /* A macro arg. */ - NT_USER_MACRO, /* A user macro. */ - NT_BUILTIN_MACRO, /* A builtin macro. */ - NT_MACRO_MASK = NT_USER_MACRO /* Mask for either macro kind. */ -}; - -/* Different flavors of builtin macro. _Pragma is an operator, but we - handle it with the builtin code for efficiency reasons. */ -enum cpp_builtin_type -{ - BT_SPECLINE = 0, /* `__LINE__' */ - BT_DATE, /* `__DATE__' */ - BT_FILE, /* `__FILE__' */ - BT_FILE_NAME, /* `__FILE_NAME__' */ - BT_BASE_FILE, /* `__BASE_FILE__' */ - BT_INCLUDE_LEVEL, /* `__INCLUDE_LEVEL__' */ - BT_TIME, /* `__TIME__' */ - BT_STDC, /* `__STDC__' */ - BT_PRAGMA, /* `_Pragma' operator */ - BT_TIMESTAMP, /* `__TIMESTAMP__' */ - BT_COUNTER, /* `__COUNTER__' */ - BT_HAS_ATTRIBUTE, /* `__has_attribute(x)' */ - BT_HAS_STD_ATTRIBUTE, /* `__has_c_attribute(x)' */ - BT_HAS_BUILTIN, /* `__has_builtin(x)' */ - BT_HAS_INCLUDE, /* `__has_include(x)' */ - BT_HAS_INCLUDE_NEXT, /* `__has_include_next(x)' */ - - // RT Extension - BT_RT_ASSIGN, - BT_RT_TO_ARG_LIST, - BT_RT_TO_TOKEN_LIST, - BT_RT_FIRST, - BT_RT_REST, - BT_RT_MAP, - BT_RT_AL_MAP, - BT_RT_IF, - BT_RT_NOT, - BT_RT_AND, - BT_RT_OR, - BT_RT_IS_IDENTIFIER, - BT_RT_IS_NAME, - BT_RT_PASTE -}; - -#define CPP_HASHNODE(HNODE) ((cpp_hashnode *) (HNODE)) -#define HT_NODE(NODE) (&(NODE)->ident) -#define NODE_LEN(NODE) HT_LEN (HT_NODE (NODE)) -#define NODE_NAME(NODE) HT_STR (HT_NODE (NODE)) - -/* The common part of an identifier node shared amongst all 3 C front - ends. Also used to store CPP identifiers, which are a superset of - identifiers in the grammatical sense. */ - -union GTY(()) _cpp_hashnode_value { - /* Assert (maybe NULL) */ - cpp_macro * GTY((tag ("NT_VOID"))) answers; - /* Macro (maybe NULL) */ - cpp_macro * GTY((tag ("NT_USER_MACRO"))) macro; - /* Code for a builtin macro. */ - enum cpp_builtin_type GTY ((tag ("NT_BUILTIN_MACRO"))) builtin; - /* Macro argument index. */ - unsigned short GTY ((tag ("NT_MACRO_ARG"))) arg_index; -}; - -struct GTY(()) cpp_hashnode { - struct ht_identifier ident; - unsigned int is_directive : 1; - unsigned int directive_index : 7; /* If is_directive, - then index into directive table. - Otherwise, a NODE_OPERATOR. */ - unsigned int rid_code : 8; /* Rid code - for front ends. */ - unsigned int flags : 9; /* CPP flags. */ - ENUM_BITFIELD(node_type) type : 2; /* CPP node type. */ - - /* 5 bits spare. */ - - /* The deferred cookie is applicable to NT_USER_MACRO or NT_VOID. - The latter for when a macro had a prevailing undef. - On a 64-bit system there would be 32-bits of padding to the value - field. So placing the deferred index here is not costly. */ - unsigned deferred; /* Deferred cookie */ - - union _cpp_hashnode_value GTY ((desc ("%1.type"))) value; -}; - -/* A class for iterating through the source locations within a - string token (before escapes are interpreted, and before - concatenation). */ - -class cpp_string_location_reader { - public: - cpp_string_location_reader (location_t src_loc, - line_maps *line_table); - - source_range get_next (); - - private: - location_t m_loc; - int m_offset_per_column; -}; - -/* A class for storing the source ranges of all of the characters within - a string literal, after escapes are interpreted, and after - concatenation. - - This is not GTY-marked, as instances are intended to be temporary. */ - -class cpp_substring_ranges -{ - public: - cpp_substring_ranges (); - ~cpp_substring_ranges (); - - int get_num_ranges () const { return m_num_ranges; } - source_range get_range (int idx) const - { - linemap_assert (idx < m_num_ranges); - return m_ranges[idx]; - } - - void add_range (source_range range); - void add_n_ranges (int num, cpp_string_location_reader &loc_reader); - - private: - source_range *m_ranges; - int m_num_ranges; - int m_alloc_ranges; -}; - -/* Call this first to get a handle to pass to other functions. - - If you want cpplib to manage its own hashtable, pass in a NULL - pointer. Otherwise you should pass in an initialized hash table - that cpplib will share; this technique is used by the C front - ends. */ -extern cpp_reader *cpp_create_reader (enum c_lang, struct ht *, - class line_maps *); - -/* Reset the cpp_reader's line_map. This is only used after reading a - PCH file. */ -extern void cpp_set_line_map (cpp_reader *, class line_maps *); - -/* Call this to change the selected language standard (e.g. because of - command line options). */ -extern void cpp_set_lang (cpp_reader *, enum c_lang); - -/* Set the include paths. */ -extern void cpp_set_include_chains (cpp_reader *, cpp_dir *, cpp_dir *, int); - -/* Call these to get pointers to the options, callback, and deps - structures for a given reader. These pointers are good until you - call cpp_finish on that reader. You can either edit the callbacks - through the pointer returned from cpp_get_callbacks, or set them - with cpp_set_callbacks. */ -extern cpp_options *cpp_get_options (cpp_reader *) ATTRIBUTE_PURE; -extern cpp_callbacks *cpp_get_callbacks (cpp_reader *) ATTRIBUTE_PURE; -extern void cpp_set_callbacks (cpp_reader *, cpp_callbacks *); -extern class mkdeps *cpp_get_deps (cpp_reader *) ATTRIBUTE_PURE; - -extern const char *cpp_probe_header_unit (cpp_reader *, const char *file, - bool angle_p, location_t); - -/* Call these to get name data about the various compile-time - charsets. */ -extern const char *cpp_get_narrow_charset_name (cpp_reader *) ATTRIBUTE_PURE; -extern const char *cpp_get_wide_charset_name (cpp_reader *) ATTRIBUTE_PURE; - -/* This function reads the file, but does not start preprocessing. It - returns the name of the original file; this is the same as the - input file, except for preprocessed input. This will generate at - least one file change callback, and possibly a line change callback - too. If there was an error opening the file, it returns NULL. */ -extern const char *cpp_read_main_file (cpp_reader *, const char *, - bool injecting = false); -extern location_t cpp_main_loc (const cpp_reader *); - -/* Adjust for the main file to be an include. */ -extern void cpp_retrofit_as_include (cpp_reader *); - -/* Set up built-ins with special behavior. Use cpp_init_builtins() - instead unless your know what you are doing. */ -extern void cpp_init_special_builtins (cpp_reader *); - -/* Set up built-ins like __FILE__. */ -extern void cpp_init_builtins (cpp_reader *, int); - -/* This is called after options have been parsed, and partially - processed. */ -extern void cpp_post_options (cpp_reader *); - -/* Set up translation to the target character set. */ -extern void cpp_init_iconv (cpp_reader *); - -/* Call this to finish preprocessing. If you requested dependency - generation, pass an open stream to write the information to, - otherwise NULL. It is your responsibility to close the stream. */ -extern void cpp_finish (cpp_reader *, FILE *deps_stream); - -/* Call this to release the handle at the end of preprocessing. Any - use of the handle after this function returns is invalid. */ -extern void cpp_destroy (cpp_reader *); - -extern unsigned int cpp_token_len (const cpp_token *); -extern unsigned char *cpp_token_as_text (cpp_reader *, const cpp_token *); -extern unsigned char *cpp_spell_token (cpp_reader *, const cpp_token *, - unsigned char *, bool); -extern void cpp_register_pragma (cpp_reader *, const char *, const char *, - void (*) (cpp_reader *), bool); -extern void cpp_register_deferred_pragma (cpp_reader *, const char *, - const char *, unsigned, bool, bool); -extern int cpp_avoid_paste (cpp_reader *, const cpp_token *, - const cpp_token *); -extern const cpp_token *cpp_get_token (cpp_reader *); -extern const cpp_token *cpp_get_token_with_location (cpp_reader *, - location_t *); -inline bool cpp_user_macro_p (const cpp_hashnode *node) -{ - return node->type == NT_USER_MACRO; -} -inline bool cpp_builtin_macro_p (const cpp_hashnode *node) -{ - return node->type == NT_BUILTIN_MACRO; -} -inline bool cpp_macro_p (const cpp_hashnode *node) -{ - return node->type & NT_MACRO_MASK; -} -inline cpp_macro *cpp_set_deferred_macro (cpp_hashnode *node, - cpp_macro *forced = NULL) -{ - cpp_macro *old = node->value.macro; - - node->value.macro = forced; - node->type = NT_USER_MACRO; - node->flags &= ~NODE_USED; - - return old; -} -cpp_macro *cpp_get_deferred_macro (cpp_reader *, cpp_hashnode *, location_t); - -/* Returns true if NODE is a function-like user macro. */ -inline bool cpp_fun_like_macro_p (cpp_hashnode *node) -{ - return cpp_user_macro_p (node) && node->value.macro->fun_like; -} - -extern const unsigned char *cpp_macro_definition (cpp_reader *, cpp_hashnode *); -extern const unsigned char *cpp_macro_definition (cpp_reader *, cpp_hashnode *, - const cpp_macro *); -inline location_t cpp_macro_definition_location (cpp_hashnode *node) -{ - const cpp_macro *macro = node->value.macro; - return macro ? macro->line : 0; -} -/* Return an idempotent time stamp (possibly from SOURCE_DATE_EPOCH). */ -enum class CPP_time_kind -{ - FIXED = -1, /* Fixed time via source epoch. */ - DYNAMIC = -2, /* Dynamic via time(2). */ - UNKNOWN = -3 /* Wibbly wobbly, timey wimey. */ -}; -extern CPP_time_kind cpp_get_date (cpp_reader *, time_t *); - -extern void _cpp_backup_tokens (cpp_reader *, unsigned int); -extern const cpp_token *cpp_peek_token (cpp_reader *, int); - -/* Evaluate a CPP_*CHAR* token. */ -extern cppchar_t cpp_interpret_charconst (cpp_reader *, const cpp_token *, - unsigned int *, int *); -/* Evaluate a vector of CPP_*STRING* tokens. */ -extern bool cpp_interpret_string (cpp_reader *, - const cpp_string *, size_t, - cpp_string *, enum cpp_ttype); -extern const char *cpp_interpret_string_ranges (cpp_reader *pfile, - const cpp_string *from, - cpp_string_location_reader *, - size_t count, - cpp_substring_ranges *out, - enum cpp_ttype type); -extern bool cpp_interpret_string_notranslate (cpp_reader *, - const cpp_string *, size_t, - cpp_string *, enum cpp_ttype); - -/* Convert a host character constant to the execution character set. */ -extern cppchar_t cpp_host_to_exec_charset (cpp_reader *, cppchar_t); - -/* Used to register macros and assertions, perhaps from the command line. - The text is the same as the command line argument. */ -extern void cpp_define (cpp_reader *, const char *); -extern void cpp_define_unused (cpp_reader *, const char *); -extern void cpp_define_formatted (cpp_reader *pfile, - const char *fmt, ...) ATTRIBUTE_PRINTF_2; -extern void cpp_define_formatted_unused (cpp_reader *pfile, - const char *fmt, - ...) ATTRIBUTE_PRINTF_2; -extern void cpp_assert (cpp_reader *, const char *); -extern void cpp_undef (cpp_reader *, const char *); -extern void cpp_unassert (cpp_reader *, const char *); - -/* Mark a node as a lazily defined macro. */ -extern void cpp_define_lazily (cpp_reader *, cpp_hashnode *node, unsigned N); - -/* Undefine all macros and assertions. */ -extern void cpp_undef_all (cpp_reader *); - -extern cpp_buffer *cpp_push_buffer (cpp_reader *, const unsigned char *, - size_t, int); -extern int cpp_defined (cpp_reader *, const unsigned char *, int); - -/* A preprocessing number. Code assumes that any unused high bits of - the double integer are set to zero. */ - -/* This type has to be equal to unsigned HOST_WIDE_INT, see - gcc/c-family/c-lex.cc. */ -typedef uint64_t cpp_num_part; -typedef struct cpp_num cpp_num; -struct cpp_num -{ - cpp_num_part high; - cpp_num_part low; - bool unsignedp; /* True if value should be treated as unsigned. */ - bool overflow; /* True if the most recent calculation overflowed. */ -}; - -/* cpplib provides two interfaces for interpretation of preprocessing - numbers. - - cpp_classify_number categorizes numeric constants according to - their field (integer, floating point, or invalid), radix (decimal, - octal, hexadecimal), and type suffixes. */ - -#define CPP_N_CATEGORY 0x000F -#define CPP_N_INVALID 0x0000 -#define CPP_N_INTEGER 0x0001 -#define CPP_N_FLOATING 0x0002 - -#define CPP_N_WIDTH 0x00F0 -#define CPP_N_SMALL 0x0010 /* int, float, short _Fract/Accum */ -#define CPP_N_MEDIUM 0x0020 /* long, double, long _Fract/_Accum. */ -#define CPP_N_LARGE 0x0040 /* long long, long double, - long long _Fract/Accum. */ - -#define CPP_N_WIDTH_MD 0xF0000 /* machine defined. */ -#define CPP_N_MD_W 0x10000 -#define CPP_N_MD_Q 0x20000 - -#define CPP_N_RADIX 0x0F00 -#define CPP_N_DECIMAL 0x0100 -#define CPP_N_HEX 0x0200 -#define CPP_N_OCTAL 0x0400 -#define CPP_N_BINARY 0x0800 - -#define CPP_N_UNSIGNED 0x1000 /* Properties. */ -#define CPP_N_IMAGINARY 0x2000 -#define CPP_N_DFLOAT 0x4000 -#define CPP_N_DEFAULT 0x8000 - -#define CPP_N_FRACT 0x100000 /* Fract types. */ -#define CPP_N_ACCUM 0x200000 /* Accum types. */ -#define CPP_N_FLOATN 0x400000 /* _FloatN types. */ -#define CPP_N_FLOATNX 0x800000 /* _FloatNx types. */ - -#define CPP_N_USERDEF 0x1000000 /* C++11 user-defined literal. */ - -#define CPP_N_SIZE_T 0x2000000 /* C++23 size_t literal. */ - -#define CPP_N_WIDTH_FLOATN_NX 0xF0000000 /* _FloatN / _FloatNx value - of N, divided by 16. */ -#define CPP_FLOATN_SHIFT 24 -#define CPP_FLOATN_MAX 0xF0 - -/* Classify a CPP_NUMBER token. The return value is a combination of - the flags from the above sets. */ -extern unsigned cpp_classify_number (cpp_reader *, const cpp_token *, - const char **, location_t); - -/* Return the classification flags for a float suffix. */ -extern unsigned int cpp_interpret_float_suffix (cpp_reader *, const char *, - size_t); - -/* Return the classification flags for an int suffix. */ -extern unsigned int cpp_interpret_int_suffix (cpp_reader *, const char *, - size_t); - -/* Evaluate a token classified as category CPP_N_INTEGER. */ -extern cpp_num cpp_interpret_integer (cpp_reader *, const cpp_token *, - unsigned int); - -/* Sign extend a number, with PRECISION significant bits and all - others assumed clear, to fill out a cpp_num structure. */ -cpp_num cpp_num_sign_extend (cpp_num, size_t); - -/* Output a diagnostic of some kind. */ -extern bool cpp_error (cpp_reader *, enum cpp_diagnostic_level, - const char *msgid, ...) - ATTRIBUTE_PRINTF_3; -extern bool cpp_warning (cpp_reader *, enum cpp_warning_reason, - const char *msgid, ...) - ATTRIBUTE_PRINTF_3; -extern bool cpp_pedwarning (cpp_reader *, enum cpp_warning_reason, - const char *msgid, ...) - ATTRIBUTE_PRINTF_3; -extern bool cpp_warning_syshdr (cpp_reader *, enum cpp_warning_reason reason, - const char *msgid, ...) - ATTRIBUTE_PRINTF_3; - -/* As their counterparts above, but use RICHLOC. */ -extern bool cpp_warning_at (cpp_reader *, enum cpp_warning_reason, - rich_location *richloc, const char *msgid, ...) - ATTRIBUTE_PRINTF_4; -extern bool cpp_pedwarning_at (cpp_reader *, enum cpp_warning_reason, - rich_location *richloc, const char *msgid, ...) - ATTRIBUTE_PRINTF_4; - -/* Output a diagnostic with "MSGID: " preceding the - error string of errno. No location is printed. */ -extern bool cpp_errno (cpp_reader *, enum cpp_diagnostic_level, - const char *msgid); -/* Similarly, but with "FILENAME: " instead of "MSGID: ", where - the filename is not localized. */ -extern bool cpp_errno_filename (cpp_reader *, enum cpp_diagnostic_level, - const char *filename, location_t loc); - -/* Same as cpp_error, except additionally specifies a position as a - (translation unit) physical line and physical column. If the line is - zero, then no location is printed. */ -extern bool cpp_error_with_line (cpp_reader *, enum cpp_diagnostic_level, - location_t, unsigned, - const char *msgid, ...) - ATTRIBUTE_PRINTF_5; -extern bool cpp_warning_with_line (cpp_reader *, enum cpp_warning_reason, - location_t, unsigned, - const char *msgid, ...) - ATTRIBUTE_PRINTF_5; -extern bool cpp_pedwarning_with_line (cpp_reader *, enum cpp_warning_reason, - location_t, unsigned, - const char *msgid, ...) - ATTRIBUTE_PRINTF_5; -extern bool cpp_warning_with_line_syshdr (cpp_reader *, enum cpp_warning_reason, - location_t, unsigned, - const char *msgid, ...) - ATTRIBUTE_PRINTF_5; - -extern bool cpp_error_at (cpp_reader * pfile, enum cpp_diagnostic_level, - location_t src_loc, const char *msgid, ...) - ATTRIBUTE_PRINTF_4; - -extern bool cpp_error_at (cpp_reader * pfile, enum cpp_diagnostic_level, - rich_location *richloc, const char *msgid, ...) - ATTRIBUTE_PRINTF_4; - -/* In lex.cc */ -extern int cpp_ideq (const cpp_token *, const char *); -extern void cpp_output_line (cpp_reader *, FILE *); -extern unsigned char *cpp_output_line_to_string (cpp_reader *, - const unsigned char *); -extern const unsigned char *cpp_alloc_token_string - (cpp_reader *, const unsigned char *, unsigned); -extern void cpp_output_token (const cpp_token *, FILE *); -extern const char *cpp_type2name (enum cpp_ttype, unsigned char flags); -/* Returns the value of an escape sequence, truncated to the correct - target precision. PSTR points to the input pointer, which is just - after the backslash. LIMIT is how much text we have. WIDE is true - if the escape sequence is part of a wide character constant or - string literal. Handles all relevant diagnostics. */ -extern cppchar_t cpp_parse_escape (cpp_reader *, const unsigned char ** pstr, - const unsigned char *limit, int wide); - -/* Structure used to hold a comment block at a given location in the - source code. */ - -typedef struct -{ - /* Text of the comment including the terminators. */ - char *comment; - - /* source location for the given comment. */ - location_t sloc; -} cpp_comment; - -/* Structure holding all comments for a given cpp_reader. */ - -typedef struct -{ - /* table of comment entries. */ - cpp_comment *entries; - - /* number of actual entries entered in the table. */ - int count; - - /* number of entries allocated currently. */ - int allocated; -} cpp_comment_table; - -/* Returns the table of comments encountered by the preprocessor. This - table is only populated when pfile->state.save_comments is true. */ -extern cpp_comment_table *cpp_get_comments (cpp_reader *); - -/* In hash.c */ - -/* Lookup an identifier in the hashtable. Puts the identifier in the - table if it is not already there. */ -extern cpp_hashnode *cpp_lookup (cpp_reader *, const unsigned char *, - unsigned int); - -typedef int (*cpp_cb) (cpp_reader *, cpp_hashnode *, void *); -extern void cpp_forall_identifiers (cpp_reader *, cpp_cb, void *); - -/* In macro.cc */ -extern void cpp_scan_nooutput (cpp_reader *); -extern int cpp_sys_macro_p (cpp_reader *); -extern unsigned char *cpp_quote_string (unsigned char *, const unsigned char *, - unsigned int); -extern bool cpp_compare_macros (const cpp_macro *macro1, - const cpp_macro *macro2); - -/* In files.cc */ -extern bool cpp_included (cpp_reader *, const char *); -extern bool cpp_included_before (cpp_reader *, const char *, location_t); -extern void cpp_make_system_header (cpp_reader *, int, int); -extern bool cpp_push_include (cpp_reader *, const char *); -extern bool cpp_push_default_include (cpp_reader *, const char *); -extern void cpp_change_file (cpp_reader *, enum lc_reason, const char *); -extern const char *cpp_get_path (struct _cpp_file *); -extern cpp_dir *cpp_get_dir (struct _cpp_file *); -extern cpp_buffer *cpp_get_buffer (cpp_reader *); -extern struct _cpp_file *cpp_get_file (cpp_buffer *); -extern cpp_buffer *cpp_get_prev (cpp_buffer *); -extern void cpp_clear_file_cache (cpp_reader *); - -/* cpp_get_converted_source returns the contents of the given file, as it exists - after cpplib has read it and converted it from the input charset to the - source charset. Return struct will be zero-filled if the data could not be - read for any reason. The data starts at the DATA pointer, but the TO_FREE - pointer is what should be passed to free(), as there may be an offset. */ -struct cpp_converted_source -{ - char *to_free; - char *data; - size_t len; -}; -cpp_converted_source cpp_get_converted_source (const char *fname, - const char *input_charset); - -/* In pch.cc */ -struct save_macro_data; -extern int cpp_save_state (cpp_reader *, FILE *); -extern int cpp_write_pch_deps (cpp_reader *, FILE *); -extern int cpp_write_pch_state (cpp_reader *, FILE *); -extern int cpp_valid_state (cpp_reader *, const char *, int); -extern void cpp_prepare_state (cpp_reader *, struct save_macro_data **); -extern int cpp_read_state (cpp_reader *, const char *, FILE *, - struct save_macro_data *); - -/* In lex.cc */ -extern void cpp_force_token_locations (cpp_reader *, location_t); -extern void cpp_stop_forcing_token_locations (cpp_reader *); -enum CPP_DO_task -{ - CPP_DO_print, - CPP_DO_location, - CPP_DO_token -}; - -extern void cpp_directive_only_process (cpp_reader *pfile, - void *data, - void (*cb) (cpp_reader *, - CPP_DO_task, - void *data, ...)); - -/* In expr.cc */ -extern enum cpp_ttype cpp_userdef_string_remove_type - (enum cpp_ttype type); -extern enum cpp_ttype cpp_userdef_string_add_type - (enum cpp_ttype type); -extern enum cpp_ttype cpp_userdef_char_remove_type - (enum cpp_ttype type); -extern enum cpp_ttype cpp_userdef_char_add_type - (enum cpp_ttype type); -extern bool cpp_userdef_string_p - (enum cpp_ttype type); -extern bool cpp_userdef_char_p - (enum cpp_ttype type); -extern const char * cpp_get_userdef_suffix - (const cpp_token *); - -/* In charset.cc */ - -/* The result of attempting to decode a run of UTF-8 bytes. */ - -struct cpp_decoded_char -{ - const char *m_start_byte; - const char *m_next_byte; - - bool m_valid_ch; - cppchar_t m_ch; -}; - -/* Information for mapping between code points and display columns. - - This is a tabstop value, along with a callback for getting the - widths of characters. Normally this callback is cpp_wcwidth, but we - support other schemes for escaping non-ASCII unicode as a series of - ASCII chars when printing the user's source code in diagnostic-show-locus.cc - - For example, consider: - - the Unicode character U+03C0 "GREEK SMALL LETTER PI" (UTF-8: 0xCF 0x80) - - the Unicode character U+1F642 "SLIGHTLY SMILING FACE" - (UTF-8: 0xF0 0x9F 0x99 0x82) - - the byte 0xBF (a stray trailing byte of a UTF-8 character) - Normally U+03C0 would occupy one display column, U+1F642 - would occupy two display columns, and the stray byte would be - printed verbatim as one display column. - - However when escaping them as unicode code points as "" - and "" they occupy 8 and 9 display columns respectively, - and when escaping them as bytes as "<80>" and "<9F><99><82>" - they occupy 8 and 16 display columns respectively. In both cases - the stray byte is escaped to as 4 display columns. */ - -struct cpp_char_column_policy -{ - cpp_char_column_policy (int tabstop, - int (*width_cb) (cppchar_t c)) - : m_tabstop (tabstop), - m_undecoded_byte_width (1), - m_width_cb (width_cb) - {} - - int m_tabstop; - /* Width in display columns of a stray byte that isn't decodable - as UTF-8. */ - int m_undecoded_byte_width; - int (*m_width_cb) (cppchar_t c); -}; - -/* A class to manage the state while converting a UTF-8 sequence to cppchar_t - and computing the display width one character at a time. */ -class cpp_display_width_computation { - public: - cpp_display_width_computation (const char *data, int data_length, - const cpp_char_column_policy &policy); - const char *next_byte () const { return m_next; } - int bytes_processed () const { return m_next - m_begin; } - int bytes_left () const { return m_bytes_left; } - bool done () const { return !bytes_left (); } - int display_cols_processed () const { return m_display_cols; } - - int process_next_codepoint (cpp_decoded_char *out); - int advance_display_cols (int n); - - private: - const char *const m_begin; - const char *m_next; - size_t m_bytes_left; - const cpp_char_column_policy &m_policy; - int m_display_cols; -}; - -/* Convenience functions that are simple use cases for class - cpp_display_width_computation. Tab characters will be expanded to spaces - as determined by POLICY.m_tabstop, and non-printable-ASCII characters - will be escaped as per POLICY. */ - -int cpp_byte_column_to_display_column (const char *data, int data_length, - int column, - const cpp_char_column_policy &policy); -inline int cpp_display_width (const char *data, int data_length, - const cpp_char_column_policy &policy) -{ - return cpp_byte_column_to_display_column (data, data_length, data_length, - policy); -} -int cpp_display_column_to_byte_column (const char *data, int data_length, - int display_col, - const cpp_char_column_policy &policy); -int cpp_wcwidth (cppchar_t c); - -bool cpp_input_conversion_is_trivial (const char *input_charset); -int cpp_check_utf8_bom (const char *data, size_t data_length); - -#endif /* ! LIBCPP_CPPLIB_H */ diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/library/include/cpplib.h" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/library/include/cpplib.h" deleted file mode 100644 index aea752f..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/library/include/cpplib.h" +++ /dev/null @@ -1,1585 +0,0 @@ -/* Definitions for CPP library. - Copyright (C) 1995-2022 Free Software Foundation, Inc. - Written by Per Bothner, 1994-95. - -This program is free software; you can redistribute it and/or modify it -under the terms of the GNU General Public License as published by the -Free Software Foundation; either version 3, or (at your option) any -later version. - -This program is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -You should have received a copy of the GNU General Public License -along with this program; see the file COPYING3. If not see -. - - In other words, you are welcome to use, share and improve this program. - You are forbidden to forbid anyone else to use, share and improve - what you give them. Help stamp out software-hoarding! */ -#ifndef LIBCPP_CPPLIB_H -#define LIBCPP_CPPLIB_H - -#include -#include "symtab.h" -#include "line-map.h" - -typedef struct cpp_reader cpp_reader; -typedef struct cpp_buffer cpp_buffer; -typedef struct cpp_options cpp_options; -typedef struct cpp_token cpp_token; -typedef struct cpp_string cpp_string; -typedef struct cpp_hashnode cpp_hashnode; -typedef struct cpp_macro cpp_macro; -typedef struct cpp_callbacks cpp_callbacks; -typedef struct cpp_dir cpp_dir; - -struct _cpp_file; - -/* The first three groups, apart from '=', can appear in preprocessor - expressions (+= and -= are used to indicate unary + and - resp.). - This allows a lookup table to be implemented in _cpp_parse_expr. - - The first group, to CPP_LAST_EQ, can be immediately followed by an - '='. The lexer needs operators ending in '=', like ">>=", to be in - the same order as their counterparts without the '=', like ">>". - - See the cpp_operator table optab in expr.cc if you change the order or - add or remove anything in the first group. */ - -#define TTYPE_TABLE \ - OP(EQ, "=") \ - OP(NOT, "!") \ - OP(GREATER, ">") /* compare */ \ - OP(LESS, "<") \ - OP(PLUS, "+") /* math */ \ - OP(MINUS, "-") \ - OP(MULT, "*") \ - OP(DIV, "/") \ - OP(MOD, "%") \ - OP(AND, "&") /* bit ops */ \ - OP(OR, "|") \ - OP(XOR, "^") \ - OP(RSHIFT, ">>") \ - OP(LSHIFT, "<<") \ - \ - OP(COMPL, "~") \ - OP(AND_AND, "&&") /* logical */ \ - OP(OR_OR, "||") \ - OP(QUERY, "?") \ - OP(COLON, ":") \ - OP(COMMA, ",") /* grouping */ \ - OP(OPEN_PAREN, "(") \ - OP(CLOSE_PAREN, ")") \ - TK(EOF, NONE) \ - OP(EQ_EQ, "==") /* compare */ \ - OP(NOT_EQ, "!=") \ - OP(GREATER_EQ, ">=") \ - OP(LESS_EQ, "<=") \ - OP(SPACESHIP, "<=>") \ - \ - /* These two are unary + / - in preprocessor expressions. */ \ - OP(PLUS_EQ, "+=") /* math */ \ - OP(MINUS_EQ, "-=") \ - \ - OP(MULT_EQ, "*=") \ - OP(DIV_EQ, "/=") \ - OP(MOD_EQ, "%=") \ - OP(AND_EQ, "&=") /* bit ops */ \ - OP(OR_EQ, "|=") \ - OP(XOR_EQ, "^=") \ - OP(RSHIFT_EQ, ">>=") \ - OP(LSHIFT_EQ, "<<=") \ - /* Digraphs together, beginning with CPP_FIRST_DIGRAPH. */ \ - OP(HASH, "#") /* digraphs */ \ - OP(PASTE, "##") \ - OP(OPEN_SQUARE, "[") \ - OP(CLOSE_SQUARE, "]") \ - OP(OPEN_BRACE, "{") \ - OP(CLOSE_BRACE, "}") \ - /* The remainder of the punctuation. Order is not significant. */ \ - OP(SEMICOLON, ";") /* structure */ \ - OP(ELLIPSIS, "...") \ - OP(PLUS_PLUS, "++") /* increment */ \ - OP(MINUS_MINUS, "--") \ - OP(DEREF, "->") /* accessors */ \ - OP(DOT, ".") \ - OP(SCOPE, "::") \ - OP(DEREF_STAR, "->*") \ - OP(DOT_STAR, ".*") \ - OP(ATSIGN, "@") /* used in Objective-C */ \ - \ - TK(NAME, IDENT) /* word */ \ - TK(AT_NAME, IDENT) /* @word - Objective-C */ \ - TK(NUMBER, LITERAL) /* 34_be+ta */ \ - \ - TK(CHAR, LITERAL) /* 'char' */ \ - TK(WCHAR, LITERAL) /* L'char' */ \ - TK(CHAR16, LITERAL) /* u'char' */ \ - TK(CHAR32, LITERAL) /* U'char' */ \ - TK(UTF8CHAR, LITERAL) /* u8'char' */ \ - TK(OTHER, LITERAL) /* stray punctuation */ \ - \ - TK(STRING, LITERAL) /* "string" */ \ - TK(WSTRING, LITERAL) /* L"string" */ \ - TK(STRING16, LITERAL) /* u"string" */ \ - TK(STRING32, LITERAL) /* U"string" */ \ - TK(UTF8STRING, LITERAL) /* u8"string" */ \ - TK(OBJC_STRING, LITERAL) /* @"string" - Objective-C */ \ - TK(HEADER_NAME, LITERAL) /* in #include */ \ - \ - TK(CHAR_USERDEF, LITERAL) /* 'char'_suffix - C++-0x */ \ - TK(WCHAR_USERDEF, LITERAL) /* L'char'_suffix - C++-0x */ \ - TK(CHAR16_USERDEF, LITERAL) /* u'char'_suffix - C++-0x */ \ - TK(CHAR32_USERDEF, LITERAL) /* U'char'_suffix - C++-0x */ \ - TK(UTF8CHAR_USERDEF, LITERAL) /* u8'char'_suffix - C++-0x */ \ - TK(STRING_USERDEF, LITERAL) /* "string"_suffix - C++-0x */ \ - TK(WSTRING_USERDEF, LITERAL) /* L"string"_suffix - C++-0x */ \ - TK(STRING16_USERDEF, LITERAL) /* u"string"_suffix - C++-0x */ \ - TK(STRING32_USERDEF, LITERAL) /* U"string"_suffix - C++-0x */ \ - TK(UTF8STRING_USERDEF,LITERAL) /* u8"string"_suffix - C++-0x */ \ - \ - TK(COMMENT, LITERAL) /* Only if output comments. */ \ - /* SPELL_LITERAL happens to DTRT. */ \ - TK(MACRO_ARG, NONE) /* Macro argument. */ \ - TK(PRAGMA, NONE) /* Only for deferred pragmas. */ \ - TK(PRAGMA_EOL, NONE) /* End-of-line for deferred pragmas. */ \ - TK(PADDING, NONE) /* Whitespace for -E. */ - -#define OP(e, s) CPP_ ## e, -#define TK(e, s) CPP_ ## e, -enum cpp_ttype -{ - TTYPE_TABLE - N_TTYPES, - - /* A token type for keywords, as opposed to ordinary identifiers. */ - CPP_KEYWORD, - - /* Positions in the table. */ - CPP_LAST_EQ = CPP_LSHIFT, - CPP_FIRST_DIGRAPH = CPP_HASH, - CPP_LAST_PUNCTUATOR= CPP_ATSIGN, - CPP_LAST_CPP_OP = CPP_LESS_EQ -}; -#undef OP -#undef TK - -/* C language kind, used when calling cpp_create_reader. */ -enum c_lang {CLK_GNUC89 = 0, CLK_GNUC99, CLK_GNUC11, CLK_GNUC17, CLK_GNUC2X, - CLK_STDC89, CLK_STDC94, CLK_STDC99, CLK_STDC11, CLK_STDC17, - CLK_STDC2X, - CLK_GNUCXX, CLK_CXX98, CLK_GNUCXX11, CLK_CXX11, - CLK_GNUCXX14, CLK_CXX14, CLK_GNUCXX17, CLK_CXX17, - CLK_GNUCXX20, CLK_CXX20, CLK_GNUCXX23, CLK_CXX23, - CLK_ASM}; - -/* Payload of a NUMBER, STRING, CHAR or COMMENT token. */ -struct GTY(()) cpp_string { - unsigned int len; - const unsigned char *text; -}; - -/* Flags for the cpp_token structure. */ -#define PREV_WHITE (1 << 0) /* If whitespace before this token. */ -#define DIGRAPH (1 << 1) /* If it was a digraph. */ -#define STRINGIFY_ARG (1 << 2) /* If macro argument to be stringified. */ -#define PASTE_LEFT (1 << 3) /* If on LHS of a ## operator. */ -#define NAMED_OP (1 << 4) /* C++ named operators. */ -#define PREV_FALLTHROUGH (1 << 5) /* On a token preceeded by FALLTHROUGH - comment. */ -#define BOL (1 << 6) /* Token at beginning of line. */ -#define PURE_ZERO (1 << 7) /* Single 0 digit, used by the C++ frontend, - set in c-lex.cc. */ -#define COLON_SCOPE PURE_ZERO /* Adjacent colons in C < 23. */ -#define SP_DIGRAPH (1 << 8) /* # or ## token was a digraph. */ -#define SP_PREV_WHITE (1 << 9) /* If whitespace before a ## - operator, or before this token - after a # operator. */ -#define NO_EXPAND (1 << 10) /* Do not macro-expand this token. */ -#define PRAGMA_OP (1 << 11) /* _Pragma token. */ - -/* Specify which field, if any, of the cpp_token union is used. */ - -enum cpp_token_fld_kind { - CPP_TOKEN_FLD_NODE, - CPP_TOKEN_FLD_SOURCE, - CPP_TOKEN_FLD_STR, - CPP_TOKEN_FLD_ARG_NO, - CPP_TOKEN_FLD_TOKEN_NO, - CPP_TOKEN_FLD_PRAGMA, - CPP_TOKEN_FLD_NONE -}; - -/* A macro argument in the cpp_token union. */ -struct GTY(()) cpp_macro_arg { - /* Argument number. */ - unsigned int arg_no; - /* The original spelling of the macro argument token. */ - cpp_hashnode * - GTY ((nested_ptr (union tree_node, - "%h ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (%h)) : NULL", - "%h ? HT_IDENT_TO_GCC_IDENT (HT_NODE (%h)) : NULL"))) - spelling; -}; - -/* An identifier in the cpp_token union. */ -struct GTY(()) cpp_identifier { - /* The canonical (UTF-8) spelling of the identifier. */ - cpp_hashnode * - GTY ((nested_ptr (union tree_node, - "%h ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (%h)) : NULL", - "%h ? HT_IDENT_TO_GCC_IDENT (HT_NODE (%h)) : NULL"))) - node; - /* The original spelling of the identifier. */ - cpp_hashnode * - GTY ((nested_ptr (union tree_node, - "%h ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (%h)) : NULL", - "%h ? HT_IDENT_TO_GCC_IDENT (HT_NODE (%h)) : NULL"))) - spelling; -}; - -/* A preprocessing token. This has been carefully packed and should - occupy 16 bytes on 32-bit hosts and 24 bytes on 64-bit hosts. */ -struct GTY(()) cpp_token { - - /* Location of first char of token, together with range of full token. */ - location_t src_loc; - - ENUM_BITFIELD(cpp_ttype) type : CHAR_BIT; /* token type */ - unsigned short flags; /* flags - see above */ - - union cpp_token_u - { - /* An identifier. */ - struct cpp_identifier GTY ((tag ("CPP_TOKEN_FLD_NODE"))) node; - - /* Inherit padding from this token. */ - cpp_token * GTY ((tag ("CPP_TOKEN_FLD_SOURCE"))) source; - - /* A string, or number. */ - struct cpp_string GTY ((tag ("CPP_TOKEN_FLD_STR"))) str; - - /* Argument no. (and original spelling) for a CPP_MACRO_ARG. */ - struct cpp_macro_arg GTY ((tag ("CPP_TOKEN_FLD_ARG_NO"))) macro_arg; - - /* Original token no. for a CPP_PASTE (from a sequence of - consecutive paste tokens in a macro expansion). */ - unsigned int GTY ((tag ("CPP_TOKEN_FLD_TOKEN_NO"))) token_no; - - /* Caller-supplied identifier for a CPP_PRAGMA. */ - unsigned int GTY ((tag ("CPP_TOKEN_FLD_PRAGMA"))) pragma; - } GTY ((desc ("cpp_token_val_index (&%1)"))) val; -}; - -/* Say which field is in use. */ -extern enum cpp_token_fld_kind cpp_token_val_index (const cpp_token *tok); - -/* A type wide enough to hold any multibyte source character. - cpplib's character constant interpreter requires an unsigned type. - Also, a typedef for the signed equivalent. - The width of this type is capped at 32 bits; there do exist targets - where wchar_t is 64 bits, but only in a non-default mode, and there - would be no meaningful interpretation for a wchar_t value greater - than 2^32 anyway -- the widest wide-character encoding around is - ISO 10646, which stops at 2^31. */ -#if CHAR_BIT * SIZEOF_INT >= 32 -# define CPPCHAR_SIGNED_T int -#elif CHAR_BIT * SIZEOF_LONG >= 32 -# define CPPCHAR_SIGNED_T long -#else -# error "Cannot find a least-32-bit signed integer type" -#endif -typedef unsigned CPPCHAR_SIGNED_T cppchar_t; -typedef CPPCHAR_SIGNED_T cppchar_signed_t; - -/* Style of header dependencies to generate. */ -enum cpp_deps_style { DEPS_NONE = 0, DEPS_USER, DEPS_SYSTEM }; - -/* The possible normalization levels, from most restrictive to least. */ -enum cpp_normalize_level { - /* In NFKC. */ - normalized_KC = 0, - /* In NFC. */ - normalized_C, - /* In NFC, except for subsequences where being in NFC would make - the identifier invalid. */ - normalized_identifier_C, - /* Not normalized at all. */ - normalized_none -}; - -enum cpp_main_search -{ - CMS_none, /* A regular source file. */ - CMS_header, /* Is a directly-specified header file (eg PCH or - header-unit). */ - CMS_user, /* Search the user INCLUDE path. */ - CMS_system, /* Search the system INCLUDE path. */ -}; - -/* The possible bidirectional control characters checking levels. */ -enum cpp_bidirectional_level { - /* No checking. */ - bidirectional_none = 0, - /* Only detect unpaired uses of bidirectional control characters. */ - bidirectional_unpaired = 1, - /* Detect any use of bidirectional control characters. */ - bidirectional_any = 2, - /* Also warn about UCNs. */ - bidirectional_ucn = 4 -}; - -/* This structure is nested inside struct cpp_reader, and - carries all the options visible to the command line. */ -struct cpp_options -{ - /* The language we're preprocessing. */ - enum c_lang lang; - - /* Nonzero means use extra default include directories for C++. */ - unsigned char cplusplus; - - /* Nonzero means handle cplusplus style comments. */ - unsigned char cplusplus_comments; - - /* Nonzero means define __OBJC__, treat @ as a special token, use - the OBJC[PLUS]_INCLUDE_PATH environment variable, and allow - "#import". */ - unsigned char objc; - - /* Nonzero means don't copy comments into the output file. */ - unsigned char discard_comments; - - /* Nonzero means don't copy comments into the output file during - macro expansion. */ - unsigned char discard_comments_in_macro_exp; - - /* Nonzero means process the ISO trigraph sequences. */ - unsigned char trigraphs; - - /* Nonzero means process the ISO digraph sequences. */ - unsigned char digraphs; - - /* Nonzero means to allow hexadecimal floats and LL suffixes. */ - unsigned char extended_numbers; - - /* Nonzero means process u/U prefix literals (UTF-16/32). */ - unsigned char uliterals; - - /* Nonzero means process u8 prefixed character literals (UTF-8). */ - unsigned char utf8_char_literals; - - /* Nonzero means process r/R raw strings. If this is set, uliterals - must be set as well. */ - unsigned char rliterals; - - /* Nonzero means print names of header files (-H). */ - unsigned char print_include_names; - - /* Nonzero means complain about deprecated features. */ - unsigned char cpp_warn_deprecated; - - /* Nonzero means warn if slash-star appears in a comment. */ - unsigned char warn_comments; - - /* Nonzero means to warn about __DATA__, __TIME__ and __TIMESTAMP__ usage. */ - unsigned char warn_date_time; - - /* Nonzero means warn if a user-supplied include directory does not - exist. */ - unsigned char warn_missing_include_dirs; - - /* Nonzero means warn if there are any trigraphs. */ - unsigned char warn_trigraphs; - - /* Nonzero means warn about multicharacter charconsts. */ - unsigned char warn_multichar; - - /* Nonzero means warn about various incompatibilities with - traditional C. */ - unsigned char cpp_warn_traditional; - - /* Nonzero means warn about long long numeric constants. */ - unsigned char cpp_warn_long_long; - - /* Nonzero means warn about text after an #endif (or #else). */ - unsigned char warn_endif_labels; - - /* Nonzero means warn about implicit sign changes owing to integer - promotions. */ - unsigned char warn_num_sign_change; - - /* Zero means don't warn about __VA_ARGS__ usage in c89 pedantic mode. - Presumably the usage is protected by the appropriate #ifdef. */ - unsigned char warn_variadic_macros; - - /* Nonzero means warn about builtin macros that are redefined or - explicitly undefined. */ - unsigned char warn_builtin_macro_redefined; - - /* Different -Wimplicit-fallthrough= levels. */ - unsigned char cpp_warn_implicit_fallthrough; - - /* Nonzero means we should look for header.gcc files that remap file - names. */ - unsigned char remap; - - /* Zero means dollar signs are punctuation. */ - unsigned char dollars_in_ident; - - /* Nonzero means UCNs are accepted in identifiers. */ - unsigned char extended_identifiers; - - /* True if we should warn about dollars in identifiers or numbers - for this translation unit. */ - unsigned char warn_dollars; - - /* Nonzero means warn if undefined identifiers are evaluated in an #if. */ - unsigned char warn_undef; - - /* Nonzero means warn if "defined" is encountered in a place other than - an #if. */ - unsigned char warn_expansion_to_defined; - - /* Nonzero means warn of unused macros from the main file. */ - unsigned char warn_unused_macros; - - /* Nonzero for the 1999 C Standard, including corrigenda and amendments. */ - unsigned char c99; - - /* Nonzero if we are conforming to a specific C or C++ standard. */ - unsigned char std; - - /* Nonzero means give all the error messages the ANSI standard requires. */ - unsigned char cpp_pedantic; - - /* Nonzero means we're looking at already preprocessed code, so don't - bother trying to do macro expansion and whatnot. */ - unsigned char preprocessed; - - /* Nonzero means we are going to emit debugging logs during - preprocessing. */ - unsigned char debug; - - /* Nonzero means we are tracking locations of tokens involved in - macro expansion. 1 Means we track the location in degraded mode - where we do not track locations of tokens resulting from the - expansion of arguments of function-like macro. 2 Means we do - track all macro expansions. This last option is the one that - consumes the highest amount of memory. */ - unsigned char track_macro_expansion; - - /* Nonzero means handle C++ alternate operator names. */ - unsigned char operator_names; - - /* Nonzero means warn about use of C++ alternate operator names. */ - unsigned char warn_cxx_operator_names; - - /* True for traditional preprocessing. */ - unsigned char traditional; - - /* Nonzero for C++ 2011 Standard user-defined literals. */ - unsigned char user_literals; - - /* Nonzero means warn when a string or character literal is followed by a - ud-suffix which does not beging with an underscore. */ - unsigned char warn_literal_suffix; - - /* Nonzero means interpret imaginary, fixed-point, or other gnu extension - literal number suffixes as user-defined literal number suffixes. */ - unsigned char ext_numeric_literals; - - /* Nonzero means extended identifiers allow the characters specified - in C11. */ - unsigned char c11_identifiers; - - /* Nonzero for C++ 2014 Standard binary constants. */ - unsigned char binary_constants; - - /* Nonzero for C++ 2014 Standard digit separators. */ - unsigned char digit_separators; - - /* Nonzero for C2X decimal floating-point constants. */ - unsigned char dfp_constants; - - /* Nonzero for C++20 __VA_OPT__ feature. */ - unsigned char va_opt; - - /* Nonzero for the '::' token. */ - unsigned char scope; - - /* Nonzero for the '#elifdef' and '#elifndef' directives. */ - unsigned char elifdef; - - /* Nonzero means tokenize C++20 module directives. */ - unsigned char module_directives; - - /* Nonzero for C++23 size_t literals. */ - unsigned char size_t_literals; - - /* Holds the name of the target (execution) character set. */ - const char *narrow_charset; - - /* Holds the name of the target wide character set. */ - const char *wide_charset; - - /* Holds the name of the input character set. */ - const char *input_charset; - - /* The minimum permitted level of normalization before a warning - is generated. See enum cpp_normalize_level. */ - int warn_normalize; - - /* True to warn about precompiled header files we couldn't use. */ - bool warn_invalid_pch; - - /* True if dependencies should be restored from a precompiled header. */ - bool restore_pch_deps; - - /* True if warn about differences between C90 and C99. */ - signed char cpp_warn_c90_c99_compat; - - /* True if warn about differences between C11 and C2X. */ - signed char cpp_warn_c11_c2x_compat; - - /* True if warn about differences between C++98 and C++11. */ - bool cpp_warn_cxx11_compat; - - /* Nonzero if bidirectional control characters checking is on. See enum - cpp_bidirectional_level. */ - unsigned char cpp_warn_bidirectional; - - /* Dependency generation. */ - struct - { - /* Style of header dependencies to generate. */ - enum cpp_deps_style style; - - /* Assume missing files are generated files. */ - bool missing_files; - - /* Generate phony targets for each dependency apart from the first - one. */ - bool phony_targets; - - /* Generate dependency info for modules. */ - bool modules; - - /* If true, no dependency is generated on the main file. */ - bool ignore_main_file; - - /* If true, intend to use the preprocessor output (e.g., for compilation) - in addition to the dependency info. */ - bool need_preprocessor_output; - } deps; - - /* Target-specific features set by the front end or client. */ - - /* Precision for target CPP arithmetic, target characters, target - ints and target wide characters, respectively. */ - size_t precision, char_precision, int_precision, wchar_precision; - - /* True means chars (wide chars) are unsigned. */ - bool unsigned_char, unsigned_wchar; - - /* True if the most significant byte in a word has the lowest - address in memory. */ - bool bytes_big_endian; - - /* Nonzero means __STDC__ should have the value 0 in system headers. */ - unsigned char stdc_0_in_system_headers; - - /* True disables tokenization outside of preprocessing directives. */ - bool directives_only; - - /* True enables canonicalization of system header file paths. */ - bool canonical_system_headers; - - /* The maximum depth of the nested #include. */ - unsigned int max_include_depth; - - cpp_main_search main_search : 8; -}; - -/* Diagnostic levels. To get a diagnostic without associating a - position in the translation unit with it, use cpp_error_with_line - with a line number of zero. */ - -enum cpp_diagnostic_level { - /* Warning, an error with -Werror. */ - CPP_DL_WARNING = 0, - /* Same as CPP_DL_WARNING, except it is not suppressed in system headers. */ - CPP_DL_WARNING_SYSHDR, - /* Warning, an error with -pedantic-errors or -Werror. */ - CPP_DL_PEDWARN, - /* An error. */ - CPP_DL_ERROR, - /* An internal consistency check failed. Prints "internal error: ", - otherwise the same as CPP_DL_ERROR. */ - CPP_DL_ICE, - /* An informative note following a warning. */ - CPP_DL_NOTE, - /* A fatal error. */ - CPP_DL_FATAL -}; - -/* Warning reason codes. Use a reason code of CPP_W_NONE for unclassified - warnings and diagnostics that are not warnings. */ - -enum cpp_warning_reason { - CPP_W_NONE = 0, - CPP_W_DEPRECATED, - CPP_W_COMMENTS, - CPP_W_MISSING_INCLUDE_DIRS, - CPP_W_TRIGRAPHS, - CPP_W_MULTICHAR, - CPP_W_TRADITIONAL, - CPP_W_LONG_LONG, - CPP_W_ENDIF_LABELS, - CPP_W_NUM_SIGN_CHANGE, - CPP_W_VARIADIC_MACROS, - CPP_W_BUILTIN_MACRO_REDEFINED, - CPP_W_DOLLARS, - CPP_W_UNDEF, - CPP_W_UNUSED_MACROS, - CPP_W_CXX_OPERATOR_NAMES, - CPP_W_NORMALIZE, - CPP_W_INVALID_PCH, - CPP_W_WARNING_DIRECTIVE, - CPP_W_LITERAL_SUFFIX, - CPP_W_SIZE_T_LITERALS, - CPP_W_DATE_TIME, - CPP_W_PEDANTIC, - CPP_W_C90_C99_COMPAT, - CPP_W_C11_C2X_COMPAT, - CPP_W_CXX11_COMPAT, - CPP_W_EXPANSION_TO_DEFINED, - CPP_W_BIDIRECTIONAL -}; - -/* Callback for header lookup for HEADER, which is the name of a - source file. It is used as a method of last resort to find headers - that are not otherwise found during the normal include processing. - The return value is the malloced name of a header to try and open, - if any, or NULL otherwise. This callback is called only if the - header is otherwise unfound. */ -typedef const char *(*missing_header_cb)(cpp_reader *, const char *header, cpp_dir **); - -/* Call backs to cpplib client. */ -struct cpp_callbacks -{ - /* Called when a new line of preprocessed output is started. */ - void (*line_change) (cpp_reader *, const cpp_token *, int); - - /* Called when switching to/from a new file. - The line_map is for the new file. It is NULL if there is no new file. - (In C this happens when done with + and also - when done with a main file.) This can be used for resource cleanup. */ - void (*file_change) (cpp_reader *, const line_map_ordinary *); - - void (*dir_change) (cpp_reader *, const char *); - void (*include) (cpp_reader *, location_t, const unsigned char *, - const char *, int, const cpp_token **); - void (*define) (cpp_reader *, location_t, cpp_hashnode *); - void (*undef) (cpp_reader *, location_t, cpp_hashnode *); - void (*ident) (cpp_reader *, location_t, const cpp_string *); - void (*def_pragma) (cpp_reader *, location_t); - int (*valid_pch) (cpp_reader *, const char *, int); - void (*read_pch) (cpp_reader *, const char *, int, const char *); - missing_header_cb missing_header; - - /* Context-sensitive macro support. Returns macro (if any) that should - be expanded. */ - cpp_hashnode * (*macro_to_expand) (cpp_reader *, const cpp_token *); - - /* Called to emit a diagnostic. This callback receives the - translated message. */ - bool (*diagnostic) (cpp_reader *, - enum cpp_diagnostic_level, - enum cpp_warning_reason, - rich_location *, - const char *, va_list *) - ATTRIBUTE_FPTR_PRINTF(5,0); - - /* Callbacks for when a macro is expanded, or tested (whether - defined or not at the time) in #ifdef, #ifndef or "defined". */ - void (*used_define) (cpp_reader *, location_t, cpp_hashnode *); - void (*used_undef) (cpp_reader *, location_t, cpp_hashnode *); - /* Called before #define and #undef or other macro definition - changes are processed. */ - void (*before_define) (cpp_reader *); - /* Called whenever a macro is expanded or tested. - Second argument is the location of the start of the current expansion. */ - void (*used) (cpp_reader *, location_t, cpp_hashnode *); - - /* Callback to identify whether an attribute exists. */ - int (*has_attribute) (cpp_reader *, bool); - - /* Callback to determine whether a built-in function is recognized. */ - int (*has_builtin) (cpp_reader *); - - /* Callback that can change a user lazy into normal macro. */ - void (*user_lazy_macro) (cpp_reader *, cpp_macro *, unsigned); - - /* Callback to handle deferred cpp_macros. */ - cpp_macro *(*user_deferred_macro) (cpp_reader *, location_t, cpp_hashnode *); - - /* Callback to parse SOURCE_DATE_EPOCH from environment. */ - time_t (*get_source_date_epoch) (cpp_reader *); - - /* Callback for providing suggestions for misspelled directives. */ - const char *(*get_suggestion) (cpp_reader *, const char *, const char *const *); - - /* Callback for when a comment is encountered, giving the location - of the opening slash, a pointer to the content (which is not - necessarily 0-terminated), and the length of the content. - The content contains the opening slash-star (or slash-slash), - and for C-style comments contains the closing star-slash. For - C++-style comments it does not include the terminating newline. */ - void (*comment) (cpp_reader *, location_t, const unsigned char *, - size_t); - - /* Callback for filename remapping in __FILE__ and __BASE_FILE__ macro - expansions. */ - const char *(*remap_filename) (const char*); - - /* Maybe translate a #include into something else. Return a - cpp_buffer containing the translation if translating. */ - char *(*translate_include) (cpp_reader *, line_maps *, location_t, - const char *path); -}; - -#ifdef VMS -#define INO_T_CPP ino_t ino[3] -#elif defined (_AIX) && SIZEOF_INO_T == 4 -#define INO_T_CPP ino64_t ino -#else -#define INO_T_CPP ino_t ino -#endif - -#if defined (_AIX) && SIZEOF_DEV_T == 4 -#define DEV_T_CPP dev64_t dev -#else -#define DEV_T_CPP dev_t dev -#endif - -/* Chain of directories to look for include files in. */ -struct cpp_dir -{ - /* NULL-terminated singly-linked list. */ - struct cpp_dir *next; - - /* NAME of the directory, NUL-terminated. */ - char *name; - unsigned int len; - - /* One if a system header, two if a system header that has extern - "C" guards for C++. */ - unsigned char sysp; - - /* Is this a user-supplied directory? */ - bool user_supplied_p; - - /* The canonicalized NAME as determined by lrealpath. This field - is only used by hosts that lack reliable inode numbers. */ - char *canonical_name; - - /* Mapping of file names for this directory for MS-DOS and related - platforms. A NULL-terminated array of (from, to) pairs. */ - const char **name_map; - - /* Routine to construct pathname, given the search path name and the - HEADER we are trying to find, return a constructed pathname to - try and open. If this is NULL, the constructed pathname is as - constructed by append_file_to_dir. */ - char *(*construct) (const char *header, cpp_dir *dir); - - /* The C front end uses these to recognize duplicated - directories in the search path. */ - INO_T_CPP; - DEV_T_CPP; -}; - -/* The kind of the cpp_macro. */ -enum cpp_macro_kind { - cmk_macro, /* An ISO macro (token expansion). */ - cmk_assert, /* An assertion. */ - cmk_traditional /* A traditional macro (text expansion). */ -}; - -/* Each macro definition is recorded in a cpp_macro structure. - Variadic macros cannot occur with traditional cpp. */ -struct GTY(()) cpp_macro { - union cpp_parm_u - { - /* Parameters, if any. If parameter names use extended identifiers, - the original spelling of those identifiers, not the canonical - UTF-8 spelling, goes here. */ - cpp_hashnode ** GTY ((tag ("false"), - nested_ptr (union tree_node, - "%h ? CPP_HASHNODE (GCC_IDENT_TO_HT_IDENT (%h)) : NULL", - "%h ? HT_IDENT_TO_GCC_IDENT (HT_NODE (%h)) : NULL"), - length ("%1.paramc"))) params; - - /* If this is an assertion, the next one in the chain. */ - cpp_macro *GTY ((tag ("true"))) next; - } GTY ((desc ("%1.kind == cmk_assert"))) parm; - - /* Definition line number. */ - location_t line; - - /* Number of tokens in body, or bytes for traditional macros. */ - /* Do we really need 2^32-1 range here? */ - unsigned int count; - - /* Number of parameters. */ - unsigned short paramc; - - /* Non-zero if this is a user-lazy macro, value provided by user. */ - unsigned char lazy; - - /* The kind of this macro (ISO, trad or assert) */ - unsigned kind : 2; - - /* If a function-like macro. */ - unsigned int fun_like : 1; - - /* If a variadic macro. */ - unsigned int variadic : 1; - - /* If macro defined in system header. */ - unsigned int syshdr : 1; - - /* Nonzero if it has been expanded or had its existence tested. */ - unsigned int used : 1; - - /* Indicate whether the tokens include extra CPP_PASTE tokens at the - end to track invalid redefinitions with consecutive CPP_PASTE - tokens. */ - unsigned int extra_tokens : 1; - - /* Imported C++20 macro (from a header unit). */ - unsigned int imported_p : 1; - - /* 0 bits spare (32-bit). 32 on 64-bit target. */ - - union cpp_exp_u - { - /* Trailing array of replacement tokens (ISO), or assertion body value. */ - cpp_token GTY ((tag ("false"), length ("%1.count"))) tokens[1]; - - /* Pointer to replacement text (traditional). See comment at top - of cpptrad.c for how traditional function-like macros are - encoded. */ - const unsigned char *GTY ((tag ("true"))) text; - } GTY ((desc ("%1.kind == cmk_traditional"))) exp; -}; - -/* Poisoned identifiers are flagged NODE_POISONED. NODE_OPERATOR (C++ - only) indicates an identifier that behaves like an operator such as - "xor". NODE_DIAGNOSTIC is for speed in lex_token: it indicates a - diagnostic may be required for this node. Currently this only - applies to __VA_ARGS__, poisoned identifiers, and -Wc++-compat - warnings about NODE_OPERATOR. */ - -/* Hash node flags. */ -#define NODE_OPERATOR (1 << 0) /* C++ named operator. */ -#define NODE_POISONED (1 << 1) /* Poisoned identifier. */ -#define NODE_DIAGNOSTIC (1 << 2) /* Possible diagnostic when lexed. */ -#define NODE_WARN (1 << 3) /* Warn if redefined or undefined. */ -#define NODE_DISABLED (1 << 4) /* A disabled macro. */ -#define NODE_USED (1 << 5) /* Dumped with -dU. */ -#define NODE_CONDITIONAL (1 << 6) /* Conditional macro */ -#define NODE_WARN_OPERATOR (1 << 7) /* Warn about C++ named operator. */ -#define NODE_MODULE (1 << 8) /* C++-20 module-related name. */ - -/* Different flavors of hash node. */ -enum node_type -{ - NT_VOID = 0, /* Maybe an assert? */ - NT_MACRO_ARG, /* A macro arg. */ - NT_USER_MACRO, /* A user macro. */ - NT_BUILTIN_MACRO, /* A builtin macro. */ - NT_MACRO_MASK = NT_USER_MACRO /* Mask for either macro kind. */ -}; - -/* Different flavors of builtin macro. _Pragma is an operator, but we - handle it with the builtin code for efficiency reasons. */ -enum cpp_builtin_type -{ - BT_SPECLINE = 0, /* `__LINE__' */ - BT_DATE, /* `__DATE__' */ - BT_FILE, /* `__FILE__' */ - BT_FILE_NAME, /* `__FILE_NAME__' */ - BT_BASE_FILE, /* `__BASE_FILE__' */ - BT_INCLUDE_LEVEL, /* `__INCLUDE_LEVEL__' */ - BT_TIME, /* `__TIME__' */ - BT_STDC, /* `__STDC__' */ - BT_PRAGMA, /* `_Pragma' operator */ - BT_TIMESTAMP, /* `__TIMESTAMP__' */ - BT_COUNTER, /* `__COUNTER__' */ - BT_HAS_ATTRIBUTE, /* `__has_attribute(x)' */ - BT_HAS_STD_ATTRIBUTE, /* `__has_c_attribute(x)' */ - BT_HAS_BUILTIN, /* `__has_builtin(x)' */ - BT_HAS_INCLUDE, /* `__has_include(x)' */ - BT_HAS_INCLUDE_NEXT, /* `__has_include_next(x)' */ - - // RT Extension - BT_RT_ASSIGN, - BT_RT_TO_ARG_LIST, - BT_RT_TO_TOKEN_LIST, - BT_RT_FIRST, - BT_RT_REST, - BT_RT_MAP, - BT_RT_AL_MAP, - BT_RT_IF, - BT_RT_NOT, - BT_RT_AND, - BT_RT_OR, - BT_RT_IS_IDENTIFIER, - BT_RT_IS_NAME, - BT_RT_PASTE -}; - -#define CPP_HASHNODE(HNODE) ((cpp_hashnode *) (HNODE)) -#define HT_NODE(NODE) (&(NODE)->ident) -#define NODE_LEN(NODE) HT_LEN (HT_NODE (NODE)) -#define NODE_NAME(NODE) HT_STR (HT_NODE (NODE)) - -/* The common part of an identifier node shared amongst all 3 C front - ends. Also used to store CPP identifiers, which are a superset of - identifiers in the grammatical sense. */ - -union GTY(()) _cpp_hashnode_value { - /* Assert (maybe NULL) */ - cpp_macro * GTY((tag ("NT_VOID"))) answers; - /* Macro (maybe NULL) */ - cpp_macro * GTY((tag ("NT_USER_MACRO"))) macro; - /* Code for a builtin macro. */ - enum cpp_builtin_type GTY ((tag ("NT_BUILTIN_MACRO"))) builtin; - /* Macro argument index. */ - unsigned short GTY ((tag ("NT_MACRO_ARG"))) arg_index; -}; - -struct GTY(()) cpp_hashnode { - struct ht_identifier ident; - unsigned int is_directive : 1; - unsigned int directive_index : 7; /* If is_directive, - then index into directive table. - Otherwise, a NODE_OPERATOR. */ - unsigned int rid_code : 8; /* Rid code - for front ends. */ - unsigned int flags : 9; /* CPP flags. */ - ENUM_BITFIELD(node_type) type : 2; /* CPP node type. */ - - /* 5 bits spare. */ - - /* The deferred cookie is applicable to NT_USER_MACRO or NT_VOID. - The latter for when a macro had a prevailing undef. - On a 64-bit system there would be 32-bits of padding to the value - field. So placing the deferred index here is not costly. */ - unsigned deferred; /* Deferred cookie */ - - union _cpp_hashnode_value GTY ((desc ("%1.type"))) value; -}; - -/* A class for iterating through the source locations within a - string token (before escapes are interpreted, and before - concatenation). */ - -class cpp_string_location_reader { - public: - cpp_string_location_reader (location_t src_loc, - line_maps *line_table); - - source_range get_next (); - - private: - location_t m_loc; - int m_offset_per_column; -}; - -/* A class for storing the source ranges of all of the characters within - a string literal, after escapes are interpreted, and after - concatenation. - - This is not GTY-marked, as instances are intended to be temporary. */ - -class cpp_substring_ranges -{ - public: - cpp_substring_ranges (); - ~cpp_substring_ranges (); - - int get_num_ranges () const { return m_num_ranges; } - source_range get_range (int idx) const - { - linemap_assert (idx < m_num_ranges); - return m_ranges[idx]; - } - - void add_range (source_range range); - void add_n_ranges (int num, cpp_string_location_reader &loc_reader); - - private: - source_range *m_ranges; - int m_num_ranges; - int m_alloc_ranges; -}; - -/* Call this first to get a handle to pass to other functions. - - If you want cpplib to manage its own hashtable, pass in a NULL - pointer. Otherwise you should pass in an initialized hash table - that cpplib will share; this technique is used by the C front - ends. */ -extern cpp_reader *cpp_create_reader (enum c_lang, struct ht *, - class line_maps *); - -/* Reset the cpp_reader's line_map. This is only used after reading a - PCH file. */ -extern void cpp_set_line_map (cpp_reader *, class line_maps *); - -/* Call this to change the selected language standard (e.g. because of - command line options). */ -extern void cpp_set_lang (cpp_reader *, enum c_lang); - -/* Set the include paths. */ -extern void cpp_set_include_chains (cpp_reader *, cpp_dir *, cpp_dir *, int); - -/* Call these to get pointers to the options, callback, and deps - structures for a given reader. These pointers are good until you - call cpp_finish on that reader. You can either edit the callbacks - through the pointer returned from cpp_get_callbacks, or set them - with cpp_set_callbacks. */ -extern cpp_options *cpp_get_options (cpp_reader *) ATTRIBUTE_PURE; -extern cpp_callbacks *cpp_get_callbacks (cpp_reader *) ATTRIBUTE_PURE; -extern void cpp_set_callbacks (cpp_reader *, cpp_callbacks *); -extern class mkdeps *cpp_get_deps (cpp_reader *) ATTRIBUTE_PURE; - -extern const char *cpp_probe_header_unit (cpp_reader *, const char *file, - bool angle_p, location_t); - -/* Call these to get name data about the various compile-time - charsets. */ -extern const char *cpp_get_narrow_charset_name (cpp_reader *) ATTRIBUTE_PURE; -extern const char *cpp_get_wide_charset_name (cpp_reader *) ATTRIBUTE_PURE; - -/* This function reads the file, but does not start preprocessing. It - returns the name of the original file; this is the same as the - input file, except for preprocessed input. This will generate at - least one file change callback, and possibly a line change callback - too. If there was an error opening the file, it returns NULL. */ -extern const char *cpp_read_main_file (cpp_reader *, const char *, - bool injecting = false); -extern location_t cpp_main_loc (const cpp_reader *); - -/* Adjust for the main file to be an include. */ -extern void cpp_retrofit_as_include (cpp_reader *); - -/* Set up built-ins with special behavior. Use cpp_init_builtins() - instead unless your know what you are doing. */ -extern void cpp_init_special_builtins (cpp_reader *); - -/* Set up built-ins like __FILE__. */ -extern void cpp_init_builtins (cpp_reader *, int); - -/* This is called after options have been parsed, and partially - processed. */ -extern void cpp_post_options (cpp_reader *); - -/* Set up translation to the target character set. */ -extern void cpp_init_iconv (cpp_reader *); - -/* Call this to finish preprocessing. If you requested dependency - generation, pass an open stream to write the information to, - otherwise NULL. It is your responsibility to close the stream. */ -extern void cpp_finish (cpp_reader *, FILE *deps_stream); - -/* Call this to release the handle at the end of preprocessing. Any - use of the handle after this function returns is invalid. */ -extern void cpp_destroy (cpp_reader *); - -extern unsigned int cpp_token_len (const cpp_token *); -extern unsigned char *cpp_token_as_text (cpp_reader *, const cpp_token *); -extern unsigned char *cpp_spell_token (cpp_reader *, const cpp_token *, - unsigned char *, bool); -extern void cpp_register_pragma (cpp_reader *, const char *, const char *, - void (*) (cpp_reader *), bool); -extern void cpp_register_deferred_pragma (cpp_reader *, const char *, - const char *, unsigned, bool, bool); -extern int cpp_avoid_paste (cpp_reader *, const cpp_token *, - const cpp_token *); -extern const cpp_token *cpp_get_token (cpp_reader *); -extern const cpp_token *cpp_get_token_with_location (cpp_reader *, - location_t *); -inline bool cpp_user_macro_p (const cpp_hashnode *node) -{ - return node->type == NT_USER_MACRO; -} -inline bool cpp_builtin_macro_p (const cpp_hashnode *node) -{ - return node->type == NT_BUILTIN_MACRO; -} -inline bool cpp_macro_p (const cpp_hashnode *node) -{ - return node->type & NT_MACRO_MASK; -} -inline cpp_macro *cpp_set_deferred_macro (cpp_hashnode *node, - cpp_macro *forced = NULL) -{ - cpp_macro *old = node->value.macro; - - node->value.macro = forced; - node->type = NT_USER_MACRO; - node->flags &= ~NODE_USED; - - return old; -} -cpp_macro *cpp_get_deferred_macro (cpp_reader *, cpp_hashnode *, location_t); - -/* Returns true if NODE is a function-like user macro. */ -inline bool cpp_fun_like_macro_p (cpp_hashnode *node) -{ - return cpp_user_macro_p (node) && node->value.macro->fun_like; -} - -extern const unsigned char *cpp_macro_definition (cpp_reader *, cpp_hashnode *); -extern const unsigned char *cpp_macro_definition (cpp_reader *, cpp_hashnode *, - const cpp_macro *); -inline location_t cpp_macro_definition_location (cpp_hashnode *node) -{ - const cpp_macro *macro = node->value.macro; - return macro ? macro->line : 0; -} -/* Return an idempotent time stamp (possibly from SOURCE_DATE_EPOCH). */ -enum class CPP_time_kind -{ - FIXED = -1, /* Fixed time via source epoch. */ - DYNAMIC = -2, /* Dynamic via time(2). */ - UNKNOWN = -3 /* Wibbly wobbly, timey wimey. */ -}; -extern CPP_time_kind cpp_get_date (cpp_reader *, time_t *); - -extern void _cpp_backup_tokens (cpp_reader *, unsigned int); -extern const cpp_token *cpp_peek_token (cpp_reader *, int); - -/* Evaluate a CPP_*CHAR* token. */ -extern cppchar_t cpp_interpret_charconst (cpp_reader *, const cpp_token *, - unsigned int *, int *); -/* Evaluate a vector of CPP_*STRING* tokens. */ -extern bool cpp_interpret_string (cpp_reader *, - const cpp_string *, size_t, - cpp_string *, enum cpp_ttype); -extern const char *cpp_interpret_string_ranges (cpp_reader *pfile, - const cpp_string *from, - cpp_string_location_reader *, - size_t count, - cpp_substring_ranges *out, - enum cpp_ttype type); -extern bool cpp_interpret_string_notranslate (cpp_reader *, - const cpp_string *, size_t, - cpp_string *, enum cpp_ttype); - -/* Convert a host character constant to the execution character set. */ -extern cppchar_t cpp_host_to_exec_charset (cpp_reader *, cppchar_t); - -/* Used to register macros and assertions, perhaps from the command line. - The text is the same as the command line argument. */ -extern void cpp_define (cpp_reader *, const char *); -extern void cpp_define_unused (cpp_reader *, const char *); -extern void cpp_define_formatted (cpp_reader *pfile, - const char *fmt, ...) ATTRIBUTE_PRINTF_2; -extern void cpp_define_formatted_unused (cpp_reader *pfile, - const char *fmt, - ...) ATTRIBUTE_PRINTF_2; -extern void cpp_assert (cpp_reader *, const char *); -extern void cpp_undef (cpp_reader *, const char *); -extern void cpp_unassert (cpp_reader *, const char *); - -/* Mark a node as a lazily defined macro. */ -extern void cpp_define_lazily (cpp_reader *, cpp_hashnode *node, unsigned N); - -/* Undefine all macros and assertions. */ -extern void cpp_undef_all (cpp_reader *); - -extern cpp_buffer *cpp_push_buffer (cpp_reader *, const unsigned char *, - size_t, int); -extern int cpp_defined (cpp_reader *, const unsigned char *, int); - -/* A preprocessing number. Code assumes that any unused high bits of - the double integer are set to zero. */ - -/* This type has to be equal to unsigned HOST_WIDE_INT, see - gcc/c-family/c-lex.cc. */ -typedef uint64_t cpp_num_part; -typedef struct cpp_num cpp_num; -struct cpp_num -{ - cpp_num_part high; - cpp_num_part low; - bool unsignedp; /* True if value should be treated as unsigned. */ - bool overflow; /* True if the most recent calculation overflowed. */ -}; - -/* cpplib provides two interfaces for interpretation of preprocessing - numbers. - - cpp_classify_number categorizes numeric constants according to - their field (integer, floating point, or invalid), radix (decimal, - octal, hexadecimal), and type suffixes. */ - -#define CPP_N_CATEGORY 0x000F -#define CPP_N_INVALID 0x0000 -#define CPP_N_INTEGER 0x0001 -#define CPP_N_FLOATING 0x0002 - -#define CPP_N_WIDTH 0x00F0 -#define CPP_N_SMALL 0x0010 /* int, float, short _Fract/Accum */ -#define CPP_N_MEDIUM 0x0020 /* long, double, long _Fract/_Accum. */ -#define CPP_N_LARGE 0x0040 /* long long, long double, - long long _Fract/Accum. */ - -#define CPP_N_WIDTH_MD 0xF0000 /* machine defined. */ -#define CPP_N_MD_W 0x10000 -#define CPP_N_MD_Q 0x20000 - -#define CPP_N_RADIX 0x0F00 -#define CPP_N_DECIMAL 0x0100 -#define CPP_N_HEX 0x0200 -#define CPP_N_OCTAL 0x0400 -#define CPP_N_BINARY 0x0800 - -#define CPP_N_UNSIGNED 0x1000 /* Properties. */ -#define CPP_N_IMAGINARY 0x2000 -#define CPP_N_DFLOAT 0x4000 -#define CPP_N_DEFAULT 0x8000 - -#define CPP_N_FRACT 0x100000 /* Fract types. */ -#define CPP_N_ACCUM 0x200000 /* Accum types. */ -#define CPP_N_FLOATN 0x400000 /* _FloatN types. */ -#define CPP_N_FLOATNX 0x800000 /* _FloatNx types. */ - -#define CPP_N_USERDEF 0x1000000 /* C++11 user-defined literal. */ - -#define CPP_N_SIZE_T 0x2000000 /* C++23 size_t literal. */ - -#define CPP_N_WIDTH_FLOATN_NX 0xF0000000 /* _FloatN / _FloatNx value - of N, divided by 16. */ -#define CPP_FLOATN_SHIFT 24 -#define CPP_FLOATN_MAX 0xF0 - -/* Classify a CPP_NUMBER token. The return value is a combination of - the flags from the above sets. */ -extern unsigned cpp_classify_number (cpp_reader *, const cpp_token *, - const char **, location_t); - -/* Return the classification flags for a float suffix. */ -extern unsigned int cpp_interpret_float_suffix (cpp_reader *, const char *, - size_t); - -/* Return the classification flags for an int suffix. */ -extern unsigned int cpp_interpret_int_suffix (cpp_reader *, const char *, - size_t); - -/* Evaluate a token classified as category CPP_N_INTEGER. */ -extern cpp_num cpp_interpret_integer (cpp_reader *, const cpp_token *, - unsigned int); - -/* Sign extend a number, with PRECISION significant bits and all - others assumed clear, to fill out a cpp_num structure. */ -cpp_num cpp_num_sign_extend (cpp_num, size_t); - -/* Output a diagnostic of some kind. */ -extern bool cpp_error (cpp_reader *, enum cpp_diagnostic_level, - const char *msgid, ...) - ATTRIBUTE_PRINTF_3; -extern bool cpp_warning (cpp_reader *, enum cpp_warning_reason, - const char *msgid, ...) - ATTRIBUTE_PRINTF_3; -extern bool cpp_pedwarning (cpp_reader *, enum cpp_warning_reason, - const char *msgid, ...) - ATTRIBUTE_PRINTF_3; -extern bool cpp_warning_syshdr (cpp_reader *, enum cpp_warning_reason reason, - const char *msgid, ...) - ATTRIBUTE_PRINTF_3; - -/* As their counterparts above, but use RICHLOC. */ -extern bool cpp_warning_at (cpp_reader *, enum cpp_warning_reason, - rich_location *richloc, const char *msgid, ...) - ATTRIBUTE_PRINTF_4; -extern bool cpp_pedwarning_at (cpp_reader *, enum cpp_warning_reason, - rich_location *richloc, const char *msgid, ...) - ATTRIBUTE_PRINTF_4; - -/* Output a diagnostic with "MSGID: " preceding the - error string of errno. No location is printed. */ -extern bool cpp_errno (cpp_reader *, enum cpp_diagnostic_level, - const char *msgid); -/* Similarly, but with "FILENAME: " instead of "MSGID: ", where - the filename is not localized. */ -extern bool cpp_errno_filename (cpp_reader *, enum cpp_diagnostic_level, - const char *filename, location_t loc); - -/* Same as cpp_error, except additionally specifies a position as a - (translation unit) physical line and physical column. If the line is - zero, then no location is printed. */ -extern bool cpp_error_with_line (cpp_reader *, enum cpp_diagnostic_level, - location_t, unsigned, - const char *msgid, ...) - ATTRIBUTE_PRINTF_5; -extern bool cpp_warning_with_line (cpp_reader *, enum cpp_warning_reason, - location_t, unsigned, - const char *msgid, ...) - ATTRIBUTE_PRINTF_5; -extern bool cpp_pedwarning_with_line (cpp_reader *, enum cpp_warning_reason, - location_t, unsigned, - const char *msgid, ...) - ATTRIBUTE_PRINTF_5; -extern bool cpp_warning_with_line_syshdr (cpp_reader *, enum cpp_warning_reason, - location_t, unsigned, - const char *msgid, ...) - ATTRIBUTE_PRINTF_5; - -extern bool cpp_error_at (cpp_reader * pfile, enum cpp_diagnostic_level, - location_t src_loc, const char *msgid, ...) - ATTRIBUTE_PRINTF_4; - -extern bool cpp_error_at (cpp_reader * pfile, enum cpp_diagnostic_level, - rich_location *richloc, const char *msgid, ...) - ATTRIBUTE_PRINTF_4; - -/* In lex.cc */ -extern int cpp_ideq (const cpp_token *, const char *); -extern void cpp_output_line (cpp_reader *, FILE *); -extern unsigned char *cpp_output_line_to_string (cpp_reader *, - const unsigned char *); -extern const unsigned char *cpp_alloc_token_string - (cpp_reader *, const unsigned char *, unsigned); -extern void cpp_output_token (const cpp_token *, FILE *); -extern const char *cpp_type2name (enum cpp_ttype, unsigned char flags); -/* Returns the value of an escape sequence, truncated to the correct - target precision. PSTR points to the input pointer, which is just - after the backslash. LIMIT is how much text we have. WIDE is true - if the escape sequence is part of a wide character constant or - string literal. Handles all relevant diagnostics. */ -extern cppchar_t cpp_parse_escape (cpp_reader *, const unsigned char ** pstr, - const unsigned char *limit, int wide); - -/* Structure used to hold a comment block at a given location in the - source code. */ - -typedef struct -{ - /* Text of the comment including the terminators. */ - char *comment; - - /* source location for the given comment. */ - location_t sloc; -} cpp_comment; - -/* Structure holding all comments for a given cpp_reader. */ - -typedef struct -{ - /* table of comment entries. */ - cpp_comment *entries; - - /* number of actual entries entered in the table. */ - int count; - - /* number of entries allocated currently. */ - int allocated; -} cpp_comment_table; - -/* Returns the table of comments encountered by the preprocessor. This - table is only populated when pfile->state.save_comments is true. */ -extern cpp_comment_table *cpp_get_comments (cpp_reader *); - -/* In hash.c */ - -/* Lookup an identifier in the hashtable. Puts the identifier in the - table if it is not already there. */ -extern cpp_hashnode *cpp_lookup (cpp_reader *, const unsigned char *, - unsigned int); - -typedef int (*cpp_cb) (cpp_reader *, cpp_hashnode *, void *); -extern void cpp_forall_identifiers (cpp_reader *, cpp_cb, void *); - -/* In macro.cc */ -extern void cpp_scan_nooutput (cpp_reader *); -extern int cpp_sys_macro_p (cpp_reader *); -extern unsigned char *cpp_quote_string (unsigned char *, const unsigned char *, - unsigned int); -extern bool cpp_compare_macros (const cpp_macro *macro1, - const cpp_macro *macro2); - -/* In files.cc */ -extern bool cpp_included (cpp_reader *, const char *); -extern bool cpp_included_before (cpp_reader *, const char *, location_t); -extern void cpp_make_system_header (cpp_reader *, int, int); -extern bool cpp_push_include (cpp_reader *, const char *); -extern bool cpp_push_default_include (cpp_reader *, const char *); -extern void cpp_change_file (cpp_reader *, enum lc_reason, const char *); -extern const char *cpp_get_path (struct _cpp_file *); -extern cpp_dir *cpp_get_dir (struct _cpp_file *); -extern cpp_buffer *cpp_get_buffer (cpp_reader *); -extern struct _cpp_file *cpp_get_file (cpp_buffer *); -extern cpp_buffer *cpp_get_prev (cpp_buffer *); -extern void cpp_clear_file_cache (cpp_reader *); - -/* cpp_get_converted_source returns the contents of the given file, as it exists - after cpplib has read it and converted it from the input charset to the - source charset. Return struct will be zero-filled if the data could not be - read for any reason. The data starts at the DATA pointer, but the TO_FREE - pointer is what should be passed to free(), as there may be an offset. */ -struct cpp_converted_source -{ - char *to_free; - char *data; - size_t len; -}; -cpp_converted_source cpp_get_converted_source (const char *fname, - const char *input_charset); - -/* In pch.cc */ -struct save_macro_data; -extern int cpp_save_state (cpp_reader *, FILE *); -extern int cpp_write_pch_deps (cpp_reader *, FILE *); -extern int cpp_write_pch_state (cpp_reader *, FILE *); -extern int cpp_valid_state (cpp_reader *, const char *, int); -extern void cpp_prepare_state (cpp_reader *, struct save_macro_data **); -extern int cpp_read_state (cpp_reader *, const char *, FILE *, - struct save_macro_data *); - -/* In lex.cc */ -extern void cpp_force_token_locations (cpp_reader *, location_t); -extern void cpp_stop_forcing_token_locations (cpp_reader *); -enum CPP_DO_task -{ - CPP_DO_print, - CPP_DO_location, - CPP_DO_token -}; - -extern void cpp_directive_only_process (cpp_reader *pfile, - void *data, - void (*cb) (cpp_reader *, - CPP_DO_task, - void *data, ...)); - -/* In expr.cc */ -extern enum cpp_ttype cpp_userdef_string_remove_type - (enum cpp_ttype type); -extern enum cpp_ttype cpp_userdef_string_add_type - (enum cpp_ttype type); -extern enum cpp_ttype cpp_userdef_char_remove_type - (enum cpp_ttype type); -extern enum cpp_ttype cpp_userdef_char_add_type - (enum cpp_ttype type); -extern bool cpp_userdef_string_p - (enum cpp_ttype type); -extern bool cpp_userdef_char_p - (enum cpp_ttype type); -extern const char * cpp_get_userdef_suffix - (const cpp_token *); - -/* In charset.cc */ - -/* The result of attempting to decode a run of UTF-8 bytes. */ - -struct cpp_decoded_char -{ - const char *m_start_byte; - const char *m_next_byte; - - bool m_valid_ch; - cppchar_t m_ch; -}; - -/* Information for mapping between code points and display columns. - - This is a tabstop value, along with a callback for getting the - widths of characters. Normally this callback is cpp_wcwidth, but we - support other schemes for escaping non-ASCII unicode as a series of - ASCII chars when printing the user's source code in diagnostic-show-locus.cc - - For example, consider: - - the Unicode character U+03C0 "GREEK SMALL LETTER PI" (UTF-8: 0xCF 0x80) - - the Unicode character U+1F642 "SLIGHTLY SMILING FACE" - (UTF-8: 0xF0 0x9F 0x99 0x82) - - the byte 0xBF (a stray trailing byte of a UTF-8 character) - Normally U+03C0 would occupy one display column, U+1F642 - would occupy two display columns, and the stray byte would be - printed verbatim as one display column. - - However when escaping them as unicode code points as "" - and "" they occupy 8 and 9 display columns respectively, - and when escaping them as bytes as "<80>" and "<9F><99><82>" - they occupy 8 and 16 display columns respectively. In both cases - the stray byte is escaped to as 4 display columns. */ - -struct cpp_char_column_policy -{ - cpp_char_column_policy (int tabstop, - int (*width_cb) (cppchar_t c)) - : m_tabstop (tabstop), - m_undecoded_byte_width (1), - m_width_cb (width_cb) - {} - - int m_tabstop; - /* Width in display columns of a stray byte that isn't decodable - as UTF-8. */ - int m_undecoded_byte_width; - int (*m_width_cb) (cppchar_t c); -}; - -/* A class to manage the state while converting a UTF-8 sequence to cppchar_t - and computing the display width one character at a time. */ -class cpp_display_width_computation { - public: - cpp_display_width_computation (const char *data, int data_length, - const cpp_char_column_policy &policy); - const char *next_byte () const { return m_next; } - int bytes_processed () const { return m_next - m_begin; } - int bytes_left () const { return m_bytes_left; } - bool done () const { return !bytes_left (); } - int display_cols_processed () const { return m_display_cols; } - - int process_next_codepoint (cpp_decoded_char *out); - int advance_display_cols (int n); - - private: - const char *const m_begin; - const char *m_next; - size_t m_bytes_left; - const cpp_char_column_policy &m_policy; - int m_display_cols; -}; - -/* Convenience functions that are simple use cases for class - cpp_display_width_computation. Tab characters will be expanded to spaces - as determined by POLICY.m_tabstop, and non-printable-ASCII characters - will be escaped as per POLICY. */ - -int cpp_byte_column_to_display_column (const char *data, int data_length, - int column, - const cpp_char_column_policy &policy); -inline int cpp_display_width (const char *data, int data_length, - const cpp_char_column_policy &policy) -{ - return cpp_byte_column_to_display_column (data, data_length, data_length, - policy); -} -int cpp_display_column_to_byte_column (const char *data, int data_length, - int display_col, - const cpp_char_column_policy &policy); -int cpp_wcwidth (cppchar_t c); - -bool cpp_input_conversion_is_trivial (const char *input_charset); -int cpp_check_utf8_bom (const char *data, size_t data_length); - -#endif /* ! LIBCPP_CPPLIB_H */ diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/library/init.cc" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/library/init.cc" deleted file mode 100644 index 36cdc6a..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/library/init.cc" +++ /dev/null @@ -1,935 +0,0 @@ -/* CPP Library. - Copyright (C) 1986-2022 Free Software Foundation, Inc. - Contributed by Per Bothner, 1994-95. - Based on CCCP program by Paul Rubin, June 1986 - Adapted to ANSI C, Richard Stallman, Jan 1987 - -This program is free software; you can redistribute it and/or modify it -under the terms of the GNU General Public License as published by the -Free Software Foundation; either version 3, or (at your option) any -later version. - -This program is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -You should have received a copy of the GNU General Public License -along with this program; see the file COPYING3. If not see -. */ - -#include "config.h" -#include "system.h" -#include "cpplib.h" -#include "internal.h" -#include "mkdeps.h" -#include "localedir.h" -#include "filenames.h" - -#ifndef ENABLE_CANONICAL_SYSTEM_HEADERS -#ifdef HAVE_DOS_BASED_FILE_SYSTEM -#define ENABLE_CANONICAL_SYSTEM_HEADERS 1 -#else -#define ENABLE_CANONICAL_SYSTEM_HEADERS 0 -#endif -#endif - -static void init_library (void); -static void mark_named_operators (cpp_reader *, int); -static bool read_original_filename (cpp_reader *); -static void read_original_directory (cpp_reader *); -static void post_options (cpp_reader *); - -/* If we have designated initializers (GCC >2.7) these tables can be - initialized, constant data. Otherwise, they have to be filled in at - runtime. */ -#if HAVE_DESIGNATED_INITIALIZERS - -#define init_trigraph_map() /* Nothing. */ -#define TRIGRAPH_MAP \ -__extension__ const uchar _cpp_trigraph_map[UCHAR_MAX + 1] = { - -#define END }; -#define s(p, v) [p] = v, - -#else - -#define TRIGRAPH_MAP uchar _cpp_trigraph_map[UCHAR_MAX + 1] = { 0 }; \ - static void init_trigraph_map (void) { \ - unsigned char *x = _cpp_trigraph_map; - -#define END } -#define s(p, v) x[p] = v; - -#endif - -TRIGRAPH_MAP - s('=', '#') s(')', ']') s('!', '|') - s('(', '[') s('\'', '^') s('>', '}') - s('/', '\\') s('<', '{') s('-', '~') -END - -#undef s -#undef END -#undef TRIGRAPH_MAP - -/* A set of booleans indicating what CPP features each source language - requires. */ -struct lang_flags -{ - char c99; - char cplusplus; - char extended_numbers; - char extended_identifiers; - char c11_identifiers; - char std; - char digraphs; - char uliterals; - char rliterals; - char user_literals; - char binary_constants; - char digit_separators; - char trigraphs; - char utf8_char_literals; - char va_opt; - char scope; - char dfp_constants; - char size_t_literals; - char elifdef; -}; - -static const struct lang_flags lang_defaults[] = -{ /* c99 c++ xnum xid c11 std digr ulit rlit udlit bincst digsep trig u8chlit vaopt scope dfp szlit elifdef */ - /* GNUC89 */ { 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0 }, - /* GNUC99 */ { 1, 0, 1, 1, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0 }, - /* GNUC11 */ { 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0 }, - /* GNUC17 */ { 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0 }, - /* GNUC2X */ { 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1 }, - /* STDC89 */ { 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0 }, - /* STDC94 */ { 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0 }, - /* STDC99 */ { 1, 0, 1, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0 }, - /* STDC11 */ { 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0 }, - /* STDC17 */ { 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0 }, - /* STDC2X */ { 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1 }, - /* GNUCXX */ { 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0 }, - /* CXX98 */ { 0, 1, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 0 }, - /* GNUCXX11 */ { 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0 }, - /* CXX11 */ { 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 1, 0, 0, 0 }, - /* GNUCXX14 */ { 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 0, 0, 0 }, - /* CXX14 */ { 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 0, 0, 0 }, - /* GNUCXX17 */ { 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0 }, - /* CXX17 */ { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 0, 1, 0, 0, 0 }, - /* GNUCXX20 */ { 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0 }, - /* CXX20 */ { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 0, 0 }, - /* GNUCXX23 */ { 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1 }, - /* CXX23 */ { 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 1 }, - /* ASM */ { 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 } -}; - -/* Sets internal flags correctly for a given language. */ -void -cpp_set_lang (cpp_reader *pfile, enum c_lang lang) -{ - const struct lang_flags *l = &lang_defaults[(int) lang]; - - CPP_OPTION (pfile, lang) = lang; - - CPP_OPTION (pfile, c99) = l->c99; - CPP_OPTION (pfile, cplusplus) = l->cplusplus; - CPP_OPTION (pfile, extended_numbers) = l->extended_numbers; - CPP_OPTION (pfile, extended_identifiers) = l->extended_identifiers; - CPP_OPTION (pfile, c11_identifiers) = l->c11_identifiers; - CPP_OPTION (pfile, std) = l->std; - CPP_OPTION (pfile, digraphs) = l->digraphs; - CPP_OPTION (pfile, uliterals) = l->uliterals; - CPP_OPTION (pfile, rliterals) = l->rliterals; - CPP_OPTION (pfile, user_literals) = l->user_literals; - CPP_OPTION (pfile, binary_constants) = l->binary_constants; - CPP_OPTION (pfile, digit_separators) = l->digit_separators; - CPP_OPTION (pfile, trigraphs) = l->trigraphs; - CPP_OPTION (pfile, utf8_char_literals) = l->utf8_char_literals; - CPP_OPTION (pfile, va_opt) = l->va_opt; - CPP_OPTION (pfile, scope) = l->scope; - CPP_OPTION (pfile, dfp_constants) = l->dfp_constants; - CPP_OPTION (pfile, size_t_literals) = l->size_t_literals; - CPP_OPTION (pfile, elifdef) = l->elifdef; -} - -/* Initialize library global state. */ -static void -init_library (void) -{ - static int initialized = 0; - - if (! initialized) - { - initialized = 1; - - _cpp_init_lexer (); - - /* Set up the trigraph map. This doesn't need to do anything if - we were compiled with a compiler that supports C99 designated - initializers. */ - init_trigraph_map (); - -#ifdef ENABLE_NLS - (void) bindtextdomain (PACKAGE, LOCALEDIR); -#endif - } -} - -/* Initialize a cpp_reader structure. */ -cpp_reader * -cpp_create_reader (enum c_lang lang, cpp_hash_table *table, - class line_maps *line_table) -{ - cpp_reader *pfile; - - /* Initialize this instance of the library if it hasn't been already. */ - init_library (); - - pfile = XCNEW (cpp_reader); - memset (&pfile->base_context, 0, sizeof (pfile->base_context)); - - cpp_set_lang (pfile, lang); - CPP_OPTION (pfile, warn_multichar) = 1; - CPP_OPTION (pfile, discard_comments) = 1; - CPP_OPTION (pfile, discard_comments_in_macro_exp) = 1; - CPP_OPTION (pfile, max_include_depth) = 200; - CPP_OPTION (pfile, operator_names) = 1; - CPP_OPTION (pfile, warn_trigraphs) = 2; - CPP_OPTION (pfile, warn_endif_labels) = 1; - CPP_OPTION (pfile, cpp_warn_c90_c99_compat) = -1; - CPP_OPTION (pfile, cpp_warn_c11_c2x_compat) = -1; - CPP_OPTION (pfile, cpp_warn_cxx11_compat) = 0; - CPP_OPTION (pfile, cpp_warn_deprecated) = 1; - CPP_OPTION (pfile, cpp_warn_long_long) = 0; - CPP_OPTION (pfile, dollars_in_ident) = 1; - CPP_OPTION (pfile, warn_dollars) = 1; - CPP_OPTION (pfile, warn_variadic_macros) = 1; - CPP_OPTION (pfile, warn_builtin_macro_redefined) = 1; - CPP_OPTION (pfile, cpp_warn_implicit_fallthrough) = 0; - /* By default, track locations of tokens resulting from macro - expansion. The '2' means, track the locations with the highest - accuracy. Read the comments for struct - cpp_options::track_macro_expansion to learn about the other - values. */ - CPP_OPTION (pfile, track_macro_expansion) = 2; - CPP_OPTION (pfile, warn_normalize) = normalized_C; - CPP_OPTION (pfile, warn_literal_suffix) = 1; - CPP_OPTION (pfile, canonical_system_headers) - = ENABLE_CANONICAL_SYSTEM_HEADERS; - CPP_OPTION (pfile, ext_numeric_literals) = 1; - CPP_OPTION (pfile, warn_date_time) = 0; - CPP_OPTION (pfile, cpp_warn_bidirectional) = bidirectional_unpaired; - - /* Default CPP arithmetic to something sensible for the host for the - benefit of dumb users like fix-header. */ - CPP_OPTION (pfile, precision) = CHAR_BIT * sizeof (long); - CPP_OPTION (pfile, char_precision) = CHAR_BIT; - CPP_OPTION (pfile, wchar_precision) = CHAR_BIT * sizeof (int); - CPP_OPTION (pfile, int_precision) = CHAR_BIT * sizeof (int); - CPP_OPTION (pfile, unsigned_char) = 0; - CPP_OPTION (pfile, unsigned_wchar) = 1; - CPP_OPTION (pfile, bytes_big_endian) = 1; /* does not matter */ - - /* Default to no charset conversion. */ - CPP_OPTION (pfile, narrow_charset) = _cpp_default_encoding (); - CPP_OPTION (pfile, wide_charset) = 0; - - /* Default the input character set to UTF-8. */ - CPP_OPTION (pfile, input_charset) = _cpp_default_encoding (); - - /* A fake empty "directory" used as the starting point for files - looked up without a search path. Name cannot be '/' because we - don't want to prepend anything at all to filenames using it. All - other entries are correct zero-initialized. */ - pfile->no_search_path.name = (char *) ""; - - /* Initialize the line map. */ - pfile->line_table = line_table; - - /* Initialize lexer state. */ - pfile->state.save_comments = ! CPP_OPTION (pfile, discard_comments); - - /* Set up static tokens. */ - pfile->avoid_paste.type = CPP_PADDING; - pfile->avoid_paste.val.source = NULL; - pfile->avoid_paste.src_loc = 0; - pfile->endarg.type = CPP_EOF; - pfile->endarg.flags = 0; - pfile->endarg.src_loc = 0; - - /* Create a token buffer for the lexer. */ - _cpp_init_tokenrun (&pfile->base_run, 250); - pfile->cur_run = &pfile->base_run; - pfile->cur_token = pfile->base_run.base; - - /* Initialize the base context. */ - pfile->context = &pfile->base_context; - pfile->base_context.c.macro = 0; - pfile->base_context.prev = pfile->base_context.next = 0; - - /* Aligned and unaligned storage. */ - pfile->a_buff = _cpp_get_buff (pfile, 0); - pfile->u_buff = _cpp_get_buff (pfile, 0); - - /* Initialize table for push_macro/pop_macro. */ - pfile->pushed_macros = 0; - - /* Do not force token locations by default. */ - pfile->forced_token_location = 0; - - /* Note the timestamp is unset. */ - pfile->time_stamp = time_t (-1); - pfile->time_stamp_kind = 0; - - /* The expression parser stack. */ - _cpp_expand_op_stack (pfile); - - /* Initialize the buffer obstack. */ - obstack_specify_allocation (&pfile->buffer_ob, 0, 0, xmalloc, free); - - _cpp_init_files (pfile); - - _cpp_init_hashtable (pfile, table); - - return pfile; -} - -/* Set the line_table entry in PFILE. This is called after reading a - PCH file, as the old line_table will be incorrect. */ -void -cpp_set_line_map (cpp_reader *pfile, class line_maps *line_table) -{ - pfile->line_table = line_table; -} - -/* Free resources used by PFILE. Accessing PFILE after this function - returns leads to undefined behavior. Returns the error count. */ -void -cpp_destroy (cpp_reader *pfile) -{ - cpp_context *context, *contextn; - struct def_pragma_macro *pmacro; - tokenrun *run, *runn; - int i; - - free (pfile->op_stack); - - while (CPP_BUFFER (pfile) != NULL) - _cpp_pop_buffer (pfile); - - free (pfile->out.base); - - if (pfile->macro_buffer) - { - free (pfile->macro_buffer); - pfile->macro_buffer = NULL; - pfile->macro_buffer_len = 0; - } - - if (pfile->deps) - deps_free (pfile->deps); - obstack_free (&pfile->buffer_ob, 0); - - _cpp_destroy_hashtable (pfile); - _cpp_cleanup_files (pfile); - _cpp_destroy_iconv (pfile); - - _cpp_free_buff (pfile->a_buff); - _cpp_free_buff (pfile->u_buff); - _cpp_free_buff (pfile->free_buffs); - - for (run = &pfile->base_run; run; run = runn) - { - runn = run->next; - free (run->base); - if (run != &pfile->base_run) - free (run); - } - - for (context = pfile->base_context.next; context; context = contextn) - { - contextn = context->next; - free (context); - } - - if (pfile->comments.entries) - { - for (i = 0; i < pfile->comments.count; i++) - free (pfile->comments.entries[i].comment); - - free (pfile->comments.entries); - } - if (pfile->pushed_macros) - { - do - { - pmacro = pfile->pushed_macros; - pfile->pushed_macros = pmacro->next; - free (pmacro->name); - free (pmacro); - } - while (pfile->pushed_macros); - } - - free (pfile); -} - -/* This structure defines one built-in identifier. A node will be - entered in the hash table under the name NAME, with value VALUE. - - There are two tables of these. builtin_array holds all the - "builtin" macros: these are handled by builtin_macro() in - macro.cc. Builtin is somewhat of a misnomer -- the property of - interest is that these macros require special code to compute their - expansions. The value is a "cpp_builtin_type" enumerator. - - operator_array holds the C++ named operators. These are keywords - which act as aliases for punctuators. In C++, they cannot be - altered through #define, and #if recognizes them as operators. In - C, these are not entered into the hash table at all (but see - ). The value is a token-type enumerator. */ -struct builtin_macro -{ - const uchar *const name; - const unsigned short len; - const unsigned short value; - const bool always_warn_if_redefined; -}; - -#define B(n, t, f) { DSC(n), t, f } -static const struct builtin_macro builtin_array[] = -{ - - B("_ASSIGN", BT_RT_ASSIGN, true), - B("_TO_ARG_LIST", BT_RT_TO_ARG_LIST, true), - B("_TO_TOKEN_LIST", BT_RT_TO_TOKEN_LIST, true), - B("_FIRST", BT_RT_FIRST, true), - B("_REST", BT_RT_REST, true), - B("_MAP", BT_RT_MAP, true), - B("_AL_MAP", BT_RT_AL_MAP, true), - B("_IF", BT_RT_IF, true), - B("_NOT", BT_RT_NOT, true), - B("_AND", BT_RT_AND, true), - B("_OR", BT_RT_OR, true), - B("_IS_IDENTIFIER", BT_RT_IS_IDENTIFIER, true), - B("_IS_NAME", BT_RT_IS_NAME, true), - B("_PASTE", BT_RT_PASTE, true), - - - B("__TIMESTAMP__", BT_TIMESTAMP, false), - B("__TIME__", BT_TIME, false), - B("__DATE__", BT_DATE, false), - B("__FILE__", BT_FILE, false), - B("__FILE_NAME__", BT_FILE_NAME, false), - B("__BASE_FILE__", BT_BASE_FILE, false), - B("__LINE__", BT_SPECLINE, true), - B("__INCLUDE_LEVEL__", BT_INCLUDE_LEVEL, true), - B("__COUNTER__", BT_COUNTER, true), - /* Make sure to update the list of built-in - function-like macros in traditional.cc: - fun_like_macro() when adding more following */ - B("__has_attribute", BT_HAS_ATTRIBUTE, true), - B("__has_c_attribute", BT_HAS_STD_ATTRIBUTE, true), - B("__has_cpp_attribute", BT_HAS_ATTRIBUTE, true), - B("__has_builtin", BT_HAS_BUILTIN, true), - B("__has_include", BT_HAS_INCLUDE, true), - B("__has_include_next",BT_HAS_INCLUDE_NEXT, true), - /* The following macros are excluded when -traditional-cpp is used. - Therefore, they must appear at the end of this array so that they can be - easily removed by slicing in cpp_init_special_builtins(). - (If you add new built-ins that should be excluded in traditional mode, - place them *before* __STDC__ and update cpp_init_special_builtins() accordingly.) - */ - B("_Pragma", BT_PRAGMA, true), - B("__STDC__", BT_STDC, true) -}; -#undef B - -struct builtin_operator -{ - const uchar *const name; - const unsigned short len; - const unsigned short value; -}; - -#define B(n, t) { DSC(n), t } -static const struct builtin_operator operator_array[] = -{ - B("and", CPP_AND_AND), - B("and_eq", CPP_AND_EQ), - B("bitand", CPP_AND), - B("bitor", CPP_OR), - B("compl", CPP_COMPL), - B("not", CPP_NOT), - B("not_eq", CPP_NOT_EQ), - B("or", CPP_OR_OR), - B("or_eq", CPP_OR_EQ), - B("xor", CPP_XOR), - B("xor_eq", CPP_XOR_EQ) -}; -#undef B - -/* Mark the C++ named operators in the hash table. */ -static void -mark_named_operators (cpp_reader *pfile, int flags) -{ - const struct builtin_operator *b; - - for (b = operator_array; - b < (operator_array + ARRAY_SIZE (operator_array)); - b++) - { - cpp_hashnode *hp = cpp_lookup (pfile, b->name, b->len); - hp->flags |= flags; - hp->is_directive = 0; - hp->directive_index = b->value; - } -} - -/* Helper function of cpp_type2name. Return the string associated with - named operator TYPE. */ -const char * -cpp_named_operator2name (enum cpp_ttype type) -{ - const struct builtin_operator *b; - - for (b = operator_array; - b < (operator_array + ARRAY_SIZE (operator_array)); - b++) - { - if (type == b->value) - return (const char *) b->name; - } - - return NULL; -} - -void -cpp_init_special_builtins (cpp_reader *pfile) -{ - const struct builtin_macro *b; - size_t n = ARRAY_SIZE (builtin_array); - - if (CPP_OPTION (pfile, traditional)) - n -= 2; - else if (! CPP_OPTION (pfile, stdc_0_in_system_headers) - || CPP_OPTION (pfile, std)) - n--; - - for (b = builtin_array; b < builtin_array + n; b++) - { - if ((b->value == BT_HAS_ATTRIBUTE - || b->value == BT_HAS_STD_ATTRIBUTE - || b->value == BT_HAS_BUILTIN) - && (CPP_OPTION (pfile, lang) == CLK_ASM - || pfile->cb.has_attribute == NULL)) - continue; - cpp_hashnode *hp = cpp_lookup (pfile, b->name, b->len); - hp->type = NT_BUILTIN_MACRO; - if (b->always_warn_if_redefined) - hp->flags |= NODE_WARN; - hp->value.builtin = (enum cpp_builtin_type) b->value; - } -} - -/* Restore macro C to builtin macro definition. */ - -void -_cpp_restore_special_builtin (cpp_reader *pfile, struct def_pragma_macro *c) -{ - size_t len = strlen (c->name); - - for (const struct builtin_macro *b = builtin_array; - b < builtin_array + ARRAY_SIZE (builtin_array); b++) - if (b->len == len && memcmp (c->name, b->name, len + 1) == 0) - { - cpp_hashnode *hp = cpp_lookup (pfile, b->name, b->len); - hp->type = NT_BUILTIN_MACRO; - if (b->always_warn_if_redefined) - hp->flags |= NODE_WARN; - hp->value.builtin = (enum cpp_builtin_type) b->value; - } -} - -/* Read the builtins table above and enter them, and language-specific - macros, into the hash table. HOSTED is true if this is a hosted - environment. */ -void -cpp_init_builtins (cpp_reader *pfile, int hosted) -{ - cpp_init_special_builtins (pfile); - - if (!CPP_OPTION (pfile, traditional) - && (! CPP_OPTION (pfile, stdc_0_in_system_headers) - || CPP_OPTION (pfile, std))) - _cpp_define_builtin (pfile, "__STDC__ 1"); - - if (CPP_OPTION (pfile, cplusplus)) - { - /* C++23 is not yet a standard. For now, use an invalid - * year/month, 202100L, which is larger than 202002L. */ - if (CPP_OPTION (pfile, lang) == CLK_CXX23 - || CPP_OPTION (pfile, lang) == CLK_GNUCXX23) - _cpp_define_builtin (pfile, "__cplusplus 202100L"); - else if (CPP_OPTION (pfile, lang) == CLK_CXX20 - || CPP_OPTION (pfile, lang) == CLK_GNUCXX20) - _cpp_define_builtin (pfile, "__cplusplus 202002L"); - else if (CPP_OPTION (pfile, lang) == CLK_CXX17 - || CPP_OPTION (pfile, lang) == CLK_GNUCXX17) - _cpp_define_builtin (pfile, "__cplusplus 201703L"); - else if (CPP_OPTION (pfile, lang) == CLK_CXX14 - || CPP_OPTION (pfile, lang) == CLK_GNUCXX14) - _cpp_define_builtin (pfile, "__cplusplus 201402L"); - else if (CPP_OPTION (pfile, lang) == CLK_CXX11 - || CPP_OPTION (pfile, lang) == CLK_GNUCXX11) - _cpp_define_builtin (pfile, "__cplusplus 201103L"); - else - _cpp_define_builtin (pfile, "__cplusplus 199711L"); - } - else if (CPP_OPTION (pfile, lang) == CLK_ASM) - _cpp_define_builtin (pfile, "__ASSEMBLER__ 1"); - else if (CPP_OPTION (pfile, lang) == CLK_STDC94) - _cpp_define_builtin (pfile, "__STDC_VERSION__ 199409L"); - else if (CPP_OPTION (pfile, lang) == CLK_STDC2X - || CPP_OPTION (pfile, lang) == CLK_GNUC2X) - _cpp_define_builtin (pfile, "__STDC_VERSION__ 202000L"); - else if (CPP_OPTION (pfile, lang) == CLK_STDC17 - || CPP_OPTION (pfile, lang) == CLK_GNUC17) - _cpp_define_builtin (pfile, "__STDC_VERSION__ 201710L"); - else if (CPP_OPTION (pfile, lang) == CLK_STDC11 - || CPP_OPTION (pfile, lang) == CLK_GNUC11) - _cpp_define_builtin (pfile, "__STDC_VERSION__ 201112L"); - else if (CPP_OPTION (pfile, c99)) - _cpp_define_builtin (pfile, "__STDC_VERSION__ 199901L"); - - if (CPP_OPTION (pfile, uliterals) - && !(CPP_OPTION (pfile, cplusplus) - && (CPP_OPTION (pfile, lang) == CLK_GNUCXX - || CPP_OPTION (pfile, lang) == CLK_CXX98))) - { - _cpp_define_builtin (pfile, "__STDC_UTF_16__ 1"); - _cpp_define_builtin (pfile, "__STDC_UTF_32__ 1"); - } - - if (hosted) - _cpp_define_builtin (pfile, "__STDC_HOSTED__ 1"); - else - _cpp_define_builtin (pfile, "__STDC_HOSTED__ 0"); - - if (CPP_OPTION (pfile, objc)) - _cpp_define_builtin (pfile, "__OBJC__ 1"); -} - -/* Sanity-checks are dependent on command-line options, so it is - called as a subroutine of cpp_read_main_file. */ -#if CHECKING_P -static void sanity_checks (cpp_reader *); -static void sanity_checks (cpp_reader *pfile) -{ - cppchar_t test = 0; - size_t max_precision = 2 * CHAR_BIT * sizeof (cpp_num_part); - - /* Sanity checks for assumptions about CPP arithmetic and target - type precisions made by cpplib. */ - test--; - if (test < 1) - cpp_error (pfile, CPP_DL_ICE, "cppchar_t must be an unsigned type"); - - if (CPP_OPTION (pfile, precision) > max_precision) - cpp_error (pfile, CPP_DL_ICE, - "preprocessor arithmetic has maximum precision of %lu bits;" - " target requires %lu bits", - (unsigned long) max_precision, - (unsigned long) CPP_OPTION (pfile, precision)); - - if (CPP_OPTION (pfile, precision) < CPP_OPTION (pfile, int_precision)) - cpp_error (pfile, CPP_DL_ICE, - "CPP arithmetic must be at least as precise as a target int"); - - if (CPP_OPTION (pfile, char_precision) < 8) - cpp_error (pfile, CPP_DL_ICE, "target char is less than 8 bits wide"); - - if (CPP_OPTION (pfile, wchar_precision) < CPP_OPTION (pfile, char_precision)) - cpp_error (pfile, CPP_DL_ICE, - "target wchar_t is narrower than target char"); - - if (CPP_OPTION (pfile, int_precision) < CPP_OPTION (pfile, char_precision)) - cpp_error (pfile, CPP_DL_ICE, - "target int is narrower than target char"); - - /* This is assumed in eval_token() and could be fixed if necessary. */ - if (sizeof (cppchar_t) > sizeof (cpp_num_part)) - cpp_error (pfile, CPP_DL_ICE, - "CPP half-integer narrower than CPP character"); - - if (CPP_OPTION (pfile, wchar_precision) > BITS_PER_CPPCHAR_T) - cpp_error (pfile, CPP_DL_ICE, - "CPP on this host cannot handle wide character constants over" - " %lu bits, but the target requires %lu bits", - (unsigned long) BITS_PER_CPPCHAR_T, - (unsigned long) CPP_OPTION (pfile, wchar_precision)); -} -#else -# define sanity_checks(PFILE) -#endif - -/* This is called after options have been parsed, and partially - processed. */ -void -cpp_post_options (cpp_reader *pfile) -{ - int flags; - - sanity_checks (pfile); - - post_options (pfile); - - /* Mark named operators before handling command line macros. */ - flags = 0; - if (CPP_OPTION (pfile, cplusplus) && CPP_OPTION (pfile, operator_names)) - flags |= NODE_OPERATOR; - if (CPP_OPTION (pfile, warn_cxx_operator_names)) - flags |= NODE_DIAGNOSTIC | NODE_WARN_OPERATOR; - if (flags != 0) - mark_named_operators (pfile, flags); -} - -/* Setup for processing input from the file named FNAME, or stdin if - it is the empty string. Return the original filename on success - (e.g. foo.i->foo.c), or NULL on failure. INJECTING is true if - there may be injected headers before line 1 of the main file. */ -const char * -cpp_read_main_file (cpp_reader *pfile, const char *fname, bool injecting) -{ - if (mkdeps *deps = cpp_get_deps (pfile)) - /* Set the default target (if there is none already). */ - deps_add_default_target (deps, fname); - - pfile->main_file - = _cpp_find_file (pfile, fname, - CPP_OPTION (pfile, preprocessed) ? &pfile->no_search_path - : CPP_OPTION (pfile, main_search) == CMS_user - ? pfile->quote_include - : CPP_OPTION (pfile, main_search) == CMS_system - ? pfile->bracket_include : &pfile->no_search_path, - /*angle=*/0, _cpp_FFK_NORMAL, 0); - - if (_cpp_find_failed (pfile->main_file)) - return NULL; - - _cpp_stack_file (pfile, pfile->main_file, - injecting || CPP_OPTION (pfile, preprocessed) - ? IT_PRE_MAIN : IT_MAIN, 0); - - /* For foo.i, read the original filename foo.c now, for the benefit - of the front ends. */ - if (CPP_OPTION (pfile, preprocessed)) - if (!read_original_filename (pfile)) - { - /* We're on line 1 after all. */ - auto *last = linemap_check_ordinary - (LINEMAPS_LAST_MAP (pfile->line_table, false)); - last->to_line = 1; - /* Inform of as-if a file change. */ - _cpp_do_file_change (pfile, LC_RENAME_VERBATIM, LINEMAP_FILE (last), - LINEMAP_LINE (last), LINEMAP_SYSP (last)); - } - - auto *map = LINEMAPS_LAST_ORDINARY_MAP (pfile->line_table); - pfile->main_loc = MAP_START_LOCATION (map); - - return ORDINARY_MAP_FILE_NAME (map); -} - -location_t -cpp_main_loc (const cpp_reader *pfile) -{ - return pfile->main_loc; -} - -/* For preprocessed files, if the very first characters are - '#[01]', then handle a line directive so we know the - original file name. This will generate file_change callbacks, - which the front ends must handle appropriately given their state of - initialization. We peek directly into the character buffer, so - that we're not confused by otherwise-skipped white space & - comments. We can be very picky, because this should have been - machine-generated text (by us, no less). This way we do not - interfere with the module directive state machine. */ - -static bool -read_original_filename (cpp_reader *pfile) -{ - auto *buf = pfile->buffer->next_line; - - if (pfile->buffer->rlimit - buf > 4 - && buf[0] == '#' - && buf[1] == ' ' - // Also permit '1', as that's what used to be here - && (buf[2] == '0' || buf[2] == '1') - && buf[3] == ' ') - { - const cpp_token *token = _cpp_lex_direct (pfile); - gcc_checking_assert (token->type == CPP_HASH); - if (_cpp_handle_directive (pfile, token->flags & PREV_WHITE)) - { - read_original_directory (pfile); - - auto *penult = &linemap_check_ordinary - (LINEMAPS_LAST_MAP (pfile->line_table, false))[-1]; - if (penult[1].reason == LC_RENAME_VERBATIM) - { - /* Expunge any evidence of the original linemap. */ - pfile->line_table->highest_location - = pfile->line_table->highest_line - = penult[0].start_location; - - penult[1].start_location = penult[0].start_location; - penult[1].reason = penult[0].reason; - penult[0] = penult[1]; - pfile->line_table->info_ordinary.used--; - pfile->line_table->info_ordinary.cache = 0; - } - - return true; - } - } - - return false; -} - -/* For preprocessed files, if the tokens following the first filename - line is of the form # "/path/name//", handle the - directive so we know the original current directory. - - As with the first line peeking, we can do this without lexing by - being picky. */ -static void -read_original_directory (cpp_reader *pfile) -{ - auto *buf = pfile->buffer->next_line; - - if (pfile->buffer->rlimit - buf > 4 - && buf[0] == '#' - && buf[1] == ' ' - // Also permit '1', as that's what used to be here - && (buf[2] == '0' || buf[2] == '1') - && buf[3] == ' ') - { - const cpp_token *hash = _cpp_lex_direct (pfile); - gcc_checking_assert (hash->type == CPP_HASH); - pfile->state.in_directive = 1; - const cpp_token *number = _cpp_lex_direct (pfile); - gcc_checking_assert (number->type == CPP_NUMBER); - const cpp_token *string = _cpp_lex_direct (pfile); - pfile->state.in_directive = 0; - - const unsigned char *text = nullptr; - size_t len = 0; - if (string->type == CPP_STRING) - { - /* The string value includes the quotes. */ - text = string->val.str.text; - len = string->val.str.len; - } - if (len < 5 - || !IS_DIR_SEPARATOR (text[len - 2]) - || !IS_DIR_SEPARATOR (text[len - 3])) - { - /* That didn't work out, back out. */ - _cpp_backup_tokens (pfile, 3); - return; - } - - if (pfile->cb.dir_change) - { - /* Smash the string directly, it's dead at this point */ - char *smashy = (char *)text; - smashy[len - 3] = 0; - - pfile->cb.dir_change (pfile, smashy + 1); - } - - /* We should be at EOL. */ - } -} - -/* This is called at the end of preprocessing. It pops the last - buffer and writes dependency output. - - Maybe it should also reset state, such that you could call - cpp_start_read with a new filename to restart processing. */ -void -cpp_finish (cpp_reader *pfile, FILE *deps_stream) -{ - /* Warn about unused macros before popping the final buffer. */ - if (CPP_OPTION (pfile, warn_unused_macros)) - cpp_forall_identifiers (pfile, _cpp_warn_if_unused_macro, NULL); - - /* lex.cc leaves the final buffer on the stack. This it so that - it returns an unending stream of CPP_EOFs to the client. If we - popped the buffer, we'd dereference a NULL buffer pointer and - segfault. It's nice to allow the client to do worry-free excess - cpp_get_token calls. */ - while (pfile->buffer) - _cpp_pop_buffer (pfile); - - if (deps_stream) - deps_write (pfile, deps_stream, 72); - - /* Report on headers that could use multiple include guards. */ - if (CPP_OPTION (pfile, print_include_names)) - _cpp_report_missing_guards (pfile); -} - -static void -post_options (cpp_reader *pfile) -{ - /* -Wtraditional is not useful in C++ mode. */ - if (CPP_OPTION (pfile, cplusplus)) - CPP_OPTION (pfile, cpp_warn_traditional) = 0; - - /* Permanently disable macro expansion if we are rescanning - preprocessed text. Read preprocesed source in ISO mode. */ - if (CPP_OPTION (pfile, preprocessed)) - { - if (!CPP_OPTION (pfile, directives_only)) - pfile->state.prevent_expansion = 1; - CPP_OPTION (pfile, traditional) = 0; - } - - if (CPP_OPTION (pfile, warn_trigraphs) == 2) - CPP_OPTION (pfile, warn_trigraphs) = !CPP_OPTION (pfile, trigraphs); - - if (CPP_OPTION (pfile, traditional)) - { - CPP_OPTION (pfile, trigraphs) = 0; - CPP_OPTION (pfile, warn_trigraphs) = 0; - } - - if (CPP_OPTION (pfile, module_directives)) - { - /* These unspellable tokens have a leading space. */ - const char *const inits[spec_nodes::M_HWM] - = {"export ", "module ", "import ", "__import"}; - - for (int ix = 0; ix != spec_nodes::M_HWM; ix++) - { - cpp_hashnode *node = cpp_lookup (pfile, UC (inits[ix]), - strlen (inits[ix])); - - /* Token we pass to the compiler. */ - pfile->spec_nodes.n_modules[ix][1] = node; - - if (ix != spec_nodes::M__IMPORT) - /* Token we recognize when lexing, drop the trailing ' '. */ - node = cpp_lookup (pfile, NODE_NAME (node), NODE_LEN (node) - 1); - - node->flags |= NODE_MODULE; - pfile->spec_nodes.n_modules[ix][0] = node; - } - } -} diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/library/macro.cc" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/library/macro.cc" deleted file mode 100644 index 56c3b98..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/library/macro.cc" +++ /dev/null @@ -1,5537 +0,0 @@ -/* Part of CPP library. (Macro and #define handling.) - Copyright (C) 1986-2022 Free Software Foundation, Inc. - Written by Per Bothner, 1994. - Based on CCCP program by Paul Rubin, June 1986 - Adapted to ANSI C, Richard Stallman, Jan 1987 - -This program is free software; you can redistribute it and/or modify it -under the terms of the GNU General Public License as published by the -Free Software Foundation; either version 3, or (at your option) any -later version. - -This program is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -You should have received a copy of the GNU General Public License -along with this program; see the file COPYING3. If not see -. - - In other words, you are welcome to use, share and improve this program. - You are forbidden to forbid anyone else to use, share and improve - what you give them. Help stamp out software-hoarding! */ - -#pragma GCC diagnostic ignored "-Wparentheses" - - -#include "config.h" -#include "system.h" -#include "cpplib.h" -#include "internal.h" - -// RT extension - static const uchar *evaluate_RT_ASSIGN(cpp_reader *pfile); - static const uchar *evaluate_RT_TO_ARG_LIST(cpp_reader *pfile); - static const uchar *evaluate_RT_TO_TOKEN_LIST(cpp_reader *pfile); - static const uchar *evaluate_RT_FIRST(cpp_reader *pfile); - static const uchar *evaluate_RT_REST(cpp_reader *pfile); - static const uchar *evaluate_RT_MAP(cpp_reader *pfile); - static const uchar *evaluate_RT_AL_MAP(cpp_reader *pfile); - static const uchar *evaluate_RT_IF(cpp_reader *pfile); - static const uchar *evaluate_RT_NOT(cpp_reader *pfile); - static const uchar *evaluate_RT_AND(cpp_reader *pfile); - static const uchar *evaluate_RT_OR(cpp_reader *pfile); - static const uchar *evaluate_RT_IS_IDENTIFIER(cpp_reader *pfile); - static const uchar *evaluate_RT_IS_NAME(cpp_reader *pfile); - static const uchar *evaluate_RT_PASTE(cpp_reader *pfile); - -typedef struct macro_arg macro_arg; -/* This structure represents the tokens of a macro argument. These - tokens can be macro themselves, in which case they can be either - expanded or unexpanded. When they are expanded, this data - structure keeps both the expanded and unexpanded forms. */ -struct macro_arg -{ - const cpp_token **first; /* First token in unexpanded argument. */ - const cpp_token **expanded; /* Macro-expanded argument. */ - const cpp_token *stringified; /* Stringified argument. */ - unsigned int count; /* # of tokens in argument. */ - unsigned int expanded_count; /* # of tokens in expanded argument. */ - location_t *virt_locs; /* Where virtual locations for - unexpanded tokens are stored. */ - location_t *expanded_virt_locs; /* Where virtual locations for - expanded tokens are - stored. */ -}; - -/* The kind of macro tokens which the instance of - macro_arg_token_iter is supposed to iterate over. */ -enum macro_arg_token_kind { - MACRO_ARG_TOKEN_NORMAL, - /* This is a macro argument token that got transformed into a string - literal, e.g. #foo. */ - MACRO_ARG_TOKEN_STRINGIFIED, - /* This is a token resulting from the expansion of a macro - argument that was itself a macro. */ - MACRO_ARG_TOKEN_EXPANDED -}; - -/* An iterator over tokens coming from a function-like macro - argument. */ -typedef struct macro_arg_token_iter macro_arg_token_iter; -struct macro_arg_token_iter -{ - /* Whether or not -ftrack-macro-expansion is used. */ - bool track_macro_exp_p; - /* The kind of token over which we are supposed to iterate. */ - enum macro_arg_token_kind kind; - /* A pointer to the current token pointed to by the iterator. */ - const cpp_token **token_ptr; - /* A pointer to the "full" location of the current token. If - -ftrack-macro-expansion is used this location tracks loci across - macro expansion. */ - const location_t *location_ptr; -#if CHECKING_P - /* The number of times the iterator went forward. This useful only - when checking is enabled. */ - size_t num_forwards; -#endif -}; - -/* Saved data about an identifier being used as a macro argument - name. */ -struct macro_arg_saved_data { - /* The canonical (UTF-8) spelling of this identifier. */ - cpp_hashnode *canonical_node; - /* The previous value & type of this identifier. */ - union _cpp_hashnode_value value; - node_type type; -}; - -static const char *vaopt_paste_error = - N_("'##' cannot appear at either end of __VA_OPT__"); - -static void expand_arg (cpp_reader *, macro_arg *); - -/* A class for tracking __VA_OPT__ state while iterating over a - sequence of tokens. This is used during both macro definition and - expansion. */ -class vaopt_state { - - public: - - enum update_type - { - ERROR, - DROP, - INCLUDE, - BEGIN, - END - }; - - /* Initialize the state tracker. ANY_ARGS is true if variable - arguments were provided to the macro invocation. */ - vaopt_state (cpp_reader *pfile, bool is_variadic, macro_arg *arg) - : m_pfile (pfile), - m_arg (arg), - m_variadic (is_variadic), - m_last_was_paste (false), - m_stringify (false), - m_state (0), - m_paste_location (0), - m_location (0), - m_update (ERROR) - { - } - - /* Given a token, update the state of this tracker and return a - boolean indicating whether the token should be be included in the - expansion. */ - update_type update (const cpp_token *token) - { - /* If the macro isn't variadic, just don't bother. */ - if (!m_variadic) - return INCLUDE; - - if (token->type == CPP_NAME - && token->val.node.node == m_pfile->spec_nodes.n__VA_OPT__) - { - if (m_state > 0) - { - cpp_error_at (m_pfile, CPP_DL_ERROR, token->src_loc, - "__VA_OPT__ may not appear in a __VA_OPT__"); - return ERROR; - } - ++m_state; - m_location = token->src_loc; - m_stringify = (token->flags & STRINGIFY_ARG) != 0; - return BEGIN; - } - else if (m_state == 1) - { - if (token->type != CPP_OPEN_PAREN) - { - cpp_error_at (m_pfile, CPP_DL_ERROR, m_location, - "__VA_OPT__ must be followed by an " - "open parenthesis"); - return ERROR; - } - ++m_state; - if (m_update == ERROR) - { - if (m_arg == NULL) - m_update = INCLUDE; - else - { - m_update = DROP; - if (!m_arg->expanded) - expand_arg (m_pfile, m_arg); - for (unsigned idx = 0; idx < m_arg->expanded_count; ++idx) - if (m_arg->expanded[idx]->type != CPP_PADDING) - { - m_update = INCLUDE; - break; - } - } - } - return DROP; - } - else if (m_state >= 2) - { - if (m_state == 2 && token->type == CPP_PASTE) - { - cpp_error_at (m_pfile, CPP_DL_ERROR, token->src_loc, - vaopt_paste_error); - return ERROR; - } - /* Advance states before further considering this token, in - case we see a close paren immediately after the open - paren. */ - if (m_state == 2) - ++m_state; - - bool was_paste = m_last_was_paste; - m_last_was_paste = false; - if (token->type == CPP_PASTE) - { - m_last_was_paste = true; - m_paste_location = token->src_loc; - } - else if (token->type == CPP_OPEN_PAREN) - ++m_state; - else if (token->type == CPP_CLOSE_PAREN) - { - --m_state; - if (m_state == 2) - { - /* Saw the final paren. */ - m_state = 0; - - if (was_paste) - { - cpp_error_at (m_pfile, CPP_DL_ERROR, token->src_loc, - vaopt_paste_error); - return ERROR; - } - - return END; - } - } - return m_update; - } - - /* Nothing to do with __VA_OPT__. */ - return INCLUDE; - } - - /* Ensure that any __VA_OPT__ was completed. If ok, return true. - Otherwise, issue an error and return false. */ - bool completed () - { - if (m_variadic && m_state != 0) - cpp_error_at (m_pfile, CPP_DL_ERROR, m_location, - "unterminated __VA_OPT__"); - return m_state == 0; - } - - /* Return true for # __VA_OPT__. */ - bool stringify () const - { - return m_stringify; - } - - private: - - /* The cpp_reader. */ - cpp_reader *m_pfile; - - /* The __VA_ARGS__ argument. */ - macro_arg *m_arg; - - /* True if the macro is variadic. */ - bool m_variadic; - /* If true, the previous token was ##. This is used to detect when - a paste occurs at the end of the sequence. */ - bool m_last_was_paste; - /* True for #__VA_OPT__. */ - bool m_stringify; - - /* The state variable: - 0 means not parsing - 1 means __VA_OPT__ seen, looking for "(" - 2 means "(" seen (so the next token can't be "##") - >= 3 means looking for ")", the number encodes the paren depth. */ - int m_state; - - /* The location of the paste token. */ - location_t m_paste_location; - - /* Location of the __VA_OPT__ token. */ - location_t m_location; - - /* If __VA_ARGS__ substitutes to no preprocessing tokens, - INCLUDE, otherwise DROP. ERROR when unknown yet. */ - update_type m_update; -}; - -/* Macro expansion. */ - -static cpp_macro *get_deferred_or_lazy_macro (cpp_reader *, cpp_hashnode *, - location_t); -static int enter_macro_context (cpp_reader *, cpp_hashnode *, - const cpp_token *, location_t); -static int builtin_macro (cpp_reader *, cpp_hashnode *, - location_t, location_t); -static void push_ptoken_context (cpp_reader *, cpp_hashnode *, _cpp_buff *, - const cpp_token **, unsigned int); -static void push_extended_tokens_context (cpp_reader *, cpp_hashnode *, - _cpp_buff *, location_t *, - const cpp_token **, unsigned int); -static _cpp_buff *collect_args (cpp_reader *, const cpp_hashnode *, - _cpp_buff **, unsigned *); -static cpp_context *next_context (cpp_reader *); -static const cpp_token *padding_token (cpp_reader *, const cpp_token *); -static const cpp_token *new_string_token (cpp_reader *, uchar *, unsigned int); -static const cpp_token *stringify_arg (cpp_reader *, const cpp_token **, - unsigned int); -static void paste_all_tokens (cpp_reader *, const cpp_token *); -static bool paste_tokens (cpp_reader *, location_t, - const cpp_token **, const cpp_token *); -static void alloc_expanded_arg_mem (cpp_reader *, macro_arg *, size_t); -static void ensure_expanded_arg_room (cpp_reader *, macro_arg *, size_t, size_t *); -static void delete_macro_args (_cpp_buff*, unsigned num_args); -static void set_arg_token (macro_arg *, const cpp_token *, - location_t, size_t, - enum macro_arg_token_kind, - bool); -static const location_t *get_arg_token_location (const macro_arg *, - enum macro_arg_token_kind); -static const cpp_token **arg_token_ptr_at (const macro_arg *, - size_t, - enum macro_arg_token_kind, - location_t **virt_location); - -static void macro_arg_token_iter_init (macro_arg_token_iter *, bool, - enum macro_arg_token_kind, - const macro_arg *, - const cpp_token **); -static const cpp_token *macro_arg_token_iter_get_token -(const macro_arg_token_iter *it); -static location_t macro_arg_token_iter_get_location -(const macro_arg_token_iter *); -static void macro_arg_token_iter_forward (macro_arg_token_iter *); -static _cpp_buff *tokens_buff_new (cpp_reader *, size_t, - location_t **); -static size_t tokens_buff_count (_cpp_buff *); -static const cpp_token **tokens_buff_last_token_ptr (_cpp_buff *); -static inline const cpp_token **tokens_buff_put_token_to (const cpp_token **, - location_t *, - const cpp_token *, - location_t, - location_t, - const line_map_macro *, - unsigned int); - -static const cpp_token **tokens_buff_add_token (_cpp_buff *, - location_t *, - const cpp_token *, - location_t, - location_t, - const line_map_macro *, - unsigned int); -static inline void tokens_buff_remove_last_token (_cpp_buff *); -static void replace_args (cpp_reader *, cpp_hashnode *, cpp_macro *, - macro_arg *, location_t); -static _cpp_buff *funlike_invocation_p (cpp_reader *, cpp_hashnode *, - _cpp_buff **, unsigned *); -static cpp_macro *create_iso_definition (cpp_reader *); - -/* #define directive parsing and handling. */ - -static cpp_macro *lex_expansion_token (cpp_reader *, cpp_macro *); -static bool parse_params (cpp_reader *, unsigned *, bool *); -static void check_trad_stringification (cpp_reader *, const cpp_macro *, - const cpp_string *); -static bool reached_end_of_context (cpp_context *); -static void consume_next_token_from_context (cpp_reader *pfile, - const cpp_token **, - location_t *); -static const cpp_token* cpp_get_token_1 (cpp_reader *, location_t *); - -static cpp_hashnode* macro_of_context (cpp_context *context); - -/* Statistical counter tracking the number of macros that got - expanded. */ -unsigned num_expanded_macros_counter = 0; -/* Statistical counter tracking the total number tokens resulting - from macro expansion. */ -unsigned num_macro_tokens_counter = 0; - -/* Wrapper around cpp_get_token to skip CPP_PADDING tokens - and not consume CPP_EOF. */ -static const cpp_token * -cpp_get_token_no_padding (cpp_reader *pfile) -{ - for (;;) - { - const cpp_token *ret = cpp_peek_token (pfile, 0); - if (ret->type == CPP_EOF) - return ret; - ret = cpp_get_token (pfile); - if (ret->type != CPP_PADDING) - return ret; - } -} - -/* Handle meeting "__has_include" builtin macro. */ - -static int -builtin_has_include (cpp_reader *pfile, cpp_hashnode *op, bool has_next) -{ - int result = 0; - - if (!pfile->state.in_directive) - cpp_error (pfile, CPP_DL_ERROR, - "\"%s\" used outside of preprocessing directive", - NODE_NAME (op)); - - pfile->state.angled_headers = true; - const cpp_token *token = cpp_get_token_no_padding (pfile); - bool paren = token->type == CPP_OPEN_PAREN; - if (paren) - token = cpp_get_token_no_padding (pfile); - else - cpp_error (pfile, CPP_DL_ERROR, - "missing '(' before \"%s\" operand", NODE_NAME (op)); - pfile->state.angled_headers = false; - - bool bracket = token->type != CPP_STRING; - char *fname = NULL; - if (token->type == CPP_STRING || token->type == CPP_HEADER_NAME) - { - fname = XNEWVEC (char, token->val.str.len - 1); - memcpy (fname, token->val.str.text + 1, token->val.str.len - 2); - fname[token->val.str.len - 2] = '\0'; - } - else if (token->type == CPP_LESS) - fname = _cpp_bracket_include (pfile); - else - cpp_error (pfile, CPP_DL_ERROR, - "operator \"%s\" requires a header-name", NODE_NAME (op)); - - if (fname) - { - /* Do not do the lookup if we're skipping, that's unnecessary - IO. */ - if (!pfile->state.skip_eval - && _cpp_has_header (pfile, fname, bracket, - has_next ? IT_INCLUDE_NEXT : IT_INCLUDE)) - result = 1; - - XDELETEVEC (fname); - } - - if (paren - && cpp_get_token_no_padding (pfile)->type != CPP_CLOSE_PAREN) - cpp_error (pfile, CPP_DL_ERROR, - "missing ')' after \"%s\" operand", NODE_NAME (op)); - - return result; -} - -/* Emits a warning if NODE is a macro defined in the main file that - has not been used. */ -int -_cpp_warn_if_unused_macro (cpp_reader *pfile, cpp_hashnode *node, - void *v ATTRIBUTE_UNUSED) -{ - if (cpp_user_macro_p (node)) - { - cpp_macro *macro = node->value.macro; - - if (!macro->used - && MAIN_FILE_P (linemap_check_ordinary - (linemap_lookup (pfile->line_table, - macro->line)))) - cpp_warning_with_line (pfile, CPP_W_UNUSED_MACROS, macro->line, 0, - "macro \"%s\" is not used", NODE_NAME (node)); - } - - return 1; -} - -/* Allocates and returns a CPP_STRING token, containing TEXT of length - LEN, after null-terminating it. TEXT must be in permanent storage. */ -static const cpp_token * -new_string_token (cpp_reader *pfile, unsigned char *text, unsigned int len) -{ - cpp_token *token = _cpp_temp_token (pfile); - - text[len] = '\0'; - token->type = CPP_STRING; - token->val.str.len = len; - token->val.str.text = text; - token->flags = 0; - return token; -} - -static const char * const monthnames[] = -{ - "Jan", "Feb", "Mar", "Apr", "May", "Jun", - "Jul", "Aug", "Sep", "Oct", "Nov", "Dec" -}; - -/* Helper function for builtin_macro. Returns the text generated by - a builtin macro. */ -const uchar * -_cpp_builtin_macro_text (cpp_reader *pfile, cpp_hashnode *node, - location_t loc) -{ - const uchar *result = NULL; - linenum_type number = 1; - - switch (node->value.builtin) - { - default: - cpp_error (pfile, CPP_DL_ICE, "invalid built-in macro \"%s\"", - NODE_NAME (node)); - break; - - case BT_TIMESTAMP: - { - if (CPP_OPTION (pfile, warn_date_time)) - cpp_warning (pfile, CPP_W_DATE_TIME, "macro \"%s\" might prevent " - "reproducible builds", NODE_NAME (node)); - - cpp_buffer *pbuffer = cpp_get_buffer (pfile); - if (pbuffer->timestamp == NULL) - { - /* Initialize timestamp value of the assotiated file. */ - struct _cpp_file *file = cpp_get_file (pbuffer); - if (file) - { - /* Generate __TIMESTAMP__ string, that represents - the date and time of the last modification - of the current source file. The string constant - looks like "Sun Sep 16 01:03:52 1973". */ - struct tm *tb = NULL; - struct stat *st = _cpp_get_file_stat (file); - if (st) - tb = localtime (&st->st_mtime); - if (tb) - { - char *str = asctime (tb); - size_t len = strlen (str); - unsigned char *buf = _cpp_unaligned_alloc (pfile, len + 2); - buf[0] = '"'; - strcpy ((char *) buf + 1, str); - buf[len] = '"'; - pbuffer->timestamp = buf; - } - else - { - cpp_errno (pfile, CPP_DL_WARNING, - "could not determine file timestamp"); - pbuffer->timestamp = UC"\"??? ??? ?? ??:??:?? ????\""; - } - } - } - result = pbuffer->timestamp; - } - break; - case BT_FILE: - case BT_FILE_NAME: - case BT_BASE_FILE: - { - unsigned int len; - const char *name; - uchar *buf; - - if (node->value.builtin == BT_FILE - || node->value.builtin == BT_FILE_NAME) - { - name = linemap_get_expansion_filename (pfile->line_table, - pfile->line_table->highest_line); - if ((node->value.builtin == BT_FILE_NAME) && name) - name = lbasename (name); - } - else - { - name = _cpp_get_file_name (pfile->main_file); - if (!name) - abort (); - } - if (pfile->cb.remap_filename) - name = pfile->cb.remap_filename (name); - len = strlen (name); - buf = _cpp_unaligned_alloc (pfile, len * 2 + 3); - result = buf; - *buf = '"'; - buf = cpp_quote_string (buf + 1, (const unsigned char *) name, len); - *buf++ = '"'; - *buf = '\0'; - } - break; - - case BT_INCLUDE_LEVEL: - /* The line map depth counts the primary source as level 1, but - historically __INCLUDE_DEPTH__ has called the primary source - level 0. */ - number = pfile->line_table->depth - 1; - break; - - case BT_SPECLINE: - /* If __LINE__ is embedded in a macro, it must expand to the - line of the macro's invocation, not its definition. - Otherwise things like assert() will not work properly. - See WG14 N1911, WG21 N4220 sec 6.5, and PR 61861. */ - if (CPP_OPTION (pfile, traditional)) - loc = pfile->line_table->highest_line; - else - loc = linemap_resolve_location (pfile->line_table, loc, - LRK_MACRO_EXPANSION_POINT, NULL); - number = linemap_get_expansion_line (pfile->line_table, loc); - break; - - /* __STDC__ has the value 1 under normal circumstances. - However, if (a) we are in a system header, (b) the option - stdc_0_in_system_headers is true (set by target config), and - (c) we are not in strictly conforming mode, then it has the - value 0. (b) and (c) are already checked in cpp_init_builtins. */ - case BT_STDC: - if (_cpp_in_system_header (pfile)) - number = 0; - else - number = 1; - break; - - case BT_DATE: - case BT_TIME: - if (CPP_OPTION (pfile, warn_date_time)) - cpp_warning (pfile, CPP_W_DATE_TIME, "macro \"%s\" might prevent " - "reproducible builds", NODE_NAME (node)); - if (pfile->date == NULL) - { - /* Allocate __DATE__ and __TIME__ strings from permanent - storage. We only do this once, and don't generate them - at init time, because time() and localtime() are very - slow on some systems. */ - time_t tt; - auto kind = cpp_get_date (pfile, &tt); - - if (kind == CPP_time_kind::UNKNOWN) - { - cpp_errno (pfile, CPP_DL_WARNING, - "could not determine date and time"); - - pfile->date = UC"\"??? ?? ????\""; - pfile->time = UC"\"??:??:??\""; - } - else - { - struct tm *tb = (kind == CPP_time_kind::FIXED - ? gmtime : localtime) (&tt); - - pfile->date = _cpp_unaligned_alloc (pfile, - sizeof ("\"Oct 11 1347\"")); - sprintf ((char *) pfile->date, "\"%s %2d %4d\"", - monthnames[tb->tm_mon], tb->tm_mday, - tb->tm_year + 1900); - - pfile->time = _cpp_unaligned_alloc (pfile, - sizeof ("\"12:34:56\"")); - sprintf ((char *) pfile->time, "\"%02d:%02d:%02d\"", - tb->tm_hour, tb->tm_min, tb->tm_sec); - } - } - - if (node->value.builtin == BT_DATE) - result = pfile->date; - else - result = pfile->time; - break; - - case BT_COUNTER: - if (CPP_OPTION (pfile, directives_only) && pfile->state.in_directive) - cpp_error (pfile, CPP_DL_ERROR, - "__COUNTER__ expanded inside directive with -fdirectives-only"); - number = pfile->counter++; - break; - - case BT_HAS_ATTRIBUTE: - number = pfile->cb.has_attribute (pfile, false); - break; - - case BT_HAS_STD_ATTRIBUTE: - number = pfile->cb.has_attribute (pfile, true); - break; - - case BT_HAS_BUILTIN: - number = pfile->cb.has_builtin (pfile); - break; - - case BT_HAS_INCLUDE: - case BT_HAS_INCLUDE_NEXT: - number = builtin_has_include (pfile, node, - node->value.builtin == BT_HAS_INCLUDE_NEXT); - break; - - case BT_RT_ASSIGN: - result = evaluate_RT_ASSIGN(pfile); - break; - - case BT_RT_TO_ARG_LIST: - result = evaluate_RT_TO_ARG_LIST(pfile); - break; - - case BT_RT_TO_TOKEN_LIST: - result = evaluate_RT_TO_TOKEN_LIST(pfile); - break; - - case BT_RT_FIRST: - result = evaluate_RT_FIRST(pfile); - break; - - case BT_RT_REST: - result = evaluate_RT_REST(pfile); - break; - - case BT_RT_MAP: - result = evaluate_RT_MAP(pfile); - break; - - case BT_RT_AL_MAP: - result = evaluate_RT_AL_MAP(pfile); - break; - - case BT_RT_IF: - result = evaluate_RT_IF(pfile); - break; - - case BT_RT_NOT: - result = evaluate_RT_NOT(pfile); - break; - - case BT_RT_AND: - result = evaluate_RT_AND(pfile); - break; - - case BT_RT_OR: - result = evaluate_RT_OR(pfile); - break; - - case BT_RT_IS_IDENTIFIER: - result = evaluate_RT_IS_IDENTIFIER(pfile); - break; - - case BT_RT_IS_NAME: - result = evaluate_RT_IS_NAME(pfile); - break; - - case BT_RT_PASTE: - result = evaluate_RT_PASTE(pfile); - break; - - } - - if (result == NULL) - { - /* 21 bytes holds all NUL-terminated unsigned 64-bit numbers. */ - result = _cpp_unaligned_alloc (pfile, 21); - sprintf ((char *) result, "%u", number); - } - - return result; -} - -/* Get an idempotent date. Either the cached value, the value from - source epoch, or failing that, the value from time(2). Use this - during compilation so that every time stamp is the same. */ -CPP_time_kind -cpp_get_date (cpp_reader *pfile, time_t *result) -{ - if (!pfile->time_stamp_kind) - { - int kind = 0; - if (pfile->cb.get_source_date_epoch) - { - /* Try reading the fixed epoch. */ - pfile->time_stamp = pfile->cb.get_source_date_epoch (pfile); - if (pfile->time_stamp != time_t (-1)) - kind = int (CPP_time_kind::FIXED); - } - - if (!kind) - { - /* Pedantically time_t (-1) is a legitimate value for - "number of seconds since the Epoch". It is a silly - time. */ - errno = 0; - pfile->time_stamp = time (nullptr); - /* Annoyingly a library could legally set errno and return a - valid time! Bad library! */ - if (pfile->time_stamp == time_t (-1) && errno) - kind = errno; - else - kind = int (CPP_time_kind::DYNAMIC); - } - - pfile->time_stamp_kind = kind; - } - - *result = pfile->time_stamp; - if (pfile->time_stamp_kind >= 0) - { - errno = pfile->time_stamp_kind; - return CPP_time_kind::UNKNOWN; - } - - return CPP_time_kind (pfile->time_stamp_kind); -} - -/* Convert builtin macros like __FILE__ to a token and push it on the - context stack. Also handles _Pragma, for which a new token may not - be created. Returns 1 if it generates a new token context, 0 to - return the token to the caller. LOC is the location of the expansion - point of the macro. */ -static int -builtin_macro (cpp_reader *pfile, cpp_hashnode *node, - location_t loc, location_t expand_loc) -{ - const uchar *buf; - size_t len; - char *nbuf; - - if (node->value.builtin == BT_PRAGMA) - { - /* Don't interpret _Pragma within directives. The standard is - not clear on this, but to me this makes most sense. - Similarly, don't interpret _Pragma inside expand_args, we might - need to stringize it later on. */ - if (pfile->state.in_directive || pfile->state.ignore__Pragma) - return 0; - - return _cpp_do__Pragma (pfile, loc); - } - - buf = _cpp_builtin_macro_text (pfile, node, expand_loc); - len = ustrlen (buf); - nbuf = (char *) alloca (len + 1); - memcpy (nbuf, buf, len); - nbuf[len]='\n'; - - cpp_push_buffer (pfile, (uchar *) nbuf, len, /* from_stage3 */ true); - _cpp_clean_line (pfile); - - /* Set pfile->cur_token as required by _cpp_lex_direct. */ - pfile->cur_token = _cpp_temp_token (pfile); - cpp_token *token = _cpp_lex_direct (pfile); - /* We should point to the expansion point of the builtin macro. */ - token->src_loc = loc; - if (pfile->context->tokens_kind == TOKENS_KIND_EXTENDED) - { - /* We are tracking tokens resulting from macro expansion. - Create a macro line map and generate a virtual location for - the token resulting from the expansion of the built-in - macro. */ - location_t *virt_locs = NULL; - _cpp_buff *token_buf = tokens_buff_new (pfile, 1, &virt_locs); - const line_map_macro * map = - linemap_enter_macro (pfile->line_table, node, loc, 1); - tokens_buff_add_token (token_buf, virt_locs, token, - pfile->line_table->builtin_location, - pfile->line_table->builtin_location, - map, /*macro_token_index=*/0); - push_extended_tokens_context (pfile, node, token_buf, virt_locs, - (const cpp_token **)token_buf->base, - 1); - } - else - _cpp_push_token_context (pfile, NULL, token, 1); - if (pfile->buffer->cur != pfile->buffer->rlimit) - cpp_error (pfile, CPP_DL_ICE, "invalid built-in macro \"%s\"", - NODE_NAME (node)); - _cpp_pop_buffer (pfile); - - return 1; -} - -/* Copies SRC, of length LEN, to DEST, adding backslashes before all - backslashes and double quotes. DEST must be of sufficient size. - Returns a pointer to the end of the string. */ -uchar * -cpp_quote_string (uchar *dest, const uchar *src, unsigned int len) -{ - while (len--) - { - uchar c = *src++; - - switch (c) - { - case '\n': - /* Naked LF can appear in raw string literals */ - c = 'n'; - /* FALLTHROUGH */ - - case '\\': - case '"': - *dest++ = '\\'; - /* FALLTHROUGH */ - - default: - *dest++ = c; - } - } - - return dest; -} - -/* Convert a token sequence FIRST to FIRST+COUNT-1 to a single string token - according to the rules of the ISO C #-operator. */ -static const cpp_token * -stringify_arg (cpp_reader *pfile, const cpp_token **first, unsigned int count) -{ - unsigned char *dest; - unsigned int i, escape_it, backslash_count = 0; - const cpp_token *source = NULL; - size_t len; - - if (BUFF_ROOM (pfile->u_buff) < 3) - _cpp_extend_buff (pfile, &pfile->u_buff, 3); - dest = BUFF_FRONT (pfile->u_buff); - *dest++ = '"'; - - /* Loop, reading in the argument's tokens. */ - for (i = 0; i < count; i++) - { - const cpp_token *token = first[i]; - - if (token->type == CPP_PADDING) - { - if (source == NULL - || (!(source->flags & PREV_WHITE) - && token->val.source == NULL)) - source = token->val.source; - continue; - } - - escape_it = (token->type == CPP_STRING || token->type == CPP_CHAR - || token->type == CPP_WSTRING || token->type == CPP_WCHAR - || token->type == CPP_STRING32 || token->type == CPP_CHAR32 - || token->type == CPP_STRING16 || token->type == CPP_CHAR16 - || token->type == CPP_UTF8STRING || token->type == CPP_UTF8CHAR - || cpp_userdef_string_p (token->type) - || cpp_userdef_char_p (token->type)); - - /* Room for each char being written in octal, initial space and - final quote and NUL. */ - len = cpp_token_len (token); - if (escape_it) - len *= 4; - len += 3; - - if ((size_t) (BUFF_LIMIT (pfile->u_buff) - dest) < len) - { - size_t len_so_far = dest - BUFF_FRONT (pfile->u_buff); - _cpp_extend_buff (pfile, &pfile->u_buff, len); - dest = BUFF_FRONT (pfile->u_buff) + len_so_far; - } - - /* Leading white space? */ - if (dest - 1 != BUFF_FRONT (pfile->u_buff)) - { - if (source == NULL) - source = token; - if (source->flags & PREV_WHITE) - *dest++ = ' '; - } - source = NULL; - - if (escape_it) - { - _cpp_buff *buff = _cpp_get_buff (pfile, len); - unsigned char *buf = BUFF_FRONT (buff); - len = cpp_spell_token (pfile, token, buf, true) - buf; - dest = cpp_quote_string (dest, buf, len); - _cpp_release_buff (pfile, buff); - } - else - dest = cpp_spell_token (pfile, token, dest, true); - - if (token->type == CPP_OTHER && token->val.str.text[0] == '\\') - backslash_count++; - else - backslash_count = 0; - } - - /* Ignore the final \ of invalid string literals. */ - if (backslash_count & 1) - { - cpp_error (pfile, CPP_DL_WARNING, - "invalid string literal, ignoring final '\\'"); - dest--; - } - - /* Commit the memory, including NUL, and return the token. */ - *dest++ = '"'; - len = dest - BUFF_FRONT (pfile->u_buff); - BUFF_FRONT (pfile->u_buff) = dest + 1; - return new_string_token (pfile, dest - len, len); -} - -/* Try to paste two tokens. On success, return nonzero. In any - case, PLHS is updated to point to the pasted token, which is - guaranteed to not have the PASTE_LEFT flag set. LOCATION is - the virtual location used for error reporting. */ -static bool -paste_tokens (cpp_reader *pfile, location_t location, - const cpp_token **plhs, const cpp_token *rhs) -{ - unsigned char *buf, *end, *lhsend; - cpp_token *lhs; - unsigned int len; - - len = cpp_token_len (*plhs) + cpp_token_len (rhs) + 2; - buf = (unsigned char *) alloca (len); - end = lhsend = cpp_spell_token (pfile, *plhs, buf, true); - - /* Avoid comment headers, since they are still processed in stage 3. - It is simpler to insert a space here, rather than modifying the - lexer to ignore comments in some circumstances. Simply returning - false doesn't work, since we want to clear the PASTE_LEFT flag. */ - if ((*plhs)->type == CPP_DIV && rhs->type != CPP_EQ) - *end++ = ' '; - /* In one obscure case we might see padding here. */ - if (rhs->type != CPP_PADDING) - end = cpp_spell_token (pfile, rhs, end, true); - *end = '\n'; - - cpp_push_buffer (pfile, buf, end - buf, /* from_stage3 */ true); - _cpp_clean_line (pfile); - - /* Set pfile->cur_token as required by _cpp_lex_direct. */ - pfile->cur_token = _cpp_temp_token (pfile); - lhs = _cpp_lex_direct (pfile); - if (pfile->buffer->cur != pfile->buffer->rlimit) - { - location_t saved_loc = lhs->src_loc; - - _cpp_pop_buffer (pfile); - - unsigned char *rhsstart = lhsend; - if ((*plhs)->type == CPP_DIV && rhs->type != CPP_EQ) - rhsstart++; - - /* We have to remove the PASTE_LEFT flag from the old lhs, but - we want to keep the new location. */ - *lhs = **plhs; - *plhs = lhs; - lhs->src_loc = saved_loc; - lhs->flags &= ~PASTE_LEFT; - - /* Mandatory error for all apart from assembler. */ - if (CPP_OPTION (pfile, lang) != CLK_ASM) - cpp_error_with_line (pfile, CPP_DL_ERROR, location, 0, - "pasting \"%.*s\" and \"%.*s\" does not give " - "a valid preprocessing token", - (int) (lhsend - buf), buf, - (int) (end - rhsstart), rhsstart); - return false; - } - - lhs->flags |= (*plhs)->flags & (PREV_WHITE | PREV_FALLTHROUGH); - *plhs = lhs; - _cpp_pop_buffer (pfile); - return true; -} - -/* Handles an arbitrarily long sequence of ## operators, with initial - operand LHS. This implementation is left-associative, - non-recursive, and finishes a paste before handling succeeding - ones. If a paste fails, we back up to the RHS of the failing ## - operator before pushing the context containing the result of prior - successful pastes, with the effect that the RHS appears in the - output stream after the pasted LHS normally. */ -static void -paste_all_tokens (cpp_reader *pfile, const cpp_token *lhs) -{ - const cpp_token *rhs = NULL; - cpp_context *context = pfile->context; - location_t virt_loc = 0; - - /* We are expanding a macro and we must have been called on a token - that appears at the left hand side of a ## operator. */ - if (macro_of_context (pfile->context) == NULL - || (!(lhs->flags & PASTE_LEFT))) - abort (); - - if (context->tokens_kind == TOKENS_KIND_EXTENDED) - /* The caller must have called consume_next_token_from_context - right before calling us. That has incremented the pointer to - the current virtual location. So it now points to the location - of the token that comes right after *LHS. We want the - resulting pasted token to have the location of the current - *LHS, though. */ - virt_loc = context->c.mc->cur_virt_loc[-1]; - else - /* We are not tracking macro expansion. So the best virtual - location we can get here is the expansion point of the macro we - are currently expanding. */ - virt_loc = pfile->invocation_location; - - do - { - /* Take the token directly from the current context. We can do - this, because we are in the replacement list of either an - object-like macro, or a function-like macro with arguments - inserted. In either case, the constraints to #define - guarantee we have at least one more token. */ - if (context->tokens_kind == TOKENS_KIND_DIRECT) - rhs = FIRST (context).token++; - else if (context->tokens_kind == TOKENS_KIND_INDIRECT) - rhs = *FIRST (context).ptoken++; - else if (context->tokens_kind == TOKENS_KIND_EXTENDED) - { - /* So we are in presence of an extended token context, which - means that each token in this context has a virtual - location attached to it. So let's not forget to update - the pointer to the current virtual location of the - current token when we update the pointer to the current - token */ - - rhs = *FIRST (context).ptoken++; - /* context->c.mc must be non-null, as if we were not in a - macro context, context->tokens_kind could not be equal to - TOKENS_KIND_EXTENDED. */ - context->c.mc->cur_virt_loc++; - } - - if (rhs->type == CPP_PADDING) - { - if (rhs->flags & PASTE_LEFT) - abort (); - } - if (!paste_tokens (pfile, virt_loc, &lhs, rhs)) - { - _cpp_backup_tokens (pfile, 1); - break; - } - } - while (rhs->flags & PASTE_LEFT); - - /* Put the resulting token in its own context. */ - if (context->tokens_kind == TOKENS_KIND_EXTENDED) - { - location_t *virt_locs = NULL; - _cpp_buff *token_buf = tokens_buff_new (pfile, 1, &virt_locs); - tokens_buff_add_token (token_buf, virt_locs, lhs, - virt_loc, 0, NULL, 0); - push_extended_tokens_context (pfile, context->c.mc->macro_node, - token_buf, virt_locs, - (const cpp_token **)token_buf->base, 1); - } - else - _cpp_push_token_context (pfile, NULL, lhs, 1); -} - -/* Returns TRUE if the number of arguments ARGC supplied in an - invocation of the MACRO referenced by NODE is valid. An empty - invocation to a macro with no parameters should pass ARGC as zero. - - Note that MACRO cannot necessarily be deduced from NODE, in case - NODE was redefined whilst collecting arguments. */ -bool -_cpp_arguments_ok (cpp_reader *pfile, cpp_macro *macro, const cpp_hashnode *node, unsigned int argc) -{ - if (argc == macro->paramc) - return true; - - if (argc < macro->paramc) - { - /* In C++20 (here the va_opt flag is used), and also as a GNU - extension, variadic arguments are allowed to not appear in - the invocation at all. - e.g. #define debug(format, args...) something - debug("string"); - - This is exactly the same as if an empty variadic list had been - supplied - debug("string", ). */ - - if (argc + 1 == macro->paramc && macro->variadic) - { - if (CPP_PEDANTIC (pfile) && ! macro->syshdr - && ! CPP_OPTION (pfile, va_opt)) - { - if (CPP_OPTION (pfile, cplusplus)) - cpp_error (pfile, CPP_DL_PEDWARN, - "ISO C++11 requires at least one argument " - "for the \"...\" in a variadic macro"); - else - cpp_error (pfile, CPP_DL_PEDWARN, - "ISO C99 requires at least one argument " - "for the \"...\" in a variadic macro"); - } - return true; - } - - cpp_error (pfile, CPP_DL_ERROR, - "macro \"%s\" requires %u arguments, but only %u given", - NODE_NAME (node), macro->paramc, argc); - } - else - cpp_error (pfile, CPP_DL_ERROR, - "macro \"%s\" passed %u arguments, but takes just %u", - NODE_NAME (node), argc, macro->paramc); - - if (macro->line > RESERVED_LOCATION_COUNT) - cpp_error_at (pfile, CPP_DL_NOTE, macro->line, "macro \"%s\" defined here", - NODE_NAME (node)); - - return false; -} - -/* Reads and returns the arguments to a function-like macro - invocation. Assumes the opening parenthesis has been processed. - If there is an error, emits an appropriate diagnostic and returns - NULL. Each argument is terminated by a CPP_EOF token, for the - future benefit of expand_arg(). If there are any deferred - #pragma directives among macro arguments, store pointers to the - CPP_PRAGMA ... CPP_PRAGMA_EOL tokens into *PRAGMA_BUFF buffer. - - What is returned is the buffer that contains the memory allocated - to hold the macro arguments. NODE is the name of the macro this - function is dealing with. If NUM_ARGS is non-NULL, *NUM_ARGS is - set to the actual number of macro arguments allocated in the - returned buffer. */ -static _cpp_buff * -collect_args (cpp_reader *pfile, const cpp_hashnode *node, - _cpp_buff **pragma_buff, unsigned *num_args) -{ - _cpp_buff *buff, *base_buff; - cpp_macro *macro; - macro_arg *args, *arg; - const cpp_token *token; - unsigned int argc; - location_t virt_loc; - bool track_macro_expansion_p = CPP_OPTION (pfile, track_macro_expansion); - unsigned num_args_alloced = 0; - - macro = node->value.macro; - if (macro->paramc) - argc = macro->paramc; - else - argc = 1; - -#define DEFAULT_NUM_TOKENS_PER_MACRO_ARG 50 -#define ARG_TOKENS_EXTENT 1000 - - buff = _cpp_get_buff (pfile, argc * (DEFAULT_NUM_TOKENS_PER_MACRO_ARG - * sizeof (cpp_token *) - + sizeof (macro_arg))); - base_buff = buff; - args = (macro_arg *) buff->base; - memset (args, 0, argc * sizeof (macro_arg)); - buff->cur = (unsigned char *) &args[argc]; - arg = args, argc = 0; - - /* Collect the tokens making up each argument. We don't yet know - how many arguments have been supplied, whether too many or too - few. Hence the slightly bizarre usage of "argc" and "arg". */ - do - { - unsigned int paren_depth = 0; - unsigned int ntokens = 0; - unsigned virt_locs_capacity = DEFAULT_NUM_TOKENS_PER_MACRO_ARG; - num_args_alloced++; - - argc++; - arg->first = (const cpp_token **) buff->cur; - if (track_macro_expansion_p) - { - virt_locs_capacity = DEFAULT_NUM_TOKENS_PER_MACRO_ARG; - arg->virt_locs = XNEWVEC (location_t, - virt_locs_capacity); - } - - for (;;) - { - /* Require space for 2 new tokens (including a CPP_EOF). */ - if ((unsigned char *) &arg->first[ntokens + 2] > buff->limit) - { - buff = _cpp_append_extend_buff (pfile, buff, - ARG_TOKENS_EXTENT - * sizeof (cpp_token *)); - arg->first = (const cpp_token **) buff->cur; - } - if (track_macro_expansion_p - && (ntokens + 2 > virt_locs_capacity)) - { - virt_locs_capacity += ARG_TOKENS_EXTENT; - arg->virt_locs = XRESIZEVEC (location_t, - arg->virt_locs, - virt_locs_capacity); - } - - token = cpp_get_token_1 (pfile, &virt_loc); - - if (token->type == CPP_PADDING) - { - /* Drop leading padding. */ - if (ntokens == 0) - continue; - } - else if (token->type == CPP_OPEN_PAREN) - paren_depth++; - else if (token->type == CPP_CLOSE_PAREN) - { - if (paren_depth-- == 0) - break; - } - else if (token->type == CPP_COMMA) - { - /* A comma does not terminate an argument within - parentheses or as part of a variable argument. */ - if (paren_depth == 0 - && ! (macro->variadic && argc == macro->paramc)) - break; - } - else if (token->type == CPP_EOF - || (token->type == CPP_HASH && token->flags & BOL)) - break; - else if (token->type == CPP_PRAGMA && !(token->flags & PRAGMA_OP)) - { - cpp_token *newtok = _cpp_temp_token (pfile); - - /* CPP_PRAGMA token lives in directive_result, which will - be overwritten on the next directive. */ - *newtok = *token; - token = newtok; - do - { - if (*pragma_buff == NULL - || BUFF_ROOM (*pragma_buff) < sizeof (cpp_token *)) - { - _cpp_buff *next; - if (*pragma_buff == NULL) - *pragma_buff - = _cpp_get_buff (pfile, 32 * sizeof (cpp_token *)); - else - { - next = *pragma_buff; - *pragma_buff - = _cpp_get_buff (pfile, - (BUFF_FRONT (*pragma_buff) - - (*pragma_buff)->base) * 2); - (*pragma_buff)->next = next; - } - } - *(const cpp_token **) BUFF_FRONT (*pragma_buff) = token; - BUFF_FRONT (*pragma_buff) += sizeof (cpp_token *); - if (token->type == CPP_PRAGMA_EOL) - break; - token = cpp_get_token_1 (pfile, &virt_loc); - } - while (token->type != CPP_EOF); - - /* In deferred pragmas parsing_args and prevent_expansion - had been changed, reset it. */ - pfile->state.parsing_args = 2; - pfile->state.prevent_expansion = 1; - - if (token->type == CPP_EOF) - break; - else - continue; - } - set_arg_token (arg, token, virt_loc, - ntokens, MACRO_ARG_TOKEN_NORMAL, - CPP_OPTION (pfile, track_macro_expansion)); - ntokens++; - } - - /* Drop trailing padding. */ - while (ntokens > 0 && arg->first[ntokens - 1]->type == CPP_PADDING) - ntokens--; - - arg->count = ntokens; - /* Append an EOF to mark end-of-argument. */ - set_arg_token (arg, &pfile->endarg, token->src_loc, - ntokens, MACRO_ARG_TOKEN_NORMAL, - CPP_OPTION (pfile, track_macro_expansion)); - - /* Terminate the argument. Excess arguments loop back and - overwrite the final legitimate argument, before failing. */ - if (argc <= macro->paramc) - { - buff->cur = (unsigned char *) &arg->first[ntokens + 1]; - if (argc != macro->paramc) - arg++; - } - } - while (token->type != CPP_CLOSE_PAREN && token->type != CPP_EOF); - - if (token->type == CPP_EOF) - { - /* Unless the EOF is marking the end of an argument, it's a fake - one from the end of a file that _cpp_clean_line will not have - advanced past. */ - if (token == &pfile->endarg) - _cpp_backup_tokens (pfile, 1); - cpp_error (pfile, CPP_DL_ERROR, - "unterminated argument list invoking macro \"%s\"", - NODE_NAME (node)); - } - else - { - /* A single empty argument is counted as no argument. */ - if (argc == 1 && macro->paramc == 0 && args[0].count == 0) - argc = 0; - if (_cpp_arguments_ok (pfile, macro, node, argc)) - { - /* GCC has special semantics for , ## b where b is a varargs - parameter: we remove the comma if b was omitted entirely. - If b was merely an empty argument, the comma is retained. - If the macro takes just one (varargs) parameter, then we - retain the comma only if we are standards conforming. - - If FIRST is NULL replace_args () swallows the comma. */ - if (macro->variadic && (argc < macro->paramc - || (argc == 1 && args[0].count == 0 - && !CPP_OPTION (pfile, std)))) - args[macro->paramc - 1].first = NULL; - if (num_args) - *num_args = num_args_alloced; - return base_buff; - } - } - - /* An error occurred. */ - _cpp_release_buff (pfile, base_buff); - return NULL; -} - -/* Search for an opening parenthesis to the macro of NODE, in such a - way that, if none is found, we don't lose the information in any - intervening padding tokens. If we find the parenthesis, collect - the arguments and return the buffer containing them. PRAGMA_BUFF - argument is the same as in collect_args. If NUM_ARGS is non-NULL, - *NUM_ARGS is set to the number of arguments contained in the - returned buffer. */ -static _cpp_buff * -funlike_invocation_p (cpp_reader *pfile, cpp_hashnode *node, - _cpp_buff **pragma_buff, unsigned *num_args) -{ - const cpp_token *token, *padding = NULL; - - for (;;) - { - token = cpp_get_token (pfile); - if (token->type != CPP_PADDING) - break; - gcc_assert ((token->flags & PREV_WHITE) == 0); - if (padding == NULL - || padding->val.source == NULL - || (!(padding->val.source->flags & PREV_WHITE) - && token->val.source == NULL)) - padding = token; - } - - if (token->type == CPP_OPEN_PAREN) - { - pfile->state.parsing_args = 2; - return collect_args (pfile, node, pragma_buff, num_args); - } - - /* Back up. A CPP_EOF is either an EOF from an argument we're - expanding, or a fake one from lex_direct. We want to backup the - former, but not the latter. We may have skipped padding, in - which case backing up more than one token when expanding macros - is in general too difficult. We re-insert it in its own - context. */ - if (token->type != CPP_EOF || token == &pfile->endarg) - { - _cpp_backup_tokens (pfile, 1); - if (padding) - _cpp_push_token_context (pfile, NULL, padding, 1); - } - - return NULL; -} - -/* Return the real number of tokens in the expansion of MACRO. */ -static inline unsigned int -macro_real_token_count (const cpp_macro *macro) -{ - if (__builtin_expect (!macro->extra_tokens, true)) - return macro->count; - - for (unsigned i = macro->count; i--;) - if (macro->exp.tokens[i].type != CPP_PASTE) - return i + 1; - - return 0; -} - -/* Push the context of a macro with hash entry NODE onto the context - stack. If we can successfully expand the macro, we push a context - containing its yet-to-be-rescanned replacement list and return one. - If there were additionally any unexpanded deferred #pragma - directives among macro arguments, push another context containing - the pragma tokens before the yet-to-be-rescanned replacement list - and return two. Otherwise, we don't push a context and return - zero. LOCATION is the location of the expansion point of the - macro. */ -static int -enter_macro_context (cpp_reader *pfile, cpp_hashnode *node, - const cpp_token *result, location_t location) -{ - /* The presence of a macro invalidates a file's controlling macro. */ - pfile->mi_valid = false; - - pfile->state.angled_headers = false; - - /* From here to when we push the context for the macro later down - this function, we need to flag the fact that we are about to - expand a macro. This is useful when -ftrack-macro-expansion is - turned off. In that case, we need to record the location of the - expansion point of the top-most macro we are about to to expand, - into pfile->invocation_location. But we must not record any such - location once the process of expanding the macro starts; that is, - we must not do that recording between now and later down this - function where set this flag to FALSE. */ - pfile->about_to_expand_macro_p = true; - - if (cpp_user_macro_p (node)) - { - cpp_macro *macro = node->value.macro; - _cpp_buff *pragma_buff = NULL; - - if (macro->fun_like) - { - _cpp_buff *buff; - unsigned num_args = 0; - - pfile->state.prevent_expansion++; - pfile->keep_tokens++; - pfile->state.parsing_args = 1; - buff = funlike_invocation_p (pfile, node, &pragma_buff, - &num_args); - pfile->state.parsing_args = 0; - pfile->keep_tokens--; - pfile->state.prevent_expansion--; - - if (buff == NULL) - { - if (CPP_WTRADITIONAL (pfile) && ! node->value.macro->syshdr) - cpp_warning (pfile, CPP_W_TRADITIONAL, - "function-like macro \"%s\" must be used with arguments in traditional C", - NODE_NAME (node)); - - if (pragma_buff) - _cpp_release_buff (pfile, pragma_buff); - - pfile->about_to_expand_macro_p = false; - return 0; - } - - if (macro->paramc > 0) - replace_args (pfile, node, macro, - (macro_arg *) buff->base, - location); - /* Free the memory used by the arguments of this - function-like macro. This memory has been allocated by - funlike_invocation_p and by replace_args. */ - delete_macro_args (buff, num_args); - } - - /* Disable the macro within its expansion. */ - node->flags |= NODE_DISABLED; - - /* Laziness can only affect the expansion tokens of the macro, - not its fun-likeness or parameters. */ - _cpp_maybe_notify_macro_use (pfile, node, location); - if (pfile->cb.used) - pfile->cb.used (pfile, location, node); - - macro->used = 1; - - if (macro->paramc == 0) - { - unsigned tokens_count = macro_real_token_count (macro); - if (CPP_OPTION (pfile, track_macro_expansion)) - { - unsigned int i; - const cpp_token *src = macro->exp.tokens; - const line_map_macro *map; - location_t *virt_locs = NULL; - _cpp_buff *macro_tokens - = tokens_buff_new (pfile, tokens_count, &virt_locs); - - /* Create a macro map to record the locations of the - tokens that are involved in the expansion. LOCATION - is the location of the macro expansion point. */ - map = linemap_enter_macro (pfile->line_table, - node, location, tokens_count); - for (i = 0; i < tokens_count; ++i) - { - tokens_buff_add_token (macro_tokens, virt_locs, - src, src->src_loc, - src->src_loc, map, i); - ++src; - } - push_extended_tokens_context (pfile, node, - macro_tokens, - virt_locs, - (const cpp_token **) - macro_tokens->base, - tokens_count); - } - else - _cpp_push_token_context (pfile, node, macro->exp.tokens, - tokens_count); - num_macro_tokens_counter += tokens_count; - } - - if (pragma_buff) - { - if (!pfile->state.in_directive) - _cpp_push_token_context (pfile, NULL, - padding_token (pfile, result), 1); - do - { - unsigned tokens_count; - _cpp_buff *tail = pragma_buff->next; - pragma_buff->next = NULL; - tokens_count = ((const cpp_token **) BUFF_FRONT (pragma_buff) - - (const cpp_token **) pragma_buff->base); - push_ptoken_context (pfile, NULL, pragma_buff, - (const cpp_token **) pragma_buff->base, - tokens_count); - pragma_buff = tail; - if (!CPP_OPTION (pfile, track_macro_expansion)) - num_macro_tokens_counter += tokens_count; - - } - while (pragma_buff != NULL); - pfile->about_to_expand_macro_p = false; - return 2; - } - - pfile->about_to_expand_macro_p = false; - return 1; - } - - pfile->about_to_expand_macro_p = false; - /* Handle built-in macros and the _Pragma operator. */ - { - location_t expand_loc; - - if (/* The top-level macro invocation that triggered the expansion - we are looking at is with a function-like user macro ... */ - cpp_fun_like_macro_p (pfile->top_most_macro_node) - /* ... and we are tracking the macro expansion. */ - && CPP_OPTION (pfile, track_macro_expansion)) - /* Then the location of the end of the macro invocation is the - location of the expansion point of this macro. */ - expand_loc = location; - else - /* Otherwise, the location of the end of the macro invocation is - the location of the expansion point of that top-level macro - invocation. */ - expand_loc = pfile->invocation_location; - - return builtin_macro (pfile, node, location, expand_loc); - } -} - -/* De-allocate the memory used by BUFF which is an array of instances - of macro_arg. NUM_ARGS is the number of instances of macro_arg - present in BUFF. */ -static void -delete_macro_args (_cpp_buff *buff, unsigned num_args) -{ - macro_arg *macro_args; - unsigned i; - - if (buff == NULL) - return; - - macro_args = (macro_arg *) buff->base; - - /* Walk instances of macro_arg to free their expanded tokens as well - as their macro_arg::virt_locs members. */ - for (i = 0; i < num_args; ++i) - { - if (macro_args[i].expanded) - { - free (macro_args[i].expanded); - macro_args[i].expanded = NULL; - } - if (macro_args[i].virt_locs) - { - free (macro_args[i].virt_locs); - macro_args[i].virt_locs = NULL; - } - if (macro_args[i].expanded_virt_locs) - { - free (macro_args[i].expanded_virt_locs); - macro_args[i].expanded_virt_locs = NULL; - } - } - _cpp_free_buff (buff); -} - -/* Set the INDEXth token of the macro argument ARG. TOKEN is the token - to set, LOCATION is its virtual location. "Virtual" location means - the location that encodes loci across macro expansion. Otherwise - it has to be TOKEN->SRC_LOC. KIND is the kind of tokens the - argument ARG is supposed to contain. Note that ARG must be - tailored so that it has enough room to contain INDEX + 1 numbers of - tokens, at least. */ -static void -set_arg_token (macro_arg *arg, const cpp_token *token, - location_t location, size_t index, - enum macro_arg_token_kind kind, - bool track_macro_exp_p) -{ - const cpp_token **token_ptr; - location_t *loc = NULL; - - token_ptr = - arg_token_ptr_at (arg, index, kind, - track_macro_exp_p ? &loc : NULL); - *token_ptr = token; - - if (loc != NULL) - { - /* We can't set the location of a stringified argument - token and we can't set any location if we aren't tracking - macro expansion locations. */ - gcc_checking_assert (kind != MACRO_ARG_TOKEN_STRINGIFIED - && track_macro_exp_p); - *loc = location; - } -} - -/* Get the pointer to the location of the argument token of the - function-like macro argument ARG. This function must be called - only when we -ftrack-macro-expansion is on. */ -static const location_t * -get_arg_token_location (const macro_arg *arg, - enum macro_arg_token_kind kind) -{ - const location_t *loc = NULL; - const cpp_token **token_ptr = - arg_token_ptr_at (arg, 0, kind, (location_t **) &loc); - - if (token_ptr == NULL) - return NULL; - - return loc; -} - -/* Return the pointer to the INDEXth token of the macro argument ARG. - KIND specifies the kind of token the macro argument ARG contains. - If VIRT_LOCATION is non NULL, *VIRT_LOCATION is set to the address - of the virtual location of the returned token if the - -ftrack-macro-expansion flag is on; otherwise, it's set to the - spelling location of the returned token. */ -static const cpp_token ** -arg_token_ptr_at (const macro_arg *arg, size_t index, - enum macro_arg_token_kind kind, - location_t **virt_location) -{ - const cpp_token **tokens_ptr = NULL; - - switch (kind) - { - case MACRO_ARG_TOKEN_NORMAL: - tokens_ptr = arg->first; - break; - case MACRO_ARG_TOKEN_STRINGIFIED: - tokens_ptr = (const cpp_token **) &arg->stringified; - break; - case MACRO_ARG_TOKEN_EXPANDED: - tokens_ptr = arg->expanded; - break; - } - - if (tokens_ptr == NULL) - /* This can happen for e.g, an empty token argument to a - funtion-like macro. */ - return tokens_ptr; - - if (virt_location) - { - if (kind == MACRO_ARG_TOKEN_NORMAL) - *virt_location = &arg->virt_locs[index]; - else if (kind == MACRO_ARG_TOKEN_EXPANDED) - *virt_location = &arg->expanded_virt_locs[index]; - else if (kind == MACRO_ARG_TOKEN_STRINGIFIED) - *virt_location = - (location_t *) &tokens_ptr[index]->src_loc; - } - return &tokens_ptr[index]; -} - -/* Initialize an iterator so that it iterates over the tokens of a - function-like macro argument. KIND is the kind of tokens we want - ITER to iterate over. TOKEN_PTR points the first token ITER will - iterate over. */ -static void -macro_arg_token_iter_init (macro_arg_token_iter *iter, - bool track_macro_exp_p, - enum macro_arg_token_kind kind, - const macro_arg *arg, - const cpp_token **token_ptr) -{ - iter->track_macro_exp_p = track_macro_exp_p; - iter->kind = kind; - iter->token_ptr = token_ptr; - /* Unconditionally initialize this so that the compiler doesn't warn - about iter->location_ptr being possibly uninitialized later after - this code has been inlined somewhere. */ - iter->location_ptr = NULL; - if (track_macro_exp_p) - iter->location_ptr = get_arg_token_location (arg, kind); -#if CHECKING_P - iter->num_forwards = 0; - if (track_macro_exp_p - && token_ptr != NULL - && iter->location_ptr == NULL) - abort (); -#endif -} - -/* Move the iterator one token forward. Note that if IT was - initialized on an argument that has a stringified token, moving it - forward doesn't make sense as a stringified token is essentially one - string. */ -static void -macro_arg_token_iter_forward (macro_arg_token_iter *it) -{ - switch (it->kind) - { - case MACRO_ARG_TOKEN_NORMAL: - case MACRO_ARG_TOKEN_EXPANDED: - it->token_ptr++; - if (it->track_macro_exp_p) - it->location_ptr++; - break; - case MACRO_ARG_TOKEN_STRINGIFIED: -#if CHECKING_P - if (it->num_forwards > 0) - abort (); -#endif - break; - } - -#if CHECKING_P - it->num_forwards++; -#endif -} - -/* Return the token pointed to by the iterator. */ -static const cpp_token * -macro_arg_token_iter_get_token (const macro_arg_token_iter *it) -{ -#if CHECKING_P - if (it->kind == MACRO_ARG_TOKEN_STRINGIFIED - && it->num_forwards > 0) - abort (); -#endif - if (it->token_ptr == NULL) - return NULL; - return *it->token_ptr; -} - -/* Return the location of the token pointed to by the iterator.*/ -static location_t -macro_arg_token_iter_get_location (const macro_arg_token_iter *it) -{ -#if CHECKING_P - if (it->kind == MACRO_ARG_TOKEN_STRINGIFIED - && it->num_forwards > 0) - abort (); -#endif - if (it->track_macro_exp_p) - return *it->location_ptr; - else - return (*it->token_ptr)->src_loc; -} - -/* Return the index of a token [resulting from macro expansion] inside - the total list of tokens resulting from a given macro - expansion. The index can be different depending on whether if we - want each tokens resulting from function-like macro arguments - expansion to have a different location or not. - - E.g, consider this function-like macro: - - #define M(x) x - 3 - - Then consider us "calling" it (and thus expanding it) like: - - M(1+4) - - It will be expanded into: - - 1+4-3 - - Let's consider the case of the token '4'. - - Its index can be 2 (it's the third token of the set of tokens - resulting from the expansion) or it can be 0 if we consider that - all tokens resulting from the expansion of the argument "1+2" have - the same index, which is 0. In this later case, the index of token - '-' would then be 1 and the index of token '3' would be 2. - - The later case is useful to use less memory e.g, for the case of - the user using the option -ftrack-macro-expansion=1. - - ABSOLUTE_TOKEN_INDEX is the index of the macro argument token we - are interested in. CUR_REPLACEMENT_TOKEN is the token of the macro - parameter (inside the macro replacement list) that corresponds to - the macro argument for which ABSOLUTE_TOKEN_INDEX is a token index - of. - - If we refer to the example above, for the '4' argument token, - ABSOLUTE_TOKEN_INDEX would be set to 2, and CUR_REPLACEMENT_TOKEN - would be set to the token 'x', in the replacement list "x - 3" of - macro M. - - This is a subroutine of replace_args. */ -inline static unsigned -expanded_token_index (cpp_reader *pfile, cpp_macro *macro, - const cpp_token *cur_replacement_token, - unsigned absolute_token_index) -{ - if (CPP_OPTION (pfile, track_macro_expansion) > 1) - return absolute_token_index; - return cur_replacement_token - macro->exp.tokens; -} - -/* Copy whether PASTE_LEFT is set from SRC to *PASTE_FLAG. */ - -static void -copy_paste_flag (cpp_reader *pfile, const cpp_token **paste_flag, - const cpp_token *src) -{ - cpp_token *token = _cpp_temp_token (pfile); - token->type = (*paste_flag)->type; - token->val = (*paste_flag)->val; - if (src->flags & PASTE_LEFT) - token->flags = (*paste_flag)->flags | PASTE_LEFT; - else - token->flags = (*paste_flag)->flags & ~PASTE_LEFT; - *paste_flag = token; -} - -/* True IFF the last token emitted into BUFF (if any) is PTR. */ - -static bool -last_token_is (_cpp_buff *buff, const cpp_token **ptr) -{ - return (ptr && tokens_buff_last_token_ptr (buff) == ptr); -} - -/* Replace the parameters in a function-like macro of NODE with the - actual ARGS, and place the result in a newly pushed token context. - Expand each argument before replacing, unless it is operated upon - by the # or ## operators. EXPANSION_POINT_LOC is the location of - the expansion point of the macro. E.g, the location of the - function-like macro invocation. */ -static void -replace_args (cpp_reader *pfile, cpp_hashnode *node, cpp_macro *macro, - macro_arg *args, location_t expansion_point_loc) -{ - unsigned int i, total; - const cpp_token *src, *limit; - const cpp_token **first = NULL; - macro_arg *arg; - _cpp_buff *buff = NULL; - location_t *virt_locs = NULL; - unsigned int exp_count; - const line_map_macro *map = NULL; - int track_macro_exp; - - /* First, fully macro-expand arguments, calculating the number of - tokens in the final expansion as we go. The ordering of the if - statements below is subtle; we must handle stringification before - pasting. */ - - /* EXP_COUNT is the number of tokens in the macro replacement - list. TOTAL is the number of tokens /after/ macro parameters - have been replaced by their arguments. */ - exp_count = macro_real_token_count (macro); - total = exp_count; - limit = macro->exp.tokens + exp_count; - - for (src = macro->exp.tokens; src < limit; src++) - if (src->type == CPP_MACRO_ARG) - { - /* Leading and trailing padding tokens. */ - total += 2; - /* Account for leading and padding tokens in exp_count too. - This is going to be important later down this function, - when we want to handle the case of (track_macro_exp < - 2). */ - exp_count += 2; - - /* We have an argument. If it is not being stringified or - pasted it is macro-replaced before insertion. */ - arg = &args[src->val.macro_arg.arg_no - 1]; - - if (src->flags & STRINGIFY_ARG) - { - if (!arg->stringified) - arg->stringified = stringify_arg (pfile, arg->first, arg->count); - } - else if ((src->flags & PASTE_LEFT) - || (src != macro->exp.tokens && (src[-1].flags & PASTE_LEFT))) - total += arg->count - 1; - else - { - if (!arg->expanded) - expand_arg (pfile, arg); - total += arg->expanded_count - 1; - } - } - - /* When the compiler is called with the -ftrack-macro-expansion - flag, we need to keep track of the location of each token that - results from macro expansion. - - A token resulting from macro expansion is not a new token. It is - simply the same token as the token coming from the macro - definition. The new things that are allocated are the buffer - that holds the tokens resulting from macro expansion and a new - location that records many things like the locus of the expansion - point as well as the original locus inside the definition of the - macro. This location is called a virtual location. - - So the buffer BUFF holds a set of cpp_token*, and the buffer - VIRT_LOCS holds the virtual locations of the tokens held by BUFF. - - Both of these two buffers are going to be hung off of the macro - context, when the latter is pushed. The memory allocated to - store the tokens and their locations is going to be freed once - the context of macro expansion is popped. - - As far as tokens are concerned, the memory overhead of - -ftrack-macro-expansion is proportional to the number of - macros that get expanded multiplied by sizeof (location_t). - The good news is that extra memory gets freed when the macro - context is freed, i.e shortly after the macro got expanded. */ - - /* Is the -ftrack-macro-expansion flag in effect? */ - track_macro_exp = CPP_OPTION (pfile, track_macro_expansion); - - /* Now allocate memory space for tokens and locations resulting from - the macro expansion, copy the tokens and replace the arguments. - This memory must be freed when the context of the macro MACRO is - popped. */ - buff = tokens_buff_new (pfile, total, track_macro_exp ? &virt_locs : NULL); - - first = (const cpp_token **) buff->base; - - /* Create a macro map to record the locations of the tokens that are - involved in the expansion. Note that the expansion point is set - to the location of the closing parenthesis. Otherwise, the - subsequent map created for the first token that comes after the - macro map might have a wrong line number. That would lead to - tokens with wrong line numbers after the macro expansion. This - adds up to the memory overhead of the -ftrack-macro-expansion - flag; for every macro that is expanded, a "macro map" is - created. */ - if (track_macro_exp) - { - int num_macro_tokens = total; - if (track_macro_exp < 2) - /* Then the number of macro tokens won't take in account the - fact that function-like macro arguments can expand to - multiple tokens. This is to save memory at the expense of - accuracy. - - Suppose we have #define SQUARE(A) A * A - - And then we do SQUARE(2+3) - - Then the tokens 2, +, 3, will have the same location, - saying they come from the expansion of the argument A. */ - num_macro_tokens = exp_count; - map = linemap_enter_macro (pfile->line_table, node, - expansion_point_loc, - num_macro_tokens); - } - i = 0; - vaopt_state vaopt_tracker (pfile, macro->variadic, &args[macro->paramc - 1]); - const cpp_token **vaopt_start = NULL; - for (src = macro->exp.tokens; src < limit; src++) - { - unsigned int arg_tokens_count; - macro_arg_token_iter from; - const cpp_token **paste_flag = NULL; - const cpp_token **tmp_token_ptr; - - /* __VA_OPT__ handling. */ - vaopt_state::update_type vostate = vaopt_tracker.update (src); - if (__builtin_expect (vostate != vaopt_state::INCLUDE, false)) - { - if (vostate == vaopt_state::BEGIN) - { - /* Padding on the left of __VA_OPT__ (unless RHS of ##). */ - if (src != macro->exp.tokens && !(src[-1].flags & PASTE_LEFT)) - { - const cpp_token *t = padding_token (pfile, src); - unsigned index = expanded_token_index (pfile, macro, src, i); - /* Allocate a virtual location for the padding token and - append the token and its location to BUFF and - VIRT_LOCS. */ - tokens_buff_add_token (buff, virt_locs, t, - t->src_loc, t->src_loc, - map, index); - } - vaopt_start = tokens_buff_last_token_ptr (buff); - } - else if (vostate == vaopt_state::END) - { - const cpp_token **start = vaopt_start; - vaopt_start = NULL; - - paste_flag = tokens_buff_last_token_ptr (buff); - - if (vaopt_tracker.stringify ()) - { - unsigned int count - = start ? paste_flag - start : tokens_buff_count (buff); - const cpp_token **first - = start ? start + 1 - : (const cpp_token **) (buff->base); - unsigned int i, j; - - /* Paste any tokens that need to be pasted before calling - stringify_arg, because stringify_arg uses pfile->u_buff - which paste_tokens can use as well. */ - for (i = 0, j = 0; i < count; i++, j++) - { - const cpp_token *token = first[i]; - - if (token->flags & PASTE_LEFT) - { - location_t virt_loc = pfile->invocation_location; - const cpp_token *rhs; - do - { - if (i == count) - abort (); - rhs = first[++i]; - if (!paste_tokens (pfile, virt_loc, &token, rhs)) - { - --i; - break; - } - } - while (rhs->flags & PASTE_LEFT); - } - - first[j] = token; - } - if (j != i) - { - while (i-- != j) - tokens_buff_remove_last_token (buff); - count = j; - } - - const cpp_token *t = stringify_arg (pfile, first, count); - while (count--) - tokens_buff_remove_last_token (buff); - if (src->flags & PASTE_LEFT) - copy_paste_flag (pfile, &t, src); - tokens_buff_add_token (buff, virt_locs, - t, t->src_loc, t->src_loc, - NULL, 0); - continue; - } - if (start && paste_flag == start && (*start)->flags & PASTE_LEFT) - /* If __VA_OPT__ expands to nothing (either because __VA_ARGS__ - is empty or because it is __VA_OPT__() ), drop PASTE_LEFT - flag from previous token. */ - copy_paste_flag (pfile, start, &pfile->avoid_paste); - if (src->flags & PASTE_LEFT) - { - /* Don't avoid paste after all. */ - while (paste_flag && paste_flag != start - && *paste_flag == &pfile->avoid_paste) - { - tokens_buff_remove_last_token (buff); - paste_flag = tokens_buff_last_token_ptr (buff); - } - - /* With a non-empty __VA_OPT__ on the LHS of ##, the last - token should be flagged PASTE_LEFT. */ - if (paste_flag && (*paste_flag)->type != CPP_PADDING) - copy_paste_flag (pfile, paste_flag, src); - } - else - { - /* Otherwise, avoid paste on RHS, __VA_OPT__(c)d or - __VA_OPT__(c)__VA_OPT__(d). */ - const cpp_token *t = &pfile->avoid_paste; - tokens_buff_add_token (buff, virt_locs, - t, t->src_loc, t->src_loc, - NULL, 0); - } - } - continue; - } - - if (src->type != CPP_MACRO_ARG) - { - /* Allocate a virtual location for token SRC, and add that - token and its virtual location into the buffers BUFF and - VIRT_LOCS. */ - unsigned index = expanded_token_index (pfile, macro, src, i); - tokens_buff_add_token (buff, virt_locs, src, - src->src_loc, src->src_loc, - map, index); - i += 1; - continue; - } - - paste_flag = 0; - arg = &args[src->val.macro_arg.arg_no - 1]; - /* SRC is a macro parameter that we need to replace with its - corresponding argument. So at some point we'll need to - iterate over the tokens of the macro argument and copy them - into the "place" now holding the correspondig macro - parameter. We are going to use the iterator type - macro_argo_token_iter to handle that iterating. The 'if' - below is to initialize the iterator depending on the type of - tokens the macro argument has. It also does some adjustment - related to padding tokens and some pasting corner cases. */ - if (src->flags & STRINGIFY_ARG) - { - arg_tokens_count = 1; - macro_arg_token_iter_init (&from, - CPP_OPTION (pfile, - track_macro_expansion), - MACRO_ARG_TOKEN_STRINGIFIED, - arg, &arg->stringified); - } - else if (src->flags & PASTE_LEFT) - { - arg_tokens_count = arg->count; - macro_arg_token_iter_init (&from, - CPP_OPTION (pfile, - track_macro_expansion), - MACRO_ARG_TOKEN_NORMAL, - arg, arg->first); - } - else if (src != macro->exp.tokens && (src[-1].flags & PASTE_LEFT)) - { - int num_toks; - arg_tokens_count = arg->count; - macro_arg_token_iter_init (&from, - CPP_OPTION (pfile, - track_macro_expansion), - MACRO_ARG_TOKEN_NORMAL, - arg, arg->first); - - num_toks = tokens_buff_count (buff); - - if (num_toks != 0) - { - /* So the current parameter token is pasted to the previous - token in the replacement list. Let's look at what - we have as previous and current arguments. */ - - /* This is the previous argument's token ... */ - tmp_token_ptr = tokens_buff_last_token_ptr (buff); - - if ((*tmp_token_ptr)->type == CPP_COMMA - && macro->variadic - && src->val.macro_arg.arg_no == macro->paramc) - { - /* ... which is a comma; and the current parameter - is the last parameter of a variadic function-like - macro. If the argument to the current last - parameter is NULL, then swallow the comma, - otherwise drop the paste flag. */ - if (macro_arg_token_iter_get_token (&from) == NULL) - tokens_buff_remove_last_token (buff); - else - paste_flag = tmp_token_ptr; - } - /* Remove the paste flag if the RHS is a placemarker. */ - else if (arg_tokens_count == 0) - paste_flag = tmp_token_ptr; - } - } - else - { - arg_tokens_count = arg->expanded_count; - macro_arg_token_iter_init (&from, - CPP_OPTION (pfile, - track_macro_expansion), - MACRO_ARG_TOKEN_EXPANDED, - arg, arg->expanded); - - if (last_token_is (buff, vaopt_start)) - { - /* We're expanding an arg at the beginning of __VA_OPT__. - Skip padding. */ - while (arg_tokens_count) - { - const cpp_token *t = macro_arg_token_iter_get_token (&from); - if (t->type != CPP_PADDING) - break; - macro_arg_token_iter_forward (&from); - --arg_tokens_count; - } - } - } - - /* Padding on the left of an argument (unless RHS of ##). */ - if ((!pfile->state.in_directive || pfile->state.directive_wants_padding) - && src != macro->exp.tokens - && !(src[-1].flags & PASTE_LEFT) - && !last_token_is (buff, vaopt_start)) - { - const cpp_token *t = padding_token (pfile, src); - unsigned index = expanded_token_index (pfile, macro, src, i); - /* Allocate a virtual location for the padding token and - append the token and its location to BUFF and - VIRT_LOCS. */ - tokens_buff_add_token (buff, virt_locs, t, - t->src_loc, t->src_loc, - map, index); - } - - if (arg_tokens_count) - { - /* So now we've got the number of tokens that make up the - argument that is going to replace the current parameter - in the macro's replacement list. */ - unsigned int j; - for (j = 0; j < arg_tokens_count; ++j) - { - /* So if track_macro_exp is < 2, the user wants to - save extra memory while tracking macro expansion - locations. So in that case here is what we do: - - Suppose we have #define SQUARE(A) A * A - - And then we do SQUARE(2+3) - - Then the tokens 2, +, 3, will have the same location, - saying they come from the expansion of the argument - A. - - So that means we are going to ignore the COUNT tokens - resulting from the expansion of the current macro - argument. In other words all the ARG_TOKENS_COUNT tokens - resulting from the expansion of the macro argument will - have the index I. Normally, each of those tokens should - have index I+J. */ - unsigned token_index = i; - unsigned index; - if (track_macro_exp > 1) - token_index += j; - - index = expanded_token_index (pfile, macro, src, token_index); - const cpp_token *tok = macro_arg_token_iter_get_token (&from); - tokens_buff_add_token (buff, virt_locs, tok, - macro_arg_token_iter_get_location (&from), - src->src_loc, map, index); - macro_arg_token_iter_forward (&from); - } - - /* With a non-empty argument on the LHS of ##, the last - token should be flagged PASTE_LEFT. */ - if (src->flags & PASTE_LEFT) - paste_flag - = (const cpp_token **) tokens_buff_last_token_ptr (buff); - } - else if (CPP_PEDANTIC (pfile) && ! CPP_OPTION (pfile, c99) - && ! macro->syshdr && ! _cpp_in_system_header (pfile)) - { - if (CPP_OPTION (pfile, cplusplus)) - cpp_pedwarning (pfile, CPP_W_PEDANTIC, - "invoking macro %s argument %d: " - "empty macro arguments are undefined" - " in ISO C++98", - NODE_NAME (node), src->val.macro_arg.arg_no); - else if (CPP_OPTION (pfile, cpp_warn_c90_c99_compat)) - cpp_pedwarning (pfile, - CPP_OPTION (pfile, cpp_warn_c90_c99_compat) > 0 - ? CPP_W_C90_C99_COMPAT : CPP_W_PEDANTIC, - "invoking macro %s argument %d: " - "empty macro arguments are undefined" - " in ISO C90", - NODE_NAME (node), src->val.macro_arg.arg_no); - } - else if (CPP_OPTION (pfile, cpp_warn_c90_c99_compat) > 0 - && ! CPP_OPTION (pfile, cplusplus) - && ! macro->syshdr && ! _cpp_in_system_header (pfile)) - cpp_warning (pfile, CPP_W_C90_C99_COMPAT, - "invoking macro %s argument %d: " - "empty macro arguments are undefined" - " in ISO C90", - NODE_NAME (node), src->val.macro_arg.arg_no); - - /* Avoid paste on RHS (even case count == 0). */ - if (!pfile->state.in_directive && !(src->flags & PASTE_LEFT)) - { - const cpp_token *t = &pfile->avoid_paste; - tokens_buff_add_token (buff, virt_locs, - t, t->src_loc, t->src_loc, - NULL, 0); - } - - /* Add a new paste flag, or remove an unwanted one. */ - if (paste_flag) - copy_paste_flag (pfile, paste_flag, src); - - i += arg_tokens_count; - } - - if (track_macro_exp) - push_extended_tokens_context (pfile, node, buff, virt_locs, first, - tokens_buff_count (buff)); - else - push_ptoken_context (pfile, node, buff, first, - tokens_buff_count (buff)); - - num_macro_tokens_counter += tokens_buff_count (buff); -} - -/* Return a special padding token, with padding inherited from SOURCE. */ -static const cpp_token * -padding_token (cpp_reader *pfile, const cpp_token *source) -{ - cpp_token *result = _cpp_temp_token (pfile); - - result->type = CPP_PADDING; - - /* Data in GCed data structures cannot be made const so far, so we - need a cast here. */ - result->val.source = (cpp_token *) source; - result->flags = 0; - return result; -} - -/* Get a new uninitialized context. Create a new one if we cannot - re-use an old one. */ -static cpp_context * -next_context (cpp_reader *pfile) -{ - cpp_context *result = pfile->context->next; - - if (result == 0) - { - result = XNEW (cpp_context); - memset (result, 0, sizeof (cpp_context)); - result->prev = pfile->context; - result->next = 0; - pfile->context->next = result; - } - - pfile->context = result; - return result; -} - -/* Push a list of pointers to tokens. */ -static void -push_ptoken_context (cpp_reader *pfile, cpp_hashnode *macro, _cpp_buff *buff, - const cpp_token **first, unsigned int count) -{ - cpp_context *context = next_context (pfile); - - context->tokens_kind = TOKENS_KIND_INDIRECT; - context->c.macro = macro; - context->buff = buff; - FIRST (context).ptoken = first; - LAST (context).ptoken = first + count; -} - -/* Push a list of tokens. - - A NULL macro means that we should continue the current macro - expansion, in essence. That means that if we are currently in a - macro expansion context, we'll make the new pfile->context refer to - the current macro. */ -void -_cpp_push_token_context (cpp_reader *pfile, cpp_hashnode *macro, - const cpp_token *first, unsigned int count) -{ - cpp_context *context; - - if (macro == NULL) - macro = macro_of_context (pfile->context); - - context = next_context (pfile); - context->tokens_kind = TOKENS_KIND_DIRECT; - context->c.macro = macro; - context->buff = NULL; - FIRST (context).token = first; - LAST (context).token = first + count; -} - -/* Build a context containing a list of tokens as well as their - virtual locations and push it. TOKENS_BUFF is the buffer that - contains the tokens pointed to by FIRST. If TOKENS_BUFF is - non-NULL, it means that the context owns it, meaning that - _cpp_pop_context will free it as well as VIRT_LOCS_BUFF that - contains the virtual locations. - - A NULL macro means that we should continue the current macro - expansion, in essence. That means that if we are currently in a - macro expansion context, we'll make the new pfile->context refer to - the current macro. */ -static void -push_extended_tokens_context (cpp_reader *pfile, - cpp_hashnode *macro, - _cpp_buff *token_buff, - location_t *virt_locs, - const cpp_token **first, - unsigned int count) -{ - cpp_context *context; - macro_context *m; - - if (macro == NULL) - macro = macro_of_context (pfile->context); - - context = next_context (pfile); - context->tokens_kind = TOKENS_KIND_EXTENDED; - context->buff = token_buff; - - m = XNEW (macro_context); - m->macro_node = macro; - m->virt_locs = virt_locs; - m->cur_virt_loc = virt_locs; - context->c.mc = m; - FIRST (context).ptoken = first; - LAST (context).ptoken = first + count; -} - -/* Push a traditional macro's replacement text. */ -void -_cpp_push_text_context (cpp_reader *pfile, cpp_hashnode *macro, - const uchar *start, size_t len) -{ - cpp_context *context = next_context (pfile); - - context->tokens_kind = TOKENS_KIND_DIRECT; - context->c.macro = macro; - context->buff = NULL; - CUR (context) = start; - RLIMIT (context) = start + len; - macro->flags |= NODE_DISABLED; -} - -/* Creates a buffer that holds tokens a.k.a "token buffer", usually - for the purpose of storing them on a cpp_context. If VIRT_LOCS is - non-null (which means that -ftrack-macro-expansion is on), - *VIRT_LOCS is set to a newly allocated buffer that is supposed to - hold the virtual locations of the tokens resulting from macro - expansion. */ -static _cpp_buff* -tokens_buff_new (cpp_reader *pfile, size_t len, - location_t **virt_locs) -{ - size_t tokens_size = len * sizeof (cpp_token *); - size_t locs_size = len * sizeof (location_t); - - if (virt_locs != NULL) - *virt_locs = XNEWVEC (location_t, locs_size); - return _cpp_get_buff (pfile, tokens_size); -} - -/* Returns the number of tokens contained in a token buffer. The - buffer holds a set of cpp_token*. */ -static size_t -tokens_buff_count (_cpp_buff *buff) -{ - return (BUFF_FRONT (buff) - buff->base) / sizeof (cpp_token *); -} - -/* Return a pointer to the last token contained in the token buffer - BUFF. */ -static const cpp_token ** -tokens_buff_last_token_ptr (_cpp_buff *buff) -{ - if (BUFF_FRONT (buff) == buff->base) - return NULL; - return &((const cpp_token **) BUFF_FRONT (buff))[-1]; -} - -/* Remove the last token contained in the token buffer TOKENS_BUFF. - If VIRT_LOCS_BUFF is non-NULL, it should point at the buffer - containing the virtual locations of the tokens in TOKENS_BUFF; in - which case the function updates that buffer as well. */ -static inline void -tokens_buff_remove_last_token (_cpp_buff *tokens_buff) - -{ - if (BUFF_FRONT (tokens_buff) > tokens_buff->base) - BUFF_FRONT (tokens_buff) = - (unsigned char *) &((cpp_token **) BUFF_FRONT (tokens_buff))[-1]; -} - -/* Insert a token into the token buffer at the position pointed to by - DEST. Note that the buffer is not enlarged so the previous token - that was at *DEST is overwritten. VIRT_LOC_DEST, if non-null, - means -ftrack-macro-expansion is effect; it then points to where to - insert the virtual location of TOKEN. TOKEN is the token to - insert. VIRT_LOC is the virtual location of the token, i.e, the - location possibly encoding its locus across macro expansion. If - TOKEN is an argument of a function-like macro (inside a macro - replacement list), PARM_DEF_LOC is the spelling location of the - macro parameter that TOKEN is replacing, in the replacement list of - the macro. If TOKEN is not an argument of a function-like macro or - if it doesn't come from a macro expansion, then VIRT_LOC can just - be set to the same value as PARM_DEF_LOC. If MAP is non null, it - means TOKEN comes from a macro expansion and MAP is the macro map - associated to the macro. MACRO_TOKEN_INDEX points to the index of - the token in the macro map; it is not considered if MAP is NULL. - - Upon successful completion this function returns the a pointer to - the position of the token coming right after the insertion - point. */ -static inline const cpp_token ** -tokens_buff_put_token_to (const cpp_token **dest, - location_t *virt_loc_dest, - const cpp_token *token, - location_t virt_loc, - location_t parm_def_loc, - const line_map_macro *map, - unsigned int macro_token_index) -{ - location_t macro_loc = virt_loc; - const cpp_token **result; - - if (virt_loc_dest) - { - /* -ftrack-macro-expansion is on. */ - if (map) - macro_loc = linemap_add_macro_token (map, macro_token_index, - virt_loc, parm_def_loc); - *virt_loc_dest = macro_loc; - } - *dest = token; - result = &dest[1]; - - return result; -} - -/* Adds a token at the end of the tokens contained in BUFFER. Note - that this function doesn't enlarge BUFFER when the number of tokens - reaches BUFFER's size; it aborts in that situation. - - TOKEN is the token to append. VIRT_LOC is the virtual location of - the token, i.e, the location possibly encoding its locus across - macro expansion. If TOKEN is an argument of a function-like macro - (inside a macro replacement list), PARM_DEF_LOC is the location of - the macro parameter that TOKEN is replacing. If TOKEN doesn't come - from a macro expansion, then VIRT_LOC can just be set to the same - value as PARM_DEF_LOC. If MAP is non null, it means TOKEN comes - from a macro expansion and MAP is the macro map associated to the - macro. MACRO_TOKEN_INDEX points to the index of the token in the - macro map; It is not considered if MAP is NULL. If VIRT_LOCS is - non-null, it means -ftrack-macro-expansion is on; in which case - this function adds the virtual location DEF_LOC to the VIRT_LOCS - array, at the same index as the one of TOKEN in BUFFER. Upon - successful completion this function returns the a pointer to the - position of the token coming right after the insertion point. */ -static const cpp_token ** -tokens_buff_add_token (_cpp_buff *buffer, - location_t *virt_locs, - const cpp_token *token, - location_t virt_loc, - location_t parm_def_loc, - const line_map_macro *map, - unsigned int macro_token_index) -{ - const cpp_token **result; - location_t *virt_loc_dest = NULL; - unsigned token_index = - (BUFF_FRONT (buffer) - buffer->base) / sizeof (cpp_token *); - - /* Abort if we pass the end the buffer. */ - if (BUFF_FRONT (buffer) > BUFF_LIMIT (buffer)) - abort (); - - if (virt_locs != NULL) - virt_loc_dest = &virt_locs[token_index]; - - result = - tokens_buff_put_token_to ((const cpp_token **) BUFF_FRONT (buffer), - virt_loc_dest, token, virt_loc, parm_def_loc, - map, macro_token_index); - - BUFF_FRONT (buffer) = (unsigned char *) result; - return result; -} - -/* Allocate space for the function-like macro argument ARG to store - the tokens resulting from the macro-expansion of the tokens that - make up ARG itself. That space is allocated in ARG->expanded and - needs to be freed using free. */ -static void -alloc_expanded_arg_mem (cpp_reader *pfile, macro_arg *arg, size_t capacity) -{ - gcc_checking_assert (arg->expanded == NULL - && arg->expanded_virt_locs == NULL); - - arg->expanded = XNEWVEC (const cpp_token *, capacity); - if (CPP_OPTION (pfile, track_macro_expansion)) - arg->expanded_virt_locs = XNEWVEC (location_t, capacity); - -} - -/* If necessary, enlarge ARG->expanded to so that it can contain SIZE - tokens. */ -static void -ensure_expanded_arg_room (cpp_reader *pfile, macro_arg *arg, - size_t size, size_t *expanded_capacity) -{ - if (size <= *expanded_capacity) - return; - - size *= 2; - - arg->expanded = - XRESIZEVEC (const cpp_token *, arg->expanded, size); - *expanded_capacity = size; - - if (CPP_OPTION (pfile, track_macro_expansion)) - { - if (arg->expanded_virt_locs == NULL) - arg->expanded_virt_locs = XNEWVEC (location_t, size); - else - arg->expanded_virt_locs = XRESIZEVEC (location_t, - arg->expanded_virt_locs, - size); - } -} - -/* - Expand an argument ARG before replacing parameters in a - function-like macro. This works by pushing a context with the - argument's tokens, and then expanding that into a temporary buffer - as if it were a normal part of the token stream. collect_args() - has terminated the argument's tokens with a CPP_EOF so that we know - when we have fully expanded the argument. - */ -static void -expand_arg (cpp_reader *pfile, macro_arg *arg) -{ - size_t capacity; - bool saved_warn_trad; - bool track_macro_exp_p = CPP_OPTION (pfile, track_macro_expansion); - bool saved_ignore__Pragma; - - if (arg->count == 0 - || arg->expanded != NULL) - return; - - /* Don't warn about funlike macros when pre-expanding. */ - saved_warn_trad = CPP_WTRADITIONAL (pfile); - CPP_WTRADITIONAL (pfile) = 0; - - /* Loop, reading in the tokens of the argument. */ - capacity = 256; - alloc_expanded_arg_mem (pfile, arg, capacity); - - if (track_macro_exp_p) - push_extended_tokens_context (pfile, NULL, NULL, - arg->virt_locs, - arg->first, - arg->count + 1); - else - push_ptoken_context (pfile, NULL, NULL, - arg->first, arg->count + 1); - - saved_ignore__Pragma = pfile->state.ignore__Pragma; - pfile->state.ignore__Pragma = 1; - - for (;;) - { - const cpp_token *token; - location_t location; - - ensure_expanded_arg_room (pfile, arg, arg->expanded_count + 1, - &capacity); - - token = cpp_get_token_1 (pfile, &location); - - if (token->type == CPP_EOF) - break; - - set_arg_token (arg, token, location, - arg->expanded_count, MACRO_ARG_TOKEN_EXPANDED, - CPP_OPTION (pfile, track_macro_expansion)); - arg->expanded_count++; - } - - _cpp_pop_context (pfile); - - CPP_WTRADITIONAL (pfile) = saved_warn_trad; - pfile->state.ignore__Pragma = saved_ignore__Pragma; -} - -/* Returns the macro associated to the current context if we are in - the context a macro expansion, NULL otherwise. */ -static cpp_hashnode* -macro_of_context (cpp_context *context) -{ - if (context == NULL) - return NULL; - - return (context->tokens_kind == TOKENS_KIND_EXTENDED) - ? context->c.mc->macro_node - : context->c.macro; -} - -/* Return TRUE iff we are expanding a macro or are about to start - expanding one. If we are effectively expanding a macro, the - function macro_of_context returns a pointer to the macro being - expanded. */ -static bool -in_macro_expansion_p (cpp_reader *pfile) -{ - if (pfile == NULL) - return false; - - return (pfile->about_to_expand_macro_p - || macro_of_context (pfile->context)); -} - -/* Pop the current context off the stack, re-enabling the macro if the - context represented a macro's replacement list. Initially the - context structure was not freed so that we can re-use it later, but - now we do free it to reduce peak memory consumption. */ -void -_cpp_pop_context (cpp_reader *pfile) -{ - cpp_context *context = pfile->context; - - /* We should not be popping the base context. */ - gcc_assert (context != &pfile->base_context); - - if (context->c.macro) - { - cpp_hashnode *macro; - if (context->tokens_kind == TOKENS_KIND_EXTENDED) - { - macro_context *mc = context->c.mc; - macro = mc->macro_node; - /* If context->buff is set, it means the life time of tokens - is bound to the life time of this context; so we must - free the tokens; that means we must free the virtual - locations of these tokens too. */ - if (context->buff && mc->virt_locs) - { - free (mc->virt_locs); - mc->virt_locs = NULL; - } - free (mc); - context->c.mc = NULL; - } - else - macro = context->c.macro; - - /* Beware that MACRO can be NULL in cases like when we are - called from expand_arg. In those cases, a dummy context with - tokens is pushed just for the purpose of walking them using - cpp_get_token_1. In that case, no 'macro' field is set into - the dummy context. */ - if (macro != NULL - /* Several contiguous macro expansion contexts can be - associated to the same macro; that means it's the same - macro expansion that spans across all these (sub) - contexts. So we should re-enable an expansion-disabled - macro only when we are sure we are really out of that - macro expansion. */ - && macro_of_context (context->prev) != macro) - macro->flags &= ~NODE_DISABLED; - - if (macro == pfile->top_most_macro_node && context->prev == NULL) - /* We are popping the context of the top-most macro node. */ - pfile->top_most_macro_node = NULL; - } - - if (context->buff) - { - /* Decrease memory peak consumption by freeing the memory used - by the context. */ - _cpp_free_buff (context->buff); - } - - pfile->context = context->prev; - /* decrease peak memory consumption by feeing the context. */ - pfile->context->next = NULL; - free (context); -} - -/* Return TRUE if we reached the end of the set of tokens stored in - CONTEXT, FALSE otherwise. */ -static inline bool -reached_end_of_context (cpp_context *context) -{ - if (context->tokens_kind == TOKENS_KIND_DIRECT) - return FIRST (context).token == LAST (context).token; - else if (context->tokens_kind == TOKENS_KIND_INDIRECT - || context->tokens_kind == TOKENS_KIND_EXTENDED) - return FIRST (context).ptoken == LAST (context).ptoken; - else - abort (); -} - -/* Consume the next token contained in the current context of PFILE, - and return it in *TOKEN. It's "full location" is returned in - *LOCATION. If -ftrack-macro-location is in effeect, fFull location" - means the location encoding the locus of the token across macro - expansion; otherwise it's just is the "normal" location of the - token which (*TOKEN)->src_loc. */ -static inline void -consume_next_token_from_context (cpp_reader *pfile, - const cpp_token ** token, - location_t *location) -{ - cpp_context *c = pfile->context; - - if ((c)->tokens_kind == TOKENS_KIND_DIRECT) - { - *token = FIRST (c).token; - *location = (*token)->src_loc; - FIRST (c).token++; - } - else if ((c)->tokens_kind == TOKENS_KIND_INDIRECT) - { - *token = *FIRST (c).ptoken; - *location = (*token)->src_loc; - FIRST (c).ptoken++; - } - else if ((c)->tokens_kind == TOKENS_KIND_EXTENDED) - { - macro_context *m = c->c.mc; - *token = *FIRST (c).ptoken; - if (m->virt_locs) - { - *location = *m->cur_virt_loc; - m->cur_virt_loc++; - } - else - *location = (*token)->src_loc; - FIRST (c).ptoken++; - } - else - abort (); -} - -/* In the traditional mode of the preprocessor, if we are currently in - a directive, the location of a token must be the location of the - start of the directive line. This function returns the proper - location if we are in the traditional mode, and just returns - LOCATION otherwise. */ - -static inline location_t -maybe_adjust_loc_for_trad_cpp (cpp_reader *pfile, location_t location) -{ - if (CPP_OPTION (pfile, traditional)) - { - if (pfile->state.in_directive) - return pfile->directive_line; - } - return location; -} - -/* Routine to get a token as well as its location. - - Macro expansions and directives are transparently handled, - including entering included files. Thus tokens are post-macro - expansion, and after any intervening directives. External callers - see CPP_EOF only at EOF. Internal callers also see it when meeting - a directive inside a macro call, when at the end of a directive and - state.in_directive is still 1, and at the end of argument - pre-expansion. - - LOC is an out parameter; *LOC is set to the location "as expected - by the user". Please read the comment of - cpp_get_token_with_location to learn more about the meaning of this - location. */ -static const cpp_token* -cpp_get_token_1 (cpp_reader *pfile, location_t *location) -{ - const cpp_token *result; - /* This token is a virtual token that either encodes a location - related to macro expansion or a spelling location. */ - location_t virt_loc = 0; - /* pfile->about_to_expand_macro_p can be overriden by indirect calls - to functions that push macro contexts. So let's save it so that - we can restore it when we are about to leave this routine. */ - bool saved_about_to_expand_macro = pfile->about_to_expand_macro_p; - - for (;;) - { - cpp_hashnode *node; - cpp_context *context = pfile->context; - - /* Context->prev == 0 <=> base context. */ - if (!context->prev) - { - result = _cpp_lex_token (pfile); - virt_loc = result->src_loc; - } - else if (!reached_end_of_context (context)) - { - consume_next_token_from_context (pfile, &result, - &virt_loc); - if (result->flags & PASTE_LEFT) - { - paste_all_tokens (pfile, result); - if (pfile->state.in_directive) - continue; - result = padding_token (pfile, result); - goto out; - } - } - else - { - if (pfile->context->c.macro) - ++num_expanded_macros_counter; - _cpp_pop_context (pfile); - if (pfile->state.in_directive) - continue; - result = &pfile->avoid_paste; - goto out; - } - - if (pfile->state.in_directive && result->type == CPP_COMMENT) - continue; - - if (result->type != CPP_NAME) - break; - - node = result->val.node.node; - - if (node->type == NT_VOID || (result->flags & NO_EXPAND)) - break; - - if (!(node->flags & NODE_USED) - && node->type == NT_USER_MACRO - && !node->value.macro - && !cpp_get_deferred_macro (pfile, node, result->src_loc)) - break; - - if (!(node->flags & NODE_DISABLED)) - { - int ret = 0; - /* If not in a macro context, and we're going to start an - expansion, record the location and the top level macro - about to be expanded. */ - if (!in_macro_expansion_p (pfile)) - { - pfile->invocation_location = result->src_loc; - pfile->top_most_macro_node = node; - } - if (pfile->state.prevent_expansion) - break; - - /* Conditional macros require that a predicate be evaluated - first. */ - if ((node->flags & NODE_CONDITIONAL) != 0) - { - if (pfile->cb.macro_to_expand) - { - bool whitespace_after; - const cpp_token *peek_tok = cpp_peek_token (pfile, 0); - - whitespace_after = (peek_tok->type == CPP_PADDING - || (peek_tok->flags & PREV_WHITE)); - node = pfile->cb.macro_to_expand (pfile, result); - if (node) - ret = enter_macro_context (pfile, node, result, virt_loc); - else if (whitespace_after) - { - /* If macro_to_expand hook returned NULL and it - ate some tokens, see if we don't need to add - a padding token in between this and the - next token. */ - peek_tok = cpp_peek_token (pfile, 0); - if (peek_tok->type != CPP_PADDING - && (peek_tok->flags & PREV_WHITE) == 0) - _cpp_push_token_context (pfile, NULL, - padding_token (pfile, - peek_tok), 1); - } - } - } - else - ret = enter_macro_context (pfile, node, result, virt_loc); - if (ret) - { - if (pfile->state.in_directive || ret == 2) - continue; - result = padding_token (pfile, result); - goto out; - } - } - else - { - /* Flag this token as always unexpandable. FIXME: move this - to collect_args()?. */ - cpp_token *t = _cpp_temp_token (pfile); - t->type = result->type; - t->flags = result->flags | NO_EXPAND; - t->val = result->val; - result = t; - } - - break; - } - - out: - if (location != NULL) - { - if (virt_loc == 0) - virt_loc = result->src_loc; - *location = virt_loc; - - if (!CPP_OPTION (pfile, track_macro_expansion) - && macro_of_context (pfile->context) != NULL) - /* We are in a macro expansion context, are not tracking - virtual location, but were asked to report the location - of the expansion point of the macro being expanded. */ - *location = pfile->invocation_location; - - *location = maybe_adjust_loc_for_trad_cpp (pfile, *location); - } - - pfile->about_to_expand_macro_p = saved_about_to_expand_macro; - - if (pfile->state.directive_file_token - && !pfile->state.parsing_args - && !(result->type == CPP_PADDING || result->type == CPP_COMMENT) - && !(15 & --pfile->state.directive_file_token)) - { - /* Do header-name frobbery. Concatenate < ... > as approprate. - Do header search if needed, and finally drop the outer <> or - "". */ - pfile->state.angled_headers = false; - - /* Do angle-header reconstitution. Then do include searching. - We'll always end up with a ""-quoted header-name in that - case. If searching finds nothing, we emit a diagnostic and - an empty string. */ - size_t len = 0; - char *fname = NULL; - - cpp_token *tmp = _cpp_temp_token (pfile); - *tmp = *result; - - tmp->type = CPP_HEADER_NAME; - bool need_search = !pfile->state.directive_file_token; - pfile->state.directive_file_token = 0; - - bool angle = result->type != CPP_STRING; - if (result->type == CPP_HEADER_NAME - || (result->type == CPP_STRING && result->val.str.text[0] != 'R')) - { - len = result->val.str.len - 2; - fname = XNEWVEC (char, len + 1); - memcpy (fname, result->val.str.text + 1, len); - fname[len] = 0; - } - else if (result->type == CPP_LESS) - fname = _cpp_bracket_include (pfile); - - if (fname) - { - /* We have a header-name. Look it up. This will emit an - unfound diagnostic. Canonicalize the found name. */ - const char *found = fname; - - if (need_search) - { - found = _cpp_find_header_unit (pfile, fname, angle, tmp->src_loc); - if (!found) - found = ""; - len = strlen (found); - } - /* Force a leading './' if it's not absolute. */ - bool dotme = (found[0] == '.' ? !IS_DIR_SEPARATOR (found[1]) - : found[0] && !IS_ABSOLUTE_PATH (found)); - - if (BUFF_ROOM (pfile->u_buff) < len + 1 + dotme * 2) - _cpp_extend_buff (pfile, &pfile->u_buff, len + 1 + dotme * 2); - unsigned char *buf = BUFF_FRONT (pfile->u_buff); - size_t pos = 0; - - if (dotme) - { - buf[pos++] = '.'; - /* Apparently '/' is unconditional. */ - buf[pos++] = '/'; - } - memcpy (&buf[pos], found, len); - pos += len; - buf[pos] = 0; - - tmp->val.str.len = pos; - tmp->val.str.text = buf; - - tmp->type = CPP_HEADER_NAME; - XDELETEVEC (fname); - - result = tmp; - } - } - - return result; -} - -/* External routine to get a token. Also used nearly everywhere - internally, except for places where we know we can safely call - _cpp_lex_token directly, such as lexing a directive name. - - Macro expansions and directives are transparently handled, - including entering included files. Thus tokens are post-macro - expansion, and after any intervening directives. External callers - see CPP_EOF only at EOF. Internal callers also see it when meeting - a directive inside a macro call, when at the end of a directive and - state.in_directive is still 1, and at the end of argument - pre-expansion. */ -const cpp_token * -cpp_get_token (cpp_reader *pfile) -{ - return cpp_get_token_1 (pfile, NULL); -} - -/* Like cpp_get_token, but also returns a virtual token location - separate from the spelling location carried by the returned token. - - LOC is an out parameter; *LOC is set to the location "as expected - by the user". This matters when a token results from macro - expansion; in that case the token's spelling location indicates the - locus of the token in the definition of the macro but *LOC - virtually encodes all the other meaningful locuses associated to - the token. - - What? virtual location? Yes, virtual location. - - If the token results from macro expansion and if macro expansion - location tracking is enabled its virtual location encodes (at the - same time): - - - the spelling location of the token - - - the locus of the macro expansion point - - - the locus of the point where the token got instantiated as part - of the macro expansion process. - - You have to use the linemap API to get the locus you are interested - in from a given virtual location. - - Note however that virtual locations are not necessarily ordered for - relations '<' and '>'. One must use the function - linemap_location_before_p instead of using the relational operator - '<'. - - If macro expansion tracking is off and if the token results from - macro expansion the virtual location is the expansion point of the - macro that got expanded. - - When the token doesn't result from macro expansion, the virtual - location is just the same thing as its spelling location. */ - -const cpp_token * -cpp_get_token_with_location (cpp_reader *pfile, location_t *loc) -{ - return cpp_get_token_1 (pfile, loc); -} - -/* Returns true if we're expanding an object-like macro that was - defined in a system header. Just checks the macro at the top of - the stack. Used for diagnostic suppression. - Also return true for builtin macros. */ -int -cpp_sys_macro_p (cpp_reader *pfile) -{ - cpp_hashnode *node = NULL; - - if (pfile->context->tokens_kind == TOKENS_KIND_EXTENDED) - node = pfile->context->c.mc->macro_node; - else - node = pfile->context->c.macro; - - if (!node) - return false; - if (cpp_builtin_macro_p (node)) - return true; - return node->value.macro && node->value.macro->syshdr; -} - -/* Read each token in, until end of the current file. Directives are - transparently processed. */ -void -cpp_scan_nooutput (cpp_reader *pfile) -{ - /* Request a CPP_EOF token at the end of this file, rather than - transparently continuing with the including file. */ - pfile->buffer->return_at_eof = true; - - pfile->state.discarding_output++; - pfile->state.prevent_expansion++; - - if (CPP_OPTION (pfile, traditional)) - while (_cpp_read_logical_line_trad (pfile)) - ; - else - while (cpp_get_token (pfile)->type != CPP_EOF) - ; - - pfile->state.discarding_output--; - pfile->state.prevent_expansion--; -} - -/* Step back one or more tokens obtained from the lexer. */ -void -_cpp_backup_tokens_direct (cpp_reader *pfile, unsigned int count) -{ - pfile->lookaheads += count; - while (count--) - { - pfile->cur_token--; - if (pfile->cur_token == pfile->cur_run->base - /* Possible with -fpreprocessed and no leading #line. */ - && pfile->cur_run->prev != NULL) - { - pfile->cur_run = pfile->cur_run->prev; - pfile->cur_token = pfile->cur_run->limit; - } - } -} - -/* Step back one (or more) tokens. Can only step back more than 1 if - they are from the lexer, and not from macro expansion. */ -void -_cpp_backup_tokens (cpp_reader *pfile, unsigned int count) -{ - if (pfile->context->prev == NULL) - _cpp_backup_tokens_direct (pfile, count); - else - { - if (count != 1) - abort (); - if (pfile->context->tokens_kind == TOKENS_KIND_DIRECT) - FIRST (pfile->context).token--; - else if (pfile->context->tokens_kind == TOKENS_KIND_INDIRECT) - FIRST (pfile->context).ptoken--; - else if (pfile->context->tokens_kind == TOKENS_KIND_EXTENDED) - { - FIRST (pfile->context).ptoken--; - if (pfile->context->c.macro) - { - macro_context *m = pfile->context->c.mc; - m->cur_virt_loc--; - gcc_checking_assert (m->cur_virt_loc >= m->virt_locs); - } - else - abort (); - } - else - abort (); - } -} - -/* #define directive parsing and handling. */ - -/* Returns true if a macro redefinition warning is required. */ -static bool -warn_of_redefinition (cpp_reader *pfile, cpp_hashnode *node, - const cpp_macro *macro2) -{ - /* Some redefinitions need to be warned about regardless. */ - if (node->flags & NODE_WARN) - return true; - - /* Suppress warnings for builtins that lack the NODE_WARN flag, - unless Wbuiltin-macro-redefined. */ - if (cpp_builtin_macro_p (node)) - return CPP_OPTION (pfile, warn_builtin_macro_redefined); - - /* Redefinitions of conditional (context-sensitive) macros, on - the other hand, must be allowed silently. */ - if (node->flags & NODE_CONDITIONAL) - return false; - - if (cpp_macro *macro1 = get_deferred_or_lazy_macro (pfile, node, macro2->line)) - return cpp_compare_macros (macro1, macro2); - return false; -} - -/* Return TRUE if MACRO1 and MACRO2 differ. */ - -bool -cpp_compare_macros (const cpp_macro *macro1, const cpp_macro *macro2) -{ - /* Redefinition of a macro is allowed if and only if the old and new - definitions are the same. (6.10.3 paragraph 2). */ - - /* Don't check count here as it can be different in valid - traditional redefinitions with just whitespace differences. */ - if (macro1->paramc != macro2->paramc - || macro1->fun_like != macro2->fun_like - || macro1->variadic != macro2->variadic) - return true; - - /* Check parameter spellings. */ - for (unsigned i = macro1->paramc; i--; ) - if (macro1->parm.params[i] != macro2->parm.params[i]) - return true; - - /* Check the replacement text or tokens. */ - if (macro1->kind == cmk_traditional) - return _cpp_expansions_different_trad (macro1, macro2); - - if (macro1->count != macro2->count) - return true; - - for (unsigned i= macro1->count; i--; ) - if (!_cpp_equiv_tokens (¯o1->exp.tokens[i], ¯o2->exp.tokens[i])) - return true; - - return false; -} - -/* Free the definition of hashnode H. */ -void -_cpp_free_definition (cpp_hashnode *h) -{ - /* Macros and assertions no longer have anything to free. */ - h->type = NT_VOID; - h->value.answers = NULL; - h->flags &= ~(NODE_DISABLED | NODE_USED); -} - -/* Save parameter NODE (spelling SPELLING) to the parameter list of - macro MACRO. Returns true on success, false on failure. */ -bool -_cpp_save_parameter (cpp_reader *pfile, unsigned n, cpp_hashnode *node, - cpp_hashnode *spelling) -{ - /* Constraint 6.10.3.6 - duplicate parameter names. */ - if (node->type == NT_MACRO_ARG) - { - cpp_error (pfile, CPP_DL_ERROR, "duplicate macro parameter \"%s\"", - NODE_NAME (node)); - return false; - } - - unsigned len = (n + 1) * sizeof (struct macro_arg_saved_data); - if (len > pfile->macro_buffer_len) - { - pfile->macro_buffer - = XRESIZEVEC (unsigned char, pfile->macro_buffer, len); - pfile->macro_buffer_len = len; - } - - macro_arg_saved_data *saved = (macro_arg_saved_data *)pfile->macro_buffer; - saved[n].canonical_node = node; - saved[n].value = node->value; - saved[n].type = node->type; - - void *base = _cpp_reserve_room (pfile, n * sizeof (cpp_hashnode *), - sizeof (cpp_hashnode *)); - ((cpp_hashnode **)base)[n] = spelling; - - /* Morph into a macro arg. */ - node->type = NT_MACRO_ARG; - /* Index is 1 based. */ - node->value.arg_index = n + 1; - - return true; -} - -/* Restore the parameters to their previous state. */ -void -_cpp_unsave_parameters (cpp_reader *pfile, unsigned n) -{ - /* Clear the fast argument lookup indices. */ - while (n--) - { - struct macro_arg_saved_data *save = - &((struct macro_arg_saved_data *) pfile->macro_buffer)[n]; - - struct cpp_hashnode *node = save->canonical_node; - node->type = save->type; - node->value = save->value; - } -} - -/* Check the syntax of the parameters in a MACRO definition. Return - false on failure. Set *N_PTR and *VARADIC_PTR as appropriate. - '(' ')' - '(' parm-list ',' last-parm ')' - '(' last-parm ')' - parm-list: name - | parm-list, name - last-parm: name - | name '...' - | '...' -*/ - -static bool -parse_params (cpp_reader *pfile, unsigned *n_ptr, bool *varadic_ptr) -{ - unsigned nparms = 0; - bool ok = false; - - for (bool prev_ident = false;;) - { - const cpp_token *token = _cpp_lex_token (pfile); - - switch (token->type) - { - case CPP_COMMENT: - /* Allow/ignore comments in parameter lists if we are - preserving comments in macro expansions. */ - if (!CPP_OPTION (pfile, discard_comments_in_macro_exp)) - break; - - /* FALLTHRU */ - default: - bad: - { - const char *const msgs[5] = - { - N_("expected parameter name, found \"%s\""), - N_("expected ',' or ')', found \"%s\""), - N_("expected parameter name before end of line"), - N_("expected ')' before end of line"), - N_("expected ')' after \"...\"") - }; - unsigned ix = prev_ident; - const unsigned char *as_text = NULL; - if (*varadic_ptr) - ix = 4; - else if (token->type == CPP_EOF) - ix += 2; - else - as_text = cpp_token_as_text (pfile, token); - cpp_error (pfile, CPP_DL_ERROR, msgs[ix], as_text); - } - goto out; - - case CPP_NAME: - if (prev_ident || *varadic_ptr) - goto bad; - prev_ident = true; - - if (!_cpp_save_parameter (pfile, nparms, token->val.node.node, - token->val.node.spelling)) - goto out; - nparms++; - break; - - case CPP_CLOSE_PAREN: - if (prev_ident || !nparms || *varadic_ptr) - { - ok = true; - goto out; - } - - /* FALLTHRU */ - case CPP_COMMA: - if (!prev_ident || *varadic_ptr) - goto bad; - prev_ident = false; - break; - - case CPP_ELLIPSIS: - if (*varadic_ptr) - goto bad; - *varadic_ptr = true; - if (!prev_ident) - { - /* An ISO bare ellipsis. */ - _cpp_save_parameter (pfile, nparms, - pfile->spec_nodes.n__VA_ARGS__, - pfile->spec_nodes.n__VA_ARGS__); - nparms++; - pfile->state.va_args_ok = 1; - if (! CPP_OPTION (pfile, c99) - && CPP_OPTION (pfile, cpp_pedantic) - && CPP_OPTION (pfile, warn_variadic_macros)) - cpp_pedwarning - (pfile, CPP_W_VARIADIC_MACROS, - CPP_OPTION (pfile, cplusplus) - ? N_("anonymous variadic macros were introduced in C++11") - : N_("anonymous variadic macros were introduced in C99")); - else if (CPP_OPTION (pfile, cpp_warn_c90_c99_compat) > 0 - && ! CPP_OPTION (pfile, cplusplus)) - cpp_error (pfile, CPP_DL_WARNING, - "anonymous variadic macros were introduced in C99"); - } - else if (CPP_OPTION (pfile, cpp_pedantic) - && CPP_OPTION (pfile, warn_variadic_macros)) - cpp_pedwarning (pfile, CPP_W_VARIADIC_MACROS, - CPP_OPTION (pfile, cplusplus) - ? N_("ISO C++ does not permit named variadic macros") - : N_("ISO C does not permit named variadic macros")); - break; - } - } - - out: - *n_ptr = nparms; - - return ok; -} - -/* - "Lex a token from the expansion of MACRO, but mark parameters as we - find them and warn of traditional stringification." -original comment. - - This routine, despite its name, does no expansion. It redirects the token - pointer inside the lexer so that it writes the next token raw from the - input file into the 'expansion array' of the macro. - - The expansion array in the macro is an expandable (sort of) array of tokens, - used for holding the body of the macro. I.e. 'expansion' here refers to - the token array getting longer, not to the expansion of macros. - - This routine requires that the macro passed in is on the bump pointer buffer (the buffer returned by _cpp_reserve_room). Given this requirement, there was no need to pass the original macro as they could instead recovered it from the buffer: - - cpp_macro *macro = (cpp_macro *)BUFF_FRONT(pfile->a_buff); - - The return value is that of the buffer holding the macro with one more token on its expansion array than it had when it was passed in. -*/ -static cpp_macro * -lex_expansion_token (cpp_reader *pfile, cpp_macro *macro) -{ - macro = (cpp_macro *)_cpp_reserve_room ( - pfile - ,sizeof (cpp_macro) - - sizeof (cpp_token) - + macro->count * sizeof (cpp_token) - ,sizeof (cpp_token) - ); - - // Tells the lexer to lex the next token into ¯o->exp.tokens[macro->count]. - // Perhaps the lexer was already set to lex the next token to a different buffer? - // So just in case, its value original value is saved then restored. - cpp_token *saved_cur_token = pfile->cur_token; - pfile->cur_token = ¯o->exp.tokens[macro->count]; - cpp_token *token = _cpp_lex_direct (pfile); - pfile->cur_token = saved_cur_token; - - /* Is this a parameter? */ - if (token->type == CPP_NAME && token->val.node.node->type == NT_MACRO_ARG) - { - /* Morph into a parameter reference. */ - cpp_hashnode *spelling = token->val.node.spelling; - token->type = CPP_MACRO_ARG; - token->val.macro_arg.arg_no = token->val.node.node->value.arg_index; - token->val.macro_arg.spelling = spelling; - } - else if (CPP_WTRADITIONAL (pfile) && macro->paramc > 0 - && (token->type == CPP_STRING || token->type == CPP_CHAR)) - check_trad_stringification (pfile, macro, &token->val.str); - - return macro; -} - -static cpp_macro * -create_iso_definition (cpp_reader *pfile) -{ - bool following_paste_op = false; - const char *paste_op_error_msg = - N_("'##' cannot appear at either end of a macro expansion"); - unsigned int num_extra_tokens = 0; - unsigned nparms = 0; - cpp_hashnode **params = NULL; - bool varadic = false; - bool ok = false; - cpp_macro *macro = NULL; - - /* Look at the first token, to see if this is a function-like - macro. */ - cpp_token first; - cpp_token *saved_cur_token = pfile->cur_token; - pfile->cur_token = &first; - cpp_token *token = _cpp_lex_direct (pfile); - pfile->cur_token = saved_cur_token; - - if (token->flags & PREV_WHITE) - /* Preceeded by space, must be part of expansion. */; - else if (token->type == CPP_OPEN_PAREN) - { - /* An open-paren, get a parameter list. */ - if (!parse_params (pfile, &nparms, &varadic)) - goto out; - - params = (cpp_hashnode **)_cpp_commit_buff - (pfile, sizeof (cpp_hashnode *) * nparms); - token = NULL; - } - else if (token->type != CPP_EOF - && !(token->type == CPP_COMMENT - && ! CPP_OPTION (pfile, discard_comments_in_macro_exp))) - { - /* While ISO C99 requires whitespace before replacement text - in a macro definition, ISO C90 with TC1 allows characters - from the basic source character set there. */ - if (CPP_OPTION (pfile, c99)) - cpp_error (pfile, CPP_DL_PEDWARN, - CPP_OPTION (pfile, cplusplus) - ? N_("ISO C++11 requires whitespace after the macro name") - : N_("ISO C99 requires whitespace after the macro name")); - else - { - enum cpp_diagnostic_level warntype = CPP_DL_WARNING; - switch (token->type) - { - case CPP_ATSIGN: - case CPP_AT_NAME: - case CPP_OBJC_STRING: - /* '@' is not in basic character set. */ - warntype = CPP_DL_PEDWARN; - break; - case CPP_OTHER: - /* Basic character set sans letters, digits and _. */ - if (strchr ("!\"#%&'()*+,-./:;<=>?[\\]^{|}~", - token->val.str.text[0]) == NULL) - warntype = CPP_DL_PEDWARN; - break; - default: - /* All other tokens start with a character from basic - character set. */ - break; - } - cpp_error (pfile, warntype, - "missing whitespace after the macro name"); - } - } - - macro = _cpp_new_macro (pfile, cmk_macro, - _cpp_reserve_room (pfile, 0, sizeof (cpp_macro))); - - if (!token) - { - macro->variadic = varadic; - macro->paramc = nparms; - macro->parm.params = params; - macro->fun_like = true; - } - else - { - /* Preserve the token we peeked, there is already a single slot for it. */ - macro->exp.tokens[0] = *token; - token = ¯o->exp.tokens[0]; - macro->count = 1; - } - - for (vaopt_state vaopt_tracker (pfile, macro->variadic, NULL);; token = NULL) - { - if (!token) - { - macro = lex_expansion_token (pfile, macro); - token = ¯o->exp.tokens[macro->count++]; - } - - /* Check the stringifying # constraint 6.10.3.2.1 of - function-like macros when lexing the subsequent token. */ - if (macro->count > 1 && token[-1].type == CPP_HASH && macro->fun_like) - { - if (token->type == CPP_MACRO_ARG - || (macro->variadic - && token->type == CPP_NAME - && token->val.node.node == pfile->spec_nodes.n__VA_OPT__)) - { - if (token->flags & PREV_WHITE) - token->flags |= SP_PREV_WHITE; - if (token[-1].flags & DIGRAPH) - token->flags |= SP_DIGRAPH; - token->flags &= ~PREV_WHITE; - token->flags |= STRINGIFY_ARG; - token->flags |= token[-1].flags & PREV_WHITE; - token[-1] = token[0]; - macro->count--; - } - /* Let assembler get away with murder. */ - else if (CPP_OPTION (pfile, lang) != CLK_ASM) - { - cpp_error (pfile, CPP_DL_ERROR, - "'#' is not followed by a macro parameter"); - goto out; - } - } - - if (token->type == CPP_EOF) - { - /* Paste operator constraint 6.10.3.3.1: - Token-paste ##, can appear in both object-like and - function-like macros, but not at the end. */ - if (following_paste_op) - { - cpp_error (pfile, CPP_DL_ERROR, paste_op_error_msg); - goto out; - } - if (!vaopt_tracker.completed ()) - goto out; - break; - } - - /* Paste operator constraint 6.10.3.3.1. */ - if (token->type == CPP_PASTE) - { - /* Token-paste ##, can appear in both object-like and - function-like macros, but not at the beginning. */ - if (macro->count == 1) - { - cpp_error (pfile, CPP_DL_ERROR, paste_op_error_msg); - goto out; - } - - if (following_paste_op) - { - /* Consecutive paste operators. This one will be moved - to the end. */ - num_extra_tokens++; - token->val.token_no = macro->count - 1; - } - else - { - /* Drop the paste operator. */ - --macro->count; - token[-1].flags |= PASTE_LEFT; - if (token->flags & DIGRAPH) - token[-1].flags |= SP_DIGRAPH; - if (token->flags & PREV_WHITE) - token[-1].flags |= SP_PREV_WHITE; - } - following_paste_op = true; - } - else - following_paste_op = false; - - if (vaopt_tracker.update (token) == vaopt_state::ERROR) - goto out; - } - - /* We're committed to winning now. */ - ok = true; - - /* Don't count the CPP_EOF. */ - macro->count--; - - macro = (cpp_macro *)_cpp_commit_buff - (pfile, sizeof (cpp_macro) - sizeof (cpp_token) - + sizeof (cpp_token) * macro->count); - - /* Clear whitespace on first token. */ - if (macro->count) - macro->exp.tokens[0].flags &= ~PREV_WHITE; - - if (num_extra_tokens) - { - /* Place second and subsequent ## or %:%: tokens in sequences of - consecutive such tokens at the end of the list to preserve - information about where they appear, how they are spelt and - whether they are preceded by whitespace without otherwise - interfering with macro expansion. Remember, this is - extremely rare, so efficiency is not a priority. */ - cpp_token *temp = (cpp_token *)_cpp_reserve_room - (pfile, 0, num_extra_tokens * sizeof (cpp_token)); - unsigned extra_ix = 0, norm_ix = 0; - cpp_token *exp = macro->exp.tokens; - for (unsigned ix = 0; ix != macro->count; ix++) - if (exp[ix].type == CPP_PASTE) - temp[extra_ix++] = exp[ix]; - else - exp[norm_ix++] = exp[ix]; - memcpy (&exp[norm_ix], temp, num_extra_tokens * sizeof (cpp_token)); - - /* Record there are extra tokens. */ - macro->extra_tokens = 1; - } - - out: - pfile->state.va_args_ok = 0; - _cpp_unsave_parameters (pfile, nparms); - - return ok ? macro : NULL; -} - -cpp_macro * -_cpp_new_macro (cpp_reader *pfile, cpp_macro_kind kind, void *placement) -{ - cpp_macro *macro = (cpp_macro *) placement; - - /* Zero init all the fields. This'll tell the compiler know all the - following inits are writing a virgin object. */ - memset (macro, 0, offsetof (cpp_macro, exp)); - - macro->line = pfile->directive_line; - macro->parm.params = 0; - macro->lazy = 0; - macro->paramc = 0; - macro->variadic = 0; - macro->used = !CPP_OPTION (pfile, warn_unused_macros); - macro->count = 0; - macro->fun_like = 0; - macro->imported_p = false; - macro->extra_tokens = 0; - /* To suppress some diagnostics. */ - macro->syshdr = pfile->buffer && pfile->buffer->sysp != 0; - - macro->kind = kind; - - return macro; -} - -/* Parse a macro and save its expansion. Returns nonzero on success. */ -bool -_cpp_create_definition (cpp_reader *pfile, cpp_hashnode *node) -{ - cpp_macro *macro; - - if (CPP_OPTION (pfile, traditional)) - macro = _cpp_create_trad_definition (pfile); - else - macro = create_iso_definition (pfile); - - if (!macro) - return false; - - if (cpp_macro_p (node)) - { - if (CPP_OPTION (pfile, warn_unused_macros)) - _cpp_warn_if_unused_macro (pfile, node, NULL); - - if (warn_of_redefinition (pfile, node, macro)) - { - const enum cpp_warning_reason reason - = (cpp_builtin_macro_p (node) && !(node->flags & NODE_WARN)) - ? CPP_W_BUILTIN_MACRO_REDEFINED : CPP_W_NONE; - - bool warned = - cpp_pedwarning_with_line (pfile, reason, - pfile->directive_line, 0, - "\"%s\" redefined", NODE_NAME (node)); - - if (warned && cpp_user_macro_p (node)) - cpp_error_with_line (pfile, CPP_DL_NOTE, - node->value.macro->line, 0, - "this is the location of the previous definition"); - } - _cpp_free_definition (node); - } - - /* Enter definition in hash table. */ - node->type = NT_USER_MACRO; - node->value.macro = macro; - if (! ustrncmp (NODE_NAME (node), DSC ("__STDC_")) - && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_FORMAT_MACROS") - /* __STDC_LIMIT_MACROS and __STDC_CONSTANT_MACROS are mentioned - in the C standard, as something that one must use in C++. - However DR#593 and C++11 indicate that they play no role in C++. - We special-case them anyway. */ - && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_LIMIT_MACROS") - && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_CONSTANT_MACROS")) - node->flags |= NODE_WARN; - - /* If user defines one of the conditional macros, remove the - conditional flag */ - node->flags &= ~NODE_CONDITIONAL; - - return true; -} - -extern void -cpp_define_lazily (cpp_reader *pfile, cpp_hashnode *node, unsigned num) -{ - cpp_macro *macro = node->value.macro; - - gcc_checking_assert (pfile->cb.user_lazy_macro && macro && num < UCHAR_MAX); - - macro->lazy = num + 1; -} - -/* NODE is a deferred macro, resolve it, returning the definition - (which may be NULL). */ -cpp_macro * -cpp_get_deferred_macro (cpp_reader *pfile, cpp_hashnode *node, - location_t loc) -{ - gcc_checking_assert (node->type == NT_USER_MACRO); - - node->value.macro = pfile->cb.user_deferred_macro (pfile, loc, node); - - if (!node->value.macro) - node->type = NT_VOID; - - return node->value.macro; -} - -static cpp_macro * -get_deferred_or_lazy_macro (cpp_reader *pfile, cpp_hashnode *node, - location_t loc) -{ - cpp_macro *macro = node->value.macro; - if (!macro) - { - macro = cpp_get_deferred_macro (pfile, node, loc); - gcc_checking_assert (!macro || !macro->lazy); - } - else if (macro->lazy) - { - pfile->cb.user_lazy_macro (pfile, macro, macro->lazy - 1); - macro->lazy = 0; - } - - return macro; -} - -/* Notify the use of NODE in a macro-aware context (i.e. expanding it, - or testing its existance). Also applies any lazy definition. - Return FALSE if the macro isn't really there. */ - -extern bool -_cpp_notify_macro_use (cpp_reader *pfile, cpp_hashnode *node, - location_t loc) -{ - node->flags |= NODE_USED; - switch (node->type) - { - case NT_USER_MACRO: - if (!get_deferred_or_lazy_macro (pfile, node, loc)) - return false; - /* FALLTHROUGH. */ - - case NT_BUILTIN_MACRO: - if (pfile->cb.used_define) - pfile->cb.used_define (pfile, loc, node); - break; - - case NT_VOID: - if (pfile->cb.used_undef) - pfile->cb.used_undef (pfile, loc, node); - break; - - default: - abort (); - } - - return true; -} - -/* Warn if a token in STRING matches one of a function-like MACRO's - parameters. */ -static void -check_trad_stringification (cpp_reader *pfile, const cpp_macro *macro, - const cpp_string *string) -{ - unsigned int i, len; - const uchar *p, *q, *limit; - - /* Loop over the string. */ - limit = string->text + string->len - 1; - for (p = string->text + 1; p < limit; p = q) - { - /* Find the start of an identifier. */ - while (p < limit && !is_idstart (*p)) - p++; - - /* Find the end of the identifier. */ - q = p; - while (q < limit && is_idchar (*q)) - q++; - - len = q - p; - - /* Loop over the function macro arguments to see if the - identifier inside the string matches one of them. */ - for (i = 0; i < macro->paramc; i++) - { - const cpp_hashnode *node = macro->parm.params[i]; - - if (NODE_LEN (node) == len - && !memcmp (p, NODE_NAME (node), len)) - { - cpp_warning (pfile, CPP_W_TRADITIONAL, - "macro argument \"%s\" would be stringified in traditional C", - NODE_NAME (node)); - break; - } - } - } -} - -/* Returns the name, arguments and expansion of a macro, in a format - suitable to be read back in again, and therefore also for DWARF 2 - debugging info. e.g. "PASTE(X, Y) X ## Y", or "MACNAME EXPANSION". - Caller is expected to generate the "#define" bit if needed. The - returned text is temporary, and automatically freed later. */ -const unsigned char * -cpp_macro_definition (cpp_reader *pfile, cpp_hashnode *node) -{ - gcc_checking_assert (cpp_user_macro_p (node)); - - if (const cpp_macro *macro = get_deferred_or_lazy_macro (pfile, node, 0)) - return cpp_macro_definition (pfile, node, macro); - return NULL; -} - -const unsigned char * -cpp_macro_definition (cpp_reader *pfile, cpp_hashnode *node, - const cpp_macro *macro) -{ - unsigned int i, len; - unsigned char *buffer; - - /* Calculate length. */ - len = NODE_LEN (node) * 10 + 2; /* ' ' and NUL. */ - if (macro->fun_like) - { - len += 4; /* "()" plus possible final ".." of named - varargs (we have + 1 below). */ - for (i = 0; i < macro->paramc; i++) - len += NODE_LEN (macro->parm.params[i]) + 1; /* "," */ - } - - /* This should match below where we fill in the buffer. */ - if (CPP_OPTION (pfile, traditional)) - len += _cpp_replacement_text_len (macro); - else - { - unsigned int count = macro_real_token_count (macro); - for (i = 0; i < count; i++) - { - const cpp_token *token = ¯o->exp.tokens[i]; - - if (token->type == CPP_MACRO_ARG) - len += NODE_LEN (token->val.macro_arg.spelling); - else - len += cpp_token_len (token); - - if (token->flags & STRINGIFY_ARG) - len++; /* "#" */ - if (token->flags & PASTE_LEFT) - len += 3; /* " ##" */ - if (token->flags & PREV_WHITE) - len++; /* " " */ - } - } - - if (len > pfile->macro_buffer_len) - { - pfile->macro_buffer = XRESIZEVEC (unsigned char, - pfile->macro_buffer, len); - pfile->macro_buffer_len = len; - } - - /* Fill in the buffer. Start with the macro name. */ - buffer = pfile->macro_buffer; - buffer = _cpp_spell_ident_ucns (buffer, node); - - /* Parameter names. */ - if (macro->fun_like) - { - *buffer++ = '('; - for (i = 0; i < macro->paramc; i++) - { - cpp_hashnode *param = macro->parm.params[i]; - - if (param != pfile->spec_nodes.n__VA_ARGS__) - { - memcpy (buffer, NODE_NAME (param), NODE_LEN (param)); - buffer += NODE_LEN (param); - } - - if (i + 1 < macro->paramc) - /* Don't emit a space after the comma here; we're trying - to emit a Dwarf-friendly definition, and the Dwarf spec - forbids spaces in the argument list. */ - *buffer++ = ','; - else if (macro->variadic) - *buffer++ = '.', *buffer++ = '.', *buffer++ = '.'; - } - *buffer++ = ')'; - } - - /* The Dwarf spec requires a space after the macro name, even if the - definition is the empty string. */ - *buffer++ = ' '; - - if (CPP_OPTION (pfile, traditional)) - buffer = _cpp_copy_replacement_text (macro, buffer); - else if (macro->count) - /* Expansion tokens. */ - { - unsigned int count = macro_real_token_count (macro); - for (i = 0; i < count; i++) - { - const cpp_token *token = ¯o->exp.tokens[i]; - - if (token->flags & PREV_WHITE) - *buffer++ = ' '; - if (token->flags & STRINGIFY_ARG) - *buffer++ = '#'; - - if (token->type == CPP_MACRO_ARG) - { - memcpy (buffer, - NODE_NAME (token->val.macro_arg.spelling), - NODE_LEN (token->val.macro_arg.spelling)); - buffer += NODE_LEN (token->val.macro_arg.spelling); - } - else - buffer = cpp_spell_token (pfile, token, buffer, true); - - if (token->flags & PASTE_LEFT) - { - *buffer++ = ' '; - *buffer++ = '#'; - *buffer++ = '#'; - /* Next has PREV_WHITE; see _cpp_create_definition. */ - } - } - } - - *buffer = '\0'; - return pfile->macro_buffer; -} - - -/*-------------------------------------------------------------------------------- - RT extensions ---------------------------------------------------------------------------------*/ - -/*-------------------------------------------------------------------------------- - shared declarations -*/ - - typedef struct{ - unsigned int count; - cpp_token *token_array; // _cpp_reserve_room buffer set by clause parser - } token_list; - - typedef struct{ - unsigned int count; - token_list token_list[1]; - } argument_list; - - typedef enum clause_parse_delimiting { - CPD_EOF - ,CPD_BALANCED - } clause_parse_delimiting; - - typedef enum clause_parse_comma { - CPC_ERROR - ,CPC_TERMINATOR - ,CPD_IS_TOKEN - } clause_parse_comma; - - typedef enum clause_parse_expand { - CPE_NOEXPAND - ,CPE_EXPAND_OPTION - ,CPE_EXPAND - } clause_parse_expand; - - typedef enum clause_parse_status { - PCS_COMPLETE // Clause completely parsed - ,PCS_ERR_EXPECTED_OPEN_DELIM // Failed to find expected opening '(' - ,PCS_ERR_UNEXPECTED_COMMA // probably has too many arguments in list - ,PCS_ERR_UNEXPECTED_EOF // Hit real EOF before matching ')' - ,PCS_ERR_PASTE_AT_END // Trailing '##' paste operator - ,PCS_ERR_HASH_NOT_FOLLOWED_BY_ARG // '#' not followed by macro parameter - ,PCS_ERR_VAOPT_STATE_INVALID // __VA_OPT__ or variadic tracking error - ,PCS_ERR_EOF_FETCH_FAILED // Failed to fetch next line after EOF - ,PCS_ERR_UNKNOWN // Fallback error (should not occur) - ,PCS_ERR_STATUS_NOT_SET // function did not set the status - } clause_parse_status; - - -/*-------------------------------------------------------------------------------- - debug helpers -*/ - - // debug info for clause parsing - #define DebugParseClause 1 - - // debug info for the macro directive - #define DebugRTMacro 1 - - // debug info for the assign directive and built in macro - #define DebugAssign 1 - - // gates compilation of functions that were defined specifically to assist with debug - #define DebugHelpers 1 - - #if DebugHelpers - - static const char * - ttype_to_text(enum cpp_ttype ttype) - { - switch (ttype) - { - case CPP_EOF: return "EOF"; - case CPP_PADDING: return "PADDING"; - case CPP_COMMENT: return "COMMENT"; - // case CPP_HSPACE: return "HSPACE"; - // case CPP_VSPACE: return "VSPACE"; - case CPP_OTHER: return "OTHER"; - case CPP_OPEN_PAREN: return "OPEN_PAREN"; - case CPP_CLOSE_PAREN: return "CLOSE_PAREN"; - case CPP_OPEN_SQUARE: return "OPEN_SQUARE"; - case CPP_CLOSE_SQUARE: return "CLOSE_SQUARE"; - case CPP_OPEN_BRACE: return "OPEN_BRACE"; - case CPP_CLOSE_BRACE: return "CLOSE_BRACE"; - case CPP_COMMA: return "COMMA"; - case CPP_SEMICOLON: return "SEMICOLON"; - case CPP_ELLIPSIS: return "ELLIPSIS"; - case CPP_NAME: return "NAME"; - case CPP_NUMBER: return "NUMBER"; - case CPP_CHAR: return "CHAR"; - case CPP_STRING: return "STRING"; - case CPP_HEADER_NAME: return "HEADER_NAME"; - case CPP_PLUS: return "PLUS"; - case CPP_MINUS: return "MINUS"; - case CPP_MULT: return "MULT"; - case CPP_DIV: return "DIV"; - case CPP_MOD: return "MOD"; - case CPP_AND: return "AND"; - case CPP_OR: return "OR"; - case CPP_XOR: return "XOR"; - case CPP_NOT: return "NOT"; - case CPP_LSHIFT: return "LSHIFT"; - case CPP_RSHIFT: return "RSHIFT"; - case CPP_EQ: return "EQ"; - // case CPP_NE: return "NE"; - // case CPP_LE: return "LE"; - // case CPP_GE: return "GE"; - // case CPP_LT: return "LT"; - // case CPP_GT: return "GT"; - case CPP_ATSIGN: return "@"; - case CPP_PLUS_EQ: return "PLUS_EQ"; - case CPP_MINUS_EQ: return "MINUS_EQ"; - case CPP_MULT_EQ: return "MULT_EQ"; - case CPP_DIV_EQ: return "DIV_EQ"; - case CPP_MOD_EQ: return "MOD_EQ"; - case CPP_AND_EQ: return "AND_EQ"; - case CPP_OR_EQ: return "OR_EQ"; - case CPP_XOR_EQ: return "XOR_EQ"; - case CPP_LSHIFT_EQ: return "LSHIFT_EQ"; - case CPP_RSHIFT_EQ: return "RSHIFT_EQ"; - // case CPP_CONDITIONAL: return "CONDITIONAL"; - case CPP_COLON: return "COLON"; - case CPP_DEREF: return "DEREF"; - case CPP_DOT: return "DOT"; - case CPP_DEREF_STAR: return "DEREF_STAR"; - case CPP_DOT_STAR: return "DOT_STAR"; - // case CPP_INCREMENT: return "INCREMENT"; - // case CPP_DECREMENT: return "DECREMENT"; - default: return ""; - } - } - - static void - print_ttype(enum cpp_ttype ttype){ - fprintf(stderr, "%s (%d)", ttype_to_text(ttype), ttype); - } - - const char *cpp_token_as_text(const cpp_token *token){ - static char buffer[256]; - - switch (token->type) - { - case CPP_NAME: - snprintf(buffer, sizeof(buffer), "CPP_NAME: '%s'", - NODE_NAME(token->val.node.node)); - break; - - case CPP_NUMBER: - case CPP_STRING: - case CPP_CHAR: - case CPP_HEADER_NAME: - snprintf(buffer, sizeof(buffer), "'%.*s'", - token->val.str.len, - token->val.str.text); - break; - - case CPP_EOF: - return ""; - case CPP_OTHER: - return ""; - case CPP_OPEN_PAREN: - return "'('"; - case CPP_CLOSE_PAREN: - return "')'"; - case CPP_COMMA: - return "','"; - case CPP_SEMICOLON: - return "';'"; - case CPP_PLUS: - return "'+'"; - case CPP_MINUS: - return "'-'"; - case CPP_MULT: - return "'*'"; - case CPP_DIV: - return "'/'"; - case CPP_MOD: - return "'%'"; - case CPP_MACRO_ARG: - snprintf( - buffer - ,sizeof(buffer) - ,"CPP_MACRO_ARG: '%s'" - ,NODE_NAME(token->val.macro_arg.spelling) - ); - break; - case CPP_PADDING: return ""; - case CPP_COMMENT: return ""; - case CPP_HASH: return "'#'"; - case CPP_PASTE: return "'##'"; - case CPP_ELLIPSIS: return "'...'"; - case CPP_COLON: return "':'"; - case CPP_OPEN_SQUARE: return "'['"; - case CPP_CLOSE_SQUARE: return "']'"; - case CPP_OPEN_BRACE: return "'{'"; - case CPP_CLOSE_BRACE: return "'}'"; - case CPP_DOT: return "'.'"; - case CPP_DEREF: return "'->'"; - case CPP_SCOPE: return "'::'"; - case CPP_DOT_STAR: return "'.*'"; - case CPP_DEREF_STAR: return "'->*'"; - case CPP_PRAGMA: return "<_Pragma>"; - case CPP_KEYWORD: return ""; - - default: - snprintf(buffer, sizeof(buffer), "", token->type); - break; - } - - // Append token flags if any are set - if (token->flags & (PREV_WHITE | DIGRAPH | STRINGIFY_ARG | - PASTE_LEFT | NAMED_OP | BOL | PURE_ZERO | - SP_DIGRAPH | SP_PREV_WHITE | NO_EXPAND | PRAGMA_OP)) - { - size_t len = strlen(buffer); - snprintf(buffer + len, sizeof(buffer) - len, " [flags:"); - - if (token->flags & PREV_WHITE) - strncat(buffer, " PREV_WHITE", sizeof(buffer) - strlen(buffer) - 1); - if (token->flags & DIGRAPH) - strncat(buffer, " DIGRAPH", sizeof(buffer) - strlen(buffer) - 1); - if (token->flags & STRINGIFY_ARG) - strncat(buffer, " STRINGIFY", sizeof(buffer) - strlen(buffer) - 1); - if (token->flags & PASTE_LEFT) - strncat(buffer, " ##L", sizeof(buffer) - strlen(buffer) - 1); - if (token->flags & NAMED_OP) - strncat(buffer, " NAMED_OP", sizeof(buffer) - strlen(buffer) - 1); - if (token->flags & BOL) - strncat(buffer, " BOL", sizeof(buffer) - strlen(buffer) - 1); - if (token->flags & PURE_ZERO) - strncat(buffer, " ZERO", sizeof(buffer) - strlen(buffer) - 1); - if (token->flags & SP_DIGRAPH) - strncat(buffer, " ##DIGRAPH", sizeof(buffer) - strlen(buffer) - 1); - if (token->flags & SP_PREV_WHITE) - strncat(buffer, " SP_WHITE", sizeof(buffer) - strlen(buffer) - 1); - if (token->flags & NO_EXPAND) - strncat(buffer, " NO_EXPAND", sizeof(buffer) - strlen(buffer) - 1); - if (token->flags & PRAGMA_OP) - strncat(buffer, " _Pragma", sizeof(buffer) - strlen(buffer) - 1); - - strncat(buffer, " ]", sizeof(buffer) - strlen(buffer) - 1); - } - - return buffer; - } - - void print_token_list(const cpp_token *tokens ,size_t count){ - for (size_t i = 0; i < countus; ++i) - fprintf( stderr ,"[%zu] %s\n" ,i , cpp_token_as_text(&tokens[i]) ); - } - - void print_clause_parse_status(enum clause_parse_status status){ - const char *message = NULL; - switch (status) - { - case PCS_COMPLETE: - message = "parse_clause status is OK"; - break; - case PCS_ERR_EXPECTED_OPEN_DELIM: - message = "expected opening delimiter such as '(' but did not find it."; - break; - case PCS_ERR_UNEXPECTED_EOF: - message = "unexpected EOF before closing ')'."; - break; - case PCS_ERR_PASTE_AT_END: - message = "paste operator '##' appeared at the beginning or end of macro body."; - break; - case PCS_ERR_HASH_NOT_FOLLOWED_BY_ARG: - message = "'#' was not followed by a valid macro parameter."; - break; - case PCS_ERR_VAOPT_STATE_INVALID: - message = "invalid __VA_OPT__ tracking state."; - break; - case PCS_ERR_EOF_FETCH_FAILED: - message = "_cpp_get_fresh_line() failed to fetch next line."; - break; - case PCS_ERR_STATUS_NOT_SET: - message = "Internal Error, status was not set"; - break; - case PCS_ERR_UNKNOWN: - default: - message = "unknown or unhandled error."; - break; - } - fprintf(stderr, "%s\n", message); - } - - // a helper function for probing where the parser thinks it is in the source - void debug_peek_token (cpp_reader *pfile){ - const cpp_token *tok = _cpp_lex_token(pfile); - - cpp_error_with_line( - pfile, - CPP_DL_ERROR, - tok->src_loc, - 0, - "DEBUG: next token is: `%s`", - (const char *) cpp_token_as_text(tok) - ); - - _cpp_backup_tokens (pfile, 1); - - } - -#endif - -/*-------------------------------------------------------------------------------- - Parse clauses - - Clause parsers are intended to grammatical, with CPP semantics added by functions that are given a token list as an argument. We will see if this paints us into a corner soon enough I guess. - - A clause is a delimited token list. Deliminators include either: - - balanced delimiters, current parens or brackets - - end of line (CPP quirk: the lexer replaces newline with EOF) - - comma - - A comma in a clause is optionally: - - an error - - an alternative terminator - - merely another token - - Optionally, the lexer can be told to expand tokens before they arrive at the clause parsing routine, or to not expand them. - -*/ - -/* - Similar to `macro.cc::lex_expansion_token`, but returns the token rather than writing directly into an macro structure. - - Would be used for standard behavior when parsing a function body, but we instead elected - to use _cpp_lex_token. Note the doc `fetching_a_token.org`. - -*/ -#if 0 // no longer used - cpp_token get_token_noexpand(cpp_reader *pfile){ - cpp_token result; - - // Tells the lexer to lex the next token into result. - // Perhaps the lexer was already set to lex the next token to a different buffer? - // So just in case, its value original value is saved then restored. - cpp_token *saved_cur_token = pfile->cur_token; - pfile->cur_token = &result; - _cpp_lex_direct(pfile); - pfile->cur_token = saved_cur_token; - - return result; - } -#endif - -/* - Given the pfile and mode of clause parsing, returns a token list and the terminating delimiter. The token_array for the token list is allocated on the bump buffer, `pfile->a`. - - When parsing a balanced paren clause, the opening paren has already been parsed, perhaps by `clause_parse`. This function then completes the parse. - - When parsing a line clause, this parses the clause. - - Optionally stops parsing at a comma. -*/ -static enum clause_parse_status clause_parse_1( - // inputs - cpp_reader *pfile - ,clause_parse_delimiting cpd - ,enum cpp_ttype opening // needed for counting in balanced delimiters mode - ,enum cpp_ttype closing // " - ,clause_parse_comma cpc - ,bool expand // whether tokens should be expanded when lexed - - // outputs - ,token_list *tl // caller allocates the pointed to token_list - ,cpp_token *terminator // caller allocates the pointed to token, or sets terminator to null -){ - #if DebugParseClause - fprintf(stderr, ">> parse_clause\n"); - fprintf(stderr, " delimiter_matching: %s\n", delimiter_matching ? "true" : "false"); - fprintf(stderr, " opening token: %s (%d)\n", ttype_to_text(opening), opening); - fprintf(stderr, " closing token: %s (%d)\n", ttype_to_text(closing), closing); - fprintf(stderr, " comma_list: %s\n", comma_list ? "true" : "false"); - // ./include/line-map.h:typedef unsigned int location_t; - fprintf(stderr, " src_loc: %u\n", src_loc); - #endif - - int nesting_depth = 1; - cpp_token token; - tl->count = 0; - tl->token_array = NULL; - location_t src_loc; - bool first = true; - - for(;;){ - - /* get a token - - For cpp_get_token_1 src_loc is an out parameter, = the location user would expect. - cpp_get_token_1 is defined in this file (macro.cc). - */ - if(expand){ - token = *cpp_get_token_1(pfile, &src_loc); - }else{ - token = *_cpp_lex_token(pfile); - src_loc = token.src_loc; - } - - /* skip padding - - This is necessary for the name expr, but does it impact potential other uses of parse_clause? Another flag needed for this perhaps? - - Didn't use `macro.cc::cpp_get_token_no_padding` due to the two lex options above, also because that routine does not return location. - */ - if(token.type == CPP_PADDING) continue; - - - /* Note that the lexer replaces newline with EOF when parsing a directive, but we - allow for multiple line clauses in directives. - */ - if(cpd != CPD_EOF && token.type == CPP_EOF){ - #if DebugParseClause - fprintf( stderr, "CPP_EOF during parse with parentheses matching \n"); - #endif - if(!_cpp_get_fresh_line(pfile)){ - if(terminator) *terminator = token; - return PCS_ERR_EOF_FETCH_FAILED; - } - continue; - } - - /* if we shouldn't see a comma - */ - if( cpc == CPC_ERROR && token.type == CPP_COMMA ){ - if(terminator) *terminator = token; - return PCS_ERR_UNEXPECTED_COMMA; - } - - /* parentheses matching overhead - */ - if(cpd == CPD_BALANCED){ - if (token.type == opening) { - nesting_depth++; - } - else if (token.type == closing) { - nesting_depth--; - if (nesting_depth < 0) { - cpp_error(pfile, CPP_DL_ERROR, "unmatched closing delimiter"); - if(terminator) *terminator = token; - return PCS_ERR_UNEXPECTED_EOF; - } - } - #if DebugParseClause - if( token.type == opening || token.type == closing){ - fprintf( stderr, "new nesting_depth: %d\n", nesting_depth); - } - #endif - } - - /* Determine if clause has reached a terminator - */ - bool terminated_by_matched_delimiter = - cpd == CPD_BALANCED - && nesting_depth == 0 - && token.type == closing - ; - - bool terminated_by_comma = - cpc == CPC_TERMINATOR - && token.type == CPP_COMMA - ; - - bool terminated_by_EOL = - cpd == CPD_EOF - && token.type == CPP_EOF - ; - - if( - terminated_by_matched_delimiter - || terminted_by_comma - || terminated_by_EOL - ){ - if(terminator) *terminator = token; - return PCS_COMPLETE; - } - - // store the token in the token list - tl->token_array = (cpp_token *)_cpp_reserve_room( - pfile - ,tl->count * sizeof(cpp_token) - ,sizeof(cpp_token) - ); - tl->token_array[tl->count] = token; - tl->count++; - #if DebugParseClause - fprintf( stderr, "token: %s\n", cpp_token_as_text(&token) ); - #endif - - }// end for next token loop - -} - - -/* - Request to parse a clause - - If given expand_option, square taken as expand, and paren taken as no expand. - - Check the returned terminator token to determine which option occurred. - - When there is an optional comma for termination, check the terminator to see - if the clause parse ran into a comma. - - For the continuing, end of line, and expand option clause types, the input opening and closing arguments are ignored. - -*/ -static enum clause_parse_status clause_parse( - // inputs - cpp_reader *pfile - ,clause_parse_delimiting cpd - ,enum cpp_ttype opening // needed for counting in balanced delimiters mode - ,enum cpp_ttype closing // " - ,bool continuing // typically true when continuing after bumping into a comma terminator - ,clause_parse_comma cpc - ,clause_parse_expand cpe - - // outputs - ,token_list *tl // caller allocates the pointed to token_list - ,cpp_token *terminator // caller allocates the pointed to token, or sets terminator to null -){ - - #if DebugParseClause - fprintf(stderr, ">> parse_clause_balanced\n"); - fprintf(stderr, " opening token: %s (%d)\n", ttype_to_text(opening), opening); - fprintf(stderr, " closing token: %s (%d)\n", ttype_to_text(closing), closing); - fprintf(stderr, " comma_list: %s\n", comma_list ? "true" : "false"); - fprintf(stderr, " expand: %s\n", expand ? "true" : "false"); - fprintf(stderr, " src_loc_pt: %p\n", (void *)src_loc_pt); - #endif - - if( - continuing - || cpd == CPD_EOF - ){ - return clause_parse_1( - pfile - ,cpd - ,opening - ,closing - ,cpc - ,cpe - ,tl - ,terminator - ); - } - - const cpp_token *token = _cpp_lex_token(pfile); - #if DebugParseClause - fprintf(stderr, "opening token: %s" ,cpp_token_as_text(token)); - #endif - - if(cpe == CPE_EXPAND_OPTION){ - switch(token.type){ - CPP_OPEN_SQUARE: - return clause_parse_1( - pfile - ,CPD_BALANCED - ,CPP_OPEN_SQUARE - ,CPP_CLOSE_SQUARE - ,cpc - ,CPE_EXPAND - ,tl - ,terminator - ); - - CPP_OPEN_PAREN: - return clause_parse_1( - pfile - ,CPD_BALANCED - ,CPP_OPEN_PAREN - ,CPP_CLOSE_PAREN - ,cpc - ,CPE_NOEXPAND - ,tl - ,terminator - ); - - default: - return PCS_ERR_EXPECTED_OPEN_DELIM; - } - } - - // at this point we must be doing CPD_BALANCED with no square/paren option - - if(token.type != opening) return PCS_ERR_EXPECTED_OPEN_DELIM; - - return clause_parse_1( - pfile - ,cpd - ,opening - ,closing - ,cpc - ,cpe - ,tl - ,terminator - ); - -} - - -static cpp_hashnode * -name_clause_is_name(cpp_reader *pfile, const cpp_macro *macro) -{ - if (!macro || macro->count != 1) - { - cpp_error(pfile, CPP_DL_ERROR, - "expected exactly one token in assign name expression, got %u", - macro ? macro->count : 0); - return NULL; - } - - const cpp_token *tok = ¯o->exp.tokens[0]; - - if (tok->type != CPP_NAME) - { - cpp_error(pfile, CPP_DL_ERROR, - "expected identifier in assign name expression, got: %s", - cpp_token_as_text(tok)); - return NULL; - } - - return tok->val.node.node; -} - - -/*-------------------------------------------------------------------------------- - `#assign` directive RT extension - - called from directives.cc::do_assign() - -*/ - -bool _cpp_create_assign(cpp_reader *pfile){ - - clause_parse_status status; - location_t src_loc; - unsigned int num_extra_tokens = 0; - PCPSO_choice choice; - - /* Parse name clause into a temporary macro. - - This macro will not be committed, so it will be overwritten on the next _cpp_new_macro call. - */ - cpp_macro *name_macro = _cpp_new_macro( - pfile - ,cmk_macro - ,_cpp_reserve_room( pfile, 0, sizeof(cpp_macro) ) - ); - name_macro->variadic = false; - name_macro->paramc = 0; - name_macro->parm.params = NULL; - name_macro->fun_like = false; - - status = parse_clause_paren_square_option( - pfile - ,name_macro - ,&choice // square brackets, or round parents? - ,false // not a commas list - ,&src_loc - ,NULL // don't need to know the terminator - ,&num_extra_tokens - ); - - - #if DebugAssign - fprintf(stderr,"name_macro->count: %d\n" ,name_macro->count); - fprintf(stderr,"assign directive name tokens:\n"); - print_token_list(name_macro->exp.tokens ,name_macro->count); - #endif - - /* The name clause must be either a noexpandly valid name, or it must expand into - a valid name, depending if the programmer used () or []. - If valid, keep the name node. - */ - cpp_hashnode *name_node = name_clause_is_name(pfile ,name_macro); - if(name_node){ - #if DebugAssign - fprintf( - stderr - ,"assign macro name: '%.*s'\n" - ,(int) NODE_LEN(name_node) - ,NODE_NAME(name_node) - ); - #endif - }else{ - #if DebugAssign - fprintf(stderr, "node is not a name\n"); - #endif - return false; - } - - /* Unpaint name_node - - There are three scenarios where name_node will already exist in the symbol table - before the name clause of `#assign` is evaluated: - - 1. A macro definition already exists for name_node, and the name clause - is not expanded (i.e., it was delineated with '()'). - - 2. A macro definition exists, and the name clause *is* expanded (i.e., it - was delineated with '[]'), but name_node was painted and thus skipped - during expansion. - - 3. A macro definition exists and was not painted initially, but the name - clause expands recursively to itself (e.g., `list -> list`), resulting - in name_node being painted *during* the name clause evaluation. - - After the name clause is parsed, the body clause might be expanded. If so, - name_node must not be painted — this ensures that it will expand at least once. This enables patterns like: - - #assign ()(list)(list second) - - ...to work even if 'list' was painted prior to entering #assign. - - If the macro recurs during evaluation of the body clause, it will be automatically painted by the expansion engine, as usual. - - Note also: upon exit from this routine, the newly created macro will *not* be painted. Its disabled flag will remain clear. - - Consequently, for a recursive macro, assign can be called repeatedly to get 'one more level' of evaluation upon each call. - */ - if (cpp_macro_p(name_node)) { - name_node->flags &= ~NODE_DISABLED; - } - - /* create a new macro and put the #assign body clause in it - */ - cpp_macro *body_macro = _cpp_new_macro( - pfile - ,cmk_macro - ,_cpp_reserve_room( pfile, 0, sizeof(cpp_macro) ) - ); - body_macro->variadic = false; - body_macro->paramc = 0; - body_macro->parm.params = NULL; - body_macro->fun_like = false; - - status = parse_clause_paren_square_option( - pfile - ,body_macro - ,&choice // square brackets, or round parents? - ,false // not a commas list - ,&src_loc - ,NULL // don't need to know the terminator - ,&num_extra_tokens - ); - #if DebugAssign - fprintf(stderr,"assign directive body tokens:\n"); - print_token_list(body_macro->exp.tokens ,body_macro->count); - #endif - - - cpp_macro *assign_macro = (cpp_macro *)_cpp_commit_buff( - pfile - ,sizeof(cpp_macro) - sizeof(cpp_token) + sizeof(cpp_token) * body_macro->count - ); - assign_macro->count = body_macro->count; - memcpy( - assign_macro->exp.tokens - ,body_macro->exp.tokens - ,sizeof(cpp_token) * body_macro->count - ); - body_macro->variadic = false; - body_macro->paramc = 0; - body_macro->parm.params = NULL; - body_macro->fun_like = false; - - /* Install the assign macro under name_node. - - If name_node previously had a macro definition, discard it. - Then install the new macro, and clear any disabled flag. - - This ensures the assigned macro can be expanded immediately, - even if it appeared in its own body clause and was painted. - */ - name_node->flags &= ~NODE_USED; - - if (cpp_macro_p(name_node)) { - // There is no mechanism in libcpp to free the memory taken by a committed macro, but wec an cast it adrift. - name_node->value.macro = NULL; - } - name_node->type = NT_USER_MACRO; - name_node->value.macro = assign_macro; - name_node->flags &= ~NODE_DISABLED; - - /* all done - */ - #if DebugAssign - fprintf( - stderr - ,"macro '%.*s' assigned successfully.\n\n" - ,(int) NODE_LEN(name_node) - ,NODE_NAME(name_node) - ); - #endif - - return true; - -} - -/*-------------------------------------------------------------------------------- - `#macro` directive RT extension - - Given a pfile, returns a macro definition. - - #macro name (parameter [,parameter] ...) (body_expr) - #macro name () (body_expr) - - Upon entry, the name was already been parsed in directives.cc::do_macro, so the next token will be the opening paren of the parameter list. - - Thi code is similar to `_cpp_create_definition` though uses paren blancing around the body, instead of requiring the macro body be on a single line. - - The cpp_macro struct is defined in cpplib.h: `struct GTY(()) cpp_macro {` it has a flexible array field in a union as a last member: cpp_token tokens[1]; - - This code was derived from create_iso_definition(). The break out portions shared - with create_macro_definition code should be shared with the main code, so that there - is only one place for edits. - -*/ -static cpp_macro *create_rt_macro (cpp_reader *pfile){ - - #if DebugRTMacro - fprintf(stderr,"entering create_rt_macro\n"); - #endif - - unsigned int num_extra_tokens = 0; - unsigned paramc = 0; - cpp_hashnode **params = NULL; - bool varadic = false; - bool ok = false; - cpp_macro *macro = NULL; - clause_parse_status status; - - /* parse parameter list - - after parse_parms runs, the next token returned by pfile will be subsequent to the parameter list, e.g.: - 7 | #macro Q(f ,...) printf(f ,__VA_ARGS__) - | ^~~~~~ - */ - const cpp_token *token = _cpp_lex_token(pfile); - location_t src_loc = token->src_loc; - - if(token->type != CPP_OPEN_PAREN){ - cpp_error_with_line( - pfile - ,CPP_DL_ERROR - ,src_loc - ,0 - ,"expected '(' to open arguments list, but found: %s" - ,cpp_token_as_text(token) - ); - goto out; - } - - if( !parse_params(pfile, ¶mc, &varadic) ) goto out; - - // finalizes the reserved room, otherwise it will be reused on the next reserve room call. - params = (cpp_hashnode **)_cpp_commit_buff( pfile, sizeof (cpp_hashnode *) * paramc ); - token = NULL; - - /* parse body macro - - A macro struct instance is variable size, due to tokens added to the macro.exp.tokens - during parse, and possible reallocations. - - Function like macros will later need space to hold parameter values. - */ - macro = _cpp_new_macro( - pfile - ,cmk_macro - ,_cpp_reserve_room( pfile, 0, sizeof(cpp_macro) ) - ); - // used by parse_clause_noexpand - macro->variadic = varadic; - macro->paramc = paramc; - macro->parm.params = params; - macro->fun_like = true; - - PCPSO_choice choice; - status = parse_clause_paren_square_option( - pfile - ,macro - ,&choice - ,false // not a commas list - ,&src_loc - ,NULL // don't need to know the terminator - ,&num_extra_tokens - ); - if( status != PCS_COMPLETE ){ - fprintf(stderr, "parse_paren_clause returned: "); - print_clause_parse_status(status); - goto out; - } - - #if DebugRTMacro - fprintf(stderr,"rt_macro directive body tokens:\n"); - print_token_list(macro->exp.tokens ,macro->count); - #endif - - // commit the macro, attach the parameter list - ok = true; - macro = (cpp_macro *)_cpp_commit_buff( - pfile - , - sizeof (cpp_macro) - - sizeof (cpp_token) - + sizeof (cpp_token) * macro->count - + sizeof(cpp_hashnode *) * paramc - ); - macro->variadic = varadic; - macro->paramc = paramc; - macro->parm.params = params; - macro->fun_like = true; - - /* some end cases we must clean up - */ - /* - It might be that the first token of the macro body was preceded by white space, so - the white space flag is set. However, upon expansion, there might not be a white - space before said token, so the following code clears the flag. - */ - if (macro->count) - macro->exp.tokens[0].flags &= ~PREV_WHITE; - - /* - Identifies consecutive ## tokens (a.k.a. CPP_PASTE) that were invalid or ambiguous, - - Removes them from the main macro body, - - Stashes them at the end of the tokens[] array in the same memory, - - Sets macro->extra_tokens = 1 to signal their presence. - */ - if (num_extra_tokens) - { - /* Place second and subsequent ## or %:%: tokens in sequences of - consecutive such tokens at the end of the list to preserve - information about where they appear, how they are spelt and - whether they are preceded by whitespace without otherwise - interfering with macro expansion. Remember, this is - extremely rare, so efficiency is not a priority. */ - cpp_token *temp = (cpp_token *)_cpp_reserve_room - (pfile, 0, num_extra_tokens * sizeof (cpp_token)); - unsigned extra_ix = 0, norm_ix = 0; - cpp_token *exp = macro->exp.tokens; - for (unsigned ix = 0; ix != macro->count; ix++) - if (exp[ix].type == CPP_PASTE) - temp[extra_ix++] = exp[ix]; - else - exp[norm_ix++] = exp[ix]; - memcpy (&exp[norm_ix], temp, num_extra_tokens * sizeof (cpp_token)); - - /* Record there are extra tokens. */ - macro->extra_tokens = 1; - } - - out: - - /* - - This resets a flag in the parser’s state machine, pfile. - - The field `va_args_ok` tracks whether the current macro body is allowed to reference `__VA_ARGS__` (or more precisely, `__VA_OPT__`). - - It's set **while parsing a macro body** that might use variadic logic — particularly in `vaopt_state` tracking. - - Resetting it here ensures that future macros aren't accidentally parsed under the assumption that variadic substitution is valid. - */ - pfile->state.va_args_ok = 0; - - /* - Earlier we did: - if (!parse_params(pfile, ¶mc, &variadic)) goto out; - This cleans up temporary memory used by parse_params. - */ - _cpp_unsave_parameters (pfile, paramc); - - return ok ? macro : NULL; - -} - -/* - called from directives.cc:: do_macro -*/ -bool -_cpp_create_rt_macro(cpp_reader *pfile, cpp_hashnode *node){ - - #if DebugRTMacro - fprintf(stderr,"entering _cpp_create_macro\n"); - #endif - - cpp_macro *macro; - macro = create_rt_macro (pfile); - - if (!macro) - return false; - - if (cpp_macro_p (node)) - { - if (CPP_OPTION (pfile, warn_unused_macros)) - _cpp_warn_if_unused_macro (pfile, node, NULL); - - if (warn_of_redefinition (pfile, node, macro)) - { - const enum cpp_warning_reason reason - = (cpp_builtin_macro_p (node) && !(node->flags & NODE_WARN)) - ? CPP_W_BUILTIN_MACRO_REDEFINED : CPP_W_NONE; - - bool warned = - cpp_pedwarning_with_line (pfile, reason, - pfile->directive_line, 0, - "\"%s\" redefined", NODE_NAME (node)); - - if (warned && cpp_user_macro_p (node)) - cpp_error_with_line (pfile, CPP_DL_NOTE, - node->value.macro->line, 0, - "this is the location of the previous definition"); - } - _cpp_free_definition (node); - } - - /* Enter definition in hash table. */ - node->type = NT_USER_MACRO; - node->value.macro = macro; - if (! ustrncmp (NODE_NAME (node), DSC ("__STDC_")) - && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_FORMAT_MACROS") - /* __STDC_LIMIT_MACROS and __STDC_CONSTANT_MACROS are mentioned - in the C standard, as something that one must use in C++. - However DR#593 and C++11 indicate that they play no role in C++. - We special-case them anyway. */ - && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_LIMIT_MACROS") - && ustrcmp (NODE_NAME (node), (const uchar *) "__STDC_CONSTANT_MACROS")) - node->flags |= NODE_WARN; - - /* If user defines one of the conditional macros, remove the - conditional flag */ - node->flags &= ~NODE_CONDITIONAL; - - return true; -} - -/*-------------------------------------------------------------------------------- - RT builtin macro extensions - -*/ - - static const uchar *evaluate_RT_ASSIGN(cpp_reader *pfile){ - if( ! _cpp_create_assign(pfile) ){ - cpp_error( - pfile - ,CPP_DL_ERROR - ,"#assign macro failed" - ); - } - // return UC""; // returning null string gave an internal compiler error - return NULL; // expands as `1` dunno why, but the code is there to make it happen - // return UC" "; // another internal compiler error - } - - static const uchar *evaluate_RT_TO_ARG_LIST(cpp_reader *pfile){ - cpp_error( - pfile - ,CPP_DL_NOTE - ,"#to_arg_list macro evaluated" - ); - return NULL; - } - - static const uchar *evaluate_RT_TO_TOKEN_LIST(cpp_reader *pfile){ - cpp_error( - pfile - ,CPP_DL_NOTE - ,"#to_token_list macro evaluated" - ); - return NULL; - } - - static const uchar *evaluate_RT_FIRST(cpp_reader *pfile){ - cpp_error( - pfile - ,CPP_DL_NOTE - ,"#first macro evaluated" - ); - return NULL; - } - - static const uchar *evaluate_RT_REST(cpp_reader *pfile){ - cpp_error( - pfile - ,CPP_DL_NOTE - ,"#rest macro evaluated" - ); - return NULL; - } - - static const uchar *evaluate_RT_MAP(cpp_reader *pfile){ - cpp_error( - pfile - ,CPP_DL_NOTE - ,"#map macro evaluated" - ); - return NULL; - } - - static const uchar *evaluate_RT_AL_MAP(cpp_reader *pfile){ - cpp_error( - pfile - ,CPP_DL_NOTE - ,"#al_map macro evaluated" - ); - return NULL; - } - - static const uchar *evaluate_RT_IF(cpp_reader *pfile){ - cpp_error( - pfile - ,CPP_DL_NOTE - ,"#if macro evaluated" - ); - return NULL; - } - - static const uchar *evaluate_RT_NOT(cpp_reader *pfile){ - cpp_error( - pfile - ,CPP_DL_NOTE - ,"#not macro evaluated" - ); - return NULL; - } - - static const uchar *evaluate_RT_AND(cpp_reader *pfile){ - cpp_error( - pfile - ,CPP_DL_NOTE - ,"#and macro evaluated" - ); - return NULL; - } - - static const uchar *evaluate_RT_OR(cpp_reader *pfile){ - cpp_error( - pfile - ,CPP_DL_NOTE - ,"#or macro evaluated" - ); - return NULL; - } - - static const uchar *evaluate_RT_IS_IDENTIFIER(cpp_reader *pfile){ - cpp_error( - pfile - ,CPP_DL_NOTE - ,"#is_identifier macro evaluated" - ); - return NULL; - } - - static const uchar *evaluate_RT_IS_NAME(cpp_reader *pfile){ - cpp_error( - pfile - ,CPP_DL_NOTE - ,"#is_name macro evaluated" - ); - return NULL; - } - - static const uchar *evaluate_RT_PASTE(cpp_reader *pfile){ - cpp_error( - pfile - ,CPP_DL_NOTE - ,"#paste macro evaluated" - ); - return NULL; - } - - -#if 0 - static const uchar *evaluate_RT_ASSIGN(cpp_reader *pfile){ - return UC"_ASSIGN"; - } - - static const uchar *evaluate_RT_TO_ARG_LIST(cpp_reader *pfile){ - return UC"_TO_ARG_LIST"; - } - - static const uchar *evaluate_RT_TO_TOKEN_LIST(cpp_reader *pfile){ - return UC"_TO_TOKEN_LIST"; - } - - static const uchar *evaluate_RT_FIRST(cpp_reader *pfile){ - return UC"_FIRST"; - } - - static const uchar *evaluate_RT_REST(cpp_reader *pfile){ - return UC"_REST"; - } - - static const uchar *evaluate_RT_MAP(cpp_reader *pfile){ - return UC"_MAP"; - } - - static const uchar *evaluate_RT_AL_MAP(cpp_reader *pfile){ - return UC"_AL_MAP"; - } - - static const uchar *evaluate_RT_IF(cpp_reader *pfile){ - return UC"_IF"; - } - - static const uchar *evaluate_RT_NOT(cpp_reader *pfile){ - return UC"_NOT"; - } - - static const uchar *evaluate_RT_AND(cpp_reader *pfile){ - return UC"_AND"; - } - - static const uchar *evaluate_RT_OR(cpp_reader *pfile){ - return UC"_OR"; - } - - static const uchar *evaluate_RT_IS_IDENTIFIER(cpp_reader *pfile){ - return UC"_IS_IDENTIFIER"; - } - - static const uchar *evaluate_RT_IS_NAME(cpp_reader *pfile){ - return UC"_IS_NAME"; - } - - static const uchar *evaluate_RT_PASTE(cpp_reader *pfile){ - return UC"_PASTE"; - } -#endif -#if 0 -/*───────────────────────── RT helper utilities ─────────────────────────*/ - -/* rt_read_paren_argument - Consume exactly one parenthesised argument list, collecting every token - between the outer ‘( … )’ into OUT_LIST. Returns true on success and - emits its own diagnostic on failure. */ -static bool rt_read_paren_argument(cpp_reader *pfile ,vec &out_list){ - const cpp_token *tok = cpp_get_token_no_padding(pfile); - if(tok->type != CPP_OPEN_PAREN){ - cpp_error(pfile ,CPP_DL_ERROR ,"missing '(' after built-in macro"); - return false; - } - unsigned depth = 1; - while(depth){ - tok = cpp_get_token_1(pfile ,nullptr); - if(tok->type == CPP_EOF){ - cpp_error(pfile ,CPP_DL_ERROR ,"unterminated argument list in built-in macro"); - return false; - } - if(tok->type == CPP_OPEN_PAREN) depth++; - else if(tok->type == CPP_CLOSE_PAREN){ - if(--depth == 0) break; - } - if(depth) out_list.safe_push(*tok); /* omit the final ‘)’ */ - } - return true; -} - -/* rt_tokens_as_text - Spell a vector of tokens back into a single space-separated byte string - allocated from the preprocessor’s permanent pool. */ -static const uchar *rt_tokens_as_text(cpp_reader *pfile ,const vec &src){ - size_t reserve = src.length()*20 + 1; /* generous bound */ - uchar *buf = _cpp_unaligned_alloc(pfile ,reserve); - uchar *dst = buf; - for(unsigned i = 0 ;i < src.length() ;++i){ - if(i) *dst++ = ' '; - dst = cpp_spell_token(pfile ,&src[i] ,dst ,true); - } - *dst = '\0'; - return buf; -} - -/*──────────────────── _FIRST(token_list) implementation ─────────────────*/ - -static const uchar *evaluate_RT_FIRST(cpp_reader *pfile){ - vec list; - if(!rt_read_paren_argument(pfile ,list)) return UC""; - unsigned idx = 0; - while(idx < list.length() && list[idx].type == CPP_PADDING) ++idx; - if(idx == list.length()) return UC""; /* empty list */ - vec one; - one.safe_push(list[idx]); - return rt_tokens_as_text(pfile ,one); -} - -/*──────────────────── _REST(token_list) implementation ──────────────────*/ - -static const uchar *evaluate_RT_REST(cpp_reader *pfile){ - vec list; - if(!rt_read_paren_argument(pfile ,list)) return UC""; - unsigned first = 0; - while(first < list.length() && list[first].type == CPP_PADDING) ++first; - if(first >= list.length() - 1) return UC""; /* one-token list */ - vec rest; - for(unsigned i = first + 1 ;i < list.length() ;++i) rest.safe_push(list[i]); - return rt_tokens_as_text(pfile ,rest); -} -#endif diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/mv_libs_to_gcc.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/mv_libs_to_gcc.sh" deleted file mode 100755 index 9e4b5e5..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/mv_libs_to_gcc.sh" +++ /dev/null @@ -1,39 +0,0 @@ -#!/bin/bash -# mv_libs_to_gcc.sh -# Move prerequisite libraries into the GCC source tree, replacing stale copies. -# This script can be run multiple times for incremental moves when more sources become available. - -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -LIB_LIST=( - "gmp" "$GMP_SRC" - "mpfr" "$MPFR_SRC" - "mpc" "$MPC_SRC" - "isl" "$ISL_SRC" - "zstd" "$ZSTD_SRC" -) - -i=0 -while [ $i -lt ${#LIB_LIST[@]} ]; do - lib="${LIB_LIST[$i]}" - src="${LIB_LIST[$((i + 1))]}" - dest="$GCC_SRC/$lib" - i=$((i + 2)) - - if [[ ! -d "$src" ]]; then - echo "Source not found, skipping: $src" - continue - fi - - if [[ -d "$dest" ]]; then - echo "Removing stale: $dest" - rm -rf "$dest" - fi - - echo "mv $src $dest" - mv "$src" "$dest" -done - -echo "completed mv_libs_to_gcc.sh" diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/project_download.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/project_download.sh" deleted file mode 100755 index 0aa20bc..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/project_download.sh" +++ /dev/null @@ -1,131 +0,0 @@ -#!/bin/bash -# This script can be run multiple times to download what was missed on prior invocations -# If there is a corrupt tarball, delete it and run this again -# Sometimes a connection test will fails, then the downloads runs anyway - -set -uo pipefail # no `-e`, we want to continue on error - -source "$(dirname "$0")/environment.sh" - -check_internet_connection() { - echo "🌐 Checking internet connection..." - # Use a quick connection check without blocking the whole script - if ! curl -s --connect-timeout 5 https://google.com > /dev/null; then - echo "⚠️ No internet connection detected (proceeding with download anyway)" - else - echo "✅ Internet connection detected" - fi -} - -# check_server_reachability() { -# local url=$1 -# if ! curl -s --head "$url" | head -n 1 | grep -q "HTTP/1.1 200 OK"; then -# echo "⚠️ Cannot reach $url (proceeding with download anyway)" -# fi -# } - -check_server_reachability() { - local url=$1 - echo "checking is reachable: $url " - - # Attempt to get the HTTP response code without following redirects - http_code=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 5 "$url") - - # If the HTTP code is between 200 and 299, consider it reachable - if [[ "$http_code" -ge 200 && "$http_code" -lt 300 ]]; then - echo "✅ Server reachable (HTTP $http_code): $url " - else - # If not 2xx, print the status code for transparency - echo "⚠️ Server HTTP $http_code not 2xx, will try anyway: $url" - fi -} - -check_file_exists() { - local file=$1 - [[ -f "$UPSTREAM/$file" ]] -} - -download_file() { - local file=$1 - local url=$2 - - echo "Downloading $file from $url..." - if (cd "$UPSTREAM" && curl -LO "$url"); then - if file "$UPSTREAM/$file" | grep -qi 'html'; then - echo "❌ Invalid download (HTML, not archive): $file" - rm -f "$UPSTREAM/$file" - return 1 - elif [[ -f "$UPSTREAM/$file" ]]; then - echo "✅ Successfully downloaded: $file" - return 0 - # Validate it's not an HTML error page - else - echo "❌ Did not appear after download: $file " - return 1 - fi - else - echo "❌ Failed to download: $file" - return 1 - fi -} - -download_tarballs() { - i=0 - while [ $i -lt ${#UPSTREAM_TARBALL_LIST[@]} ]; do - tarball="${UPSTREAM_TARBALL_LIST[$i]}" - url="${UPSTREAM_TARBALL_LIST[$((i+1))]}" - i=$((i + 3)) - - if check_file_exists "$tarball"; then - echo "⚡ already exists, skipping download: $tarball " - continue - fi - - check_server_reachability "$url" - - if ! download_file "$tarball" "$url"; then - echo "⚠️ Skipping due to previous error: $tarball " - fi - - done -} - -download_git_repos() { - i=0 - while [ $i -lt ${#UPSTREAM_GIT_REPO_LIST[@]} ]; do - repo="${UPSTREAM_GIT_REPO_LIST[$i]}" - branch="${UPSTREAM_GIT_REPO_LIST[$((i+1))]}" - dir="${UPSTREAM_GIT_REPO_LIST[$((i+2))]}" - - if [[ -d "$dir/.git" ]]; then - echo "⚡ Already exists, skipping git clone: $dir " - i=$((i + 3)) - continue - fi - - echo "Cloning $repo into $dir..." - if ! git clone --branch "$branch" "$repo" "$dir"; then - echo "❌ Failed to clone $repo → $dir" - fi - - i=$((i + 3)) - done -} - -# do the downloads - -check_internet_connection - -echo "Downloading tarballs:" -for ((i=0; i<${#UPSTREAM_TARBALL_LIST[@]}; i+=3)); do - echo " - ${UPSTREAM_TARBALL_LIST[i]}" -done -download_tarballs - -echo "Cloning Git repositories:" -for ((i=0; i<${#UPSTREAM_GIT_REPO_LIST[@]}; i+=3)); do - echo " - ${UPSTREAM_GIT_REPO_LIST[i]} (branch ${UPSTREAM_GIT_REPO_LIST[i+1]})" -done -download_git_repos - -echo "project_download.sh completed" diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/project_extract.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/project_extract.sh" deleted file mode 100755 index 272470d..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/project_extract.sh" +++ /dev/null @@ -1,55 +0,0 @@ -#!/bin/bash -# extracts (unpacks) the source tarballs held in upstream/ into the source/ directory. -# Will not extract if target already exists -# Delete any malformed extractions before running again -# -# gcc is not installed as a tar file, rather it is git cloned directly into source/ as part of the downloading from upstream sources. Hence, there is nothing to extract. - -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -had_error=0 -i=0 - -while [ $i -lt ${#UPSTREAM_TARBALL_LIST[@]} ]; do - tarball="${UPSTREAM_TARBALL_LIST[$i]}" - i=$((i + 3)) - - src_path="$UPSTREAM/$tarball" - - # Strip compression suffix to guess subdirectory name - base_name="${tarball%%.tar.*}" # safer across .tar.gz, .tar.zst, etc. - target_dir="$SRC/$base_name" - - if [[ -d "$target_dir" ]]; then - echo "⚡ Already exists, skipping: $target_dir" - continue - fi - - if [[ ! -f "$src_path" ]]; then - echo "❌ Missing tarball: $src_path" - had_error=1 - continue - fi - - echo "tar -xf $tarball" - if ! (cd "$SRC" && tar -xf "$src_path"); then - echo "❌ Extraction failed: $tarball" - had_error=1 - continue - fi - - if [[ -d "$target_dir" ]]; then - echo "Extracted to: $target_dir" - else - echo "❌ Target not found after extraction: $target_dir" - had_error=1 - fi -done - -if [[ $had_error -eq 0 ]]; then - echo "✅ All tarballs extracted successfully" -else - echo "❌ Some extractions failed or were incomplete" -fi diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/project_requisites.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/project_requisites.sh" deleted file mode 100755 index 1688e92..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/project_requisites.sh" +++ /dev/null @@ -1,161 +0,0 @@ -#!/bin/bash -# project_requisites.sh -# Checks that all required tools, libraries, and sources are available -# before proceeding with the GCC build. - -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "Checking requisites for native standalone GCC build." - -if ! command -v pkg-config >/dev/null; then - echo "❌ pkg-config command required for this script" - echo " Debian: sudo apt install pkg-config" - echo " Fedora: sudo dnf install pkg-config" - exit 1 -fi - -missing_requisite_list=() -failed_pkg_config_list=() -found_requisite_list=() - -# --- Required Script Tools (must be usable by this script itself) --- -script_tools=( - bash - awk - sed - grep -) - -echo "Checking for essential script dependencies." -for tool in "${script_tools[@]}"; do - location=$(command -v "$tool") - if [ $? -eq 0 ]; then - found_requisite_list+=("$location") - else - missing_requisite_list+=("tool: $tool") - fi -done - -# --- Build Tools --- -build_tools=( - gcc - g++ - make - tar - gzip - bzip2 - perl - patch - diff - python3 -) - -echo "Checking for required build tools." -for tool in "${build_tools[@]}"; do - location=$(command -v "$tool") - if [ $? -eq 0 ]; then - found_requisite_list+=("$location") - else - missing_requisite_list+=("tool: $tool") - fi -done - -# --- Libraries via pkg-config --- -required_pkgs=( - gmp - mpfr - mpc - isl - zstd -) - -echo "Checking for required development libraries (via pkg-config)." -for lib in "${required_pkgs[@]}"; do - if pkg-config --exists "$lib"; then - libdir=$(pkg-config --variable=libdir "$lib" 2>/dev/null) - soname="lib$lib.so" - - if [[ -f "$libdir/$soname" ]]; then - found_requisite_list+=("library: $lib @ $libdir/$soname") - else - found_requisite_list+=("library: $lib @ (not found in $libdir)") - fi - else - failed_pkg_config_list+=("library: $lib") - fi -done - -# --- Source Trees --- -echo "Checking for required source directories." -echo "These will be installed by project_download.sh and project_extract.sh" -for src in "${SOURCE_DIR_LIST[@]}"; do - if [[ -d "$src" && "$(ls -A "$src")" ]]; then - found_requisite_list+=("source: $src") - else - missing_requisite_list+=("source: $src") - fi -done - -# --- Optional Python Modules --- -optional_py_modules=( - re sys os json gzip pathlib shutil time tempfile -) - -echo "Checking optional Python3 modules." -for mod in "${optional_py_modules[@]}"; do - if python3 -c "import $mod" &>/dev/null; then - found_requisite_list+=("python: module $mod") - else - missing_requisite_list+=("python (optional): module $mod") - fi -done - -glibc_version=$(ldd --version 2>/dev/null | head -n1 | grep -oE '[0-9]+\.[0-9]+' | head -n1) -glibc_path=$(ldd /bin/ls | grep 'libc.so.6' | awk '{print $3}') -if [[ -n "$glibc_version" && -f "$glibc_path" ]]; then - found_requisite_list+=("library: glibc @ $glibc_path (version $glibc_version)") -else - missing_requisite_list+=("library: glibc") -fi - - -echo -echo "Summary:" -echo "--------" - -for item in "${found_requisite_list[@]}"; do - echo " found: $item" -done - -for item in "${missing_requisite_list[@]:-}"; do - echo "❌ missing required: $item" -done - -for item in "${failed_pkg_config_list[@]:-}"; do - echo "⚠️ pkg-config could not find: $item" -done - -echo - -if [[ ${#missing_requisite_list[@]} -eq 0 && ${#failed_pkg_config_list[@]} -eq 0 ]]; then - echo "✅ All required tools and sources are present." -else - echo "❌ Some requisites are missing or unresolved." - if [[ ${#failed_pkg_config_list[@]} -gt 0 ]]; then - echo - echo "Note: The following libraries were not found by pkg-config:" - for item in "${failed_pkg_config_list[@]}"; do - echo " - $item" - done - echo - echo "The following are expected to be missing if you are building them from source:" - echo " - mpc" - echo " - isl" - echo " - zstd" - echo "If not, consider installing the appropriate development packages:" - echo " Debian: sudo apt install libmpc-dev libisl-dev libzstd-dev" - echo " Fedora: sudo dnf install libmpc-devel isl-devel libzstd-devel" - fi -fi diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/project_setup.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/project_setup.sh" deleted file mode 100755 index 953a99c..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/project_setup.sh" +++ /dev/null @@ -1,47 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -# Create top-level project directories -for dir in "${PROJECT_DIR_LIST[@]}"; do - echo "mkdir -p $dir" - mkdir -p "$dir" -done - -# Create subdirectories within SYSROOT -for subdir in "${PROJECT_SUBDIR_LIST[@]}"; do - echo "mkdir -p $subdir" - mkdir -p "$subdir" -done - -# Ensure TMPDIR exists and add .gitignore -if [[ ! -d "$TMPDIR" ]]; then - echo "mkdir -p $TMPDIR" - mkdir -p "$TMPDIR" - - echo "$TMPDIR/" > "$TMPDIR/.gitignore" -else - echo "⚠️ TMPDIR already exists" -fi - -# Create root-level .gitignore if missing -if [[ -f "$REPO_HOME/.gitignore" ]]; then - echo "⚠️ $REPO_HOME/.gitignore already exists" -else - echo "create $REPO_HOME/.gitignore" - { - echo "# Ignore synthesized top-level directories" - for dir in "${PROJECT_DIR_LIST[@]}"; do - rel_path="${dir#$REPO_HOME/}" - echo "/$rel_path" - done - echo "# Ignore synthesized files" - echo "/.gitignore" - } > "$REPO_HOME/.gitignore" -fi - -echo -echo "Created project structure:" -# tree -L 2 "$REPO_HOME" 2>/dev/null || find "$REPO_HOME" -maxdepth 2 - diff --git "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/rebuild_gcc.sh" "b/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/rebuild_gcc.sh" deleted file mode 100755 index 447442a..0000000 --- "a/developer/script_Deb-12.10_gcc-12.4.1\360\237\226\211/rebuild_gcc.sh" +++ /dev/null @@ -1,21 +0,0 @@ -#!/bin/bash -# rebuild_gcc.sh – no structural changes, and build directory is still intact - -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "🔧 Starting GCC rebuild..." - -pushd "$GCC_BUILD" - - echo "gcc: $(command -v gcc)" - echo "toolchain: $TOOLCHAIN" - - $MAKE -j"$MAKE_JOBS" - $MAKE install - -popd - -echo "✅ GCC re-installed to $TOOLCHAIN/bin" -"$TOOLCHAIN/bin/gcc" --version diff --git a/developer/script_gcc-15/README.org b/developer/script_gcc-15/README.org new file mode 100755 index 0000000..44936b8 --- /dev/null +++ b/developer/script_gcc-15/README.org @@ -0,0 +1,34 @@ + +This was a bear to get running, and I have given up on it for now. + +It dies early in the chain at the making of glibc and the crt.o +etc. files. + +# Scripts for building standalone gcc + +The scripts here will build a standalone gcc along with version compatible tools. + +There is a lot more to a gcc than one might imagine. It was developed as though an integral part of Unix. Hence, the standalone build has top level directories with many things in them, in parallel to the top level directories a person would find on a Unix system. + +## .gitignore + +* If there is no top level `.gitignore`, `setup_project.sh` will create one. +* The synthesized `.gitignore` references itself, so it will not get checked in. +* No script, including`clean_dist.sh`, will delete an existing `.gitignore`. + +## Clean + +* clean_build.sh - for saving space after the build is done. The build scripts are idempotent, so in an ideal world this need not be run to do a rebuild. + +* clean_dist.sh - with on exception, this will delete everything that was synthesized. The one exception is that .gitignore is moved to the tmp directory so as to preserve any changes a user might have been made, and the contents of the tmp directory are not removed. + +* clean_tmp.sh - wipes clean all contents of the temporary directory. + +## Setup + +* setup_project.sh - makes the directory structure for the build, creates a `tmp/` directory under the project. If it does not already exist, creates a .gitignore file with all the created directories listed. + +## Download + +* download_upstream_sources.sh - goes to the Internet, fetches all the sources that have not already been fetched. Then expands the sources into the proper sub-directory under `source/1. + diff --git a/developer/script_gcc-15/audit_glibc_headers.sh b/developer/script_gcc-15/audit_glibc_headers.sh new file mode 100755 index 0000000..64ddd57 --- /dev/null +++ b/developer/script_gcc-15/audit_glibc_headers.sh @@ -0,0 +1,50 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "🔎 Auditing glibc build state..." + +declare -a missing +declare -a present + +# GLIBC_BUILD sanity +[[ -d "$GLIBC_BUILD" ]] && present+=("GLIBC_BUILD exists: $GLIBC_BUILD") || missing+=("GLIBC_BUILD missing") + +# Check for Makefile +if [[ -s "$GLIBC_BUILD/Makefile" ]]; then + present+=("Makefile exists and non-empty") +else + missing+=("Makefile missing or empty in $GLIBC_BUILD") +fi + +# Check bits/stdio_lim.h +if [[ -f "$GLIBC_BUILD/bits/stdio_lim.h" ]]; then + present+=("bits/stdio_lim.h exists (post-header install marker)") +else + missing+=("bits/stdio_lim.h missing — make install-headers likely incomplete") +fi + +# Check csu/Makefile +if [[ -f "$GLIBC_BUILD/csu/Makefile" ]]; then + present+=("csu/Makefile exists") + grep -q 'crt1\.o' "$GLIBC_BUILD/csu/Makefile" \ + && present+=("csu/Makefile defines crt1.o") \ + || missing+=("csu/Makefile missing rule for crt1.o") +else + missing+=("csu/Makefile missing") +fi + +# Show report +echo "✅ Present:" +for p in "${present[@]}"; do echo " $p"; done + +echo +if (( ${#missing[@]} )); then + echo "❌ Missing:" + for m in "${missing[@]}"; do echo " $m"; done + exit 1 +else + echo "🎉 All bootstrap prerequisites are in place" + exit 0 +fi diff --git a/developer/script_gcc-15/build_Linux_requisites.sh b/developer/script_gcc-15/build_Linux_requisites.sh new file mode 100755 index 0000000..42fa7b7 --- /dev/null +++ b/developer/script_gcc-15/build_Linux_requisites.sh @@ -0,0 +1,62 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "📦 Checking requisites for Linux kernel headers build." + +missing_requisite_list=() +found_requisite_list=() + +# tools required for build +# +required_tools=( + "gcc" + "g++" + "make" +) + +for tool in "${required_tools[@]}"; do + location=$(command -v "$tool") # Fixed this part to use $tool instead of "tool" + if [ $? -eq 0 ]; then # Check if the command was successful + found_requisite_list+=("$location") + else + missing_requisite_list+=("$tool") + fi +done + +# source code required for build +# +if [ -d "$LINUX_SRC" ] && [ "$(ls -A "$LINUX_SRC")" ]; then + found_requisite_list+=("$LINUX_SRC") +else + missing_requisite_list+=("$LINUX_SRC") +fi + +# print requisites found +# +if (( ${#found_requisite_list[@]} != 0 )); then + echo "found:" + for found_requisite in "${found_requisite_list[@]}"; do + echo " $found_requisite" + done +fi + +# print requisites missing +# +if (( ${#missing_requisite_list[@]} != 0 )); then + echo "missing:" + for missing_requisite in "${missing_requisite_list[@]}"; do + echo " $missing_requisite" + done +fi + +# in conclusion +# +if (( ${#missing_requisite_list[@]} > 0 )); then + echo "❌ Missing requisites" + exit 1 +else + echo "✅ All checked specified requisites found" + exit 0 +fi diff --git a/developer/script_gcc-15/build_all.sh b/developer/script_gcc-15/build_all.sh new file mode 100755 index 0000000..6f55a28 --- /dev/null +++ b/developer/script_gcc-15/build_all.sh @@ -0,0 +1,36 @@ +#!/bin/bash +set -euo pipefail + +cd "$(dirname "$0")" +SCRIPT_DIR="$PWD" + +echo "loading environment" +source "$SCRIPT_DIR/environment.sh" + +cd "$SCRIPT_DIR" + +./project_setup.sh +./project_download.sh +./project_extract.sh +./project_requisites + +./build_binutils_requisites.sh +./build_binutils.sh + +./build_linux_requisites.sh +./build_linux.sh + +#./build_glibc_bootstrap_requisites.sh +./build_glibc_bootstrap.sh + +./build_gcc_stage1_requisites.sh +./build_gcc_stage1.sh + +./build_glibc_requisites.sh +./build_glibc.sh + +./build_gcc_final_requisites.sh +./build_gcc_final.sh + +echo "✅ Toolchain build complete" +"$TOOLCHAIN/bin/gcc" --version diff --git a/developer/script_gcc-15/build_binutils.sh b/developer/script_gcc-15/build_binutils.sh new file mode 100755 index 0000000..959e91c --- /dev/null +++ b/developer/script_gcc-15/build_binutils.sh @@ -0,0 +1,32 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +mkdir -p "$BINUTILS_BUILD" +pushd "$BINUTILS_BUILD" + + "$BINUTILS_SRC/configure" \ + --prefix="$TOOLCHAIN" \ + --with-sysroot="$SYSROOT" \ + --disable-nls \ + --disable-werror \ + --disable-multilib \ + --enable-deterministic-archives \ + --enable-plugins \ + --with-lib-path="$SYSROOT/lib:$SYSROOT/usr/lib" + + $MAKE + $MAKE install + + # Verify installation + if [[ -x "$TOOLCHAIN/bin/ld" ]]; then + echo "✅ Binutils installed in $TOOLCHAIN/bin" + exit 0 + fi + + echo "❌ Binutils install incomplete" + exit 1 + +popd + diff --git a/developer/script_gcc-15/build_binutils_requisites.sh b/developer/script_gcc-15/build_binutils_requisites.sh new file mode 100755 index 0000000..317378e --- /dev/null +++ b/developer/script_gcc-15/build_binutils_requisites.sh @@ -0,0 +1,62 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "📦 Checking requisites for binutils (bootstrap)." + +missing_requisite_list=() +found_requisite_list=() + +# tool required for build +# + required_tools=( + "gcc" + "g++" + "make" + ) + + for tool in "${required_tools[@]}"; do + location=$(command -v "$tool") + if [ $? -eq 0 ]; then + found_requisite_list+=("$location") + else + missing_requisite_list+=("$tool") + fi + done + +# source code required for build +# + if [ -d "$BINUTILS_SRC" ] && [ "$(ls -A "$BINUTILS_SRC")" ]; then + found_requisite_list+=("$BINUTILS_SRC") + else + missing_requisite_list+=("$BINUTILS_SRC") + fi + +# print requisites found +# + if (( ${#found_requisite_list[@]} != 0 )); then + echo "found:" + for found_requisite in "${found_requisite_list[@]}"; do + echo " $found_requisite" + done + fi + +# print requisites missing +# + if (( ${#missing_requisite_list[@]} != 0 )); then + echo "missing:" + for missing_requisite in "${missing_requisite_list[@]}"; do + echo " $missing_requisite" + done + fi + +# in conclusion +# + if (( ${#missing_requisite_list[@]} > 0 )); then + echo "❌ Missing requisites" + exit 1 + else + echo "✅ All checked specified requisites found" + exit 0 + fi diff --git a/developer/script_gcc-15/build_gcc_final.sh b/developer/script_gcc-15/build_gcc_final.sh new file mode 100755 index 0000000..1e6ca88 --- /dev/null +++ b/developer/script_gcc-15/build_gcc_final.sh @@ -0,0 +1,31 @@ +#!/bin/bash +set -euo pipefail + +# Load environment +source "$(dirname "$0")/environment.sh" + +echo "🔧 Starting final GCC build..." + +mkdir -p "$GCC_BUILD_FINAL" +pushd "$GCC_BUILD_FINAL" + +"$GCC_SRC/configure" \ + --prefix="$TOOLCHAIN" \ + --with-sysroot="$SYSROOT" \ + --with-native-system-header-dir=/usr/include \ + --target="$TARGET" \ + --enable-languages=c,c++ \ + --enable-threads=posix \ + --enable-shared \ + --disable-nls \ + --disable-multilib \ + --disable-bootstrap \ + --disable-libsanitizer \ + $CONFIGURE_FLAGS + +$MAKE +$MAKE install + +popd + +echo "✅ Final GCC installed to $TOOLCHAIN/bin" diff --git a/developer/script_gcc-15/build_gcc_final_requisites.sh b/developer/script_gcc-15/build_gcc_final_requisites.sh new file mode 100755 index 0000000..5d36697 --- /dev/null +++ b/developer/script_gcc-15/build_gcc_final_requisites.sh @@ -0,0 +1,46 @@ +#!/bin/bash +set -euo pipefail + +# Load shared environment +source "$(dirname "$0")/environment.sh" + +echo "📦 Checking prerequisites for final GCC..." + +# Required host tools +required_tools=(gcc g++ make curl tar gawk bison flex) +missing_tools=() + +for tool in "${required_tools[@]}"; do + if ! type -P "$tool" > /dev/null; then + missing_tools+=("$tool") + fi +done + +if (( ${#missing_tools[@]} )); then + echo "❌ Missing required tools: ${missing_tools[*]}" + exit 1 +fi + +# Check for libc headers and startup objects in sysroot +required_headers=("$SYSROOT/include/stdio.h") +required_crt_objects=( + "$SYSROOT/lib/crt1.o" + "$SYSROOT/lib/crti.o" + "$SYSROOT/lib/crtn.o" +) + +for hdr in "${required_headers[@]}"; do + if [ ! -f "$hdr" ]; then + echo "❌ C library header missing: $hdr" + exit 1 + fi +done + +for obj in "${required_crt_objects[@]}"; do + if [ ! -f "$obj" ]; then + echo "❌ Startup object missing: $obj" + exit 1 + fi +done + +echo "✅ Prerequisites for final GCC met." diff --git a/developer/script_gcc-15/build_gcc_stage1.sh b/developer/script_gcc-15/build_gcc_stage1.sh new file mode 100755 index 0000000..2aae2c7 --- /dev/null +++ b/developer/script_gcc-15/build_gcc_stage1.sh @@ -0,0 +1,69 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "🔧 Starting stage 1 GCC build (native layout)..." + +# 🧼 Clean optionally if forced +if [[ "${CLEAN_STAGE1:-0}" == "1" ]]; then + echo "🧹 Forcing rebuild: cleaning $GCC_BUILD_STAGE1" + rm -rf "$GCC_BUILD_STAGE1" +fi + +mkdir -p "$GCC_BUILD_STAGE1" +pushd "$GCC_BUILD_STAGE1" + +# 🛠️ Configure only if not yet configured +if [[ ! -f Makefile ]]; then + echo "⚙️ Configuring GCC stage 1..." + "$GCC_SRC/configure" \ + --prefix="$TOOLCHAIN" \ + --with-sysroot="$SYSROOT" \ + --with-build-sysroot="$SYSROOT" \ + --with-native-system-header-dir=/include \ + --enable-languages=c \ + --disable-nls \ + --disable-shared \ + --disable-threads \ + --disable-libatomic \ + --disable-libgomp \ + --disable-libquadmath \ + --disable-libssp \ + --disable-multilib \ + --disable-bootstrap \ + --disable-libstdcxx \ + --disable-fixincludes \ + --without-headers \ + --with-newlib +else + echo "✅ GCC already configured, skipping." +fi + +# 🧾 Ensure proper sysroot handling for internal libgcc +export CFLAGS_FOR_TARGET="--sysroot=$SYSROOT" +export CXXFLAGS_FOR_TARGET="--sysroot=$SYSROOT" +export CPPFLAGS_FOR_TARGET="--sysroot=$SYSROOT" +export CFLAGS="--sysroot=$SYSROOT" +export CXXFLAGS="--sysroot=$SYSROOT" + +# 🏗️ Build only if not built +if [[ ! -x "$TOOLCHAIN/bin/gcc" ]]; then + echo "⚙️ Building GCC stage 1..." + make -j"$(nproc)" all-gcc all-target-libgcc + + echo "📦 Installing GCC stage 1 to $TOOLCHAIN" + make install-gcc install-target-libgcc +else + echo "✅ GCC stage 1 already installed at $TOOLCHAIN/bin/gcc, skipping build." +fi + +popd + +# ✅ Final check +if [[ ! -x "$TOOLCHAIN/bin/gcc" ]]; then + echo "❌ Stage 1 GCC not found at $TOOLCHAIN/bin/gcc — build may have failed." + exit 1 +fi + +echo "✅ Stage 1 GCC successfully installed in $TOOLCHAIN/bin" diff --git a/developer/script_gcc-15/build_gcc_stage1_requisites.sh b/developer/script_gcc-15/build_gcc_stage1_requisites.sh new file mode 100755 index 0000000..3098e0f --- /dev/null +++ b/developer/script_gcc-15/build_gcc_stage1_requisites.sh @@ -0,0 +1,28 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "📦 Checking prerequisites for stage 1 GCC (bootstrap)..." + +required_tools=(gcc g++ make tar gawk bison flex) +missing=() + +for tool in "${required_tools[@]}"; do + if ! type -P "$tool" &>/dev/null; then + missing+=("$tool") + fi +done + +if (( ${#missing[@]} )); then + echo "❌ Missing required tools: ${missing[*]}" + exit 1 +fi + +if [ ! -d "$GCC_SRC" ]; then + echo "❌ GCC source not found at $GCC_SRC" + echo "💡 You may need to run: prepare_gcc_sources.sh" + exit 1 +fi + +echo "✅ Prerequisites for stage 1 GCC met." diff --git a/developer/script_gcc-15/build_glibc.sh b/developer/script_gcc-15/build_glibc.sh new file mode 100755 index 0000000..652ebc3 --- /dev/null +++ b/developer/script_gcc-15/build_glibc.sh @@ -0,0 +1,26 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "🔧 Building full glibc..." + +mkdir -p "$GLIBC_BUILD" +pushd "$GLIBC_BUILD" + +"$GLIBC_SRC/configure" \ + --prefix=/usr \ + --host="$TARGET" \ + --build="$(gcc -dumpmachine)" \ + --with-headers="$SYSROOT/usr/include" \ + --disable-multilib \ + --enable-static \ + --enable-shared \ + libc_cv_slibdir="/usr/lib" + +$MAKE +DESTDIR="$SYSROOT" $MAKE install + +popd + +echo "✅ Full glibc installed in $SYSROOT" diff --git a/developer/script_gcc-15/build_glibc_bootstrap.sh b/developer/script_gcc-15/build_glibc_bootstrap.sh new file mode 100755 index 0000000..b1c5c73 --- /dev/null +++ b/developer/script_gcc-15/build_glibc_bootstrap.sh @@ -0,0 +1,40 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "📦 Building glibc startup files (crt*.o)..." + +pushd "$GLIBC_BUILD" + + # Confirm that required build artifacts are present + if [[ ! -f bits/stdio_lim.h ]]; then + echo "❌ Missing bits/stdio_lim.h — did you run build_glibc_headers.sh?" + exit 1 + fi + + if [[ ! -f csu/Makefile ]]; then + echo "❌ Missing csu/Makefile — glibc configure phase may have failed" + exit 1 + fi + + # Attempt to build the crt startup object files + make csu/crt1.o csu/crti.o csu/crtn.o -j"$MAKE_JOBS" + + # Install them to the sysroot + install -m 644 csu/crt1.o csu/crti.o csu/crtn.o "$SYSROOT/usr/lib" + + # Create a dummy libc.so to satisfy linker if needed + touch "$SYSROOT/usr/lib/libc.so" + + # ✅ Verify installation + for f in crt1.o crti.o crtn.o; do + if [[ ! -f "$SYSROOT/usr/lib/$f" ]]; then + echo "❌ Missing startup file after install: $f" + exit 1 + fi + done + +popd + +echo "✅ glibc startup files installed to $SYSROOT/usr/lib" diff --git a/developer/script_gcc-15/build_glibc_bootstrap_requisites.sh b/developer/script_gcc-15/build_glibc_bootstrap_requisites.sh new file mode 100755 index 0000000..e10ea96 --- /dev/null +++ b/developer/script_gcc-15/build_glibc_bootstrap_requisites.sh @@ -0,0 +1,138 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "📦 Checking requisites for glibc startup file build (crt1.o, crti.o, crtn.o)." + +missing_requisite_list=() +found_requisite_list=() + +# ──────────────────────────────────────────────── +# 1. Required tools +# + required_tools=( + "gcc" + "g++" + "make" + "ld" + "as" + ) + + for tool in "${required_tools[@]}"; do + if location=$(command -v "$tool"); then + found_requisite_list+=("$location") + else + missing_requisite_list+=("$tool") + fi + done + +# ──────────────────────────────────────────────── +# 2. Required directories and sources +# + if [ -d "$GLIBC_SRC" ] && [ "$(ls -A "$GLIBC_SRC")" ]; then + found_requisite_list+=("$GLIBC_SRC") + else + missing_requisite_list+=("$GLIBC_SRC (empty or missing)") + fi + +# ──────────────────────────────────────────────── +# 3. Required sysroot include path with Linux headers +# + linux_headers=( + "$SYSROOT/usr/include/linux/version.h" + "$SYSROOT/usr/include/asm/unistd.h" + "$SYSROOT/usr/include/bits/types.h" + ) + + for header in "${linux_headers[@]}"; do + if [[ -f "$header" ]]; then + found_requisite_list+=("$header") + else + missing_requisite_list+=("$header") + fi + done + +# ──────────────────────────────────────────────── +# 4. Confirm SYSROOT write access +# + if [[ -w "$SYSROOT/usr/include" ]]; then + found_requisite_list+=("SYSROOT writable: $SYSROOT/usr/include") + else + missing_requisite_list+=("SYSROOT not writable: $SYSROOT/usr/include") + fi + +# ──────────────────────────────────────────────── +# Additional sanity checks before header & crt build +# + + # 1. Check that the C preprocessor works and headers can be found + echo '#include ' | gcc -E - > /dev/null 2>&1 + if [[ $? -eq 0 ]]; then + found_requisite_list+=("C preprocessor operational: gcc -E works") + else + missing_requisite_list+=("C preprocessor failed: gcc -E on failed") + fi + + # 2. Check that bits/stdio_lim.h exists after headers install (glibc marker) + if [[ -f "$GLIBC_BUILD/bits/stdio_lim.h" ]]; then + found_requisite_list+=("$GLIBC_BUILD/bits/stdio_lim.h (glibc headers marker found)") + else + missing_requisite_list+=("$GLIBC_BUILD/bits/stdio_lim.h missing — headers may not be fully installed") + fi + + # 3. Check for crt objects already present (optional) + for f in crt1.o crti.o crtn.o; do + if [[ -f "$GLIBC_BUILD/csu/$f" ]]; then + found_requisite_list+=("$GLIBC_BUILD/csu/$f (already built)") + fi + done + + # 4. Check that Makefile exists and is non-empty + if [[ -f "$GLIBC_BUILD/Makefile" ]]; then + if [[ -s "$GLIBC_BUILD/Makefile" ]]; then + found_requisite_list+=("$GLIBC_BUILD/Makefile exists and is populated") + else + missing_requisite_list+=("$GLIBC_BUILD/Makefile exists but is empty — incomplete configure?") + fi + else + missing_requisite_list+=("$GLIBC_BUILD/Makefile missing — did configure run?") + fi + + # 5. Check that csu Makefile has rules for crt1.o + if [[ -f "$GLIBC_BUILD/csu/Makefile" ]]; then + if grep -q 'crt1\.o' "$GLIBC_BUILD/csu/Makefile"; then + found_requisite_list+=("csu/Makefile defines crt1.o") + else + missing_requisite_list+=("csu/Makefile does not define crt1.o — possible misconfigure") + fi + else + missing_requisite_list+=("csu/Makefile missing — subdir config likely failed") + fi + +# ──────────────────────────────────────────────── +# Print results +# + if (( ${#found_requisite_list[@]} > 0 )); then + echo "found:" + for item in "${found_requisite_list[@]}"; do + echo " $item" + done + fi + + if (( ${#missing_requisite_list[@]} > 0 )); then + echo "missing:" + for item in "${missing_requisite_list[@]}"; do + echo " $item" + done + fi + + # Final verdict + # + if (( ${#missing_requisite_list[@]} > 0 )); then + echo "❌ Missing requisites for glibc bootstrap" + exit 1 + else + echo "✅ All specified requisites found" + exit 0 + fi diff --git a/developer/script_gcc-15/build_glibc_crt.sh b/developer/script_gcc-15/build_glibc_crt.sh new file mode 100755 index 0000000..cb6d0b5 --- /dev/null +++ b/developer/script_gcc-15/build_glibc_crt.sh @@ -0,0 +1,56 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +# Use separate dir to avoid conflicts with headers +# this build variable should be moved to environment.sh, so that the clean scripts will work: +GLIBC_BUILD_CRT="$ROOT/build/glibc-crt" +rm -rf "$GLIBC_BUILD_CRT" +mkdir -p "$GLIBC_BUILD_CRT" +mkdir -p /home/Thomas/subu_data/developer/RT_gcc/build/glibc-crt/csu +touch /home/Thomas/subu_data/developer/RT_gcc/build/glibc-crt/csu/grcrt1.o +mkdir -p "$GLIBC_BUILD_CRT/include" +cp -r "$SYSROOT/usr/include"/* "$GLIBC_BUILD_CRT/include" + + +pushd "$GLIBC_BUILD_CRT" + + + echo "🧱 Configuring glibc for startup file build..." + + # Invoke configure explicitly + "$GLIBC_SRC/configure" \ + --prefix=/usr \ + --build="$HOST" \ + --host="$HOST" \ + --with-headers="$SYSROOT/usr/include" \ + --disable-multilib \ + --enable-static \ + --disable-shared \ + --enable-kernel=4.4.0 \ + libc_cv_slibdir="/usr/lib" + + # Ensure csu/Makefile is generated + #make -C "$GLIBC_SRC" objdir="$GLIBC_BUILD_CRT" csu/subdir_lib -j"$MAKE_JOBS" + make -C "$GLIBC_SRC" objdir="$GLIBC_BUILD_CRT" csu/crt1.o csu/crti.o csu/crtn.o -j"$MAKE_JOBS" + + # Now check and continue + if [[ ! -f "$GLIBC_BUILD_CRT/csu/Makefile" ]]; then + echo "❌ csu/Makefile still not found after configure. Startup build failed." + exit 1 + fi + + echo "🔨 Building crt objects..." + make -C "$GLIBC_SRC" objdir="$GLIBC_BUILD_CRT" csu/crt1.o csu/crti.o csu/crtn.o -j"$MAKE_JOBS" + + echo "📦 Installing crt objects to sysroot..." + install -m 644 "$GLIBC_BUILD_CRT/csu/crt1.o" "$GLIBC_BUILD_CRT/csu/crti.o" "$GLIBC_BUILD_CRT/csu/crtn.o" "$SYSROOT/usr/lib" + touch "$SYSROOT/usr/lib/libc.so" + + for f in crt1.o crti.o crtn.o; do + [[ -f "$SYSROOT/usr/lib/$f" ]] || { echo "❌ Missing $f after install"; exit 1; } + done + +popd +echo "✅ Startup files installed from isolated build dir: $GLIBC_BUILD_CRT" diff --git a/developer/script_gcc-15/build_glibc_crt_requisites.sh b/developer/script_gcc-15/build_glibc_crt_requisites.sh new file mode 100755 index 0000000..7875681 --- /dev/null +++ b/developer/script_gcc-15/build_glibc_crt_requisites.sh @@ -0,0 +1,60 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "📦 Checking requisites for glibc crt (startup files)" + +missing=() +found=() + +# Core toolchain utilities +for tool in gcc g++ make ld as; do + if command -v "$tool" >/dev/null; then + found+=("tool: $tool at $(command -v "$tool")") + else + missing+=("missing tool: $tool") + fi +done + +# GLIBC source/csu +if [[ -d "$GLIBC_SRC/csu" ]]; then + found+=("glibc source/csu present") +else + missing+=("missing: $GLIBC_SRC/csu") +fi + +# Expected headers from sysroot +for h in gnu/libc-version.h stdio.h unistd.h; do + if [[ -f "$SYSROOT/usr/include/$h" ]]; then + found+=("header present: $h") + else + missing+=("missing header: $SYSROOT/usr/include/$h") + fi +done + +# Writable sysroot lib dir +if [[ -w "$SYSROOT/usr/lib" ]]; then + found+=("SYSROOT writable: $SYSROOT/usr/lib") +else + missing+=("not writable: $SYSROOT/usr/lib") +fi + +# Summary output +echo +if (( ${#found[@]} > 0 )); then + echo "✅ Found:" + for item in "${found[@]}"; do echo " $item"; done +fi + +echo +if (( ${#missing[@]} > 0 )); then + echo "❌ Missing:" + for item in "${missing[@]}"; do echo " $item"; done + echo + echo "❌ Requisites check failed" + exit 1 +else + echo "✅ All crt requisites satisfied" + exit 0 +fi diff --git a/developer/script_gcc-15/build_glibc_headers.sh b/developer/script_gcc-15/build_glibc_headers.sh new file mode 100755 index 0000000..abcb5a4 --- /dev/null +++ b/developer/script_gcc-15/build_glibc_headers.sh @@ -0,0 +1,47 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "📦 Building and installing glibc headers..." + +mkdir -p "$GLIBC_BUILD" +pushd "$GLIBC_BUILD" + + # Configure glibc with minimal bootstrap options + "$GLIBC_SRC/configure" \ + --prefix=/usr \ + --build="$HOST" \ + --host="$HOST" \ + --with-headers="$SYSROOT/usr/include" \ + --disable-multilib \ + --enable-static \ + --disable-shared \ + --enable-kernel=4.4.0 \ + libc_cv_slibdir="/usr/lib" + + + # Install headers into sysroot + make install-headers -j"$MAKE_JOBS" DESTDIR="$SYSROOT" + + # ✅ Verify headers were installed + required_headers=( + "$SYSROOT/usr/include/gnu/libc-version.h" + "$SYSROOT/usr/include/stdio.h" + "$SYSROOT/usr/include/unistd.h" + ) + + missing=() + for h in "${required_headers[@]}"; do + [[ -f "$h" ]] || missing+=("$h") + done + + if (( ${#missing[@]} > 0 )); then + echo "❌ Missing required glibc headers:" + printf ' %s\n' "${missing[@]}" + exit 1 + fi + +popd + +echo "✅ glibc headers successfully installed to $SYSROOT/usr/include" diff --git a/developer/script_gcc-15/build_glibc_headers_requisites.sh b/developer/script_gcc-15/build_glibc_headers_requisites.sh new file mode 100755 index 0000000..0264c81 --- /dev/null +++ b/developer/script_gcc-15/build_glibc_headers_requisites.sh @@ -0,0 +1,99 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "📦 Checking requisites for glibc headers installation." + +missing_requisite_list=() +found_requisite_list=() + +# ──────────────────────────────────────────────── +# 1. Required tools for headers phase +# +required_tools=( + "gcc" + "g++" + "make" + "ld" + "as" +) + +for tool in "${required_tools[@]}"; do + if location=$(command -v "$tool"); then + found_requisite_list+=("$location") + else + missing_requisite_list+=("$tool") + fi +done + +# ──────────────────────────────────────────────── +# 2. glibc source directory check +# +if [ -d "$GLIBC_SRC" ] && [ "$(ls -A "$GLIBC_SRC")" ]; then + found_requisite_list+=("$GLIBC_SRC") +else + missing_requisite_list+=("$GLIBC_SRC (empty or missing)") +fi + +# ──────────────────────────────────────────────── +# 3. Kernel headers required for bootstrap glibc +# +linux_headers=( + "$SYSROOT/usr/include/linux/version.h" + "$SYSROOT/usr/include/asm/unistd.h" + "$SYSROOT/usr/include/bits/types.h" +) + +for header in "${linux_headers[@]}"; do + if [[ -f "$header" ]]; then + found_requisite_list+=("$header") + else + missing_requisite_list+=("$header") + fi +done + +# ──────────────────────────────────────────────── +# 4. Confirm SYSROOT write access for header install +# +if [[ -w "$SYSROOT/usr/include" ]]; then + found_requisite_list+=("SYSROOT writable: $SYSROOT/usr/include") +else + missing_requisite_list+=("SYSROOT not writable: $SYSROOT/usr/include") +fi + +# ──────────────────────────────────────────────── +# 5. Check C preprocessor is operational +# +echo '#include ' | gcc -E - > /dev/null 2>&1 +if [[ $? -eq 0 ]]; then + found_requisite_list+=("C preprocessor operational: gcc -E works") +else + missing_requisite_list+=("C preprocessor failed: gcc -E on failed") +fi + +# ──────────────────────────────────────────────── +# Print results +# +if (( ${#found_requisite_list[@]} > 0 )); then + echo "found:" + for item in "${found_requisite_list[@]}"; do + echo " $item" + done +fi + +if (( ${#missing_requisite_list[@]} > 0 )); then + echo "missing:" + for item in "${missing_requisite_list[@]}"; do + echo " $item" + done +fi + +# Final verdict +if (( ${#missing_requisite_list[@]} > 0 )); then + echo "❌ Missing requisites for glibc header install" + exit 1 +else + echo "✅ All specified requisites found for glibc headers" + exit 0 +fi diff --git a/developer/script_gcc-15/build_glibc_requisites.sh b/developer/script_gcc-15/build_glibc_requisites.sh new file mode 100755 index 0000000..0d25cf0 --- /dev/null +++ b/developer/script_gcc-15/build_glibc_requisites.sh @@ -0,0 +1,51 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "📦 Checking prerequisites for glibc..." + +required_tools=(gcc make curl tar perl python3 gawk bison) +missing_tools=() + +for tool in "${required_tools[@]}"; do + if ! command -v "$tool" > /dev/null; then + missing_tools+=("$tool") + fi +done + +if (( ${#missing_tools[@]} )); then + echo "❌ Missing required tools:" + printf ' %s\n' "${missing_tools[@]}" + exit 1 +fi + +# Check that expected headers exist +glibc_headers=( + "$SYSROOT/usr/include/stdio.h" + "$SYSROOT/usr/include/unistd.h" +) + +# Check that expected startup objects exist +startup_objects=( + "$SYSROOT/usr/lib/crt1.o" + "$SYSROOT/usr/lib/crti.o" + "$SYSROOT/usr/lib/crtn.o" +) + +missing_files=() +for f in "${glibc_headers[@]}" "${startup_objects[@]}"; do + if [ ! -f "$f" ]; then + missing_files+=("$f") + fi +done + +if (( ${#missing_files[@]} )); then + echo "❌ Missing required files in sysroot:" + printf ' %s\n' "${missing_files[@]}" + echo + echo "Hint: these files should have been generated by build_glibc_headers.sh" + exit 1 +fi + +echo "✅ All prerequisites for glibc are met and sysroot is correctly populated." diff --git a/developer/script_gcc-15/build_linux.sh b/developer/script_gcc-15/build_linux.sh new file mode 100755 index 0000000..4b7ac0c --- /dev/null +++ b/developer/script_gcc-15/build_linux.sh @@ -0,0 +1,21 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "📦 Preparing Linux kernel headers for glibc and GCC..." + +pushd "$LINUX_SRC" + + $MAKE mrproper + $MAKE headers_install ARCH=x86_64 INSTALL_HDR_PATH="$SYSROOT/usr" + + if [[ -f "$SYSROOT/usr/include/linux/version.h" ]]; then + echo "✅ Linux headers installed to $SYSROOT/usr/include" + exit 0 + fi + + echo "❌ Kernel headers not found at expected location." + exit 1 + +popd diff --git a/developer/script_gcc-15/clean_build.sh b/developer/script_gcc-15/clean_build.sh new file mode 100755 index 0000000..7c9bca3 --- /dev/null +++ b/developer/script_gcc-15/clean_build.sh @@ -0,0 +1,15 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "🧹 Cleaning build directories..." + +for dir in "${BUILD_DIR_LIST[@]}"; do + if [[ -d "$dir" ]]; then + echo "rm -rf $dir" + rm -rf "$dir" + fi +done + +echo "✅ Build directories cleaned." diff --git a/developer/script_gcc-15/clean_dist.sh b/developer/script_gcc-15/clean_dist.sh new file mode 100755 index 0000000..009e01e --- /dev/null +++ b/developer/script_gcc-15/clean_dist.sh @@ -0,0 +1,31 @@ +#!/bin/bash +set -euo pipefail + +echo "removing: build, source, upstream, and project directories" + +source "$(dirname "$0")/environment.sh" + +# Remove build +# + "./clean_build.sh" + ! ! rmdir "$BUILD_DIR" >& /dev/null && echo "rmdir $BUILD_DIR" + +# Remove source +# note that repos are removed with clean_upstream +# + "./clean_source.sh" + "./clean_upstream.sh" + + ! ! rmdir "$SRC" >& /dev/null && echo "rmdir $SRC" + ! ! rmdir "$UPSTREAM" >& /dev/null && echo "rmdir $UPSTREAM" + +# Remove project directories +# + for dir in "${PROJECT_SUBDIR_LIST[@]}" "${PROJECT_DIR_LIST[@]}"; do + if [[ -d "$dir" ]]; then + echo "rm -rf $dir" + ! rm -rf "$dir" && echo "could not remove $dir" + fi + done + +echo "✅ clean_dist.sh" diff --git a/developer/script_gcc-15/clean_source.sh b/developer/script_gcc-15/clean_source.sh new file mode 100755 index 0000000..2f5beb0 --- /dev/null +++ b/developer/script_gcc-15/clean_source.sh @@ -0,0 +1,29 @@ +#!/bin/bash +# removes project tarball expansions from source/ +# git repos are part of `upstream` so are not removed + +set -euo pipefail + + +source "$(dirname "$0")/environment.sh" + +i=0 +while [ $i -lt ${#UPSTREAM_TARBALL_LIST[@]} ]; do + tarball="${UPSTREAM_TARBALL_LIST[$i]}" + # skip url + i=$((i + 1)) + # skip explicit dest dir + i=$((i + 1)) + + base_name="${tarball%.tar.*}" + dir="$SRC/$base_name" + + if [[ -d "$dir" ]]; then + echo "rm -rf $dir" + rm -rf "$dir" + fi + + i=$((i + 1)) +done + +echo "✅ clean_source.sh" diff --git a/developer/script_gcc-15/clean_upstream.sh b/developer/script_gcc-15/clean_upstream.sh new file mode 100755 index 0000000..50e8d98 --- /dev/null +++ b/developer/script_gcc-15/clean_upstream.sh @@ -0,0 +1,38 @@ +#!/bin/bash +# run this to force repeat of the downloads +# removes project tarballs from upstream/ +# removes project repos from source/ +# does not remove non-project files + +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +# Remove tarballs +i=0 +while [ $i -lt ${#UPSTREAM_TARBALL_LIST[@]} ]; do + tarball="${UPSTREAM_TARBALL_LIST[$i]}" + path="$UPSTREAM/$tarball" + + if [[ -f "$path" ]]; then + echo "rm $path" + rm "$path" + fi + + i=$((i + 3)) +done + +# Remove Git repositories +i=0 +while [ $i -lt ${#UPSTREAM_GIT_REPO_LIST[@]} ]; do + dir="${UPSTREAM_GIT_REPO_LIST[$((i+2))]}" + + if [[ -d "$dir" ]]; then + echo "rm -rf $dir" + rm -rf "$dir" + fi + + i=$((i + 3)) +done + +echo "✅ clean_upstream.sh" diff --git a/developer/script_gcc-15/environment.sh b/developer/script_gcc-15/environment.sh new file mode 100755 index 0000000..1991f40 --- /dev/null +++ b/developer/script_gcc-15/environment.sh @@ -0,0 +1,160 @@ +# === environment.sh === +# Source this file in each build script to ensure consistent paths and settings + +echo "ROOT: $ROOT" +cd $SCRIPT_DIR + +#-------------------------------------------------------------------------------- +# tools + + # machine target + export HOST=$(gcc -dumpmachine) + +# export MAKE_JOBS=$(nproc) +# export MAKE="make -j$MAKE_JOBS" + export MAKE_JOBS=$(getconf _NPROCESSORS_ONLN) + export MAKE=make + + + # Compiler path prefixes + export CC_FOR_BUILD=$(command -v gcc) + export CXX_FOR_BUILD=$(command -v g++) + +#-------------------------------------------------------------------------------- +# tool versions + + export LINUX_VER=6.8 + export BINUTILS_VER=2.42 + export GCC_VER=15.1.0 + export GLIBC_VER=2.39 + + # Library versions: required minimums or recommended tested versions + export GMP_VER=6.3.0 # Compatible with GCC 15, latest stable from GMP site + export MPFR_VER=4.2.1 # Latest stable, tested with GCC 15 + export MPC_VER=1.3.1 # Works with GCC 15, matches default in-tree + export ISL_VER=0.26 # Matches upstream GCC infrastructure repo + export ZSTD_VER=1.5.5 # Stable release, supported by GCC for LTO compression + +#-------------------------------------------------------------------------------- +# project structure + + # temporary directory + export TMPDIR="$ROOT/tmp" + + # Project directories + export SYSROOT="$ROOT/sysroot" + export TOOLCHAIN="$ROOT/toolchain" + export BUILD_DIR="$ROOT/build" + export LOGDIR="$ROOT/log" + export UPSTREAM="$ROOT/upstream" + export SRC=$ROOT/source + + # Synthesized directory lists + PROJECT_DIR_LIST=( + "$LOGDIR" + "$SYSROOT" "$TOOLCHAIN" "$BUILD_DIR" + "$UPSTREAM" "$SRC" + ) + # list these in the order they can be deleted + PROJECT_SUBDIR_LIST=( + "$SYSROOT/usr/lib" + "$SYSROOT/lib" + "$SYSROOT/usr/include" + ) + + # Source directories + export LINUX_SRC="$SRC/linux-$LINUX_VER" + export BINUTILS_SRC="$SRC/binutils-$BINUTILS_VER" + export GCC_SRC="$SRC/gcc-$GCC_VER" + export GLIBC_SRC="$SRC/glibc-$GLIBC_VER" + export GMP_SRC="$SRC/gmp-$GMP_VER" + export MPFR_SRC="$SRC/mpfr-$MPFR_VER" + export MPC_SRC="$SRC/mpc-$MPC_VER" + export ISL_SRC="$SRC/isl-$ISL_VER" + export ZSTD_SRC="$SRC/zstd-$ZSTD_VER" + + SOURCE_DIR_LIST=( + "$LINUX_SRC" + "$BINUTILS_SRC" + "$GCC_SRC" + "$GLIBC_SRC" + "$GMP_SRC" + "$MPFR_SRC" + "$MPC_SRC" + "$ISL_SRC" + "$ZSTD_SRC" + ) + + # Build directories + export BINUTILS_BUILD="$BUILD_DIR/binutils" + export GCC_BUILD_STAGE1="$BUILD_DIR/gcc-stage1" + export GCC_BUILD_FINAL="$BUILD_DIR/gcc-final" + export GLIBC_BUILD="$BUILD_DIR/glibc" + BUILD_DIR_LIST=( + "$BINUTILS_BUILD" + "$GCC_BUILD_STAGE1" + "$GCC_BUILD_FINAL" + "$GLIBC_BUILD" + ) + +#-------------------------------------------------------------------------------- +# upstream -> local stuff + + # see top of this file for the _VER variables + + # Tarball Download Info (Name, URL, Destination Directory) + export UPSTREAM_TARBALL_LIST=( + "linux-${LINUX_VER}.tar.xz" + "https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-${LINUX_VER}.tar.xz" + "$UPSTREAM/linux-$LINUX_VER" + + "binutils-${BINUTILS_VER}.tar.xz" + "https://ftp.gnu.org/gnu/binutils/binutils-${BINUTILS_VER}.tar.xz" + "$UPSTREAM/binutils-$BINUTILS_VER" + + # using repo + # "gcc-${GCC_VER}.tar.xz" + # "https://ftp.gnu.org/gnu/gcc/gcc-${GCC_VER}/gcc-${GCC_VER}.tar.xz" + # "$UPSTREAM/gcc-$GCC_VER" + + "glibc-${GLIBC_VER}.tar.xz" + "https://ftp.gnu.org/gnu/libc/glibc-${GLIBC_VER}.tar.xz" + "$UPSTREAM/glibc-$GLIBC_VER" + + "gmp-${GMP_VER}.tar.xz" + "https://ftp.gnu.org/gnu/gmp/gmp-${GMP_VER}.tar.xz" + "$UPSTREAM/gmp-$GMP_VER" + + "mpfr-${MPFR_VER}.tar.xz" + "https://www.mpfr.org/mpfr-${MPFR_VER}/mpfr-${MPFR_VER}.tar.xz" + "$UPSTREAM/mpfr-$MPFR_VER" + + "mpc-${MPC_VER}.tar.gz" + "https://ftp.gnu.org/gnu/mpc/mpc-${MPC_VER}.tar.gz" + "$UPSTREAM/mpc-$MPC_VER" + + "isl-${ISL_VER}.tar.bz2" +# "https://gcc.gnu.org/pub/gcc/infrastructure/isl-${ISL_VER}.tar.bz2" + "https://libisl.sourceforge.io/isl-0.26.tar.bz2" +# "https://github.com/Meinersbur/isl/archive/refs/tags/isl-0.26.tar.gz" + "$UPSTREAM/isl-$ISL_VER" + + "zstd-${ZSTD_VER}.tar.zst" + "https://github.com/facebook/zstd/releases/download/v${ZSTD_VER}/zstd-${ZSTD_VER}.tar.zst" + "$UPSTREAM/zstd-$ZSTD_VER" + ) + + + # Git Repo Info + # Each entry is triple: Repository URL, Branch, Destination Directory + # Repos clone directly into $SRC + export UPSTREAM_GIT_REPO_LIST=( + + "git://gcc.gnu.org/git/gcc.git" + "releases/gcc-15" + "$SRC/gcc-$GCC_VER" + + #currently there is no second repo + ) + + diff --git a/developer/script_gcc-15/project_download.sh b/developer/script_gcc-15/project_download.sh new file mode 100755 index 0000000..0aa20bc --- /dev/null +++ b/developer/script_gcc-15/project_download.sh @@ -0,0 +1,131 @@ +#!/bin/bash +# This script can be run multiple times to download what was missed on prior invocations +# If there is a corrupt tarball, delete it and run this again +# Sometimes a connection test will fails, then the downloads runs anyway + +set -uo pipefail # no `-e`, we want to continue on error + +source "$(dirname "$0")/environment.sh" + +check_internet_connection() { + echo "🌐 Checking internet connection..." + # Use a quick connection check without blocking the whole script + if ! curl -s --connect-timeout 5 https://google.com > /dev/null; then + echo "⚠️ No internet connection detected (proceeding with download anyway)" + else + echo "✅ Internet connection detected" + fi +} + +# check_server_reachability() { +# local url=$1 +# if ! curl -s --head "$url" | head -n 1 | grep -q "HTTP/1.1 200 OK"; then +# echo "⚠️ Cannot reach $url (proceeding with download anyway)" +# fi +# } + +check_server_reachability() { + local url=$1 + echo "checking is reachable: $url " + + # Attempt to get the HTTP response code without following redirects + http_code=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 5 "$url") + + # If the HTTP code is between 200 and 299, consider it reachable + if [[ "$http_code" -ge 200 && "$http_code" -lt 300 ]]; then + echo "✅ Server reachable (HTTP $http_code): $url " + else + # If not 2xx, print the status code for transparency + echo "⚠️ Server HTTP $http_code not 2xx, will try anyway: $url" + fi +} + +check_file_exists() { + local file=$1 + [[ -f "$UPSTREAM/$file" ]] +} + +download_file() { + local file=$1 + local url=$2 + + echo "Downloading $file from $url..." + if (cd "$UPSTREAM" && curl -LO "$url"); then + if file "$UPSTREAM/$file" | grep -qi 'html'; then + echo "❌ Invalid download (HTML, not archive): $file" + rm -f "$UPSTREAM/$file" + return 1 + elif [[ -f "$UPSTREAM/$file" ]]; then + echo "✅ Successfully downloaded: $file" + return 0 + # Validate it's not an HTML error page + else + echo "❌ Did not appear after download: $file " + return 1 + fi + else + echo "❌ Failed to download: $file" + return 1 + fi +} + +download_tarballs() { + i=0 + while [ $i -lt ${#UPSTREAM_TARBALL_LIST[@]} ]; do + tarball="${UPSTREAM_TARBALL_LIST[$i]}" + url="${UPSTREAM_TARBALL_LIST[$((i+1))]}" + i=$((i + 3)) + + if check_file_exists "$tarball"; then + echo "⚡ already exists, skipping download: $tarball " + continue + fi + + check_server_reachability "$url" + + if ! download_file "$tarball" "$url"; then + echo "⚠️ Skipping due to previous error: $tarball " + fi + + done +} + +download_git_repos() { + i=0 + while [ $i -lt ${#UPSTREAM_GIT_REPO_LIST[@]} ]; do + repo="${UPSTREAM_GIT_REPO_LIST[$i]}" + branch="${UPSTREAM_GIT_REPO_LIST[$((i+1))]}" + dir="${UPSTREAM_GIT_REPO_LIST[$((i+2))]}" + + if [[ -d "$dir/.git" ]]; then + echo "⚡ Already exists, skipping git clone: $dir " + i=$((i + 3)) + continue + fi + + echo "Cloning $repo into $dir..." + if ! git clone --branch "$branch" "$repo" "$dir"; then + echo "❌ Failed to clone $repo → $dir" + fi + + i=$((i + 3)) + done +} + +# do the downloads + +check_internet_connection + +echo "Downloading tarballs:" +for ((i=0; i<${#UPSTREAM_TARBALL_LIST[@]}; i+=3)); do + echo " - ${UPSTREAM_TARBALL_LIST[i]}" +done +download_tarballs + +echo "Cloning Git repositories:" +for ((i=0; i<${#UPSTREAM_GIT_REPO_LIST[@]}; i+=3)); do + echo " - ${UPSTREAM_GIT_REPO_LIST[i]} (branch ${UPSTREAM_GIT_REPO_LIST[i+1]})" +done +download_git_repos + +echo "project_download.sh completed" diff --git a/developer/script_gcc-15/project_extract.sh b/developer/script_gcc-15/project_extract.sh new file mode 100755 index 0000000..e114d34 --- /dev/null +++ b/developer/script_gcc-15/project_extract.sh @@ -0,0 +1,52 @@ +#!/bin/bash +# Will not extract if target already exists +# Delete any malformed extractions before running again + +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +had_error=0 +i=0 + +while [ $i -lt ${#UPSTREAM_TARBALL_LIST[@]} ]; do + tarball="${UPSTREAM_TARBALL_LIST[$i]}" + i=$((i + 3)) + + src_path="$UPSTREAM/$tarball" + + # Strip compression suffix to guess subdirectory name + base_name="${tarball%%.tar.*}" # safer across .tar.gz, .tar.zst, etc. + target_dir="$SRC/$base_name" + + if [[ -d "$target_dir" ]]; then + echo "⚡ Already exists, skipping: $target_dir" + continue + fi + + if [[ ! -f "$src_path" ]]; then + echo "❌ Missing tarball: $src_path" + had_error=1 + continue + fi + + echo "tar -xf $tarball" + if ! (cd "$SRC" && tar -xf "$src_path"); then + echo "❌ Extraction failed: $tarball" + had_error=1 + continue + fi + + if [[ -d "$target_dir" ]]; then + echo "Extracted to: $target_dir" + else + echo "❌ Target not found after extraction: $target_dir" + had_error=1 + fi +done + +if [[ $had_error -eq 0 ]]; then + echo "✅ All tarballs extracted successfully" +else + echo "❌ Some extractions failed or were incomplete" +fi diff --git a/developer/script_gcc-15/project_requisites.sh b/developer/script_gcc-15/project_requisites.sh new file mode 100755 index 0000000..76bf171 --- /dev/null +++ b/developer/script_gcc-15/project_requisites.sh @@ -0,0 +1,155 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +echo "Checking requisites for native standalone GCC build." + +if [ ! $(command -v pkg-config) ]; then + echo "pkg-config command required for this script" + echo "Debian: sudo apt install pkg-config" + echo "Fedora: sudo dnf install pkg-config" + exit 1 +fi + + +missing_requisite_list=() +failed_pkg_config_list=() +found_reequisite_list=() + +# --- Required Script Tools (must be usable by this script itself) --- +script_tools=( + bash + awk + sed + grep +) + +echo "Checking for essential script dependencies." +for tool in "${script_tools[@]}"; do + location=$(command -v "$tool") + if [ $? -eq 0 ]; then + found_requisite_list+=("$location") + else + missing_requisite_list+=("$tool") + fi +done + +# --- Build Tools --- +build_tools=( + gcc + g++ + make + tar + gzip + bzip2 + perl + patch + diff + python3 +) + +echo "Checking for required build tools." +for tool in "${build_tools[@]}"; do + location=$(command -v "$tool") + if [ $? -eq 0 ]; then + found_requisite_list+=("$location") + else + missing_requisite_list+=("$tool") + fi +done + +# --- Libraries --- +required_pkgs=( + gmp + mpfr + mpc + isl + zstd +) + +echo "Checking for required development libraries (via pkg-config)." +for lib in "${required_pkgs[@]}"; do + if pkg-config --exists "$lib"; then + found_reequisite_list+=("library: $lib => $(pkg-config --modversion "$lib")") + else + failed_pkg_config_list+=("library: $lib") + fi +done + +# --- Source Trees --- +required_sources=( + "$GCC_SRC" + "$BINUTILS_SRC" + "$GLIBC_SRC" + "$LINUX_SRC" + "$GMP_SRC" + "$MPFR_SRC" + "$MPC_SRC" + "$ISL_SRC" + "$ZSTD_SRC" +) + +echo "Checking for required source directories." +echo "These will be installed by download_sources.sh and extract_from_tar.sh" +for src in "${required_sources[@]}"; do + if [[ -d "$src" && "$(ls -A "$src")" ]]; then + found_reequisite_list+=("source: $src") + else + missing_requisite_list+=("source: $src") + fi +done + +# --- Python Modules (non-fatal) --- +optional_py_modules=(re sys os json gzip pathlib shutil time tempfile) + +echo "Checking optional Python3 modules." +for mod in "${optional_py_modules[@]}"; do + if python3 -c "import $mod" &>/dev/null; then + found_reequisite_list+=("python: module $mod") + else + missing_requisite_list+=("python (optional): module $mod") + fi +done + +echo +echo "Summary:" +echo "--------" + +# Found tools +for item in "${found_reequisite_list[@]}"; do + echo " found: $item" +done + +# Missing essentials +for item in "${missing_requisite_list[@]:-}"; do + echo "❌ missing required tool: $item" +done + +# pkg-config failures +for item in "${failed_pkg_config_list[@]:-}"; do + echo "⚠️ pkg-config could not find: $item" +done + +# Final verdict +echo + +if [[ ${#missing_requisite_list[@]} -eq 0 && ${#failed_pkg_config_list[@]} -eq 0 ]]; then + echo "✅ All required tools and libraries found." +else + echo "❌ Some requisites are missing." + + if [[ ${#failed_pkg_config_list[@]} -gt 0 ]]; then + echo + echo "Note: pkg-config did not find some libraries:" + echo " These are expected if you are building them from source:" + echo " - mpc" + echo " - isl" + echo " - zstd" + echo " If not, consider installing the corresponding dev packages." + echo " Debian: sudo apt install libmpc-dev libisl-dev libzstd-dev" + echo " Fedora: sudo dnf install libmpc-devel isl-devel libzstd-devel" + fi +fi + + diff --git a/developer/script_gcc-15/project_setup.sh b/developer/script_gcc-15/project_setup.sh new file mode 100755 index 0000000..31516dc --- /dev/null +++ b/developer/script_gcc-15/project_setup.sh @@ -0,0 +1,45 @@ +#!/bin/bash +set -euo pipefail + +source "$(dirname "$0")/environment.sh" + +# Create top-level project directories +for dir in "${PROJECT_DIR_LIST[@]}"; do + echo "mkdir -p $dir" + mkdir -p "$dir" +done + +# Create subdirectories within SYSROOT +for subdir in "${PROJECT_SUBDIR_LIST[@]}"; do + echo "mkdir -p $subdir" + mkdir -p "$subdir" +done + +# Ensure TMPDIR exists and add .gitignore +if [[ ! -d "$TMPDIR" ]]; then + echo "mkdir -p $TMPDIR" + mkdir -p "$TMPDIR" + + echo "echo $TMPDIR/ > $TMPDIR/.gitignore" + echo "$TMPDIR/" > "$TMPDIR/.gitignore" +else + echo "⚠️ TMPDIR already exists" +fi + +# Create root-level .gitignore if missing +if [[ -f "$ROOT/.gitignore" ]]; then + echo "⚠️ $ROOT/.gitignore already exists" +else + echo "create $ROOT/.gitignore" + { + echo "# Ignore synthesized top-level directories" + for dir in "${PROJECT_DIR_LIST[@]}"; do + rel_path="${dir#$ROOT/}" + echo "/$rel_path" + done + echo "# Ignore synthesized files" + echo "/.gitignore" + } > "$ROOT/.gitignore" +fi + +echo "✅ setup_project.sh" diff --git "a/developer/script_gcc-15\360\237\226\211/README.org" "b/developer/script_gcc-15\360\237\226\211/README.org" deleted file mode 100755 index 44936b8..0000000 --- "a/developer/script_gcc-15\360\237\226\211/README.org" +++ /dev/null @@ -1,34 +0,0 @@ - -This was a bear to get running, and I have given up on it for now. - -It dies early in the chain at the making of glibc and the crt.o -etc. files. - -# Scripts for building standalone gcc - -The scripts here will build a standalone gcc along with version compatible tools. - -There is a lot more to a gcc than one might imagine. It was developed as though an integral part of Unix. Hence, the standalone build has top level directories with many things in them, in parallel to the top level directories a person would find on a Unix system. - -## .gitignore - -* If there is no top level `.gitignore`, `setup_project.sh` will create one. -* The synthesized `.gitignore` references itself, so it will not get checked in. -* No script, including`clean_dist.sh`, will delete an existing `.gitignore`. - -## Clean - -* clean_build.sh - for saving space after the build is done. The build scripts are idempotent, so in an ideal world this need not be run to do a rebuild. - -* clean_dist.sh - with on exception, this will delete everything that was synthesized. The one exception is that .gitignore is moved to the tmp directory so as to preserve any changes a user might have been made, and the contents of the tmp directory are not removed. - -* clean_tmp.sh - wipes clean all contents of the temporary directory. - -## Setup - -* setup_project.sh - makes the directory structure for the build, creates a `tmp/` directory under the project. If it does not already exist, creates a .gitignore file with all the created directories listed. - -## Download - -* download_upstream_sources.sh - goes to the Internet, fetches all the sources that have not already been fetched. Then expands the sources into the proper sub-directory under `source/1. - diff --git "a/developer/script_gcc-15\360\237\226\211/audit_glibc_headers.sh" "b/developer/script_gcc-15\360\237\226\211/audit_glibc_headers.sh" deleted file mode 100755 index 64ddd57..0000000 --- "a/developer/script_gcc-15\360\237\226\211/audit_glibc_headers.sh" +++ /dev/null @@ -1,50 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "🔎 Auditing glibc build state..." - -declare -a missing -declare -a present - -# GLIBC_BUILD sanity -[[ -d "$GLIBC_BUILD" ]] && present+=("GLIBC_BUILD exists: $GLIBC_BUILD") || missing+=("GLIBC_BUILD missing") - -# Check for Makefile -if [[ -s "$GLIBC_BUILD/Makefile" ]]; then - present+=("Makefile exists and non-empty") -else - missing+=("Makefile missing or empty in $GLIBC_BUILD") -fi - -# Check bits/stdio_lim.h -if [[ -f "$GLIBC_BUILD/bits/stdio_lim.h" ]]; then - present+=("bits/stdio_lim.h exists (post-header install marker)") -else - missing+=("bits/stdio_lim.h missing — make install-headers likely incomplete") -fi - -# Check csu/Makefile -if [[ -f "$GLIBC_BUILD/csu/Makefile" ]]; then - present+=("csu/Makefile exists") - grep -q 'crt1\.o' "$GLIBC_BUILD/csu/Makefile" \ - && present+=("csu/Makefile defines crt1.o") \ - || missing+=("csu/Makefile missing rule for crt1.o") -else - missing+=("csu/Makefile missing") -fi - -# Show report -echo "✅ Present:" -for p in "${present[@]}"; do echo " $p"; done - -echo -if (( ${#missing[@]} )); then - echo "❌ Missing:" - for m in "${missing[@]}"; do echo " $m"; done - exit 1 -else - echo "🎉 All bootstrap prerequisites are in place" - exit 0 -fi diff --git "a/developer/script_gcc-15\360\237\226\211/build_Linux_requisites.sh" "b/developer/script_gcc-15\360\237\226\211/build_Linux_requisites.sh" deleted file mode 100755 index 42fa7b7..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_Linux_requisites.sh" +++ /dev/null @@ -1,62 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "📦 Checking requisites for Linux kernel headers build." - -missing_requisite_list=() -found_requisite_list=() - -# tools required for build -# -required_tools=( - "gcc" - "g++" - "make" -) - -for tool in "${required_tools[@]}"; do - location=$(command -v "$tool") # Fixed this part to use $tool instead of "tool" - if [ $? -eq 0 ]; then # Check if the command was successful - found_requisite_list+=("$location") - else - missing_requisite_list+=("$tool") - fi -done - -# source code required for build -# -if [ -d "$LINUX_SRC" ] && [ "$(ls -A "$LINUX_SRC")" ]; then - found_requisite_list+=("$LINUX_SRC") -else - missing_requisite_list+=("$LINUX_SRC") -fi - -# print requisites found -# -if (( ${#found_requisite_list[@]} != 0 )); then - echo "found:" - for found_requisite in "${found_requisite_list[@]}"; do - echo " $found_requisite" - done -fi - -# print requisites missing -# -if (( ${#missing_requisite_list[@]} != 0 )); then - echo "missing:" - for missing_requisite in "${missing_requisite_list[@]}"; do - echo " $missing_requisite" - done -fi - -# in conclusion -# -if (( ${#missing_requisite_list[@]} > 0 )); then - echo "❌ Missing requisites" - exit 1 -else - echo "✅ All checked specified requisites found" - exit 0 -fi diff --git "a/developer/script_gcc-15\360\237\226\211/build_all.sh" "b/developer/script_gcc-15\360\237\226\211/build_all.sh" deleted file mode 100755 index 6f55a28..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_all.sh" +++ /dev/null @@ -1,36 +0,0 @@ -#!/bin/bash -set -euo pipefail - -cd "$(dirname "$0")" -SCRIPT_DIR="$PWD" - -echo "loading environment" -source "$SCRIPT_DIR/environment.sh" - -cd "$SCRIPT_DIR" - -./project_setup.sh -./project_download.sh -./project_extract.sh -./project_requisites - -./build_binutils_requisites.sh -./build_binutils.sh - -./build_linux_requisites.sh -./build_linux.sh - -#./build_glibc_bootstrap_requisites.sh -./build_glibc_bootstrap.sh - -./build_gcc_stage1_requisites.sh -./build_gcc_stage1.sh - -./build_glibc_requisites.sh -./build_glibc.sh - -./build_gcc_final_requisites.sh -./build_gcc_final.sh - -echo "✅ Toolchain build complete" -"$TOOLCHAIN/bin/gcc" --version diff --git "a/developer/script_gcc-15\360\237\226\211/build_binutils.sh" "b/developer/script_gcc-15\360\237\226\211/build_binutils.sh" deleted file mode 100755 index 959e91c..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_binutils.sh" +++ /dev/null @@ -1,32 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -mkdir -p "$BINUTILS_BUILD" -pushd "$BINUTILS_BUILD" - - "$BINUTILS_SRC/configure" \ - --prefix="$TOOLCHAIN" \ - --with-sysroot="$SYSROOT" \ - --disable-nls \ - --disable-werror \ - --disable-multilib \ - --enable-deterministic-archives \ - --enable-plugins \ - --with-lib-path="$SYSROOT/lib:$SYSROOT/usr/lib" - - $MAKE - $MAKE install - - # Verify installation - if [[ -x "$TOOLCHAIN/bin/ld" ]]; then - echo "✅ Binutils installed in $TOOLCHAIN/bin" - exit 0 - fi - - echo "❌ Binutils install incomplete" - exit 1 - -popd - diff --git "a/developer/script_gcc-15\360\237\226\211/build_binutils_requisites.sh" "b/developer/script_gcc-15\360\237\226\211/build_binutils_requisites.sh" deleted file mode 100755 index 317378e..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_binutils_requisites.sh" +++ /dev/null @@ -1,62 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "📦 Checking requisites for binutils (bootstrap)." - -missing_requisite_list=() -found_requisite_list=() - -# tool required for build -# - required_tools=( - "gcc" - "g++" - "make" - ) - - for tool in "${required_tools[@]}"; do - location=$(command -v "$tool") - if [ $? -eq 0 ]; then - found_requisite_list+=("$location") - else - missing_requisite_list+=("$tool") - fi - done - -# source code required for build -# - if [ -d "$BINUTILS_SRC" ] && [ "$(ls -A "$BINUTILS_SRC")" ]; then - found_requisite_list+=("$BINUTILS_SRC") - else - missing_requisite_list+=("$BINUTILS_SRC") - fi - -# print requisites found -# - if (( ${#found_requisite_list[@]} != 0 )); then - echo "found:" - for found_requisite in "${found_requisite_list[@]}"; do - echo " $found_requisite" - done - fi - -# print requisites missing -# - if (( ${#missing_requisite_list[@]} != 0 )); then - echo "missing:" - for missing_requisite in "${missing_requisite_list[@]}"; do - echo " $missing_requisite" - done - fi - -# in conclusion -# - if (( ${#missing_requisite_list[@]} > 0 )); then - echo "❌ Missing requisites" - exit 1 - else - echo "✅ All checked specified requisites found" - exit 0 - fi diff --git "a/developer/script_gcc-15\360\237\226\211/build_gcc_final.sh" "b/developer/script_gcc-15\360\237\226\211/build_gcc_final.sh" deleted file mode 100755 index 1e6ca88..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_gcc_final.sh" +++ /dev/null @@ -1,31 +0,0 @@ -#!/bin/bash -set -euo pipefail - -# Load environment -source "$(dirname "$0")/environment.sh" - -echo "🔧 Starting final GCC build..." - -mkdir -p "$GCC_BUILD_FINAL" -pushd "$GCC_BUILD_FINAL" - -"$GCC_SRC/configure" \ - --prefix="$TOOLCHAIN" \ - --with-sysroot="$SYSROOT" \ - --with-native-system-header-dir=/usr/include \ - --target="$TARGET" \ - --enable-languages=c,c++ \ - --enable-threads=posix \ - --enable-shared \ - --disable-nls \ - --disable-multilib \ - --disable-bootstrap \ - --disable-libsanitizer \ - $CONFIGURE_FLAGS - -$MAKE -$MAKE install - -popd - -echo "✅ Final GCC installed to $TOOLCHAIN/bin" diff --git "a/developer/script_gcc-15\360\237\226\211/build_gcc_final_requisites.sh" "b/developer/script_gcc-15\360\237\226\211/build_gcc_final_requisites.sh" deleted file mode 100755 index 5d36697..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_gcc_final_requisites.sh" +++ /dev/null @@ -1,46 +0,0 @@ -#!/bin/bash -set -euo pipefail - -# Load shared environment -source "$(dirname "$0")/environment.sh" - -echo "📦 Checking prerequisites for final GCC..." - -# Required host tools -required_tools=(gcc g++ make curl tar gawk bison flex) -missing_tools=() - -for tool in "${required_tools[@]}"; do - if ! type -P "$tool" > /dev/null; then - missing_tools+=("$tool") - fi -done - -if (( ${#missing_tools[@]} )); then - echo "❌ Missing required tools: ${missing_tools[*]}" - exit 1 -fi - -# Check for libc headers and startup objects in sysroot -required_headers=("$SYSROOT/include/stdio.h") -required_crt_objects=( - "$SYSROOT/lib/crt1.o" - "$SYSROOT/lib/crti.o" - "$SYSROOT/lib/crtn.o" -) - -for hdr in "${required_headers[@]}"; do - if [ ! -f "$hdr" ]; then - echo "❌ C library header missing: $hdr" - exit 1 - fi -done - -for obj in "${required_crt_objects[@]}"; do - if [ ! -f "$obj" ]; then - echo "❌ Startup object missing: $obj" - exit 1 - fi -done - -echo "✅ Prerequisites for final GCC met." diff --git "a/developer/script_gcc-15\360\237\226\211/build_gcc_stage1.sh" "b/developer/script_gcc-15\360\237\226\211/build_gcc_stage1.sh" deleted file mode 100755 index 2aae2c7..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_gcc_stage1.sh" +++ /dev/null @@ -1,69 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "🔧 Starting stage 1 GCC build (native layout)..." - -# 🧼 Clean optionally if forced -if [[ "${CLEAN_STAGE1:-0}" == "1" ]]; then - echo "🧹 Forcing rebuild: cleaning $GCC_BUILD_STAGE1" - rm -rf "$GCC_BUILD_STAGE1" -fi - -mkdir -p "$GCC_BUILD_STAGE1" -pushd "$GCC_BUILD_STAGE1" - -# 🛠️ Configure only if not yet configured -if [[ ! -f Makefile ]]; then - echo "⚙️ Configuring GCC stage 1..." - "$GCC_SRC/configure" \ - --prefix="$TOOLCHAIN" \ - --with-sysroot="$SYSROOT" \ - --with-build-sysroot="$SYSROOT" \ - --with-native-system-header-dir=/include \ - --enable-languages=c \ - --disable-nls \ - --disable-shared \ - --disable-threads \ - --disable-libatomic \ - --disable-libgomp \ - --disable-libquadmath \ - --disable-libssp \ - --disable-multilib \ - --disable-bootstrap \ - --disable-libstdcxx \ - --disable-fixincludes \ - --without-headers \ - --with-newlib -else - echo "✅ GCC already configured, skipping." -fi - -# 🧾 Ensure proper sysroot handling for internal libgcc -export CFLAGS_FOR_TARGET="--sysroot=$SYSROOT" -export CXXFLAGS_FOR_TARGET="--sysroot=$SYSROOT" -export CPPFLAGS_FOR_TARGET="--sysroot=$SYSROOT" -export CFLAGS="--sysroot=$SYSROOT" -export CXXFLAGS="--sysroot=$SYSROOT" - -# 🏗️ Build only if not built -if [[ ! -x "$TOOLCHAIN/bin/gcc" ]]; then - echo "⚙️ Building GCC stage 1..." - make -j"$(nproc)" all-gcc all-target-libgcc - - echo "📦 Installing GCC stage 1 to $TOOLCHAIN" - make install-gcc install-target-libgcc -else - echo "✅ GCC stage 1 already installed at $TOOLCHAIN/bin/gcc, skipping build." -fi - -popd - -# ✅ Final check -if [[ ! -x "$TOOLCHAIN/bin/gcc" ]]; then - echo "❌ Stage 1 GCC not found at $TOOLCHAIN/bin/gcc — build may have failed." - exit 1 -fi - -echo "✅ Stage 1 GCC successfully installed in $TOOLCHAIN/bin" diff --git "a/developer/script_gcc-15\360\237\226\211/build_gcc_stage1_requisites.sh" "b/developer/script_gcc-15\360\237\226\211/build_gcc_stage1_requisites.sh" deleted file mode 100755 index 3098e0f..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_gcc_stage1_requisites.sh" +++ /dev/null @@ -1,28 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "📦 Checking prerequisites for stage 1 GCC (bootstrap)..." - -required_tools=(gcc g++ make tar gawk bison flex) -missing=() - -for tool in "${required_tools[@]}"; do - if ! type -P "$tool" &>/dev/null; then - missing+=("$tool") - fi -done - -if (( ${#missing[@]} )); then - echo "❌ Missing required tools: ${missing[*]}" - exit 1 -fi - -if [ ! -d "$GCC_SRC" ]; then - echo "❌ GCC source not found at $GCC_SRC" - echo "💡 You may need to run: prepare_gcc_sources.sh" - exit 1 -fi - -echo "✅ Prerequisites for stage 1 GCC met." diff --git "a/developer/script_gcc-15\360\237\226\211/build_glibc.sh" "b/developer/script_gcc-15\360\237\226\211/build_glibc.sh" deleted file mode 100755 index 652ebc3..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_glibc.sh" +++ /dev/null @@ -1,26 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "🔧 Building full glibc..." - -mkdir -p "$GLIBC_BUILD" -pushd "$GLIBC_BUILD" - -"$GLIBC_SRC/configure" \ - --prefix=/usr \ - --host="$TARGET" \ - --build="$(gcc -dumpmachine)" \ - --with-headers="$SYSROOT/usr/include" \ - --disable-multilib \ - --enable-static \ - --enable-shared \ - libc_cv_slibdir="/usr/lib" - -$MAKE -DESTDIR="$SYSROOT" $MAKE install - -popd - -echo "✅ Full glibc installed in $SYSROOT" diff --git "a/developer/script_gcc-15\360\237\226\211/build_glibc_bootstrap.sh" "b/developer/script_gcc-15\360\237\226\211/build_glibc_bootstrap.sh" deleted file mode 100755 index b1c5c73..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_glibc_bootstrap.sh" +++ /dev/null @@ -1,40 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "📦 Building glibc startup files (crt*.o)..." - -pushd "$GLIBC_BUILD" - - # Confirm that required build artifacts are present - if [[ ! -f bits/stdio_lim.h ]]; then - echo "❌ Missing bits/stdio_lim.h — did you run build_glibc_headers.sh?" - exit 1 - fi - - if [[ ! -f csu/Makefile ]]; then - echo "❌ Missing csu/Makefile — glibc configure phase may have failed" - exit 1 - fi - - # Attempt to build the crt startup object files - make csu/crt1.o csu/crti.o csu/crtn.o -j"$MAKE_JOBS" - - # Install them to the sysroot - install -m 644 csu/crt1.o csu/crti.o csu/crtn.o "$SYSROOT/usr/lib" - - # Create a dummy libc.so to satisfy linker if needed - touch "$SYSROOT/usr/lib/libc.so" - - # ✅ Verify installation - for f in crt1.o crti.o crtn.o; do - if [[ ! -f "$SYSROOT/usr/lib/$f" ]]; then - echo "❌ Missing startup file after install: $f" - exit 1 - fi - done - -popd - -echo "✅ glibc startup files installed to $SYSROOT/usr/lib" diff --git "a/developer/script_gcc-15\360\237\226\211/build_glibc_bootstrap_requisites.sh" "b/developer/script_gcc-15\360\237\226\211/build_glibc_bootstrap_requisites.sh" deleted file mode 100755 index e10ea96..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_glibc_bootstrap_requisites.sh" +++ /dev/null @@ -1,138 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "📦 Checking requisites for glibc startup file build (crt1.o, crti.o, crtn.o)." - -missing_requisite_list=() -found_requisite_list=() - -# ──────────────────────────────────────────────── -# 1. Required tools -# - required_tools=( - "gcc" - "g++" - "make" - "ld" - "as" - ) - - for tool in "${required_tools[@]}"; do - if location=$(command -v "$tool"); then - found_requisite_list+=("$location") - else - missing_requisite_list+=("$tool") - fi - done - -# ──────────────────────────────────────────────── -# 2. Required directories and sources -# - if [ -d "$GLIBC_SRC" ] && [ "$(ls -A "$GLIBC_SRC")" ]; then - found_requisite_list+=("$GLIBC_SRC") - else - missing_requisite_list+=("$GLIBC_SRC (empty or missing)") - fi - -# ──────────────────────────────────────────────── -# 3. Required sysroot include path with Linux headers -# - linux_headers=( - "$SYSROOT/usr/include/linux/version.h" - "$SYSROOT/usr/include/asm/unistd.h" - "$SYSROOT/usr/include/bits/types.h" - ) - - for header in "${linux_headers[@]}"; do - if [[ -f "$header" ]]; then - found_requisite_list+=("$header") - else - missing_requisite_list+=("$header") - fi - done - -# ──────────────────────────────────────────────── -# 4. Confirm SYSROOT write access -# - if [[ -w "$SYSROOT/usr/include" ]]; then - found_requisite_list+=("SYSROOT writable: $SYSROOT/usr/include") - else - missing_requisite_list+=("SYSROOT not writable: $SYSROOT/usr/include") - fi - -# ──────────────────────────────────────────────── -# Additional sanity checks before header & crt build -# - - # 1. Check that the C preprocessor works and headers can be found - echo '#include ' | gcc -E - > /dev/null 2>&1 - if [[ $? -eq 0 ]]; then - found_requisite_list+=("C preprocessor operational: gcc -E works") - else - missing_requisite_list+=("C preprocessor failed: gcc -E on failed") - fi - - # 2. Check that bits/stdio_lim.h exists after headers install (glibc marker) - if [[ -f "$GLIBC_BUILD/bits/stdio_lim.h" ]]; then - found_requisite_list+=("$GLIBC_BUILD/bits/stdio_lim.h (glibc headers marker found)") - else - missing_requisite_list+=("$GLIBC_BUILD/bits/stdio_lim.h missing — headers may not be fully installed") - fi - - # 3. Check for crt objects already present (optional) - for f in crt1.o crti.o crtn.o; do - if [[ -f "$GLIBC_BUILD/csu/$f" ]]; then - found_requisite_list+=("$GLIBC_BUILD/csu/$f (already built)") - fi - done - - # 4. Check that Makefile exists and is non-empty - if [[ -f "$GLIBC_BUILD/Makefile" ]]; then - if [[ -s "$GLIBC_BUILD/Makefile" ]]; then - found_requisite_list+=("$GLIBC_BUILD/Makefile exists and is populated") - else - missing_requisite_list+=("$GLIBC_BUILD/Makefile exists but is empty — incomplete configure?") - fi - else - missing_requisite_list+=("$GLIBC_BUILD/Makefile missing — did configure run?") - fi - - # 5. Check that csu Makefile has rules for crt1.o - if [[ -f "$GLIBC_BUILD/csu/Makefile" ]]; then - if grep -q 'crt1\.o' "$GLIBC_BUILD/csu/Makefile"; then - found_requisite_list+=("csu/Makefile defines crt1.o") - else - missing_requisite_list+=("csu/Makefile does not define crt1.o — possible misconfigure") - fi - else - missing_requisite_list+=("csu/Makefile missing — subdir config likely failed") - fi - -# ──────────────────────────────────────────────── -# Print results -# - if (( ${#found_requisite_list[@]} > 0 )); then - echo "found:" - for item in "${found_requisite_list[@]}"; do - echo " $item" - done - fi - - if (( ${#missing_requisite_list[@]} > 0 )); then - echo "missing:" - for item in "${missing_requisite_list[@]}"; do - echo " $item" - done - fi - - # Final verdict - # - if (( ${#missing_requisite_list[@]} > 0 )); then - echo "❌ Missing requisites for glibc bootstrap" - exit 1 - else - echo "✅ All specified requisites found" - exit 0 - fi diff --git "a/developer/script_gcc-15\360\237\226\211/build_glibc_crt.sh" "b/developer/script_gcc-15\360\237\226\211/build_glibc_crt.sh" deleted file mode 100755 index cb6d0b5..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_glibc_crt.sh" +++ /dev/null @@ -1,56 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -# Use separate dir to avoid conflicts with headers -# this build variable should be moved to environment.sh, so that the clean scripts will work: -GLIBC_BUILD_CRT="$ROOT/build/glibc-crt" -rm -rf "$GLIBC_BUILD_CRT" -mkdir -p "$GLIBC_BUILD_CRT" -mkdir -p /home/Thomas/subu_data/developer/RT_gcc/build/glibc-crt/csu -touch /home/Thomas/subu_data/developer/RT_gcc/build/glibc-crt/csu/grcrt1.o -mkdir -p "$GLIBC_BUILD_CRT/include" -cp -r "$SYSROOT/usr/include"/* "$GLIBC_BUILD_CRT/include" - - -pushd "$GLIBC_BUILD_CRT" - - - echo "🧱 Configuring glibc for startup file build..." - - # Invoke configure explicitly - "$GLIBC_SRC/configure" \ - --prefix=/usr \ - --build="$HOST" \ - --host="$HOST" \ - --with-headers="$SYSROOT/usr/include" \ - --disable-multilib \ - --enable-static \ - --disable-shared \ - --enable-kernel=4.4.0 \ - libc_cv_slibdir="/usr/lib" - - # Ensure csu/Makefile is generated - #make -C "$GLIBC_SRC" objdir="$GLIBC_BUILD_CRT" csu/subdir_lib -j"$MAKE_JOBS" - make -C "$GLIBC_SRC" objdir="$GLIBC_BUILD_CRT" csu/crt1.o csu/crti.o csu/crtn.o -j"$MAKE_JOBS" - - # Now check and continue - if [[ ! -f "$GLIBC_BUILD_CRT/csu/Makefile" ]]; then - echo "❌ csu/Makefile still not found after configure. Startup build failed." - exit 1 - fi - - echo "🔨 Building crt objects..." - make -C "$GLIBC_SRC" objdir="$GLIBC_BUILD_CRT" csu/crt1.o csu/crti.o csu/crtn.o -j"$MAKE_JOBS" - - echo "📦 Installing crt objects to sysroot..." - install -m 644 "$GLIBC_BUILD_CRT/csu/crt1.o" "$GLIBC_BUILD_CRT/csu/crti.o" "$GLIBC_BUILD_CRT/csu/crtn.o" "$SYSROOT/usr/lib" - touch "$SYSROOT/usr/lib/libc.so" - - for f in crt1.o crti.o crtn.o; do - [[ -f "$SYSROOT/usr/lib/$f" ]] || { echo "❌ Missing $f after install"; exit 1; } - done - -popd -echo "✅ Startup files installed from isolated build dir: $GLIBC_BUILD_CRT" diff --git "a/developer/script_gcc-15\360\237\226\211/build_glibc_crt_requisites.sh" "b/developer/script_gcc-15\360\237\226\211/build_glibc_crt_requisites.sh" deleted file mode 100755 index 7875681..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_glibc_crt_requisites.sh" +++ /dev/null @@ -1,60 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "📦 Checking requisites for glibc crt (startup files)" - -missing=() -found=() - -# Core toolchain utilities -for tool in gcc g++ make ld as; do - if command -v "$tool" >/dev/null; then - found+=("tool: $tool at $(command -v "$tool")") - else - missing+=("missing tool: $tool") - fi -done - -# GLIBC source/csu -if [[ -d "$GLIBC_SRC/csu" ]]; then - found+=("glibc source/csu present") -else - missing+=("missing: $GLIBC_SRC/csu") -fi - -# Expected headers from sysroot -for h in gnu/libc-version.h stdio.h unistd.h; do - if [[ -f "$SYSROOT/usr/include/$h" ]]; then - found+=("header present: $h") - else - missing+=("missing header: $SYSROOT/usr/include/$h") - fi -done - -# Writable sysroot lib dir -if [[ -w "$SYSROOT/usr/lib" ]]; then - found+=("SYSROOT writable: $SYSROOT/usr/lib") -else - missing+=("not writable: $SYSROOT/usr/lib") -fi - -# Summary output -echo -if (( ${#found[@]} > 0 )); then - echo "✅ Found:" - for item in "${found[@]}"; do echo " $item"; done -fi - -echo -if (( ${#missing[@]} > 0 )); then - echo "❌ Missing:" - for item in "${missing[@]}"; do echo " $item"; done - echo - echo "❌ Requisites check failed" - exit 1 -else - echo "✅ All crt requisites satisfied" - exit 0 -fi diff --git "a/developer/script_gcc-15\360\237\226\211/build_glibc_headers.sh" "b/developer/script_gcc-15\360\237\226\211/build_glibc_headers.sh" deleted file mode 100755 index abcb5a4..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_glibc_headers.sh" +++ /dev/null @@ -1,47 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "📦 Building and installing glibc headers..." - -mkdir -p "$GLIBC_BUILD" -pushd "$GLIBC_BUILD" - - # Configure glibc with minimal bootstrap options - "$GLIBC_SRC/configure" \ - --prefix=/usr \ - --build="$HOST" \ - --host="$HOST" \ - --with-headers="$SYSROOT/usr/include" \ - --disable-multilib \ - --enable-static \ - --disable-shared \ - --enable-kernel=4.4.0 \ - libc_cv_slibdir="/usr/lib" - - - # Install headers into sysroot - make install-headers -j"$MAKE_JOBS" DESTDIR="$SYSROOT" - - # ✅ Verify headers were installed - required_headers=( - "$SYSROOT/usr/include/gnu/libc-version.h" - "$SYSROOT/usr/include/stdio.h" - "$SYSROOT/usr/include/unistd.h" - ) - - missing=() - for h in "${required_headers[@]}"; do - [[ -f "$h" ]] || missing+=("$h") - done - - if (( ${#missing[@]} > 0 )); then - echo "❌ Missing required glibc headers:" - printf ' %s\n' "${missing[@]}" - exit 1 - fi - -popd - -echo "✅ glibc headers successfully installed to $SYSROOT/usr/include" diff --git "a/developer/script_gcc-15\360\237\226\211/build_glibc_headers_requisites.sh" "b/developer/script_gcc-15\360\237\226\211/build_glibc_headers_requisites.sh" deleted file mode 100755 index 0264c81..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_glibc_headers_requisites.sh" +++ /dev/null @@ -1,99 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "📦 Checking requisites for glibc headers installation." - -missing_requisite_list=() -found_requisite_list=() - -# ──────────────────────────────────────────────── -# 1. Required tools for headers phase -# -required_tools=( - "gcc" - "g++" - "make" - "ld" - "as" -) - -for tool in "${required_tools[@]}"; do - if location=$(command -v "$tool"); then - found_requisite_list+=("$location") - else - missing_requisite_list+=("$tool") - fi -done - -# ──────────────────────────────────────────────── -# 2. glibc source directory check -# -if [ -d "$GLIBC_SRC" ] && [ "$(ls -A "$GLIBC_SRC")" ]; then - found_requisite_list+=("$GLIBC_SRC") -else - missing_requisite_list+=("$GLIBC_SRC (empty or missing)") -fi - -# ──────────────────────────────────────────────── -# 3. Kernel headers required for bootstrap glibc -# -linux_headers=( - "$SYSROOT/usr/include/linux/version.h" - "$SYSROOT/usr/include/asm/unistd.h" - "$SYSROOT/usr/include/bits/types.h" -) - -for header in "${linux_headers[@]}"; do - if [[ -f "$header" ]]; then - found_requisite_list+=("$header") - else - missing_requisite_list+=("$header") - fi -done - -# ──────────────────────────────────────────────── -# 4. Confirm SYSROOT write access for header install -# -if [[ -w "$SYSROOT/usr/include" ]]; then - found_requisite_list+=("SYSROOT writable: $SYSROOT/usr/include") -else - missing_requisite_list+=("SYSROOT not writable: $SYSROOT/usr/include") -fi - -# ──────────────────────────────────────────────── -# 5. Check C preprocessor is operational -# -echo '#include ' | gcc -E - > /dev/null 2>&1 -if [[ $? -eq 0 ]]; then - found_requisite_list+=("C preprocessor operational: gcc -E works") -else - missing_requisite_list+=("C preprocessor failed: gcc -E on failed") -fi - -# ──────────────────────────────────────────────── -# Print results -# -if (( ${#found_requisite_list[@]} > 0 )); then - echo "found:" - for item in "${found_requisite_list[@]}"; do - echo " $item" - done -fi - -if (( ${#missing_requisite_list[@]} > 0 )); then - echo "missing:" - for item in "${missing_requisite_list[@]}"; do - echo " $item" - done -fi - -# Final verdict -if (( ${#missing_requisite_list[@]} > 0 )); then - echo "❌ Missing requisites for glibc header install" - exit 1 -else - echo "✅ All specified requisites found for glibc headers" - exit 0 -fi diff --git "a/developer/script_gcc-15\360\237\226\211/build_glibc_requisites.sh" "b/developer/script_gcc-15\360\237\226\211/build_glibc_requisites.sh" deleted file mode 100755 index 0d25cf0..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_glibc_requisites.sh" +++ /dev/null @@ -1,51 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "📦 Checking prerequisites for glibc..." - -required_tools=(gcc make curl tar perl python3 gawk bison) -missing_tools=() - -for tool in "${required_tools[@]}"; do - if ! command -v "$tool" > /dev/null; then - missing_tools+=("$tool") - fi -done - -if (( ${#missing_tools[@]} )); then - echo "❌ Missing required tools:" - printf ' %s\n' "${missing_tools[@]}" - exit 1 -fi - -# Check that expected headers exist -glibc_headers=( - "$SYSROOT/usr/include/stdio.h" - "$SYSROOT/usr/include/unistd.h" -) - -# Check that expected startup objects exist -startup_objects=( - "$SYSROOT/usr/lib/crt1.o" - "$SYSROOT/usr/lib/crti.o" - "$SYSROOT/usr/lib/crtn.o" -) - -missing_files=() -for f in "${glibc_headers[@]}" "${startup_objects[@]}"; do - if [ ! -f "$f" ]; then - missing_files+=("$f") - fi -done - -if (( ${#missing_files[@]} )); then - echo "❌ Missing required files in sysroot:" - printf ' %s\n' "${missing_files[@]}" - echo - echo "Hint: these files should have been generated by build_glibc_headers.sh" - exit 1 -fi - -echo "✅ All prerequisites for glibc are met and sysroot is correctly populated." diff --git "a/developer/script_gcc-15\360\237\226\211/build_linux.sh" "b/developer/script_gcc-15\360\237\226\211/build_linux.sh" deleted file mode 100755 index 4b7ac0c..0000000 --- "a/developer/script_gcc-15\360\237\226\211/build_linux.sh" +++ /dev/null @@ -1,21 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "📦 Preparing Linux kernel headers for glibc and GCC..." - -pushd "$LINUX_SRC" - - $MAKE mrproper - $MAKE headers_install ARCH=x86_64 INSTALL_HDR_PATH="$SYSROOT/usr" - - if [[ -f "$SYSROOT/usr/include/linux/version.h" ]]; then - echo "✅ Linux headers installed to $SYSROOT/usr/include" - exit 0 - fi - - echo "❌ Kernel headers not found at expected location." - exit 1 - -popd diff --git "a/developer/script_gcc-15\360\237\226\211/clean_build.sh" "b/developer/script_gcc-15\360\237\226\211/clean_build.sh" deleted file mode 100755 index 7c9bca3..0000000 --- "a/developer/script_gcc-15\360\237\226\211/clean_build.sh" +++ /dev/null @@ -1,15 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "🧹 Cleaning build directories..." - -for dir in "${BUILD_DIR_LIST[@]}"; do - if [[ -d "$dir" ]]; then - echo "rm -rf $dir" - rm -rf "$dir" - fi -done - -echo "✅ Build directories cleaned." diff --git "a/developer/script_gcc-15\360\237\226\211/clean_dist.sh" "b/developer/script_gcc-15\360\237\226\211/clean_dist.sh" deleted file mode 100755 index 009e01e..0000000 --- "a/developer/script_gcc-15\360\237\226\211/clean_dist.sh" +++ /dev/null @@ -1,31 +0,0 @@ -#!/bin/bash -set -euo pipefail - -echo "removing: build, source, upstream, and project directories" - -source "$(dirname "$0")/environment.sh" - -# Remove build -# - "./clean_build.sh" - ! ! rmdir "$BUILD_DIR" >& /dev/null && echo "rmdir $BUILD_DIR" - -# Remove source -# note that repos are removed with clean_upstream -# - "./clean_source.sh" - "./clean_upstream.sh" - - ! ! rmdir "$SRC" >& /dev/null && echo "rmdir $SRC" - ! ! rmdir "$UPSTREAM" >& /dev/null && echo "rmdir $UPSTREAM" - -# Remove project directories -# - for dir in "${PROJECT_SUBDIR_LIST[@]}" "${PROJECT_DIR_LIST[@]}"; do - if [[ -d "$dir" ]]; then - echo "rm -rf $dir" - ! rm -rf "$dir" && echo "could not remove $dir" - fi - done - -echo "✅ clean_dist.sh" diff --git "a/developer/script_gcc-15\360\237\226\211/clean_source.sh" "b/developer/script_gcc-15\360\237\226\211/clean_source.sh" deleted file mode 100755 index 2f5beb0..0000000 --- "a/developer/script_gcc-15\360\237\226\211/clean_source.sh" +++ /dev/null @@ -1,29 +0,0 @@ -#!/bin/bash -# removes project tarball expansions from source/ -# git repos are part of `upstream` so are not removed - -set -euo pipefail - - -source "$(dirname "$0")/environment.sh" - -i=0 -while [ $i -lt ${#UPSTREAM_TARBALL_LIST[@]} ]; do - tarball="${UPSTREAM_TARBALL_LIST[$i]}" - # skip url - i=$((i + 1)) - # skip explicit dest dir - i=$((i + 1)) - - base_name="${tarball%.tar.*}" - dir="$SRC/$base_name" - - if [[ -d "$dir" ]]; then - echo "rm -rf $dir" - rm -rf "$dir" - fi - - i=$((i + 1)) -done - -echo "✅ clean_source.sh" diff --git "a/developer/script_gcc-15\360\237\226\211/clean_upstream.sh" "b/developer/script_gcc-15\360\237\226\211/clean_upstream.sh" deleted file mode 100755 index 50e8d98..0000000 --- "a/developer/script_gcc-15\360\237\226\211/clean_upstream.sh" +++ /dev/null @@ -1,38 +0,0 @@ -#!/bin/bash -# run this to force repeat of the downloads -# removes project tarballs from upstream/ -# removes project repos from source/ -# does not remove non-project files - -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -# Remove tarballs -i=0 -while [ $i -lt ${#UPSTREAM_TARBALL_LIST[@]} ]; do - tarball="${UPSTREAM_TARBALL_LIST[$i]}" - path="$UPSTREAM/$tarball" - - if [[ -f "$path" ]]; then - echo "rm $path" - rm "$path" - fi - - i=$((i + 3)) -done - -# Remove Git repositories -i=0 -while [ $i -lt ${#UPSTREAM_GIT_REPO_LIST[@]} ]; do - dir="${UPSTREAM_GIT_REPO_LIST[$((i+2))]}" - - if [[ -d "$dir" ]]; then - echo "rm -rf $dir" - rm -rf "$dir" - fi - - i=$((i + 3)) -done - -echo "✅ clean_upstream.sh" diff --git "a/developer/script_gcc-15\360\237\226\211/environment.sh" "b/developer/script_gcc-15\360\237\226\211/environment.sh" deleted file mode 100755 index 1991f40..0000000 --- "a/developer/script_gcc-15\360\237\226\211/environment.sh" +++ /dev/null @@ -1,160 +0,0 @@ -# === environment.sh === -# Source this file in each build script to ensure consistent paths and settings - -echo "ROOT: $ROOT" -cd $SCRIPT_DIR - -#-------------------------------------------------------------------------------- -# tools - - # machine target - export HOST=$(gcc -dumpmachine) - -# export MAKE_JOBS=$(nproc) -# export MAKE="make -j$MAKE_JOBS" - export MAKE_JOBS=$(getconf _NPROCESSORS_ONLN) - export MAKE=make - - - # Compiler path prefixes - export CC_FOR_BUILD=$(command -v gcc) - export CXX_FOR_BUILD=$(command -v g++) - -#-------------------------------------------------------------------------------- -# tool versions - - export LINUX_VER=6.8 - export BINUTILS_VER=2.42 - export GCC_VER=15.1.0 - export GLIBC_VER=2.39 - - # Library versions: required minimums or recommended tested versions - export GMP_VER=6.3.0 # Compatible with GCC 15, latest stable from GMP site - export MPFR_VER=4.2.1 # Latest stable, tested with GCC 15 - export MPC_VER=1.3.1 # Works with GCC 15, matches default in-tree - export ISL_VER=0.26 # Matches upstream GCC infrastructure repo - export ZSTD_VER=1.5.5 # Stable release, supported by GCC for LTO compression - -#-------------------------------------------------------------------------------- -# project structure - - # temporary directory - export TMPDIR="$ROOT/tmp" - - # Project directories - export SYSROOT="$ROOT/sysroot" - export TOOLCHAIN="$ROOT/toolchain" - export BUILD_DIR="$ROOT/build" - export LOGDIR="$ROOT/log" - export UPSTREAM="$ROOT/upstream" - export SRC=$ROOT/source - - # Synthesized directory lists - PROJECT_DIR_LIST=( - "$LOGDIR" - "$SYSROOT" "$TOOLCHAIN" "$BUILD_DIR" - "$UPSTREAM" "$SRC" - ) - # list these in the order they can be deleted - PROJECT_SUBDIR_LIST=( - "$SYSROOT/usr/lib" - "$SYSROOT/lib" - "$SYSROOT/usr/include" - ) - - # Source directories - export LINUX_SRC="$SRC/linux-$LINUX_VER" - export BINUTILS_SRC="$SRC/binutils-$BINUTILS_VER" - export GCC_SRC="$SRC/gcc-$GCC_VER" - export GLIBC_SRC="$SRC/glibc-$GLIBC_VER" - export GMP_SRC="$SRC/gmp-$GMP_VER" - export MPFR_SRC="$SRC/mpfr-$MPFR_VER" - export MPC_SRC="$SRC/mpc-$MPC_VER" - export ISL_SRC="$SRC/isl-$ISL_VER" - export ZSTD_SRC="$SRC/zstd-$ZSTD_VER" - - SOURCE_DIR_LIST=( - "$LINUX_SRC" - "$BINUTILS_SRC" - "$GCC_SRC" - "$GLIBC_SRC" - "$GMP_SRC" - "$MPFR_SRC" - "$MPC_SRC" - "$ISL_SRC" - "$ZSTD_SRC" - ) - - # Build directories - export BINUTILS_BUILD="$BUILD_DIR/binutils" - export GCC_BUILD_STAGE1="$BUILD_DIR/gcc-stage1" - export GCC_BUILD_FINAL="$BUILD_DIR/gcc-final" - export GLIBC_BUILD="$BUILD_DIR/glibc" - BUILD_DIR_LIST=( - "$BINUTILS_BUILD" - "$GCC_BUILD_STAGE1" - "$GCC_BUILD_FINAL" - "$GLIBC_BUILD" - ) - -#-------------------------------------------------------------------------------- -# upstream -> local stuff - - # see top of this file for the _VER variables - - # Tarball Download Info (Name, URL, Destination Directory) - export UPSTREAM_TARBALL_LIST=( - "linux-${LINUX_VER}.tar.xz" - "https://cdn.kernel.org/pub/linux/kernel/v6.x/linux-${LINUX_VER}.tar.xz" - "$UPSTREAM/linux-$LINUX_VER" - - "binutils-${BINUTILS_VER}.tar.xz" - "https://ftp.gnu.org/gnu/binutils/binutils-${BINUTILS_VER}.tar.xz" - "$UPSTREAM/binutils-$BINUTILS_VER" - - # using repo - # "gcc-${GCC_VER}.tar.xz" - # "https://ftp.gnu.org/gnu/gcc/gcc-${GCC_VER}/gcc-${GCC_VER}.tar.xz" - # "$UPSTREAM/gcc-$GCC_VER" - - "glibc-${GLIBC_VER}.tar.xz" - "https://ftp.gnu.org/gnu/libc/glibc-${GLIBC_VER}.tar.xz" - "$UPSTREAM/glibc-$GLIBC_VER" - - "gmp-${GMP_VER}.tar.xz" - "https://ftp.gnu.org/gnu/gmp/gmp-${GMP_VER}.tar.xz" - "$UPSTREAM/gmp-$GMP_VER" - - "mpfr-${MPFR_VER}.tar.xz" - "https://www.mpfr.org/mpfr-${MPFR_VER}/mpfr-${MPFR_VER}.tar.xz" - "$UPSTREAM/mpfr-$MPFR_VER" - - "mpc-${MPC_VER}.tar.gz" - "https://ftp.gnu.org/gnu/mpc/mpc-${MPC_VER}.tar.gz" - "$UPSTREAM/mpc-$MPC_VER" - - "isl-${ISL_VER}.tar.bz2" -# "https://gcc.gnu.org/pub/gcc/infrastructure/isl-${ISL_VER}.tar.bz2" - "https://libisl.sourceforge.io/isl-0.26.tar.bz2" -# "https://github.com/Meinersbur/isl/archive/refs/tags/isl-0.26.tar.gz" - "$UPSTREAM/isl-$ISL_VER" - - "zstd-${ZSTD_VER}.tar.zst" - "https://github.com/facebook/zstd/releases/download/v${ZSTD_VER}/zstd-${ZSTD_VER}.tar.zst" - "$UPSTREAM/zstd-$ZSTD_VER" - ) - - - # Git Repo Info - # Each entry is triple: Repository URL, Branch, Destination Directory - # Repos clone directly into $SRC - export UPSTREAM_GIT_REPO_LIST=( - - "git://gcc.gnu.org/git/gcc.git" - "releases/gcc-15" - "$SRC/gcc-$GCC_VER" - - #currently there is no second repo - ) - - diff --git "a/developer/script_gcc-15\360\237\226\211/project_download.sh" "b/developer/script_gcc-15\360\237\226\211/project_download.sh" deleted file mode 100755 index 0aa20bc..0000000 --- "a/developer/script_gcc-15\360\237\226\211/project_download.sh" +++ /dev/null @@ -1,131 +0,0 @@ -#!/bin/bash -# This script can be run multiple times to download what was missed on prior invocations -# If there is a corrupt tarball, delete it and run this again -# Sometimes a connection test will fails, then the downloads runs anyway - -set -uo pipefail # no `-e`, we want to continue on error - -source "$(dirname "$0")/environment.sh" - -check_internet_connection() { - echo "🌐 Checking internet connection..." - # Use a quick connection check without blocking the whole script - if ! curl -s --connect-timeout 5 https://google.com > /dev/null; then - echo "⚠️ No internet connection detected (proceeding with download anyway)" - else - echo "✅ Internet connection detected" - fi -} - -# check_server_reachability() { -# local url=$1 -# if ! curl -s --head "$url" | head -n 1 | grep -q "HTTP/1.1 200 OK"; then -# echo "⚠️ Cannot reach $url (proceeding with download anyway)" -# fi -# } - -check_server_reachability() { - local url=$1 - echo "checking is reachable: $url " - - # Attempt to get the HTTP response code without following redirects - http_code=$(curl -s -o /dev/null -w "%{http_code}" --connect-timeout 5 "$url") - - # If the HTTP code is between 200 and 299, consider it reachable - if [[ "$http_code" -ge 200 && "$http_code" -lt 300 ]]; then - echo "✅ Server reachable (HTTP $http_code): $url " - else - # If not 2xx, print the status code for transparency - echo "⚠️ Server HTTP $http_code not 2xx, will try anyway: $url" - fi -} - -check_file_exists() { - local file=$1 - [[ -f "$UPSTREAM/$file" ]] -} - -download_file() { - local file=$1 - local url=$2 - - echo "Downloading $file from $url..." - if (cd "$UPSTREAM" && curl -LO "$url"); then - if file "$UPSTREAM/$file" | grep -qi 'html'; then - echo "❌ Invalid download (HTML, not archive): $file" - rm -f "$UPSTREAM/$file" - return 1 - elif [[ -f "$UPSTREAM/$file" ]]; then - echo "✅ Successfully downloaded: $file" - return 0 - # Validate it's not an HTML error page - else - echo "❌ Did not appear after download: $file " - return 1 - fi - else - echo "❌ Failed to download: $file" - return 1 - fi -} - -download_tarballs() { - i=0 - while [ $i -lt ${#UPSTREAM_TARBALL_LIST[@]} ]; do - tarball="${UPSTREAM_TARBALL_LIST[$i]}" - url="${UPSTREAM_TARBALL_LIST[$((i+1))]}" - i=$((i + 3)) - - if check_file_exists "$tarball"; then - echo "⚡ already exists, skipping download: $tarball " - continue - fi - - check_server_reachability "$url" - - if ! download_file "$tarball" "$url"; then - echo "⚠️ Skipping due to previous error: $tarball " - fi - - done -} - -download_git_repos() { - i=0 - while [ $i -lt ${#UPSTREAM_GIT_REPO_LIST[@]} ]; do - repo="${UPSTREAM_GIT_REPO_LIST[$i]}" - branch="${UPSTREAM_GIT_REPO_LIST[$((i+1))]}" - dir="${UPSTREAM_GIT_REPO_LIST[$((i+2))]}" - - if [[ -d "$dir/.git" ]]; then - echo "⚡ Already exists, skipping git clone: $dir " - i=$((i + 3)) - continue - fi - - echo "Cloning $repo into $dir..." - if ! git clone --branch "$branch" "$repo" "$dir"; then - echo "❌ Failed to clone $repo → $dir" - fi - - i=$((i + 3)) - done -} - -# do the downloads - -check_internet_connection - -echo "Downloading tarballs:" -for ((i=0; i<${#UPSTREAM_TARBALL_LIST[@]}; i+=3)); do - echo " - ${UPSTREAM_TARBALL_LIST[i]}" -done -download_tarballs - -echo "Cloning Git repositories:" -for ((i=0; i<${#UPSTREAM_GIT_REPO_LIST[@]}; i+=3)); do - echo " - ${UPSTREAM_GIT_REPO_LIST[i]} (branch ${UPSTREAM_GIT_REPO_LIST[i+1]})" -done -download_git_repos - -echo "project_download.sh completed" diff --git "a/developer/script_gcc-15\360\237\226\211/project_extract.sh" "b/developer/script_gcc-15\360\237\226\211/project_extract.sh" deleted file mode 100755 index e114d34..0000000 --- "a/developer/script_gcc-15\360\237\226\211/project_extract.sh" +++ /dev/null @@ -1,52 +0,0 @@ -#!/bin/bash -# Will not extract if target already exists -# Delete any malformed extractions before running again - -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -had_error=0 -i=0 - -while [ $i -lt ${#UPSTREAM_TARBALL_LIST[@]} ]; do - tarball="${UPSTREAM_TARBALL_LIST[$i]}" - i=$((i + 3)) - - src_path="$UPSTREAM/$tarball" - - # Strip compression suffix to guess subdirectory name - base_name="${tarball%%.tar.*}" # safer across .tar.gz, .tar.zst, etc. - target_dir="$SRC/$base_name" - - if [[ -d "$target_dir" ]]; then - echo "⚡ Already exists, skipping: $target_dir" - continue - fi - - if [[ ! -f "$src_path" ]]; then - echo "❌ Missing tarball: $src_path" - had_error=1 - continue - fi - - echo "tar -xf $tarball" - if ! (cd "$SRC" && tar -xf "$src_path"); then - echo "❌ Extraction failed: $tarball" - had_error=1 - continue - fi - - if [[ -d "$target_dir" ]]; then - echo "Extracted to: $target_dir" - else - echo "❌ Target not found after extraction: $target_dir" - had_error=1 - fi -done - -if [[ $had_error -eq 0 ]]; then - echo "✅ All tarballs extracted successfully" -else - echo "❌ Some extractions failed or were incomplete" -fi diff --git "a/developer/script_gcc-15\360\237\226\211/project_requisites.sh" "b/developer/script_gcc-15\360\237\226\211/project_requisites.sh" deleted file mode 100755 index 76bf171..0000000 --- "a/developer/script_gcc-15\360\237\226\211/project_requisites.sh" +++ /dev/null @@ -1,155 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -echo "Checking requisites for native standalone GCC build." - -if [ ! $(command -v pkg-config) ]; then - echo "pkg-config command required for this script" - echo "Debian: sudo apt install pkg-config" - echo "Fedora: sudo dnf install pkg-config" - exit 1 -fi - - -missing_requisite_list=() -failed_pkg_config_list=() -found_reequisite_list=() - -# --- Required Script Tools (must be usable by this script itself) --- -script_tools=( - bash - awk - sed - grep -) - -echo "Checking for essential script dependencies." -for tool in "${script_tools[@]}"; do - location=$(command -v "$tool") - if [ $? -eq 0 ]; then - found_requisite_list+=("$location") - else - missing_requisite_list+=("$tool") - fi -done - -# --- Build Tools --- -build_tools=( - gcc - g++ - make - tar - gzip - bzip2 - perl - patch - diff - python3 -) - -echo "Checking for required build tools." -for tool in "${build_tools[@]}"; do - location=$(command -v "$tool") - if [ $? -eq 0 ]; then - found_requisite_list+=("$location") - else - missing_requisite_list+=("$tool") - fi -done - -# --- Libraries --- -required_pkgs=( - gmp - mpfr - mpc - isl - zstd -) - -echo "Checking for required development libraries (via pkg-config)." -for lib in "${required_pkgs[@]}"; do - if pkg-config --exists "$lib"; then - found_reequisite_list+=("library: $lib => $(pkg-config --modversion "$lib")") - else - failed_pkg_config_list+=("library: $lib") - fi -done - -# --- Source Trees --- -required_sources=( - "$GCC_SRC" - "$BINUTILS_SRC" - "$GLIBC_SRC" - "$LINUX_SRC" - "$GMP_SRC" - "$MPFR_SRC" - "$MPC_SRC" - "$ISL_SRC" - "$ZSTD_SRC" -) - -echo "Checking for required source directories." -echo "These will be installed by download_sources.sh and extract_from_tar.sh" -for src in "${required_sources[@]}"; do - if [[ -d "$src" && "$(ls -A "$src")" ]]; then - found_reequisite_list+=("source: $src") - else - missing_requisite_list+=("source: $src") - fi -done - -# --- Python Modules (non-fatal) --- -optional_py_modules=(re sys os json gzip pathlib shutil time tempfile) - -echo "Checking optional Python3 modules." -for mod in "${optional_py_modules[@]}"; do - if python3 -c "import $mod" &>/dev/null; then - found_reequisite_list+=("python: module $mod") - else - missing_requisite_list+=("python (optional): module $mod") - fi -done - -echo -echo "Summary:" -echo "--------" - -# Found tools -for item in "${found_reequisite_list[@]}"; do - echo " found: $item" -done - -# Missing essentials -for item in "${missing_requisite_list[@]:-}"; do - echo "❌ missing required tool: $item" -done - -# pkg-config failures -for item in "${failed_pkg_config_list[@]:-}"; do - echo "⚠️ pkg-config could not find: $item" -done - -# Final verdict -echo - -if [[ ${#missing_requisite_list[@]} -eq 0 && ${#failed_pkg_config_list[@]} -eq 0 ]]; then - echo "✅ All required tools and libraries found." -else - echo "❌ Some requisites are missing." - - if [[ ${#failed_pkg_config_list[@]} -gt 0 ]]; then - echo - echo "Note: pkg-config did not find some libraries:" - echo " These are expected if you are building them from source:" - echo " - mpc" - echo " - isl" - echo " - zstd" - echo " If not, consider installing the corresponding dev packages." - echo " Debian: sudo apt install libmpc-dev libisl-dev libzstd-dev" - echo " Fedora: sudo dnf install libmpc-devel isl-devel libzstd-devel" - fi -fi - - diff --git "a/developer/script_gcc-15\360\237\226\211/project_setup.sh" "b/developer/script_gcc-15\360\237\226\211/project_setup.sh" deleted file mode 100755 index 31516dc..0000000 --- "a/developer/script_gcc-15\360\237\226\211/project_setup.sh" +++ /dev/null @@ -1,45 +0,0 @@ -#!/bin/bash -set -euo pipefail - -source "$(dirname "$0")/environment.sh" - -# Create top-level project directories -for dir in "${PROJECT_DIR_LIST[@]}"; do - echo "mkdir -p $dir" - mkdir -p "$dir" -done - -# Create subdirectories within SYSROOT -for subdir in "${PROJECT_SUBDIR_LIST[@]}"; do - echo "mkdir -p $subdir" - mkdir -p "$subdir" -done - -# Ensure TMPDIR exists and add .gitignore -if [[ ! -d "$TMPDIR" ]]; then - echo "mkdir -p $TMPDIR" - mkdir -p "$TMPDIR" - - echo "echo $TMPDIR/ > $TMPDIR/.gitignore" - echo "$TMPDIR/" > "$TMPDIR/.gitignore" -else - echo "⚠️ TMPDIR already exists" -fi - -# Create root-level .gitignore if missing -if [[ -f "$ROOT/.gitignore" ]]; then - echo "⚠️ $ROOT/.gitignore already exists" -else - echo "create $ROOT/.gitignore" - { - echo "# Ignore synthesized top-level directories" - for dir in "${PROJECT_DIR_LIST[@]}"; do - rel_path="${dir#$ROOT/}" - echo "/$rel_path" - done - echo "# Ignore synthesized files" - echo "/.gitignore" - } > "$ROOT/.gitignore" -fi - -echo "✅ setup_project.sh" diff --git a/document/configure/RT metagdata.org b/document/configure/RT metagdata.org new file mode 100644 index 0000000..bbed67a --- /dev/null +++ b/document/configure/RT metagdata.org @@ -0,0 +1,84 @@ +#+TITLE: RT_gcc Build Branding and Documentation URLs +#+AUTHOR: Thomas Walker Lynch +#+DATE: 2025-05-06 +#+OPTIONS: toc:nil num:nil +#+LANGUAGE: en + +This document outlines the recommended settings for GCC distributor metadata options when building the standalone ~RT_gcc~ compiler toolchain. These settings help clarify origin, version, and support paths for any distributed binaries based on this toolchain. + +* Purpose +GCC offers metadata fields to help identify binary builds, especially when they include custom patches or are not compiled from the official FSF release tree. Since RT_gcc introduces features like `#assign` and modular build scripts, we recommend setting these fields clearly. + +* Recommended Configuration + +** --with-pkgversion +:PROPERTIES: +:default: GCC +:controls: Label shown by `gcc --version` to identify the binary origin +:END: + +** Recommended setting +#+BEGIN_EXAMPLE +--with-pkgversion="RT_gcc standalone by Reasoning Technology" +#+END_EXAMPLE + +This distinguishes RT_gcc binaries from FSF-provided GCC builds. + +You can optionally append a build number or commit hash: +#+BEGIN_EXAMPLE +--with-pkgversion="RT_gcc standalone r1 (commit 9f3b123)" +#+END_EXAMPLE + +** --with-bugurl +:PROPERTIES: +:default: https://gcc.gnu.org/bugs/ +:controls: Where users are told to report bugs +:END: + +** Recommended setting +#+BEGIN_EXAMPLE +--with-bugurl="https://github.com/Thomas-Walker-Lynch/RT_gcc/issues" +#+END_EXAMPLE + +Use the GitHub Issues tracker unless you prefer an email or internal system. + +** --with-documentation-root-url +:PROPERTIES: +:default: https://gcc.gnu.org/onlinedocs/ +:controls: Where documentation links in error messages point +:END: + +** Recommended setting +#+BEGIN_EXAMPLE +--with-documentation-root-url="https://gcc.gnu.org/onlinedocs/" +#+END_EXAMPLE + +You may leave this as-is unless you plan to host modified docs yourself. + +** --with-changes-root-url +:PROPERTIES: +:default: https://gcc.gnu.org/ +:controls: Where version-specific release notes are linked +:END: + +** Recommended setting +#+BEGIN_EXAMPLE +--with-changes-root-url="https://gcc.gnu.org/" +#+END_EXAMPLE + +Alternatively, if you maintain a changelog of your patched builds: +#+BEGIN_EXAMPLE +--with-changes-root-url="https://github.com/Thomas-Walker-Lynch/RT_gcc/releases/" +#+END_EXAMPLE + +* Summary Table +#+LATEX_HEADER: \usepackage{booktabs} +#+ATTR_LATEX: :environment tabular :align lll +#+NAME: Distributor Option Summary +| Option | Purpose | Suggested Setting | +|------------------------------+-------------------------------------+--------------------------------------------------------------------| +| --with-pkgversion | Identify GCC build source | RT_gcc standalone by Reasoning Technology | +| --with-bugurl | Where users report issues | https://github.com/Thomas-Walker-Lynch/RT_gcc/issues | +| --with-documentation-root-url | Docs base for online help | https://gcc.gnu.org/onlinedocs/ | +| --with-changes-root-url | Changelog root | https://github.com/Thomas-Walker-Lynch/RT_gcc/releases/ | + diff --git a/document/configure/directory settings.org b/document/configure/directory settings.org new file mode 100644 index 0000000..bde87a1 --- /dev/null +++ b/document/configure/directory settings.org @@ -0,0 +1,156 @@ +#+TITLE: GCC Configure Directory Settings +#+AUTHOR: Thomas Walker Lynch +#+DATE: 2025-05-06 +#+OPTIONS: toc:nil num:nil +#+LANGUAGE: en + +See `terms - machine roles` and `compiler deployment scenarios` for context on build environments. + +The one or more GCC programs used during the build, will by necessity be different from the GCC produced by the build. We call the GCC produced by the build the 'target GCC'. The GCC used to do the build is the `build GCC`. + +We talk about GCC compiling C programs. This is a simplification as target GCC compilers can be made for a number of languages. + +* Settings that customize the build process itself + +** --prefix $PREFIX + +This setting directly affects the build. + +Sets the build target directory to $PREFIX. This is where GCC build will write the build products: binaries, libraries, and include files. + +The GCC build does not read from this directory. There are other options for specifying where GCC reads libraries and header files. + +Default: /usr/local + +Examples: + +- If you want to install GCC into a custom toolchain directory called `/opt/gcc-12` use + `--prefix=/opt/gcc-12` + +- Thomas is building a standalone toolchain into `$TOOLCHAIN`, so he sets: + `--prefix=$TOOLCHAIN` + to ensure all artifacts are installed in that controlled location. + +For splitting apart the --prefix setting: + +*** --exec-prefix + +Sets the prefix for machine-specific files (binaries and libraries). If omitted, defaults to the value of `--prefix`. + +Rarely needed unless separating machine-independent files (docs, configs) from machine-dependent binaries. + +Default: Same as `--prefix` + +Example: + +- To split architecture-specific files into `/opt/gcc-12/x86_64`, you could use: + `--prefix=/opt/gcc-12` + `--exec-prefix=/opt/gcc-12/x86_64` + +The following standard autoconf options are supported. Normally you should not need to use these options. + +*** --bindir=dirname +Specify the installation directory for the executables called by users (such as gcc and g++). The default is exec-prefix/bin. + +*** --libdir=dirname +Specify the installation directory for object code libraries and internal data files of GCC. The default is exec-prefix/lib. + +*** --libexecdir=dirname +Specify the installation directory for internal executables of GCC. The default is exec-prefix/libexec. + +*** --with-slibdir=dirname +Specify the installation directory for the shared libgcc library. The default is libdir. + +*** --datarootdir=dirname +Specify the root of the directory tree for read-only architecture-independent data files referenced by GCC. The default is prefix/share. + +*** --infodir=dirname +Specify the installation directory for documentation in info format. The default is datarootdir/info. + +*** --datadir=dirname +Specify the installation directory for some architecture-independent data files referenced by GCC. The default is datarootdir. + +*** --docdir=dirname +Specify the installation directory for documentation files (other than Info) for GCC. The default is datarootdir/doc. + +*** --htmldir=dirname +Specify the installation directory for HTML documentation files. The default is docdir. + +*** --pdfdir=dirname +Specify the installation directory for PDF documentation files. The default is docdir. + +*** --mandir=dirname +Specify the installation directory for manual pages. The default is datarootdir/man. (Note that the manual pages are only extracts from the full GCC manuals, which are provided in Texinfo format. The manpages are derived by an automatic conversion process from parts of the full manual.) + +*** --with-gxx-include-dir=dirname +Specify the installation directory for G++ header files. The default depends on other configuration options, and differs between cross and native configurations. + +*** --with-specs=specs +Specify additional command line driver SPECS. This can be useful if you need to turn on a non-standard feature by default without modifying the compiler’s source code, for instance --with-specs=%{!fcommon:%{!fno-common:-fno-common}}. See “Spec Files” in the main manual + + +* Settings that affect the behavior of the target compiler + +** --with-local-prefix $INCDIR + +This setting affects the behavior of the target GCC, but apart from that, does not affect the build. + +The target GCC, i.e. the one that configure and make is producing, will presumably be used later by programmers to compile programs. When the target GCC is called upon to compile a C program, that C program might specify one or more include files. + +By default, when the target GCC is invoked and the source file direct for the inclusion of a system header file, the target GCC will first search the directory `/usr/include` for that header file, and if it does not find it, it will then search `/usr/local/include`. + +When `--with-local-prefix $INCDIR` is specified, the target GCC will first search `/usr/include`, as before; however, if it does not find the header file, it will instead search in `$INCDIR/include`. Thus `/usr/include/local` does not get searched in these first two steps. + +This features is intended to support sites that use a different convention than `/usr/local` for the installation of site local header files. + +As for any GCC compile, additional include directories to be searched can be specified with the `-I` option given to the target GCC at the time it is invoked. The user can provide arbitrary values after the -I, though the target GCC will require read access to the specified directories. + +The document at https://gcc.gnu.org/install/configure.html, describes this option with a potentially confusing first sentence of "Specify the installation directory for local include files." By 'installation directory' they do not mean the target directory where the target GCC is being put by the build process, unless by coincidence. Rather they mean the place that site local header files will be found in the future when the target GCC is running and compiling source code. + +The site local directory can be removed from the include file search path by setting `--with-local-prefix` to `/dev/null'. + +- **Default**: `/usr/local` + +** --with-sysroot $SYSROOT + +This setting affects the behavior of the target GCC, but apart from that, does not affect the build. + +By default, when the target GCC is later invoked to compile a program, it will search absolute paths for libraries and include files, namely in `/usr/include`, `/usr/lib` and `/usr/local/include` for header and library files to read. This happens downstream when the target GCC is invoked to do work for programmers. + +The `--with-sysroot $SYSROOT` option causes the library and include searches to instead be relative to $SYSROOT. It will instead search, `$SYSROOT/include`, `$SYSROOT/usr/lib` and `$SYSROOT/usr/local/include` for header and library files to read. + +Various other options here can modify the defaults of where the target GCC will search for libraries and include files. When given `--with-sysroot $SYSROOT` those default overrides will also be taken to be relative to `$SYSROOT`. + +The target GCC's built-in default include and library paths are adjusted to be relative to $SYSROOT. However, user-supplied -I and -L paths remain absolute unless written as relative paths. + +This setting is useful when cross compiling or when building an isolated GCC. + +- Default: (none) +- Usage: + During cross-compilation or isolated builds, you use this to point to a directory that mimics the target system's root. +- Example: Thomas builds a version of GCC that is not supported by the system, so he installs all requisite run time files to be used by this version of GCC into `$SYSROOT`, so at build time he passes to configure: + `--with-sysroot=$SYSROOT` + +** --with-native-system-header-dir + +Specifies the directory where the target GCC will look for **native system headers** by default, when it is later invoked to compile programs. + +The term *native* here refers to the **target system** that the built GCC is intended to compile programs for — whether that system is the same as the host (in a native build) or different (in a cross-compilation scenario). These headers typically come from the target's C library (like glibc or musl), and include standard files such as ``, ``, etc. + +This setting controls where those headers are expected to reside. It is especially useful when building an isolated toolchain or a cross-compiler, where the target's system headers do not follow the default `/usr/include` layout. + +When used in combination with `--with-sysroot`, this path is interpreted relative to the given sysroot. For example, if `--with-native-system-header-dir=/usr/include` and `--with-sysroot=$SYSROOT` are both provided, the target GCC will search for headers in `$SYSROOT/usr/include`. + +This option overrides the default path implied by `--with-local-prefix`, and affects only the behavior of the resulting target GCC — it does not influence how GCC itself is built. + +- **Default**: `/usr/include` + +- **Example**: Inside a musl-based sysroot located at `$SYSROOT`, system headers live in `$SYSROOT/usr/include`. Thomas configures GCC with: + #+begin_src bash + --with-sysroot=$SYSROOT + --with-native-system-header-dir=/usr/include + #+end_src + + This causes the resulting GCC to treat `/usr/include` as the location of system headers — *relative to* `$SYSROOT`. + + diff --git a/document/cpp.txt b/document/cpp.txt new file mode 100644 index 0000000..376e5bd --- /dev/null +++ b/document/cpp.txt @@ -0,0 +1,489 @@ +by Jonathan Heathcote + +C Pre-Processor Magic +The C Pre-Processor (CPP) is the somewhat basic macro system used by the C programming language to implement features such as #include and #define which allow very simple text-substitutions to be carried out at compile time. In this article we abuse the humble #define to implement if-statements and iteration. + +Before we begin, a disclaimer: these tricks, while perfectly valid C, should not be considered good development practice and should almost certainly not be used for "real work". That said it can totally be used for fun home-automation projects... Finally, whilst these tricks have been found to work under GCC and Clang's CPP implementations, I've heard that they might not under Microsoft's compilers. + +The humble #define +Most C programmers will be familiar with the common-or-garden #define preprocessor directive. This directive allows the programmer to define a simple text-substitution macro. For example: + +#define VERSION 123 + +// ... later ... +printf("Version: %d\n", VERSION); +In this snippet we define a macro VERSION which the CPP will look for and replace with 123. We can specify any valid sequence of C tokens (that is, fragments of valid C though these need not be syntactically valid, for example , 123 { hello would be acceptable). We can see this in action by feeding this to CPP like so: + +cpp << EOF +#define VERSION 123 + +// ... later ... +printf("Version: %d\n", VERSION); +EOF +Which produces: + +# 1 "" +# 1 "" +# 1 "" +# 1 "/usr/include/stdc-predef.h" 1 3 4 +# 1 "" 2 +# 1 "" + + + +printf("Version: %d\n", 123); +This is actually the raw input that your compiler sees and compiles. The lines starting with # are not preprocessor directives but rather compiler hints which help the compiler work out line-numbers after #includes have been added and comments removed and thus produce helpful error messages. You can suppress these lines using -P. + +We can also define 'function-style' macros which take a number of arguments: + +#define MULTIPLY(a, b) a * b + +// ... later ... + +printf("4*8 = %d\n", MULTIPLY(4, 8)); +Which expands to: + +printf("4*8 = %d\n", 4 * 8); +Note that when using these in normal code it is common to place brackets around the macro substitution and also around the arguments: + +#define MULTIPLY(a, b) ((a) * (b)) +The reason for this is that without the brackets the following may not do what you'd expect: + +printf("%d\n", MULTIPLY(4 + 2, 2 + 8) * 2); +Without brackets this expands to: + +printf("%d\n", 4 + 2 * 2 + 8 * 2); +Which due to operator precedence rules (multiplies are evaluated first) would not evaluate how you'd expect. The bracketed version, however, works as you'd expect: + +printf("%d\n", ((4 + 2) * (2 + 8)) * 2); +As a final advanced twist, we can define function style macros with varadic arguments. You'll see these most often looking like this: + +#define DEBUG(...) fprintf(stderr, __VA_ARGS__) + +// ... later, inside a for-loop ... + +DEBUG("Something went wrong in iteration: %d", i); +If we specify the final argument to our macro as being ..., the macro will accept any number of arguments (even zero). These arguments are inserted into your substitution if you write __VA_ARGS__, complete with separating commas between each of the arguments. + +This is where sane usage of macros in C ends. + +If-statements +Time for our first bit of magic. Let's try and produce a macro that does the following: + +IF_ELSE(condition)( + expand to this if condition is not 0 +)( + expand to this otherwise +) +Unlike a C if-else-statement, the condition will be evaluated in the preprocessor, before your code is even compiled. The usefulness of this will become more apparent later on. + +Pattern Matching +The key to our if-else statement is abusing CPP to perform pattern matching like so: + +#define IF_ELSE(condition) _IF_ ## condition +#define _IF_1(...) __VA_ARGS__ _IF_1_ELSE +#define _IF_0(...) _IF_0_ELSE + +#define _IF_1_ELSE(...) +#define _IF_0_ELSE(...) __VA_ARGS__ +Download & Try Me! (Hint: cpp -P filename.txt) + +First notice that IF_ELSE takes a single argument: a condition. In our example above you can see that this is then followed by two parenthesised expressions corresponding to the true and false case for the condition respectively. + +Lets see how this works in practice by walking through the expansion of the 1 and 0 cases: + +Condition is 1 case: + +IF_ELSE(1)(it was one)(it was zero) +_IF_ ## 1 (it was one)(it was zero) +_IF_1 (it was one)(it was zero) +it was one _IF_1_ELSE (it was zero) +it was one +Condition is 0 case: + +IF_ELSE(0)(it was one)(it was zero) +_IF_ ## 0 (it was one)(it was zero) +_IF_0 (it was one)(it was zero) +_IF_0_ELSE (it was zero) +it was zero +The trick here is using the CPP concatenation operator (##) to concatenate _IF_ and the condition argument. In this case we expect condition to be either 0 or 1 and so the result is either _IF_1 or _IF_0. These two macros combine with the second set of brackets (the true clause) and either reproduce their arguments or swallow them respectively. They also produce a matching _IF_1_ELSE or _IF_0_ELSE macro which combines with the third set of brackets, swallowing or reproducing the arguments respectively. + +Cast to bool!! (and negation) +Our IF_ELSE is looking pretty good at this point but what if we write: + +IF_ELSE(123)(non-zero)(zero) +Download & Try Me! + +In this case it expands to: + +_IF_123(non-zero)(zero) +Unfortunately because _IF_123 is not defined, the macro expansion stops here and we're stuck. What we need is a macro which expands to 0 when its argument is 0 and 1 in any other case. To do this we'll attempt to implement C's famous cast to bool "operator" !!. This "operator" works by first negating the value with ! yielding either 0 or 1, negating this will then yield 0 if the original value was 0 and 1 otherwise. In order to do this with a macro we'll need to implement logical negation. + +Before we can implement negation we need a handful of simple macros which at first sight will seem a bit random but don't worry, we'll get there! The first is this guy: + +#define SECOND(a, b, ...) b +Download & Try Me! + +This macro takes an two or more arguments and expands to the second argument. + +The second macro is a 'probing' macro IS_PROBE: it takes a single argument and expands to 1 if the argument is PROBE() and 0 for any other argument. The macro is implemented as follows: + +#define IS_PROBE(...) SECOND(__VA_ARGS__, 0) +#define PROBE() ~, 1 +Download & Try Me! + +The dirty secret to IS_PROBE is that it takes a variable number of arguments and yet we just said it only takes a single argument. If we do indeed give IS_PROBE a single argument, you'll notice that the SECOND macro will in turn expand to 0. If we pass PROBE(), however, this expands to ~, 1 and as a result SECOND(~, 1, 0) will expand to 1. + +The trick here is that we can always spot the PROBE() because it (secretly) expands to two arguments (the second of which is 1 while all other valid inputs expand to one input. The choice of ~ as a first argument is essentially arbitrary (since SECOND will always cause it to disappear). However this particular character is a popular convention since if a bug in your macros results in one sneaking out into the final expansion it frequently results in a syntax error in the compiler alerting you to the problem. + +Now, using the above we can write our negation macro using the above and a little pattern matching like so: + +#define NOT(x) IS_PROBE(_NOT_ ## x) +#define _NOT_0 PROBE() +Download & Try Me! + +Here we pattern match the case where NOT's argument is 0 and substitute it for the PROBE(). When the argument is non-zero we simply get something like _NOT_some stuff here. Since IS_PROBE will only result in a 1 if it gets the PROBE() and that only happens when we pass in 0, we have a working negation! + +Here's a walk-through of the substitutions happening for both the zero and non-zero cases: + +Non-zero case: +NOT(not zero) +IS_PROBE(_NOT_ ## not zero) +IS_PROBE(_NOT_not zero) +SECOND(_NOT_not zero, 0) +0 +Zero case: +NOT(0) +IS_PROBE(_NOT_ ## 0) +IS_PROBE(_NOT_0) +IS_PROBE(PROBE()) +IS_PROBE(~, 1) +SECOND(~, 1, 0) +1 +With our negation macro working we can trivially make a cast-to-bool macro: + +#define BOOL(x) NOT(NOT(x)) +Download & Try Me! + +Unfortunately, this doesn't work: + +BOOL(0) +BOOL(123) +Becomes: + +0 +0 +This is all due to the slightly unusual rules surrounding the ## (concatenation) operator. Typically, macro arguments are expanded before they are inserted into the macro body, however, if the argument is inserted next to a ##, it will not be expanded. The result is as follows: + +BOOL(123) +NOT(NOT(123)) +IS_PROBE(_NOT_ ## NOT(123)) +IS_PROBE(_NOT_NOT(123)) +SECOND(_NOT_NOT(123), 0) +0 +We can force the expansion to take place by hiding the ## using a macro as follows: + +#define CAT(a,b) a ## b +If we redefine NOT as follows: + +#define NOT(x) IS_PROBE(CAT(_NOT_, x)) +#define _NOT_0 PROBE() +Download & Try Me! + +This time, because the expansion of NOT does not (directly) contain a ## near its argument, the arguments to NOT are macro expanded before insertion into the macro body. As a result, our double negation, and thus our BOOL macro, works: + +BOOL(123) +NOT(NOT(123)) +NOT(IS_PROBE(CAT(_NOT_, 123))) +NOT(IS_PROBE(_NOT_123)) +NOT(SECOND(_NOT_123, 0)) +NOT(0) +IS_PROBE(CAT(_NOT_, 0)) +IS_PROBE(_NOT_0) +IS_PROBE(PROBE()) +IS_PROBE(~, 1) +SECOND(~, 1, 0) +1 +Unfortunately, though, we hit a related problem when we stick this into our IF_ELSE macro: + +#define IF_ELSE(condition) _IF_ ## BOOL(condition) +#define _IF_1(...) __VA_ARGS__ _IF_1_ELSE +#define _IF_0(...) _IF_0_ELSE + +#define _IF_1_ELSE(...) +#define _IF_0_ELSE(...) __VA_ARGS__ + +IF_ELSE(123)(is non zero)(is zero) +Download & Try Me! + +Which produces: + +_IF_BOOL(123)(is non zero)(is zero) +The concatenation is happening before our BOOL macro is expanded and of course _IF_BOOL(123) is not a macro and so it remains unexpanded. In order to force the BOOL macro to be expanded before the concatenation we need to do the following: + +#define IF_ELSE(condition) _IF_ELSE(BOOL(condition)) +#define _IF_ELSE(condition) CAT(_IF_, condition) + +#define _IF_1(...) __VA_ARGS__ _IF_1_ELSE +#define _IF_0(...) _IF_0_ELSE + +#define _IF_1_ELSE(...) +#define _IF_0_ELSE(...) __VA_ARGS__ +Download & Try Me! + +We now have two layers of indirection between IF_ELSE and the concatenation. The expansion looks like this: + +IF_ELSE(123)(is non zero)(is zero) +_IF_ELSE(BOOL(123))(is non zero)(is zero) +(Expansion of BOOL not shown) +_IF_ELSE(1)(is non zero)(is zero) +CAT(_IF_, 1)(is non zero)(is zero) +_IF_ ## 1(is non zero)(is zero) +_IF_1(is non zero)(is zero) +is non zero _IF_1_ELSE(is zero) +is non zero +Since IF_ELSE ensures that BOOL(condition) is expanded when it is an argument to _IF_ELSE, the BOOL macro is fully expanded to 1 when it reaches the CAT macro. Since CAT uses ##, its inputs would not be expanded and so this step is critical. Also note that if we didn't use the CAT macro and instead just used ##, _IF_ELSE's arguments would not get expanded and so we'd hit the problem we saw before. + +So after all that, at long last, we now have a working IF_ELSE macro! + +Iterators +Iterators are a tricky beast in CPP because recursive macros are explicitly blocked. For example if we write: + +#define RECURSIVE() I am RECURSIVE() +RECURSIVE() rather boringly expands to + +I am RECURSIVE() +What happens is that when RECURSIVE() is expanded, if RECURSIVE appears in its expansion it is 'painted blue' (macro language jargon) which prevents it ever being expanded as a macro. Clearly we can't rely on simple recursion to implement. One could of course exhaustively define $n$ macros for iterating up to $n$ times but this would be tiresome and extremely limiting. + +Luckily with some more detailed knowledge of how CPP expands macros we can create an iterator which can iterate $O(2^n)$ times with only $O(n)$ macro definitions. + +Forcing CPP to make multiple passes +Lets start by looking at the following snippet: + +#define EMPTY() +#define A(n) I like the number n + +A (123) +A EMPTY() (123) +Download & Try Me! + +This expands to: + +I like the number 123 +A (123) +This might seem a little odd since we can see that A (123) should have be expanded but hasn't been. Lets look at the process CPP goes through when expanding the last line of the example: + +CPP sees the token A but since it isn't followed by a set of brackets, it is not considered a macro. + +A EMPTY() (123) +^ +Next it sees EMPTY() + +A EMPTY() (123) + ^ +This is substituted according to our #define: + +A (123) + ^ +At this point the macro expander sees (123) which cannot be expanded and so expansion completes: + +A (123) + ^ +Clearly, an second expansion pass is required. We can force this to happen by making a macro like so: + +#define EVAL1(...) __VA_ARGS__ + +EVAL1(A EMPTY() (123)) +Download & Try Me! + +This expands to: + +I like the number 123 +The reason this works is that when CPP encounters a function-style macro, it recursively expands the macro's arguments before substituting the macro's body and expanding that. + +So, with that in mind, lets watch what happens in detail: + +CPP sees the token EVAL1 followed by a set of brackets. This corresponds with a function-style macro. + +EVAL1(A EMPTY() (123)) +^ +CPP now takes the arguments and separately expands these. It sees the token A but since it isn't followed by a set of brackets, it is not considered a macro. + +A EMPTY() (123) +^ +Next it sees EMPTY() + +A EMPTY() (123) + ^ +CPP takes the arguments and exppands these, which is rather boring since there are no arguments: +^ +The equally empty body of EMPTY() is then substituted in and expansion continues. + +A (123) + ^ +At this point the macro expander sees (123) which cannot be expanded and so expansion completes. + +A (123) + ^ +The arguments to EVAL1, having been expanded, are now substituted into the macro body of EVAL1 and the macro expander continues from the start of EVAL1's expansion. It sees A (123) and expands this. + +A (123) +^ +This leaves us with the following: + +I like the number 123 +^ +A pass of the macro expander will now find no further substitutions and so the expansion completes. + +I like the number 123 + ^ +OK, so clearly using this trick we can force the macro expander to make additional passes when expanding macros. In fact, we can cause the macro expander to make many additional passes: + +#define EVAL(...) EVAL1024(__VA_ARGS__) +#define EVAL1024(...) EVAL512(EVAL512(__VA_ARGS__)) +#define EVAL512(...) EVAL256(EVAL256(__VA_ARGS__)) +#define EVAL256(...) EVAL128(EVAL128(__VA_ARGS__)) +#define EVAL128(...) EVAL64(EVAL64(__VA_ARGS__)) +#define EVAL64(...) EVAL32(EVAL32(__VA_ARGS__)) +#define EVAL32(...) EVAL16(EVAL16(__VA_ARGS__)) +#define EVAL16(...) EVAL8(EVAL8(__VA_ARGS__)) +#define EVAL8(...) EVAL4(EVAL4(__VA_ARGS__)) +#define EVAL4(...) EVAL2(EVAL2(__VA_ARGS__)) +#define EVAL2(...) EVAL1(EVAL1(__VA_ARGS__)) +#define EVAL1(...) __VA_ARGS__ +Before we move on, lets define the following macro based on what we've just learnt: + +#define DEFER1(m) m EMPTY() +This macro can be used to defer the expansion of another macro to the next expansion pass as follows: + +#define B(n) n is my favourite! +DEFER1(B)(321) +Which expands to: + +B (321) +Requiring a further expansion pass to become: + +321 is my favourite! +Turning multiple expansion passes into recursion +As this subheading suggests, its time to try and implement recursion: one possible basis for any form of iteration. Previously we stated that we cannot allow a macro to expand to itself because it will get painted blue and thus never get expanded. However, if the recursive expansion to takes place in a separate macro expansion pass, CPP won't realise recursion is taking place and thus we can get away with it. + +Take a look at the following macro: + +#define RECURSE() I am recursive, look: DEFER1(_RECURSE)()() +#define _RECURSE() RECURSE + +RECURSE() +Download & Try Me! + +It will expand in a single pass to: + +I am recursive, look: _RECURSE ()() +Note that we didn't expand to the RECURSE() macro so everything is fine so far. + +If we force a second macro expansion, for example: + +EVAL1(RECURSE()) +We get: + +I am recursive, look: I am recursive, look: _RECURSE ()() +In this second pass, the macro expander expands _RECURSE () giving: + +I am recursive, look: RECURSE () +It is critical that _RECURSE expands to RECURSE and not RECURSE() since the former cannot be expanded in isolation since RECURSE is a function-style macro. If we used the latter, RECURSE would expand to contain _RECURSE which would then be painted blue and no number of evaluations will cause it to expand. + +The expansion of _RECURSE(), RECURSE, combines with the extra pair of brackets which follows which, completes a second iteration of our recursion. + +Each additional evaluation results in an iteration of the recursive macro. Obviously execution is not truly recursive in that the recursion depth is limited to the number of evaluations which in turn is bounded. However, since the number of evaluations can be trivially made very large, for any practical purpose the recursion depth should be sufficient. + +Turning recursion into an iterator +Finally, lets do what we set out to do and implement a MAP macro which applies the specified macro to every element of a list of arguments. Clearly we can make a start based on our recursion example: + +#define MAP(m, first, ...) m(first) DEFER1(_MAP)()(m, __VA_ARGS__) +#define _MAP() MAP +Our MAP macro evaluates the passed macro passing in the first argument. We then recursively call ourselves passing the supplied macro and remaining arguments. For example: + +#define GREET(x) Hello, x! +EVAL8(MAP(GREET, Mum, Dad, Adam, Joe)) +Download & Try Me! + +And it largely works: + +Hello, Mum! Hello, Dad! Hello, Adam! Hello, Joe! Hello, ! Hello, ! Hello, ! Hello, ! Hello, ! Hello, ! Hello, ! Hello, ! Hello, ! Hello, ! Hello, ! Hello, ! _MAP ()(GREET, ) +But alas, our recursion never terminates. We must come up with a macro which can detect when there are no more arguments and terminate the recursion. We can do this as follows: + +#define FIRST(a, ...) a + +#define HAS_ARGS(...) BOOL(FIRST(_END_OF_ARGUMENTS_ __VA_ARGS__)()) +#define _END_OF_ARGUMENTS_() 0 +Download & Try Me! + +When HAS_ARGS is passed no arguments, expansion proceeds as follows: + +HAS_ARGS() +BOOL(FIRST(_END_OF_ARGUMENTS_)()) +BOOL(_END_OF_ARGUMENTS_()) +BOOL(0) +0 +However, when HAS_ARGS has any number of arguments: + +HAS_ARGS(some, arguments, here) +BOOL(FIRST(_END_OF_ARGUMENTS_ some, arguments, here)()) +BOOL(_END_OF_ARGUMENTS_ some()) +1 +The key trick is that the first argument passed to HAS_ARGS is placed between _END_OF_ARGUMENTS_ and the brackets that would cause it to expand to 0. As a result, the BOOL macro's argument will be non-zero and thus evaluate to 1. + +Now we can use HAS_ARGS along with our IF_ELSE macro to make our MAP terminate cleanly: + +#define MAP(m, first, ...) \ + m(first) \ + IF_ELSE(HAS_ARGS(__VA_ARGS__))( \ + DEFER1(_MAP)()(m, __VA_ARGS__) \ + )( \ + /* Do nothing, just terminate */ \ + ) +#define _MAP() MAP +Download & Try Me! + +Lets try it out: + +EVAL(MAP(GREET, Mum, Dad, Adam, Joe)) +And lo and behold: + +Hello, Mum! MAP(GREET, Dad, Adam, Joe) +Oh no! Now it doesn't recurse! You'll remember that earlier we prevented _MAP from being expanded in a single pass (using DEFER1) in order to hide the recursive expansion from CPP. What's happened here is that our DEFER1 has been moved into the true-clause of our IF_ELSE statement which becomes an argument to a macro and, as a result, receives an additional pass of the macro expander and so _MAP() gets expanded to MAP which in turn gets painted blue, killing the recursion. + +To get around this we need to defer the expansion of _MAP for two passes of macro expansion. We can extend our DEFER1 macro like so to make version which defer for successively greater numbers of expansion: + +#define DEFER2(m) m EMPTY EMPTY()() +#define DEFER3(m) m EMPTY EMPTY EMPTY()()() +#define DEFER4(m) m EMPTY EMPTY EMPTY EMPTY()()()() +// ... +In this case we simply need to defer for two expansion passes so DEFER2 is sufficient: + +#define MAP(m, first, ...) \ + m(first) \ + IF_ELSE(HAS_ARGS(__VA_ARGS__))( \ + DEFER2(_MAP)()(m, __VA_ARGS__) \ + )( \ + /* Do nothing, just terminate */ \ + ) +#define _MAP() MAP +Download & Try Me! + +And so, at long last: + +EVAL(MAP(GREET, Mum, Dad, Adam, Joe)) +Expands to: + +Hello, Mum! Hello, Dad! Hello, Adam! Hello, Joe! +And with that, congratulations, you've just implemented iteration in the C-preprocessor! + +Learning More +The CPP manual is, of course, a good place for detailed and precise information on CPP's behaviour though it doesn't cover interesting abuses such as those above. + +My own first introduction to CPP abuse came from pfultz2's excellent introduction to the matter. + +If you're interested in a non-trivial application of these techniques, take a look at my own 'EZSHET' library which can be found on GitHub as part of uSHET, a home automation library. EZSHET generates C code to unpack and type-check arbitrary JSON strings at compile time using some of these tricks. You can start with lib/cpp_magic.h for the generic CPP abuse (from which much of this article is taken). You should then move on to the innards to discover how this is used to generate code like that listed in the appendix of the EZSHET tutorial. + +Another popular place to see abusive CPP macros in action is Boost's CPP library, built for the days when template meta-programming wasn't hard-core enough to stand alone. + diff --git a/document/how_it_works/cpp.org b/document/how_it_works/cpp.org new file mode 100644 index 0000000..b38c7ba --- /dev/null +++ b/document/how_it_works/cpp.org @@ -0,0 +1,503 @@ +#+TITLE: C Preprocessor Overview +#+AUTHOR: Thomas Walker Lynch & Caelestis Index +#+DESCRIPTION: High-level architectural partitioning of cpp (GCC 12.x) +#+FILETAGS: cpp preprocessor architecture gcc +#+OPTIONS: toc:nil + +* Preprocessing Pipeline (Diagram) + +#+BEGIN_SRC text + C Preprocessor (cpp) + ===================== + ++----------------------+ +| Source Code | ++----------------------+ + | + v ++----------------------+ +| Lexical Analysis | <- Part of: Lexical Analysis +| (tokenize input) | ++----------------------+ + | + v ++----------------------+ +| Directive Engine | <- Part of: Directive Handling +| (#define, #if, etc.) | ++----------------------+ + | + v ++----------------------+ +| Conditional Logic | <- Part of: Conditional Compilation +| (#if/#ifdef/#else) | ++----------------------+ + | + v ++----------------------+ +| Macro Expansion | <- Part of: Macro Expansion +| (object/function) | ++----------------------+ + | + v ++----------------------+ +| Callback Hooks | <- Part of: Hook and Callback Interface +| (cpp_callbacks) | ++----------------------+ + | + v ++----------------------+ +| Output Tokens | <- Output stream to compiler frontend +| (to GCC parser) | ++----------------------+ +#+END_SRC + +Each block corresponds to a major processing stage in `cpp`. The functional groups defined earlier align to these blocks as indicated, though some (like state management and diagnostics) operate globally across the pipeline. + + + +* Major Functional Partitions of the C Preprocessor (cpp) + +This section outlines the primary architectural components of the C preprocessor as implemented in GCC 12.x. These functional partitions help frame how cpp processes input and how its internal modules interact. + +** 1. Lexical Analysis +- Tokenizes input into =cpp_token= streams. +- Decodes: + - UTF-8 characters + - Trigraphs (e.g., =??=) + - Digraphs (e.g., =<: = for =[=) +- Central structure: =cpp_lexer= +- Produces tokens for macro expansion and conditional evaluation. +** 2. Directive Handling +- Processes all =#= directives, including: + - =#define=, =#undef=, =#include=, =#line=, =#error=, =#pragma= + - Extended directives like =#assign=, =#call= if supported. +- Managed via =directive_table= and dispatch functions like =do_define=, =do_include=, etc. + +** 3. Conditional Compilation +- Handles constructs like: + - =#if=, =#ifdef=, =#ifndef=, =#elif=, =#else=, =#endif= +- Used to include or exclude code based on macro definitions and constant expressions. +- Driven by the =if_stack= in =cpp_reader=. +- Central to controlling variant builds, platform-specific code, or staged compilation. +** 4. File Inclusion and Search Paths +- Resolves =#include= and maintains include history. +- Handles: + - System vs user includes (<...> vs "..."). + - Include path resolution via =cpp_search_path=. + - File change tracking via =file_stack=. +** 5. Macro Expansion +- Handles object-like and function-like macros: + - =#define PI 3.14= + - =#define SQR(x) ((x)*(x))= +- Manages: + - Argument collection and expansion + - Token-pasting (=##=) and stringification (=#=) +- Involves =macro_table=, =collect_args=, and =expand_macro()= + +** 6. Diagnostics and Error Recovery +- Reports syntax errors, macro misuse, directive misuse. +- Uses: + - =cpp_error=, =cpp_warning=, =cpp_notice= + - Tracks macro nesting, input location, and file state for context. + +** 7. Hook and Callback Interface +- Interface: =cpp_callbacks= +- Allows frontend or plugin to observe: + - Macro definitions + - File changes + - Token output stream +- Enables debugging tools, IDEs, or language servers to integrate preprocessor awareness. + +** 8. State Management and Scoping +- Maintains global and file-level preprocessor state. +- Tracks: + - Nested conditional state via =if_stack= + - Macro table lifetimes and shadowing + - Include guards and =#pragma once= heuristics + + +* cpplib.h -- Application Interface Overview + +This section documents the **interface** and **in-memory model** of the C preprocessor (`libcpp`) from GCC 12.2.0. +It covers core data structures (tokens, macros, readers) and the primary functions for working with them. + +** Key Data Structures + +*** Token & Token Metadata +- `enum cpp_ttype` :: All possible token types (operators, names, literals, etc.) +- `struct cpp_token` :: Represents a token in the stream (with union-based payload) +- `enum cpp_token_fld_kind` :: Discriminates the active field in `cpp_token.val` +- `struct cpp_string` :: Raw string representation with length and pointer + +*** Macros & Identifiers +- `struct cpp_macro` :: Describes macro kind, parameter list, and token expansion +- `enum cpp_macro_kind` :: ISO-style, traditional-style, and assertion macros +- `struct cpp_identifier` :: Canonical and original spellings of a name +- `struct cpp_macro_arg` :: Argument number and spelling for macro arguments + +*** Symbol Table +- `struct cpp_hashnode` :: Hash table node for identifiers/macros +- `enum node_type` :: Distinguishes macro types (arg/user/builtin) +- `union _cpp_hashnode_value` :: Payload (macro, arg index, etc.) +- `enum cpp_builtin_type` :: Reserved built-ins like `__LINE__`, `__FILE__`, `_Pragma` + +*** Reader & Configuration +- `struct cpp_reader` :: Forward-declared. Central structure for preprocessing. +- `struct cpp_options` :: Stores all language mode flags, warning flags, and feature toggles. +- `struct cpp_callbacks` :: Client hook interface for diagnostic, macro, and file events. +- `struct cpp_dir` :: Represents an `#include` search directory. + +*** Numerics +- `struct cpp_num` :: Two-part 64-bit integer (high, low), overflow flags +- `cpp_classify_number` :: Categorizes radix/type (e.g., `0x`, `u`, `LL`) +- Defines :: `CPP_N_*` classify bits (INTEGER, FLOATING, WIDTH, RADIX, SUFFIX) + +*** Charset Handling +- `typedef cppchar_t` :: 32-bit safe character representation +- `struct cpp_decoded_char` :: Result of UTF-8 decoding step +- `struct cpp_char_column_policy` :: Visual column handling for diagnostics +- `class cpp_display_width_computation` :: Converts UTF-8 sequence to visual width + +*** Comment Tracking +- `struct cpp_comment`, `cpp_comment_table` :: Captures all parsed comments (if enabled) + +** Core Functions + +*** Lifecycle & Reader Setup +- `cpp_create_reader(enum c_lang, ...)` :: Allocates and initializes `cpp_reader` +- `cpp_finish`, `cpp_destroy` :: Finalize and free the reader +- `cpp_post_options` :: Commit option changes after parsing flags + +*** Preprocessing Input +- `cpp_read_main_file` :: Begin reading and preprocessing a source file +- `cpp_get_token()` :: Fetch next token from stream +- `cpp_peek_token()` :: Peek ahead without consuming +- `cpp_backup_tokens()` :: Push tokens back for re-parsing +- `cpp_retrofit_as_include()` :: Treat main file as if included + +*** Macro System +- `cpp_define()`, `cpp_define_unused()`, `cpp_define_lazily()` :: Define macros +- `cpp_macro_definition()` :: Dump macro body as string +- `cpp_compare_macros()` :: Deep compare two macros +- `cpp_undef()`, `cpp_undef_all()` :: Remove macro(s) +- `cpp_set_deferred_macro()`, `cpp_get_deferred_macro()` :: Lazy macro substitution + +*** Symbol Lookup +- `cpp_lookup()` :: Lookup or create an identifier hashnode +- `cpp_forall_identifiers()` :: Iterate over all identifiers + +*** String & Char Evaluation +- `cpp_interpret_charconst()` :: Parse a character constant (e.g. `'a'`) +- `cpp_interpret_string()` :: Parse string literal(s) into `cpp_string` +- `cpp_interpret_integer()` :: Parse numeric token into `cpp_num` + +*** Diagnostics +- `cpp_error()`, `cpp_warning()`, `cpp_pedwarning()` :: General messages +- `cpp_error_at()` :: Message with source location (rich_location optional) +- `cpp_errno()` / `cpp_errno_filename()` :: Errors based on `errno` +- `cpp_warning_with_line()` :: Fallback location-based warnings +- `cpp_get_callbacks()` / `cpp_set_callbacks()` :: Manage diagnostic hooks + +*** Extension Hooks & Pragma +- `cpp_register_pragma()` :: Register custom `#pragma` handler +- `cpp_get_callbacks()` :: Access to client-supplied hook table +- `cpp_define_formatted()` :: Macro with `printf`-style input +- `cpp_directive_only_process()` :: Run directive-only logic on a token stream + +*** Includes & File Management +- `cpp_set_include_chains()` :: Set system and user include paths +- `cpp_push_buffer()` :: Manually push a buffer for parsing +- `cpp_included()`, `cpp_included_before()` :: Has this file been included? +- `cpp_get_converted_source()` :: Read a file in input charset, return decoded buffer + +** Token Types (cpp_ttype) + +A full enumeration of all tokens in the preprocessor: +- Operators: `CPP_PLUS`, `CPP_MINUS`, `CPP_EQ_EQ`, etc. +- Punctuation: `CPP_OPEN_PAREN`, `CPP_HASH`, `CPP_SEMICOLON` +- Literals: `CPP_STRING`, `CPP_WCHAR`, `CPP_NUMBER` +- Special: `CPP_MACRO_ARG`, `CPP_PRAGMA`, `CPP_EOF` + +Each token has: +- Type (`enum cpp_ttype`) +- Flags (`PREV_WHITE`, `DIGRAPH`, `NO_EXPAND`, etc.) +- Source location +- Union payload (e.g., string, macro arg, hashnode) + +** Interface Concepts Beyond Code +*** Unicode Handling +- Input is normalized per `cpp_normalize_level` +- UTF-8 is expanded into 32-bit code points (`cppchar_t`) +- Display width of characters is estimated for diagnostics +- Bidi (bidirectional) controls are optionally scanned/warned + +*** Client Extension Hooks +- Most preprocessing operations (macro use, `#include`, comments, errors) are callback-hooked +- Used by GCC frontend to track macro use, implement diagnostics, and guide `#pragma` processing + +*** Dependency Generation +- `cpp_finish()` accepts an output stream for dependency info +- Options control whether main file is included, phony targets are added, etc. + +** Summary + +`cpplib.h` serves as both API contract and internal representation guide. +- It offers a high-fidelity view of source tokens for later compiler stages. +- The entire macro system, character encoding, and diagnostic lifecycle are managed through this interface. + + + + +* Callback Hooks (cpp_callbacks) + +The `cpp_callbacks` struct in `cpplib.h` allows external consumers (e.g., GCC frontend, IDE integrations, or plugins) to receive notifications during preprocessing. Each function pointer in this struct represents a hookable event. + +** Overview + +Hooks are triggered at specific stages: +- After macro definition or undefinition +- Before and after file inclusion +- When tokens are emitted +- Upon encountering diagnostics +- During comment scanning (if enabled) +- On encountering special directives (e.g., `#pragma`) + +** Hook Structure + +#+BEGIN_SRC c +struct cpp_callbacks { + void (*define)(cpp_reader *, source_location, const cpp_hashnode *); + void (*undef)(cpp_reader *, source_location, const cpp_hashnode *); + void (*include)(cpp_reader *, const char *filename, int angle_brackets); + void (*file_change)(cpp_reader *, const struct line_map *); + void (*line_change)(cpp_reader *, source_location, int to_file, int to_line); + void (*ident)(cpp_reader *, const cpp_string *); + void (*invalid_directive)(cpp_reader *); + void (*def_pragma)(cpp_reader *, const cpp_token *); + void (*cb_comment)(cpp_reader *, const cpp_token *); +}; +#+END_SRC + +Each callback receives either a pointer to the `cpp_reader`, the affected token or structure, and optional contextual data. + +--- + +** `define` + +*** Trigger +- Fired immediately after a macro is defined with `#define`. + +*** Parameters +- `cpp_reader *pfile`: global preprocessor state (read-write). +- `source_location loc`: location of the `#define`. +- `const cpp_hashnode *node`: the macro name and metadata (read-only in this context). + +*** Semantics +- The `cpp_hashnode` holds the macro's name and a pointer to its `cpp_macro` definition. +- Modifying the macro at this point is possible but discouraged. Use `cpp_undef()` + `cpp_define()` instead if redefinition is needed. + +*** Uses +- GCC uses this to update dependency tracking and debug tables. +- Tools may track macro definitions, emit logs, or enforce naming policies. + +--- + +** `undef` + +*** Trigger +- Fired after `#undef` removes a macro. + +*** Parameters +- Same as `define`. + +*** Semantics +- The node is marked `undefined`, but the symbol remains in the hash table. +- No mutation should occur—only inspection or logging. + +*** Uses +- Enables reversal tracking or macro scoping analysis. + +--- + +** `include` + +*** Trigger +- Fired just before a file is opened via `#include`. + +*** Parameters +- `cpp_reader *pfile` +- `const char *filename`: string from the include directive (not normalized). +- `int angle_brackets`: nonzero for `<...>`, zero for `"..."`. + +*** Semantics +- Purely informational; does not affect include search or suppression. +- The filename is unverified and not guaranteed to exist. + +*** Uses +- IDEs and build tools use this to build include graphs. +- LSPs use it to track file references and symbol origins. + +--- + +** `file_change` + +*** Trigger +- Called when the active input file changes (entry or exit of `#include`). + +*** Parameters +- `cpp_reader *pfile` +- `const struct line_map *map`: describes the current file's location and context. + +*** Semantics +- `line_map` gives full access to file/line/column mapping. +- This structure is read-only; mutating it will corrupt diagnostics and tokenization. + +*** Uses +- Debug info (DWARF line tables), logging, stack-based include tracking. + +--- + +** `line_change` + +*** Trigger +- Fired on `#line` directives or line-mapping transitions. + +*** Parameters +- `cpp_reader *pfile` +- `source_location loc`: location in input stream. +- `int to_file`: non-zero if a new file name is being used. +- `int to_line`: new logical line number. + +*** Semantics +- Use this to remap locations or re-synchronize overlays. +- These values are inputs to the line map; do not write back. + +*** Uses +- Used in DWARF debug info to support accurate line-based breakpoints. + +--- + +** `ident` + +*** Trigger +- Called when a `#ident` directive is parsed. + +*** Parameters +- `cpp_reader *pfile` +- `const cpp_string *text`: payload of the identifier message. + +*** Semantics +- Informational only. Common in legacy systems or codegen traces. + +*** Uses +- Collect module identity, versioning hints, or logmarks. + +--- + +** `invalid_directive` + +*** Trigger +- Fired when an unrecognized or malformed directive is encountered. + +*** Parameters +- `cpp_reader *pfile` + +*** Semantics +- Hook has no extra context; use `cpp_get_token()` to recover. +- Hook may trigger fallback behavior or custom directive logic. + +*** Uses +- Used in `-fpreprocessed` mode to suppress diagnostics. +- External tools can use this to extend the directive set. + +--- + +** `def_pragma` + +*** Trigger +- Fired when a `#pragma` directive is parsed. + +*** Parameters +- `cpp_reader *pfile` +- `const cpp_token *pragma`: token stream beginning with `CPP_PRAGMA`. + +*** Semantics +- Read-only access to token stream. +- Mutation possible via `cpp_push_buffer()` to inject expanded tokens. + +*** Uses +- GCC plugins hook this to implement custom `#pragma` behavior. +- Can trigger front-end features (like `#pragma GCC diagnostic`). + +--- + +** `cb_comment` + +*** Trigger +- Optional. Enabled if comment tracking is requested. + +*** Parameters +- `cpp_reader *pfile` +- `const cpp_token *comment`: holds text of comment. + +*** Semantics +- Only line/block comment content is captured, not semantics. +- Read-only token; do not mutate token payload. + +*** Uses +- Used by source-to-source translators and formatters. +- Some static analyzers inspect comments for hints or disables. + +--- + +** Summary + +The `cpp_callbacks` interface enables observational and limited transformational interaction with the preprocessor pipeline. + +- Most parameters are read-only or shallow copies. +- For transformations, prefer using `cpp_define()`, `cpp_push_buffer()`, or `cpp_backup_tokens()` externally. +- Internal structures like `cpp_reader`, `cpp_token`, and `cpp_macro` should not be mutated unless explicitly permitted. + + + +* Plugin-Like Integration in libcpp + +Unlike the main GCC compiler, which supports a formal plugin system (`gcc-plugin.h`), `libcpp` (the C preprocessor library) does *not* support plugins in the dynamic or runtime-loaded sense. There is no system for loading shared libraries, registering handlers via symbols, or extending preprocessor behavior through runtime modules. + +** Static Hook Interface via cpp_callbacks + +Instead, `libcpp` exposes a *statically defined interface* (`struct cpp_callbacks`) for embedding applications to receive notifications of preprocessor events. These include: + +- Macro definitions and undefinitions +- Source file entry/exit +- Comment and pragma parsing +- Token emission and buffer transitions + +An embedding client (such as GCC's C/C++ frontend, or a third-party tool using libcpp) may assign function pointers directly into this struct during reader setup. + +#+BEGIN_SRC c +cpp_reader *r = cpp_create_reader(...); +cpp_callbacks *cb = cpp_get_callbacks(r); +cb->macro_defined = my_macro_handler; +cb->file_change = my_file_tracker; +#+END_SRC + +This pattern is analogous to a *plugin interface*, but all logic is statically linked at compile time. + +** Mutability and Access Scope + +The callback interface is primarily **observational**—that is, hooks are expected to inspect events, not mutate the `cpp_reader` state directly. However, advanced users can, with care, reach into the data structures passed to them (e.g., `cpp_macro`, `cpp_hashnode`) and affect behavior, though this is neither documented nor officially supported. + +In summary: + +| Feature | GCC Frontend Plugin | libcpp Callback Interface | +|--------------------------+---------------------+----------------------------| +| Dynamically loadable | Yes | No | +| Runtime extension API | Yes (`gcc-plugin.h`) | No | +| Assign custom handlers | Yes | Yes (via `cpp_callbacks`) | +| Mutate core structures | With care | With care (not endorsed) | +| Stability across versions| Best-effort | Internal API, may break | + +** Recommendation + +Use `cpp_callbacks` as a read-only interface to monitor preprocessing behavior. If deeper mutation or instrumentation is required, consider modifying or forking `libcpp` itself. There is currently no officially supported way to extend it at runtime. diff --git a/document/how_it_works/cpp_reader.org b/document/how_it_works/cpp_reader.org new file mode 100644 index 0000000..bc87d15 --- /dev/null +++ b/document/how_it_works/cpp_reader.org @@ -0,0 +1,147 @@ +#+TITLE: cpp_reader: Preprocessor State and Interface Guide +#+AUTHOR: Caelestis Index +#+FILETAGS: cpp, GCC internals, preprocessor, architecture + +* Overview +The =cpp_reader= struct in GCC's =libcpp= encapsulates the complete state of a single C preprocessor session. It governs token input, macro expansion, directive parsing, include stack management, and source map resolution. It is the central state object passed through nearly all parts of the C preprocessor. + +* 1. State Data + +** 1.1 Buffer and Lexing State +- ~buffer~, ~overlaid_buffer~: Input buffer stack for file and macro streams. +- ~cur_token~, ~cur_run~, ~base_run~: Active token buffer and tokenrun tracking. +- ~keep_tokens~: Whether to preserve old tokens (e.g., for diagnostics). +- ~a_buff~, ~u_buff~, ~free_buffs~: Temporary memory allocation pools. + +** 1.2 Parsing and Directive State +- ~state~: General lexer state (includes ~in_directive~ flag). +- ~state.in_directive~: Boolean flag indicating whether the preprocessor is currently parsing a directive line. If ~true~, token behavior (e.g., whitespace and line continuation) may differ. +- ~directive~, ~directive_line~: Currently parsed directive and its location. +- ~directive_result~: Token synthesized by a directive (if any). + +** 1.3 Macro Context and Expansion +- ~context~, ~base_context~: Macro expansion call stack. +- ~top_most_macro_node~: Current top-level macro under expansion. +- ~about_to_expand_macro_p~: Indicates if a macro is about to expand. +- ~macro_buffer~, ~macro_buffer_len~: Buffers for rendering macro string forms. + +** 1.4 Include and File Lookup State +- ~quote_include~, ~bracket_include~, ~no_search_path~: Search paths. +- ~all_files~, ~main_file~: Linked list of all known input files. +- ~file_hash~, ~dir_hash~: Hashtables for file path caching. +- ~nonexistent_file_hash~: Optimizes negative lookup caching. +- ~seen_once_only~: Tracks ~#pragma once~ semantics. + +** 1.5 Character Set Conversion +- ~narrow_cset_desc~, ~utf8_cset_desc~, ~wide_cset_desc~, etc.: Converters for source to execution character encodings. + +** 1.6 Location Mapping and Source Positioning +- ~line_table~: GCC's =line_maps= structure for virtual location tracking. +- ~invocation_location~, ~main_loc~, ~forced_token_location~: Positional context for diagnostics, token creation. + +** 1.7 Miscellaneous Flags and Utilities +- ~quote_ignores_source_dir~: Include resolution behavior flag. +- ~counter~: Value of the ~__COUNTER__~ macro. +- ~out~: Output buffer for traditional preprocessing mode. +- ~savedstate~: Used for dependency tracking with precompiled headers. +- ~comments~: Optional comment capture buffer. + +* 2. Core Interface Functions +** 2.1 Token Retrieval +- ~cpp_get_token(pfile)~: Public interface for retrieving the next logical token. +- ~cpp_peek_token(pfile, N)~: Look ahead without consuming. +- ~cpp_get_token_1(pfile)~: Internal token fetch used during macro expansion. + +** 2.2 Macro Definition and Expansion +- ~_cpp_new_macro(pfile, cmk_macro, obstack_ptr)~: Allocate and initialize a new macro definition. +- ~_cpp_mark_macro_used(node)~: Mark a macro as having been used. +- ~replace_args(...)~: Expand and replace macro arguments (not used during directive handling). +- ~collect_args(...)~: Collects arguments for a function-like macro invocation. +- ~collect_single_argument(...)~: Parses one macro argument and handles token accumulation. +- ~cpp_arguments_ok(...)~: Checks argument count and matching for a macro invocation. +- ~set_arg_token(...)~: Sets or appends a token in an argument’s expansion list. + +** 2.3 Directive Handling Helpers +- ~_cpp_skip_rest_of_line(pfile)~: Skip trailing tokens after directive arguments. +- ~lex_macro_node(pfile)~: Specialized lexer for parsing macro names. + +** 2.4 File/Include Handling +- ~cpp_push_include(pfile, filename)~: Add a new include to the stack. +- ~cpp_find_include_file(...)~: Path search logic. + +** 2.5 Location Utilities +- ~cpp_token_location(token)~: Extracts a =location_t= from a token. +- ~linemap_add(...)~: Adds a mapping between logical and physical line/column. + +** 2.6 Miscellaneous +- ~cpp_warning_with_line(...)~, ~cpp_error_with_line(...)~: Emit diagnostics with location. +- ~cpp_lookup(pfile, name, length)~: Interns an identifier and returns a ~cpp_hashnode *~. +- ~NODE_NAME(node)~: Expands to the null-terminated name of a macro node. + +* 3. Usage Examples + +** 3.1 Defining a Macro from a Directive +#+BEGIN_SRC c +cpp_hashnode *node = lex_macro_node(pfile); +cpp_macro *macro = _cpp_new_macro(pfile, cmk_macro, _cpp_reserve_room(pfile, 0, sizeof(cpp_macro))); +macro->count = 1; +macro->exp.tokens[0] = make_number_token("42"); +node->type = NT_USER_MACRO; +node->value.macro = macro; +_cpp_mark_macro_used(node); +#+END_SRC + +** 3.2 Parsing a Directive With Two Arguments +#+BEGIN_SRC c +cpp_token *arg1 = cpp_get_token(pfile); +cpp_token *comma = cpp_get_token(pfile); +if (comma->type != CPP_COMMA) + cpp_error(pfile, CPP_DL_ERROR, "expected ',' after macro name"); +cpp_token *arg2 = cpp_get_token(pfile); +_cpp_skip_rest_of_line(pfile); +#+END_SRC + +** 3.3 Controlling Directive Context +#+BEGIN_SRC c +bool saved = pfile->state.in_directive; +pfile->state.in_directive = false; +assign_handler(pfile); +pfile->state.in_directive = saved; +#+END_SRC + +** 3.4 Tokenization and Location Debugging +#+BEGIN_SRC c +const cpp_token *tok = cpp_get_token(pfile); +location_t loc = tok->src_loc; +printf("token at line: %d\n", LOCATION_LINE(loc)); +#+END_SRC + +* 4. directive.cc extensions to the reader +- ~lex_macro_node(pfile)~: Returns a ~cpp_hashnode *~ for the next identifier, used for directives like ~#define~ or custom ones like ~#assign~. +- ~_cpp_skip_rest_of_line(pfile)~: Advances the token stream to the next physical line. +- ~cpp_error_with_line(...)~, ~cpp_warning_with_line(...)~: Used for directive diagnostics. +- ~cpp_lookup(pfile, name, length)~: Interns a name as a hashnode symbol. +- ~cpp_reader->directive_result~: Used to push a synthesized token result into the stream (e.g., for ~#include_next~). +- ~pfile->state.in_directive~: Must be manually toggled when directive code calls into macro infrastructure. +* 5. macro.cc extensions to the reader + +*** 4.2.1 collect_args(...) +Accumulates macro arguments for a function-like macro. Reads and segments the input stream into a series of ~macro_arg~ entries, tracking nesting of parentheses and token boundaries. + +*** 4.2.2 collect_single_argument(...) +Parses and collects one macro argument, terminating on a comma or closing paren. Used internally by ~collect_args~, but can be called separately for single-argument macro handling. + +*** 4.2.3 replace_args(...) +Performs full substitution of macro arguments into the macro body. Handles token pasting (~##~), stringification (~#~), and recursive macro expansion. + +*** 4.2.4 cpp_arguments_ok(...) +Checks whether the number of provided arguments matches the macro’s parameter list. Validates ~paramc~ and variadic status. + +*** 4.2.5 set_arg_token(...) +Helper to insert or append a token into a ~macro_arg~. Used when building argument streams in ~collect_single_argument~. + +These routines enable fine-grained control over macro behavior and can be selectively reused to simulate macro expansion at directive time (e.g., ~#assign~, ~#bind~, or macro templating extensions). +* 6. Conclusion +~cpp_reader~ is the heart of the preprocessor, acting as a unifying context for token streams, macro tables, buffer management, diagnostics, and parser state. Understanding and safely manipulating it is key to extending the preprocessor (e.g., adding new directives like ~#assign~) without destabilizing expansion or include logic. + +Use ~in_directive~, ~context~, and ~cur_token~ fields with care, and follow the established patterns in ~directives.cc~ and ~macro.cc~ to ensure consistent behavior across parse and expansion phases. diff --git a/document/how_it_works/lexing.org b/document/how_it_works/lexing.org new file mode 100644 index 0000000..a6bed25 --- /dev/null +++ b/document/how_it_works/lexing.org @@ -0,0 +1,230 @@ +#+TITLE: GCC libcpp Lexer: Structure, Usage, and Extension +#+AUTHOR: Caelus (OpenAI) and Thomas Walker Lynch +#+DATE: 2025-05-09 + +* Overview +The C preprocessor lexer (`lex.cc`) in GCC's `libcpp` is responsible for scanning raw source characters and emitting `cpp_token` structures. It is Unicode-aware, macro-sensitive, context-tracking, and supports multiple levels of token buffering. This lexer is both a general-purpose lexical analyzer and a specialized component for preprocessing. + +This document provides: +1. An architectural overview of how the lexer operates. +2. Guidance on how to interface with it (i.e., how to invoke, initialize, and consume it). +3. Examples demonstrating token flow and useful idioms. + +* 1. About the Lexer + +** 1.1 Services Provided +The lexer transforms a stream of characters into a stream of `cpp_token`s. It performs: +- UCN (Universal Character Name) expansion. +- Unicode normalization for identifiers. +- Detection of digraphs/trigraphs. +- Skipping of whitespace and comments. +- Classification into token types (`cpp_ttype`). +- Optional macro expansion (via higher-level coordination with macro subsystem). + +The function `_cpp_lex_token()` is the main entry point for lexing one token from the input stream. + +** 1.2 Token Types and Structures +Tokens are represented as `struct cpp_token`, which contains: +- `type`: token kind (from `cpp_ttype`) +- `val`: a union holding the value (e.g. number, string, identifier) +- `flags`: indicators such as `PREV_WHITE` or `DIGRAPH` +- `src_loc`: location for diagnostics +- `spelling`: optional cached spelling (may be recomputed) + +Auxiliary structures include: +- `cpp_hashnode`: interned identifiers and macro names +- `normalize_state`: for handling normalization and BiDi context +- `_cpp_buff`: dynamic buffers used for temporary token storage + +** 1.3 Unicode and Normalization +Lexer supports bidirectional Unicode enforcement using: +- `context`, `normalize_state`: track BiDi embeddings and UCN states +- `on_char`, `on_close`, `maybe_warn_bidi_on_close`: enforce structure + +** 1.4 Vectorized Fast Path +Several functions (e.g. `search_line_sse2`) accelerate scanning on x86 via SIMD. These are conditionally invoked from `search_line_fast` when alignment and CPU features allow. + +** 1.5 Token Buffers and Pools +Token buffers are managed using `_cpp_get_buff`, `_cpp_extend_buff`, `_cpp_commit_buff`, and `_cpp_release_buff`. These form a scratch/reuse pool and reduce allocations in macro processing or lexing multiple tokens rapidly. + +* 2. How to Use the Lexer API + +** 2.1 Initialization +Before lexing, the preprocessor must initialize its state: + +#+begin_src c +cpp_reader *pfile = cpp_create_reader(GTK_TESTING, NULL, NULL); +_cpp_init_lexer(pfile); +_cpp_init_tokenrun(pfile); +#+end_src + +** 2.2 Lexing Tokens +To retrieve the next token: + +#+begin_src c +const cpp_token *token = _cpp_lex_token(pfile); +#+end_src + +For directive-specific parsing (no macro expansion): + +#+begin_src c +cpp_token *token = _cpp_lex_direct(pfile); +#+end_src + +** 2.3 Token Inspection +Each token has type and value fields: + +#+begin_src c +if (token->type == CPP_NUMBER) { + printf("Numeric token: %s\n", cpp_spell_token(pfile, token)); +} +#+end_src + +** 2.4 Identifier Handling +Lex identifiers directly (e.g., for macro lookup): + +#+begin_src c +cpp_hashnode *node = _cpp_lex_identifier(pfile); +if (cpp_macro_p(node)) { + // Node is a macro +} +#+end_src + +** 2.5 Stringification and Output +To spell a token or output lines: + +#+begin_src c +unsigned char *text = cpp_token_as_text(pfile, token); +cpp_output_token(pfile, token, stdout); +#+end_src + +* 3. Examples and Advanced Use + +** 3.1 Simple Token Stream +Lex a stream from input and print token types: + +#+begin_src c +while (true) { + const cpp_token *tok = _cpp_lex_token(pfile); + if (tok->type == CPP_EOF) + break; + printf("Token: %s\n", cpp_type2name(tok->type)); +} +#+end_src + +** 3.2 Peeking and Lookahead +Use `cpp_peek_token` to look ahead: + +#+begin_src c +const cpp_token *next = cpp_peek_token(pfile); +if (next->type == CPP_OPEN_PAREN) + printf("Function call?\n"); +#+end_src + +** 3.3 Handling Unicode Identifiers +To support identifiers with UCNs: + +#+begin_src c +cpp_hashnode *ident = _cpp_lex_identifier(pfile); +const uchar *spell = _cpp_spell_ident_ucns(pfile, ident); +printf("Normalized: %s\n", spell); +#+end_src + +** 3.4 Example: Skipping Comments +Use `_cpp_skip_block_comment` or `skip_line_comment`: + +#+begin_src c +bool changed_line = _cpp_skip_block_comment(pfile); +if (changed_line) + _cpp_clean_line(pfile); +#+end_src + +** 3.5 Buffer Usage Examples + +*** 3.5.1 Allocate and Fill a Temporary Buffer +Use `_cpp_get_buff` to allocate a scratch buffer. Always check and ensure space before writing. Then commit the buffer and retrieve its contents. + +#+begin_src c +_cpp_buff *buff = _cpp_get_buff(pfile); +size_t len = 5; // Number of bytes to write + +// Ensure buffer has enough room +if ((size_t)(buff->limit - buff->cur) < len) + _cpp_extend_buff(pfile, &buff); + +// Write data safely +memcpy(buff->cur, "hello", len); +buff->cur += len; + +// Commit buffer and retrieve stable pointer +unsigned char *data = (unsigned char *) _cpp_commit_buff(pfile, buff, len); +printf("Buffer contents: %.*s\n", (int)len, data); +#+end_src +*** 3.5.2 Extend a Buffer Dynamically +Extend a buffer when you exceed its original size. + +#+begin_src c +_cpp_buff *buff = _cpp_get_buff(pfile); + +// Simulate a long write +for (int i = 0; i < 300; ++i) { + if ((size_t)(buff->limit - buff->cur) < 1) { + _cpp_extend_buff(pfile, &buff); + } + *buff->cur++ = 'A'; +} + +unsigned char *text = (unsigned char *) _cpp_commit_buff(pfile, buff, 300); +printf("Expanded buffer: %.*s\n", 10, text); // First 10 chars +#+end_src + +*** 3.5.3 Use Buffers in Token Construction +Construct a macro expansion or synthetic token string. + +#+begin_src c +_cpp_buff *buff = _cpp_get_buff(pfile); +buff->cur = stpcpy((char *)buff->cur, "MY_MACRO("); +buff->cur = stpcpy((char *)buff->cur, "123 + 456"); +*buff->cur++ = ')'; + +unsigned char *macro_text = (unsigned char *) _cpp_commit_buff(pfile, buff, + buff->cur - buff->base); +printf("Token string: %s\n", macro_text); +#+end_src + +*** 3.5.4 Releasing a Buffer +After using a buffer temporarily (e.g., in lookahead), release it. + +#+begin_src c +_cpp_buff *buff = _cpp_get_buff(pfile); +// ... use the buffer ... +_cpp_release_buff(pfile, buff); +#+end_src + +*** 3.5.5 Commit and Reuse +After committing a buffer, you may allocate another for reuse: + +#+begin_src c +unsigned char *first = (unsigned char *) _cpp_commit_buff(pfile, buff, len); +_cpp_buff *next = _cpp_get_buff(pfile); +// next->base points to fresh or recycled memory +#+end_src + +* 4. Notes on Extension + +- You may insert a new directive (e.g., `#assign`) by defining it in `directives.cc` and adding handler logic in `macro.cc` or your own file. +- If you want to extend the lexer for new token kinds, you must: + - Add a new `cpp_ttype` enum value. + - Extend `_cpp_lex_token` or `lex_string` to recognize and classify it. + - Update `cpp_type2name` and spelling functions. + +* 5. Recommended Reading +- `libcpp/include/cpp-id-data.h`: For macro flags and token identifiers +- `libcpp/lex.cc`: Lexer core implementation +- `libcpp/directives.cc`: Directive parsing +- `libcpp/macro.cc`: Macro expansion +- `libcpp/line-map.cc`: Location tracking and diagnostics + + + + diff --git a/document/how_it_works/tool_chain_dependency_layers.org b/document/how_it_works/tool_chain_dependency_layers.org new file mode 100644 index 0000000..94c7a70 --- /dev/null +++ b/document/how_it_works/tool_chain_dependency_layers.org @@ -0,0 +1,78 @@ +#+TITLE: Toolchain Dependency Layers +#+AUTHOR: Thomas Walker Lynch +#+DATE: 2025-05-06 +#+OPTIONS: toc:nil num:nil +#+LANGUAGE: en + +* Purpose + +This document outlines the dependencies involved in building a standalone GCC toolchain. It compares two approaches: + +1. Using system-provided tools and headers to build GCC +2. Building a fully self-consistent standalone toolchain + +Understanding the bootstrap sequence is critical for modifying or reproducing GCC builds, especially when building in isolation. + +* The Story: Bootstrap Spiral + +So this programmer — he wanted to add a new directive to GCC. + +So he downloaded the GCC sources with the intent to make a modified, standalone copy. + +But to compile GCC, he needed standard C library headers — which meant downloading glibc. + +But to compile glibc, he needed a working C compiler. So he would first need a minimal GCC — stage 1. + +But to build that stage 1 GCC, he needed glibc headers. + +So he compiled the glibc headers first. + +Then he compiled stage 1 GCC. + +Then he compiled the full glibc. + +Then he compiled the full GCC. + +Ah, but to compile the glibc headers, he first needed the Linux kernel headers... + +There was an old lady who swallowed a fly. I don’t know why she... + +* Approach 1: System-Assisted Bootstrap + +This method uses the host system’s tools and headers to provide bootstrap support. It is simpler and faster, but not fully isolated. + +** Dependencies: + +- System-provided: + - C compiler (e.g. GCC) + - libc (headers and shared objects) + - binutils + - Linux kernel headers + +- Build Steps: + 1. Build binutils using system GCC + 2. Build GCC using system libc and headers + +** Characteristics: +- Fast +- Relies on host environment +- Not self-contained + +** Use Case: +- Building a local variant of GCC that will be used on the same system +- Development where purity or relocatability isn’t required + +* Approach 2: Fully Self-Consistent Toolchain + +This method builds every component of the toolchain in a clean directory, using only upstream sources. It isolates the build from host interference. + +** Dependencies: + +- Linux kernel headers (must be provided up front) +- Binutils source +- Glibc source +- GCC source + +** Build Sequence: + +1. Install Linux kernel headers → needed to build glib diff --git a/document/source/internal_h.org b/document/source/internal_h.org new file mode 100644 index 0000000..97d10de --- /dev/null +++ b/document/source/internal_h.org @@ -0,0 +1,227 @@ +#+TITLE: internal.h - Documentation Reference (Emacs Org Format) +#+AUTHOR: Thomas Walker Lynch & Caelestis Index +#+DESCRIPTION: Reference breakdown of types, macros, and helper declarations in GCC's libcpp/internal.h +#+FILETAGS: cpp preprocessor gcc headers internal +#+OPTIONS: toc:nil + +* Overview +`internal.h` contains declarations for data structures, constants, and utility macros central to GCC's internal C preprocessor logic. It defines memory buffers, macro contexts, lexer state tracking, token kinds, character classes, and preprocessor infrastructure such as file buffers and include tracking. + +* Included Headers +- `symtab.h`: Symbol table definitions used internally. +- `cpplib.h`: Public CPP interfaces for tokens and readers. +- ``: (conditionally) iconv conversion API. + +* Core Data Structures +** `_cpp_buff` +A generic buffer with pointer markers. Used throughout macro processing and string/token accumulation. + +** `cpp_context` +Represents the current token expansion context. May hold ISO macro token runs or traditional literal input. Tracks virtual locations if macro tracking is enabled. + +** `macro_context` +Holds virtual locations and associated macro node. Used to support `-ftrack-macro-expansion`. + +** `cpp_reader` +Global object managing state for a preprocessing run, including lexer state, file buffers, context stack, macro table, callbacks, charset converters, and diagnostics. + +** `cpp_buffer` +Represents the input buffer of a file or command. Tracks physical and logical line positions, associated file, character set conversion, and line notes. + +** `lexer_state` +Bitfield flags tracking parsing state, expansion behavior, and preprocessor conditionals. + +** `tokenrun` +Represents a sequence of `cpp_token`s. Token runs are chained and form a circular buffer. + +** `spec_nodes` +Holds special pre-defined nodes like `defined`, `true`, `__VA_ARGS__`, etc. Used by conditional expressions and macro substitution. + +** `def_pragma_macro` +Stores push/pop state for macros affected by `#pragma push_macro` and `#pragma pop_macro`. + +* cpp_reader Structure +The `cpp_reader` structure is the central object representing the full state of a preprocessor session in GCC's `libcpp`. It is passed to nearly every function across the subsystem and serves as the orchestration hub for lexing, macro expansion, buffer management, file inclusion, diagnostics, encoding conversion, and callback integration. + +** Purpose +`cpp_reader` encapsulates: +- The lexical stream and its position. +- The active and historical context stack. +- Preprocessor directives and include file tracking. +- Memory and token buffer management. +- Charset encoding conversions. +- Diagnostics and frontend callbacks. + +This makes it the definitive state carrier for a preprocessor run. + +** Usage in the Preprocessor + +The `cpp_reader` structure is instantiated once at the beginning of a preprocessing session via a function like `cpp_create_reader`. It is then initialized with options, encoding settings, and source input before being passed into most libcpp functions. It acts as the persistent environment for all operations and carries forward lexical position, macro state, memory buffers, and file context. + +Typical usage involves: + +1. **Initialization**: + - Create with `cpp_create_reader`. + - Configure options via `cpp_get_options`. + - Setup include paths and callbacks. + - Load source with `cpp_read_main_file`. + +2. **Tokenization Loop**: + - Repeatedly call `cpp_get_token(pfile)` to read tokens. + - Tokens are drawn from buffers, macro expansions, or virtual sources. + - `pfile->context` may be manipulated during macro expansions. + +3. **Directive and Macro Handling**: + - Functions like `_cpp_handle_directive`, `_cpp_create_definition`, or `_cpp_push_token_context` all mutate or inspect `pfile` to reflect state changes during preprocessing. + +4. **Finalization**: + - Clean up with `cpp_finish`, `cpp_destroy`, or related resource freeing logic. + +**Example** (simplified and partial): +```c +cpp_reader *pfile = cpp_create_reader(CLK_GNUC89, NULL, linemap); +cpp_get_options(pfile)->lang = CLK_GNUC89; +cpp_get_callbacks(pfile)->diagnostic = my_diagnostic_callback; +cpp_read_main_file(pfile, "myheader.h"); + +const cpp_token *tok; +while ((tok = cpp_get_token(pfile))->type != CPP_EOF) { + // Process token +} + +cpp_finish(pfile); + + +** Member-by-Member Overview + +- `cpp_buffer *buffer` :: Current input buffer, holding the text being preprocessed. +- `cpp_buffer *overlaid_buffer` :: A temporary buffer overlayed for special cases (e.g. `#include` insertions). +- `struct lexer_state state` :: Tracks current directive state (e.g., inside `#define`), comment retention, and conditional skipping. +- `class line_maps *line_table` :: Manages source line and file mapping for diagnostics and `__LINE__`/`__FILE__`. +- `location_t directive_line` :: Source location of the last encountered directive. +- `_cpp_buff *a_buff` :: Aligned buffer for allocations requiring native alignment (e.g., tokens). +- `_cpp_buff *u_buff` :: Unaligned buffer for simpler memory allocations. +- `_cpp_buff *free_buffs` :: Chain of reusable `_cpp_buff` structures. +- `cpp_context base_context` :: The base (top-level) token context. +- `cpp_context *context` :: Pointer to the current context on the expansion stack. +- `const struct directive *directive` :: Active directive, if in one. +- `cpp_token directive_result` :: The token result of directive evaluation. +- `location_t invocation_location` :: Location of a macro's invocation, used for expansion diagnostics. +- `cpp_hashnode *top_most_macro_node` :: Node of the macro currently being expanded at top level. +- `bool about_to_expand_macro_p` :: True if a macro is queued for expansion. +- `cpp_dir *quote_include` :: `#include "..."` search path. +- `cpp_dir *bracket_include` :: `#include <...>` search path. +- `cpp_dir no_search_path` :: A dummy path that disables search. +- `_cpp_file *all_files` :: List of all known files encountered. +- `_cpp_file *main_file` :: The initial input source file. +- `htab *file_hash`, `htab *dir_hash` :: Hash tables for file and directory caching. +- `file_hash_entry_pool *file_hash_entries` :: Pool allocator for file hash entries. +- `htab *nonexistent_file_hash` :: Cache of known missing files (for fast rejection). +- `obstack nonexistent_file_ob` :: Memory store for missing file data. +- `bool quote_ignores_source_dir` :: Controls whether to skip the current file's directory when resolving `#include "..."`. +- `bool seen_once_only` :: True if any `#pragma once` or `#import` was used. +- `const cpp_hashnode *mi_cmacro`, `mi_ind_cmacro` :: Cached macro guards used for multiple-include optimization. +- `bool mi_valid` :: Whether the multiple-inclusion optimization is currently valid. +- `cpp_token *cur_token` :: The current token being read or expanded. +- `tokenrun base_run, *cur_run` :: Token run (buffer) chain for macro-expanded tokens. +- `unsigned int lookaheads` :: Number of lookahead tokens buffered. +- `unsigned int keep_tokens` :: Whether to retain tokens for re-use or reprocessing. +- `unsigned char *macro_buffer` :: Buffer holding macro definition text for diagnostics or display. +- `unsigned int macro_buffer_len` :: Length of `macro_buffer`. +- `cset_converter narrow_cset_desc` :: Converter from source charset to execution charset (e.g. UTF-8). +- `cset_converter utf8_cset_desc`, `char16_cset_desc`, `char32_cset_desc`, `wide_cset_desc` :: Charset converters for UTF and wide characters. +- `const unsigned char *date`, `*time` :: Cached date/time strings used for `__DATE__` and `__TIME__`. +- `time_t time_stamp` :: Internal timestamp used for `__TIMESTAMP__`. +- `int time_stamp_kind` :: Metadata on how timestamp was acquired. +- `cpp_token avoid_paste`, `endarg` :: Special tokens used for controlling macro pasting behavior and argument marking. +- `mkdeps *deps` :: Opaque pointer to dependency tracking system (used for `-M` options). +- `obstack hash_ob`, `buffer_ob` :: Obstack memory pools for hash nodes and buffers, respectively. +- `pragma_entry *pragmas` :: List of user-defined or built-in pragma handlers. +- `cpp_callbacks cb` :: Callback structure for emitting diagnostics or user-visible events. +- `ht *hash_table` :: Identifier hash table. +- `op *op_stack`, `*op_limit` :: Stack used for evaluating constant expressions (e.g., in `#if`). +- `cpp_options opts` :: Holds all preprocessor option settings (e.g. pedantic mode, line directives). +- `spec_nodes spec_nodes` :: Special identifiers (`__VA_ARGS__`, `defined`, etc.). +- `bool our_hashtable` :: Whether this instance owns the hash table memory. +- `out { base, limit, cur, first_line }` :: Traditional output buffer. +- `saved_cur`, `saved_rlimit`, `saved_line_base` :: Saved pointers for buffer overlays. +- `cpp_savedstate *savedstate` :: Saved state for precompiled header support. +- `unsigned int counter` :: Value of `__COUNTER__` macro. +- `cpp_comment_table comments` :: Stores comments if `save_comments` is enabled. +- `def_pragma_macro *pushed_macros` :: List of macros pushed via `#pragma push_macro`. +- `location_t forced_token_location` :: Override location used for the next emitted token. +- `location_t main_loc` :: Marker for the location of the main file’s first line. + +** Summary +`cpp_reader` is a highly stateful construct. It abstracts preprocessing into a cooperative sequence of stages: file loading, lexical analysis, macro handling, directive parsing, and token expansion. Each of these is enabled or modulated via member fields. The design permits reuse of storage buffers, incremental context stacking, and precise location tracking across deeply nested macro expansions and file inclusions. +* Enums and Constants +** `include_type` +Represents how a file was included (e.g., `#include`, `#import`, `-include`, etc.). Used to manage buffer overlays and inclusion depth. + +** `context_tokens_kind` +Distinguishes how tokens are held in a `cpp_context`: direct, indirect, or extended. + +** Alignment Helpers +- `DEFAULT_ALIGNMENT`: Derived from struct alignment. +- `CPP_ALIGN2`, `CPP_ALIGN`: Ensure proper memory alignment. + +** Character Class Macros +- `is_idchar`, `is_numchar`, `is_hspace`, `is_vspace`, etc.: Type-safe wrappers over libc ctype behavior with preprocessor-specific adjustments. + +* Buffers and Memory +- `_cpp_get_buff`, `_cpp_release_buff`, `_cpp_extend_buff`, `_cpp_aligned_alloc`, `_cpp_unaligned_alloc`: Allocate and manage working buffers used during expansion. + +* Token and Macro Helpers +- `_cpp_mark_macro_used`: Marks a macro as used for diagnostics. +- `CPP_OPTION`, `CPP_BUFFER`, `CPP_INCREMENT_LINE`: Common access macros for reader state and buffer internals. +- `SEEN_EOL()`: Helper to check if the last token was EOF. + +* Function Declarations by File +** From macro.cc +- `_cpp_create_definition`, `_cpp_new_macro`, `_cpp_notify_macro_use`, `_cpp_push_token_context`, etc.: Manage macro creation, expansion, and context. + +** From directives.cc +- `_cpp_define_builtin`, `_cpp_handle_directive`, `_cpp_do__Pragma`, etc.: Directive parsing and #pragma handlers. + +** From files.cc +- `_cpp_find_file`, `_cpp_stack_include`, `_cpp_pop_file_buffer`: File inclusion management and include guards. + +** From lex.cc +- `_cpp_lex_token`, `_cpp_temp_token`, `_cpp_equiv_tokens`: Token lexing and temporary token generation. + +** From expr.cc +- `_cpp_parse_expr`, `_cpp_expand_op_stack`: Expression parsing in `#if`/`#elif`. + +** From charset.cc +- `_cpp_valid_utf8`, `_cpp_convert_input`, `_cpp_destroy_iconv`: Character encoding conversion routines. + +** From init.cc +- `_cpp_restore_special_builtin`, `cpp_named_operator2name`: Initialization helpers for macro state. + +** From identifiers.cc +- `_cpp_init_hashtable`, `_cpp_destroy_hashtable`: Identifier table setup and teardown. + +* Encoding and Normalization +** `normalize_state` +Tracks normalization level and combining characters for UCN validation and identifier processing. + +** `cset_converter` +Holds state for iconv-based charset conversion. Used for input and output charset normalization. + +* Accessor Inline Functions +- `_cpp_in_system_header`, `_cpp_in_main_source_file`, `_cpp_defined_macro_p`: Context-sensitive accessors. +- `ustrcmp`, `ustrlen`, `uxstrdup`, `ufputs`, etc.: UTF-aware string handling routines. + +* Diagnostic Integration +** `encoding_rich_location` +Subclass of `rich_location` that forces encoding escape visibility for diagnostics. Constructed from `cpp_reader`. + +* Notes +- This file is not compiled standalone but included in many CPP components. +- It contains bridge-level API elements that link between token processing, buffer management, and frontend logic. +- Care must be taken when editing alignment or buffer routines as they affect all downstream expansion logic. + +* TODO +- Document how iconv fallback works when `HAVE_ICONV` is not defined. +- Clarify lifecycle of pushed macro contexts during nested `#pragma push_macro` chains. +- Integrate doc with `macro.cc` and `lex.cc` references for cross-module tracing. diff --git a/document/source/lex_cc.org b/document/source/lex_cc.org new file mode 100644 index 0000000..4f3c628 --- /dev/null +++ b/document/source/lex_cc.org @@ -0,0 +1,455 @@ +#+TITLE: lex.cc Detailed Structure and Function Index +#+Author: Caelus, code formalist (GPT-4, OpenAI), Thomas +#+Date:2025-05-09 + +* Data Structures Found in Non-Static Function Signatures +** struct context +Used in lexer or normalization stages to track state during token reclassification or Unicode normalization. + +** enum cpp_token_fld_kind +Enumeration describing the internal storage kind for a preprocessor token's value — distinguishes between identifiers, numbers, etc. + +** enum cpp_ttype +Enumeration of token types recognized by the preprocessor (e.g., identifiers, punctuators, literals, etc.). + +** struct lit_accum +Helper structure that accumulates string or character literal fragments during lexing. + +** struct normalize_state +Tracks intermediate state during Unicode normalization of identifiers or literals. + +** struct token_spelling +Structure used to store or compute the textual spelling of a token, including alternate representations (e.g., digraphs). +* Data Structures Shared Among Functions in lex.cc +** _cpp_buff +Used in: _cpp_aligned_alloc, _cpp_extend_buff, _cpp_free_buff, _cpp_get_buff, _cpp_release_buff, free, is_macro, new_buff, usage +Temporary token buffer used during macro argument collection and expansion. Shared to manage input buffering across stages. + +** context +Used in: _cpp_remaining_tokens_num_in_context, character, if, maybe_warn_bidi_on_close, on_char, rich_loc +State struct used in bidirectional text normalization and context-aware lexing. Functions reference it to apply UCN and bidi safety rules. + +** cpp_hashnode +Used in: cpp_error, if, is_macro, lex_identifier, lex_identifier_intern, line, linemap_included_from +Represents identifiers and macro definitions. Shared among symbol lookup, macro parsing, and token classification functions. + +** cpp_token +Used in: RESULT, _cpp_temp_token, cpp_directive_only_process, cpp_output_line_to_string, if, line, linemap_included_from, own, return +Token structure used to represent lexed entities passed between scanners, macro collectors, and diagnostic routines. + +** cpp_ttype +Used in: is_macro, lex_string, own, return +Enumeration of token types (e.g., identifiers, keywords, operators). Shared by scanners and type-check logic to interpret input. +* Non-Static Functions +** _cpp_aligned_alloc +- Signature: `unsigned char * _cpp_aligned_alloc (...)` +- Purpose: Allocates a buffer with alignment suitable for vectorized scanning operations (e.g., SSE, AVX). + +** _cpp_append_extend_buff +- Signature: `_cpp_buff * _cpp_append_extend_buff (...)` +- Purpose: Appends additional space to an existing token buffer, used when macro expansions exceed initial estimates. + +** _cpp_clean_line +- Signature: `void _cpp_clean_line (...)` +- Purpose: Cleans lexer line state after processing a complete logical line. + +** _cpp_commit_buff +- Signature: `void * _cpp_commit_buff (...)` +- Purpose: Finalizes a temporary token buffer and returns a stable pointer to the committed data. + +** _cpp_equiv_tokens +- Signature: `int _cpp_equiv_tokens (...)` +- Purpose: Determines whether two tokens are equivalent, ignoring cosmetic differences such as spacing. + +** _cpp_extend_buff +- Signature: `void _cpp_extend_buff (...)` +- Purpose: Increases the capacity of a token buffer to accommodate additional tokens during macro processing. + +** _cpp_free_buff +- Signature: `void _cpp_free_buff (...)` +- Purpose: Releases memory allocated for a temporary or committed token buffer. + +** _cpp_get_buff +- Signature: `_cpp_buff * _cpp_get_buff (...)` +- Purpose: Returns a new or recycled token buffer from the internal pool, minimizing allocations. + +** _cpp_get_fresh_line +- Signature: `bool _cpp_get_fresh_line (...)` +- Purpose: Consumes input until a logical line is ready. Handles escaped newlines. + +** _cpp_init_lexer +- Signature: `void _cpp_init_lexer (...)` +- Purpose: Initializes the core lexer state: buffers, token rings, and diagnostic counters. + +** _cpp_init_tokenrun +- Signature: `void _cpp_init_tokenrun (...)` +- Purpose: Initializes a ring buffer or region for holding tokens during lexing. + +** _cpp_lex_direct +- Signature: `cpp_token * _cpp_lex_direct (...)` +- Purpose: Lexes a single token from the input without macro expansion — used for directive parsing. + +** _cpp_lex_identifier +- Signature: `cpp_hashnode * _cpp_lex_identifier (...)` +- Purpose: Lexes an identifier and returns a hashnode for it, performing UCN expansion and keyword recognition. + +** _cpp_lex_token +- Signature: `const cpp_token * _cpp_lex_token (...)` +- Purpose: Lexes the next token from the input stream, handling macro expansion and buffering. + +** _cpp_process_line_notes +- Signature: `void _cpp_process_line_notes (...)` +- Purpose: Handles mapping #line notes and diagnostic position metadata. + +** _cpp_release_buff +- Signature: `void _cpp_release_buff (...)` +- Purpose: Returns a previously used token buffer back to the internal pool for reuse. + +** _cpp_remaining_tokens_num_in_context +- Signature: `int _cpp_remaining_tokens_num_in_context (...)` +- Purpose: Returns how many tokens are left within the current lexing context. + +** _cpp_skip_block_comment +- Signature: `bool _cpp_skip_block_comment (...)` +- Purpose: Skips over block comments, optionally returning whether line state changed. + +** _cpp_spell_ident_ucns +- Signature: `unsigned char * _cpp_spell_ident_ucns (...)` +- Purpose: Generates a UTF-8 spelling for identifiers that contain Universal Character Names (UCNs). + +** _cpp_temp_token +- Signature: `cpp_token * _cpp_temp_token (...)` +- Purpose: Allocates space for a temporary token during parsing or lookahead. + +** _cpp_unaligned_alloc +- Signature: `unsigned char * _cpp_unaligned_alloc (...)` +- Purpose: Allocates unaligned memory for fallback lexers or comment scanning buffers. + +** cpp_alloc_token_string +- Signature: `const uchar * cpp_alloc_token_string (...)` +- Purpose: Allocates a fresh string buffer for a token's textual content, typically used in output or diagnostics. + +** cpp_avoid_paste +- Signature: `int cpp_avoid_paste (...)` +- Purpose: Determines whether a space is needed between two tokens to avoid unintended pasting. + +** cpp_force_token_locations +- Signature: `void cpp_force_token_locations (...)` +- Purpose: Forces the preprocessor to track source locations for all tokens, overriding lazy behavior. + +** cpp_get_comments +- Signature: `cpp_comment_table * cpp_get_comments (...)` +- Purpose: Returns a pointer to the internal comment table used for diagnostics or pretty-printing. + +** cpp_ideq +- Signature: `int cpp_ideq (...)` +- Purpose: Compares two identifiers for equality in a normalized preprocessor sense. + +** cpp_output_line +- Signature: `void cpp_output_line (...)` +- Purpose: Outputs an entire preprocessor line, including comments or tokens, to a file. + +** cpp_output_line_to_string +- Signature: `unsigned char * cpp_output_line_to_string (...)` +- Purpose: Generates a string representation of a preprocessed line for diagnostics. + +** cpp_output_token +- Signature: `void cpp_output_token (...)` +- Purpose: Writes a token to an output stream, respecting spacing and formatting rules. + +** cpp_peek_token +- Signature: `const cpp_token * cpp_peek_token (...)` +- Purpose: Returns a pointer to the next token without consuming it. Used in lookahead. + +** cpp_spell_token +- Signature: `unsigned char * cpp_spell_token (...)` +- Purpose: Computes or reconstructs the text spelling of a token from internal data. + +** cpp_stop_forcing_token_locations +- Signature: `void cpp_stop_forcing_token_locations (...)` +- Purpose: Stops forcibly tracking token locations, restoring default behavior. + +** cpp_token_as_text +- Signature: `unsigned char * cpp_token_as_text (...)` +- Purpose: Converts a token into its textual representation (used for macro debug output or trace logs). + +** cpp_token_len +- Signature: `unsigned int cpp_token_len (...)` +- Purpose: Computes the length of a token for buffer management or output purposes. + +** cpp_token_val_index +- Signature: `enum cpp_token_fld_kind cpp_token_val_index (...)` +- Purpose: Returns the kind of value stored in the token (e.g., string, identifier, number). + +** cpp_type2name +- Signature: `const char * cpp_type2name (...)` +- Purpose: Maps internal token types (e.g., CPP_NUMBER) to human-readable strings like "number". + +** current_ctx +- Signature: `kind current_ctx (...)` +- Purpose: Returns the current Unicode bidirectional context (e.g., LTR, RTL) used during lexing. + +** current_ctx_loc +- Signature: `location_t current_ctx_loc (...)` +- Purpose: Returns the source location associated with the current bidi context — for diagnostics. + +** current_ctx_ucn_p +- Signature: `bool current_ctx_ucn_p (...)` +- Purpose: Returns whether the current Unicode context allows Universal Character Names (UCNs). + +** init_vectorized_lexer +- Signature: `define HAVE_init_vectorized_lexer 1 +static inline void init_vectorized_lexer (...)` +- Purpose: Initializes vectorized scanning function pointers depending on CPU features. + +** on_char +- Signature: `void on_char (...)` +- Purpose: Handles logic when a character is encountered that might affect bidirectional or normalization context. + +** on_close +- Signature: `void on_close (...)` +- Purpose: Called when a bidirectional context-closing token (e.g., PDF) is encountered. + +** pop +- Signature: `void pop (...)` +- Purpose: Pops the current normalization or bidi context off the internal context stack. + +** pop_kind_at +- Signature: `kind pop_kind_at (...)` +- Purpose: Returns the kind of context that would be popped at a given depth (used for lookahead). + +** read_char +- Signature: `char read_char (...)` +- Purpose: Reads a character from the input buffer, optionally applying normalization or escaping rules. + +** search_line_fast +- Signature: `ATTRIBUTE_NO_SANITIZE_UNDEFINED +static const uchar * search_line_fast (...)` +- Purpose: Fallback vectorized line scanner for supported architectures. Tries MMX, SSE, etc. + +** search_line_fast +- Signature: `define AARCH64_MIN_PAGE_SIZE 4096 + +static const uchar * search_line_fast (...)` +- Purpose: Fallback vectorized line scanner for supported architectures. Tries MMX, SSE, etc. + +** search_line_mmx +- Signature: `endif search_line_mmx (...)` +- Purpose: Performs vectorized scanning of input using MMX instructions. + +** search_line_sse2 +- Signature: `endif search_line_sse2 (...)` +- Purpose: Performs fast input scanning using SSE2 instructions on aligned buffers. + +** search_line_sse42 +- Signature: `endif search_line_sse42 (...)` +- Purpose: Uses SSE4.2 instructions (e.g., `pcmpestri`) to scan for newline and comment sequences. +* File Scope Data Structures +- `CPP_TOKEN_FLD_ARG_NO` +- `CPP_TOKEN_FLD_NODE` +- `CPP_TOKEN_FLD_NONE` +- `CPP_TOKEN_FLD_PRAGMA` +- `CPP_TOKEN_FLD_SOURCE` +- `CPP_TOKEN_FLD_STR` +- `CPP_TOKEN_FLD_TOKEN_NO` +- `Foundation` +- `NULL` +- `SSE1` +- `WARRANTY` +- `a` +- `accum` +- `after_backslash` +- `all_upper` +- `alloced` +- `backup` +- `bad_string` +- `bol` +- `break` +- `buffer` +- `c` +- `category` +- `col` +- `cols` +- `combined_loc` +- `count` +- `data` +- `delim_len` +- `delimited_string` +- `dest` +- `dflt` +- `done` +- `done_comment` +- `done_string` +- `end` +- `end_loc` +- `end_offset` +- `eol` +- `esc` +- `extra_len` +- `f` +- `fallthrough_comment` +- `false` +- `found` +- `fresh_line` +- `hash` +- `header_count` +- `i` +- `impl` +- `import` +- `index` +- `is_block` +- `ix` +- `j` +- `l` +- `la` +- `len` +- `line_count` +- `loc` +- `m` +- `m_custom_label` +- `m_kind` +- `m_loc` +- `m_ucn` +- `magic` +- `mask` +- `maybe_number_start` +- `minimum` +- `misalign` +- `module_p` +- `n` +- `name` +- `new_buff` +- `next_line` +- `not_module` +- `nst` +- `num_bytes` +- `ok` +- `ones` +- `orig_line` +- `out` +- `p` +- `peek` +- `peek_R` +- `peek_u` +- `peek_u8` +- `peektok` +- `prefix_len` +- `program` +- `ptr` +- `quote_eight` +- `quote_first` +- `quote_peek` +- `raw` +- `read_note` +- `repl_bs` +- `repl_cr` +- `repl_nl` +- `repl_qm` +- `restart` +- `result` +- `ret` +- `room` +- `s` +- `saw_NUL` +- `search` +- `search_line_fast` +- `second_raw` +- `shift` +- `si` +- `size` +- `skipped_white` +- `slen` +- `sloc` +- `slow_path` +- `software` +- `spell_ident` +- `spelling` +- `src_loc` +- `src_range` +- `star` +- `start` +- `start_loc` +- `start_offset` +- `sv` +- `sz` +- `t` +- `terminator` +- `tok_range` +- `true` +- `type` +- `ucn_len` +- `ucn_len_c` +- `update_tokens_line` +- `utf32` +- `utf8_signifier` +- `utf8_start` +- `v` +- `want_number` +- `warn_bidi` +- `warn_bidi_p` +- `was` +- `word_type` +- `ws` +- `xmask` +- `zero` + +* Static Functions +- `void add_line_note (...)` +- `int skip_line_comment (...)` +- `void skip_whitespace (...)` +- `void lex_string (...)` +- `void save_comment (...)` +- `void store_comment (...)` +- `void create_literal (...)` +- `bool warn_in_comment (...)` +- `int name_p (...)` +- `void add_line_note (...)` +- `inline word_type acc_char_mask_misalign (...)` +- `inline word_type acc_char_replicate (...)` +- `inline word_type acc_char_cmp (...)` +- `inline int acc_char_index (...)` +- `const uchar * search_line_acc_char (...)` +- `const uchar * search_line_acc_char (...)` +- `const uchar * search_line_fast (...)` +- `const uchar * search_line_fast (...)` +- `bool warn_in_comment (...)` +- `location_t get_location_for_byte_range_in_cur_line (...)` +- `bidi::kind get_bidi_utf8_1 (...)` +- `bidi::kind get_bidi_utf8 (...)` +- `bidi::kind get_bidi_ucn_1 (...)` +- `bidi::kind get_bidi_ucn (...)` +- `void maybe_warn_bidi_on_close (...)` +- `void maybe_warn_bidi_on_char (...)` +- `int skip_line_comment (...)` +- `void skip_whitespace (...)` +- `int name_p (...)` +- `void warn_about_normalization (...)` +- `bool forms_identifier_p (...)` +- `void maybe_va_opt_error (...)` +- `cpp_hashnode * lex_identifier_intern (...)` +- `cpp_hashnode * lex_identifier (...)` +- `void lex_number (...)` +- `void create_literal (...)` +- `bool is_macro (...)` +- `bool is_macro_not_literal_suffix (...)` +- `void lex_raw_string (...)` +- `void lex_string (...)` +- `void store_comment (...)` +- `void save_comment (...)` +- `bool fallthrough_comment_p (...)` +- `tokenrun * next_tokenrun (...)` +- `const cpp_token* _cpp_token_from_context_at (...)` +- `void cpp_maybe_module_directive (...)` +- `size_t utf8_to_ucn (...)` +- `const unsigned char * cpp_digraph2name (...)` +- `_cpp_buff * new_buff (...)` +- `const unsigned char * do_peek_backslash (...)` +- `const unsigned char * do_peek_next (...)` +- `const unsigned char * do_peek_prev (...)` +- `const unsigned char * do_peek_ident (...)` +- `bool do_peek_module (...)` + + + + + diff --git a/document/source/libcpp_h.org b/document/source/libcpp_h.org new file mode 100644 index 0000000..e69de29 diff --git a/document/source/macro_cc.org b/document/source/macro_cc.org new file mode 100644 index 0000000..53d8b65 --- /dev/null +++ b/document/source/macro_cc.org @@ -0,0 +1,83 @@ +#+TITLE: macro.cc - Documentation Reference +#+AUTHOR: Thomas Walker Lynch & Caelestis Index +#+DESCRIPTION: High-level architectural partitioning of cpp (GCC 12.x) +#+FILETAGS: cpp preprocessor architecture gcc +#+OPTIONS: toc:nil + +* Overview +This file implements the core logic for macro parsing, macro definition, expansion, and deferred/lazy evaluation within the C preprocessor (CPP) subsystem in GCC's `libcpp`. It complements the infrastructure declared in `libcpp.h` and utilizes various helpers from supporting headers. + +* Included Headers and Their Purpose +- `config.h`: Compiler configuration macros. +- `system.h`: GCC-wide portability and utility macros. +- `intl.h`: Localization support. +- `cpplib.h`: Core interface for the C preprocessor. +- `internal.h`: Internal-only structures and definitions for the preprocessor. +- `macros.h`: Macro parsing and storage structures. +- `trad.h`: Traditional mode logic. +- `mkdeps.h`: Dependency output handling. +- `diagnostic-core.h`: Diagnostic emission interfaces. +- `cpp-id-data.h`: Identifier information, e.g. for argument naming. + +* Major Data Structures Used + +- `cpp_reader` (from `cpplib.h`): The global preprocessor context. +- `cpp_hashnode` (from `cpplib.h`): Represents identifiers, including macro definitions. +- `cpp_macro` (from `macros.h`): Stores a single macro definition, either traditional or ISO. +- `macro_arg` (from `macros.h`): Represents a single argument to a function-like macro. +- `macro_context` (internal): Used for tracking extended macro expansion location information. +- `_cpp_buff` (from `internal.h`): Temporary token or string storage buffer. +- `cpp_token` (from `cpplib.h`): Represents a preprocessor token. +- `cpp_string` (from `cpplib.h`): String-like wrapper for character sequences. + +* Functional Groups (Grouped per libcpp.h Theme) + +*** Token Context Management +- `_cpp_push_token_context`: Pushes a direct token sequence as context. +- `push_ptoken_context`: Pushes indirect token sequence. +- `push_extended_tokens_context`: Pushes context with virtual locations. +- `_cpp_pop_context`: Pops current macro or token context. + +*** Argument Expansion and Memory +- `expand_arg`: Expands a macro argument by recursively evaluating tokens. +- `alloc_expanded_arg_mem`: Allocates buffer space for an argument. +- `ensure_expanded_arg_room`: Doubles expansion buffer when needed. +- `set_arg_token` (external): Inserts expanded tokens into an argument. + +*** Macro Definition and Redefinition +- `_cpp_create_definition`: Top-level interface to create and store macro definition. +- `create_iso_definition`: Parses macro arguments and expansion tokens. +- `_cpp_save_parameter`: Saves a named parameter for a function-like macro. +- `_cpp_unsave_parameters`: Restores hashnodes after failed macro parse. +- `warn_of_redefinition`: Determines if a redefinition should trigger a warning. +- `cpp_compare_macros`: Compares two macros for semantic equality. + +*** Macro Instantiation and Lazy Expansion +- `get_deferred_or_lazy_macro`: Retrieves or forces realization of a deferred or lazy macro. +- `cpp_get_deferred_macro`: Resolves a deferred macro. +- `cpp_define_lazily`: Marks a macro for delayed definition. +- `_cpp_notify_macro_use`: Central notification hook that tracks macro use. + +*** Macro Definition Representation +- `cpp_macro_definition`: Renders a macro definition as a string. +- `cpp_macro_definition(pfile, node, macro)`: Core form with macro pointer. + +*** Lexing Helpers and Traditional Compatibility +- `lex_expansion_token`: Lexes one token in a macro body. +- `check_trad_stringification`: Warns if argument appears stringified in traditional C. +- `_cpp_new_macro`: Allocates and initializes a `cpp_macro`. + +* Integration with Other Subsystems +- Works closely with: `lex.c`, `directives.cc`, and `internal.c`. +- Interfaces with `linemap` for virtual location computation. +- Supports both ISO and traditional C macro handling. + +* Notes +- Token pasting (`##`) is carefully constrained per ISO rules. +- Parameter and macro use is tracked for diagnostics and DWARF output. +- Extra tokens such as padding and stringification markers carry encoded flags. + +* TODO +- Document edge cases and non-ISO behaviors (e.g., bare ellipsis). +- Link to relevant `libcpp.h` macro flags and diagnostic utilities. +- Cross-reference context expansion rules with `cpp_get_token_1`. diff --git a/document/source/macro_registration.org b/document/source/macro_registration.org new file mode 100644 index 0000000..56bf752 --- /dev/null +++ b/document/source/macro_registration.org @@ -0,0 +1,150 @@ +#+TITLE: Macro Symbol Registration in GCC 12 libcpp +#+AUTHOR: Caelestis Index +#+DESCRIPTION: Full lifecycle of defining and registering a macro in GCC's C preprocessor +#+OPTIONS: toc:nil +#+FILETAGS: gcc libcpp macro cpp_hashnode + +* Overview +This document explains the full lifecycle for defining a macro in GCC 12.x's =libcpp= preprocessor. It traces the required steps from token parsing through symbol table registration, highlighting where and how macro definitions become visible to the preprocessor engine. + +* 1. Obtaining the Macro Name’s Hash Node + +In =libcpp=, all identifiers — including macro names — are interned in a symbol table as =cpp_hashnode= entries. When the lexer emits a =CPP_NAME= token, it automatically fills: + +#+BEGIN_SRC c +token->val.node.node // type: cpp_hashnode * +#+END_SRC + +If the macro name comes from parsed input (e.g. `#assign` or `#define`), this node is already in the symbol table — no need to call =cpp_lookup= again. + +If you're defining a macro from a raw string (not a parsed token), you *would* use: + +#+BEGIN_SRC c +cpp_lookup(pfile ,name ,len); +#+END_SRC + +Note: =cpp_lookup= both interns new identifiers and retrieves existing ones. + +* 2. Creating and Populating a cpp_macro Object + +GCC uses a =cpp_macro= struct to hold the macro’s definition: number of parameters, replacement tokens, flags, etc. + +Allocation is done with: + +#+BEGIN_SRC c +cpp_macro *macro = _cpp_new_macro( + pfile, + cmk_macro, + _cpp_reserve_room(pfile ,0 ,sizeof(cpp_macro)) +); +#+END_SRC + +After that, populate its fields: + +#+BEGIN_SRC c +macro->fun_like = 0; +macro->paramc = 0; +macro->variadic = 0; +macro->count = 1; +macro->used = 1; + +cpp_token *tok = ¯o->exp.tokens[0]; +tok->type = CPP_NUMBER; +tok->val.str.text = (const unsigned char *) "42"; +tok->val.str.len = 2; +tok->flags = 0; +#+END_SRC + +Note: These macros are obstack-allocated; you don't free them manually. + +* 3. Handling Redefinitions (Optional, but Expected) + +If the symbol already has a macro: + +#+BEGIN_SRC c +if( cpp_macro_p(node) ) + warn_of_redefinition(pfile ,node ,macro); +#+END_SRC + +GCC allows redefinition only if the new macro is *identical*. If not, it issues a pedantic warning and overwrites the old definition. + +To remove the previous macro: + +#+BEGIN_SRC c +_cpp_free_definition(node); +#+END_SRC + +This clears the macro without deallocating it (obstack). + +* 4. Installing the Macro in the Symbol Table + +The macro is made active by assigning it to the symbol table: + +#+BEGIN_SRC c +node->type = NT_USER_MACRO; +node->value.macro = macro; +#+END_SRC + +This effectively *registers* the macro for expansion. + +There is no separate "symbol table insertion" step — the hash node was already in the table. + +GCC may also set flags: + +- =NODE_WARN= → warn if redefining built-in +- =NODE_CONDITIONAL= → cleared when explicitly defined + +* 5. Finalization Steps + +Some final steps after macro insertion: + +- Mark it used (optional): + + #+BEGIN_SRC c + _cpp_mark_macro_used(node); + #+END_SRC + +- Emit a diagnostic: + + #+BEGIN_SRC c + cpp_warning(pfile ,CPP_W_NONE ,"Assigned macro %s as 42" ,NODE_NAME(node)); + #+END_SRC + +- Clear the =NODE_USED= flag to reset unused-macro warnings: + + #+BEGIN_SRC c + node->flags &= ~NODE_USED; + #+END_SRC + +* Summary of Required Steps + +Here is the complete, valid sequence to register a macro manually: + +#+BEGIN_SRC c +cpp_token *name_token = assign_name_argument(pfile); +cpp_hashnode *node = name_token->val.node.node; + +cpp_macro *macro = _cpp_new_macro(...); // allocate and populate + +// fill token replacement list... +macro->count = 1; +macro->exp.tokens[0] = ...; + +node->type = NT_USER_MACRO; +node->value.macro = macro; +_cpp_mark_macro_used(node); +#+END_SRC + +That is sufficient to define and register a macro. =NODE_NAME(node)= is useful for diagnostics, but not required for registration. + +* Notes + +- If you already have a =cpp_token= from parsing, the hash node is *already interned*. +- Macros must be registered by setting =node->type= and =node->value.macro=. +- Redefinitions are allowed only if semantically identical unless explicitly undefined. +- No extra insertion or lookup step is needed unless building from raw text. + +* References +- =directives.cc= → =do_define= and redefinition checks +- =macro.cc= → =create_iso_definition= and macro assembly +- =cpplib.h= → =cpp_hashnode=, =cpp_macro=, enum flags diff --git a/document/terms/deployment scenarios.org b/document/terms/deployment scenarios.org new file mode 100644 index 0000000..c250c26 --- /dev/null +++ b/document/terms/deployment scenarios.org @@ -0,0 +1,49 @@ +#+TITLE: Compiler Deployment Scenarios +#+AUTHOR: Thomas Walker Lynch +#+DATE: 2025-05-06 +#+OPTIONS: toc:nil num:nil +#+LANGUAGE: en + +See `terms/machine and system roles' for the definitions of 'build', 'host', and 'target' system/machine. + +* Native Compilation: Build = Host = Target. + +In a native compilation, all three roles — the machine that builds GCC (`build`), the machine on which the resulting compiler will run (`host`), and the machine for which that compiler will generate code (`target`) — are the same. + +- Example 1: Alice is running Debian on an x86_64 machine. She builds GCC from source directly on that system. The compiler runs on her Debian machine and compiles code for that same environment. + In terms of machine categories: + `build = host = target = x86_64-pc-linux-gnu` + +- Example 2: A system administrator compiles GCC inside a Fedora x86_64 container. The resulting compiler is used within that same container to build programs for Fedora x86_64. + In terms of machine categories: + `build = host = target = x86_64-redhat-linux` + +* Cross Compilation: Build = Host ≠ Target. + +In a cross compilation, the compiler is built and will run on one type of system (`build = host`), but is used to produce binaries for a different kind of system (`target`). + +- Example 1: Bob is using an x86_64 Ubuntu machine to build a GCC toolchain that runs on Ubuntu but outputs code for a Raspberry Pi (ARM). + In terms of machine categories: + `build = host = x86_64-pc-linux-gnu`, + `target = arm-linux-gnueabihf` + +- Example 2: A developer builds a MIPS cross-compiler on an x86_64 workstation. The resulting GCC is used on that workstation to compile code for an embedded MIPS device. + In terms of machine categories: + `build = host = x86_64-pc-linux-gnu`, + `target = mipsel-linux-uclibc` + +* Canadian Cross: Build ≠ Host ≠ Target. + +A Canadian Cross involves three distinct machines. GCC is built on one system (`build`), but the resulting compiler is meant to run on a different system (`host`), and it will generate binaries for yet another system (`target`). + +- Example 1: A toolchain is built on an x86_64 Linux desktop (`build`). It produces a GCC that runs on PowerPC AIX (`host`) and generates code for ARM embedded systems (`target`). + In terms of machine categories: + `build = x86_64-pc-linux-gnu` + `host = powerpc-ibm-aix` + `target = arm-none-eabi` + +- Example 2: Charlie uses his Linux laptop to build a Windows-native GCC cross-compiler (`host = Windows`). This Windows-hosted compiler will produce binaries for Android ARM (`target`). + In terms of machine categories: + `build = x86_64-pc-linux-gnu` + `host = x86_64-w64-mingw32` + `target = arm-linux-androideabi` diff --git a/document/terms/gcc directory terminology.org b/document/terms/gcc directory terminology.org new file mode 100644 index 0000000..545426a --- /dev/null +++ b/document/terms/gcc directory terminology.org @@ -0,0 +1,68 @@ + +#+TITLE: GCC Directory Terminology +#+AUTHOR: Thomas Walker Lynch +#+DATE: 2025-05-06 +#+OPTIONS: toc:nil num:nil +#+LANGUAGE: en + +* build directory (objdir) + +The directory where a build process takes place. + +For GCC this must be distinct from the source directory (srcdir). + +* current build + +Refers to the specific build being executed right now, as opposed to previously installed or referenced builds. + +Example: +- "The current build will output to `$BUILD_DIR/gcc`" +- "Avoid mixing headers from an old build with the current build" + +* current build target directory + +The destination directory into which the current build places its build products. + +This term combines the concepts of the `current build` and `target directory`, emphasizing that the output location is for the present compilation effort. + +Examples: +- If `make install DESTDIR=$SYSROOT`, then `$SYSROOT` is the current build target directory. +- For an isolated toolchain, `$TOOLCHAIN` might be the current build target directory + +* installation directory + +Any directory from which components are *intended to be used from*, whether built locally or installed from packages. A final resting place for binaries, scripts, libraries, headers, or anything else that gets 'used' as part of everyday work. + +In the GCC documents, an 'installation directory' is not limited to directories that the current build will put things into. In fact, sometimes it is used to refer to a directory the build will get things from. + +Examples include: +- System-wide locations like `/usr/bin` +- Custom toolchain roots like `$TOOLCHAIN/bin` + +* local-prefix + +Path where GCC searches for locally installed headers (like `/usr/local/include`). +This is not a build target directory, rather it is used for lookup during compilation. + +* prefix +The root path where GCC will install itself: +- Binaries to `$prefix/bin` +- Headers to `$prefix/include` +- Libraries to `$prefix/lib` + +* source directory (srcdir) +The directory where the original GCC (or binutils/glibc) source code resides. +Often deleted or archived after the build completes. +* sysroot +A directory that acts as a "fake root" (`/`) for headers and libraries. +Used in controlled builds or cross-compilation to isolate from the host system. +* target directory +The directory where build products are placed — typically the result of `make install`. + +This can be the same as `prefix`, or it may be a temporary staging area. It refers specifically to the destination for *this build’s* outputs, not a system path in general. + +Examples: +- `$TOOLCHAIN/bin` if that’s where you are installing +- `DESTDIR=/tmp/stage make install` — `/tmp/stage` is the target directory + + diff --git a/document/terms/glossary.org b/document/terms/glossary.org new file mode 100644 index 0000000..87c1cb5 --- /dev/null +++ b/document/terms/glossary.org @@ -0,0 +1,76 @@ +#+TITLE: RT_gcc Terminology Glossary +#+AUTHOR: Thomas Walker Lynch +#+DATE: 2025-05-06 +#+OPTIONS: toc:nil num:nil +#+LANGUAGE: en + +A glossary of commonly used terms and variables in the GNU toolchain documentation, +with emphasis on their meaning in the context of building GCC. + +* General Terms + +** GCC +The GNU Compiler Collection — a suite of compilers for C, C++, Fortran, and other languages. + +** GNU +GNU's Not Unix — a free software project started by the Free Software Foundation (FSF). + +** FSF +The Free Software Foundation — original sponsor and maintainer of the GNU project and GCC. + +** toolchain +A set of programs used to build software, usually including a compiler (GCC), linker (ld), assembler (as), and C library (glibc or musl). + +** native build +A build where `build = host = target`. You are compiling GCC for the same system you're building it on. + +** cross-compiler +A compiler built on one machine (the build system), to run on another (the host), and generate code for yet another (the target). + +** bootstrap +The process of building GCC in multiple stages: +1. Build a minimal compiler (stage1) +2. Use it to build a full compiler (stage2) +3. Use the result to rebuild itself (stage3) and verify output is stable + + +* Build Triplets + +** build +The system on which the *build process itself* runs. + +** host +The system where the resulting GCC binary will *run*. + +** target +The system for which GCC will *generate code*. +Only meaningful when building a cross-compiler. + +* Build Options + +** --enable- +Turns on an optional feature at configure time. + +** --disable- +Explicitly disables an optional feature. + +** --with-=value +Sets a feature or path used by the build system (e.g., `--with-sysroot`, `--with-pkgversion`). + +** --with-pkgversion +Custom string shown in `gcc --version`, useful to identify modified builds. + +** --with-bugurl +Sets the URL shown in `gcc --version` for reporting bugs. + +** --enable-languages +Limits the frontends to be built (e.g., C, C++, Fortran). +Useful to speed up bootstrap or reduce size. + +** --disable-multilib +Disables building 32-bit and 64-bit variants together (relevant on x86_64 systems). + +* Components + +** binutils +A set of binary tools including `as` diff --git a/document/terms/machine and system roles.org b/document/terms/machine and system roles.org new file mode 100644 index 0000000..c2188ea --- /dev/null +++ b/document/terms/machine and system roles.org @@ -0,0 +1,57 @@ +#+TITLE: Terms - Machine and System Roles +#+AUTHOR: Thomas Walker Lynch +#+DATE: 2025-05-06 + +* Machine vs System + +In this context, "machine" and "system" are often used interchangeably, but they can carry slightly different connotations depending on the context in which they are used. + +** Machine + + - Machine typically refers to the physical hardware or the machine itself — a particular physical computing device, such as a laptop, server, or embedded device. + + - When we talk about the target machine, we are usually referring to the physical or virtual hardware on which the final compiled binaries will run. + +** System + + - System generally refers to the complete environment in which the software operates. This includes the hardware but also extends to the operating system (OS), libraries, and other software that makes up the environment in which the machine operates. + + - The target system refers not just to the physical hardware but to the entire environment where the program will execute, including the OS, libraries, etc. + +** Target Machine vs Target System + + - The target machine is indeed a specific instance of hardware (e.g., a Raspberry Pi or an ARM-based system). + + - The target system can also refer to the environment in which that machine operates, which includes not just the hardware (machine) but also the operating system and the software environment required to run the compiled binaries. + +In many contexts, the terms are used interchangeably, but there’s a subtle distinction: + + - Target Machine emphasizes the hardware aspect. + - Target System emphasizes the complete environment, including hardware, OS, and libraries. + +** Examples + + - Target Machine: "The target machine is an ARM Cortex-M processor." Here, the focus is on the physical hardware. + - Target System: "The target system is an ARM Cortex-M processor running a specific version of an embedded OS." Here, the focus is on both the hardware and the environment (OS, libraries, etc.). + + +* System Roles + +* 'build' +The system on which the GCC compiler **is being built**. + +- John downloads the GCC source code and compiles it on his x86_64 Debian laptop. The GCC compiler **is built on** this laptop, so the build system is Debian on x86_64. +- Sarah uses her macOS machine to build a GCC compiler intended for use on a different system. Even though she won’t run it locally, the compiler **was constructed on macOS**, making macOS the build system. + +* 'host' +The system where the **compiled GCC binary will run** — where you’ll actually invoke `gcc`. + +- Alice builds a GCC binary on Ubuntu, but installs and runs it on an Alpine Linux container. The host is Alpine Linux because that’s where the compiler binary will execute. +- A CI pipeline builds GCC on an x86 Linux node but deploys the resulting `gcc` executable to a FreeBSD system, where users will compile programs. FreeBSD is the host. + +* 'target' +The system for which the **compiled GCC binary will generate code** — where the output binaries are meant to run. + +- Mark compiles a version of GCC that runs on x86 Linux (host), but it produces binaries for ARM Cortex-M chips. The target is ARM Cortex-M. +- A developer builds a GCC compiler on x86 that will run on macOS, and it generates code for RISC-V devices. The target is RISC-V — the eventual destination of the compiled programs. + diff --git a/document/why_version_12.org b/document/why_version_12.org new file mode 100644 index 0000000..f2a8e38 --- /dev/null +++ b/document/why_version_12.org @@ -0,0 +1,38 @@ +#+TITLE: The Fictional Story of The GCC 15 Install +#+AUTHOR: Thomas Walker Lynch +#+DATE: 2025-05-06 +#+OPTIONS: toc:nil num:nil +#+LANGUAGE: en + +Once upon a time, there was a programmer who wanted to add a new directive to the C preprocessor, cpp. cpp was a simple tool. It expanded macros and did basic text substitutions, nothing more. The programmer thought, “I’ll just modify cpp. That’s easy enough. + +But he could not find cpp. Rather cpp is part of gcc, part and parcel. Well he thought, +gcc is a mature tool, and after making mods to cpp, it will build merely by calling make. + +So he downloaded the latest gcc, gcc-16, but then discovered, there were no release branches. "It won't be stable." He thought, then backed off to gcc-15. + +Then upon attempting to build gcc-15 as baseline before making any mods, he discovered that the system glibc was not version compatible for the build. + +Ok, simple matter, he thought, “I’ll download glibc too. It’s just part of the process.” + +But then attempting to compiling glibc, he encountered something he hadn’t expected. To compile glibc, he needed a version compatible C compiler, and the system compiler is not up to. "Ah I remember," he said as recalled, that gcc can be built in two stages, with the +first stage not requiring glibc, but being sufficient to compile glic. + +So he set to work on the stage 1 GCC, but as he got deeper, he realized that the stage 1 GCC required certain C runtime files — specifically crt1.o, crti.o, and crtn.o. “Ah, no problem,” he said. “I’ll just generate those.” + +The programmer found that the files could be generated using the glibc make file, so he attempted that, but alas, headers were needed, and a version compatible binutils was needed. + +So the programmer downloaded the linux headers and the binutils sources. Binutils could be built with the system gcc, so all was good. + +“Now, I’ve got the CRT files. Let’s move on,” he thought. With the CRT files in hand, he turned to compiling the glibc headers. This was simple enough, and once those were built, he could proceed with the stage 1 GCC. + +After completing the first stage of GCC, he thought, “Great. Now I can move on to the full glibc. It’s just the next step.” + +And so, he used his stage 1 GCC to compile the full glibc. It wasn’t a complex process, just a bit more time-consuming. But with glibc now fully built, the programmer could finally finish the GCC build. The full toolchain was nearly ready. + +With everything falling into place, the programmer turned to compile the full GCC. And just like that, the toolchain was built. He linked everything together, a self-contained, fully functional GCC, ready for use. + +But you ask me why this story is fiction? Well the answer is simple, because glibc make did not create the crt files. Perhaps it doesn't do that. IDK. + +So this is the story of how we ended up at version 12, which does compile with system tools. + diff --git "a/document\360\237\226\211/configure/RT metagdata.org" "b/document\360\237\226\211/configure/RT metagdata.org" deleted file mode 100644 index bbed67a..0000000 --- "a/document\360\237\226\211/configure/RT metagdata.org" +++ /dev/null @@ -1,84 +0,0 @@ -#+TITLE: RT_gcc Build Branding and Documentation URLs -#+AUTHOR: Thomas Walker Lynch -#+DATE: 2025-05-06 -#+OPTIONS: toc:nil num:nil -#+LANGUAGE: en - -This document outlines the recommended settings for GCC distributor metadata options when building the standalone ~RT_gcc~ compiler toolchain. These settings help clarify origin, version, and support paths for any distributed binaries based on this toolchain. - -* Purpose -GCC offers metadata fields to help identify binary builds, especially when they include custom patches or are not compiled from the official FSF release tree. Since RT_gcc introduces features like `#assign` and modular build scripts, we recommend setting these fields clearly. - -* Recommended Configuration - -** --with-pkgversion -:PROPERTIES: -:default: GCC -:controls: Label shown by `gcc --version` to identify the binary origin -:END: - -** Recommended setting -#+BEGIN_EXAMPLE ---with-pkgversion="RT_gcc standalone by Reasoning Technology" -#+END_EXAMPLE - -This distinguishes RT_gcc binaries from FSF-provided GCC builds. - -You can optionally append a build number or commit hash: -#+BEGIN_EXAMPLE ---with-pkgversion="RT_gcc standalone r1 (commit 9f3b123)" -#+END_EXAMPLE - -** --with-bugurl -:PROPERTIES: -:default: https://gcc.gnu.org/bugs/ -:controls: Where users are told to report bugs -:END: - -** Recommended setting -#+BEGIN_EXAMPLE ---with-bugurl="https://github.com/Thomas-Walker-Lynch/RT_gcc/issues" -#+END_EXAMPLE - -Use the GitHub Issues tracker unless you prefer an email or internal system. - -** --with-documentation-root-url -:PROPERTIES: -:default: https://gcc.gnu.org/onlinedocs/ -:controls: Where documentation links in error messages point -:END: - -** Recommended setting -#+BEGIN_EXAMPLE ---with-documentation-root-url="https://gcc.gnu.org/onlinedocs/" -#+END_EXAMPLE - -You may leave this as-is unless you plan to host modified docs yourself. - -** --with-changes-root-url -:PROPERTIES: -:default: https://gcc.gnu.org/ -:controls: Where version-specific release notes are linked -:END: - -** Recommended setting -#+BEGIN_EXAMPLE ---with-changes-root-url="https://gcc.gnu.org/" -#+END_EXAMPLE - -Alternatively, if you maintain a changelog of your patched builds: -#+BEGIN_EXAMPLE ---with-changes-root-url="https://github.com/Thomas-Walker-Lynch/RT_gcc/releases/" -#+END_EXAMPLE - -* Summary Table -#+LATEX_HEADER: \usepackage{booktabs} -#+ATTR_LATEX: :environment tabular :align lll -#+NAME: Distributor Option Summary -| Option | Purpose | Suggested Setting | -|------------------------------+-------------------------------------+--------------------------------------------------------------------| -| --with-pkgversion | Identify GCC build source | RT_gcc standalone by Reasoning Technology | -| --with-bugurl | Where users report issues | https://github.com/Thomas-Walker-Lynch/RT_gcc/issues | -| --with-documentation-root-url | Docs base for online help | https://gcc.gnu.org/onlinedocs/ | -| --with-changes-root-url | Changelog root | https://github.com/Thomas-Walker-Lynch/RT_gcc/releases/ | - diff --git "a/document\360\237\226\211/configure/directory settings.org" "b/document\360\237\226\211/configure/directory settings.org" deleted file mode 100644 index bde87a1..0000000 --- "a/document\360\237\226\211/configure/directory settings.org" +++ /dev/null @@ -1,156 +0,0 @@ -#+TITLE: GCC Configure Directory Settings -#+AUTHOR: Thomas Walker Lynch -#+DATE: 2025-05-06 -#+OPTIONS: toc:nil num:nil -#+LANGUAGE: en - -See `terms - machine roles` and `compiler deployment scenarios` for context on build environments. - -The one or more GCC programs used during the build, will by necessity be different from the GCC produced by the build. We call the GCC produced by the build the 'target GCC'. The GCC used to do the build is the `build GCC`. - -We talk about GCC compiling C programs. This is a simplification as target GCC compilers can be made for a number of languages. - -* Settings that customize the build process itself - -** --prefix $PREFIX - -This setting directly affects the build. - -Sets the build target directory to $PREFIX. This is where GCC build will write the build products: binaries, libraries, and include files. - -The GCC build does not read from this directory. There are other options for specifying where GCC reads libraries and header files. - -Default: /usr/local - -Examples: - -- If you want to install GCC into a custom toolchain directory called `/opt/gcc-12` use - `--prefix=/opt/gcc-12` - -- Thomas is building a standalone toolchain into `$TOOLCHAIN`, so he sets: - `--prefix=$TOOLCHAIN` - to ensure all artifacts are installed in that controlled location. - -For splitting apart the --prefix setting: - -*** --exec-prefix - -Sets the prefix for machine-specific files (binaries and libraries). If omitted, defaults to the value of `--prefix`. - -Rarely needed unless separating machine-independent files (docs, configs) from machine-dependent binaries. - -Default: Same as `--prefix` - -Example: - -- To split architecture-specific files into `/opt/gcc-12/x86_64`, you could use: - `--prefix=/opt/gcc-12` - `--exec-prefix=/opt/gcc-12/x86_64` - -The following standard autoconf options are supported. Normally you should not need to use these options. - -*** --bindir=dirname -Specify the installation directory for the executables called by users (such as gcc and g++). The default is exec-prefix/bin. - -*** --libdir=dirname -Specify the installation directory for object code libraries and internal data files of GCC. The default is exec-prefix/lib. - -*** --libexecdir=dirname -Specify the installation directory for internal executables of GCC. The default is exec-prefix/libexec. - -*** --with-slibdir=dirname -Specify the installation directory for the shared libgcc library. The default is libdir. - -*** --datarootdir=dirname -Specify the root of the directory tree for read-only architecture-independent data files referenced by GCC. The default is prefix/share. - -*** --infodir=dirname -Specify the installation directory for documentation in info format. The default is datarootdir/info. - -*** --datadir=dirname -Specify the installation directory for some architecture-independent data files referenced by GCC. The default is datarootdir. - -*** --docdir=dirname -Specify the installation directory for documentation files (other than Info) for GCC. The default is datarootdir/doc. - -*** --htmldir=dirname -Specify the installation directory for HTML documentation files. The default is docdir. - -*** --pdfdir=dirname -Specify the installation directory for PDF documentation files. The default is docdir. - -*** --mandir=dirname -Specify the installation directory for manual pages. The default is datarootdir/man. (Note that the manual pages are only extracts from the full GCC manuals, which are provided in Texinfo format. The manpages are derived by an automatic conversion process from parts of the full manual.) - -*** --with-gxx-include-dir=dirname -Specify the installation directory for G++ header files. The default depends on other configuration options, and differs between cross and native configurations. - -*** --with-specs=specs -Specify additional command line driver SPECS. This can be useful if you need to turn on a non-standard feature by default without modifying the compiler’s source code, for instance --with-specs=%{!fcommon:%{!fno-common:-fno-common}}. See “Spec Files” in the main manual - - -* Settings that affect the behavior of the target compiler - -** --with-local-prefix $INCDIR - -This setting affects the behavior of the target GCC, but apart from that, does not affect the build. - -The target GCC, i.e. the one that configure and make is producing, will presumably be used later by programmers to compile programs. When the target GCC is called upon to compile a C program, that C program might specify one or more include files. - -By default, when the target GCC is invoked and the source file direct for the inclusion of a system header file, the target GCC will first search the directory `/usr/include` for that header file, and if it does not find it, it will then search `/usr/local/include`. - -When `--with-local-prefix $INCDIR` is specified, the target GCC will first search `/usr/include`, as before; however, if it does not find the header file, it will instead search in `$INCDIR/include`. Thus `/usr/include/local` does not get searched in these first two steps. - -This features is intended to support sites that use a different convention than `/usr/local` for the installation of site local header files. - -As for any GCC compile, additional include directories to be searched can be specified with the `-I` option given to the target GCC at the time it is invoked. The user can provide arbitrary values after the -I, though the target GCC will require read access to the specified directories. - -The document at https://gcc.gnu.org/install/configure.html, describes this option with a potentially confusing first sentence of "Specify the installation directory for local include files." By 'installation directory' they do not mean the target directory where the target GCC is being put by the build process, unless by coincidence. Rather they mean the place that site local header files will be found in the future when the target GCC is running and compiling source code. - -The site local directory can be removed from the include file search path by setting `--with-local-prefix` to `/dev/null'. - -- **Default**: `/usr/local` - -** --with-sysroot $SYSROOT - -This setting affects the behavior of the target GCC, but apart from that, does not affect the build. - -By default, when the target GCC is later invoked to compile a program, it will search absolute paths for libraries and include files, namely in `/usr/include`, `/usr/lib` and `/usr/local/include` for header and library files to read. This happens downstream when the target GCC is invoked to do work for programmers. - -The `--with-sysroot $SYSROOT` option causes the library and include searches to instead be relative to $SYSROOT. It will instead search, `$SYSROOT/include`, `$SYSROOT/usr/lib` and `$SYSROOT/usr/local/include` for header and library files to read. - -Various other options here can modify the defaults of where the target GCC will search for libraries and include files. When given `--with-sysroot $SYSROOT` those default overrides will also be taken to be relative to `$SYSROOT`. - -The target GCC's built-in default include and library paths are adjusted to be relative to $SYSROOT. However, user-supplied -I and -L paths remain absolute unless written as relative paths. - -This setting is useful when cross compiling or when building an isolated GCC. - -- Default: (none) -- Usage: - During cross-compilation or isolated builds, you use this to point to a directory that mimics the target system's root. -- Example: Thomas builds a version of GCC that is not supported by the system, so he installs all requisite run time files to be used by this version of GCC into `$SYSROOT`, so at build time he passes to configure: - `--with-sysroot=$SYSROOT` - -** --with-native-system-header-dir - -Specifies the directory where the target GCC will look for **native system headers** by default, when it is later invoked to compile programs. - -The term *native* here refers to the **target system** that the built GCC is intended to compile programs for — whether that system is the same as the host (in a native build) or different (in a cross-compilation scenario). These headers typically come from the target's C library (like glibc or musl), and include standard files such as ``, ``, etc. - -This setting controls where those headers are expected to reside. It is especially useful when building an isolated toolchain or a cross-compiler, where the target's system headers do not follow the default `/usr/include` layout. - -When used in combination with `--with-sysroot`, this path is interpreted relative to the given sysroot. For example, if `--with-native-system-header-dir=/usr/include` and `--with-sysroot=$SYSROOT` are both provided, the target GCC will search for headers in `$SYSROOT/usr/include`. - -This option overrides the default path implied by `--with-local-prefix`, and affects only the behavior of the resulting target GCC — it does not influence how GCC itself is built. - -- **Default**: `/usr/include` - -- **Example**: Inside a musl-based sysroot located at `$SYSROOT`, system headers live in `$SYSROOT/usr/include`. Thomas configures GCC with: - #+begin_src bash - --with-sysroot=$SYSROOT - --with-native-system-header-dir=/usr/include - #+end_src - - This causes the resulting GCC to treat `/usr/include` as the location of system headers — *relative to* `$SYSROOT`. - - diff --git "a/document\360\237\226\211/how_it_works/cpp.org" "b/document\360\237\226\211/how_it_works/cpp.org" deleted file mode 100644 index b38c7ba..0000000 --- "a/document\360\237\226\211/how_it_works/cpp.org" +++ /dev/null @@ -1,503 +0,0 @@ -#+TITLE: C Preprocessor Overview -#+AUTHOR: Thomas Walker Lynch & Caelestis Index -#+DESCRIPTION: High-level architectural partitioning of cpp (GCC 12.x) -#+FILETAGS: cpp preprocessor architecture gcc -#+OPTIONS: toc:nil - -* Preprocessing Pipeline (Diagram) - -#+BEGIN_SRC text - C Preprocessor (cpp) - ===================== - -+----------------------+ -| Source Code | -+----------------------+ - | - v -+----------------------+ -| Lexical Analysis | <- Part of: Lexical Analysis -| (tokenize input) | -+----------------------+ - | - v -+----------------------+ -| Directive Engine | <- Part of: Directive Handling -| (#define, #if, etc.) | -+----------------------+ - | - v -+----------------------+ -| Conditional Logic | <- Part of: Conditional Compilation -| (#if/#ifdef/#else) | -+----------------------+ - | - v -+----------------------+ -| Macro Expansion | <- Part of: Macro Expansion -| (object/function) | -+----------------------+ - | - v -+----------------------+ -| Callback Hooks | <- Part of: Hook and Callback Interface -| (cpp_callbacks) | -+----------------------+ - | - v -+----------------------+ -| Output Tokens | <- Output stream to compiler frontend -| (to GCC parser) | -+----------------------+ -#+END_SRC - -Each block corresponds to a major processing stage in `cpp`. The functional groups defined earlier align to these blocks as indicated, though some (like state management and diagnostics) operate globally across the pipeline. - - - -* Major Functional Partitions of the C Preprocessor (cpp) - -This section outlines the primary architectural components of the C preprocessor as implemented in GCC 12.x. These functional partitions help frame how cpp processes input and how its internal modules interact. - -** 1. Lexical Analysis -- Tokenizes input into =cpp_token= streams. -- Decodes: - - UTF-8 characters - - Trigraphs (e.g., =??=) - - Digraphs (e.g., =<: = for =[=) -- Central structure: =cpp_lexer= -- Produces tokens for macro expansion and conditional evaluation. -** 2. Directive Handling -- Processes all =#= directives, including: - - =#define=, =#undef=, =#include=, =#line=, =#error=, =#pragma= - - Extended directives like =#assign=, =#call= if supported. -- Managed via =directive_table= and dispatch functions like =do_define=, =do_include=, etc. - -** 3. Conditional Compilation -- Handles constructs like: - - =#if=, =#ifdef=, =#ifndef=, =#elif=, =#else=, =#endif= -- Used to include or exclude code based on macro definitions and constant expressions. -- Driven by the =if_stack= in =cpp_reader=. -- Central to controlling variant builds, platform-specific code, or staged compilation. -** 4. File Inclusion and Search Paths -- Resolves =#include= and maintains include history. -- Handles: - - System vs user includes (<...> vs "..."). - - Include path resolution via =cpp_search_path=. - - File change tracking via =file_stack=. -** 5. Macro Expansion -- Handles object-like and function-like macros: - - =#define PI 3.14= - - =#define SQR(x) ((x)*(x))= -- Manages: - - Argument collection and expansion - - Token-pasting (=##=) and stringification (=#=) -- Involves =macro_table=, =collect_args=, and =expand_macro()= - -** 6. Diagnostics and Error Recovery -- Reports syntax errors, macro misuse, directive misuse. -- Uses: - - =cpp_error=, =cpp_warning=, =cpp_notice= - - Tracks macro nesting, input location, and file state for context. - -** 7. Hook and Callback Interface -- Interface: =cpp_callbacks= -- Allows frontend or plugin to observe: - - Macro definitions - - File changes - - Token output stream -- Enables debugging tools, IDEs, or language servers to integrate preprocessor awareness. - -** 8. State Management and Scoping -- Maintains global and file-level preprocessor state. -- Tracks: - - Nested conditional state via =if_stack= - - Macro table lifetimes and shadowing - - Include guards and =#pragma once= heuristics - - -* cpplib.h -- Application Interface Overview - -This section documents the **interface** and **in-memory model** of the C preprocessor (`libcpp`) from GCC 12.2.0. -It covers core data structures (tokens, macros, readers) and the primary functions for working with them. - -** Key Data Structures - -*** Token & Token Metadata -- `enum cpp_ttype` :: All possible token types (operators, names, literals, etc.) -- `struct cpp_token` :: Represents a token in the stream (with union-based payload) -- `enum cpp_token_fld_kind` :: Discriminates the active field in `cpp_token.val` -- `struct cpp_string` :: Raw string representation with length and pointer - -*** Macros & Identifiers -- `struct cpp_macro` :: Describes macro kind, parameter list, and token expansion -- `enum cpp_macro_kind` :: ISO-style, traditional-style, and assertion macros -- `struct cpp_identifier` :: Canonical and original spellings of a name -- `struct cpp_macro_arg` :: Argument number and spelling for macro arguments - -*** Symbol Table -- `struct cpp_hashnode` :: Hash table node for identifiers/macros -- `enum node_type` :: Distinguishes macro types (arg/user/builtin) -- `union _cpp_hashnode_value` :: Payload (macro, arg index, etc.) -- `enum cpp_builtin_type` :: Reserved built-ins like `__LINE__`, `__FILE__`, `_Pragma` - -*** Reader & Configuration -- `struct cpp_reader` :: Forward-declared. Central structure for preprocessing. -- `struct cpp_options` :: Stores all language mode flags, warning flags, and feature toggles. -- `struct cpp_callbacks` :: Client hook interface for diagnostic, macro, and file events. -- `struct cpp_dir` :: Represents an `#include` search directory. - -*** Numerics -- `struct cpp_num` :: Two-part 64-bit integer (high, low), overflow flags -- `cpp_classify_number` :: Categorizes radix/type (e.g., `0x`, `u`, `LL`) -- Defines :: `CPP_N_*` classify bits (INTEGER, FLOATING, WIDTH, RADIX, SUFFIX) - -*** Charset Handling -- `typedef cppchar_t` :: 32-bit safe character representation -- `struct cpp_decoded_char` :: Result of UTF-8 decoding step -- `struct cpp_char_column_policy` :: Visual column handling for diagnostics -- `class cpp_display_width_computation` :: Converts UTF-8 sequence to visual width - -*** Comment Tracking -- `struct cpp_comment`, `cpp_comment_table` :: Captures all parsed comments (if enabled) - -** Core Functions - -*** Lifecycle & Reader Setup -- `cpp_create_reader(enum c_lang, ...)` :: Allocates and initializes `cpp_reader` -- `cpp_finish`, `cpp_destroy` :: Finalize and free the reader -- `cpp_post_options` :: Commit option changes after parsing flags - -*** Preprocessing Input -- `cpp_read_main_file` :: Begin reading and preprocessing a source file -- `cpp_get_token()` :: Fetch next token from stream -- `cpp_peek_token()` :: Peek ahead without consuming -- `cpp_backup_tokens()` :: Push tokens back for re-parsing -- `cpp_retrofit_as_include()` :: Treat main file as if included - -*** Macro System -- `cpp_define()`, `cpp_define_unused()`, `cpp_define_lazily()` :: Define macros -- `cpp_macro_definition()` :: Dump macro body as string -- `cpp_compare_macros()` :: Deep compare two macros -- `cpp_undef()`, `cpp_undef_all()` :: Remove macro(s) -- `cpp_set_deferred_macro()`, `cpp_get_deferred_macro()` :: Lazy macro substitution - -*** Symbol Lookup -- `cpp_lookup()` :: Lookup or create an identifier hashnode -- `cpp_forall_identifiers()` :: Iterate over all identifiers - -*** String & Char Evaluation -- `cpp_interpret_charconst()` :: Parse a character constant (e.g. `'a'`) -- `cpp_interpret_string()` :: Parse string literal(s) into `cpp_string` -- `cpp_interpret_integer()` :: Parse numeric token into `cpp_num` - -*** Diagnostics -- `cpp_error()`, `cpp_warning()`, `cpp_pedwarning()` :: General messages -- `cpp_error_at()` :: Message with source location (rich_location optional) -- `cpp_errno()` / `cpp_errno_filename()` :: Errors based on `errno` -- `cpp_warning_with_line()` :: Fallback location-based warnings -- `cpp_get_callbacks()` / `cpp_set_callbacks()` :: Manage diagnostic hooks - -*** Extension Hooks & Pragma -- `cpp_register_pragma()` :: Register custom `#pragma` handler -- `cpp_get_callbacks()` :: Access to client-supplied hook table -- `cpp_define_formatted()` :: Macro with `printf`-style input -- `cpp_directive_only_process()` :: Run directive-only logic on a token stream - -*** Includes & File Management -- `cpp_set_include_chains()` :: Set system and user include paths -- `cpp_push_buffer()` :: Manually push a buffer for parsing -- `cpp_included()`, `cpp_included_before()` :: Has this file been included? -- `cpp_get_converted_source()` :: Read a file in input charset, return decoded buffer - -** Token Types (cpp_ttype) - -A full enumeration of all tokens in the preprocessor: -- Operators: `CPP_PLUS`, `CPP_MINUS`, `CPP_EQ_EQ`, etc. -- Punctuation: `CPP_OPEN_PAREN`, `CPP_HASH`, `CPP_SEMICOLON` -- Literals: `CPP_STRING`, `CPP_WCHAR`, `CPP_NUMBER` -- Special: `CPP_MACRO_ARG`, `CPP_PRAGMA`, `CPP_EOF` - -Each token has: -- Type (`enum cpp_ttype`) -- Flags (`PREV_WHITE`, `DIGRAPH`, `NO_EXPAND`, etc.) -- Source location -- Union payload (e.g., string, macro arg, hashnode) - -** Interface Concepts Beyond Code -*** Unicode Handling -- Input is normalized per `cpp_normalize_level` -- UTF-8 is expanded into 32-bit code points (`cppchar_t`) -- Display width of characters is estimated for diagnostics -- Bidi (bidirectional) controls are optionally scanned/warned - -*** Client Extension Hooks -- Most preprocessing operations (macro use, `#include`, comments, errors) are callback-hooked -- Used by GCC frontend to track macro use, implement diagnostics, and guide `#pragma` processing - -*** Dependency Generation -- `cpp_finish()` accepts an output stream for dependency info -- Options control whether main file is included, phony targets are added, etc. - -** Summary - -`cpplib.h` serves as both API contract and internal representation guide. -- It offers a high-fidelity view of source tokens for later compiler stages. -- The entire macro system, character encoding, and diagnostic lifecycle are managed through this interface. - - - - -* Callback Hooks (cpp_callbacks) - -The `cpp_callbacks` struct in `cpplib.h` allows external consumers (e.g., GCC frontend, IDE integrations, or plugins) to receive notifications during preprocessing. Each function pointer in this struct represents a hookable event. - -** Overview - -Hooks are triggered at specific stages: -- After macro definition or undefinition -- Before and after file inclusion -- When tokens are emitted -- Upon encountering diagnostics -- During comment scanning (if enabled) -- On encountering special directives (e.g., `#pragma`) - -** Hook Structure - -#+BEGIN_SRC c -struct cpp_callbacks { - void (*define)(cpp_reader *, source_location, const cpp_hashnode *); - void (*undef)(cpp_reader *, source_location, const cpp_hashnode *); - void (*include)(cpp_reader *, const char *filename, int angle_brackets); - void (*file_change)(cpp_reader *, const struct line_map *); - void (*line_change)(cpp_reader *, source_location, int to_file, int to_line); - void (*ident)(cpp_reader *, const cpp_string *); - void (*invalid_directive)(cpp_reader *); - void (*def_pragma)(cpp_reader *, const cpp_token *); - void (*cb_comment)(cpp_reader *, const cpp_token *); -}; -#+END_SRC - -Each callback receives either a pointer to the `cpp_reader`, the affected token or structure, and optional contextual data. - ---- - -** `define` - -*** Trigger -- Fired immediately after a macro is defined with `#define`. - -*** Parameters -- `cpp_reader *pfile`: global preprocessor state (read-write). -- `source_location loc`: location of the `#define`. -- `const cpp_hashnode *node`: the macro name and metadata (read-only in this context). - -*** Semantics -- The `cpp_hashnode` holds the macro's name and a pointer to its `cpp_macro` definition. -- Modifying the macro at this point is possible but discouraged. Use `cpp_undef()` + `cpp_define()` instead if redefinition is needed. - -*** Uses -- GCC uses this to update dependency tracking and debug tables. -- Tools may track macro definitions, emit logs, or enforce naming policies. - ---- - -** `undef` - -*** Trigger -- Fired after `#undef` removes a macro. - -*** Parameters -- Same as `define`. - -*** Semantics -- The node is marked `undefined`, but the symbol remains in the hash table. -- No mutation should occur—only inspection or logging. - -*** Uses -- Enables reversal tracking or macro scoping analysis. - ---- - -** `include` - -*** Trigger -- Fired just before a file is opened via `#include`. - -*** Parameters -- `cpp_reader *pfile` -- `const char *filename`: string from the include directive (not normalized). -- `int angle_brackets`: nonzero for `<...>`, zero for `"..."`. - -*** Semantics -- Purely informational; does not affect include search or suppression. -- The filename is unverified and not guaranteed to exist. - -*** Uses -- IDEs and build tools use this to build include graphs. -- LSPs use it to track file references and symbol origins. - ---- - -** `file_change` - -*** Trigger -- Called when the active input file changes (entry or exit of `#include`). - -*** Parameters -- `cpp_reader *pfile` -- `const struct line_map *map`: describes the current file's location and context. - -*** Semantics -- `line_map` gives full access to file/line/column mapping. -- This structure is read-only; mutating it will corrupt diagnostics and tokenization. - -*** Uses -- Debug info (DWARF line tables), logging, stack-based include tracking. - ---- - -** `line_change` - -*** Trigger -- Fired on `#line` directives or line-mapping transitions. - -*** Parameters -- `cpp_reader *pfile` -- `source_location loc`: location in input stream. -- `int to_file`: non-zero if a new file name is being used. -- `int to_line`: new logical line number. - -*** Semantics -- Use this to remap locations or re-synchronize overlays. -- These values are inputs to the line map; do not write back. - -*** Uses -- Used in DWARF debug info to support accurate line-based breakpoints. - ---- - -** `ident` - -*** Trigger -- Called when a `#ident` directive is parsed. - -*** Parameters -- `cpp_reader *pfile` -- `const cpp_string *text`: payload of the identifier message. - -*** Semantics -- Informational only. Common in legacy systems or codegen traces. - -*** Uses -- Collect module identity, versioning hints, or logmarks. - ---- - -** `invalid_directive` - -*** Trigger -- Fired when an unrecognized or malformed directive is encountered. - -*** Parameters -- `cpp_reader *pfile` - -*** Semantics -- Hook has no extra context; use `cpp_get_token()` to recover. -- Hook may trigger fallback behavior or custom directive logic. - -*** Uses -- Used in `-fpreprocessed` mode to suppress diagnostics. -- External tools can use this to extend the directive set. - ---- - -** `def_pragma` - -*** Trigger -- Fired when a `#pragma` directive is parsed. - -*** Parameters -- `cpp_reader *pfile` -- `const cpp_token *pragma`: token stream beginning with `CPP_PRAGMA`. - -*** Semantics -- Read-only access to token stream. -- Mutation possible via `cpp_push_buffer()` to inject expanded tokens. - -*** Uses -- GCC plugins hook this to implement custom `#pragma` behavior. -- Can trigger front-end features (like `#pragma GCC diagnostic`). - ---- - -** `cb_comment` - -*** Trigger -- Optional. Enabled if comment tracking is requested. - -*** Parameters -- `cpp_reader *pfile` -- `const cpp_token *comment`: holds text of comment. - -*** Semantics -- Only line/block comment content is captured, not semantics. -- Read-only token; do not mutate token payload. - -*** Uses -- Used by source-to-source translators and formatters. -- Some static analyzers inspect comments for hints or disables. - ---- - -** Summary - -The `cpp_callbacks` interface enables observational and limited transformational interaction with the preprocessor pipeline. - -- Most parameters are read-only or shallow copies. -- For transformations, prefer using `cpp_define()`, `cpp_push_buffer()`, or `cpp_backup_tokens()` externally. -- Internal structures like `cpp_reader`, `cpp_token`, and `cpp_macro` should not be mutated unless explicitly permitted. - - - -* Plugin-Like Integration in libcpp - -Unlike the main GCC compiler, which supports a formal plugin system (`gcc-plugin.h`), `libcpp` (the C preprocessor library) does *not* support plugins in the dynamic or runtime-loaded sense. There is no system for loading shared libraries, registering handlers via symbols, or extending preprocessor behavior through runtime modules. - -** Static Hook Interface via cpp_callbacks - -Instead, `libcpp` exposes a *statically defined interface* (`struct cpp_callbacks`) for embedding applications to receive notifications of preprocessor events. These include: - -- Macro definitions and undefinitions -- Source file entry/exit -- Comment and pragma parsing -- Token emission and buffer transitions - -An embedding client (such as GCC's C/C++ frontend, or a third-party tool using libcpp) may assign function pointers directly into this struct during reader setup. - -#+BEGIN_SRC c -cpp_reader *r = cpp_create_reader(...); -cpp_callbacks *cb = cpp_get_callbacks(r); -cb->macro_defined = my_macro_handler; -cb->file_change = my_file_tracker; -#+END_SRC - -This pattern is analogous to a *plugin interface*, but all logic is statically linked at compile time. - -** Mutability and Access Scope - -The callback interface is primarily **observational**—that is, hooks are expected to inspect events, not mutate the `cpp_reader` state directly. However, advanced users can, with care, reach into the data structures passed to them (e.g., `cpp_macro`, `cpp_hashnode`) and affect behavior, though this is neither documented nor officially supported. - -In summary: - -| Feature | GCC Frontend Plugin | libcpp Callback Interface | -|--------------------------+---------------------+----------------------------| -| Dynamically loadable | Yes | No | -| Runtime extension API | Yes (`gcc-plugin.h`) | No | -| Assign custom handlers | Yes | Yes (via `cpp_callbacks`) | -| Mutate core structures | With care | With care (not endorsed) | -| Stability across versions| Best-effort | Internal API, may break | - -** Recommendation - -Use `cpp_callbacks` as a read-only interface to monitor preprocessing behavior. If deeper mutation or instrumentation is required, consider modifying or forking `libcpp` itself. There is currently no officially supported way to extend it at runtime. diff --git "a/document\360\237\226\211/how_it_works/cpp_reader.org" "b/document\360\237\226\211/how_it_works/cpp_reader.org" deleted file mode 100644 index bc87d15..0000000 --- "a/document\360\237\226\211/how_it_works/cpp_reader.org" +++ /dev/null @@ -1,147 +0,0 @@ -#+TITLE: cpp_reader: Preprocessor State and Interface Guide -#+AUTHOR: Caelestis Index -#+FILETAGS: cpp, GCC internals, preprocessor, architecture - -* Overview -The =cpp_reader= struct in GCC's =libcpp= encapsulates the complete state of a single C preprocessor session. It governs token input, macro expansion, directive parsing, include stack management, and source map resolution. It is the central state object passed through nearly all parts of the C preprocessor. - -* 1. State Data - -** 1.1 Buffer and Lexing State -- ~buffer~, ~overlaid_buffer~: Input buffer stack for file and macro streams. -- ~cur_token~, ~cur_run~, ~base_run~: Active token buffer and tokenrun tracking. -- ~keep_tokens~: Whether to preserve old tokens (e.g., for diagnostics). -- ~a_buff~, ~u_buff~, ~free_buffs~: Temporary memory allocation pools. - -** 1.2 Parsing and Directive State -- ~state~: General lexer state (includes ~in_directive~ flag). -- ~state.in_directive~: Boolean flag indicating whether the preprocessor is currently parsing a directive line. If ~true~, token behavior (e.g., whitespace and line continuation) may differ. -- ~directive~, ~directive_line~: Currently parsed directive and its location. -- ~directive_result~: Token synthesized by a directive (if any). - -** 1.3 Macro Context and Expansion -- ~context~, ~base_context~: Macro expansion call stack. -- ~top_most_macro_node~: Current top-level macro under expansion. -- ~about_to_expand_macro_p~: Indicates if a macro is about to expand. -- ~macro_buffer~, ~macro_buffer_len~: Buffers for rendering macro string forms. - -** 1.4 Include and File Lookup State -- ~quote_include~, ~bracket_include~, ~no_search_path~: Search paths. -- ~all_files~, ~main_file~: Linked list of all known input files. -- ~file_hash~, ~dir_hash~: Hashtables for file path caching. -- ~nonexistent_file_hash~: Optimizes negative lookup caching. -- ~seen_once_only~: Tracks ~#pragma once~ semantics. - -** 1.5 Character Set Conversion -- ~narrow_cset_desc~, ~utf8_cset_desc~, ~wide_cset_desc~, etc.: Converters for source to execution character encodings. - -** 1.6 Location Mapping and Source Positioning -- ~line_table~: GCC's =line_maps= structure for virtual location tracking. -- ~invocation_location~, ~main_loc~, ~forced_token_location~: Positional context for diagnostics, token creation. - -** 1.7 Miscellaneous Flags and Utilities -- ~quote_ignores_source_dir~: Include resolution behavior flag. -- ~counter~: Value of the ~__COUNTER__~ macro. -- ~out~: Output buffer for traditional preprocessing mode. -- ~savedstate~: Used for dependency tracking with precompiled headers. -- ~comments~: Optional comment capture buffer. - -* 2. Core Interface Functions -** 2.1 Token Retrieval -- ~cpp_get_token(pfile)~: Public interface for retrieving the next logical token. -- ~cpp_peek_token(pfile, N)~: Look ahead without consuming. -- ~cpp_get_token_1(pfile)~: Internal token fetch used during macro expansion. - -** 2.2 Macro Definition and Expansion -- ~_cpp_new_macro(pfile, cmk_macro, obstack_ptr)~: Allocate and initialize a new macro definition. -- ~_cpp_mark_macro_used(node)~: Mark a macro as having been used. -- ~replace_args(...)~: Expand and replace macro arguments (not used during directive handling). -- ~collect_args(...)~: Collects arguments for a function-like macro invocation. -- ~collect_single_argument(...)~: Parses one macro argument and handles token accumulation. -- ~cpp_arguments_ok(...)~: Checks argument count and matching for a macro invocation. -- ~set_arg_token(...)~: Sets or appends a token in an argument’s expansion list. - -** 2.3 Directive Handling Helpers -- ~_cpp_skip_rest_of_line(pfile)~: Skip trailing tokens after directive arguments. -- ~lex_macro_node(pfile)~: Specialized lexer for parsing macro names. - -** 2.4 File/Include Handling -- ~cpp_push_include(pfile, filename)~: Add a new include to the stack. -- ~cpp_find_include_file(...)~: Path search logic. - -** 2.5 Location Utilities -- ~cpp_token_location(token)~: Extracts a =location_t= from a token. -- ~linemap_add(...)~: Adds a mapping between logical and physical line/column. - -** 2.6 Miscellaneous -- ~cpp_warning_with_line(...)~, ~cpp_error_with_line(...)~: Emit diagnostics with location. -- ~cpp_lookup(pfile, name, length)~: Interns an identifier and returns a ~cpp_hashnode *~. -- ~NODE_NAME(node)~: Expands to the null-terminated name of a macro node. - -* 3. Usage Examples - -** 3.1 Defining a Macro from a Directive -#+BEGIN_SRC c -cpp_hashnode *node = lex_macro_node(pfile); -cpp_macro *macro = _cpp_new_macro(pfile, cmk_macro, _cpp_reserve_room(pfile, 0, sizeof(cpp_macro))); -macro->count = 1; -macro->exp.tokens[0] = make_number_token("42"); -node->type = NT_USER_MACRO; -node->value.macro = macro; -_cpp_mark_macro_used(node); -#+END_SRC - -** 3.2 Parsing a Directive With Two Arguments -#+BEGIN_SRC c -cpp_token *arg1 = cpp_get_token(pfile); -cpp_token *comma = cpp_get_token(pfile); -if (comma->type != CPP_COMMA) - cpp_error(pfile, CPP_DL_ERROR, "expected ',' after macro name"); -cpp_token *arg2 = cpp_get_token(pfile); -_cpp_skip_rest_of_line(pfile); -#+END_SRC - -** 3.3 Controlling Directive Context -#+BEGIN_SRC c -bool saved = pfile->state.in_directive; -pfile->state.in_directive = false; -assign_handler(pfile); -pfile->state.in_directive = saved; -#+END_SRC - -** 3.4 Tokenization and Location Debugging -#+BEGIN_SRC c -const cpp_token *tok = cpp_get_token(pfile); -location_t loc = tok->src_loc; -printf("token at line: %d\n", LOCATION_LINE(loc)); -#+END_SRC - -* 4. directive.cc extensions to the reader -- ~lex_macro_node(pfile)~: Returns a ~cpp_hashnode *~ for the next identifier, used for directives like ~#define~ or custom ones like ~#assign~. -- ~_cpp_skip_rest_of_line(pfile)~: Advances the token stream to the next physical line. -- ~cpp_error_with_line(...)~, ~cpp_warning_with_line(...)~: Used for directive diagnostics. -- ~cpp_lookup(pfile, name, length)~: Interns a name as a hashnode symbol. -- ~cpp_reader->directive_result~: Used to push a synthesized token result into the stream (e.g., for ~#include_next~). -- ~pfile->state.in_directive~: Must be manually toggled when directive code calls into macro infrastructure. -* 5. macro.cc extensions to the reader - -*** 4.2.1 collect_args(...) -Accumulates macro arguments for a function-like macro. Reads and segments the input stream into a series of ~macro_arg~ entries, tracking nesting of parentheses and token boundaries. - -*** 4.2.2 collect_single_argument(...) -Parses and collects one macro argument, terminating on a comma or closing paren. Used internally by ~collect_args~, but can be called separately for single-argument macro handling. - -*** 4.2.3 replace_args(...) -Performs full substitution of macro arguments into the macro body. Handles token pasting (~##~), stringification (~#~), and recursive macro expansion. - -*** 4.2.4 cpp_arguments_ok(...) -Checks whether the number of provided arguments matches the macro’s parameter list. Validates ~paramc~ and variadic status. - -*** 4.2.5 set_arg_token(...) -Helper to insert or append a token into a ~macro_arg~. Used when building argument streams in ~collect_single_argument~. - -These routines enable fine-grained control over macro behavior and can be selectively reused to simulate macro expansion at directive time (e.g., ~#assign~, ~#bind~, or macro templating extensions). -* 6. Conclusion -~cpp_reader~ is the heart of the preprocessor, acting as a unifying context for token streams, macro tables, buffer management, diagnostics, and parser state. Understanding and safely manipulating it is key to extending the preprocessor (e.g., adding new directives like ~#assign~) without destabilizing expansion or include logic. - -Use ~in_directive~, ~context~, and ~cur_token~ fields with care, and follow the established patterns in ~directives.cc~ and ~macro.cc~ to ensure consistent behavior across parse and expansion phases. diff --git "a/document\360\237\226\211/how_it_works/lexing.org" "b/document\360\237\226\211/how_it_works/lexing.org" deleted file mode 100644 index a6bed25..0000000 --- "a/document\360\237\226\211/how_it_works/lexing.org" +++ /dev/null @@ -1,230 +0,0 @@ -#+TITLE: GCC libcpp Lexer: Structure, Usage, and Extension -#+AUTHOR: Caelus (OpenAI) and Thomas Walker Lynch -#+DATE: 2025-05-09 - -* Overview -The C preprocessor lexer (`lex.cc`) in GCC's `libcpp` is responsible for scanning raw source characters and emitting `cpp_token` structures. It is Unicode-aware, macro-sensitive, context-tracking, and supports multiple levels of token buffering. This lexer is both a general-purpose lexical analyzer and a specialized component for preprocessing. - -This document provides: -1. An architectural overview of how the lexer operates. -2. Guidance on how to interface with it (i.e., how to invoke, initialize, and consume it). -3. Examples demonstrating token flow and useful idioms. - -* 1. About the Lexer - -** 1.1 Services Provided -The lexer transforms a stream of characters into a stream of `cpp_token`s. It performs: -- UCN (Universal Character Name) expansion. -- Unicode normalization for identifiers. -- Detection of digraphs/trigraphs. -- Skipping of whitespace and comments. -- Classification into token types (`cpp_ttype`). -- Optional macro expansion (via higher-level coordination with macro subsystem). - -The function `_cpp_lex_token()` is the main entry point for lexing one token from the input stream. - -** 1.2 Token Types and Structures -Tokens are represented as `struct cpp_token`, which contains: -- `type`: token kind (from `cpp_ttype`) -- `val`: a union holding the value (e.g. number, string, identifier) -- `flags`: indicators such as `PREV_WHITE` or `DIGRAPH` -- `src_loc`: location for diagnostics -- `spelling`: optional cached spelling (may be recomputed) - -Auxiliary structures include: -- `cpp_hashnode`: interned identifiers and macro names -- `normalize_state`: for handling normalization and BiDi context -- `_cpp_buff`: dynamic buffers used for temporary token storage - -** 1.3 Unicode and Normalization -Lexer supports bidirectional Unicode enforcement using: -- `context`, `normalize_state`: track BiDi embeddings and UCN states -- `on_char`, `on_close`, `maybe_warn_bidi_on_close`: enforce structure - -** 1.4 Vectorized Fast Path -Several functions (e.g. `search_line_sse2`) accelerate scanning on x86 via SIMD. These are conditionally invoked from `search_line_fast` when alignment and CPU features allow. - -** 1.5 Token Buffers and Pools -Token buffers are managed using `_cpp_get_buff`, `_cpp_extend_buff`, `_cpp_commit_buff`, and `_cpp_release_buff`. These form a scratch/reuse pool and reduce allocations in macro processing or lexing multiple tokens rapidly. - -* 2. How to Use the Lexer API - -** 2.1 Initialization -Before lexing, the preprocessor must initialize its state: - -#+begin_src c -cpp_reader *pfile = cpp_create_reader(GTK_TESTING, NULL, NULL); -_cpp_init_lexer(pfile); -_cpp_init_tokenrun(pfile); -#+end_src - -** 2.2 Lexing Tokens -To retrieve the next token: - -#+begin_src c -const cpp_token *token = _cpp_lex_token(pfile); -#+end_src - -For directive-specific parsing (no macro expansion): - -#+begin_src c -cpp_token *token = _cpp_lex_direct(pfile); -#+end_src - -** 2.3 Token Inspection -Each token has type and value fields: - -#+begin_src c -if (token->type == CPP_NUMBER) { - printf("Numeric token: %s\n", cpp_spell_token(pfile, token)); -} -#+end_src - -** 2.4 Identifier Handling -Lex identifiers directly (e.g., for macro lookup): - -#+begin_src c -cpp_hashnode *node = _cpp_lex_identifier(pfile); -if (cpp_macro_p(node)) { - // Node is a macro -} -#+end_src - -** 2.5 Stringification and Output -To spell a token or output lines: - -#+begin_src c -unsigned char *text = cpp_token_as_text(pfile, token); -cpp_output_token(pfile, token, stdout); -#+end_src - -* 3. Examples and Advanced Use - -** 3.1 Simple Token Stream -Lex a stream from input and print token types: - -#+begin_src c -while (true) { - const cpp_token *tok = _cpp_lex_token(pfile); - if (tok->type == CPP_EOF) - break; - printf("Token: %s\n", cpp_type2name(tok->type)); -} -#+end_src - -** 3.2 Peeking and Lookahead -Use `cpp_peek_token` to look ahead: - -#+begin_src c -const cpp_token *next = cpp_peek_token(pfile); -if (next->type == CPP_OPEN_PAREN) - printf("Function call?\n"); -#+end_src - -** 3.3 Handling Unicode Identifiers -To support identifiers with UCNs: - -#+begin_src c -cpp_hashnode *ident = _cpp_lex_identifier(pfile); -const uchar *spell = _cpp_spell_ident_ucns(pfile, ident); -printf("Normalized: %s\n", spell); -#+end_src - -** 3.4 Example: Skipping Comments -Use `_cpp_skip_block_comment` or `skip_line_comment`: - -#+begin_src c -bool changed_line = _cpp_skip_block_comment(pfile); -if (changed_line) - _cpp_clean_line(pfile); -#+end_src - -** 3.5 Buffer Usage Examples - -*** 3.5.1 Allocate and Fill a Temporary Buffer -Use `_cpp_get_buff` to allocate a scratch buffer. Always check and ensure space before writing. Then commit the buffer and retrieve its contents. - -#+begin_src c -_cpp_buff *buff = _cpp_get_buff(pfile); -size_t len = 5; // Number of bytes to write - -// Ensure buffer has enough room -if ((size_t)(buff->limit - buff->cur) < len) - _cpp_extend_buff(pfile, &buff); - -// Write data safely -memcpy(buff->cur, "hello", len); -buff->cur += len; - -// Commit buffer and retrieve stable pointer -unsigned char *data = (unsigned char *) _cpp_commit_buff(pfile, buff, len); -printf("Buffer contents: %.*s\n", (int)len, data); -#+end_src -*** 3.5.2 Extend a Buffer Dynamically -Extend a buffer when you exceed its original size. - -#+begin_src c -_cpp_buff *buff = _cpp_get_buff(pfile); - -// Simulate a long write -for (int i = 0; i < 300; ++i) { - if ((size_t)(buff->limit - buff->cur) < 1) { - _cpp_extend_buff(pfile, &buff); - } - *buff->cur++ = 'A'; -} - -unsigned char *text = (unsigned char *) _cpp_commit_buff(pfile, buff, 300); -printf("Expanded buffer: %.*s\n", 10, text); // First 10 chars -#+end_src - -*** 3.5.3 Use Buffers in Token Construction -Construct a macro expansion or synthetic token string. - -#+begin_src c -_cpp_buff *buff = _cpp_get_buff(pfile); -buff->cur = stpcpy((char *)buff->cur, "MY_MACRO("); -buff->cur = stpcpy((char *)buff->cur, "123 + 456"); -*buff->cur++ = ')'; - -unsigned char *macro_text = (unsigned char *) _cpp_commit_buff(pfile, buff, - buff->cur - buff->base); -printf("Token string: %s\n", macro_text); -#+end_src - -*** 3.5.4 Releasing a Buffer -After using a buffer temporarily (e.g., in lookahead), release it. - -#+begin_src c -_cpp_buff *buff = _cpp_get_buff(pfile); -// ... use the buffer ... -_cpp_release_buff(pfile, buff); -#+end_src - -*** 3.5.5 Commit and Reuse -After committing a buffer, you may allocate another for reuse: - -#+begin_src c -unsigned char *first = (unsigned char *) _cpp_commit_buff(pfile, buff, len); -_cpp_buff *next = _cpp_get_buff(pfile); -// next->base points to fresh or recycled memory -#+end_src - -* 4. Notes on Extension - -- You may insert a new directive (e.g., `#assign`) by defining it in `directives.cc` and adding handler logic in `macro.cc` or your own file. -- If you want to extend the lexer for new token kinds, you must: - - Add a new `cpp_ttype` enum value. - - Extend `_cpp_lex_token` or `lex_string` to recognize and classify it. - - Update `cpp_type2name` and spelling functions. - -* 5. Recommended Reading -- `libcpp/include/cpp-id-data.h`: For macro flags and token identifiers -- `libcpp/lex.cc`: Lexer core implementation -- `libcpp/directives.cc`: Directive parsing -- `libcpp/macro.cc`: Macro expansion -- `libcpp/line-map.cc`: Location tracking and diagnostics - - - - diff --git "a/document\360\237\226\211/how_it_works/tool_chain_dependency_layers.org" "b/document\360\237\226\211/how_it_works/tool_chain_dependency_layers.org" deleted file mode 100644 index 94c7a70..0000000 --- "a/document\360\237\226\211/how_it_works/tool_chain_dependency_layers.org" +++ /dev/null @@ -1,78 +0,0 @@ -#+TITLE: Toolchain Dependency Layers -#+AUTHOR: Thomas Walker Lynch -#+DATE: 2025-05-06 -#+OPTIONS: toc:nil num:nil -#+LANGUAGE: en - -* Purpose - -This document outlines the dependencies involved in building a standalone GCC toolchain. It compares two approaches: - -1. Using system-provided tools and headers to build GCC -2. Building a fully self-consistent standalone toolchain - -Understanding the bootstrap sequence is critical for modifying or reproducing GCC builds, especially when building in isolation. - -* The Story: Bootstrap Spiral - -So this programmer — he wanted to add a new directive to GCC. - -So he downloaded the GCC sources with the intent to make a modified, standalone copy. - -But to compile GCC, he needed standard C library headers — which meant downloading glibc. - -But to compile glibc, he needed a working C compiler. So he would first need a minimal GCC — stage 1. - -But to build that stage 1 GCC, he needed glibc headers. - -So he compiled the glibc headers first. - -Then he compiled stage 1 GCC. - -Then he compiled the full glibc. - -Then he compiled the full GCC. - -Ah, but to compile the glibc headers, he first needed the Linux kernel headers... - -There was an old lady who swallowed a fly. I don’t know why she... - -* Approach 1: System-Assisted Bootstrap - -This method uses the host system’s tools and headers to provide bootstrap support. It is simpler and faster, but not fully isolated. - -** Dependencies: - -- System-provided: - - C compiler (e.g. GCC) - - libc (headers and shared objects) - - binutils - - Linux kernel headers - -- Build Steps: - 1. Build binutils using system GCC - 2. Build GCC using system libc and headers - -** Characteristics: -- Fast -- Relies on host environment -- Not self-contained - -** Use Case: -- Building a local variant of GCC that will be used on the same system -- Development where purity or relocatability isn’t required - -* Approach 2: Fully Self-Consistent Toolchain - -This method builds every component of the toolchain in a clean directory, using only upstream sources. It isolates the build from host interference. - -** Dependencies: - -- Linux kernel headers (must be provided up front) -- Binutils source -- Glibc source -- GCC source - -** Build Sequence: - -1. Install Linux kernel headers → needed to build glib diff --git "a/document\360\237\226\211/source/internal_h.org" "b/document\360\237\226\211/source/internal_h.org" deleted file mode 100644 index 97d10de..0000000 --- "a/document\360\237\226\211/source/internal_h.org" +++ /dev/null @@ -1,227 +0,0 @@ -#+TITLE: internal.h - Documentation Reference (Emacs Org Format) -#+AUTHOR: Thomas Walker Lynch & Caelestis Index -#+DESCRIPTION: Reference breakdown of types, macros, and helper declarations in GCC's libcpp/internal.h -#+FILETAGS: cpp preprocessor gcc headers internal -#+OPTIONS: toc:nil - -* Overview -`internal.h` contains declarations for data structures, constants, and utility macros central to GCC's internal C preprocessor logic. It defines memory buffers, macro contexts, lexer state tracking, token kinds, character classes, and preprocessor infrastructure such as file buffers and include tracking. - -* Included Headers -- `symtab.h`: Symbol table definitions used internally. -- `cpplib.h`: Public CPP interfaces for tokens and readers. -- ``: (conditionally) iconv conversion API. - -* Core Data Structures -** `_cpp_buff` -A generic buffer with pointer markers. Used throughout macro processing and string/token accumulation. - -** `cpp_context` -Represents the current token expansion context. May hold ISO macro token runs or traditional literal input. Tracks virtual locations if macro tracking is enabled. - -** `macro_context` -Holds virtual locations and associated macro node. Used to support `-ftrack-macro-expansion`. - -** `cpp_reader` -Global object managing state for a preprocessing run, including lexer state, file buffers, context stack, macro table, callbacks, charset converters, and diagnostics. - -** `cpp_buffer` -Represents the input buffer of a file or command. Tracks physical and logical line positions, associated file, character set conversion, and line notes. - -** `lexer_state` -Bitfield flags tracking parsing state, expansion behavior, and preprocessor conditionals. - -** `tokenrun` -Represents a sequence of `cpp_token`s. Token runs are chained and form a circular buffer. - -** `spec_nodes` -Holds special pre-defined nodes like `defined`, `true`, `__VA_ARGS__`, etc. Used by conditional expressions and macro substitution. - -** `def_pragma_macro` -Stores push/pop state for macros affected by `#pragma push_macro` and `#pragma pop_macro`. - -* cpp_reader Structure -The `cpp_reader` structure is the central object representing the full state of a preprocessor session in GCC's `libcpp`. It is passed to nearly every function across the subsystem and serves as the orchestration hub for lexing, macro expansion, buffer management, file inclusion, diagnostics, encoding conversion, and callback integration. - -** Purpose -`cpp_reader` encapsulates: -- The lexical stream and its position. -- The active and historical context stack. -- Preprocessor directives and include file tracking. -- Memory and token buffer management. -- Charset encoding conversions. -- Diagnostics and frontend callbacks. - -This makes it the definitive state carrier for a preprocessor run. - -** Usage in the Preprocessor - -The `cpp_reader` structure is instantiated once at the beginning of a preprocessing session via a function like `cpp_create_reader`. It is then initialized with options, encoding settings, and source input before being passed into most libcpp functions. It acts as the persistent environment for all operations and carries forward lexical position, macro state, memory buffers, and file context. - -Typical usage involves: - -1. **Initialization**: - - Create with `cpp_create_reader`. - - Configure options via `cpp_get_options`. - - Setup include paths and callbacks. - - Load source with `cpp_read_main_file`. - -2. **Tokenization Loop**: - - Repeatedly call `cpp_get_token(pfile)` to read tokens. - - Tokens are drawn from buffers, macro expansions, or virtual sources. - - `pfile->context` may be manipulated during macro expansions. - -3. **Directive and Macro Handling**: - - Functions like `_cpp_handle_directive`, `_cpp_create_definition`, or `_cpp_push_token_context` all mutate or inspect `pfile` to reflect state changes during preprocessing. - -4. **Finalization**: - - Clean up with `cpp_finish`, `cpp_destroy`, or related resource freeing logic. - -**Example** (simplified and partial): -```c -cpp_reader *pfile = cpp_create_reader(CLK_GNUC89, NULL, linemap); -cpp_get_options(pfile)->lang = CLK_GNUC89; -cpp_get_callbacks(pfile)->diagnostic = my_diagnostic_callback; -cpp_read_main_file(pfile, "myheader.h"); - -const cpp_token *tok; -while ((tok = cpp_get_token(pfile))->type != CPP_EOF) { - // Process token -} - -cpp_finish(pfile); - - -** Member-by-Member Overview - -- `cpp_buffer *buffer` :: Current input buffer, holding the text being preprocessed. -- `cpp_buffer *overlaid_buffer` :: A temporary buffer overlayed for special cases (e.g. `#include` insertions). -- `struct lexer_state state` :: Tracks current directive state (e.g., inside `#define`), comment retention, and conditional skipping. -- `class line_maps *line_table` :: Manages source line and file mapping for diagnostics and `__LINE__`/`__FILE__`. -- `location_t directive_line` :: Source location of the last encountered directive. -- `_cpp_buff *a_buff` :: Aligned buffer for allocations requiring native alignment (e.g., tokens). -- `_cpp_buff *u_buff` :: Unaligned buffer for simpler memory allocations. -- `_cpp_buff *free_buffs` :: Chain of reusable `_cpp_buff` structures. -- `cpp_context base_context` :: The base (top-level) token context. -- `cpp_context *context` :: Pointer to the current context on the expansion stack. -- `const struct directive *directive` :: Active directive, if in one. -- `cpp_token directive_result` :: The token result of directive evaluation. -- `location_t invocation_location` :: Location of a macro's invocation, used for expansion diagnostics. -- `cpp_hashnode *top_most_macro_node` :: Node of the macro currently being expanded at top level. -- `bool about_to_expand_macro_p` :: True if a macro is queued for expansion. -- `cpp_dir *quote_include` :: `#include "..."` search path. -- `cpp_dir *bracket_include` :: `#include <...>` search path. -- `cpp_dir no_search_path` :: A dummy path that disables search. -- `_cpp_file *all_files` :: List of all known files encountered. -- `_cpp_file *main_file` :: The initial input source file. -- `htab *file_hash`, `htab *dir_hash` :: Hash tables for file and directory caching. -- `file_hash_entry_pool *file_hash_entries` :: Pool allocator for file hash entries. -- `htab *nonexistent_file_hash` :: Cache of known missing files (for fast rejection). -- `obstack nonexistent_file_ob` :: Memory store for missing file data. -- `bool quote_ignores_source_dir` :: Controls whether to skip the current file's directory when resolving `#include "..."`. -- `bool seen_once_only` :: True if any `#pragma once` or `#import` was used. -- `const cpp_hashnode *mi_cmacro`, `mi_ind_cmacro` :: Cached macro guards used for multiple-include optimization. -- `bool mi_valid` :: Whether the multiple-inclusion optimization is currently valid. -- `cpp_token *cur_token` :: The current token being read or expanded. -- `tokenrun base_run, *cur_run` :: Token run (buffer) chain for macro-expanded tokens. -- `unsigned int lookaheads` :: Number of lookahead tokens buffered. -- `unsigned int keep_tokens` :: Whether to retain tokens for re-use or reprocessing. -- `unsigned char *macro_buffer` :: Buffer holding macro definition text for diagnostics or display. -- `unsigned int macro_buffer_len` :: Length of `macro_buffer`. -- `cset_converter narrow_cset_desc` :: Converter from source charset to execution charset (e.g. UTF-8). -- `cset_converter utf8_cset_desc`, `char16_cset_desc`, `char32_cset_desc`, `wide_cset_desc` :: Charset converters for UTF and wide characters. -- `const unsigned char *date`, `*time` :: Cached date/time strings used for `__DATE__` and `__TIME__`. -- `time_t time_stamp` :: Internal timestamp used for `__TIMESTAMP__`. -- `int time_stamp_kind` :: Metadata on how timestamp was acquired. -- `cpp_token avoid_paste`, `endarg` :: Special tokens used for controlling macro pasting behavior and argument marking. -- `mkdeps *deps` :: Opaque pointer to dependency tracking system (used for `-M` options). -- `obstack hash_ob`, `buffer_ob` :: Obstack memory pools for hash nodes and buffers, respectively. -- `pragma_entry *pragmas` :: List of user-defined or built-in pragma handlers. -- `cpp_callbacks cb` :: Callback structure for emitting diagnostics or user-visible events. -- `ht *hash_table` :: Identifier hash table. -- `op *op_stack`, `*op_limit` :: Stack used for evaluating constant expressions (e.g., in `#if`). -- `cpp_options opts` :: Holds all preprocessor option settings (e.g. pedantic mode, line directives). -- `spec_nodes spec_nodes` :: Special identifiers (`__VA_ARGS__`, `defined`, etc.). -- `bool our_hashtable` :: Whether this instance owns the hash table memory. -- `out { base, limit, cur, first_line }` :: Traditional output buffer. -- `saved_cur`, `saved_rlimit`, `saved_line_base` :: Saved pointers for buffer overlays. -- `cpp_savedstate *savedstate` :: Saved state for precompiled header support. -- `unsigned int counter` :: Value of `__COUNTER__` macro. -- `cpp_comment_table comments` :: Stores comments if `save_comments` is enabled. -- `def_pragma_macro *pushed_macros` :: List of macros pushed via `#pragma push_macro`. -- `location_t forced_token_location` :: Override location used for the next emitted token. -- `location_t main_loc` :: Marker for the location of the main file’s first line. - -** Summary -`cpp_reader` is a highly stateful construct. It abstracts preprocessing into a cooperative sequence of stages: file loading, lexical analysis, macro handling, directive parsing, and token expansion. Each of these is enabled or modulated via member fields. The design permits reuse of storage buffers, incremental context stacking, and precise location tracking across deeply nested macro expansions and file inclusions. -* Enums and Constants -** `include_type` -Represents how a file was included (e.g., `#include`, `#import`, `-include`, etc.). Used to manage buffer overlays and inclusion depth. - -** `context_tokens_kind` -Distinguishes how tokens are held in a `cpp_context`: direct, indirect, or extended. - -** Alignment Helpers -- `DEFAULT_ALIGNMENT`: Derived from struct alignment. -- `CPP_ALIGN2`, `CPP_ALIGN`: Ensure proper memory alignment. - -** Character Class Macros -- `is_idchar`, `is_numchar`, `is_hspace`, `is_vspace`, etc.: Type-safe wrappers over libc ctype behavior with preprocessor-specific adjustments. - -* Buffers and Memory -- `_cpp_get_buff`, `_cpp_release_buff`, `_cpp_extend_buff`, `_cpp_aligned_alloc`, `_cpp_unaligned_alloc`: Allocate and manage working buffers used during expansion. - -* Token and Macro Helpers -- `_cpp_mark_macro_used`: Marks a macro as used for diagnostics. -- `CPP_OPTION`, `CPP_BUFFER`, `CPP_INCREMENT_LINE`: Common access macros for reader state and buffer internals. -- `SEEN_EOL()`: Helper to check if the last token was EOF. - -* Function Declarations by File -** From macro.cc -- `_cpp_create_definition`, `_cpp_new_macro`, `_cpp_notify_macro_use`, `_cpp_push_token_context`, etc.: Manage macro creation, expansion, and context. - -** From directives.cc -- `_cpp_define_builtin`, `_cpp_handle_directive`, `_cpp_do__Pragma`, etc.: Directive parsing and #pragma handlers. - -** From files.cc -- `_cpp_find_file`, `_cpp_stack_include`, `_cpp_pop_file_buffer`: File inclusion management and include guards. - -** From lex.cc -- `_cpp_lex_token`, `_cpp_temp_token`, `_cpp_equiv_tokens`: Token lexing and temporary token generation. - -** From expr.cc -- `_cpp_parse_expr`, `_cpp_expand_op_stack`: Expression parsing in `#if`/`#elif`. - -** From charset.cc -- `_cpp_valid_utf8`, `_cpp_convert_input`, `_cpp_destroy_iconv`: Character encoding conversion routines. - -** From init.cc -- `_cpp_restore_special_builtin`, `cpp_named_operator2name`: Initialization helpers for macro state. - -** From identifiers.cc -- `_cpp_init_hashtable`, `_cpp_destroy_hashtable`: Identifier table setup and teardown. - -* Encoding and Normalization -** `normalize_state` -Tracks normalization level and combining characters for UCN validation and identifier processing. - -** `cset_converter` -Holds state for iconv-based charset conversion. Used for input and output charset normalization. - -* Accessor Inline Functions -- `_cpp_in_system_header`, `_cpp_in_main_source_file`, `_cpp_defined_macro_p`: Context-sensitive accessors. -- `ustrcmp`, `ustrlen`, `uxstrdup`, `ufputs`, etc.: UTF-aware string handling routines. - -* Diagnostic Integration -** `encoding_rich_location` -Subclass of `rich_location` that forces encoding escape visibility for diagnostics. Constructed from `cpp_reader`. - -* Notes -- This file is not compiled standalone but included in many CPP components. -- It contains bridge-level API elements that link between token processing, buffer management, and frontend logic. -- Care must be taken when editing alignment or buffer routines as they affect all downstream expansion logic. - -* TODO -- Document how iconv fallback works when `HAVE_ICONV` is not defined. -- Clarify lifecycle of pushed macro contexts during nested `#pragma push_macro` chains. -- Integrate doc with `macro.cc` and `lex.cc` references for cross-module tracing. diff --git "a/document\360\237\226\211/source/lex_cc.org" "b/document\360\237\226\211/source/lex_cc.org" deleted file mode 100644 index 4f3c628..0000000 --- "a/document\360\237\226\211/source/lex_cc.org" +++ /dev/null @@ -1,455 +0,0 @@ -#+TITLE: lex.cc Detailed Structure and Function Index -#+Author: Caelus, code formalist (GPT-4, OpenAI), Thomas -#+Date:2025-05-09 - -* Data Structures Found in Non-Static Function Signatures -** struct context -Used in lexer or normalization stages to track state during token reclassification or Unicode normalization. - -** enum cpp_token_fld_kind -Enumeration describing the internal storage kind for a preprocessor token's value — distinguishes between identifiers, numbers, etc. - -** enum cpp_ttype -Enumeration of token types recognized by the preprocessor (e.g., identifiers, punctuators, literals, etc.). - -** struct lit_accum -Helper structure that accumulates string or character literal fragments during lexing. - -** struct normalize_state -Tracks intermediate state during Unicode normalization of identifiers or literals. - -** struct token_spelling -Structure used to store or compute the textual spelling of a token, including alternate representations (e.g., digraphs). -* Data Structures Shared Among Functions in lex.cc -** _cpp_buff -Used in: _cpp_aligned_alloc, _cpp_extend_buff, _cpp_free_buff, _cpp_get_buff, _cpp_release_buff, free, is_macro, new_buff, usage -Temporary token buffer used during macro argument collection and expansion. Shared to manage input buffering across stages. - -** context -Used in: _cpp_remaining_tokens_num_in_context, character, if, maybe_warn_bidi_on_close, on_char, rich_loc -State struct used in bidirectional text normalization and context-aware lexing. Functions reference it to apply UCN and bidi safety rules. - -** cpp_hashnode -Used in: cpp_error, if, is_macro, lex_identifier, lex_identifier_intern, line, linemap_included_from -Represents identifiers and macro definitions. Shared among symbol lookup, macro parsing, and token classification functions. - -** cpp_token -Used in: RESULT, _cpp_temp_token, cpp_directive_only_process, cpp_output_line_to_string, if, line, linemap_included_from, own, return -Token structure used to represent lexed entities passed between scanners, macro collectors, and diagnostic routines. - -** cpp_ttype -Used in: is_macro, lex_string, own, return -Enumeration of token types (e.g., identifiers, keywords, operators). Shared by scanners and type-check logic to interpret input. -* Non-Static Functions -** _cpp_aligned_alloc -- Signature: `unsigned char * _cpp_aligned_alloc (...)` -- Purpose: Allocates a buffer with alignment suitable for vectorized scanning operations (e.g., SSE, AVX). - -** _cpp_append_extend_buff -- Signature: `_cpp_buff * _cpp_append_extend_buff (...)` -- Purpose: Appends additional space to an existing token buffer, used when macro expansions exceed initial estimates. - -** _cpp_clean_line -- Signature: `void _cpp_clean_line (...)` -- Purpose: Cleans lexer line state after processing a complete logical line. - -** _cpp_commit_buff -- Signature: `void * _cpp_commit_buff (...)` -- Purpose: Finalizes a temporary token buffer and returns a stable pointer to the committed data. - -** _cpp_equiv_tokens -- Signature: `int _cpp_equiv_tokens (...)` -- Purpose: Determines whether two tokens are equivalent, ignoring cosmetic differences such as spacing. - -** _cpp_extend_buff -- Signature: `void _cpp_extend_buff (...)` -- Purpose: Increases the capacity of a token buffer to accommodate additional tokens during macro processing. - -** _cpp_free_buff -- Signature: `void _cpp_free_buff (...)` -- Purpose: Releases memory allocated for a temporary or committed token buffer. - -** _cpp_get_buff -- Signature: `_cpp_buff * _cpp_get_buff (...)` -- Purpose: Returns a new or recycled token buffer from the internal pool, minimizing allocations. - -** _cpp_get_fresh_line -- Signature: `bool _cpp_get_fresh_line (...)` -- Purpose: Consumes input until a logical line is ready. Handles escaped newlines. - -** _cpp_init_lexer -- Signature: `void _cpp_init_lexer (...)` -- Purpose: Initializes the core lexer state: buffers, token rings, and diagnostic counters. - -** _cpp_init_tokenrun -- Signature: `void _cpp_init_tokenrun (...)` -- Purpose: Initializes a ring buffer or region for holding tokens during lexing. - -** _cpp_lex_direct -- Signature: `cpp_token * _cpp_lex_direct (...)` -- Purpose: Lexes a single token from the input without macro expansion — used for directive parsing. - -** _cpp_lex_identifier -- Signature: `cpp_hashnode * _cpp_lex_identifier (...)` -- Purpose: Lexes an identifier and returns a hashnode for it, performing UCN expansion and keyword recognition. - -** _cpp_lex_token -- Signature: `const cpp_token * _cpp_lex_token (...)` -- Purpose: Lexes the next token from the input stream, handling macro expansion and buffering. - -** _cpp_process_line_notes -- Signature: `void _cpp_process_line_notes (...)` -- Purpose: Handles mapping #line notes and diagnostic position metadata. - -** _cpp_release_buff -- Signature: `void _cpp_release_buff (...)` -- Purpose: Returns a previously used token buffer back to the internal pool for reuse. - -** _cpp_remaining_tokens_num_in_context -- Signature: `int _cpp_remaining_tokens_num_in_context (...)` -- Purpose: Returns how many tokens are left within the current lexing context. - -** _cpp_skip_block_comment -- Signature: `bool _cpp_skip_block_comment (...)` -- Purpose: Skips over block comments, optionally returning whether line state changed. - -** _cpp_spell_ident_ucns -- Signature: `unsigned char * _cpp_spell_ident_ucns (...)` -- Purpose: Generates a UTF-8 spelling for identifiers that contain Universal Character Names (UCNs). - -** _cpp_temp_token -- Signature: `cpp_token * _cpp_temp_token (...)` -- Purpose: Allocates space for a temporary token during parsing or lookahead. - -** _cpp_unaligned_alloc -- Signature: `unsigned char * _cpp_unaligned_alloc (...)` -- Purpose: Allocates unaligned memory for fallback lexers or comment scanning buffers. - -** cpp_alloc_token_string -- Signature: `const uchar * cpp_alloc_token_string (...)` -- Purpose: Allocates a fresh string buffer for a token's textual content, typically used in output or diagnostics. - -** cpp_avoid_paste -- Signature: `int cpp_avoid_paste (...)` -- Purpose: Determines whether a space is needed between two tokens to avoid unintended pasting. - -** cpp_force_token_locations -- Signature: `void cpp_force_token_locations (...)` -- Purpose: Forces the preprocessor to track source locations for all tokens, overriding lazy behavior. - -** cpp_get_comments -- Signature: `cpp_comment_table * cpp_get_comments (...)` -- Purpose: Returns a pointer to the internal comment table used for diagnostics or pretty-printing. - -** cpp_ideq -- Signature: `int cpp_ideq (...)` -- Purpose: Compares two identifiers for equality in a normalized preprocessor sense. - -** cpp_output_line -- Signature: `void cpp_output_line (...)` -- Purpose: Outputs an entire preprocessor line, including comments or tokens, to a file. - -** cpp_output_line_to_string -- Signature: `unsigned char * cpp_output_line_to_string (...)` -- Purpose: Generates a string representation of a preprocessed line for diagnostics. - -** cpp_output_token -- Signature: `void cpp_output_token (...)` -- Purpose: Writes a token to an output stream, respecting spacing and formatting rules. - -** cpp_peek_token -- Signature: `const cpp_token * cpp_peek_token (...)` -- Purpose: Returns a pointer to the next token without consuming it. Used in lookahead. - -** cpp_spell_token -- Signature: `unsigned char * cpp_spell_token (...)` -- Purpose: Computes or reconstructs the text spelling of a token from internal data. - -** cpp_stop_forcing_token_locations -- Signature: `void cpp_stop_forcing_token_locations (...)` -- Purpose: Stops forcibly tracking token locations, restoring default behavior. - -** cpp_token_as_text -- Signature: `unsigned char * cpp_token_as_text (...)` -- Purpose: Converts a token into its textual representation (used for macro debug output or trace logs). - -** cpp_token_len -- Signature: `unsigned int cpp_token_len (...)` -- Purpose: Computes the length of a token for buffer management or output purposes. - -** cpp_token_val_index -- Signature: `enum cpp_token_fld_kind cpp_token_val_index (...)` -- Purpose: Returns the kind of value stored in the token (e.g., string, identifier, number). - -** cpp_type2name -- Signature: `const char * cpp_type2name (...)` -- Purpose: Maps internal token types (e.g., CPP_NUMBER) to human-readable strings like "number". - -** current_ctx -- Signature: `kind current_ctx (...)` -- Purpose: Returns the current Unicode bidirectional context (e.g., LTR, RTL) used during lexing. - -** current_ctx_loc -- Signature: `location_t current_ctx_loc (...)` -- Purpose: Returns the source location associated with the current bidi context — for diagnostics. - -** current_ctx_ucn_p -- Signature: `bool current_ctx_ucn_p (...)` -- Purpose: Returns whether the current Unicode context allows Universal Character Names (UCNs). - -** init_vectorized_lexer -- Signature: `define HAVE_init_vectorized_lexer 1 -static inline void init_vectorized_lexer (...)` -- Purpose: Initializes vectorized scanning function pointers depending on CPU features. - -** on_char -- Signature: `void on_char (...)` -- Purpose: Handles logic when a character is encountered that might affect bidirectional or normalization context. - -** on_close -- Signature: `void on_close (...)` -- Purpose: Called when a bidirectional context-closing token (e.g., PDF) is encountered. - -** pop -- Signature: `void pop (...)` -- Purpose: Pops the current normalization or bidi context off the internal context stack. - -** pop_kind_at -- Signature: `kind pop_kind_at (...)` -- Purpose: Returns the kind of context that would be popped at a given depth (used for lookahead). - -** read_char -- Signature: `char read_char (...)` -- Purpose: Reads a character from the input buffer, optionally applying normalization or escaping rules. - -** search_line_fast -- Signature: `ATTRIBUTE_NO_SANITIZE_UNDEFINED -static const uchar * search_line_fast (...)` -- Purpose: Fallback vectorized line scanner for supported architectures. Tries MMX, SSE, etc. - -** search_line_fast -- Signature: `define AARCH64_MIN_PAGE_SIZE 4096 - -static const uchar * search_line_fast (...)` -- Purpose: Fallback vectorized line scanner for supported architectures. Tries MMX, SSE, etc. - -** search_line_mmx -- Signature: `endif search_line_mmx (...)` -- Purpose: Performs vectorized scanning of input using MMX instructions. - -** search_line_sse2 -- Signature: `endif search_line_sse2 (...)` -- Purpose: Performs fast input scanning using SSE2 instructions on aligned buffers. - -** search_line_sse42 -- Signature: `endif search_line_sse42 (...)` -- Purpose: Uses SSE4.2 instructions (e.g., `pcmpestri`) to scan for newline and comment sequences. -* File Scope Data Structures -- `CPP_TOKEN_FLD_ARG_NO` -- `CPP_TOKEN_FLD_NODE` -- `CPP_TOKEN_FLD_NONE` -- `CPP_TOKEN_FLD_PRAGMA` -- `CPP_TOKEN_FLD_SOURCE` -- `CPP_TOKEN_FLD_STR` -- `CPP_TOKEN_FLD_TOKEN_NO` -- `Foundation` -- `NULL` -- `SSE1` -- `WARRANTY` -- `a` -- `accum` -- `after_backslash` -- `all_upper` -- `alloced` -- `backup` -- `bad_string` -- `bol` -- `break` -- `buffer` -- `c` -- `category` -- `col` -- `cols` -- `combined_loc` -- `count` -- `data` -- `delim_len` -- `delimited_string` -- `dest` -- `dflt` -- `done` -- `done_comment` -- `done_string` -- `end` -- `end_loc` -- `end_offset` -- `eol` -- `esc` -- `extra_len` -- `f` -- `fallthrough_comment` -- `false` -- `found` -- `fresh_line` -- `hash` -- `header_count` -- `i` -- `impl` -- `import` -- `index` -- `is_block` -- `ix` -- `j` -- `l` -- `la` -- `len` -- `line_count` -- `loc` -- `m` -- `m_custom_label` -- `m_kind` -- `m_loc` -- `m_ucn` -- `magic` -- `mask` -- `maybe_number_start` -- `minimum` -- `misalign` -- `module_p` -- `n` -- `name` -- `new_buff` -- `next_line` -- `not_module` -- `nst` -- `num_bytes` -- `ok` -- `ones` -- `orig_line` -- `out` -- `p` -- `peek` -- `peek_R` -- `peek_u` -- `peek_u8` -- `peektok` -- `prefix_len` -- `program` -- `ptr` -- `quote_eight` -- `quote_first` -- `quote_peek` -- `raw` -- `read_note` -- `repl_bs` -- `repl_cr` -- `repl_nl` -- `repl_qm` -- `restart` -- `result` -- `ret` -- `room` -- `s` -- `saw_NUL` -- `search` -- `search_line_fast` -- `second_raw` -- `shift` -- `si` -- `size` -- `skipped_white` -- `slen` -- `sloc` -- `slow_path` -- `software` -- `spell_ident` -- `spelling` -- `src_loc` -- `src_range` -- `star` -- `start` -- `start_loc` -- `start_offset` -- `sv` -- `sz` -- `t` -- `terminator` -- `tok_range` -- `true` -- `type` -- `ucn_len` -- `ucn_len_c` -- `update_tokens_line` -- `utf32` -- `utf8_signifier` -- `utf8_start` -- `v` -- `want_number` -- `warn_bidi` -- `warn_bidi_p` -- `was` -- `word_type` -- `ws` -- `xmask` -- `zero` - -* Static Functions -- `void add_line_note (...)` -- `int skip_line_comment (...)` -- `void skip_whitespace (...)` -- `void lex_string (...)` -- `void save_comment (...)` -- `void store_comment (...)` -- `void create_literal (...)` -- `bool warn_in_comment (...)` -- `int name_p (...)` -- `void add_line_note (...)` -- `inline word_type acc_char_mask_misalign (...)` -- `inline word_type acc_char_replicate (...)` -- `inline word_type acc_char_cmp (...)` -- `inline int acc_char_index (...)` -- `const uchar * search_line_acc_char (...)` -- `const uchar * search_line_acc_char (...)` -- `const uchar * search_line_fast (...)` -- `const uchar * search_line_fast (...)` -- `bool warn_in_comment (...)` -- `location_t get_location_for_byte_range_in_cur_line (...)` -- `bidi::kind get_bidi_utf8_1 (...)` -- `bidi::kind get_bidi_utf8 (...)` -- `bidi::kind get_bidi_ucn_1 (...)` -- `bidi::kind get_bidi_ucn (...)` -- `void maybe_warn_bidi_on_close (...)` -- `void maybe_warn_bidi_on_char (...)` -- `int skip_line_comment (...)` -- `void skip_whitespace (...)` -- `int name_p (...)` -- `void warn_about_normalization (...)` -- `bool forms_identifier_p (...)` -- `void maybe_va_opt_error (...)` -- `cpp_hashnode * lex_identifier_intern (...)` -- `cpp_hashnode * lex_identifier (...)` -- `void lex_number (...)` -- `void create_literal (...)` -- `bool is_macro (...)` -- `bool is_macro_not_literal_suffix (...)` -- `void lex_raw_string (...)` -- `void lex_string (...)` -- `void store_comment (...)` -- `void save_comment (...)` -- `bool fallthrough_comment_p (...)` -- `tokenrun * next_tokenrun (...)` -- `const cpp_token* _cpp_token_from_context_at (...)` -- `void cpp_maybe_module_directive (...)` -- `size_t utf8_to_ucn (...)` -- `const unsigned char * cpp_digraph2name (...)` -- `_cpp_buff * new_buff (...)` -- `const unsigned char * do_peek_backslash (...)` -- `const unsigned char * do_peek_next (...)` -- `const unsigned char * do_peek_prev (...)` -- `const unsigned char * do_peek_ident (...)` -- `bool do_peek_module (...)` - - - - - diff --git "a/document\360\237\226\211/source/libcpp_h.org" "b/document\360\237\226\211/source/libcpp_h.org" deleted file mode 100644 index e69de29..0000000 diff --git "a/document\360\237\226\211/source/macro_cc.org" "b/document\360\237\226\211/source/macro_cc.org" deleted file mode 100644 index 53d8b65..0000000 --- "a/document\360\237\226\211/source/macro_cc.org" +++ /dev/null @@ -1,83 +0,0 @@ -#+TITLE: macro.cc - Documentation Reference -#+AUTHOR: Thomas Walker Lynch & Caelestis Index -#+DESCRIPTION: High-level architectural partitioning of cpp (GCC 12.x) -#+FILETAGS: cpp preprocessor architecture gcc -#+OPTIONS: toc:nil - -* Overview -This file implements the core logic for macro parsing, macro definition, expansion, and deferred/lazy evaluation within the C preprocessor (CPP) subsystem in GCC's `libcpp`. It complements the infrastructure declared in `libcpp.h` and utilizes various helpers from supporting headers. - -* Included Headers and Their Purpose -- `config.h`: Compiler configuration macros. -- `system.h`: GCC-wide portability and utility macros. -- `intl.h`: Localization support. -- `cpplib.h`: Core interface for the C preprocessor. -- `internal.h`: Internal-only structures and definitions for the preprocessor. -- `macros.h`: Macro parsing and storage structures. -- `trad.h`: Traditional mode logic. -- `mkdeps.h`: Dependency output handling. -- `diagnostic-core.h`: Diagnostic emission interfaces. -- `cpp-id-data.h`: Identifier information, e.g. for argument naming. - -* Major Data Structures Used - -- `cpp_reader` (from `cpplib.h`): The global preprocessor context. -- `cpp_hashnode` (from `cpplib.h`): Represents identifiers, including macro definitions. -- `cpp_macro` (from `macros.h`): Stores a single macro definition, either traditional or ISO. -- `macro_arg` (from `macros.h`): Represents a single argument to a function-like macro. -- `macro_context` (internal): Used for tracking extended macro expansion location information. -- `_cpp_buff` (from `internal.h`): Temporary token or string storage buffer. -- `cpp_token` (from `cpplib.h`): Represents a preprocessor token. -- `cpp_string` (from `cpplib.h`): String-like wrapper for character sequences. - -* Functional Groups (Grouped per libcpp.h Theme) - -*** Token Context Management -- `_cpp_push_token_context`: Pushes a direct token sequence as context. -- `push_ptoken_context`: Pushes indirect token sequence. -- `push_extended_tokens_context`: Pushes context with virtual locations. -- `_cpp_pop_context`: Pops current macro or token context. - -*** Argument Expansion and Memory -- `expand_arg`: Expands a macro argument by recursively evaluating tokens. -- `alloc_expanded_arg_mem`: Allocates buffer space for an argument. -- `ensure_expanded_arg_room`: Doubles expansion buffer when needed. -- `set_arg_token` (external): Inserts expanded tokens into an argument. - -*** Macro Definition and Redefinition -- `_cpp_create_definition`: Top-level interface to create and store macro definition. -- `create_iso_definition`: Parses macro arguments and expansion tokens. -- `_cpp_save_parameter`: Saves a named parameter for a function-like macro. -- `_cpp_unsave_parameters`: Restores hashnodes after failed macro parse. -- `warn_of_redefinition`: Determines if a redefinition should trigger a warning. -- `cpp_compare_macros`: Compares two macros for semantic equality. - -*** Macro Instantiation and Lazy Expansion -- `get_deferred_or_lazy_macro`: Retrieves or forces realization of a deferred or lazy macro. -- `cpp_get_deferred_macro`: Resolves a deferred macro. -- `cpp_define_lazily`: Marks a macro for delayed definition. -- `_cpp_notify_macro_use`: Central notification hook that tracks macro use. - -*** Macro Definition Representation -- `cpp_macro_definition`: Renders a macro definition as a string. -- `cpp_macro_definition(pfile, node, macro)`: Core form with macro pointer. - -*** Lexing Helpers and Traditional Compatibility -- `lex_expansion_token`: Lexes one token in a macro body. -- `check_trad_stringification`: Warns if argument appears stringified in traditional C. -- `_cpp_new_macro`: Allocates and initializes a `cpp_macro`. - -* Integration with Other Subsystems -- Works closely with: `lex.c`, `directives.cc`, and `internal.c`. -- Interfaces with `linemap` for virtual location computation. -- Supports both ISO and traditional C macro handling. - -* Notes -- Token pasting (`##`) is carefully constrained per ISO rules. -- Parameter and macro use is tracked for diagnostics and DWARF output. -- Extra tokens such as padding and stringification markers carry encoded flags. - -* TODO -- Document edge cases and non-ISO behaviors (e.g., bare ellipsis). -- Link to relevant `libcpp.h` macro flags and diagnostic utilities. -- Cross-reference context expansion rules with `cpp_get_token_1`. diff --git "a/document\360\237\226\211/source/macro_registration.org" "b/document\360\237\226\211/source/macro_registration.org" deleted file mode 100644 index 56bf752..0000000 --- "a/document\360\237\226\211/source/macro_registration.org" +++ /dev/null @@ -1,150 +0,0 @@ -#+TITLE: Macro Symbol Registration in GCC 12 libcpp -#+AUTHOR: Caelestis Index -#+DESCRIPTION: Full lifecycle of defining and registering a macro in GCC's C preprocessor -#+OPTIONS: toc:nil -#+FILETAGS: gcc libcpp macro cpp_hashnode - -* Overview -This document explains the full lifecycle for defining a macro in GCC 12.x's =libcpp= preprocessor. It traces the required steps from token parsing through symbol table registration, highlighting where and how macro definitions become visible to the preprocessor engine. - -* 1. Obtaining the Macro Name’s Hash Node - -In =libcpp=, all identifiers — including macro names — are interned in a symbol table as =cpp_hashnode= entries. When the lexer emits a =CPP_NAME= token, it automatically fills: - -#+BEGIN_SRC c -token->val.node.node // type: cpp_hashnode * -#+END_SRC - -If the macro name comes from parsed input (e.g. `#assign` or `#define`), this node is already in the symbol table — no need to call =cpp_lookup= again. - -If you're defining a macro from a raw string (not a parsed token), you *would* use: - -#+BEGIN_SRC c -cpp_lookup(pfile ,name ,len); -#+END_SRC - -Note: =cpp_lookup= both interns new identifiers and retrieves existing ones. - -* 2. Creating and Populating a cpp_macro Object - -GCC uses a =cpp_macro= struct to hold the macro’s definition: number of parameters, replacement tokens, flags, etc. - -Allocation is done with: - -#+BEGIN_SRC c -cpp_macro *macro = _cpp_new_macro( - pfile, - cmk_macro, - _cpp_reserve_room(pfile ,0 ,sizeof(cpp_macro)) -); -#+END_SRC - -After that, populate its fields: - -#+BEGIN_SRC c -macro->fun_like = 0; -macro->paramc = 0; -macro->variadic = 0; -macro->count = 1; -macro->used = 1; - -cpp_token *tok = ¯o->exp.tokens[0]; -tok->type = CPP_NUMBER; -tok->val.str.text = (const unsigned char *) "42"; -tok->val.str.len = 2; -tok->flags = 0; -#+END_SRC - -Note: These macros are obstack-allocated; you don't free them manually. - -* 3. Handling Redefinitions (Optional, but Expected) - -If the symbol already has a macro: - -#+BEGIN_SRC c -if( cpp_macro_p(node) ) - warn_of_redefinition(pfile ,node ,macro); -#+END_SRC - -GCC allows redefinition only if the new macro is *identical*. If not, it issues a pedantic warning and overwrites the old definition. - -To remove the previous macro: - -#+BEGIN_SRC c -_cpp_free_definition(node); -#+END_SRC - -This clears the macro without deallocating it (obstack). - -* 4. Installing the Macro in the Symbol Table - -The macro is made active by assigning it to the symbol table: - -#+BEGIN_SRC c -node->type = NT_USER_MACRO; -node->value.macro = macro; -#+END_SRC - -This effectively *registers* the macro for expansion. - -There is no separate "symbol table insertion" step — the hash node was already in the table. - -GCC may also set flags: - -- =NODE_WARN= → warn if redefining built-in -- =NODE_CONDITIONAL= → cleared when explicitly defined - -* 5. Finalization Steps - -Some final steps after macro insertion: - -- Mark it used (optional): - - #+BEGIN_SRC c - _cpp_mark_macro_used(node); - #+END_SRC - -- Emit a diagnostic: - - #+BEGIN_SRC c - cpp_warning(pfile ,CPP_W_NONE ,"Assigned macro %s as 42" ,NODE_NAME(node)); - #+END_SRC - -- Clear the =NODE_USED= flag to reset unused-macro warnings: - - #+BEGIN_SRC c - node->flags &= ~NODE_USED; - #+END_SRC - -* Summary of Required Steps - -Here is the complete, valid sequence to register a macro manually: - -#+BEGIN_SRC c -cpp_token *name_token = assign_name_argument(pfile); -cpp_hashnode *node = name_token->val.node.node; - -cpp_macro *macro = _cpp_new_macro(...); // allocate and populate - -// fill token replacement list... -macro->count = 1; -macro->exp.tokens[0] = ...; - -node->type = NT_USER_MACRO; -node->value.macro = macro; -_cpp_mark_macro_used(node); -#+END_SRC - -That is sufficient to define and register a macro. =NODE_NAME(node)= is useful for diagnostics, but not required for registration. - -* Notes - -- If you already have a =cpp_token= from parsing, the hash node is *already interned*. -- Macros must be registered by setting =node->type= and =node->value.macro=. -- Redefinitions are allowed only if semantically identical unless explicitly undefined. -- No extra insertion or lookup step is needed unless building from raw text. - -* References -- =directives.cc= → =do_define= and redefinition checks -- =macro.cc= → =create_iso_definition= and macro assembly -- =cpplib.h= → =cpp_hashnode=, =cpp_macro=, enum flags diff --git "a/document\360\237\226\211/terms/deployment scenarios.org" "b/document\360\237\226\211/terms/deployment scenarios.org" deleted file mode 100644 index c250c26..0000000 --- "a/document\360\237\226\211/terms/deployment scenarios.org" +++ /dev/null @@ -1,49 +0,0 @@ -#+TITLE: Compiler Deployment Scenarios -#+AUTHOR: Thomas Walker Lynch -#+DATE: 2025-05-06 -#+OPTIONS: toc:nil num:nil -#+LANGUAGE: en - -See `terms/machine and system roles' for the definitions of 'build', 'host', and 'target' system/machine. - -* Native Compilation: Build = Host = Target. - -In a native compilation, all three roles — the machine that builds GCC (`build`), the machine on which the resulting compiler will run (`host`), and the machine for which that compiler will generate code (`target`) — are the same. - -- Example 1: Alice is running Debian on an x86_64 machine. She builds GCC from source directly on that system. The compiler runs on her Debian machine and compiles code for that same environment. - In terms of machine categories: - `build = host = target = x86_64-pc-linux-gnu` - -- Example 2: A system administrator compiles GCC inside a Fedora x86_64 container. The resulting compiler is used within that same container to build programs for Fedora x86_64. - In terms of machine categories: - `build = host = target = x86_64-redhat-linux` - -* Cross Compilation: Build = Host ≠ Target. - -In a cross compilation, the compiler is built and will run on one type of system (`build = host`), but is used to produce binaries for a different kind of system (`target`). - -- Example 1: Bob is using an x86_64 Ubuntu machine to build a GCC toolchain that runs on Ubuntu but outputs code for a Raspberry Pi (ARM). - In terms of machine categories: - `build = host = x86_64-pc-linux-gnu`, - `target = arm-linux-gnueabihf` - -- Example 2: A developer builds a MIPS cross-compiler on an x86_64 workstation. The resulting GCC is used on that workstation to compile code for an embedded MIPS device. - In terms of machine categories: - `build = host = x86_64-pc-linux-gnu`, - `target = mipsel-linux-uclibc` - -* Canadian Cross: Build ≠ Host ≠ Target. - -A Canadian Cross involves three distinct machines. GCC is built on one system (`build`), but the resulting compiler is meant to run on a different system (`host`), and it will generate binaries for yet another system (`target`). - -- Example 1: A toolchain is built on an x86_64 Linux desktop (`build`). It produces a GCC that runs on PowerPC AIX (`host`) and generates code for ARM embedded systems (`target`). - In terms of machine categories: - `build = x86_64-pc-linux-gnu` - `host = powerpc-ibm-aix` - `target = arm-none-eabi` - -- Example 2: Charlie uses his Linux laptop to build a Windows-native GCC cross-compiler (`host = Windows`). This Windows-hosted compiler will produce binaries for Android ARM (`target`). - In terms of machine categories: - `build = x86_64-pc-linux-gnu` - `host = x86_64-w64-mingw32` - `target = arm-linux-androideabi` diff --git "a/document\360\237\226\211/terms/gcc directory terminology.org" "b/document\360\237\226\211/terms/gcc directory terminology.org" deleted file mode 100644 index 545426a..0000000 --- "a/document\360\237\226\211/terms/gcc directory terminology.org" +++ /dev/null @@ -1,68 +0,0 @@ - -#+TITLE: GCC Directory Terminology -#+AUTHOR: Thomas Walker Lynch -#+DATE: 2025-05-06 -#+OPTIONS: toc:nil num:nil -#+LANGUAGE: en - -* build directory (objdir) - -The directory where a build process takes place. - -For GCC this must be distinct from the source directory (srcdir). - -* current build - -Refers to the specific build being executed right now, as opposed to previously installed or referenced builds. - -Example: -- "The current build will output to `$BUILD_DIR/gcc`" -- "Avoid mixing headers from an old build with the current build" - -* current build target directory - -The destination directory into which the current build places its build products. - -This term combines the concepts of the `current build` and `target directory`, emphasizing that the output location is for the present compilation effort. - -Examples: -- If `make install DESTDIR=$SYSROOT`, then `$SYSROOT` is the current build target directory. -- For an isolated toolchain, `$TOOLCHAIN` might be the current build target directory - -* installation directory - -Any directory from which components are *intended to be used from*, whether built locally or installed from packages. A final resting place for binaries, scripts, libraries, headers, or anything else that gets 'used' as part of everyday work. - -In the GCC documents, an 'installation directory' is not limited to directories that the current build will put things into. In fact, sometimes it is used to refer to a directory the build will get things from. - -Examples include: -- System-wide locations like `/usr/bin` -- Custom toolchain roots like `$TOOLCHAIN/bin` - -* local-prefix - -Path where GCC searches for locally installed headers (like `/usr/local/include`). -This is not a build target directory, rather it is used for lookup during compilation. - -* prefix -The root path where GCC will install itself: -- Binaries to `$prefix/bin` -- Headers to `$prefix/include` -- Libraries to `$prefix/lib` - -* source directory (srcdir) -The directory where the original GCC (or binutils/glibc) source code resides. -Often deleted or archived after the build completes. -* sysroot -A directory that acts as a "fake root" (`/`) for headers and libraries. -Used in controlled builds or cross-compilation to isolate from the host system. -* target directory -The directory where build products are placed — typically the result of `make install`. - -This can be the same as `prefix`, or it may be a temporary staging area. It refers specifically to the destination for *this build’s* outputs, not a system path in general. - -Examples: -- `$TOOLCHAIN/bin` if that’s where you are installing -- `DESTDIR=/tmp/stage make install` — `/tmp/stage` is the target directory - - diff --git "a/document\360\237\226\211/terms/glossary.org" "b/document\360\237\226\211/terms/glossary.org" deleted file mode 100644 index 87c1cb5..0000000 --- "a/document\360\237\226\211/terms/glossary.org" +++ /dev/null @@ -1,76 +0,0 @@ -#+TITLE: RT_gcc Terminology Glossary -#+AUTHOR: Thomas Walker Lynch -#+DATE: 2025-05-06 -#+OPTIONS: toc:nil num:nil -#+LANGUAGE: en - -A glossary of commonly used terms and variables in the GNU toolchain documentation, -with emphasis on their meaning in the context of building GCC. - -* General Terms - -** GCC -The GNU Compiler Collection — a suite of compilers for C, C++, Fortran, and other languages. - -** GNU -GNU's Not Unix — a free software project started by the Free Software Foundation (FSF). - -** FSF -The Free Software Foundation — original sponsor and maintainer of the GNU project and GCC. - -** toolchain -A set of programs used to build software, usually including a compiler (GCC), linker (ld), assembler (as), and C library (glibc or musl). - -** native build -A build where `build = host = target`. You are compiling GCC for the same system you're building it on. - -** cross-compiler -A compiler built on one machine (the build system), to run on another (the host), and generate code for yet another (the target). - -** bootstrap -The process of building GCC in multiple stages: -1. Build a minimal compiler (stage1) -2. Use it to build a full compiler (stage2) -3. Use the result to rebuild itself (stage3) and verify output is stable - - -* Build Triplets - -** build -The system on which the *build process itself* runs. - -** host -The system where the resulting GCC binary will *run*. - -** target -The system for which GCC will *generate code*. -Only meaningful when building a cross-compiler. - -* Build Options - -** --enable- -Turns on an optional feature at configure time. - -** --disable- -Explicitly disables an optional feature. - -** --with-=value -Sets a feature or path used by the build system (e.g., `--with-sysroot`, `--with-pkgversion`). - -** --with-pkgversion -Custom string shown in `gcc --version`, useful to identify modified builds. - -** --with-bugurl -Sets the URL shown in `gcc --version` for reporting bugs. - -** --enable-languages -Limits the frontends to be built (e.g., C, C++, Fortran). -Useful to speed up bootstrap or reduce size. - -** --disable-multilib -Disables building 32-bit and 64-bit variants together (relevant on x86_64 systems). - -* Components - -** binutils -A set of binary tools including `as` diff --git "a/document\360\237\226\211/terms/machine and system roles.org" "b/document\360\237\226\211/terms/machine and system roles.org" deleted file mode 100644 index c2188ea..0000000 --- "a/document\360\237\226\211/terms/machine and system roles.org" +++ /dev/null @@ -1,57 +0,0 @@ -#+TITLE: Terms - Machine and System Roles -#+AUTHOR: Thomas Walker Lynch -#+DATE: 2025-05-06 - -* Machine vs System - -In this context, "machine" and "system" are often used interchangeably, but they can carry slightly different connotations depending on the context in which they are used. - -** Machine - - - Machine typically refers to the physical hardware or the machine itself — a particular physical computing device, such as a laptop, server, or embedded device. - - - When we talk about the target machine, we are usually referring to the physical or virtual hardware on which the final compiled binaries will run. - -** System - - - System generally refers to the complete environment in which the software operates. This includes the hardware but also extends to the operating system (OS), libraries, and other software that makes up the environment in which the machine operates. - - - The target system refers not just to the physical hardware but to the entire environment where the program will execute, including the OS, libraries, etc. - -** Target Machine vs Target System - - - The target machine is indeed a specific instance of hardware (e.g., a Raspberry Pi or an ARM-based system). - - - The target system can also refer to the environment in which that machine operates, which includes not just the hardware (machine) but also the operating system and the software environment required to run the compiled binaries. - -In many contexts, the terms are used interchangeably, but there’s a subtle distinction: - - - Target Machine emphasizes the hardware aspect. - - Target System emphasizes the complete environment, including hardware, OS, and libraries. - -** Examples - - - Target Machine: "The target machine is an ARM Cortex-M processor." Here, the focus is on the physical hardware. - - Target System: "The target system is an ARM Cortex-M processor running a specific version of an embedded OS." Here, the focus is on both the hardware and the environment (OS, libraries, etc.). - - -* System Roles - -* 'build' -The system on which the GCC compiler **is being built**. - -- John downloads the GCC source code and compiles it on his x86_64 Debian laptop. The GCC compiler **is built on** this laptop, so the build system is Debian on x86_64. -- Sarah uses her macOS machine to build a GCC compiler intended for use on a different system. Even though she won’t run it locally, the compiler **was constructed on macOS**, making macOS the build system. - -* 'host' -The system where the **compiled GCC binary will run** — where you’ll actually invoke `gcc`. - -- Alice builds a GCC binary on Ubuntu, but installs and runs it on an Alpine Linux container. The host is Alpine Linux because that’s where the compiler binary will execute. -- A CI pipeline builds GCC on an x86 Linux node but deploys the resulting `gcc` executable to a FreeBSD system, where users will compile programs. FreeBSD is the host. - -* 'target' -The system for which the **compiled GCC binary will generate code** — where the output binaries are meant to run. - -- Mark compiles a version of GCC that runs on x86 Linux (host), but it produces binaries for ARM Cortex-M chips. The target is ARM Cortex-M. -- A developer builds a GCC compiler on x86 that will run on macOS, and it generates code for RISC-V devices. The target is RISC-V — the eventual destination of the compiled programs. - diff --git "a/document\360\237\226\211/why_version_12.org" "b/document\360\237\226\211/why_version_12.org" deleted file mode 100644 index f2a8e38..0000000 --- "a/document\360\237\226\211/why_version_12.org" +++ /dev/null @@ -1,38 +0,0 @@ -#+TITLE: The Fictional Story of The GCC 15 Install -#+AUTHOR: Thomas Walker Lynch -#+DATE: 2025-05-06 -#+OPTIONS: toc:nil num:nil -#+LANGUAGE: en - -Once upon a time, there was a programmer who wanted to add a new directive to the C preprocessor, cpp. cpp was a simple tool. It expanded macros and did basic text substitutions, nothing more. The programmer thought, “I’ll just modify cpp. That’s easy enough. - -But he could not find cpp. Rather cpp is part of gcc, part and parcel. Well he thought, -gcc is a mature tool, and after making mods to cpp, it will build merely by calling make. - -So he downloaded the latest gcc, gcc-16, but then discovered, there were no release branches. "It won't be stable." He thought, then backed off to gcc-15. - -Then upon attempting to build gcc-15 as baseline before making any mods, he discovered that the system glibc was not version compatible for the build. - -Ok, simple matter, he thought, “I’ll download glibc too. It’s just part of the process.” - -But then attempting to compiling glibc, he encountered something he hadn’t expected. To compile glibc, he needed a version compatible C compiler, and the system compiler is not up to. "Ah I remember," he said as recalled, that gcc can be built in two stages, with the -first stage not requiring glibc, but being sufficient to compile glic. - -So he set to work on the stage 1 GCC, but as he got deeper, he realized that the stage 1 GCC required certain C runtime files — specifically crt1.o, crti.o, and crtn.o. “Ah, no problem,” he said. “I’ll just generate those.” - -The programmer found that the files could be generated using the glibc make file, so he attempted that, but alas, headers were needed, and a version compatible binutils was needed. - -So the programmer downloaded the linux headers and the binutils sources. Binutils could be built with the system gcc, so all was good. - -“Now, I’ve got the CRT files. Let’s move on,” he thought. With the CRT files in hand, he turned to compiling the glibc headers. This was simple enough, and once those were built, he could proceed with the stage 1 GCC. - -After completing the first stage of GCC, he thought, “Great. Now I can move on to the full glibc. It’s just the next step.” - -And so, he used his stage 1 GCC to compile the full glibc. It wasn’t a complex process, just a bit more time-consuming. But with glibc now fully built, the programmer could finally finish the GCC build. The full toolchain was nearly ready. - -With everything falling into place, the programmer turned to compile the full GCC. And just like that, the toolchain was built. He linked everything together, a self-contained, fully functional GCC, ready for use. - -But you ask me why this story is fiction? Well the answer is simple, because glibc make did not create the crt files. Perhaps it doesn't do that. IDK. - -So this is the story of how we ended up at version 12, which does compile with system tools. - diff --git a/tester/experiment/.gitignore b/tester/experiment/.gitignore new file mode 100644 index 0000000..cba7efc --- /dev/null +++ b/tester/experiment/.gitignore @@ -0,0 +1 @@ +a.out diff --git a/tester/experiment/RT_CAT_1.c b/tester/experiment/RT_CAT_1.c new file mode 100644 index 0000000..17663b1 --- /dev/null +++ b/tester/experiment/RT_CAT_1.c @@ -0,0 +1,6 @@ +#include + +int main(void){ + printf( "The answer is: %s\n", RT_CAT ); + return 0; +} diff --git a/tester/experiment/UC_list.c b/tester/experiment/UC_list.c new file mode 100644 index 0000000..952f231 --- /dev/null +++ b/tester/experiment/UC_list.c @@ -0,0 +1,28 @@ +#include +/* +tests that built in macro stubs are present, this will not work when the macros have implementations. +*/ + +#if 0 + _ASSIGN + _TO_ARG_LIST + _TO_TOKEN_LIST + _FIRST + _REST + _MAP + _AL_MAP + _IF + _NOT + _AND + _OR + _IS_IDENTIFIER + _IS_NAME + _PASTE +#endif + +int main(void){ + _ASSIGN(X)(5); + printf("X: %d" ,X); + + return 0; +} diff --git a/tester/experiment/assign_1.c b/tester/experiment/assign_1.c new file mode 100644 index 0000000..f58320c --- /dev/null +++ b/tester/experiment/assign_1.c @@ -0,0 +1,8 @@ +#include + +#assign (ANSWER) (42) + +int main(void){ + printf( "The answer is: %d\n", ANSWER ); + return 0; +} diff --git a/tester/experiment/assign_2.c b/tester/experiment/assign_2.c new file mode 100644 index 0000000..3a41a63 --- /dev/null +++ b/tester/experiment/assign_2.c @@ -0,0 +1,22 @@ +#include + +#if 0 +#define STRINGIFY(x) #x +#define TOSTRING(x) STRINGIFY(x) +#define SHOW_MACRO(x) _Pragma(TOSTRING(message(#x " → " TOSTRING(x)))) + +SHOW_MACRO(a) +SHOW_MACRO(b) + +SHOW_MACRO($a) +SHOW_MACRO($b) +#endif + +#define a 2 +#define b 3 +#assign (ADD) [a + b] + +int main(void){ + printf("2 + 3 = %d\n", ADD); + return 0; +} diff --git a/tester/experiment/macro_1.c b/tester/experiment/macro_1.c new file mode 100644 index 0000000..2ba3e31 --- /dev/null +++ b/tester/experiment/macro_1.c @@ -0,0 +1,8 @@ +#include + +#rt_macro I(x) (x) + +int main(void){ + printf("5: %x" ,I(5)); + return 0; +} diff --git a/tester/experiment/macro_2.c b/tester/experiment/macro_2.c new file mode 100644 index 0000000..421f176 --- /dev/null +++ b/tester/experiment/macro_2.c @@ -0,0 +1,11 @@ +#include + +#rt_macro Q(f ,...)( + printf(f ,__VA_ARGS__) +) + +int main(void){ + Q("%x %x %x" ,1 ,2 ,3); + putchar('\n'); + return 0; +} diff --git a/tester/experiment/paste_1.c b/tester/experiment/paste_1.c new file mode 100644 index 0000000..d438c6b --- /dev/null +++ b/tester/experiment/paste_1.c @@ -0,0 +1,10 @@ +#include + +#define CAT2(a ,b) a##b +#define STRING_1(a) #a +#define STRING(a) STRING_1(a) + +int main(){ + printf("ab: %s." ,STRING(CAT2(a ,b))); + printf("gcc -E this one: %s." ,STRING(CAT2(a ,+))); +} diff --git a/tester/experiment/recursive_define_0.c b/tester/experiment/recursive_define_0.c new file mode 100644 index 0000000..da2919a --- /dev/null +++ b/tester/experiment/recursive_define_0.c @@ -0,0 +1,52 @@ +#include + + +#define X Q()B()(C) +#define Q() A +#define A(x) 21 +#define B() + +int main(){ + printf("X %d" ,X); +} + +/* + +So cpp truly does walk the token list with `cpp_get_token_1(pfile, &loc);` expanding each token one by one left to right. We should be able to get the desired expansion by adding another layer of macros, so that the argument expansion catches the final A(c). + +2025-05-12T02:53:59Z[developer] +Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ +> gcc test_recursive_define_0.c +test_recursive_define_0.c: In function ‘main’: +test_recursive_define_0.c:5:13: warning: implicit declaration of function ‘A’ [-Wimplicit-function-declaration] + 5 | #define Q() A + | ^ +test_recursive_define_0.c:4:12: note: in expansion of macro ‘Q’ + 4 | #define X Q()B()(C) + | ^ +test_recursive_define_0.c:10:18: note: in expansion of macro ‘X’ + 10 | printf("X %d" ,X); + | ^ +test_recursive_define_0.c:4:19: error: ‘C’ undeclared (first use in this function) + 4 | #define X Q()B()(C) + | ^ +test_recursive_define_0.c:10:18: note: in expansion of macro ‘X’ + 10 | printf("X %d" ,X); + | ^ +test_recursive_define_0.c:4:19: note: each undeclared identifier is reported only once for each function it appears in + 4 | #define X Q()B()(C) + | ^ +test_recursive_define_0.c:10:18: note: in expansion of macro ‘X’ + 10 | printf("X %d" ,X); + | ^ + +2025-05-12T02:54:17Z[developer] +Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ +> gcc -E -P test_recursive_define_0.c | tail -n 5 +extern int __overflow (FILE *, int); + +int main(){ + printf("X %d" ,A(C)); +} + +*? diff --git a/tester/experiment/recursive_define_1.c b/tester/experiment/recursive_define_1.c new file mode 100644 index 0000000..a7c6335 --- /dev/null +++ b/tester/experiment/recursive_define_1.c @@ -0,0 +1,44 @@ +#include + + +#define X Q()B()(C) +#define Q() A +#define A(x) 21 +#define B() + +#define P(x) x + +int main(){ + printf("X %d" ,P(X)); +} + +/* +2025-05-12T03:00:19Z[developer] +Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ +> /bin/gcc test_recursive_define_1.c + +2025-05-12T03:00:51Z[developer] +Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ +> ./a.out +X 21 +2025-05-12T03:00:55Z[developer] +Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ +> gcc test_recursive_define_1.c + +2025-05-12T03:01:03Z[developer] +Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ +> ./a.out +X 21 +2025-05-12T03:01:05Z[developer] +Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ +> gcc -E -P test_recursive_define_1.c | tail -n 5 +extern int __overflow (FILE *, int); + +int main(){ + printf("X %d" ,21); +} + +2025-05-12T03:01:11Z[developer] +Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ +> +*/ diff --git a/tester/experiment/recursive_define_2.c b/tester/experiment/recursive_define_2.c new file mode 100644 index 0000000..ecd87cc --- /dev/null +++ b/tester/experiment/recursive_define_2.c @@ -0,0 +1,58 @@ +#include + + +#define X Q()B()(C) +#define Q() A +#define A(x) 21 +#define B() + +#define P X + +int main(){ + printf("X %d" ,P); +} + +/* + This does not work because it does not cause an argument evaluation, as did test_recursive_define_1.c + +2025-05-12T03:01:11Z[developer] +Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ +> gcc test_recursive_define_2.c +test_recursive_define_2.c: In function ‘main’: +test_recursive_define_2.c:5:13: warning: implicit declaration of function ‘A’ [-Wimplicit-function-declaration] + 5 | #define Q() A + | ^ +test_recursive_define_2.c:4:12: note: in expansion of macro ‘Q’ + 4 | #define X Q()B()(C) + | ^ +test_recursive_define_2.c:9:11: note: in expansion of macro ‘X’ + 9 | #define P X + | ^ +test_recursive_define_2.c:12:18: note: in expansion of macro ‘P’ + 12 | printf("X %d" ,P); + | ^ +test_recursive_define_2.c:4:19: error: ‘C’ undeclared (first use in this function) + 4 | #define X Q()B()(C) + | ^ +test_recursive_define_2.c:9:11: note: in expansion of macro ‘X’ + 9 | #define P X + | ^ +test_recursive_define_2.c:12:18: note: in expansion of macro ‘P’ + 12 | printf("X %d" ,P); + | ^ +test_recursive_define_2.c:4:19: note: each undeclared identifier is reported only once for each function it appears in + 4 | #define X Q()B()(C) + | ^ +test_recursive_define_2.c:9:11: note: in expansion of macro ‘X’ + 9 | #define P X + | ^ +test_recursive_define_2.c:12:18: note: in expansion of macro ‘P’ + 12 | printf("X %d" ,P); + | ^ + +2025-05-12T03:02:36Z[developer] +Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ +> +*/ + + diff --git a/tester/experiment/temp b/tester/experiment/temp new file mode 100644 index 0000000..32b29d9 --- /dev/null +++ b/tester/experiment/temp @@ -0,0 +1,9 @@ +git mv assign_test_1.c assign_1.c +git mv assign_test_2.c assign_2.c +git mv macro_test_1.c macro_1.c +git mv macro_test_2.c macro_2.c +git mv recursive_define_0.c recursive_define_0.c +git mv recursive_define_1.c recursive_define_1.c +git mv recursive_define_2.c recursive_define_2.c +git mv RT_CAT_test_1.c RT_CAT_1.c +git mv va_arg_test_1.c va_arg_1.c diff --git a/tester/experiment/va_arg_test_1.c b/tester/experiment/va_arg_test_1.c new file mode 100644 index 0000000..8870054 --- /dev/null +++ b/tester/experiment/va_arg_test_1.c @@ -0,0 +1,9 @@ +#include + +#define A(...) (__VA_ARGS__) + + +int main(void){ + printf( "The answer is: %d\n", A(1,2,3) ); + return 0; +} diff --git "a/tester/experiment\360\237\226\211/.gitignore" "b/tester/experiment\360\237\226\211/.gitignore" deleted file mode 100644 index cba7efc..0000000 --- "a/tester/experiment\360\237\226\211/.gitignore" +++ /dev/null @@ -1 +0,0 @@ -a.out diff --git "a/tester/experiment\360\237\226\211/RT_CAT_1.c" "b/tester/experiment\360\237\226\211/RT_CAT_1.c" deleted file mode 100644 index 17663b1..0000000 --- "a/tester/experiment\360\237\226\211/RT_CAT_1.c" +++ /dev/null @@ -1,6 +0,0 @@ -#include - -int main(void){ - printf( "The answer is: %s\n", RT_CAT ); - return 0; -} diff --git "a/tester/experiment\360\237\226\211/UC_list.c" "b/tester/experiment\360\237\226\211/UC_list.c" deleted file mode 100644 index 952f231..0000000 --- "a/tester/experiment\360\237\226\211/UC_list.c" +++ /dev/null @@ -1,28 +0,0 @@ -#include -/* -tests that built in macro stubs are present, this will not work when the macros have implementations. -*/ - -#if 0 - _ASSIGN - _TO_ARG_LIST - _TO_TOKEN_LIST - _FIRST - _REST - _MAP - _AL_MAP - _IF - _NOT - _AND - _OR - _IS_IDENTIFIER - _IS_NAME - _PASTE -#endif - -int main(void){ - _ASSIGN(X)(5); - printf("X: %d" ,X); - - return 0; -} diff --git "a/tester/experiment\360\237\226\211/assign_1.c" "b/tester/experiment\360\237\226\211/assign_1.c" deleted file mode 100644 index f58320c..0000000 --- "a/tester/experiment\360\237\226\211/assign_1.c" +++ /dev/null @@ -1,8 +0,0 @@ -#include - -#assign (ANSWER) (42) - -int main(void){ - printf( "The answer is: %d\n", ANSWER ); - return 0; -} diff --git "a/tester/experiment\360\237\226\211/assign_2.c" "b/tester/experiment\360\237\226\211/assign_2.c" deleted file mode 100644 index 3a41a63..0000000 --- "a/tester/experiment\360\237\226\211/assign_2.c" +++ /dev/null @@ -1,22 +0,0 @@ -#include - -#if 0 -#define STRINGIFY(x) #x -#define TOSTRING(x) STRINGIFY(x) -#define SHOW_MACRO(x) _Pragma(TOSTRING(message(#x " → " TOSTRING(x)))) - -SHOW_MACRO(a) -SHOW_MACRO(b) - -SHOW_MACRO($a) -SHOW_MACRO($b) -#endif - -#define a 2 -#define b 3 -#assign (ADD) [a + b] - -int main(void){ - printf("2 + 3 = %d\n", ADD); - return 0; -} diff --git "a/tester/experiment\360\237\226\211/macro_1.c" "b/tester/experiment\360\237\226\211/macro_1.c" deleted file mode 100644 index 2ba3e31..0000000 --- "a/tester/experiment\360\237\226\211/macro_1.c" +++ /dev/null @@ -1,8 +0,0 @@ -#include - -#rt_macro I(x) (x) - -int main(void){ - printf("5: %x" ,I(5)); - return 0; -} diff --git "a/tester/experiment\360\237\226\211/macro_2.c" "b/tester/experiment\360\237\226\211/macro_2.c" deleted file mode 100644 index 421f176..0000000 --- "a/tester/experiment\360\237\226\211/macro_2.c" +++ /dev/null @@ -1,11 +0,0 @@ -#include - -#rt_macro Q(f ,...)( - printf(f ,__VA_ARGS__) -) - -int main(void){ - Q("%x %x %x" ,1 ,2 ,3); - putchar('\n'); - return 0; -} diff --git "a/tester/experiment\360\237\226\211/paste_1.c" "b/tester/experiment\360\237\226\211/paste_1.c" deleted file mode 100644 index d438c6b..0000000 --- "a/tester/experiment\360\237\226\211/paste_1.c" +++ /dev/null @@ -1,10 +0,0 @@ -#include - -#define CAT2(a ,b) a##b -#define STRING_1(a) #a -#define STRING(a) STRING_1(a) - -int main(){ - printf("ab: %s." ,STRING(CAT2(a ,b))); - printf("gcc -E this one: %s." ,STRING(CAT2(a ,+))); -} diff --git "a/tester/experiment\360\237\226\211/recursive_define_0.c" "b/tester/experiment\360\237\226\211/recursive_define_0.c" deleted file mode 100644 index da2919a..0000000 --- "a/tester/experiment\360\237\226\211/recursive_define_0.c" +++ /dev/null @@ -1,52 +0,0 @@ -#include - - -#define X Q()B()(C) -#define Q() A -#define A(x) 21 -#define B() - -int main(){ - printf("X %d" ,X); -} - -/* - -So cpp truly does walk the token list with `cpp_get_token_1(pfile, &loc);` expanding each token one by one left to right. We should be able to get the desired expansion by adding another layer of macros, so that the argument expansion catches the final A(c). - -2025-05-12T02:53:59Z[developer] -Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ -> gcc test_recursive_define_0.c -test_recursive_define_0.c: In function ‘main’: -test_recursive_define_0.c:5:13: warning: implicit declaration of function ‘A’ [-Wimplicit-function-declaration] - 5 | #define Q() A - | ^ -test_recursive_define_0.c:4:12: note: in expansion of macro ‘Q’ - 4 | #define X Q()B()(C) - | ^ -test_recursive_define_0.c:10:18: note: in expansion of macro ‘X’ - 10 | printf("X %d" ,X); - | ^ -test_recursive_define_0.c:4:19: error: ‘C’ undeclared (first use in this function) - 4 | #define X Q()B()(C) - | ^ -test_recursive_define_0.c:10:18: note: in expansion of macro ‘X’ - 10 | printf("X %d" ,X); - | ^ -test_recursive_define_0.c:4:19: note: each undeclared identifier is reported only once for each function it appears in - 4 | #define X Q()B()(C) - | ^ -test_recursive_define_0.c:10:18: note: in expansion of macro ‘X’ - 10 | printf("X %d" ,X); - | ^ - -2025-05-12T02:54:17Z[developer] -Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ -> gcc -E -P test_recursive_define_0.c | tail -n 5 -extern int __overflow (FILE *, int); - -int main(){ - printf("X %d" ,A(C)); -} - -*? diff --git "a/tester/experiment\360\237\226\211/recursive_define_1.c" "b/tester/experiment\360\237\226\211/recursive_define_1.c" deleted file mode 100644 index a7c6335..0000000 --- "a/tester/experiment\360\237\226\211/recursive_define_1.c" +++ /dev/null @@ -1,44 +0,0 @@ -#include - - -#define X Q()B()(C) -#define Q() A -#define A(x) 21 -#define B() - -#define P(x) x - -int main(){ - printf("X %d" ,P(X)); -} - -/* -2025-05-12T03:00:19Z[developer] -Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ -> /bin/gcc test_recursive_define_1.c - -2025-05-12T03:00:51Z[developer] -Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ -> ./a.out -X 21 -2025-05-12T03:00:55Z[developer] -Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ -> gcc test_recursive_define_1.c - -2025-05-12T03:01:03Z[developer] -Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ -> ./a.out -X 21 -2025-05-12T03:01:05Z[developer] -Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ -> gcc -E -P test_recursive_define_1.c | tail -n 5 -extern int __overflow (FILE *, int); - -int main(){ - printf("X %d" ,21); -} - -2025-05-12T03:01:11Z[developer] -Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ -> -*/ diff --git "a/tester/experiment\360\237\226\211/recursive_define_2.c" "b/tester/experiment\360\237\226\211/recursive_define_2.c" deleted file mode 100644 index ecd87cc..0000000 --- "a/tester/experiment\360\237\226\211/recursive_define_2.c" +++ /dev/null @@ -1,58 +0,0 @@ -#include - - -#define X Q()B()(C) -#define Q() A -#define A(x) 21 -#define B() - -#define P X - -int main(){ - printf("X %d" ,P); -} - -/* - This does not work because it does not cause an argument evaluation, as did test_recursive_define_1.c - -2025-05-12T03:01:11Z[developer] -Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ -> gcc test_recursive_define_2.c -test_recursive_define_2.c: In function ‘main’: -test_recursive_define_2.c:5:13: warning: implicit declaration of function ‘A’ [-Wimplicit-function-declaration] - 5 | #define Q() A - | ^ -test_recursive_define_2.c:4:12: note: in expansion of macro ‘Q’ - 4 | #define X Q()B()(C) - | ^ -test_recursive_define_2.c:9:11: note: in expansion of macro ‘X’ - 9 | #define P X - | ^ -test_recursive_define_2.c:12:18: note: in expansion of macro ‘P’ - 12 | printf("X %d" ,P); - | ^ -test_recursive_define_2.c:4:19: error: ‘C’ undeclared (first use in this function) - 4 | #define X Q()B()(C) - | ^ -test_recursive_define_2.c:9:11: note: in expansion of macro ‘X’ - 9 | #define P X - | ^ -test_recursive_define_2.c:12:18: note: in expansion of macro ‘P’ - 12 | printf("X %d" ,P); - | ^ -test_recursive_define_2.c:4:19: note: each undeclared identifier is reported only once for each function it appears in - 4 | #define X Q()B()(C) - | ^ -test_recursive_define_2.c:9:11: note: in expansion of macro ‘X’ - 9 | #define P X - | ^ -test_recursive_define_2.c:12:18: note: in expansion of macro ‘P’ - 12 | printf("X %d" ,P); - | ^ - -2025-05-12T03:02:36Z[developer] -Thomas-developer@StanleyPark§/home/Thomas/subu_data/developer/N/developer/experiment§ -> -*/ - - diff --git "a/tester/experiment\360\237\226\211/temp" "b/tester/experiment\360\237\226\211/temp" deleted file mode 100644 index 32b29d9..0000000 --- "a/tester/experiment\360\237\226\211/temp" +++ /dev/null @@ -1,9 +0,0 @@ -git mv assign_test_1.c assign_1.c -git mv assign_test_2.c assign_2.c -git mv macro_test_1.c macro_1.c -git mv macro_test_2.c macro_2.c -git mv recursive_define_0.c recursive_define_0.c -git mv recursive_define_1.c recursive_define_1.c -git mv recursive_define_2.c recursive_define_2.c -git mv RT_CAT_test_1.c RT_CAT_1.c -git mv va_arg_test_1.c va_arg_1.c diff --git "a/tester/experiment\360\237\226\211/va_arg_test_1.c" "b/tester/experiment\360\237\226\211/va_arg_test_1.c" deleted file mode 100644 index 8870054..0000000 --- "a/tester/experiment\360\237\226\211/va_arg_test_1.c" +++ /dev/null @@ -1,9 +0,0 @@ -#include - -#define A(...) (__VA_ARGS__) - - -int main(void){ - printf( "The answer is: %d\n", A(1,2,3) ); - return 0; -}