summaryrefslogtreecommitdiff
AgeCommit message (Collapse)Author
2010-05-25Store parsed tokens as token list and print all text lines.Carl Worth
Still not doing any macro expansion just yet. But it should be fairly easy from here.
2010-05-25Delete some trailing whitespace.Carl Worth
This pernicious stuff managed to sneak in on us.
2010-05-25Add xtalloc_reference.Carl Worth
Yet another talloc wrapper that should come in handy.
2010-05-25Starting over with the C99 grammar for the preprocessor.Carl Worth
This is a fresh start with a much simpler approach for the flex/bison portions of the preprocessor. This isn't functional yet, (produces no output), but can at least read all of our test cases without any parse errors. The grammar here is based on the grammar provided for the preprocessor in the C99 specification.
2010-05-24Add test for '/', '<<', and '>>' in #if expressions.Carl Worth
These operators have been supported already, but were not covered in existing tests yet. So this test passes already.
2010-05-24Add test of bitwise operators and octal/hexadecimal literals.Carl Worth
This new test covers several features from the last few commits. This test passes already.
2010-05-24Add support for octal and hexadecimal integer literals.Carl Worth
In addition to the decimal literals which we already support. Note that we use strtoll here to get the large-width integers demanded by the specification.
2010-05-24Switch to intmax_t (rather than int) for #if expressionsCarl Worth
This is what the C99 specification demands. And the GLSL specification says that we should follow the "standard C++" rules for #if condition expressions rather than the GLSL rules, (which only support a 32-bit integer).
2010-05-24Add the '~' operator to the lexer.Carl Worth
This was simply missing before, (and unnoticed since we had no test of the '~' operator).
2010-05-24Implement all operators specified for GLSL #if expressions (with tests).Carl Worth
The operator coverage here is quite complete. The one big thing missing is that we are not yet doing macro expansion in #if lines. This makes the whole support fairly useless, so we plan to fix that shortcoming right away.
2010-05-20Implement #if, #else, #elif, and #endif with tests.Carl Worth
So far the only expression implemented is a single integer literal, but obviously that's easy to extend. Various things including nesting are tested here.
2010-05-20Implement (and add test) for token pasting.Carl Worth
This is *very* easy to implement now that macro arguments are pre-expanded.
2010-05-20Pre-expand macro arguments at time of invocation.Carl Worth
Previously, we were using the same lexing stack as we use for macro expansion to also expand macro arguments. Instead, we now do this earlier by simply recursing over the macro-invocations replacement list and constructing a new expanded list, (and pushing only *that* onto the stack). This is simpler, and also allows us to more easily implement token pasting in the future.
2010-05-20Add xtalloc_asprintfCarl Worth
I expect this to be useful in the upcoming implementation of token pasting.
2010-05-20Finish cleaning up whitespace differences.Carl Worth
The last remaining thing here was that when a line ended with a macro, and the parser looked ahead to the newline token, the lexer was printing that newline before the parser printed the expansion of the macro. The fix is simple, just make the lexer tell the parser that a newline is needed, and the parser can wait until reducing a production to print that newline. With this, we now pass the entire test suite with simply "diff -u", so we no longer have any diff options hiding whitespace bugs from us. Hurrah!
2010-05-20Avoid printing a space at the beginning of lines in the output.Carl Worth
This fixes more differences compared to "gcc -E" so removes several cases of erroneously failing test cases. The implementation isn't very elegant, but it is functional.
2010-05-20Fix bug of consuming excess whitespace.Carl Worth
We fix this by moving printing up to the top-level "input" action and tracking whether a space is needed between one token and the next. This fixes all actual bugs in test-suite output, but does leave some tests failing due to differences in the amount of whitespace produced, (which aren't actual bugs per se).
2010-05-20Remove unused function _print_string_listCarl Worth
The only good dead code is non-existing dead code.
2010-05-20Remove "unnecessary" whitespace from some tests.Carl Worth
This whitespace was not part of anything being tested, and it introduces differences (that we don't actually care about) between the output of "gcc -E" and glcpp. Just eliminate this extra whitespace to reduce spurious test-case failures.
2010-05-20Stop ignoring whitespace while testing.Carl Worth
Sometime back the output of glcpp started differing from the output of "gcc -E" in the amount of whitespace in emitted. At the time, I switched the test suite to use "diff -w" to ignore this. This was a mistake since it ignores whitespace entirely. (I meant to use "diff -b" which ignores only changes in the amount of whitespace.) So bugs have since been introduced that the test suite doesn't notice. For example, glcpp is producing "twotokens" where it should be producing "two tokens". Let's stop ignoring whitespace in the test suite, which currently introduces lots of failures---some real and some spurious.
2010-05-20Add test (and fix) for a function argument of a macro that expands with a comma.Carl Worth
The fix here is quite simple (and actually only deletes code). When expanding a macro, we don't return a ',' as a unique token type, but simply let it fall through to the generic case.
2010-05-20Add support for commas within parenthesized groups in function arguments.Carl Worth
The specification says that commas within a parenthesized group, (that's not a function-like macro invocation), are passed through literally and not considered argument separators in any outer macro invocation. Add support and a test for this case. This support makes a third occurrence of the same "FUNC_MACRO (" shift/reduce conflict appear, so expect that. This change does introduce a fairly large copy/paste block in the grammar which is unfortunate. Perhaps if I were more clever I'd find a way to share the common pieces between argument and argument_or_comma.
2010-05-20Avoid re-expanding a macro name that has once been rejected from expansion.Carl Worth
The specification of the preprocessor in C99 says that when we see a macro name that we are already expanding that we refuse to expand it now, (which we've done for a while), but also that we refuse to ever expand it later if seen in other contexts at which it would be legitimate to expand. We add a test case for that here, and fix it to work. The fix takes advantage of a new token_t value for tokens and argument words along with the recently added IDENTIFIER_FINALIZED token type which instructs the parser to not even look for another expansion.
2010-05-19Use new token_list_t rather than string_list_t for macro values.Carl Worth
There's not yet any change in functionality here, (at least according to the test suite). But we now have the option of specifying a type for each string in the token list. This will allow us to finalize an unexpanded macro name so that it won't be subjected to excess expansion later.
2010-05-19Perform "re lexing" on string list values rathern than on text.Carl Worth
Previously, we would pass original strings back to the original lexer whenever we needed to re-lex something, (such as an expanded macro or a macro argument). Now, we instead parse the macro or argument originally to a string list, and then re-lex by simply returning each string from this list in turn. We do this in the recently added glcpp_parser_lex function that sits on top of the lower-level glcpp_lex that only deals with text. This doesn't change any behavior (at least according to the existing test suite which all still passes) but it brings us much closer to being able to "finalize" an unexpanded macro as required by the specification.
2010-05-19Remove unused NEWLINE token.Carl Worth
We fixed the lexer a while back to never return a NEWLINE token, but negelcted to clean up this declaration.
2010-05-19Remove unneeded YYLEX_PARAM define.Carl Worth
I'm not sure where this came from, but it's clearly not needed.
2010-05-19Rename yylex to glcpp_parser_lex and give it a glcpp_parser_t* argument.Carl Worth
Much cleaner this way, (and now our custom lex function has access to all the parser state which it will need).
2010-05-19Add a wrapper function around the lexer.Carl Worth
We rename the generated lexer from yylex to glcpp_lex. Then we implement our own yylex function in glcpp-parse.y that calls glcpp_lex. This doesn't change the behavior at all yet, but gives us a place where we can do implement alternate lexing in the future. (We want this because instead of re-lexing from strings for macro expansion, we want to lex from pre-parsed token lists. We need this so that when we terminate recursion due to an already active macro expansion, we can ensure that that symbol never gets expanded again later.)
2010-05-19Like previous fix, but for object-like macros (and add a test).Carl Worth
The support for an object-like amcro within a macro-invocation argument was also implemented at one level too high in the grammar. Fortunately, this is a very simple fix.
2010-05-19Fix bug as in previous fix, but with multi-token argument.Carl Worth
The previous fix added FUNC_MACRO to a production one higher in teh grammar than it should have. So it prevented a FUNC_MACRO from appearing as part of a mutli-token argument rather than just alone as an argument. Fix this (and add a test).
2010-05-19Fix bug (and test) for an invocation using macro name as a non-macro argumentCarl Worth
This adds a second shift/reduce conflict to our grammar. It's basically the same conflict we had previously, (deciding to shift a '(' after a FUNC_MACRO) but this time in the "argument" context rather than the "content" context. It would be nice to not have these, but I think they are unavoidable (withotu a lot of pain at least) given the preprocessor specification.
2010-05-19Fix bug (and add tests) for a function-like macro defined as itself.Carl Worth
This case worked previously, but broke in the recent rewrite of function- like macro expansion. The recursion was still terminated correctly, but any parenthesized expression after the macro name was still being swallowed even though the identifier was not being expanded as a macro. The fix is to notice earlier that the identifier is an already-expanding macro. We let the lexer know this through the classify_token function so that an already-expanding macro is lexed as an identifier, not a FUNC_MACRO.
2010-05-18Rewrite macro handling to support function-like macro invocation in macro valuesCarl Worth
The rewrite her discards the functions that did direct, recursive expansion of macro values. Instead, the parser now pushes the macro definition string over to a stack of buffers for the lexer. This way, macro expansion gets access to all parsing machinery. This isn't a small change, but the result is simpler than before (I think). It passes the entire test suite, including the four tests added with the previous commit that were failing before.
2010-05-18Add several tests where the defined value of a macro is (or looks like) a macroCarl Worth
Many of these look quite similar to existing tests that are handled correctly, yet none of these work. For example, in test 30 we have a simple non-function macro "foo" that is defined as "bar(baz(success))" and obviously non-function macro expansion has been working for a long time. Similarly, if we had text of "bar(baz(success))" it would be expanded correctly as well. But when this otherwise functioning text appears as the body of a macro, things don't work at all. This is pointing out a fundamental problem with the current approach. The current code does a recursive expansion of a macro definition, but this doesn't involve the parsing machinery, so it can't actually handle things like an arbitrary nesting of parentheses. The fix will require the parser to stuff macro values back into the lexer to get at all of the existing machinery when expanding macros.
2010-05-17Fix (and add test for) function-like macro invocation with newlines.Carl Worth
The test has a newline before the left parenthesis, and newlines to separate the parentheses from the argument. The fix involves more state in the lexer to only return a NEWLINE token when termniating a directive. This is very similar to our previous fix with extra lexer state to only return the SPACE token when it would be significant for the parser. With this change, the exact number and positioning of newlines in the output is now different compared to "gcc -E" so we add a -B option to diff when testing to ignore that.
2010-05-17Expect 1 shift/reduce conflict.Carl Worth
The most recent fix to the parser introduced a shift/reduce conflict. We document this conflict here, and tell bison that it need not report it (since I verified that it's being resolved in the direction desired). For the record, I did write additional lexer code to eliminate this conflict, but it was quite fragile, (would not accept a newline between a function-like macro name and the left parenthesis, for example).
2010-05-17Fix bug (and add test) for a function-like-macro appearing as a non-macro.Carl Worth
That is, when a function-like macro appears in the content without parentheses it should be accepted and passed on through, (previously the parser was regarding this as a syntax error).
2010-05-17Add test and fix bug leading to infinite recursion.Carl Worth
The test case here is simply "#define foo foo" and "#define bar foo" and then attempting to expand "bar". Previously, our termination condition for the recursion was overly simple---just looking for the single identifier that began the expansion. We now fix this to maintain a stack of identifiers and terminate when any one of them occurs in the replacement list.
2010-05-14Fix two whitespace bugs in the lexer.Carl Worth
The first bug was not allowing whitespace between '#' and the directive name. The second bug was swallowing a terminating newline along with any trailing whitespace on a line. With these two fixes, and the previous commit to stop emitting SPACE tokens, the recently added extra-whitespace test now passes.
2010-05-14Don't return SPACE tokens unless strictly needed.Carl Worth
This reverts the unconditional return of SPACE tokens from the lexer from commit 48b94da0994b44e41324a2419117dcd81facce8b . That commit seemed useful because it kept the lexer simpler, but the presence of SPACE tokens is causing lots of extra complication for the parser itself, (redundant productions other than whitespace differences, several productions buggy in the case of extra whitespace, etc.) Of course, we'd prefer to never have any whitespace token, but that's not possible with the need to distinguish between "#define foo()" and "#define foo ()". So we'll accept a little bit of pain in the lexer, (enough state to support this special-case token), in exchange for keeping most of the parser blissffully ignorant of whether tokens are separated by whitespace or not. This change does mean that our output now differs from that of "gcc -E", but only in whitespace. So we test with "diff -w now to ignore those differences.
2010-05-14Add test with extra whitespace in macro defintions and invocations.Carl Worth
This whitespace is not dealt with in an elegant way yet so this test does not pass currently.
2010-05-14Provide implementation for macro arguments containing parentheses.Carl Worth
We were correctly parsing this already, but simply not returning any value (for no good reason). Fortunately the fix is quite simple. This makes the test added in the previous commit now pass.
2010-05-14Add test invoking a macro with an argument containing (non-macro) parentheses.Carl Worth
The macro invocation is defined to consume all text between a set of matched parentheses. We previously tested for inner parentheses from a nested function-like macro invocation. Here we test for inner parentheses occuring on their own, (not part of another macro invocation).
2010-05-14Fix expansion of composited macros.Carl Worth
This is a case such as "foo(bar(x))". The recently added test for this now passes.
2010-05-14Add test for composed invocation of function-like macros.Carl Worth
This is a case like "foo(bar(x))" where both foo and bar are defined function-like macros. This is not yet parsed correctly so this test fails.
2010-05-14Eliminate a shift/reduce conflict.Carl Worth
By simply allowing for the argument_list production to be empty rather than the lower-level argument production to be empty.
2010-05-14Support macro invocations with multiple tokens for a single argument.Carl Worth
We provide for this by changing the value of the argument-list production from a list of strings (string_list_t) to a new data-structure that holds a list of lists of strings (argument_list_t).
2010-05-14Add test for function-like macro invocations with multiple-token arguments.Carl Worth
These are not yet parsed correctly, so these tests fail.
2010-05-14Make macro-expansion productions create string-list values rather than printingCarl Worth
Then we print the final string list up at the top-level content production along with all other printing. Additionally, having macro-expansion productions that create values will make it easier to solve problems like composed function-like macro invocations in the future.