summaryrefslogtreecommitdiff
path: root/tests
AgeCommit message (Collapse)Author
2010-06-02Fix multi-line comment regular expression to handle (non) nested comments.Carl Worth
Ken reminded me of a couple cases that I should be testing. These are the non-nestedness of things that look like nested comments as well as potentially tricky things like "/*/" and "/*/*/". The (non) nested comment case was not working in the case of the comment terminator with multiple '*' characters. We fix this by not considering a '*' as the "non-slash" to terminate a sequence of '*' characters within the comment. We also fix the final match of the terminator to use '+' rather than '*' to require the presence of a final '*' character in the comment terminator.
2010-06-01Implement comment handling in the lexer (with test).Carl Worth
We support both single-line (//) and multi-line (/* ... */) comments and add a test for this, (trying to stress the rules just a bit by embedding one comment delimiter into a comment delimited with the other style, etc.). To keep the test suite passing we do now discard any output lines from glcpp that consist only of spacing, (in addition to blank lines as previously). We also discard any initial whitespace from gcc output. In neither case should the absence or presence of this whitespace affect correctness.
2010-06-01Fix #if-skipping to *really* skip the skipped group.Carl Worth
Previously we were avoiding printing within a skipped group, but we were still evluating directives such as #define and #undef and still emitting diagnostics for things such as macro calls with the wrong number of arguments. Add a test for this and fix it with a high-priority rule in the lexer that consumes the skipped content.
2010-05-29Add killer test case from the C99 specification.Carl Worth
Happily, this passes now, (since many of the previously added test cases were extracted from this one).
2010-05-29Add test and fix bugs with multiple token-pasting on the same line.Carl Worth
The list replacement when token pasting was broken, (failing to properly update the list's tail pointer). Also, memory management when pasting was broken, (modifying the original token's string which would cause problems with multiple calls to a macro which pasted a literal string). We didn't catch this with previous tests because they only pasted argument values.
2010-05-29Fix pass-through of '=' and add a test for it.Carl Worth
Previously '=' was not included in our PUNCTUATION regeular expression, but it *was* excldued from our OTHER regular expression, so we were getting the default (and hamful) lex action of just printing it. The test we add here is named "punctuator" with the idea that we can extend it as needed for other punctuator testing.
2010-05-28Perform macro by replacing tokens in original list.Carl Worth
We take the results of macro expansion and splice them into the original token list over which we are iterating. This makes it easy for function-like macro invocations to find their arguments since they are simply subsequent tokens on the list. This fixes the recently-introduced regressions (tests 55 and 56) and also passes new tests 60 and 61 introduced to strees this feature, (with macro-argument parentheses split between a macro value and the textual input).
2010-05-28Stop interrupting the test suite at the first failure.Carl Worth
This behavior was useful when starting the implementation over ("take-2") where the whole test suite was failing. This made it easy to focus on one test at a time and get each working. More recently, we got the whole suite working, so we don't need this feature anymore. And in the previous commit, we regressed a couple of tests, so it's nice to be able to see all the failures with a single run of the suite.
2010-05-27Remove blank lines from output files before comparing.Carl Worth
Recently I'm seeing cases where "gcc -E" mysteriously omits blank lines, (even though it prints the blank lines in other very similar cases). Rather than trying to decipher and imitate this, just get rid of the blank lines. This approach with sed to kill the lines before the diff is better than "diff -B" since when there is an actual difference, the presence of blank lines won't make the diff harder to read.
2010-05-27Implement token pasting of integers.Carl Worth
To do this correctly, we change the lexer to lex integers as string values, (new token type of INTEGER_STRING), and only convert to integer values when evaluating an expression value. Add a new test case for this, (which does pass now).
2010-05-27Add placeholder tokens to support pasting with empty arguments.Carl Worth
Along with a passing test to verify that this works.
2010-05-27Provide support for empty arguments in macro invocations.Carl Worth
For this we always add a new argument to the argument list as soon as possible, without waiting until we see some argument token. This does mean we need to take some extra care when comparing the number of arguments with the number of expected arguments. In addition to matching numbers, we also support one (empty) argument when zero arguments are expected. Add a test case here for this, which does pass.
2010-05-27Avoid treating an expanded comma as an argument separator.Carl Worth
That is, a function-like invocation foo(x) is valid as a single-argument invocation even if 'x' is a macro that expands into a value with a comma. Add a new COMMA_FINAL token type to handle this, and add a test for this case, (which passes).
2010-05-26Add support (and test) for an object-to-function chain with the parens in ↵Carl Worth
the content. That is, the following case: #define foo(x) (x) #define bar bar(baz) which now works with this (ugly) commit. I definitely want to come up with something cleaner than this.
2010-05-26Treat newlines as space when invoking a function-like macro invocation.Carl Worth
This adds three new pieces of state to the parser, (is_control_line, newline_as_space, and paren_count), and a large amount of messy code. I'd definitely like to see a cleaner solution for this. With this fix, the "define-func-extra-newlines" now passes so we put it back to test #26 where it was originally (lately it has been known as test #55). Also, we tweak test 25 slightly. Previously this test was ending a file function-like macro name that was not actually a macro (not followed by a left parenthesis). As is, this fix was making that test fail because the text_line production expects to see a terminating NEWLINE, but that NEWLINE is now getting turned into a SPACE here. This seems unlikely to be a problem in the wild, (function macros being used in a non-macro sense seems rare enough---but more than likely they won't happen at the end of a file). Still, we document this shortcoming in the README.
2010-05-26Implement (and test) support for macro expansion within conditional expressions.Carl Worth
To do this we have split the existing "HASH_IF expression" into two productions: First is HASH_IF pp_tokens which simply constructs a list of tokens. Then, with that resulting token list, we first evaluate all DEFINED operator tokens, then expand all macros, and finally start lexing from the resulting token list. This brings us to the second production, IF_EXPANDED expression This final production works just like our previous "HASH_IF expression", evaluating a constant integer expression. The new test (54) added for this case now passes.
2010-05-26Fix lexing of "defined" as an operator, not an identifier.Carl Worth
Simply need to move the rule for IDENTIFIER to be after "defined" and everything is happy. With this change, tests 50 through 53 all pass now.
2010-05-26Implement #if and friends.Carl Worth
With this change, tests 41 through 49 all pass. (The defined operator appears to be somehow broken so that test 50 doesn't pass yet.)
2010-05-26Defer test 26 until much later (to test 55).Carl Worth
Supporting embedded newlines in a macro invocation is going to be tricky with our current approach to lexing and parsing. Since this isn't really an important feature for us, we can defer this until more important things are resolved. With this test out of the way, tests 27 through 31 are passing.
2010-05-25Collapse multiple spaces in input down to a single space.Carl Worth
This is what gcc does, and it's actually less work to do this. Previously we were having to save the contents of space tokens as a string, but we don't need to do that now. We extend test #0 to exercise this feature here.
2010-05-25Add a test #0 to ensure that we don't do any inadvertent token pasting.Carl Worth
This simply ensures that spaces in input line are preserved.
2010-05-25Implement expansion of object-like macros.Carl Worth
For this we add an "active" string_list_t to the parser. This makes the current expansion_list_t in the parser obsolete, but we don't remove that yet. With this change we can now start passing some actual tests, so we turn on real testing in the test suite again. I expect to implement things more or less in the same order as before, so the test suite now halts on first error. With this change the first 8 tests in the suite pass, (object-like macros with chaining and recursion).
2010-05-25Make the lexer pass whitespace through (as OTHER tokens) for text lines.Carl Worth
With this change, we can recreate the original text-line input exactly. Previously we were inserting a space between every pair of tokens so our output had a lot more whitespace than our input. With this change, we can drop the "-b" option to diff and match the input exactly.
2010-05-25Store parsed tokens as token list and print all text lines.Carl Worth
Still not doing any macro expansion just yet. But it should be fairly easy from here.
2010-05-25Starting over with the C99 grammar for the preprocessor.Carl Worth
This is a fresh start with a much simpler approach for the flex/bison portions of the preprocessor. This isn't functional yet, (produces no output), but can at least read all of our test cases without any parse errors. The grammar here is based on the grammar provided for the preprocessor in the C99 specification.
2010-05-24Add test for '/', '<<', and '>>' in #if expressions.Carl Worth
These operators have been supported already, but were not covered in existing tests yet. So this test passes already.
2010-05-24Add test of bitwise operators and octal/hexadecimal literals.Carl Worth
This new test covers several features from the last few commits. This test passes already.
2010-05-24Implement all operators specified for GLSL #if expressions (with tests).Carl Worth
The operator coverage here is quite complete. The one big thing missing is that we are not yet doing macro expansion in #if lines. This makes the whole support fairly useless, so we plan to fix that shortcoming right away.
2010-05-20Implement #if, #else, #elif, and #endif with tests.Carl Worth
So far the only expression implemented is a single integer literal, but obviously that's easy to extend. Various things including nesting are tested here.
2010-05-20Remove "unnecessary" whitespace from some tests.Carl Worth
This whitespace was not part of anything being tested, and it introduces differences (that we don't actually care about) between the output of "gcc -E" and glcpp. Just eliminate this extra whitespace to reduce spurious test-case failures.
2010-05-20Stop ignoring whitespace while testing.Carl Worth
Sometime back the output of glcpp started differing from the output of "gcc -E" in the amount of whitespace in emitted. At the time, I switched the test suite to use "diff -w" to ignore this. This was a mistake since it ignores whitespace entirely. (I meant to use "diff -b" which ignores only changes in the amount of whitespace.) So bugs have since been introduced that the test suite doesn't notice. For example, glcpp is producing "twotokens" where it should be producing "two tokens". Let's stop ignoring whitespace in the test suite, which currently introduces lots of failures---some real and some spurious.
2010-05-20Add test (and fix) for a function argument of a macro that expands with a comma.Carl Worth
The fix here is quite simple (and actually only deletes code). When expanding a macro, we don't return a ',' as a unique token type, but simply let it fall through to the generic case.
2010-05-20Add support for commas within parenthesized groups in function arguments.Carl Worth
The specification says that commas within a parenthesized group, (that's not a function-like macro invocation), are passed through literally and not considered argument separators in any outer macro invocation. Add support and a test for this case. This support makes a third occurrence of the same "FUNC_MACRO (" shift/reduce conflict appear, so expect that. This change does introduce a fairly large copy/paste block in the grammar which is unfortunate. Perhaps if I were more clever I'd find a way to share the common pieces between argument and argument_or_comma.
2010-05-20Avoid re-expanding a macro name that has once been rejected from expansion.Carl Worth
The specification of the preprocessor in C99 says that when we see a macro name that we are already expanding that we refuse to expand it now, (which we've done for a while), but also that we refuse to ever expand it later if seen in other contexts at which it would be legitimate to expand. We add a test case for that here, and fix it to work. The fix takes advantage of a new token_t value for tokens and argument words along with the recently added IDENTIFIER_FINALIZED token type which instructs the parser to not even look for another expansion.
2010-05-19Like previous fix, but for object-like macros (and add a test).Carl Worth
The support for an object-like amcro within a macro-invocation argument was also implemented at one level too high in the grammar. Fortunately, this is a very simple fix.
2010-05-19Fix bug as in previous fix, but with multi-token argument.Carl Worth
The previous fix added FUNC_MACRO to a production one higher in teh grammar than it should have. So it prevented a FUNC_MACRO from appearing as part of a mutli-token argument rather than just alone as an argument. Fix this (and add a test).
2010-05-19Fix bug (and test) for an invocation using macro name as a non-macro argumentCarl Worth
This adds a second shift/reduce conflict to our grammar. It's basically the same conflict we had previously, (deciding to shift a '(' after a FUNC_MACRO) but this time in the "argument" context rather than the "content" context. It would be nice to not have these, but I think they are unavoidable (withotu a lot of pain at least) given the preprocessor specification.
2010-05-19Fix bug (and add tests) for a function-like macro defined as itself.Carl Worth
This case worked previously, but broke in the recent rewrite of function- like macro expansion. The recursion was still terminated correctly, but any parenthesized expression after the macro name was still being swallowed even though the identifier was not being expanded as a macro. The fix is to notice earlier that the identifier is an already-expanding macro. We let the lexer know this through the classify_token function so that an already-expanding macro is lexed as an identifier, not a FUNC_MACRO.
2010-05-18Add several tests where the defined value of a macro is (or looks like) a macroCarl Worth
Many of these look quite similar to existing tests that are handled correctly, yet none of these work. For example, in test 30 we have a simple non-function macro "foo" that is defined as "bar(baz(success))" and obviously non-function macro expansion has been working for a long time. Similarly, if we had text of "bar(baz(success))" it would be expanded correctly as well. But when this otherwise functioning text appears as the body of a macro, things don't work at all. This is pointing out a fundamental problem with the current approach. The current code does a recursive expansion of a macro definition, but this doesn't involve the parsing machinery, so it can't actually handle things like an arbitrary nesting of parentheses. The fix will require the parser to stuff macro values back into the lexer to get at all of the existing machinery when expanding macros.
2010-05-17Fix (and add test for) function-like macro invocation with newlines.Carl Worth
The test has a newline before the left parenthesis, and newlines to separate the parentheses from the argument. The fix involves more state in the lexer to only return a NEWLINE token when termniating a directive. This is very similar to our previous fix with extra lexer state to only return the SPACE token when it would be significant for the parser. With this change, the exact number and positioning of newlines in the output is now different compared to "gcc -E" so we add a -B option to diff when testing to ignore that.
2010-05-17Fix bug (and add test) for a function-like-macro appearing as a non-macro.Carl Worth
That is, when a function-like macro appears in the content without parentheses it should be accepted and passed on through, (previously the parser was regarding this as a syntax error).
2010-05-17Add test and fix bug leading to infinite recursion.Carl Worth
The test case here is simply "#define foo foo" and "#define bar foo" and then attempting to expand "bar". Previously, our termination condition for the recursion was overly simple---just looking for the single identifier that began the expansion. We now fix this to maintain a stack of identifiers and terminate when any one of them occurs in the replacement list.
2010-05-14Don't return SPACE tokens unless strictly needed.Carl Worth
This reverts the unconditional return of SPACE tokens from the lexer from commit 48b94da0994b44e41324a2419117dcd81facce8b . That commit seemed useful because it kept the lexer simpler, but the presence of SPACE tokens is causing lots of extra complication for the parser itself, (redundant productions other than whitespace differences, several productions buggy in the case of extra whitespace, etc.) Of course, we'd prefer to never have any whitespace token, but that's not possible with the need to distinguish between "#define foo()" and "#define foo ()". So we'll accept a little bit of pain in the lexer, (enough state to support this special-case token), in exchange for keeping most of the parser blissffully ignorant of whether tokens are separated by whitespace or not. This change does mean that our output now differs from that of "gcc -E", but only in whitespace. So we test with "diff -w now to ignore those differences.
2010-05-14Add test with extra whitespace in macro defintions and invocations.Carl Worth
This whitespace is not dealt with in an elegant way yet so this test does not pass currently.
2010-05-14Add test invoking a macro with an argument containing (non-macro) parentheses.Carl Worth
The macro invocation is defined to consume all text between a set of matched parentheses. We previously tested for inner parentheses from a nested function-like macro invocation. Here we test for inner parentheses occuring on their own, (not part of another macro invocation).
2010-05-14Add test for composed invocation of function-like macros.Carl Worth
This is a case like "foo(bar(x))" where both foo and bar are defined function-like macros. This is not yet parsed correctly so this test fails.
2010-05-14Add test for function-like macro invocations with multiple-token arguments.Carl Worth
These are not yet parsed correctly, so these tests fail.
2010-05-14Add test where a macro formal parameter is the same as an existing macro.Carl Worth
This is a well-defined condition, but something that currently trips up the implementation. Should be easy to fix.
2010-05-14Add tests exercising substitution of arguments in function-like macros.Carl Worth
This capability is the only thing that makes function-like macros interesting. This isn't supported yet so these tests fail for now.
2010-05-14Add some whitespace variations to test 15.Carl Worth
This shows two minor failures in our current parsing (resulting in whitespace-only changes, oso not that significant): 1. We are inserting extra whitespace between tokens not originally separated by whitespace in the replacement list of a macro definition. 2. We are swallowing whitespace separating tokens in the general content.