Age | Commit message (Collapse) | Author |
|
We support both single-line (//) and multi-line (/* ... */) comments
and add a test for this, (trying to stress the rules just a bit by
embedding one comment delimiter into a comment delimited with the
other style, etc.).
To keep the test suite passing we do now discard any output lines from
glcpp that consist only of spacing, (in addition to blank lines as
previously). We also discard any initial whitespace from gcc output.
In neither case should the absence or presence of this whitespace
affect correctness.
|
|
Previously we were avoiding printing within a skipped group, but we
were still evluating directives such as #define and #undef and still
emitting diagnostics for things such as macro calls with the wrong
number of arguments.
Add a test for this and fix it with a high-priority rule in the lexer
that consumes the skipped content.
|
|
The take-2 branch started over with a new grammar based directly on
the grammar from the C99 specification. It doesn't try to capture
things like balanced sets of parentheses for macro arguments in the
grammar. Instead, it merely captures things as token lists and then
performs operations like parsing arguments and expanding macros on
those lists.
We merge it here since it's currently behaving better, (passing the
entire test suite). But the code base has proven quite fragile
really. Several of the recently added test cases required additional
special cases in the take-2 branch while working trivially on master.
So this merge point may be useful in the future, since we might have a
cleaner code base by coming back to the state before this merge and
fixing it, rather than accepting all the fragile
imperative/list-munging code from the take-2 branch.
|
|
The 071-punctuator test is failing only trivially (whitespace change only).
And the 072-token-pasting-same-line.c test passes just fine here, (more
evidence perhaps that the approach in take-2 is more trouble than it's
worth?).
The 099-c99-example test case is the inspiration for much of the rest
of the test suite. It amazingly passes on the take-2 branch, but
doesn't pass here yet.
|
|
Happily, this passes now, (since many of the previously added test
cases were extracted from this one).
|
|
The list replacement when token pasting was broken, (failing to
properly update the list's tail pointer). Also, memory management when
pasting was broken, (modifying the original token's string which would
cause problems with multiple calls to a macro which pasted a literal
string). We didn't catch this with previous tests because they only
pasted argument values.
|
|
Previously '=' was not included in our PUNCTUATION regeular expression,
but it *was* excldued from our OTHER regular expression, so we were
getting the default (and hamful) lex action of just printing it.
The test we add here is named "punctuator" with the idea that we can
extend it as needed for other punctuator testing.
|
|
These tests were recently fixed on the take-2 branch, but will require
additional work before they will pass here.
|
|
These two tests were tricky to make work on take-2, but happen to
already eb working here.
|
|
This isn't a problem here, but on the take-2 branch, it was trickier
at one point to make a non-macro work when the last token of the file.
So we use the simpler test case here and defer the other case until
later.
|
|
To match what we have done on the take-2 branch to these test cases.
|
|
We take the results of macro expansion and splice them into the
original token list over which we are iterating. This makes it easy
for function-like macro invocations to find their arguments since they
are simply subsequent tokens on the list.
This fixes the recently-introduced regressions (tests 55 and 56) and
also passes new tests 60 and 61 introduced to strees this feature,
(with macro-argument parentheses split between a macro value and the
textual input).
|
|
We previously had a confusing thing where _expand_token_onto would
return a non-zero value to indicate that the caller should then call
_expand_function_onto. It's much cleaner for _expand_token_onto to
just do what's needed and call the necessary function.
|
|
This behavior was useful when starting the implementation over
("take-2") where the whole test suite was failing. This made it easy
to focus on one test at a time and get each working.
More recently, we got the whole suite working, so we don't need this
feature anymore. And in the previous commit, we regressed a couple of
tests, so it's nice to be able to see all the failures with a single
run of the suite.
|
|
content."
This reverts commit 7db2402a8009772a3f10d19cfc7f30be9ee79295
It doesn't revert the new test case from that commit, just the
extremely ugly second-pass implementation.
|
|
Recently I'm seeing cases where "gcc -E" mysteriously omits blank
lines, (even though it prints the blank lines in other very similar
cases). Rather than trying to decipher and imitate this, just get rid
of the blank lines.
This approach with sed to kill the lines before the diff is better
than "diff -B" since when there is an actual difference, the presence
of blank lines won't make the diff harder to read.
|
|
This test was tricky to make pass in the take-2 branch. It ends up
passing already here with no additional effort, (since we are lexing
integers as string-valued token except when in the ST_IF state in the
lexer anyway).
|
|
To do this correctly, we change the lexer to lex integers as string values,
(new token type of INTEGER_STRING), and only convert to integer values when
evaluating an expression value.
Add a new test case for this, (which does pass now).
|
|
Along with a passing test to verify that this works.
|
|
This case was recently solved on the take-2 branch.
|
|
For this we always add a new argument to the argument list as soon as
possible, without waiting until we see some argument token. This does
mean we need to take some extra care when comparing the number of
arguments with the number of expected arguments. In addition to
matching numbers, we also support one (empty) argument when zero
arguments are expected.
Add a test case here for this, which does pass.
|
|
This just makes these functions easier to understand all around. In
the case of _token_list_append_list this is an actual bug fix, (where
append an empty list onto a non-empty list would previously scramble
the tail pointer of the original list).
|
|
This case was tricky on the take-2 branch. It happens to be passing already
here.
|
|
That is, a function-like invocation foo(x) is valid as a
single-argument invocation even if 'x' is a macro that expands into a
value with a comma. Add a new COMMA_FINAL token type to handle this,
and add a test for this case, (which passes).
|
|
the content.
That is, the following case:
#define foo(x) (x)
#define bar
bar(baz)
which now works with this (ugly) commit.
I definitely want to come up with something cleaner than this.
|
|
The define-chain-obj-to-func-parens-in-text test passes here while the
if-with-macros test fails.
|
|
This adds three new pieces of state to the parser, (is_control_line,
newline_as_space, and paren_count), and a large amount of messy
code. I'd definitely like to see a cleaner solution for this.
With this fix, the "define-func-extra-newlines" now passes so we put
it back to test #26 where it was originally (lately it has been known
as test #55).
Also, we tweak test 25 slightly. Previously this test was ending a
file function-like macro name that was not actually a macro (not
followed by a left parenthesis). As is, this fix was making that test
fail because the text_line production expects to see a terminating
NEWLINE, but that NEWLINE is now getting turned into a SPACE here.
This seems unlikely to be a problem in the wild, (function macros
being used in a non-macro sense seems rare enough---but more than
likely they won't happen at the end of a file). Still, we document
this shortcoming in the README.
|
|
This is what I get for using a non-type-safe hash-table implementation.
|
|
To do this we have split the existing "HASH_IF expression" into two
productions:
First is HASH_IF pp_tokens which simply constructs a list of tokens.
Then, with that resulting token list, we first evaluate all DEFINED
operator tokens, then expand all macros, and finally start lexing from
the resulting token list. This brings us to the second production,
IF_EXPANDED expression
This final production works just like our previous "HASH_IF
expression", evaluating a constant integer expression.
The new test (54) added for this case now passes.
|
|
Simply need to move the rule for IDENTIFIER to be after "defined" and
everything is happy.
With this change, tests 50 through 53 all pass now.
|
|
With this change, tests 41 through 49 all pass. (The defined operator
appears to be somehow broken so that test 50 doesn't pass yet.)
|
|
|
|
Which makes test 40 now pass.
|
|
Now that we no longer have nested for loops with 'i' and 'j' we can
use the 'node' that we already have.
|
|
All the code referencing these was removed some time ago.
|
|
With this fix, tests 37 - 39 now pass.
|
|
None of these are fundamental---just a few things that haven't been
implemented yet.
|
|
Always better to use proper grammar in our grammar.
|
|
As required by the C99 specification of the preprocessor.
With this fix, tests 33 through 36 now pass.
|
|
This doesn't change any functionality here, but will allow us to make
future changes that were not possible with direct printing.
Specifically, we need to expand macros within macro arguments before
performing argument substitution. And *that* expansion cannot result
in immediate printing.
|
|
With this fix, test 32 no longer recurses infinitely, but now passes.
|
|
Supporting embedded newlines in a macro invocation is going to be
tricky with our current approach to lexing and parsing. Since this
isn't really an important feature for us, we can defer this until more
important things are resolved.
With this test out of the way, tests 27 through 31 are passing.
|
|
This trailing whitespace was coming from macro definitions and from
macro arguments. We fix this with a little extra state in the
token_list. It now remembers the last non-space token added, so that
these can be trimmed off just before printing the list.
With this fix test 23 now passes. Tests 24 and 25 are also passing,
but they probbably would ahve before this fix---just that they weren't
being run earlier.
|
|
We're no longer using the expansion stack, so its functions can go
along with most of the body of glcpp_parser_lex that was using it.
|
|
We weren't including this left parenthesis in the argument's token
list so the nested function invocation wasn not being recognized.
With this fix, tests 21 and 22 now pass.
|
|
This causes test 16 to pass. Tests 17-20 are also passing now, (though
they would probably have passed before this change and simply weren't
being run yet).
|
|
This makes tests 16 - 19 pass.
|
|
This is what gcc does, and it's actually less work to do
this. Previously we were having to save the contents of space tokens
as a string, but we don't need to do that now.
We extend test #0 to exercise this feature here.
|
|
This simply ensures that spaces in input line are preserved.
|
|
This makes test 15 pass and also dramatically simplifies the lexer.
We were previously using a CONTROL state in the lexer to only emit
SPACE tokens when on text lines. But that's not actually what we
want. We need SPACE tokens in the replacement lists as well. Instead
of a lexer state for this, we now simply set a "space_tokens" flag
whenever we start constructing a pp_tokens list and clear the flag
whenever we see a '#' introducing a directive.
Much cleaner this way.
|