- From: Tab Atkins Jr. <jackalmage@gmail.com>
- Date: Wed, 8 May 2013 17:13:32 -0700
- To: Simon Sapin <simon.sapin@kozea.fr>
- Cc: www-style list <www-style@w3.org>
On Tue, Feb 19, 2013 at 10:03 PM, Simon Sapin <simon.sapin@kozea.fr> wrote:
> http://lists.w3.org/Archives/Public/www-style/2013Feb/0278.html
>
> 1. Perhaps for team-legal rather than this WG? When a spec contains detailed
> algorithm in English, implementing it may look like "translating" it to a
> computer language, similar to translating the spec to another human
> language.
>
> Clarify that implementing is not a "derivative work" forbidden by the W3C
> Document License?
>
>
> http://lists.w3.org/Archives/Public/www-style/2013Feb/0402.html
As noted by others, this is perhaps a Team legal issue, but shouldn't
be an issue for Syntax until/unless they say something about it.
> 2. Possible security issue: Taking the stylesheet’s character encoding from
> the referring document should be same-origin only.
Haven't looked into this yet. Added an issue.
> 3. Editorial: get rid of §3.2.1. "Preprocessing the input stream" by doing
> the same work in the tokenizer?
I haven't tried to do this yet. Is it necessary? You could consider
it part of tokenization, in the same way that "consume a component
value" is part of parsing.
> 4. Editorial: The tokenizer would be nicer (and could be less redundant)
> with a style closer to that of the parser: a bunch of "functions" that call
> each-other rather than a state machine. (Not quite "recursive decent"
> though, there is no recursion.)
With the help of some identification/validation functions, I've
eliminate a *ton* of redundancy. Let me know if you find any
remaining stuff that would benefit from being abstracted out.
> 5. Editorial: use more look-ahead to avoid "reconsuming"?
Most of the reconsuming is just for convenience. If you spot places
where you think I could use lookahead rather than reconsuming (and
which wouldn't violate my "three characters of lookahead, one token of
lookahead" rule), let me know.
> 6. *-match tokens: maybe add now tokens for !#%+./?@ (each follow by = equal
> sign) in addition to the current ~|^$* so that future additions to Selectors
> don’t need to add new tokens. Maybe have a single "match" token with a
> character value (like delim) rather than many tokens.
Could, or we could just wait and add to the parser later. Dunno what's best.
> 7. *If* SVG2 wants some of its attributes values to have CSS syntax *but*
> not allow CSS comments, add a "no comment" flag to the tokenizer. Tab and I
> would rather just allow comments, though, if that’s not a web-compat issue.
Up to the SVGWG, but I don't think they need this control.
> 8a. Should EOF in quoted strings or urls not be an error at all, to be
> consistent with the rest of the "unexpected EOF" rules?
I'd be fine with this, but I'd need to check compat again, as I've
forgotten the original details.
> 8b. (Special case of the above) §4.2 of CSS 2.1 has an example where EOF in
> a string as acceptable, in contradiction with its own Core Grammar in §4.4.1
> where it’s a bad-string token.
Yay!
> 9. There is concern with bad-string and bad-url being "preserved". (Should
> always be errors caught as early as possible?) But I don’t see how to do
> this while enabling Media Queries’s fine-grained error handling.
Right, they need to stick around for various reasons.
> 10. Editorial: §4.4.12 has some redundant checks, since this mode is only
> ever entered in specific cases.
Removed as many redundant checks as I could find. Let me know if
there are any left.
> 11. Apparently SVG requires scientific notation not only for numbers (which
> we now have in CSS) but also for percentages and dimensions.
Fixed.
> 12. Some concern about changes in bad-url tokenization. Did non-WebKit
> implementers discuss it? (No opinion from me.)
Not really. This needs to be discussed with the WG to make sure it's fine.
> 13. Proposal: make at-rule syntax completely generic: get rid of the
> "recognized at-rule", "declaration-filled" and "rule-filled" concepts. Parse
> ';' or a generic {} block for at rules. Definitions of specific at-rules can
> call back into Syntax with one entry point or another to parse the contents
> of a {} block.
Done.
> 14. Editorial: Non-normative prose describing error recovery would be nice.
> (Like the diagrams describe valid syntax.)
Sounds fine. Will do.
> 15. Quirks mode and transform function whitespace do not belong in the
> generic Syntax module, but in the grammar of the relevant
> attributes/properties.
Quirks mode has been removed. Transform-function-whitespace needs to
be handled at Syntax level to be sane; it really is a tokenizer flag,
because it's invoked only when parsing the transform attribute in SVG.
> 16. Maybe an+b belongs in Selectors rather than Syntax?
I think it's appropriate to leave in Syntax, but my current approach
is bad. Instead, I need to describe it in terms of tokens (so it can
interact with other tokens in grammars), then do a "reserialize and
reparse with simpler rules" thing.
> 17. Hash tokens need a new "is a valid ident" for ID selectors. The edit is
> not trivial: if 4. or 5. are to happen, might be better to do this
> afterwards or at the same time.
Done. (Though we're trying to drop that now, which would make me happy.)
~TJ
Received on Thursday, 9 May 2013 00:14:19 UTC