Django

Code

Ticket #7806 (new)

Opened 6 months ago

Last modified 2 months ago

django.template refactoring

Reported by: emulbreh Assigned to: nobody
Milestone: post-1.0 Component: Template system
Version: SVN Keywords: tplrf
Cc: carl@dirtcircle.com, research@einfallsreich.net Triage Stage: Accepted
Has patch: 1 Needs documentation: 1
Needs tests: 1 Patch needs improvement: 1

Description

Why should django.template be refactored?

Filter and Variable parsing is inconsistent. Many ad hoc parsers in defaulttags are fragile. Whitespace handling is ungraceful.

The patch provided splits __init__.py into separate modules and introduces a TokenStream class that allows parsing of literals, vars (called lookups) and filter expressions. No docs yet, have a look at the source (it all happens in ~150 loc).

The patch only touches django.template and django.templatetags. Tests are separate so you can run them with the old code.

Modules in django.template

  • context (no dependencies)
    Untouched.

  • expressions (no dependencies)
    Contains Lookup, Literal, and FilterExpression. All are subclasses of a marker class Expression. These replace Variable and the old FilterExpression class. A Variable equivalent is provided in compat.Variable.

  • nodes (depends on expressions)
    Contains Node, NodeList, TextNode, and ExpressionNode. The latter is the old VariableNode renamed. A VariableNode alias is provided in compat.

  • compiler (depends on context, expressions, nodes)
    Contains Parser, Lexer, Token, TemplateSyntaxError, Origin, StringOrgin, and compile_string(). Those are mostly untouched. Additionally there are TokenStream, and TokenSyntaxError. Those provide a safe way to parse Tokens containing expressions.

  • library (depends on compiler, nodes, context)
    Contains InvalidTemplateLibrary, Library, get_library(), and add_to_builtins(). Mostly untouched. Should be called libraries for consistency, but that would clash with the legacy dict.

  • loader
    Untouched. Moved TemplateDoesNotExist here.

  • utils
    ConditionalNode, base class for IfNode, IfEqualNode. EmptyNode. Helper functions: parse_conditional_nodelists(parser, name), parse_as(bits) parse_args_and_kwargs(bits), resolve_args_and_kwargs(args, kwargs)

  • compat
    Provides backwards compatibility for Variable, VariableNode, TokenParser, and resolve_variable()

  • defaulttags, loader_tags
    Mostly refactored to use TokenStream where appropriate.

Tickets

  • #4746 - allow whitespace before and after filter separator
    fixed. tests are there (filter-syntax03, filter-syntax04) but expect TemplateSyntaxError?.

  • #5270 - empty string literals
    fixed. tests included.

  • #5756 - accept filters (almost) everywhere
    fixed. a few tests included.

  • #5971 - token parser bug
    TokenParser will be deprecated.

  • #6271 - filter arguments with spaces
    fixed. test included. (seems to be fixed in trunk already)

  • #6296 - expressions for ifequal
    fixed. tests included (dup of #5756 ?)

  • #6510 - get_nodes_by_type() problem
    probably fixed.

  • #6535 - negative numeric literals
    fixed. tests included.

  • #7295 - quotes, escaping and translation of string literals handled inconsistently in templates
    fixed.

Performance

Comparing runtests.py templates measures, the refactored version is a bit slower.

Todo

{% if not %} is currently valid. The refactored implementation is greedy and reports a syntax error. Could be fixed, not sure it's worth it. Broken tests are: if-tag-not02, if-tag-not03. DDN.

Numeric literals ending with "." are now considered a syntax error. Broken test: ifequal-numeric07. DDN.

{% url %} currently accepts an arbitrary unquoted unicode string for the view name. The refactored implementation only accepts bare python identifiers or quoted string literals. Broken test: url05. DDN.

TemplateSyntaxError messages need some work. Eventually shortcut methods on TokenStream.

Find better names for TokenSyntaxError and TokenStream.

Provide a decorator that handles the boilerplate bits=parser.token_stream(token) and bits.assert_consumed() calls.

Investigate performance impact.

Docs and more tests.

Attachments

tplrf.diff (109.6 kB) - added by emulbreh on 07/18/08 06:46:57.
tplrf-tests.diff (6.3 kB) - added by emulbreh on 07/18/08 06:47:51.
tplrf.2.diff (116.9 kB) - added by emulbreh on 07/18/08 12:58:36.
tplrf-r8970.diff (129.2 kB) - added by emulbreh on 09/05/08 14:14:35.
with tests

Change History

07/18/08 06:46:57 changed by emulbreh

  • attachment tplrf.diff added.

07/18/08 06:47:51 changed by emulbreh

  • attachment tplrf-tests.diff added.

07/18/08 06:54:50 changed by emulbreh

  • needs_better_patch set to 1.
  • has_patch set to 1.
  • needs_tests set to 1.
  • needs_docs set to 1.

I'm going to add a tplrf-fixed keyword to #4746, #5270, #5756, #5862, #6271, #6296, #6510, #6535, and #7295 - unless someone tells me that's a bad idea.

07/18/08 12:58:36 changed by emulbreh

  • attachment tplrf.2.diff added.

07/18/08 13:05:41 changed by emulbreh

  • made decorators 2.3 compatible
  • added a (clusily named) decorator for parser functions using TokenStream: token_stream_parsed.
  • moved some common syntax error handling to TokenStream.
  • converted some remaining compile_filter() calls

Notes: {% cycle %} is still missing. The docstring/code ratio in defaultags starts to get ugly. A DebugTokenStream should capture the exact token positions for nicer error messages.

07/18/08 18:38:57 changed by Simon Greenhill

  • stage changed from Unreviewed to Accepted.
  • milestone set to post-1.0.

Accepted as post-1.0 following discussion on #7799.

07/24/08 17:59:07 changed by emulbreh

This would also fix #7377.

09/05/08 14:14:35 changed by emulbreh

  • attachment tplrf-r8970.diff added.

with tests

09/18/08 08:37:59 changed by carljm

  • cc set to carl@dirtcircle.com.

09/26/08 01:42:23 changed by anonymous

  • cc changed from carl@dirtcircle.com to carl@dirtcircle.com, research@einfallsreich.net.

10/02/08 20:47:46 changed by SmileyChris

This is all incredibly impressive, emulbreh.

I've got a tag token parser which has a different tact - you're more than welcome to a look. Send me an email if you're interested - [me]@gmail.com

10/24/08 08:59:11 changed by emulbreh

This would also fix #9315.

11/13/08 13:32:34 changed by jacob

This looks really good! I've got a few questions:

  • Have you done any benchmarking? I'd be interested in any speed implications.
  • The @uses_token_stream decorator seems a bit clumsy (as you note), but I see why you did it. Any thoughts on a more elegant way of getting the new functionality in in a backwards-compatible way? What about making Token.__iter__ return a TokenStream? That'd allow a pretty natural for x in token: approach in custom tags.
  • There's no documentation of the new TokenStream class, either as docs or docstrings. This needs to happen, if only so I can figure out what the hell pop_lexm means.

I'm sure I'll have more questions later, but for now that'll do. Thanks!

11/14/08 20:36:42 changed by emulbreh

I haven't done any real benchmarking. Comparing time spent in the "templates" test, it's about 1.18 times slower.

There's a tiny bit of documentation in this thread. Hopefully enough to dicuss the API. I'll add docstrings in the next patch.

Token.__iter__ would seldomly be useful as you'd typically use the higher level parser functions in custom tags. Perhaps TokenStream should be named TokenParser. Simon Willison proposed @register.tag(token_stream=True).


Add/Change #7806 (django.template refactoring)




Change Properties
Action