In Defiance of Titles

Just another weblog

Output Transformation in a Zend Framework Model Layer

with 4 comments

A few weeks back, Matthew Weier-O’Phinney wrote a very helpful discussion of model layer infrastructure using various components of the Zend Framework. I especially appreciated his advice on using Zend_Form as an input filter inside the model class itself; it provides a very clean way to keep validation and filtering logic properly encapsulated.

Zend_Form’s use of Zend_Filter and Zend_Validate also makes it very easy to get precisely the filtering and validation rules you need. You can even filter through an external library like HTMLPurifier if you find you need the extra functionality, just by writing a new filter class; this has already been covered quite well (for example, see Part 8, Step 3 of Pádraic Brady’s Zend Framework blog tutorial). As Weier-O’Phinney demonstrates, you can then use this Zend_Form object as a screening filter in your model class, so that certain properties must always pass through the form’s validation process before they are set in the model itself. I won’t duplicate his logic here either, but you should definitely take a look at it.

However, I’ve run into a minor problem, and I’m not sure my solution is particularly ideal. See, the Zend_Form approach described above does a great job of implementing Chris Shiflett’s Filter Input, Escape Output principle…user input is filtered for invalid HTML before it’s ever saved to the model, and can then be escaped as appropriate in the view layer. But what happens if you need to be able to retrieve the user’s original unfiltered input later?

That might not sound like an appropriate thing to do, but consider this. Suppose that instead of simply sanitizing user-contributed HTML, you wanted to allow your users to use a simpler text input format (such as Markdown) and generate the HTML for them later? It wouldn’t be appropriate to save the generated HTML to the model, since your users would then be unable to retrieve their original Markdown version for later editing. However, if you don’t pre-generate the HTML, then you can’t perform your HTMLPurifier sanitizing at the input stage either, since there isn’t any HTML to sanitize yet.

In this situation, it looks to me like you’d be stuck doing all your input filtering in the presentation (output) layer, which doesn’t really dovetail well with Shiflett’s principle. But then again, there do appear to be two distinct types of “filtering” at work here, one of which is what Shiflett was talking about, and the other of which probably isn’t:

  1. Sanitization, or making sure that user input doesn’t contain any security risks.
  2. Transformation, or converting user input for presentational purposes. (I feel like this is different from escaping, since escaping is mainly concerned with defusing special characters?)

So what do you think? It’s clear that sanitization ought to be done immediately upon input (preferably in the form object), but where should transformation happen?

Rob Allen’s Zend Framework Overview from last year hints at implementing things like Markdown formatting in the view layer through the use of view helpers. This is certainly appropriate from a strict MVC perspective, as output transformation is definitely presentation-layer stuff. However, this isn’t particularly DRY; every time you wrote a view script utilizing this data, you’d need to remember to run it through the appropriate chain of output filters.

So, my best overall idea (building on Weier-O’Phinney’s examples) is to implement it in the getters in my model:

class My_Model
  // ...
  public function __get($property)
    $method = 'get' . ucwords($property);
    if (method_exists($this, $method)) {
      return $this->$method();
    if (array_key_exists($property, $this->_data)) {
      return $this->_data[$property];
    return null;

  public function getBody($applyOutputFilter = true)
    $body = $this->_data['body'];
    if ($applyOutputFilter) {
      $body = $this->getOutputFilter()->filter($body);
    return $body;

  public function getOutputFilter()
    $filterChain = new Zend_Filter();
    // add specific filter objects as appropriate, and then...
    return $filterChain;
  // ...

This guarantees that whenever the “body” is accessed as a property, it’s correctly transformed for HTML output (a sensible default).

However, both of these approaches still leave us with the same core problem: you almost inevitably end up doing all your input filtering at the presentation stage, rather than prior to saving it to the persistence layer as is usually recommended. This can be a security risk if you’re not careful, and is almost certainly a performance hit for the average visiting user.

Any ideas on how best to resolve these issues?


Written by jazzslider

April 6, 2009 at 7:11 am

4 Responses

Subscribe to comments with RSS.

  1. You’re mixing up the concepts a little. “Filter Input” correlates to validation and normalization: am I receiving sane input, and is there a particular (normalized) way I wish to store it? (The latter is particularly useful with data formats that might have different representations based on locale.) This is what Zend_Form’s filter and validation chains accomplish.

    The second half of the security mantra is “Escape Output.” When dealing with models, this typically happens in two very different places, with very different rules. In the first, you need to escape data going into your data storage to ensure it does so safely; when using a database, this typically means using prepared statements or utilizing the appropriate quoting mechanism for your database (Zend_Db does this for you). In the second place, you have the output generation for client consumption — which corresponds to transformation of markup and escaping the data in a fashion that is sane for the output you’re generating.

    I’d argue that you _do_ want to do your Markdown -> HTML transformation at the view layer. Markdown can be stored easily and safely, and then transformed to a variety of formats; its in the view layer that you select the final output format you will be using. A view helper makes this transformation trivial.

    Matthew Weier O'Phinney

    April 6, 2009 at 7:58 am

  2. Thanks for clearing that up, @weierophinney; I did feel a bit confused after I’d finished writing this morning 🙂

    I think my main goal in putting the transformation procedure in the model layer was to automate the process for the majority of use cases…if most of my view scripts are HTML, it’d save me a lot of typing in the view layer.

    But “saving typing” does not necessarily equal “responsible coding” 🙂 This approach can definitely lead to problems later on, especially if I start writing a lot of non-HTML views. Much better to be consistent about escaping from the get-go, I suppose.

    Thanks for your feedback!


    April 6, 2009 at 8:34 am

  3. Wouldn’t this be a case where using viewhelpers would allow you to easily have your output filter chain, but at the same time keep the amount of code in your views to a minimum?

    Bert JW Regeer

    June 15, 2009 at 3:05 pm

  4. Definitely. In the content module I just published, I ended up writing a simple “Content_View_Helper_Filter” helper that takes a string to be filtered along with the name of the filter object to use…very clean overall solution that I wish I’d thought of when I wrote this post 🙂


    June 15, 2009 at 4:03 pm

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: