julik live

Bad API design: a whirlwind tour of strong_parameters

Lately I have been doing time upgrading an application from Rails 3 to Rails 4. Obviously, one of the key parts of the operation in that case becomes a move from attribute protection (attr_accessible'/attr_protected') to strong parameters. Doing so, I have actually struggled a great deal. And after a number of hours spent on reflecting on the whole ordeal I came to the conclusion that it was not entirely my fault, and not even a problem with how the previous state of the application was when the migration had to be done.

The problem is strong_parameters itself. See, when you design an API, you usually assume a number of use cases, and then streamline the workflow for those specific use cases to the best of your ability. When situations arise where your API does not perform satisfactorily, you basically have two choices:

  • Press on and state that these situations are not covered by your API
  • Re-engineer the API to accomodate the new requirements

When designing strong_parameters, the Rails gang apparently went for the first approach. Except that since stating that "you are on your own" is not often encouraged as a message to developers-customers, it has been pretty much been swept under the rug. As a result, strong_parameters have been released (and codified as The solution to input validation) without (in my opinion) due thought process.

Since I was finally able to wrestle through it, behind all the frustrations I could actually see why strong_parameters did not work. It did not work because it is a badly designed API. And I would like to take some time to walk through it, in hopes that it can reveal what could have been done better, differently, or maybe even not at all.

So, let's run through it.

It is both a Builder and a Gatekeeper

By far the biggest issue with the API is this. It is both something we Rubyists tend to call a builder, and something we tend to call a gatekeeper - this is more of my personal moniker. Let's explain these two roles:

  • A Builder allows you to construct an arbitrarily-nested expression tree that is going to be used to perform some operation.
  • A Gatekeeper performs checks on values and creates early (and clear) failures when the input does not satisfy a condition.

For example, Arel is a good citizen of the Builders camp. A basic Arel operation is strictly separated into two phases - building the SQL expression out of chainable calls and converting the expression into raw SQL, or into a prepared query. Observe:

where(age: 19).where(gender: 'M').where(paying: true).to_sql

You can immediately see that the buildout part of the expression (the where() calls) and the execution (to_sql) are separated. The to_sql call is the last one in the chain, and we know that it won't get materialized before we have specified all the various conditions that the SQL statement has to contain. We can also let other methods collaborate on the creation of the SQL statement by chaining onto our last call and grabbing the return value of the chain.

XML Builder is another old friend from the Builders camp, probably the oldest of the pack. Here we can see the same pattern

b.payload do
  b.age 18
  b.name 'John Doe'
  b.bio 'Born and raised in Tennessee'
end
b._target # Returns the output

The obtaining of the result (the output of the Builder) is a definitely separate operation from the calls to the Builder proper. Even though calls to the Builder might have side effects when it outputs at-call-time, we know that it is not going to terminate early because the output object it uses is orthogonal to the Builder itself.

Strong parameters violate this convention brutally. The guides specify that you have to do this:

parameters.permit(:operation, :user => [:age, :name])

If you have strict strong parameters enabled - and this is the recommended approach since otherwise parameters you did not specify will simply get silently ditched - if you have even one single key besides :operation and :user at the top level, you will get an exception. If you supply a parameter that is not expected within :user - the same. The raise will occur at the very first call to permit or require. This means that the "Gatekeeper" function of strong_parameters is happening within the Builder function, and you do not really know where the Builder part will be aborted and the Gatekeeper part will take over.

Since we are so dead-set on validating the input outright, at the earliest possible opportunity, this mandates an API where you have to cram all of your permitted parameters into one method call. This produces monstrosities like thos:

params.require(:user).permit(
  :email,
  :password,
  :password_confirm,
  :full_name,
  :remember_me,
  :profile_image_setting,
  {
    paid_subscription_attributes: [
      {company_information_attributes: [:name, :street_and_number, :city, :zipcode, :country_code, :vat_number]},
      :terms_of_service,
      :coupon_code
    ]
  }
)

Those monstrosities are necessary because applications designed with use of nested attributes, and especially applications designed along the Rails Way of pre-rendered forms, will have complicated forms with nested values. And those forms are very hard to change, because they are usually surrounded by a whole mountain of hard-won CSS full of hacks, and have a very complex HTML structure to ensure they layout properly.

In practice, if we were to take the call above and "transform" it so that it becomes digestible, we would like to first specify all the various restrictions on the parameters, and then check for whether the input satisfies our constraint - divorce the "Builder" from the "Gatekeeper". For instance, like this:

user_params = params.require(:user)
user_params.permit(:email, :password, :password_confirm, :full_name, :remember_me, :profile_image_setting)

paid_subscription_params = user_params.within(:paid_subscription_attributes)
paid_subscription_params.permit(:terms_of_service, :coupon_code, :company_information_attributes)

company_params = paid_subscription_attributes.within(:company_information_attributes)
company_params.permit(:name, :street_and_number, :city, :zipcode, :country_code, :vat_number)

user_params.to_permitted_hash # The actual checks will run here

This way, if we do have a complex parameter structure, we can chain the calls to permit various attributes and do not have to cram al of them into one call.


A method that does too much not only in function, but in form

Let's take a closer look at the beast that is permit(). It effectively makes use of Ruby basic types to sort-of denote what types the parameters accept, and it does so very poorly, because those basic types do not do the input data justice. To remind, the basic syntax is:

 permit(:scalar_param1, :scalar_param2, :nested_param => [:scalar1, :scalar2])...

Due to the fact that method arguments are interpreted as splat arguments and an options hash, you effectively have to specify them in order that is required by Ruby's method arity, as opposed to specifying them in the order they are laid out in your form. If your form starts with a nested parameter set for token[key] and token[signature] you have to specify them last, because you can't have keyword arguments or options at the start of the argument list, ie. this is impossible to do:

permit(:token => [:key, :signature], :name, :email, ...)

You then have to first look at your form layout, and then mentally rearrange the form fields to befit the arity of Ruby's method calls. Which is absurd, because you would want the order of the permissions to match the order of elements in the form, if only for just plain visual diffing' sake.

When we go beyound scalar parameters, the situation becomes downright disastrous, because if you want to permit an array of scalars, you have to do it like this:

permit(:tags => [])

If you want to permit an array of attributes, you have to do it like this:

permit(:tags => [:name])

If you want to permit an array of arrays, or an array of arrays of nested records, well... you are going to need medicine. Yes, the whole idea of syntactic sugar versus syntactic vinegar is a sound one, to a point, but not when you serve people stale sugar with yeast and mire growing through it in cauliflower blossoms. If this was done properly, we would end up with something like this:

permit(:tags).as_array_of_scalars

or like this

permit(:phone_numbers).as_array_of_records_with(:number, :type)

But since strong_parameters downright insists of doing this in one, all-encompassing and all-embracing method call, this simple solution becomes impossible to execute.

permit() and require() have different return semantics

We have two methods - permit() and require. When we use one - it returns the receiver. If we use require it returns the value under the key matched under the argument. This means (in combination with the permit madness described above) that you can chain like this:

params.require(:user).permit(:password, :email)

but not like this:

params.permit(:user).permit(:password, :email)

Inconclusive exception messages

If you submit a parameter map that looks like this:

{
  "id" => "123",
  "email" => "person@eample.com",
  "password" => "[FILTERED]",
  "token" => {
    "id" => "89821",
    "value" => "AcEFghJYfg"
  }
}

and supply the folling "permistrictions" via the allmighty permit:

permit(:id, :email, :password, :token => [:value])

you get an exception:

Unpermitted parameter: :id.

The exception tells you the name of the parameter, but it does not tell you at which level of nesting this parameter has occurred. So if you are looking at a Rails error page, you have to scan through the error message first, and then scroll down and cross-check it against your request parameters to try to spot the "outsider". Additionally, the exception does not provide you with the permissions map - it does not tell you which parameter sets are allowed int the first place. Whereas being the devleoper you would really love to have that information.

Radically differnt runtime behavior hinging on one config flag

Rails offers us two options on how to deal with parameters we do not expect. We can set it the option to :raise which will abort the controller actions when unexpected parameters are encountered, and :log to write them to a log file. I hope your log collection tools are in order, because if they are not you will be missing out on lots of information that you need when your app starts misbehaving, and it will. Another issue is that the flow of controller execution is changed, drastically, depending on that setting. For instance, let's consider a controller test that submits parameters to a hypothetic Rails controller:

post :create, {:user => {name: 'Joe', password: 'abcdef'}}

Here is what can happen with this test, and what repercussions it has on the controller:

  • The test will pass, because password is silently ignored, but your record is not going to have the password changed. You won't notice it unless you explicitly test for the password attrbute to have been changed in the test.
  • The test will raise, because password creates an exception, and you won't even get to your model assertions
  • If password is permitted, and the config is set to :raise, the test will pass, but the application will fail in production, miserably. Because strong parameters forbig things like _method, authenticity_token and even the lovely utf8. Do you have these keyed in your controller tests. You better, or else...

This creates a situation where, effectively, and especially if you are migrating a legacy codebase, when you set the flag to :raise, one half of your tests will suddenly break. When you scare yourself breathless and switch it to :log, another half of your tests will break, because they do allow the model operations to go ahead but they silently toss important information submitted with the form. This makes for fun, fun, fun times, and is a result of too much config-dependent behavior. The changeover from :log to :raise is simply too extreme, with no gradations in between. And you basically have to engineer your entire application around one or another value of this configuration option, but switching from one to another halfway is pretty much a lost case, because of all the collateral damage.

Type semantics

I won't go into that here - suffice to say that strong parametrs basically differentiate three types of values: scalar, hash and array. If you want anything even a notch more specific (such as booleans, integers...) you are on your own. In theory it makes sense, because strong parameters were created to deal with url-encoded vanilla HTML forms. However, if you use something like an API interface and accept native JSON, which has richer types than forms, strong_parameters gives you zero tooling to validate on that.

More interesting even, that the stronger_parameters gem tries to make those type constraints richer, but it still crams them into the Übermethod. Which makes that method call even more verbose, even more nested, more unreadable. I do understand the sentiment to offer something that looks almost like a standard Rails method call, but what if that method call is badly designed to begin with?

So what do we do?

Contrary to popular opinion, you do have options when dealing with strong parameters. They differ depending on how far from the Rails way you want to stray. So far I have used three.

Option 1 is basically using a wrapper that turns those insane deep-calls to permit into something you can actually read. Here is what we use:

checker = Permitron.new
checker.permit(:utf8, :commit)
checker.within(:user).permit(:email_background)
checker.within(:user).within(:email_background).permit(:public_id, :file_extension)
checker.within(:user).permit(:profile_picture_attributes)
checker.within(:user).within(:profile_picture_attributes).permit(:public_id, :file_extension)
checker.resolve(params)

This does not solve all issues, but at the very least you can now understand the structure of the parameters you expect.

Option 2 is a slightly more radical one. Use something like dry-validation. If you do, you will see by the sheer amount of difference it has with strong parameters that those cases I am mentioning in this blog post have been taken care of. Aside from (in my opinion) overly eager usage of instance_exec the API the gem provides is very, very neat and certainly much clearer than strong_parameters as is.

Option 3 is an even more radical one, and is the one I use on my personal projects. Basically, I make use of the form builder objects to let the view specify the parameters. Imagine we were to send an extra key with each form that would be called parameter_policy. A form submit would then be:

{
  "email" => "...",
  "password" => "...",
  "_parameter_policy" => HMAC(["email", "password"], secret) 
}

Once this form comes in, the controller (or the form object) would look at the policy value, and compare it to the list of parameters actually received. If one of the parameters is missing from the form, or if there is an extra parameter set that wasn't expected, the HMAC value of the set of parameter keys will come out different. This will instantly signal parameter injection, and will prevent the form from being accepted. The beauty of this approach is that actually, it is the user interface that has to specify what input is possible. If you run your form through a form builder, it is very easy to ensure that the builder records every f.text_field that you call, and adds the name of the field to the parameter policy. I have been using this approach on my apps with great results so far, and if you are in for some adventurous experimentation I would suggest you do the same.

Option 4 is using form objects, where all of the parameter validation would be handled by the form object and not by the controller. If you use something like reform it is very much possible that the form object provides better validations and input checks than strong_parameters, and you can ditch the entire affair altogether.

The community aspect

This is actually something I am particularly sad about, and I what I am about to say is to be taken with a grain of salt. Call it entitlement, but I have been doing Ruby for more than a decade now. That is a long time. When I look at the commit list for strong parameters on Github I see 32 contributors, including DHH. Is it imaginable, that I am going to take all the above information, and try to convince 32 people, one by one, that what they have created is an abomination? Is it worth my time? Is it worth their time? And does it align with the party line of Rails in parcitular? Truth be told, I don't know and I do not have time to find out. What I do know is that in my particular case, strong_parameters shoot way, way short of what they promised to deliver, and do it in a way which eats into my productivity, my sanity and my mental well-being. So I will either give tips to others who might find them lacking, or try to steer away from the issue altogether by trying alternative approaches to the problem.

All in all - this is, for me, one of the things that makes Rails hard. There apparently is a "community-approved" Rails-way solution to a problem, but it is not well thought out, and sometimes just demotivating, And at the point when it gets accepted into Rails core, there is basically zero way to turn the tides and turn it into something better, because the amount of effort needed to convince the people involved is too big.

Luckily, when this kind of adversity strikes, we always have the rest of the Ruby ecosystem to come to our rescue.

Do not despair with strong_parameters. It is not your fault, trust me. Been there, done that, got the T-shirt.

Suspects: Веб-стройка

 
comments powered by Disqus

Aspirine not included.