Using nova policy.json to Enforce Availability Zone Selection

We recently had a requirement to enforce selection of an availability zone* when booting instances in nova.  There’s no way to do this in the stock API, we didn’t want to modify code, and a waffle seems like overkill.  So I spent some time playing with policy.json config to see if we could enforce this there: you can!

Data about the calling user and the called nova object are available for policy engine decisions.  In the nova boot case, the called object is actually the new instance being created.  Therefore you can key off of any of the parameters from that call, including availability zone.

Enforcing an Availability Zone Selection

To ensure the user chooses some availability zone, policy.json can be configured like this:

"compute:create": "not None:%(availability_zone)s"

Enforcing only Particular Availability Zones

Alternatively, if you wanted to enforce only using certain availability zones, you can use this model:

"compute:create": "'az-1':%(availability_zone)s or

Restricting Access to Availability Zones

We also may have to restrict which network security zones (and therefore which availability zones) that certain users may access (i.e. like a PCI or PKI zone.)  We can combine the above method for validating the availability zone with the traditional policy.json role checking:

"compute:create": "(not None:%(availability_zone)s and
                    not 'pki-az-1':%(availability_zone)s) or
                   ('pki-az-1':%(availability_zone)s and role:pkiaccess)"

This gets a little crazy if you have a number of availability zones to deal with, but for the purposes of a POC, we can see how this work.

Limited Error Reporting

One downside of enforcing this in policy.json is that any invalid selection that does not meet the policy will result in a 403 error, like this:

ERROR (Forbidden): Policy doesn't allow compute:create to be performed.
      (HTTP 403) (Request-ID: req-0b5ddbd5-ae19-4b7d-8d19-2bec18d60577)

Which is not particularly useful to the end user:  there’s no indication why the call is not allowed.  However, this is a quick and dirty way to get this validation (without code modifications) if it’s needed.

* We overload availability zones to also allow users to choose the network security zone they want.  We don’t want instances randomly scheduled to any security zone, so the end user has to tell us by selecting an appropriate availability zone for the desired security zone.