swagger-combine

knoxg | June 29, 2021 | Programming | No Comments

This post has been written in two parts,

The first part, slagging off about YAML, and YAML in swagger in particular (“the problem”), and
The second part, presenting a way of pre-processing your YAML in ways swagger doesn’t support out of the box (“the solution”)

So you can skip the first bit if you want to avoid that.

API versions

So anyway. Let’s say you’ve got an API.

You’re exposing that API to the public, so you’ve semantically versioned it so that users are aware of when it changes [1], and they’re also aware when it changes so much that you’ve broken their previously functioning online wicker sales platform.

So to combat that you phase in changes to the API and keep a few of them running at any one time, to give people a chance to upgrade their frobulator 2.0 code to use the latest and greatest frobulator 3.0.

So rather than having endpoints that evolve like this:

GET /api/v1/thing	# version 1 lists the things
POST /api/v2/thing/{id} DELETE /api/v2/thing/{id}	# version 2 lets the user update # and delete a thing
POST /api/v3/thing/{id} POST /api/v3/thing/{id}/frob	# version 3 adds a new parameter # to the updateThing operation # and allows the API user # to directly frob the thing

You’ll probably version the entire API, and have endpoints like this:

GET /api/v1/thing	# version 1 of everything
GET /api/v2/thing POST /api/v2/thing/{id} DELETE /api/v2/thing/{id}	# version 2 of everything # (includes unchanged v1 endpoints)
GET /api/v3/thing POST /api/v3/thing/{id} POST /api/v3/thing/{id}/frob DELETE /api/v3/thing/{id}	# version 3 of everything # (includes unchanged v2 endpoints)

which is a bit easier to manage, but means you’ve got a lot of repeated endpoints that haven’t really changed much.

So you’re probably going to end up with a lot of repeated guff in your API definitions.

How does swagger reduce duplication

Swagger uses YAML to define their APIs, because they looked at how easy it is to cut and paste the universally admired and never improved upon Makefile format and thought that making whitespace significant was definitely something you want in an interface definition language.

Swagger has this thing called a reference ( $ref ), which they stole from the JSON-Reference and JSON-Pointer almost-RFCs even though it’s YAML, which allows you to link swagger definition files together. So you might have

paths:
  /thing:
    get:
      operationId: getThings
      description: get all the things
      parameters:
      responses:
        200:
        schema:
          type: object
          title: getThingsOKResponse
          properties:
            status:
              description: OK
              type: string
            thing:
              type: array
              items:
                $ref: 'objects.yaml#/definitions/ThingObject'

where ‘ThingObject’ is defined in a separate objects.yaml, or even

paths:
  /thing:
    $ref: 'paths.yaml#/paths/~1thing'

Where paths.yaml contains the first snippet above. The ~1 is the JSON-Pointer way of escaping forward slashes (forward slashes themselves are part of the JSON-Pointer syntax, so the “/thing” entry needs to be encoded as “~1thing”). This starts looking a bit shit when you have loads of them in a reference, but you can also use %2F because swagger likes to have multiple encodings on the same entry, which is in no way is a security or readability problem.

So you could have this instead:

paths:
  /thing:
    $ref: 'paths.yaml#/paths/%2Fthing'

You can even URL-escape your tildes, so that this probably works as well, for a subset of the swagger toolchain:

paths:
  /thing:
    $ref: 'paths.yaml#/paths/%7E1thing'

Or some combination of all three.

All well and good, and you might rightly expect that you might be able to do something like this:

paths:
  /thing:
    get:
      $ref: 'paths.yaml#/paths/~1thing/get'
    post:
      $ref: 'paths-v2.yaml#/paths/~1thing/post'

and you’d be wrong, because the creators of the spec thought they’d only allow $refs in one or two places, and that’s not one of the one or two places [3].

You might also think that you could use YAML node references like this

get-thing: &get-thing
  (some operation definition) 

post-thing: &post-thing
  (some operation definition)

paths:
  /thing:
    get: &get-thing
    post: &post-thing

or some combination of node references and $refs that allows you to externalise those definitions, and you’d also be wrong.

Combining swagger files

So anyway, here’s a thing you can run over your yaml files before swagger gets its mitts on them, to combine them in ways that you can’t with vanilla swagger.

It adds a new reference type called $xref which is identical to $ref except

you can use them anywhere in the yaml file
you can add or override keys in the referenced objects
the syntax allows forward slashes in path keys to appear as forward slashes

As a bonus, you can also merge multiple input YAMLs together just by merging all the paths, meaning you don’t need to use $xrefs at all.

So you could have

paths:
  /thing:
    get:
      # acts like a $ref but in a place where $refs aren't allowed
      $xref: 'paths.yaml#/paths/~1thing/get'     
    post:
      $xref: 'paths-v2.yaml#/paths/~1thing/post'

paths:
  /thing:
    get:
      # hashes in the path toggle between '/'-as-JSON-Pointer separators 
      # and '/' as characters in the key
      $xref: 'paths.yaml#/paths/#/thing#/get'    

    post:
      $xref: 'paths-v2.yaml#/paths/#/thing#/post'

paths:
  /thing:
    get:
      $xref: 'paths.yaml#/paths/#/thing#/get'

    post:
      # objects defined by xref can have other fields defined 
      # which are merged in with the xref'ed object
      $xref: 'paths.yaml#/paths/#/thing#/post'
      parameters:
        old-parameter:
          description: some old parameter whose definition has changed
        new-parameter: 
          description: the new parameter in v2 of the API that isn't in v1

or have two input files:

file-1.yaml

paths:
  /thing:
    get:
      (some operation definition)

file-2.yaml

paths:
  /thing:
    post:
      (some operation definition)

and specify both file-1.yaml and file-2.yaml as inputs to the preprocessor.

$refs survive the preprocessing and are handled by swagger, as before.

Here it is as a maven plugin.

yaml-combine-maven-plugin

com.randomnoun.maven.plugins:yaml-combine-maven-plugin

And here it is on github:

yaml-combine-maven-plugin

git@github.com:randomnoun/yaml-combine-maven-plugin.git

It’s called yaml-combine-maven-plugin because I’m using it to combine YAMLfiles. It used to be called swagger-combine-maven-plugin, but I’ve started using this in other non-swagger places now.

And here’s how you use it in your pom.xml:

<project>
  <build>
    <plugins>

      <plugin>
        <groupId>com.randomnoun.maven.plugins</groupId>
        <artifactId>yaml-combine-maven-plugin</artifactId>
        <version>2.0.1</version>
        <executions>
          <execution>
            <id>yaml-combine</id>
            <phase>generate-sources</phase>
            <goals>
                <goal>yaml-combine</goal>
            </goals>
            <configuration>
              <fileset>
                <includes>
                  <include>my-swagger-file-with-xrefs-in-it.yaml</include>
                </includes>
                <directory>${project.basedir}/src/main/swagger</directory>
              </fileset>
              <outputDirectory>${project.basedir}/target/swagger-combine</outputDirectory>
              <finalName>my-swagger-file-with-resolved-xrefs.yaml</finalName>
            </configuration>
          </execution>
        </executions>
      </plugin>
            
    <plugins>
  <build>
<project>

[1] until it gets close to version 10 when you decide to change the name of the product rather than fix your y1d problems [2]. If you’re really stuck, try using the roman numeral “X”, that seems to be popular these days.
[2] y1d = y2k for the number 10
[3] they also thought they’d implement $refs by causing the swagger parser to parse the entire referenced file every time it sees a $ref, but that’s neither here nor there. It’s here. [4]
[4] Which is definitely the kind of software quality you want in a product that underpins your everything-as-a-service infrastructure, and the longer build times gives you time to catch up on your Tolstoy.