OpenAPI specifications have become the industry standard for documenting REST APIs, allowing developers to describe an API’s endpoints, operations, parameters, and responses in a standardized way.

Table of contents

    Introduction

    While having well-documented APIs is crucial for both internal development and external consumption, the approaches to creating these specifications vary widely.

    This article explores the pitfalls of using Domain-Specific Languages (DSLs) for OpenAPI documentation versus the more straightforward YAML approach, particularly in the context of Elixir development and in the age of AI assistance.

    Horror Story: Iteration 8080

    Your codebase has grown steadily over the months. You’ve been meticulously documenting your REST API, giving it that extra polish with the most popular OpenAPI spec library you could find on hex.pm (or pip, RubyGems, etc.). Life seems perfect as you admire your well-documented API…

    Until one fateful morning when you need to revisit that controller file. You open it and stare in disbelief at what’s become a monster—a file with more inlined OpenAPI specs than actual controller logic. What began as elegant documentation has morphed into an unreadable labyrinth of nested macros and obscure DSL keywords.

    “This needs cleaning up,” you think to yourself, determined to restore order. You decide to separate the code from the spec. Simple enough, right?

    Wrong. After hours of refactoring, nothing works. The OpenAPI library, it turns out, requires you to define the spec in the same file as the controller. No problem! You’ll just macro your way out of this mess:

    defmodule MyAppWeb.OpenAPI.UserControllerSpec do
      defmacro __using__(_) do
        quote do
          use OpenApiSpex.ControllerSpecs
    
          tags ["users"]
          security [%{}, %{"petstore_auth" => ["write:users", "read:users"]}]
    
          operation :update,
            summary: "Update user",
            parameters: [
              id: [in: :path, description: "User ID", type: :integer, example: 1001]
            ],
            request_body: {"User params", "application/json", UserParams},
            responses: [
              ok: {"User response", "application/json", UserResponse}
            ]
            end
          # 400+ lines omitted for brevity.
          # And yes, I've seen specs this long in production...
      end
    end
    defmodule MyAppWeb.UserController do
      use MyAppWeb.OpenAPI.UserControllerSpec
      # rest of code
    end

    “Brilliant!” you exclaim. You decide to refactor the entire codebase to follow this pattern. No more scrolling past those pesky operation definitions! Controllers become readable again, and you feel a sense of accomplishment as you commit the changes to the repository.

    Months pass by

    Business is booming. Your product has gained traction, and now customers are requesting API access to your SaaS platform. “Thank goodness I’ve been meticulously documenting those API endpoints,” you think, patting yourself on the back.

    You diligently add authentication, rate limiting, and all the other necessary components for a customer-facing API. The Swagger UI looks polished, and you announce the API’s availability to your eager customers.

    A week passes by

    Your inbox is suddenly flooded with support tickets. Customers are confused. They can’t seem to make sense of certain endpoints. “But don’t they have access to the Swagger instance?” you wonder, bewildered.

    Upon investigation, you discover several gaps in your documentation. Response types are missing. Headers aren’t properly documented. Edge cases aren’t covered. The reality of your API doesn’t quite match what’s in the specs.

    You roll up your sleeves and try to fix the issues, but you find yourself constantly bouncing between controller code, the DSL-based OpenAPI spec, DSL documentation, and OpenAPI docs. What should be a straightforward task feels like navigating a maze blindfolded.

    Desperate for help, you turn to ChatGPT, hoping AI might save the day. The results are disastrous. GPT keeps inventing non-existent DSL keywords. At one point, it suggests using Java decorators in your Elixir code. You stare at your screen in disbelief, realizing you’ll have to painstakingly write everything by hand.

    But there’s one more surprise waiting: a client needs a feature that was introduced in the latest OpenAPI spec version, but your DSL library hasn’t been updated to support it. With a sinking feeling, you realize you’ll need to fork the library to add support for this feature.

    What began as a simple documentation task has snowballed into a full-blown technical nightmare. But hey, it works! On your machine. At least. You think.

    Libraries, Libraries… How About Books?

    All this madness—the constant switching between DSL and OpenAPI documentation, the failed attempts at AI assistance, the library modifications—could have been avoided entirely. Why not simply use YAML? What benefit did that fancy library actually provide?

    After battling similar libraries in Python and Elixir, I can’t find a compelling reason why I chose them other than my persistent habit of wanting to solve everything in the most “pythonic” or “elixirish” way—which usually translates to installing a library so I don’t have to think too deeply about the problem.

    Let’s examine a simple example to illustrate the difference:

    The DSL Approach

    # In your controller:
    defmodule MyAppWeb.UserController do
      use MyAppWeb, :controller
      use OpenApiSpex.ControllerSpecs
      
      alias MyAppWeb.Schemas.{UserParams, UserResponse}
      
      tags ["users"]
      security [%{}, %{"petstore_auth" => ["write:users", "read:users"]}]
    
      operation :update,
        summary: "Update user",
        parameters: [
          id: [in: :path, description: "User ID", type: :integer, example: 1001]
        ],
        request_body: {"User params", "application/json", UserParams},
        responses: [
          ok: {"User response", "application/json", UserResponse}
        ]
    
      def update(conn, %{"id" => id}) do
        json(conn, %{
          data: %{
            id: id,
            name: "joe user",
            email: "joe@gmail.com"
          }
        })
      end
    end
    
    # In your schemas module:
    defmodule MyAppWeb.Schemas.UserParams do
      alias OpenApiSpex.Schema
      require OpenApiSpex
    
      OpenApiSpex.schema(%{
        type: :object,
        properties: %{
          email: %Schema{type: :string},
          name: %Schema{type: :string},
          callback_url: %Schema{type: :string}
        }
      })
    end
    
    defmodule MyAppWeb.Schemas.UserResponse do
      alias OpenApiSpex.Schema
      require OpenApiSpex
    
      OpenApiSpex.schema(%{
        type: :object,
        properties: %{
          data: %Schema{
            type: :object,
            properties: %{
              id: %Schema{type: :string},
              email: %Schema{type: :string},
              name: %Schema{type: :string}
            }
          }
        }
      })
    end

    This approach requires setup code scattered across multiple files, understanding of both Elixir syntax and the DSL’s unique abstractions, and careful coordination between your actual controller logic and the spec definition.

    What Gets Generated

    ---
    components:
      responses: {}
      schemas:
        UserParams:
          properties:
            callback_url:
              type: string
              x-struct:
              x-validate:
            email:
              type: string
              x-struct:
              x-validate:
            name:
              type: string
              x-struct:
              x-validate:
          title: UserParams
          type: object
          x-struct: Elixir.MyAppWeb.Schemas.UserParams
          x-validate:
        UserResponse:
          properties:
            email:
              type: string
              x-struct:
              x-validate:
            id:
              type: string
              x-struct:
              x-validate:
            name:
              type: string
              x-struct:
              x-validate:
          title: UserResponse
          type: object
          x-struct: Elixir.MyAppWeb.Schemas.UserResponse
          x-validate:
    info:
      title: My App
      version: '1.0'
    openapi: 3.0.0
    paths:
      /api/users/{id}:
        put:
          callbacks: {}
          operationId: MyAppWeb.UserController.update
          parameters:
            - description: User ID
              example: 1001
              in: path
              name: id
              required: true
              schema:
                type: integer
                x-struct:
                x-validate:
          requestBody:
            content:
              application/json:
                schema:
                  $ref: '#/components/schemas/UserParams'
            description: User params
            required: false
          responses:
            200:
              content:
                application/json:
                  schema:
                    $ref: '#/components/schemas/UserResponse'
              description: User response
          security:
            - {}
            - petstore_auth:
                - write:users
                - read:users
          summary: Update user
          tags:
            - users
    security: []
    servers:
      - url: http://localhost:4000
        variables: {}
    tags: []

    Notice the verbose output with Elixir-specific annotations like x-struct and x-validate. Also note that the generated spec doesn’t correctly account for the actual response structure wrapping fields in a data object.

    The Direct YAML Approach

    openapi: 3.0.3
    info:
      title: User API
      description: API for managing users
      version: 1.0.0
    servers:
      - url: https://api.example.com/v1
        description: Main API server
    paths:
      /users/{id}:
        put:
          tags:
            - users
          summary: Update user
          operationId: updateUser
          parameters:
            - name: id
              in: path
              description: User ID
              required: true
              schema:
                type: integer
                example: 1001
          requestBody:
            description: User params
            required: true
            content:
              application/json:
                schema:
                  $ref: '#/components/schemas/UserParams'
          responses:
            '200':
              description: User response
              content:
                application/json:
                  schema:
                    $ref: '#/components/schemas/UserResponse'
          security:
            - {}
            - petstore_auth:
                - write:users
                - read:users
    components:
      schemas:
        UserParams:
          type: object
          properties:
            name:
              type: string
              example: "joe user"
            email:
              type: string
              format: email
              example: "joe@gmail.com"
            callback_url:
              type: string
          required:
            - name
            - email
        UserResponse:
          type: object
          properties:
            data:
              type: object
              properties:
                id:
                  type: integer
                  example: 1001
                name:
                  type: string
                  example: "joe user"
                email:
                  type: string
                  format: email
                  example: "joe@gmail.com"
              required:
                - id
                - name
                - email

    The YAML approach correctly captures the nested data structure and provides cleaner semantics with standard OpenAPI constructs like proper email format validation.

    The AI Challenge: DSLs vs. Standard Formats

    If you’ve tried using LLMs with Elixir and compared how well they fare against JavaScript, you’ll notice they do a much better job with JavaScript (although they still fumble basics) compared to less popular languages like Elixir.

    The problem with LLMs and less common language-DSL combinations runs deeper than simple unfamiliarity. These models are trained predominantly on popular languages and common patterns, giving them strong capabilities with JavaScript or Python but limited understanding of Elixir-specific constructs. When you add a DSL layer that translates to OpenAPI, you’re essentially asking the AI to:

    1. Understand your programming language’s syntax
    2. Interpret the custom DSL’s relationship to that syntax
    3. Map that back to standard OpenAPI constructs
    4. Generate appropriate documentation

    This creates multiple points of potential failure. Common hallucinations include inventing non-existent DSL keywords, mixing syntax from different programming languages, or suggesting impossible combinations of parameters.

    By contrast, when working with pure YAML OpenAPI specifications, you remove two layers of complexity, allowing the AI to focus directly on the specification format it has seen thousands of examples of during training.

    Now add the factor of using a less popular library AND the library’s reliance on a custom DSL. You’re in for LSD, not LLM code.

    When I switched to a simple single OpenAPI spec YAML file, it required far less manual intervention than the DSL approach (which honestly was easier to write manually at that point).

    For the AI Doubters

    “But I still want to understand what’s being generated by the LLM.” Fair enough. It is inevitable that you will encounter hallucinations at some point and will need to examine the YAML closely. Fear not! The OpenAPI ecosystem is mature and well-documented.

    If you’re new to writing raw OpenAPI specifications, there’s an abundance of well-maintained resources:

    These resources are continually updated as the specification evolves, unlike many language-specific DSL libraries that may lag behind specification changes.

    You can start reading the OpenAPI docs or jump into StackOverflow right away. The world is your oyster.

    Addressing the Challenges of Pure OpenAPI

    The ‘Write It Yourself’ Argument

    I anticipate some pushback: “This is just a skill issue. You should be able to read the DSL’s source code and understand how it works. Just fork it and submit a PR! You just have to learn to write faster!”

    This argument misses the point. Even the most skilled developers shouldn’t need to become experts in a library’s internal implementation just to document an API. The goal is to communicate your API’s capabilities clearly, not to demonstrate mastery of esoteric DSLs.

    File Size and Management Issues

    I don’t want to oversell YAML specs. Most LLMs have file size/character/token limits, and your YAML file might grow beyond 2,000 lines. This is a legitimate concern with these specs.

    Managing Large Specifications

    When your specification grows beyond a few thousand lines, it becomes unwieldy and may exceed token limits for LLMs. Fortunately, tools like openapi-merge offer an elegant solution.

    This tool lets you split your specification into logical components (by resource type or domain) and merge them during your build process. While this introduces additional tooling, the organizational benefits outweigh the minimal complexity. Each team can own their portion of the API documentation, reducing merge conflicts and encouraging better maintenance.

    A typical setup might look like:

    openapi/
    ├── main.yaml
    ├── components/
    │   ├── schemas/
    │   │   ├── users.yaml
    │   │   └── products.yaml
    │   └── paths/
    │       ├── users.yaml
    │       └── products.yaml

    The main.yaml file references these components, and openapi-merge assembles them into a complete specification. This approach strikes a balance between maintainability and complexity.

    When working with split YAML files, you need tooling to stitch them together. This can make working with chat-based LLMs more challenging, though tools like Cursor are making progress in this area.

    The Functionality Trade-Off

    Abandoning DSL libraries means losing some integrated features that initially seem valuable:

    1. Request validation: Many DSL libraries automatically validate incoming requests against your specification.
    2. Response formatting: Some libraries ensure your responses match the documented structure.
    3. Type generation: DSLs might generate type definitions from your specification.

    However, these conveniences come with hidden costs. In dynamically typed languages like Elixir and Python, the validation can be overly strict or incorrectly implemented. I’ve seen valid requests rejected due to minor specification inconsistencies, causing hard-to-debug production incidents.

    Instead, consider testing your spec in dev/staging environments and checking if the contracts are followed with tools like wiretap.

    Essential Tools for YAML-Based OpenAPI

    There was a moment of enlightenment when I searched for “[THING] awesome github.” Lo and behold! I found many interesting tools for OpenAPI specs at https://openapi.tools/.

    Here are some that I found particularly valuable:

    • Vacuum (https://quobix.com/vacuum): A powerful linter that identifies specification violations, inconsistencies, and areas for improvement in your OpenAPI documents. It helped me catch numerous subtle issues that would have caused problems for API consumers.

        # Example vacuum command
        vacuum lint openapi.yaml
    • Wiretap (https://pb33f.io/wiretap): This DevTools-style utility analyzes API compliance in real-time, showing exactly how your implementation differs from your specification. This is pretty useful

    • when refactoring existing endpoints to match our documentation.

    • Schemathesis (https://github.com/schemathesis/schemathesis): Perhaps the most impressive tool in the collection, Schemathesis automatically generates comprehensive test cases based on your OpenAPI specification. It found edge cases in our parameter handling that we hadn’t considered and significantly improved our API’s robustness.

        # Example schemathesis command
        st run http://localhost:4000/dev/swagger/swagger.yaml --experimental=openapi-3.1 --stateful links --checks all

    These tools form a powerful ecosystem that validates your specifications, ensures implementation compliance, and tests your API thoroughly—capabilities that most DSL approaches simply don’t offer.

    With these tools, I was able to identify overlooked issues in LLM-generated YAML and fix them by pasting the warnings/errors back into the LLM. It did still require some manual intervention to eliminate all warnings and make schemathesis happy, but the process was streamlined and educational.

    Practical Migration Strategy

    If you’re currently using a DSL-based approach but want to transition to YAML, here’s a pragmatic strategy:

    1. Extract your current spec: Most DSL libraries provide a way to export the generated OpenAPI specification. Use this as your starting point.
    2. Clean up the exported YAML: Remove library-specific extensions and fix any inconsistencies.
    3. Add validation tools to your CI pipeline: Ensure your specification remains valid as it evolves.
    4. Start documenting new endpoints in YAML: Don’t try to migrate everything at once. Use your clean YAML file for new endpoints.
    5. Gradually remove DSL specs: As you touch existing endpoints, move their documentation from the DSL to your YAML file.

    This approach minimizes risk while steadily moving toward a more maintainable documentation strategy.

    Conclusion

    Throughout this article, we’ve explored several key insights about OpenAPI documentation approaches:

    1. While DSL libraries promise integration and convenience, they often create more problems than they solve as projects grow in complexity.
    2. Domain-specific languages add an extra translation layer between your code and the actual OpenAPI specification, creating confusion and maintenance challenges.
    3. AI tools like ChatGPT struggle significantly with DSLs but perform reasonably well with standard YAML specifications.
    4. The ecosystem of tools supporting raw OpenAPI specifications is robust and language-agnostic, offering features like linting, testing, and compliance checking.
    5. Managing large YAML files does present challenges, but these can be addressed with appropriate tooling and organizational practices.

    Would I recommend rewriting your current DSL-based documentation? No! Unless it’s already broken beyond repair, of course. It would simply take too much time to rewrite a large API and then test it thoroughly.

    Would I recommend sticking to regular YAML for documenting REST APIs in a new project instead of using DSLs? Absolutely! You might not immediately notice how much future hassle you’re saving yourself. But trust me, it’s much better not to worry about issues like deprecated libraries.

    Would I recommend going with YAML for an existing project that lacks OpenAPI documentation? 100%. The tools listed above are extremely helpful, especially when you already have clients using your service and you want to verify that everything complies with the spec you’re writing.

    In the end, the simpler approach wins—particularly in a world increasingly augmented by AI assistance. By choosing standard formats over custom DSLs, you position yourself to leverage both existing tools and future AI capabilities more effectively.

    Maksymilian Jodłowski - Elixir Developer
    Maksymilian Jodłowski Elixir Developer

    Read more
    on #curiosum blog