While we can't say cheating on anyone is okay, we're not as absolutistic when it comes to cheating on Elixir at times.

Table of contents

    Structs are there for a reason (we'll start from a brief overview), and that's certainly not for us to cheat on them. But we can if we have to - and we'll sometimes even justify that and get away with it!

    Today's article will come in handy especially for those who are interested in developing libraries for Elixir and making them usable across different dependency versions, which is always a problem when writing code intended to be pluggable into different applications.

    Welcome to Elixir Trickery, a series of articles telling stories about utilizing little-known language features, applying out-of-the-box thinking to programming, and going inventive and creative in your coding.

    Introduction to Structs

    What is a struct? According to Elixir's Getting Started tutorial, Structs are extensions built on top of maps that provide compile-time checks and default values. So there are maps, which is one of Elixir's basic data structures, providing a means to store key-value pairs. So just to recap, or to show it off:

    # Defining a map
    > map = %{:key => :value}
    %{key: :value} # alternative syntax when a key is an Atom
    
    # Retrieving a value from a map
    > Map.get(map, :key)
    :value
    > map.key
    :value
    > map[:key] # This is the Access behaviour - we'll talk about it later
    :value
    
    # Trying to retrieve nonexistent key
    > Map.get(map, :foo)
    nil
    > map.foo
    ** (KeyError) key :foo not found in: %{key: :value}
    > map[:foo]
    nil

    Values can be retrieved from maps in three ways: Map.get/3 (optionally, the third argument is a default value), the "dot" syntax (which is, as you can see, quite strict, because it fails when given key isn't in the map), and the [] syntax, which is courtesy of Elixir's Access behaviour - we'll return to that.

    When it comes to updating maps, you're really not doing what you might be used to in all sorts of different languages, because - since values in Elixir are immutable - you're creating a new map.

    # Returning a new map with a new key, or an updated value under :key
    > Map.put(map, :new_key, :new_value)
    %{a: :b, key: :value}
    > Map.put(map, :key, :new_value)
    %{key: :new_value}
    
    # Merging maps
    > Map.merge(map, %{new_key: :new_value, foo: :bar})
    %{foo: :bar, key: :value, new_key: :new_value}
    
    # Shorthand for returning a new map with updated value under :key
    > %{map | key: :new_value}
    %{key: :new_value}
    
    # ...the shorthand doesn't work for putting new keys, though:
    > %{map | new_key: :value}
    ** (KeyError) key :new_key not found in: %{}

    Maps can be pattern matched on:

    # Pattern matching on a map
    > %{key: matched_value} = map
    %{key: :value}
    > matched_value
    :value
    
    # Pattern matching on a map in a function argument
    > function = fn %{key: matched_value} ->
    >   String.upcase(matched_value)
    > end
    #Function<6.128620087/1 in :erl_eval.expr/5>
    > function.(%{key: "Awesome!"})
    "AWESOME!"

    The pattern matching part is particularly awesome, because you can pattern match on nested maps as well:

    > map = %{outer_key: :outer_value, inner_map: %{inner_key: :inner_value}}
    > %{outer_key: outer_match, inner_map: %{inner_key: inner_match}} = map
    > inner_match
    :inner_value
    > outer_match
    :outer_value

    Finally, Structs!

    Now, structs are an extension of maps. Defining a struct like this:

    defmodule CuriosumTime do
      defstruct [:hour, :minute, :second]
    end

    ...allows you to create maps on steroids, that is, maps that must only contain specific keys. In this case, we've created a module named CuriosumTime, which uses the Kernel.defstruct/1 macro to define a set of fields that all structs following the CuriosumTime contract will be restricted to. How to use this restriction? Here's an example:

    > time1 = %CuriosumTime{hour: 21, minute: 37, second: 42}
    %CuriosumTime{hour: 21, minute: 37, second: 42}
    
    # Missing values will be filled with nil
    > time2 = %CuriosumTime{}
    %CuriosumTime{hour: nil, minute: nil, second: nil}
    
    # Unknown keys will be rejected
    > %CuriosumTime{foo: 1}
    ** (KeyError) key :foo not found

    As you can see, default values for defined struct keys are nil, unless you use defstruct with a keyword list:

    defstruct [hour: 12, minute: 0, second: 0] # [] can be omitted

    ...so that these will default to what you've specified. So when you access the structs' keys, the following will be returned:

    > time1.hour
    21
    
    > time2.hour
    nil

    The standard way to retrieve values under struct keys is to use the dot syntax because it'll disallow you to retrieve the value of a nonexistent key. You can also use Map.get/3 if you need to. How about the [] syntax, though?

    > time1[:hour]
    ** (UndefinedFunctionError) function CuriosumTime.fetch/2 is undefined (CuriosumTime does not implement the Access behaviour)
        CuriosumTime.fetch(%CuriosumTime{hour: 21, minute: 37, second: 42}, :hour)

    This is because the [] syntax is a shorthand for CuriosumTime.fetch/2, and fetch/2 is a callback of Elixir's Access behaviour. For a struct to be able to be accessed with [], you need to implement this behaviour in your struct's module, which means e.g. defining the fetch/2 function - we won't get into much detail on it, but let a library named StructAccess serve as an example of you can do that.

    To cap off our brief introduction to structs, let's stress that you can also pattern match on the other side of your expression being a specific struct type:

    def process_time(%CuriosumTime{} = time) do # our custom time struct
      # ...
    end
    
    def process_time(%Time{}) = time) do # Elixir's native time struct
      # ...
    end

    This is useful for cases where you need a single function to process differently structured data.

    And lastly, which is important for our further reasoning, it is important to know that internally, a struct is just a map with the __struct__ key referring to a specific module. Simple, ain't it?

    > time = %CuriosumTime{hour: 10, minute: 0, second: 0}
    > time.__struct__
    CuriosumTime

    Pattern matching: %StructName{} vs. %{__struct__: StructName}

    As we've noted, in Elixir, the defstruct construct is used to define a specific structure that describes a Map's requirement for the keys it contains, as well as their default values. For example:

    > defmodule Dog, do: defstruct breed: :mongrel, age: nil
    > dog = %{__struct__: Dog, age: 5, breed: :husky}
    %Dog{age: 5, breed: :husky}

    What's underlying is just an ordinary Map where Dog is put under the :__struct__ key. This means that you can match it with both of the following syntaxes:

    > %Dog{} = dog
    %Dog{age: 5, breed: :husky}
    > %{__struct__: Dog} = dog
    %Dog{age: 5, breed: :husky}

    Is the %Dog{} syntax just a syntactic sugar, then? Well, not exactly. Suppose you have an animal variable and you want to check whether it is a Dog or a Cat... but you don't have the Cat struct defined yet.

    > case animal do
    >   %Dog{} -> IO.puts("Woof!")
    >   %Cat{} -> IO.puts("Meow!")
    > end
    ** (CompileError) iex:37: Cat.__struct__/0 is undefined, cannot expand struct Cat
    
    > case animal do
    >   %{__struct__: Dog} -> IO.puts("Woof!")
    >   %{__struct__: Cat} -> IO.puts("Meow!")
    > end
    Woof!
    :ok

    See the difference? defstruct introduces an additional compile-time check for the actual existence of matched struct, while when simply matching the __struct__ key, Dog and Cat are just plain Erlang atoms!

    This can make a huge difference when developing a library that needs to be compatible with multiple versions of a dependency - for instance, when dealing with and Ecto.Query's from key, which was a tuple in Ecto 2, but is an Ecto.Query.FromExpr struct (undefined in Ecto 2) from Ecto 3 on.

    Cheats (never) prosper

    As we've proven that you can cheat on Elixir when it comes to using struct definitions, you can also do it with the keys of a defined struct. Consider the following example, where we define a struct that has an enforced key - note that it is merely a compile-time check and doesn't come with any kind of validation, hence we're able to do this:

    defmodule Foo do
      @enforce_keys [:bar]
      defstruct @enforce_keys
    end
    
    good_foo = %Foo{bar: 1337} # OK
    bad_foo = %Foo{} # error - enforced key missing
    bad_foo = %Foo{bar: 1337, baz: 42} # error - key not found
    cheat_foo = %{__struct__: Foo} # apparently OK!
    cheat_foo = %{__struct__: Foo, bar: 1337, baz: 42} # apparently OK!

    Fine, but where to look for practical applications of this hack? Library developers usually avoid removing keys when creating new library versions, but this may not always be the case. While it's rare, it might turn out that an expected list of a struct's fields, often representing e.g. configuration options, will have an item removed or renamed in a future library revision. This might not sound exciting, but, realistically, you could find it handy in the future when pattern matching against such structs.

    Structs from Maps: Kernel.struct/2

    When dealing with data coming from external sources, perhaps provided from an import or an external API, the need to sanitize the data often arises, and structs provide the basic means to do this.

    So let's suppose that you've parsed a dataset into a map, you can call Kernel.struct/2 to annotate it as a specific struct, and what's important is that you can control the behaviour of handling unknown key occurrences.

    Specifically, there are two similar functions defined in Kernel: struct/2 will filter out keys undefined in the struct's defstruct definition, and will not fail on missing keys defined in @enforce_keys. On the contrary, struct!/2 has a rather more strict behaviour, failing on encountering an unknown key or on an enforced key not being present.

    defmodule CuriosumTime do
      @enforce_keys [:hour, :minute, :second]
      defstruct @enforce_keys
    end

    Since @enforce_keys is just a module attribute, you can directly reuse it in defstruct/1; alternatively, you can just provide a plain list, if you only want specific keys to be enforced.

    > data = %{hour: 12, minute: 30, millisecond: 45} # missing :second, extra :millisecond
    
    > struct(CuriosumTime, data)
    %CuriosumTime{hour: 12, minute: 30, second: nil}
    
    > struct!(CuriosumTime, data)
    ** (KeyError) key :millisecond not found in: %CuriosumTime{hour: 12, minute: nil, second: nil}
    
    > struct!(CuriosumTime, data |> Map.delete(:millisecond))
    ** (ArgumentError) the following keys must also be given when building struct CuriosumTime: [:second]

    Interestingly, a well-adopted library for parsing JSON data named Poison contains a decode!/2 function that will do the struct wrapping for you directly from a JSON dataset when passing a specific :as option. However, it looks to be flawed. While the following examples indicate that it's working:

    defmodule CuriosumTime do
      defstruct [:hour, :minute, :second]
    end
    
    json = ~s([
      {
        "hour": 12,
        "minute": 30,
        "second": 40
      },
      {
        "hour": 23,
        "minute": 15,
        "second": 50
      }
    ])
    
    > Poison.decode!(json)
    [
      %{"hour" => 12, "minute" => 30, "second" => 40},
      %{"hour" => 23, "minute" => 15, "second" => 50}
    ]
    
    > Poison.decode!(json, keys: :atoms)
    [%{hour: 12, minute: 30, second: 40}, %{hour: 23, minute: 15, second: 50}]
    
    > Poison.decode!(json, keys: :atoms, as: [%CuriosumTime{}])
    [
      %CuriosumTime{hour: 12, minute: 30, second: 40},
      %CuriosumTime{hour: 23, minute: 15, second: 50}
    ]

    ...problems arise when trying to use @enforce_keys:

    defmodule CuriosumTime do
      @enforce_keys [:hour, :minute, :second]
      defstruct @enforce_keys
    end
    
    > Poison.decode!(json)
    # same as above
    
    > Poison.decode!(json, keys: :atoms)
    # same as above
    
    > Poison.decode!(json, keys: :atoms, as: [%CuriosumTime{}])
    ** (ArgumentError) the following keys must also be given when building struct CuriosumTime: [:hour, :minute, :second]

    Something's looking rather off here - this is just to indicate that Poison.decode!/2 is fine to be used with most of its options, when it comes to creating structs and not just maps from your JSON data, it's better to use Kernel.struct/2 to process data in a way that you control.

    To go further...

    So we've discussed what Elixir has at its core about structs - they're very useful and used extensively throughout all sorts of well-adopted libraries such as Ecto, where each object retrieved from the database is represented as a struct.

    There are also several cool ways to build upon structs. As you may have noticed, structs are untyped, which means that our CuriosumTime struct can take :ten, "Ten" or anything as the hour - hell, in fact, Elixir's native Time struct also can. If you're into typed structs, it might be worth having a look at a library named typed_struct - though be aware that it relies on typespecs, which is not a true replacement for typing systems known from strongly typed languages.

    If you've got something interesting to add to the topic of structs - let us know and drop a comment below!

    And if you need an IT partner, let us know how can we help.

    As a Curiosum software development company, we specialize in Elixir programming language. We know how to use its greatest advantages and meet customers' expectations. Is there anything we can do for you? We help with:

    Elixir development

    Web app development

    Mobile App development

    React Native development

    Software consulting

    FAQ

    What is a struct in Elixir programming language?

    Structs in Elixir are extensions of maps, providing compile-time checks and default values, essentially acting as more constrained and specific versions of maps.

    How can you retrieve and update values in Elixir structs?

    Values in Elixir structs can be retrieved using Map.get, dot syntax, and pattern matching. Structs are immutable, so updating a value results in the creation of a new struct.

    What is the difference between maps and structs in Elixir?

    While maps are a basic data structure for key-value storage in Elixir, structs add a layer of specificity, requiring defined keys and providing compile-time checks.

    How can you utilize pattern matching with Elixir structs?

    Elixir allows pattern matching on structs, enabling developers to match against specific struct types or nested map structures within structs for more precise data handling.

    What is the significance of the __struct__ key in Elixir structs?

    The __struct__ key in an Elixir struct points to its specific module, distinguishing it from regular maps and tying it to its defined structure.

    How does Elixir's Kernel.struct/2 function work with map data?

    The Kernel.struct/2 function in Elixir converts maps to structs, filtering out undefined keys and ensuring data matches the struct's expected format.

    Why might Elixir developers "cheat" on structs, and what are the implications?

    Developers might bypass struct constraints to accommodate varying dependency versions or handle data dynamically, though this approach requires careful consideration and understanding of Elixir's type checks.

    How do Elixir structs handle default values and missing keys?

    Elixir structs fill missing values with nil by default, and attempts to include unknown keys result in errors unless using specific Elixir functions designed to handle such scenarios.

    Michał Buszkiewicz, Elixir Developer
    Michał Buszkiewicz Curiosum Founder & CTO

    Read more
    on #curiosum blog