How database transactions work in Ecto and why Elixir makes it awesome?

Today we're going to look at how Ecto, which is Elixir's first-choice database access library, addresses the issue of handling database transactions. We'll briefly introduce you to the very concept of transaction, then focus on describing the Ecto way of handling them, and explaining how it feels superior to what other languages' libraries offer us in this department. We'll give plenty of examples corresponding to a simple app you can pull from our GitHub repository, so you can have some fun testing it out!

Table of contents

    This is the first instalment of our Elixir Matters series, in which we'll do our best to convince you that (surprise!) Elixir matters, and it matters in many different ways - both at the development side of things, and at the business end, which is why it's just wrong for anyone not to invest time in getting familiar with this language.

    Elixir is not only because of its functional character, concurrency capabilities or other obscure-sounding traits - it's the combination of all those qualities that makes it a very efficient tool in translating business needs to working algorithms running behind exquisite apps.

    What is a database transaction?

    If you're familiar with SQL databases, feel free to skip this section - otherwise, let's delve a bit deeper into the mechanisms of database services.

    Consider a bank transfer, consisting of subtracting money from account A and then adding to B. It is mandatory that both operations are performed inseparably, because no money can be lost.

    Since our systems are capable of handling multiple operations concurrently, we don't want any agent to read either account's balance before the transfer is completed. We require that it not be possible to transfer $100 from A when its balance is only $50 (surprise, eh?).

    On top of that, we obviously want all of this to hold true even when our system crashes.

    What does it mean in technical terms? It means that our database is ACID. This slightly scary term is just an acronym for: Atomicity, Consistency, Isolation & Durability.

    On a basic level, a transaction is a sequence of atomic operations - that is, inseparable from one another; when any of these fails, or it is decided that we don't proceed with the remainder, the database will be rolled back to the state before transaction started.

    Aside from the atomicity, transactions are also required to keep the database consistent, i.e. to keep all defined rules governing the relations between data satisfied at all points of time. This relies on atomicity in a way, because any aborting of a transaction must not leave the database in an inconsistent state.

    Perhaps slightly easier to grasp is the concept of isolation - as transactions are core to how databases handle concurrency, the results of operations executed as part of interweaving transactions should not be 'visible' to other transactions until committed. (Though actually, that's a simplification. Read up if you're interested in the full story).

    On top of all of these traits, we need durability - the database must be crash-resilient, which means all data committed before a crash must still be seen as committed after the system is up again.

    From the application's perspective...

    In object-oriented languages, at the core of popular web frameworks were object-relational mapping libraries, which dealt with managing database connection, constructing queries and mapping the results to objects. So in Java there is Hibernate, .NET has its Entity Framework, and Ruby has ActiveRecord.

    Elixir if free from OOP's heavy, bloated objects, which always carried a lot of obscure internal state information, and relies on simple data structures such as maps to carry all needed information. So the equivalent of ActiveRecord's duties is more or less done by Ecto, though we'll no longer use the object-relational mapping term here - let's just refer to it as the data access layer.

    It hardly matters, though. What matters for us is that these libraries talk to the database to execute operations and transactions, which are committed or rolled back dependent on the app's business logic.

    So in Ruby's beloved (well, really?) ActiveRecord you had this:

    transfer = 50
    
    ActiveRecord::Base.transaction do
      acc_a, acc_b = Account.find([1, 2])
    
      raise ActiveRecord::Rollback, :balance_too_low if acc_a.balance < transfer
    
      acc_a.balance -= transfer
      acc_b.balance += transfer
      acc_a.save!
      acc_b.save!
    
      [acc_a, acc_b]
    end
    
    => [<Account:...>, <Account:...>]

    OK, and what about Elixir and Ecto? Here's a similar code that runs within a transaction a single long function that contains a procedure to execute. Test it out by running iex -S mix run priv/script1.exs from our repository.

    import Ecto.Query
    alias TransactApp.Bank.Account
    alias TransactApp.Repo
    
    transfer = 50
    
    result =
      Repo.transaction(fn ->
        [acc_a, acc_b] = from(acc in Account, where: acc.id in [1, 2]) |> Repo.all()
    
        if acc_a.balance < transfer, do: Repo.rollback(:balance_too_low)
    
        update1 = acc_a |> Account.changeset(%{balance: acc_a.balance - 50}) |> Repo.update!()
        update2 = acc_b |> Account.changeset(%{balance: acc_b.balance + 50}) |> Repo.update!()
    
        {update1, update2}
      end)
    
    => {:ok, [%Account{...}, %Account{...}]}

    At first glance, this doesn't exactly look like much of a win - obviously the language's (and Ecto's) syntax is different, but it's still very procedural.

    We still have to roll the transaction back manually. The difference, though, is that when we do roll back, ActiveRecord will just return nil, and Ecto will return {:error, :balance_too_low}. This is immensely important, because in Rails we would often spend lots of precious time figuring out what the reason of the rollback was. In Ecto, you can easily pattern match on the normal {:ok, ...} scenario, or the expected rollback reasons, and let it fail on those that are not expected.

    There's way more to what you can do to manage transactions in Ecto, though!

    Enter Ecto.Multi

    Now this is something that truly leverages the functional nature of Elixir and allows you to build and manage transactions with better control and more confidence.

    Procedural code, as seen in examples above, is heavy on assigning data to variables and usually there is an assumption that the whole procedure succeeds, and when it doesn't, most languages rely on an exception handling mechanism to handle errors. Elixir can do that too, but it leads to writing solutions with coarse-grained error handling or even to ignoring errors, which is even worse.

    I can't count how many times I've seen folks (myself included...) trying to update a record inside a transaction, which then failed because of validation errors, but the transaction kept executing, and there was an assumption that the update did succeed.

    Functional programming promotes defining procedures as pipelines of functions, and every step (function) in that pipeline has a clear requirement on what data (arguments) it expects to receive. With Elixir's awesome and ubiquitous pattern matching it just feels natural.

    Here's an example code you could write as an Elixir script - it's also available in our repository:

    import Ecto.Query
    alias Ecto.Multi
    alias TransactApp.Bank.Account
    
    transfer_amount = 50
    
    retrieve_accounts = fn repo, _ ->
      case from(acc in Account, where: acc.id in [1, 2]) |> repo.all() do
        [acc_a, acc_b] -> {:ok, {acc_a, acc_b}}
        _ -> {:error, :account_not_found}
      end
    end
    
    verify_balances = fn _repo, %{retrieve_accounts_step: {acc_a, acc_b}} ->
      # we don't do anything to account B, but we could
      if acc_a.balance < transfer_amount,
        do: {:error, :balance_too_low},
        else: {:ok, {acc_a, acc_b, transfer_amount}}
    end
    
    subtract_from_a = fn repo, %{verify_balances_step: {acc_a, _, verified_amount}} ->
      # repo.update will return {:ok, %Account{...}} or {:error, #Ecto.Changeset<...>} -
      # {:ok, value} or {:error, value} is what these functions are expected to return.
      acc_a
      |> Account.changeset(%{balance: acc_a.balance - verified_amount})
      |> repo.update()
    end
    
    add_to_b = fn repo, %{verify_balances_step: {_, acc_b, verified_amount}} ->
      acc_b
      |> Account.changeset(%{balance: acc_b.balance + verified_amount})
      |> repo.update()
    end
    
    batch =
      Multi.new()
      |> Multi.run(:retrieve_accounts_step, retrieve_accounts)
      |> Multi.run(:verify_balances_step, verify_balances)
      |> Multi.run(:subtract_from_a_step, subtract_from_a)
      |> Multi.run(:add_to_b_step, add_to_b)

    Did we just actually run the transaction? Not yet. We defined a number of functions that represent single steps of our pipeline.

    Every function adheres to the same contract:

    • Return {:ok, value},
    • Expect as arguments: the current Repo, and a map.

    Then, a particular function will also pattern match the map to verify that a certain precondition has been met. For instance, if we mistakenly try to use the subtract_from_a function without having a chance to insert the returning value of verify_balances_step into our context, pattern matching will fail.

    Using Ecto.Multi, we then defined a batch of operations as a composition of these functions. Finding a way to have these batches well-organized is just up to you. You can throw them in one of your contexts, or separate it out to a different module.

    Notice how we define retrieve_accounts/2 and verify_balances/1 as functions that return functions, because every time we do a bank transfer, we do it on different accounts and with different amounts, so each time we build the Ecto.Multi pipeline we need these two functions to be slightly different. Aside from that, subtract_from_a/2 and add_to_b/2 are defined as functions straight away, because they rely on what earlier steps have given them as input (that is, both accounts, and the transfer amount) - in this sense, these steps are identical whenever we run the pipeline.

    defmodule TransactApp.Bank.Batches do
      alias Ecto.Multi
      alias TransactApp.Bank.Account
      import Ecto.Query, only: [from: 2]
    
      def transfer_money(acc1_id, acc2_id, amount) do
        Multi.new()
        |> Multi.run(:retrieve_accounts_step, retrieve_accounts(acc1_id, acc2_id))
        |> Multi.run(:verify_balances_step, verify_balances(amount))
        |> Multi.run(:subtract_from_a_step, &subtract_from_a/2)
        |> Multi.run(:add_to_b_step, &add_to_b/2)
      end
    
      defp retrieve_accounts(acc1_id, acc2_id) do
        fn repo, _ ->
          case from(acc in Account, where: acc.id in [^acc1_id, ^acc2_id]) |> repo.all() do
            [acc_a, acc_b] -> {:ok, {acc_a, acc_b}}
            _ -> {:error, :account_not_found}
          end
        end
      end
    
      defp verify_balances(transfer_amount) do
        fn _repo, %{retrieve_accounts_step: {acc_a, acc_b}} ->
          if acc_a.balance < transfer_amount,
            do: {:error, :balance_too_low},
            else: {:ok, {acc_a, acc_b, transfer_amount}}
        end
      end
    
      defp subtract_from_a(repo, %{verify_balances_step: {acc_a, _, verified_amount}}) do
        acc_a
        |> Account.changeset(%{balance: acc_a.balance - verified_amount})
        |> repo.update()
      end
    
      defp add_to_b(repo, %{verify_balances_step: {_, acc_b, verified_amount}}) do
        acc_b
        |> Account.changeset(%{balance: acc_b.balance + verified_amount})
        |> repo.update()
      end
    end

    Then, to actually execute the batch, use Repo.transaction/1. Each step will return {:ok, ...} or {:error, ...} - and Ecto will commit or rollback the transaction depending on that. No need to explicitly call Repo.rollback/1 or raise any errors!

    TransactApp.Bank.Batches.transfer_money(1, 2, 50) |> TransactApp.Repo.transaction()
    
    # returns tuple with map denoting results of each step:
    {:ok, %{retrieve_accounts_step: ..., verify_balances_step: ..., subtract_from_a_step: ..., add_to_b_step: ...}}
    
    # or if an error occurs, it'll return the exact point of failure _and_ all previous step results, which is awesome:
    {:error, :verify_balances_step, :balance_too_low, %{retrieve_accounts_step: ...}}

    A script that pattern matches on this call and the TransactApp.Bank.Batches module are also available in our GitHub repository, so you can play around with it.

    Multi structures can be merged and this is so awesome and powerful, because we can now fully leverage the composability of Elixir.

    There are also numerous awesome shorthands available. Oftentimes a step is just about updating, inserting or deleting a record - and for less complex scenarios we can use it like this (also available in GitHub):

    Multi.new()
    |> Multi.update(:update1_step, Account.changeset(account1, %{balance: 1337}))
    |> Multi.update(:update2_step, Account.changeset(account2, %{balance: 7331}))
    |> Multi.insert(:insert_step, Account.changeset(%Account{}, %{balance: 150}))
    |> Multi.delete_all(:delete_step, Ecto.assoc(account1, :activities))
    |> Multi.update(:update3_step, fn %{insert_step: account} ->
      Account.changeset(account, %{balance: 1234})
    end)
    |> Repo.transaction()

    See that you can operate on pre-existing loaded records if pipeline steps don't rely on context from previous steps, or otherwise you can use a function as the second argument to retrieve the context.

    Conclusion

    Several Elixir libraries, including Ecto, benefit from Elixir's nature of a functional language, and allow for easy definition of data processing pipelines. Its transaction handling facilities are just cool and simply get out of your way.

    Since handling database transactions is an aspect that no serious application can evade, it's awesome that Elixir allows developers to manage them elegantly, which leads to software that does its business well.

    There's more to come in the Elixir Matters series, so stay tuned for updates at Curiosum!

    Michał Buszkiewicz, Elixir Developer
    Michał Buszkiewicz Curiosum Founder & CTO

    Read more
    on #curiosum blog

    Phoenix LiveView Tutorial: Adding Phoenix PubSub and Pow Authentication to Messenger

    We've already bootstrapped our Phoenix LiveView-based Messenger app's database structure and a first LiveView page.

    This time, we're going to improve real-time communication between the app's users using Phoenix PubSub, and use the Pow library to add secure user authentication.

    As of November 2020, the latest Phoenix LiveView version is 0.14.8 - and the series has been updated to match it!

    5 top-tier companies that use Elixir

    Elixir is a pretty capable language - and it consistently ranks near the top of most loved and wanted languages rankings. It has a large following and some very persuasive preachers as well. But that would not be enough to make me like it – what I need as real proof of its strengths is real businesses that strive with Elixir.

    That’s what this list is all about – a bunch of stories from top companies that chose Elixir and never looked back. Let us show you how its power and versatility shows in practice.