Multiple pattern-matching concatenations for the single string

Pattern-matching is one of the finest elixir-lang features. Whoever knows the power of this tool once, will want to use it forever.

Table of contents

    It's pretty easy to split and compare the string literal that way. Just like below:

    iex(1)> "r" <> _ = "run"
    "run"

    The problem, though, appears as soon as you try to assign parts of the string twice in the single pattern-match clause. Just like here:

    iex(2)> "f" <> _a <> "rre" <> _b <> "t" = "forrest"
    ** (ArgumentError) the left argument of <> operator inside a match should always be a literal binary because its size cant be verified.

    Why? Because, simply, at that complexity level, you might imagine a case with more than one assignment problem solution. Just take a look at the beekeeper's problem example:

    iex(3)> "a" <> hive1 <> "b" <> hive2 <> "c" = "abbbbbc"

    How do you (and compiler/interpreter) know how many b's are assigned either to the hive1 and hive2? There is more than one possibility. Like

    • hive2 = "bbb" + hive2 = "b"
    • hive2 = "bb" + hive2 = "bb"
    • hive2 = "b" + hive2 = "bbb"

    So, what to do? Bitstrings! Or, actually - binaries, which are just bitstrings having divisible by 8 number of bits. Using the power of bytes counting, you can now just use them inside the pattern-match clause.

    iex(4)> "f" <> <<_o>> <> "rre" <> <<_s>> <> "t" = "forrest"
    "forrest"
    
    iex(5)> "f" <> <<_o, _r, _r>> <> "est" = "forrest"
    "forrest"

    In most of the cases, it will work well as above. Sometimes, though, you might encounter multi-byte characters. Just like e.g. ü, which fills two of them.

    iex(6)> "f" <> <<_>> <> "rrest" = "forrest"
    "forrest"
    
    iex(7)> "f" <> <<_>> <> "rrest" = "fürrest"
    ** (MatchError) no match of right hand side value: "fürrest"
    
    iex(7)> "f" <> <<_, _>> <> "rrest" = "fürrest"
    "fürrest"

    An easy workaround for that is just to use the ::utf8 modifier.

    iex(8)> "f" <> <<_::utf8>> <> "rrest" = "forrest"
    "forrest"
    iex(9)> "f" <> <<_::utf8>> <> "rrest" = "fürrest"
    "fürrest"

    Happy hacking!

    Oskar Legner
    Oskar Legner Elixir & React Developer

    Read more
    on #curiosum blog