Three Tiers of Decoupling in Elixir

Allow me to weave you a tale of progressively stronger decoupling in Elixir.

I am working on a library that involves access to graph data, informed by a schema or model. Just to get things rolling I started out with a convention of holding graph nodes in a map:

map_graph = %{"node1" =>
          %{"class_id" => "class1",
            "data" => %{ "testProp" => "value" }}
        %{"node2" =>
          %{"class_id" => "class1",
            "data" => %{ "testProp" => "value2" }}}
node = map_graph["node1"]
value = node["data"]["testProp"]

It’s quick, dirty, and gets the job done for proof-of-concept purposes. Clearly it’s inadequate for other data access patterns we’d like to be able to support. We can write transformation routines against this structure, but they won’t be able to work with anything else.

It’s time for some decoupling.

Behaviors

A Behavior defines an interface or contract that implementing modules must follow. It’s like saying “if you want to be a graph node accessor, you must implement these functions.”

defmodule Graph do
  @callback get_node(node :: any()) :: [any()]
end

defmodule Graph.MapImpl do
  @behaviour Graph

  defstruct [:data]

  @impl true
  def get_node(data, node_id) do
    data[node_id]
  end
end

Now we can do

MapImpl.get_node(map_graph, "node")

Now we have some added flexibility to define a completely different implementation that dishes up nodes from the filesystem based on a root path:

def Graph.FilesystemImpl
  @impl true
  def get_node(%{root: root_path}, node_id) do
    full_path = resolve_path(root_path, node_id)
    case File.stat(full_path) do
      {:ok, %File.Stat{type: :regular}} ->
        %{
          class: "file",
          data: file_data(full_path)
        }
      {:ok, %File.Stat{type: :directory}} ->
        %{
          class: "folder",
          data: folder_data(full_path)
        }
      {:error, _} -> nil
    end
  end

Notice here the data parameter on get_node/2 serves a completely different purpose, instead providing a pointer to where to access the data. This is the first strange smell about this approach.

Now it’s possible to write methods that work on either the map or the filesystem implementation. But you still need to know which one you’re in, and data needs to be formed correctly:

MapImpl.get_node(%{"node" => ...}, "node")
FileSystemImpl.get_node(%{"root" => "."}, "node")

You can make things more convenient with a wrapper struct that carries the implementation and the data, and delegates the functions:

defmodule Graph.Instance do

  @type t :: %__MODULE__{
          impl: module(),
          data: any()
        }

  def get_node(%__MODULE__{impl: impl, data: data}, node_id) do
    impl.get_node(data, node_id)
  end

Then you can write transformation routines that work on instances, and are really independent of the underlying representation.

Protocols

Protocols provide a different form of decoupling. Here’s the equivalent to the Behavior definition from above:

defprotocol Graph do
  def get_node(data, node_id)
end

Then the implementation comes in 2 parts:

defmodule Graph.MapGraph do
  defstruct [:nodes]

  def new() do
    %__MODULE__{
      nodes: %{}
    }
  end
end

defimpl Graph, for: Graph.MapGraph do

  @impl true
  def get_node(%{nodes: graph}, node_id) do
    graph[node_id]
  end
end

Now we don’t need to pass around the Instance wrapper with its impl. Our transforms are written for “anything that can do X” rather than “modules that promise to do X”. The data itself carries the behavior.

graph = Graph.MapGraph.new()
Graph.get_node(graph, "node")

The extra cool part is that since the defimpl is separate from the defmodule, you can use protocols to extend existing types. So imagine we have SomeLibrary.DataStructure that could be interpreted as a graph. With Behaviors we would need to write an entire wrapper module. But with protocols we write a defimpl Graph, for: SomeLibrary.DataStructure. Then when we pass that structure to some graph routine, the method calls will look exactly the same:

graph = SomeLibrary.DataStructure.new()
Graph.get_node(graph, node)

This is a level of decoupling and flexibility that’s hard to imagine in traditional OOP languages.

Summary

With behaviors, you call GraphImpl.get_node(graph, id). With protocols, you call Graph.get_node(graph, id) regardless of the graph’s actual type.

Behaviors are about modules promising to implement certain functions. The module author decides to follow the behavior contract. Complexity is pushed into the infrastructure.

Protocols are about data types being extended with new capabilities. You can implement a protocol for any existing type, even ones you didn’t create. Complexity is pushed into the type system.

I think there’s a nice categorical perspective as well: behaviors as functors in a slice category. Protocols as natural transformations, with the “legs” provided by the implementation. The naturality condition looks something like

graph |> Graph.get_node("node") == graph.nodes |> Map.get("node")

But I’d need to spend some time with pen and paper to convince myself on this one.

Behaviors¶

Protocols¶

Summary¶

Behaviors

Protocols

Summary