Skip to content

Article Software Mansion: Multimedia Guide in Elixir #229

@ndrean

Description

@ndrean

Text from Software Mansion blog: https://blog.swmansion.com/all?topic=webrtc

Link:

Image

This subject in action: an "old" Livebook demo: https://github.com/dwyl/WebRTC-SFU-demo

Elixir for Multimedia: A Practical Guide for Developers

Elixir is a great language for building distributed, scalable, and reliable multimedia apps. It’s based on Erlang, which is known for handling lots of things at once without crashing. If you’re working with video streaming, image processing, or WebRTC, Elixir can make your life easier. This article shows you how to use Elixir for multimedia projects, with practical examples and tips.

Elixir Multimedia Libraries
Membrane

Elixir has some great tools for working with multimedia. Membrane is a really important one — it’s a framework built by Software Mansion specifically for creating multimedia pipelines. Think of it as a set of building blocks (elements) that you combine together to create a data flow that helps you process video and audio. An example pipeline responsible for playing MP3 files will consist of three such elements: File Source (responsible for reading data from a file), MP3 Decoder (responsible for converting audio to the raw format), and PortAudio Sink (responsible for playing raw audio in your headphones). So, if you’re working with video streaming, audio processing, or even real-time communication, Membrane is worth checking out.

Playing mp3 file with Membrane:

defmodule MP3Player do
  use Membrane.Pipeline
  @impl true
  def handle_init(_ctx, mp3_url) do
    spec =
      child(%Membrane.Hackney.Source{
        location: mp3_url, hackney_opts: [follow_redirect: true]
      })
      |> child(Membrane.MP3.MAD.Decoder)
      |> child(Membrane.PortAudio.Sink)
    {[spec: spec], %{}}
  end
end
Membrane.Pipeline.start_link(MP3Player, mp3_url)

Boombox

For those looking for a slightly simpler approach to common streaming tasks, Boombox, built on top of Membrane, provides a more streamlined API. While Membrane offers fine-grained control and customization where you need to compose a pipeline yourself, knowing exact elements and how to combine them together, Boombox focuses on ease of use for typical streaming scenarios, and provides even more declarative API where you specify what’s on input, what should be on output, and it’s Boombox’s responsibility to construct adequate pipeline under the hood. It handles much of the boilerplate, allowing you to quickly get a streaming server up and running without diving deep into the complexities of Membrane’s lower-level components. Think of Boombox as a convenient layer on top of Membrane that simplifies common streaming use cases.

Streaming mp4 file over WebRTC:

Boombox.run(input: "file.mp4", output: {:webrtc, "ws://localhost:8830"})

Or using CLI interface:

./boombox -i file.mp4 -o --webrtc ws://localhost:8830

Xav

While Membrane and Boombox are very high-level and declarative tools, sometimes it might be desirable to write a little more “imperative” code, and here comes Xav, which is a simple wrapper over FFmpeg, mostly for audio and video encoding/decoding but not only.

Decoding a single video frame with Xav:

decoder = Xav.Decoder.new(:vp8, out_format: :rgb24)
{:ok, %Xav.Frame{} = frame} = Xav.Decoder.decode(decoder, <<”somebinary>>)

Image and Vix

Multimedia is not all about streaming but also about static images! Libraries like Image and Vix will allow you to manipulate with a static image (or a single video frame). These include resizing, cropping, color conversions, drawing shapes, generating image histograms, playing with masks, and much more. Image also comes with integration with Bumblebee making it easier to perform image classification, or image generation based on text prompts.

Increasing image brightness with Image:

image = Image.open!("./test/support/images/cat.png")
{:ok, _brighter_image} = Image.brightness(image, 1.5)

Nx

Then there’s Elixir Nx — a set of machine-learning libraries for the Elixir ecosystem. Convert your audio samples or video frames into tensors, load audio and video models using Bumblebee, and do whatever you want (and need)! You can find more info on GitHub.

Evision

Continuing the topic of machine learning, who hasn’t heard about OpenCV? Evision is a library that provides bindings to OpenCV in Elixir. Looking at the number of modules in the documentation, it is highly probable that everything you can do in Python, you can also do in Elixir.

Elixir WebRTC

WebRTC is a standard that allows for real-time audio/video exchange between two peers, in particular between web browsers, and Elixir, similar to Go (Pion), Rust (webrtc-rs) and Python (aiortc) has its own implementation called Elixir WebRTC. This enables features like video chat, real-time streaming, interactive communication, and integration with LLM services like OpenAI or Google Gemini in your Elixir/Phoenix applications. What’s more, Elixir WebRTC comes with built-in observability, smooth integration with Membrane and Phoenix, and last but not least, it mimics JavaScript API so if you are already familiar with WebRTC in a web browser, you should have no trouble with using WebRTC in Elixir.

For example, this snippet in JS:

const pc = new RTCPeerConnection();
const tr = pc.addTransceiver("audio");
const offer = await pc.createOffer();
await pc.setLocalDescription(offer);

maps to the following Elixir code:

{:ok, pc} = PeerConnection.start_link()
{:ok, tr} = PeerConnection.add_transceiver(pc, :audio)
{:ok, offer} = PeerConnection.create_offer(pc)
:ok = PeerConnection.set_local_description(pc, offer)

Summing up

Which library you use depends on what you’re building. Membrane is great for creating robust and scalable pipelines, especially if you’re doing streaming or complex processing. Boombox is an excellent choice for quickly setting up common streaming scenarios. Nx is your friend when you need to do a lot of math, like with image or video processing that requires detailed analysis. Often, they work together. You might use Membrane (or Boombox) to manage the streaming and Nx to process the images or video frames. Understanding what each library does best helps you combine them to build some cool multimedia applications.

How you can use Elixir in Multimedia Applications
The table below shows a simple breakdown of how the libraries mentioned above can be used in the multimedia context. Hopefully, it inspires you to create an Elixir multimedia project on your own :)

Image

A table showing how to use various libraries like: Membrane, Boombox or Elixir webRTC in multimedia development.
Keep in mind that while Membrane does not come with built-in object detection or speech-to-text support, you can easily write your own element using e.g. Elixir Nx and Bumblebee.

Try it out!

The Elixir ecosystem is constantly growing, with new tools and libraries popping up all the time. This means even more possibilities for multimedia developers. If you’re interested in multimedia development, now’s a great time to start exploring Elixir. Experiment with the libraries, build some cool projects and contribute to the growing community. And, of course, once you do, let us know! We love to see and share your Membrane, Elixir WebRTC, and other projects that you share with us on X (make sure to tag @ElixirMembrane) or Discord.

The article was co-written with Michał Śledź, a Software Engineer at Software Mansion, who, among many other things, created Elixir WebRTC. We hope it will help you start your Elixir multimedia journey! :)

Have some questions?
As creators of Membrane, Boombox, Elixir WebRTC, Smelter, and many more multimedia tools, we are happy to discuss all kinds of issues that may be on your mind. Head over to our Discord and pop your question there — we won’t leave it unanswered 🫡

More of us
If you’d like to keep up with our work, we’re doing a monthly newsletter dedicated to all things multimedia, streaming, and WebRTC. One email per month, no spam included. See previous editions and sign up here.

If the topic of multimedia sounds up your road, we also have a little something for you. Each fall, we organize RTC.ON Conference, dedicated to topics like WebRTC, streaming, computer vision and AI. Read more here.

Smelter

https://smelter.dev/ or GitHub repo below:

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions