elixir,id3 , Most efficient way to search file for byte patterns in Elixir

Most efficient way to search file for byte patterns in Elixir


Tag: elixir,id3

I am searching for id3 tags in a song file. A file can have id3v1, id3v1 extended tags (located at the end of the file) as well as id3v2 tags (usually located at the beginning). For the id3v1 tags, I can use File.read(song_file) and pull out the last 355 bytes (128 + 227 for the extended tag). However, for the id3v2 tags, I need to search through the file from the beginning looking for a 10 byte id3v2 pattern. I want to avoid any overhead from opening and closing the same file repeatedly as I search for the different tags, so I thought the best way would be to use File.stream!(song_file) and send the file stream to different functions to search for the different tags.

def parse(file_name) do
  file_stream = File.stream!(file_name, [], 1)
  |> add_tags(id3v2_tags(file_stream))

def id3v1_tags(file_stream) do
  tags = Tags%{} #struct containing desired tags
  << id3_extended_tag :: binary-size(227), id3_tag :: binary-size(128) >> = Stream.take(file_stream, -355)
  id3_tag = to_string(id3_tag)
  if String.slice(id3_tag,0, 3) == "TAG" do
    Map.put(tags, :title, String.slice(id3_tag, 3, 30))
    Map.put(tags, :track_artist, String.slice(id3_tag, 33, 30))
  if String.slice(id3_extended_tag, 0, 4) == "TAG+" do
    Map.put(tags, :title, tags.title <> String.slice(id3_extended_tag, 4, 60))
    Map.put(tags, :track_artist, tags.track_artist <> String.slice(id3_extended_tag, 64, 60))

def id3v2_tags(file_stream) do
  search for pattern:
  <<0x49, 0x44, 0x33, version1, version2, flags, size1, size2, size3, size4>>

1) Am I saving any runtime by creating the File.stream! once and sending it to the different functions (I will be scanning tens of thousands of files, so saving a bit of time is important)? Or should I just use File.read for the id3v1 tags and File.stream! for the id3v2 tags?

2) I get an error in the line:

  << id3_extended_tag :: binary-size(227), id3_tag :: binary-size(128) >> = Stream.take(file_stream, -355)

because Stream.take(file_stream, -355) is a function, not a binary. How do I turn it into a binary that I can pattern match?


I believe your implementation is being unnecessarily complex due to the reliance on stream. Make it work, make it pretty then make it fast (but only if necessary).

For simplicity, I would first load everything into memory. Just use File.read!/1. Then you can use the functions in :binary module to search for patterns (:binary.match/2), split it (:binary.split/2) or grab a certain part (:binary.part/3). There is no need to mix File.stream and File.read too, just read it once and pass that same binary around.

Also, very important, don't use the String module. String is meant to work UTF-8 encoded binaries. You want to use the :binary module for all byte level operation.

Finally, Stream.take/2 always returns functions as it is lazy. You want to use Enum.take/2 instead (it accepts streams as streams are also enumerables). Although, as I said, I would skip the stream stuff altogether.


Why Phoenix doesn't use Plug to start the server?

liveforeverx on irc has answered my original question and I have modified this question as a followup. Phoenix depends on Plug for many of its function. However, when it comes to starting cowboy server, why doesn't Phoenix start it using Plug's api in Phoenix.Endpoint.CowboyHandler.start_link? Why does start_link on ranch_listener_sup is...

How to rewrite Erlang combinations algorithm in Elixir?

I've been tinkering with Elixir for the last few weeks. I just came across this succinct combinations algorithm in Erlang, which I tried rewriting in Elixir but got stuck. Erlang version: comb(0,_) -> [[]]; comb(_,[]) -> []; comb(N,[H|T]) -> [[H|L] || L <- comb(N-1,T)]++comb(N,T). Elixir version I came up with...

Which OTP behavior should I use for an “endless” repetition of tasks?

I want to repeatedly run the same sequence of operations over and over again next to a Phoenix application (without crashing the whole web-app if something brakes in the worker of course) and don't really know wether I should use a GenServer, Elixir's Tasks, an Agent or something completely different...

Importing test code in elixir unit test

I'm writing tests of some Elixir code that interacts with SSH. In my tests, I'd like to start an SSH server that I can run my code against. I'd prefer to store this code in it's own file in the test directory, and have it imported by various different tests....

Chunking list based on struct type changing

I have a list I want to chunk up based on a transition from struct type B to A. So for example, I have the following: iex(1)> defmodule A, do: defstruct [] {:module, A ... iex(2)> defmodule B, do: defstruct [] {:module, B ... iex(3)> values = [ %A{}, %A{},...

How to stub (or prevent running) of a call to a worker in my ExUnit test?

I have a Phoenix app (which is just a restful api with no front end) and one of the controllers does some stuff which I want to test, but at the end of the controller it calls a dispatcher which sends a payload off to a worker (run under poolboy)...

How to read and write id3v1 and id3v2 tags in Elixir

I would like to scan music files and read/write metadata using Elixir (this whole project is about learning Elixir - so please don't tell me to use Python!). As I understand it, I have two choices: call a system utility or (as no libraries exist in Erlang or Elixir that...

How to delete a Phoenix Session?

I'm going through the Phoenix Guide on Sessions. It explains it very well how I can bind data to a session using put_session and fetch the value later using get_session but it doesn't tell how I can delete a User's session. From the guide: defmodule HelloPhoenix.PageController do use Phoenix.Controller def...

What is the number that shows up after you define an anonymous function in elixir?

When you define an anonymous function in elixir you get a result like this. #Function<6.90072148/1 in :erl_eval.expr/5> What I've noticed is that the number is based on the arity of the function. So a 1 arg function is always #Function<6.90072148/1 in :erl_eval.expr/5> A two arg function is always #Function<12.90072148/2 in...

“Cannot begin test transaction because we are already inside one”

I followed this tutorial and my simple test always fail with this error 1) test /index returns a list of contacts (WorldNote.ChatsControllerTest) test/controllers/chats_controller_test.exs:16 ** (RuntimeError) cannot begin test transaction because we are already inside one stacktrace: (ecto) lib/ecto/adapters/sql.ex:321: anonymous fn/6 in Ecto.Adapters.SQL.start_test_transaction/3 (ecto) lib/ecto/adapters/sql.ex:615: Ecto.Adapters.SQL.pool_transaction/4 (ecto) lib/ecto/adapters/sql.ex:314: Ecto.Adapters.SQL.start_test_transaction/3...

Is that possible to get comments with macro?

I was trying to parse some code and reformat them, but it seems that quote will just ignore the comments. Is there any way to achieve this? I guess I have to dive into the erlang side?...

Fixing encoding of ID3 tags with mutagen

I'm trying to fix encoding of ID3 tags so that my Nokia Lumia 630 with windows 8 onboard would display correctly Cyrillic letters. I'm doing this with mutagen: # -*- coding: utf-8 -*- import os import mutagen.id3 for path in [u'Sergei Babkin - Aleksandr [pleer.com].mp3']: id3 = mutagen.id3.ID3(path) for key,...

Elixir: Pattern Match or Guard

I am curious when I should be using pattern matching vs guard clauses when defining functions. For example with pattern matching: defmodule Exponent do def power(value, 0), do: 1 def power(value, n), do: value*power(value, n-1) end vs guard clause: defmodule Exponent do def power(value, n) when n==0, do: 1 def...

Most efficient way to search file for byte patterns in Elixir

I am searching for id3 tags in a song file. A file can have id3v1, id3v1 extended tags (located at the end of the file) as well as id3v2 tags (usually located at the beginning). For the id3v1 tags, I can use File.read(song_file) and pull out the last 355 bytes...

Phoenix - controller with multiple render

Trying to create an app with Elixir + Phoenix, that would be able to handle both "browser" and "api" requests to handle its resources. Is it possible to do it without having to do something like that : scope "/", App do pipe_through :browser resources "/users", UserController end scope "/api",...

How Can We Clear the Screen in Iex on Windows

Please how can we clear the screen in Iex on Windows Documented method in Iex help does not work: clear/0 — clears the screen This StackOverflow Answer also does not work in Windows....

Step List With Elixir

Can someone please provide a suggestion on how to iterate a list BUT with a batch of x at a time? For example: If the functionality existed: ["1","2","3","4","5","6","7","8","9","10"].step(5)|> IO.puts Would produce in two iterations: 12345 678910 I believe Stream.iterate/2 is the solution but my attempts to do so given an...

Getting a sibling process in Elixir

I have an Elixir/Erlang process tree: parent (Supervisor) ├── child1 (GenServer) └── child2 (GenServer) child1 (a DB client) has information that child2 needs to use. What's a good way to pass a reference from the Supervisor process to child2 so that child2 will always have a valid reference to child1?...

Where can I put my Plugs and then use them from different controllers in my Phoenix app?

I'm creating my first Elixir-Phoenix app. I've written a few plugs that I want to use in multiple controllers, right now there is a lot of code duplication since the Plug code is being repeated in all of my controllers. My question is, is there a file where I can...

Why the syntax for defining sigils in Elixir doesn't use “defsigil”?

I was reading the page about sigils in the Elixir tutorial. I expected the syntax for defining sigils uses "defsigil" just like "defstruct", "defprotocol", and so on. But it was not so. Why?

Join two tables belong to two database in Elixir Ecto

In Elixir, with Ecto, is it possible to join two different tables (in the same host) belonging to different two databases. There are two databases called cloud and cloud_usage in this query When I execute the query, which Repo should I use? Billing.CloudUsage.Repo.all(query) or Billing.Cloud.Repo.all(query) query = from cucu in...

Elixir exrm release crashes on eredis start_link

I'm fairly new to Elixir and this is the first app that I'm attempting to release using exrm. My app interacts with a Redis database for consuming jobs from a queue (using exq), and also stores results of processed jobs in Redis using eredis. My app works perfectly when I...

Does spawning new processes use all CPU cores in elixir

Suppose, I'm on a 4-core CPU machine. If I run the following in my elixir VM: 1..4 |> Enum.map fn(x) -> spawn(computationally_heavy_process) end Does this use all 4 cores of my machine. One of each of the computationally heavy processes?...

How does Polymorphic association work with Ecto?

Ecto seems to support polymorphic association as I read through https://github.com/elixir-lang/ecto/issues/389 and its related issues linked from it. Let's say I need a Comment model association on Task and Event models. If my understanding of Ecto association with custom source is right, then we need four tables and three models,...

Optional POST parameters in Elixir Phoenix

I have a phoenix route that I want to POST some form data to, however there are about 4 fields of the form that are optional (the form is constructed by the end user and therefore these fields may not exist in the POST payload) In the Phoenix controller for...

Why Rem operator in Elixir returns negative numbers?

I am trying a simple operation rem(-1, 25) I expect that to be the reminder of integer division and return 24 (the same e.g. as in Ruby) but it returns -1. Am I doing something wrong? Is the behavior broken on elixir?...

Setting up custom response for exception in Phoenix Application

im writing phoenix application with ecto and have the following snippet in the test {:ok, data} = Poison.encode(%{email: "[email protected]", password: "mypass"}) conn() |> put_req_header("content-type", "application/json") |> put_req_header("accept", "application/json") |> post(session_path(@endpoint, :create), data) > json_response(:not_found) == %{} this throws a Ecto.NoResultsError i have this defined defimpl Plug.Exception, for: Ecto.NoResultsError do def...

What does BEAM stand for in iex for the Elixir programming language?

I'm sort of curious as to what the B. E. A. and M. stand for. I recall seeing an explanation of the acronym BEAM, but I have not managed to find it again. It comes up in error codes: ➜ gentoo iex Erlang/OTP 17 [erts-6.4.1] [source] [64-bit] [smp:8:8] [async-threads:10] [kernel-poll:false]...

Sort List elements in Elixir Lang

I have a list of strings that I want to order in two ways. Alphabetically By string length ...

Elixir - Get Host By Name?

How do you gethostbyname with Elixir? There doesn't seem to be a supported API and the two solutions seem to revolve around, Erlang's Inet Fork to shell with System (hostname) ...

How to Log something in Controller when Phoenix Server is running?

Well, the question is pretty clear. I'm trying to print some debug information from one of my Controllers in my Phoenix App when the Server is running. defmodule PhoenixApp.TopicController do use PhoenixApp.Web, :controller alias PhoenixApp.Topic plug :action def index(conn, _params) do # ... log "this text" # ... render(conn, "index.html")...

receiving :badarg on File.write

I am starting to learn Elixir, and this is also my first dynamic language, so I am really lost working with functions without type declaration. What I am trying to do: def create_training_data(file_path, indices_path, result_path) do file_path |> File.stream! |> Stream.with_index |> filter_data_with_indices(indices_path) |> create_output_file(result_path) end def filter_data_with_indices(raw_data, indices_path) do...

Output tabular data with IO.ANSI

I would like to render a 2-dimensional list to a nice tabular output, using an ANSI escape sequences to control the formatting. So given this data: data = [ [ 100, 20, 30 ], [ 20, 10, 20 ], [ 50, 400, 20 ] ] I would like to output...

Rails' before_filter equivalent in Phoenix

I've just started working on my first Phoenix app, and the issue is that I have some common lines of code in every action in my controller, that I would like to separate out. These lines fetch data from multiple Ecto Models and saves them to variables for use. In...

Elixir - Download a File (Image) from a URL

What does the code to download a file/image from a URL look like in Elixir? Google searches seem to bring back how to download Elixir itself....

Wait for Node.connect before using :global.whereis_name

I have the following function: def join(id) do if Node.connect(:"#{id}@") do send :global.whereis_name(id), {:join, id} end end I receive the error: (ArgumentError) argument error :erlang.send(:undefined, ... which I assume is because Node.connect does some gathering of information and when I call :global.whereis_name it has not finished yet. If I throw...

How to do Elixir mixins

I'm trying to create a mixins for authentication login, so it can be applied to my models which should be able to login. Much like the the has_secure_password in Ruby. Afaik this is done using the use statement which essential requires the module, and calls the __using__ macro. So I...

Where does Elixir/erlang fit into the microservices approach? [closed]

Lately I've been doing some experiments with docker compose in order to deploy multiple collaborating microservices. I can see the many benefits that microservices provide, and now that there is a good toolset for managing them, I think that it's not extremely hard to jump into the microservices wagon. But,...

elixir, ecto, compare time in the where clause

When I create a query using ecto in Elixir, I'm not really sure about how to compare time in the 'where' clause. In the schema part I declare create_at as :datetime schema "tenant" do field :id, :integer field :created_at, :datetime # timestamps([{:inserted_at,:created_at}]) end and the query part is like def...

Disable Elixir Ecto Debug output

Whatever in iex> or using mix run -e "My.code" when I run the mix project using ecto, the Ecto's Debugging Mechanism display a bunch of SQLs like below 16:42:12.870 [debug] SELECT a0.`id` FROM `account` AS a0 WHERE (a0.`account_name` = ?) ["71000000313"] (39.6ms)` ... When I dont need the debug output...

how to instruct ecto to not create the auto increment id field?

Ecto migrations automatically create an auto increment field by name 'id' in the table. How to avoid creating this field? How to set another column in the table as primary key (not auto increment)? ...

In Elixir how do you initialize a struct with a map variable

I know its possible to create a struct via %User{ email: '[email protected]' }. But if I had a variable params = %{email: '[email protected]'} is there a way to create that struct using that variable for eg, %User{ params }. This gives an error, just wondering if you can explode it...

When to use compile-only dependencies in Elixir

When would it be appropriate to specify a dependency only in deps in my mix.exs and not as a runtime dependency in applications? I thought that applications are actual applications that need to be started before my own application can be started, but I run into a problem where exrm...

Querying by DateTime in Ecto

Here is what I have tried. date = Ecto.DateTime.from_erl(:calendar.universal_time()) query |> where([record], record.deadline >= ^date) I also tried date = Ecto.DateTime.from_erl(:calendar.universal_time()) query = from m in MyApp.SomeModel, where: m.deadline >= ^date, select: m Both return same message value `%Ecto.DateTime{..}` in `where` cannot be cast to type :datetime in query From...

Phoenix: Broadcasting from IEx console

I have built a small chat app like the one here: https://github.com/chrismccord/phoenix_chat_example/blob/master/web/channels/room_channel.ex And cannot figure out how to broadcast to all users in a topic a message. In the above application (which isn't updated to v0.13 like I'm using), how would I do that? Below is what I've tried with...

Horribly redundant Phoenix controller

I'm doing my first Phoenix application, and trying to do new/2 controller. The code I wrote is def new(conn, %{"fbid" => fbid, "latitude" => lat, "longitude" => lng, "content" => content}) do {fbid, _} = Integer.parse(fbid); {lat, _} = Float.parse(lat); {lng, _} = Float.parse(lng); chats = %Chat{:fbid => fbid, :latitude...

Set default value for date selector in phoenix framework to current date

In the app I'm developing I have a date selector which will mostly be used with the current date as value (or a date a few days later). In order to reduce work for my users, I want to set today's date as default value. I can easily set the...

How to run Elixir application?

What is the correct way to run an Elixir application? I'm creating a simple project by: mix new app and after that I can do: mix run which basically compiles my app once. So when I add: IO.puts "running" in lib/app.ex I see "running" only for the first time, each...