Saturday, November 9, 2013

Erlang: Anonymous Recursive Function

The Erlang shell evaluates expressions. Calling an anonymous function by itself is a bit tricky. The easy way to do it is to pass in the reference to the function as one of its arguments. Variable binding by pattern matching happens only after a function gets defined. So we cannot call the function by the variable name from within itself.
In the example the D(L) displays the elements in the list L.
1> D = fun(F, []) -> ok;
1> (F, [H|T]) -> io:format("~p~n", [H]), F(F,T) end.
#Fun<erl_eval.12.82930912>

2> D(D, [a,b,c]).
a
b
c
ok
If the binding takes place before the function definition, we could have written like
1> G = fun([]) -> ok;
1> ([H|T]) -> io:format("~p~n", [H]), G(T) end.
* 2: variable 'G' is unbound
but, we can see that it gives an error. This is similar to a fixed-point combinator like a Y combinator.

Wednesday, November 6, 2013

Erlang: Reading a Line of Integers from stdin

There are various functions that can be used for reading from standard input. I have a line of integers separated by a space in between. There can be 1000 to 10,000 or more numbers in a line. I want to read them all and use them for further processing. I mainly thought of two ways, one is to read individual integers each and build a list. The other method is read the whole line in one go as a string, split the string delimited by space which gives a list of strings and convert each string element to an integer. Now the question is which of the two would be faster?
#!/usr/bin/env escript

% Read input as an integer
read_input(Inp) ->
case io:fread("", "~d") of
eof ->
% display the list
io:format("~p~n", [lists:reverse(Inp)]),
ok;
{ok, [N]} ->
read_input([N|Inp]);
_ -> read_input(Inp)
end.

split(S, D) ->
re:split(S, D, [{return,list}]).

%% Read input as a string
read_input() ->
case io:get_line("") of
eof ->
ok;
N ->
% display the list
L = lists:map(fun(X) -> list_to_integer(X) end, split(string:strip(N--"\n"), " ")),
io:format("~p~n", [L])
end.

main(_) ->
%read_input([]).
read_input().
I am testing this using escript without -mode(compile) which interprets the code.
The lowest time the read_input([]) function took to read and display a line of 1000 integers (around 7 digits each) was 2.235s and read_input() took only 0.750s. The average for 7 runs were 2.256s and 0.77s respectively. In the first method we can see that the list is created by appending elements to the head which is a really fast operation. Which shows that it is better to reduce the number IO calls and get the input as a chuck.
With 10,000 integers, the first function takes 2m28.625s and the second one takes only 1.875s.
NB: Reading in as string can be slow in other programming languages. In Erlang strings are represented as a list of integers.

Friday, November 1, 2013

Case Insensitive Regex in Erlang

Case insensitive matches with regular expression can be done by specifying caseless as one of the options.
1> S = "Hi there, hello there. Hello world!".
"Hi there, hello there. Hello world!"

2> re:run(S, "hello", [global, caseless]).
{match,[[{10,5}],[{23,5}]]}

3> re:run(S, "hello", [global]).
{match,[[{10,5}]]}

4> re:run(S, "hello", []).
{match,[{10,5}]}