Facebook uses Erlang

August 15, 2008

Here is proof: http://www.joyent.com/?gclid=CLeIy4La-pQCFRMJewoddC7c3A

For Facebook Chat, we rolled our own subsystem for logging chat messages (in C++) as well as an epoll-driven web server (in Erlang) that holds online users’ conversations in-memory and serves the long-polled HTTP requests. Both subsystems are clustered and partitioned for reliability and efficient failover. Why Erlang? In short, because the problem domain fits Erlang like a glove. Erlang is a functional concurrency-oriented language with extremely low-weight user-space “processes”, share-nothing message-passing semantics, built-in distribution, and a “crash and recover” philosophy proven by two decades of deployment on large soft-realtime production systems.

Umm… did I mention that I yearn for hacking Erlang code, too? Following Bill Clementson’s lead, I set up  Emacs+Erlang on my Ubuntu hardy box.

Step 1: Install erlang and erlang-mode

I used the version of erlang and erlang-mode for emacs in Ubuntu’s repositories. I might come to regret it some day, but for now the following was just too simple to resist:

$ sudo aptitude install erlang
# The above will install erlang-mode too;
# if it does not just "apt-get install erlang-mode"

Step 2: Configure erlang-mode in Emacs

I just put the following in my .emacs:

;; Erlang-mode
(require 'erlang-start)
(add-to-hook 'erlang-mode-hook
             (lambda ()
            ;; when starting an Erlang shell in Emacs, the node name
            ;; by default should be "emacs"
            (setq inferior-erlang-machine-options '("-sname" "emacs"))
            ;; add Erlang functions to an imenu menu
            (imenu-add-to-menubar "imenu")))

Now I can launch Emacs and open an erlang file, which puts me in Erlang mode.  I can start and Erlang shell with “C-c C-z”, compile the Erlang code with “C-c C-k”, and view the compilation result in the Erlang buffer (if the buffer is hidden) with “C-c C-l”.  Or, I can just switch to the Erlang shell with “C-c C-z”.  The customizations above enable me to get a menu item in the Emacs menubar with a list of functions defined in the file I am visiting, which is a big help. I am used to “ecb-mode” when coding Python; ECB does not grok Erlang yet, so the imenu is a handy substitute when coding Erlang.

Step 3: Install Distel

Distel is to Erlang and erlang-mode what SLIME is to Lisp and lisp-mode.

To get distel:

$ cd ~/.emacs.d/
$ svn co http://distel.googlecode.com/svn/trunk/ distel
$ cd distel
$ make
$ cd doc
$ make postscript && make postscript # must run twice
$ make info && sudo make install # install the Info documentation
$ info distel # read the distel info documentation

Step 4: Configure Emacs to use Distel

I just need to put this in my .emacs:

(push "/home/parijat/.emacs.d/distel/elisp/" load-path)
(require 'distel)
(distel-setup)

Step 5: Configure Erlang

Distel is designed to work in a distributed Erlang system. It can connect to specified Erlang nodes. My strategy is to use the standard erlang-mode command “C-c C-z” to start an Erlang shell, and connect to it using Distel by “C-c C-d n”. The latter asks for a nodename, which is going to be “emacs@beowulf” (because beowulf is the short hostname of my laptop).

But before we start using remote Erlang nodes, we should create on each physical machine that we will use a ~/.erlang.cookie file with a password, for inter-Erlang node authentication:

$ echo "secret" > ~/.erlang.cookie
$ chmod 0400 ~/.erlang.cookie

Step 6: Play with Distel

So, now we can launch an Erlang node, either via Emacs using “C-c C-z”, or on the command line using “erl -sname mynode”. Then, from within Emacs, we can connect Distel to this node using “C-c C-d n” specifiying the nodename on the prompt. Now we can ask Distel to interrogate the Erlang node for various things.

C-c C-d l ; list erlang processes
PID/Name              Initial Call                              Reds     Msgs
init                  otp_ring0:start/2                         3821        0
erl_prim_loader       erlang:apply/2                           87239        0
error_logger          proc_lib:init_p/5                          229        0
application_controlle erlang:apply/2                            2501        0
<0.7.0>               proc_lib:init_p/5                           45        0
<0.8.0>               application_master:start_it/4               91        0
kernel_sup            proc_lib:init_p/5                         1498        0
rex                   proc_lib:init_p/5                          493        0
global_name_server    proc_lib:init_p/5                           69        0
<0.12.0>              erlang:apply/2                              25        0
<0.13.0>              erlang:apply/2                               4        0
<0.14.0>              erlang:apply/2                               3        0
inet_db               proc_lib:init_p/5                          129        0
net_sup               proc_lib:init_p/5                          312        0
erl_epmd              proc_lib:init_p/5                          147        0

While I normally program in Python, a part of me wants to quickly hack some lisp code every now and then. Especially during those long minutes when my automated tests are running.

I realized that long neglect had left me without a working lisp setup. Here is how I got it up and running again.

Step 1: Get rid of system lisp and slime

The version of sbcl and SLIME in my Ubuntu system is not recent enough for my liking. So I get rid of them:

$ sudo apt-get -y remove sbcl slime common-lisp-controller
$ sudo apt-get -y autoremove # get rid of packages we don't need anymore
$ sudo rm -rf /var/cache/common-lisp-controller

Step 2: Install common requirements

cd ~/
sudo apt-get update
sudo apt-get -y install emacs22
sudo apt-get -y install cvs
sudo apt-get -y install git-core
sudo apt-get -y install darcs
sudo apt-get -y install subversion
sudo apt-get -y install build-essential
sudo apt-get -y install autoconf
sudo apt-get -y install curl
sudo apt-get -y install sbcl
sudo apt-get -y install texinfo
sudo apt-get -y install tetex-bin
sudo apt-get -y install xloadimage

Step 3: Get clbuild

$ mkdir lisp
$ cd lisp
# for fresh install of clbuild
$ darcs get http://common-lisp.net/project/clbuild/clbuild
$ cd clbuild
# Or, to update existing clbuild
$ cd clbuild
$ darcs pull

Step 4: Get latest SBCL

$ ./clbuild update sbcl
$ cd source/sbcl
$ sh make.sh
$ cd doc/manual
$ make
$ cd ../..
$ echo > ~/.sbclrc
(require :asdf)
(push "/home/parijat/lisp/clbuild/systems/" asdf:*central-registry*)
^D
$

Step 5: Setup SLIME

$ ./clbuild update slime

Now, we need to add this slime configuration to our .emacs file:

(push "/home/parijat/lisp/clbuild/source/slime" load-path)
;; Common Lisp Mode
(setq inferior-lisp-program "/usr/local/bin/sbcl")
(add-to-list 'auto-mode-alist '("\\.lisp$" . lisp-mode))
(add-to-list 'auto-mode-alist '("\\.cl$" . lisp-mode))
(add-to-list 'auto-mode-alist '("\\.asd$" . lisp-mode))
(require 'slime)
(slime-setup)
(eval-after-load "slime"
 '(progn
    (setq slime-complete-symbol*-fancy t
          slime-complete-symbol-function 'slime-fuzzy-complete-symbol
          slime-when-complete-filename-expand t
          slime-truncate-lines nil
          slime-autodoc-use-multiline-p t)
    (slime-setup '(slime-fancy slime-asdf))
    (define-key slime-repl-mode-map (kbd "C-c ;")
      'slime-insert-balanced-comments)
    (define-key slime-repl-mode-map (kbd "C-c M-;")
      'slime-remove-balanced-comments)
    (define-key slime-mode-map (kbd "C-c ;")
      'slime-insert-balanced-comments)
    (define-key slime-mode-map (kbd "C-c M-;")
      'slime-remove-balanced-comments)
    (define-key slime-mode-map (kbd "RET") 'newline-and-indent)
    (define-key slime-mode-map (kbd "C-j") 'newline)))
(add-hook 'lisp-mode-hook (lambda ()
                           (cond ((not (featurep 'slime))
                                  (require 'slime)
                                  (normal-mode)))
                           (indent-tabs-mode nil)
                           (pair-mode t)))

Note: pair-mode is a nice package for inserting balanced parentheses, quotes, braces and square-brackets, etc., which I installed earlier by hand.

Now we can launch emacs, and start slime with “M-x slime”.

Caveat

For some reason, after doing the above, slime would still not start. Doing a “M-x slime” would throw me into the sbcl debugger with the error message that sbcl could not find “/usr/share/common-lisp/systems/slime/swank-loader.lisp”, or something like that (I could be wrong about the exact path). After much head-scratching (and grepping my ~/lisp directory), I figured out I had to change the emacs customization variable slime-backend. This can be done by doing “M-x customize-group RET slime-lisp RET” and changing the value of “Slime Backend” to simply “swank-loader.lisp” (the file should be in the same directory as “slime.el”). We can also give an absolute path to the file. I found that the value of this variable was set wrongly (an artifact of using apt-get installed slime, I guess) and hence sbcl was not able to load the correct file.

The world is … flat?

August 12, 2008

From slashdot: The Flat Earth Society. What is this about? Well, people who really believe that the Earth is flat. Read the FAQ. It has gems like:

Q: “Why has no-one taken a photo of the Earth that proves it is flat?”
A: The government prevents people from getting close enough to the Ice Wall to take a picture.

It seems on first sight that the flat-earthers are much more civil to the ‘Round Earthers’ who visit their forums than other people who disagree with the mainstream.

Friend sent in a link:

Subject: Why software estimation is so difficult
Or are we just bad at it.
Finally some mathematical rigor to it:
http://www.idiom.com/~zilla/Work/kcsest.pdf

A necessarily incomplete excerpt from the introduction of the paper:

… Is it possible to apply mathematical and scientific principles to software estimation, so that development schedules, productivity, and quality might be objectively ascertained or estimated rather than being a matter of opinion?

And can we do it in a way that does not take longer than, uh, doing the project itself? It would be even better if it did not take longer than the boss/customer/client takes to blow their collective tops.

… there are a large number of design methods, development processes, and programming methodologies that claim or hint at objective estimation of development schedules, project complexity, and programmer productivity. …

Yup, got that right.

Hardly any methodology comes up and honestly claims that the estimates are going to be rather shaky. There are those that emphasize change management and re-estimation as you go along. However, they blame it on “changing requirements” or “project risks”, not on “our estimates are way inaccurate anyway”.  Hardly anybody acknowledges that estimating how many tasks can be done in a sprint, or how long a sprint will take to do is a guesstimate. Instead, they just downplay the estimation bit, to a bullet point in a presentation: “thou shalt estimate thy tasks”. You come away thinking there must be something everyone else knows about making accurate estimates and you are the only idiot on the planet who does not.

… it would be easy to conclude that these methods are not being practiced.

Yeah, not in very many places. Just adopt a methodology and your schedules will all come out just right.

In sections three and four we will find that algorithmic complexity results can be directly interpreted as indicating that software complexity, development schedules, and productivity cannot be objectively and feasibly estimated and so will remain a matter of opinion. …. The situation here [with approximate and statistical estimators] is more optimistic, but we give reasons for maintaining a scepticism towards overly optimistic claims of estimation accuracy.

Well, that was that. No accurate estimates for you. Schedules are just guesswork.

In all the project management literature I have read, the work of project management consists of a lot of tasks or processes, and one teeny-weeny box in the big flow is “activity duration estimation”. And I have not seen a single description of actually how to go about doing it.

The PMBOK (Project Management Body of Knowledge) produced by the pmi (http://www.pmi.org) has more to say: it described the inputs and the outputs of the process of “activity duration estimation” and the tools to do it:

  • Enterprise Environmental Factors (that is, everything about how the company does business)
  • Organizational Process Assets (aka Historical Information)
  • Expert Judgment

Wow. Expert Judgment. Hmm…. that reminds me:

Once, I was trying and failing to setup networking on Windows on a home machine. When I reluctantly fired up the help system by clicking on the “Troubleshooting” link on the networking tab, I was walked through some inane Q&A menus, finally landing on “for more information, ask the system administrator of the network”. You can see this kind of “help” in a lot of places, including user manuals.

The alleged system administrator is the equivalent of the “Expert” of the project management lore, one who would have the knowhow to give accurate, useful information.

The problem: I was the system administrator of the network. And I did not (yet) have a clue!

Finally found a blog that indicates that you can use an N95 like a business pda without being shackled to a Windows machine: http://davehall.com.au/blog/dave/2007/11/18/my-new-toy-nokia-n95

I am giving it a spin. Will see how it works out.

Homeopathy

August 2, 2008

Via this blog belonging to the magician James Randi (a debunker of extraordinary claims, most famously those of Uri Geller), I learnt that on 25th July the BBC ran a story on the 20th anniversary of a scientific paper published in the journal <em>Nature</em>.

According to a charismatic French scientist named Jacques Benveniste, pure water could somehow remember what it had previously containe. …  John Maddox, editor of Nature, realised that Benveniste’s research would be controversial, so it was accompanied by a disclaimer similar to one that had been run when he published research about Uri Geller’s supposed supernatural powers…. Unfortunately for Benveniste, the investigators soon discovered that the results in his laboratory were unreliable.

How were they unreliable? From someone who was there:

The test procedure involved counting – via a reticule in a microscope – the number of exploded basophils. This would seem to be a straightforward procedure: select an area, count the burst cells, and record that number, but that wasn’t quite sufficient. Ideally – and such an experiment can’t properly be done with less than optimum care being applied – each microscope slide should have been randomly coded, so that double-blind conditions were in place. That was not done; on one occasion, we saw a lab worker perform the count, record it, and then erase the number when it was realized that the slide that had been scanned was a “control,” not a randomly-selected sample. The lab worker replaced the slide in the holder, moved it about, and settled on an area that gave a count more in line with what had been expected. That is not the way science is done.

Ok, so the original experiment was flawed. So, file and and forget. But no. This experiment is probably the cornerstone of explanations of how homeopathy could work, assuming it does. So people persist in claiming that the phenomenon exists and that proper experiments would prove it. Here is what the BBC says in the same article:

For example, the BBC science series Horizon attempted to test Benveniste’s claims in 2002, and the conclusion was announced by Professor Martin Bland, of St George’s Hospital Medical School.

He said: “There’s absolutely no evidence at all to say that there is any difference between the solution that started off as pure water and the solution that started off with the histamine [an allergen].”

Looks like some people are giving up on homeopathy.