GNU/LinuxTechnology

Create test MPI cluster using VirtualBox images

This is a mini howto for creating a small MPI test cluster using virtual machines under linux.

We will be using Debian netinst CD for this howto as it is convenient to install a minimal system with no X subsystem, but any *nix environment would suffice.

This howto assumes you are running some form of linux as your host operating system.

Steps:

Check how many bits your kernel and userland is, ie. 32 bit or 64 bit.

Grab the Debian netinst CD of the same bitness (this could save you a lot of trouble later with shared libraries in the wrong place across cluster virtual machines).

Install VirtualBox on the host operating system.
On the host, also install the following:
$ sudo apt-get install uml-utilities bridge-utils

Create a single instance of Debian with the following parameters:
1 cpu, 768MB ram, 6GB dynamically allocated hard disk image, bridged mode network interface. Select expert install and do not choose Desktop environment (no X), but do select SSH Server. Use the same username as your main host pc for easy ssh setup.

Fire up your new instance, log in as root and run

# apt-get install mpich2

Log out and log in as your regular user and run

$ ssh-keygen -t rsa -b 2048

Use empty passphrase, and copy the contents of id_rsa.pub onto your host machine in ~/.ssh/authorized_keys.

If you already have an ssh key on your main machine, you can also authorise your key on the virtual machine by doing the reverse, otherwise create a new key on the host too and repeat.

Now you should be able to secure shell into your virtual machine without a password (using public key authentication) and check the date on the remote machine using:

(host)$ ssh IPAddressOfVirtual date

and likewise on the virtual machine terminal:

(virt)$ ssh IPAddressOfHost date

Clone your VM a few times (as many nodes as you need).
Fix up the ssh auth and ip addresses etc:

A vital trick is to remove the persistent network interface udev rules on the cloned virtual image otherwise the network card will not come up automatically:
(cloned)# rm /etc/udev/rules.d/70-persistent-net.rules

Also, you need to put your network card into promiscuous mode and create a bridge network interface on the host os and create as many virtual network interfaces (tun) as you have nodes:
Use this script to initialise your network before you fire up VirtualBox (replace damien with your username):

#!/bin/bash
# Don't need these, so they die
ifconfig vbox0 down
ifconfig eth1 down
# Pull down the main card
ifconfig eth0 down
# Throw up a bridge
brctl addbr br0
# Add my main card to the bridge
brctl addif br0 eth0
ifconfig eth0 0.0.0.0 promisc
# Bridge goes up
ifconfig br0 up
# Bridge obtains an IP address
dhclient br0

# Give me a virtual adapter
modprobe tun
tunctl -t tap0 -u damien
# Add the adapter to the bridge
brctl addif br0 tap0
chmod 0666 /dev/net/tun
ifconfig tap0 up

# Give me a virtual adapter
modprobe tun
tunctl -t tap1 -u damien
# Add the adapter to the bridge
brctl addif br0 tap1
chmod 0666 /dev/net/tun
ifconfig tap1 up

# Give me a virtual adapter
modprobe tun
tunctl -t tap2 -u damien
# Add the adapter to the bridge
brctl addif br0 tap2
chmod 0666 /dev/net/tun
ifconfig tap2 up

This creates 3 virtual interfaces for your bridge and attaches them. In VirtualBox you need to set each instance to a bridged interface using a unique tap0-tap2 virtual interface so each instance gets its own IP address. (Of course you can put more if you need).

Now, the thing with MPI is that the executable must be accessible on all clusters in the same directory!
A good solution would be to mount a NFS share at the same mountpoint on all nodes, but I hacked up a simple script that copies the executable to the same directory on all nodes separately after rebuilding it.

Create ‘hosts’ file:
ipaddress1
ipaddress2
...

Makefile (fix indentation):
SHELL=/bin/bash
PWD=$(shell pwd)
N=$(shell cat hosts|wc -l)
PROG=hello
#
build:
mpicc $(PROG).c -o $(PROG)
#
run: build
$(shell for h in $(shell cat hosts); do ssh $$h 'mkdir -p $(PWD)'; scp $(PROG) $$h:$(PWD); done) mpiexec.hydra -f hosts -n $N ./$(PROG)
#
clean:
$(shell for h in $(shell cat hosts); do ssh $$h 'rm -f $(PWD)/$(PROG)'; done)

Now try running a simple MPI program such as the following:
hello.c:

#include <stdio.h>
#include <mpi.h>
#
int main (argc, argv)
int argc;
char *argv[];
{
int rank, size;
MPI_Init (&argc, &argv); /* starts MPI */
MPI_Comm_rank (MPI_COMM_WORLD, &rank); /* get current process id */
MPI_Comm_size (MPI_COMM_WORLD, &size); /* get number of processes */
printf( "Hello world from process %d of %d\n", rank, size );
MPI_Finalize();
return 0;
}

Execute ‘make run’ and you should see this output:
(host)$ make run
mpicc hello.c -o hello
mpiexec.hydra -f hosts -n 2 ./hello
Hello world from process 0 of 2
Hello world from process 1 of 2
$

Enjoy!

Leave a Reply

Your email address will not be published. Required fields are marked *