Array Services

From Higher Intellect Wiki
Jump to: navigation, search

O/S level supported MPI implementation

Another great feature of IRIX is built in clustering using array services. Like all clustering, this allows you share the processing power and memory of multiple IRIX workstations to accomplish tasks more quickly. Being an older OS, I was nervous the process would be difficult, but it wasn’t bad at all.

The following is what I did to create my “Excelsis” cluster, consisting of my two SGI workstations, Metatron and Gabriel:

Install Array Services - array services can be found on the operating system disks that come with your workstation. Configure Your Array - IRIX comes with an arrayconfig script to generate the array configuration file, but chances are you’ll need to tweak it anyway, I find it easier to simply create the file in its entirety. The configuration file can be found at “/usr/lib/array/arrayd.conf”. Each array this machine participates in must be defined in this file. The first is normally an array labeled “me” which consists only of the localhost. Any further arrays require a name, plus a list of all machines in the array, which each need their own label and IP address or address to contact them at.

array me

    machine localhost

array excelsis

    machine metatron
         hostname "192.168.0.15"
    machine gabriel
         hostname "192.168.0.16"

According to the documentation, quotations are necessary for hostname. As can be seen, I have defined an array called “excelsis”, with two machines, one named “metatron” and one named “gabriel”. I then specified the IP address of each machine.

The default arrayd.conf file will define a number of array commands that can be run from the terminal. These are all well known UNIX commands such as top, ps, and who, only they can be run on the array as a whole, as opposed to just the local machine. E.g., running “array who” would list all users logged into the array (i.e. any machine connected to the array). Leave these commands where they are and jump to the end of the config file.

The end of the configuration file contains a “local” identifier that contains settings and defaults for array services running on the machine, such as the port array services listens on and the default array for array commands (remember, your machine can participate in more than 1 array. If you run an array command without specifying an array, it will default to the one specified here). You’ll want to change “destination array” to be the name of your default array. In my case, I changed it to “excelsis”. That’s it for this file!

Configure Array Security - By default, your array will be configured to not allow connections from remote machines. This allows messages to be passed via MPI between programs running on your machine, but doesn’t allow for clustering. Edit the “/usr/lib/array/arrayd.auth” file, and you should see “AUTHENTICATION NOREMOTE”. You’ll want to change this to “AUTHENTICATION NONE” if you want any machine to be able to connect to your array (not secure, but may not be an issue if you’re running behind a firewall), or “AUTHENTICATION SIMPLE” if you’d like to setup private keys for each machine. If you go the simple authentication route, you’ll need to specify “HOSTNAME machine1.domain.com KEY 0×3817382771948″ where machine1.domain.com is the address of the machine, and the hex string following KEY is the private key. You’ll need to specify this for all machines. These keys must match per machine on all machines in the array. Autostart Array Services - Now that you have your array services configured, you can tell IRIX to start it on bootup by enabling it in chkconfig. If you aren’t familiar with chkconfig, it is a special utility that configures whether various daemons should autostart at bootup or not. You can see a list of which daemons are currently set to start by running “chkconfig” with no arguments. To autostart array services, type “chkconfig array on”. Rinse, Repeat - You’ll need perform the above steps on every machine participating in the array. Assuming you have NFS starting before your array, I don’t see any reason why these configurations couldn’t be symbolically linked to one central configuration, but if you ever wanted machine specific settings you’d be out of luck. I simply edited the files on both my SGI workstations. At this point, you can run an “array who” to see all people logged into your array and “array ps” to see all processes running on all your clustered machines. The fun really begins for programmers at this point, as you can make use of the MPI libraries to share tasks across multiple nodes in your cluster for parallel processing.

While IRIX is on its way out and the chances of needing to setup a cluster is slim, it can be a fun little project if you have a few SGIs kicking around. It would also be interesting to see the compatibility of passing MPI messages between IRIX and non-IRIX clusters.

See instead

For the main part see the MPI topic.



Share your opinion