Globus: An ESB for Supercomputers

on Reading Time: 3 minutes

Recently, a system administrator complained to me about an issue with Globus: a platform used by the academia to transfer research data efficiently, securely and reliably. Globus is not an online storage but it acts as a mediator between storage systems. Neverthless, Globus seems to solve a widespread problem in academia.

Enterprise Service Buses (ESB) have been widely used in the industry to connect large scale systems between different organizations or even within the same organization. Certainly, I could immediately observe a similarity between ESBs and Globus . Therefore I decided to drill down and explore. Having 1.5+ years of experience on enterprise services buses, when I read through Globus documentation I was convinced that it was a simulacrum of an ESB for Supercomputers (well, only serving storage requests).

The Problem

The interesting problem posed by the system administrator was the restricted access provided by Globus to different partitions on a linux system. Let me elaborate; Globus Online allows to choose a source and a destination. Hence, the Supercomputing cluster will have an instance of Globus Connect Server and will allow a user to view files on the system — even files on the scratch: a scratch is a partition which refers to the global parallel file system, typically employed on a supercomputing cluster to provide petabytes of storage.

Although the server side allows flexible access, a user (maybe using a local server, a simple laptop or a desktop) will have to install an instance of Globus Connect Personal. For Linux, it’s an easy installation process since the user has to login to Globus and create a setup key which will allow Globus to identify it as an endpoint on its web interface. Here’s where the problem becomes conspicuous — Globus, by default, displays only the home directory of the user. Therefore, a user who would have used an external storage system mounted on /media will now would not be able to directly transfer to the scratch.

The Solution

Having worked on enterprise scale systems, my first hunch was that there has to be a configuration file which restricts the directories on the client side provided that a prescient software engineer had developed the system. In order to explore further, I installed Globus personal connect and started configuring on a local Linux server.

Installing Globus Connect Personal

1. Download and Unpack the Installer

cd /tmp/
wget https://downloads.globus.org/globus-connect-personal/linux/stable/globusconnectpersonal-latest.tgz
sudo mv globusconnectpersonal-latest.tgz /opt/
cd /opt/
tar -zxvf globusconnectpersonal-latest.tgz
cd globusconnectpersonal-x.y.z

2. Install Globus CLI

pip install --upgrade --user global-cli

3. Generate the Setup Key for Endpoint

globus endpoint create --personal neu-proxy

You will have to login to Globus using the globus login  command before you proceed.

4. Complete the Installation

Replace enter-your-setup-key-here with the setup key which was generated when you created the endpoint.

./globusconnectpersonal -setup `enter-your-setup-key-here`

5. View your Endpoints

globus endpoint search --filter-scope my-endpoints

Running Globus Connect

./globusconnectpersonal -start &
./globusconnectpersonal -stop

-status and -trace will provide the status and more detailed information.

Configuring Accessible Directories on Globus Connect Personal

As I assumed, I was able to figure out the configuration file. The file (~/.globusonline/lta/config-paths/) specifies which directories are accessible to Globus Connect Personal.

Each line of the file contains the path of the directory followed by the shared status followed by the read/write permissions.

<path>, <sharing flag>, <R/W flag>

By default the file would like as below.

~/,0,1

For example, if you need to provide access to /media/hdd1 on Globus, you will have to modify your file as below. Furthermore, I’m using all zeros to indicate the no share flag and read only flag.

~/,0,1
/media/hdd1,0,0

That’s it. Now we have to restart the services.

./globusconnectpersonal -stop
./globusconnectpersonal -start &

Now, if you logon to Globus web interface, you will see your partition, allowing to initiate transfers. Hope the post saved your day!

Leave a Reply

avatar
  Subscribe  
Notify of