Mascot cluster installation security on Linux
Mascot Server has a built-in cluster mode, where the database search can be executed in parallel on a networked cluster of PCs. This requires no special hardware or operating software. The cluster can consist of ‘commodity’ PCs running Windows or Linux. Cluster mode is usually the most practical option for licences of 5 CPU or more, as discussed in Mascot Server cluster mode. One server or PC acts as the master node and the others as compute or worker nodes.
On Linux, the default is to run Mascot as root. However, the root account is often disabled in recent Linux distributions, and there may be other security concerns. We’ve collected some tips how to run a Mascot cluster as non-root user.
Default cluster operation
The master node runs ms-monitor.exe, which controls the worker nodes using load_node.pl. The script uses SSH for starting and stopping ms-mascotnode.exe on the worker nodes. load_node.pl also uses scp for copying sequence database files and configuration files to the workers.
The service start-up in the default setup is:
- ms-monitor.exe calls load_node.pl as root, which opens an SSH connection to root@worker-pc
- On master-pc, SSH looks for the private key in master-pc:/root/.ssh
- On worker-pc, SSH looks for the public key in worker-pc:/root/.ssh
- On worker-pc, ms-mascotnode.exe is started as root
When ms-mascotnode.exe starts, it opens a TCP port (by default 5001). This is used by nph-mascot.exe for transferring search input, progress and results. On the worker, nph-mascot.exe inherits the credentials of ms-mascotnode.exe. The relevent credentials are:
- nph-mascot.exe runs as a CGI user on master-pc, typically apache or www or wwwrun
- nph-mascot.exe connects to port 5001 on worker-pc
- On worker-pc, ms-mascotnode.exe receives search data on port 5001 and launches nph-mascot.exe (as root)
load_node.pl requires a passwordless SSH key, because ms-monitor.exe runs as a daemon or system service without access to a console. Consequently, anyone who can log in as or 'su' to root on master-pc has unlimited root access to the worker nodes.
Running as unprivileged user
The default setup assumes the nodes are connected to a private Ethernet switch, where the subnet is only accessible to the master node. If this is the case, there is no real security risk: an intruder has to compromise the root account on the master node before reaching the workers, and by then you have bigger problems.
On the other hand, you might not have the luxury of a private subnet. It's common for the master and worker nodes to be connected with a trusted network like in a university campus. The trusted network is protected from the public Internet with a firewall, but anyone within the network could have access to the nodes. Another consideration is, if a user finds a way to run arbitrary commands through ms-monitor.exe, they could easily get root access, either by accident or ill intent.
The blog article Improved security for Mascot Installations under Linux has instructions for running ms-monitor.exe as an unprivileged user. These apply to the master node. For example, if you run ms-monitor.exe as user mascot, the service start-up changes to:
- ms-monitor.exe calls load_node.pl as mascot, which opens an SSH connection to root@worker-pc
- On the master node, SSH looks for the private key in master-pc:/home/mascot/.ssh
- On the worker node, SSH looks for the public key in root's home and starts ms-mascotnode.exe as root
Although ms-monitor.exe now runs with fewer privileges, the situation is a bit worse than the default! Anyone who can log in as mascot has unlimited root access to the worker nodes, and non-privileged accounts often have lower security requirements than the root account.
Suppose you want to run ms-mascotnode.exe as mascot-worker. On worker-pc, make the node directory specified in nodelist.txt writable by mascot-worker. By default, it's /usr/local/mascotnode. Add the SSH public key to worker-pc:/home/mascot-worker/.ssh. Then, edit load_node.pl and change all occurrences of "root" to "mascot-worker". Now the node start-up is:
- ms-monitor.exe calls load_node.pl as mascot, which opens an SSH connection to mascot-worker@worker-pc
- On the worker node, SSH looks for the public key in worker-pc:/home/mascot-worker/.ssh
- On the worker node, ms-mascotnode.exe is started as mascot-worker
Since nph-mascot.exe inherits the credentials of ms-mascotnode.exe, database searches on the node are also run as mascot-worker.
Restricting access by IP address
The master and worker nodes must have static IP addresses – these are specified in nodelist.txt. That being the case, it's a good idea to restrict SSH access by source IP address. For example, add a "from" option to authorized_keys on worker-pc that only allows logging in with the passwordless key if source IP or hostname is master-pc. See the sshd(8) manual for details.
Securing port 5001
We don't recommend running Mascot cluster nodes on the public Internet. Although SSH is reasonably secure, Mascot uses a custom TCP protocol in port 5001, and the messages are transmitted unencrypted. It wouldn't be hard to devise malicious messages that crash ms-mascotnode.exe or allow privilege escalation.
If the Mascot cluster is being set up in a public network such as a public cloud, it's essential to run as non-privileged user, disable root access and restrict logins by IP address. But you also need to consider traffic through port 5001.
You should set up a virtual private network (VPN) between the master and worker nodes. Most cloud providers offer private networks, sometimes called virtual private cloud (VPC). Alternatively, you can configure it at operating system level using IPsec or OpenVPN. Then, configure nodelist.txt to only use the IP addresses of the VPN adapter. On the worker node, use the operating system's firewall to prevent incoming traffic on port 5001 unless it's through the VPN adapter.
Mascot Security versus web server security
The above tips concerned low-level shell access. You should also consider web security. The example Apache configuration shipped with Mascot uses HTTP (port 80), which is unencrypted. See the Apache documentation how to enable HTTPS and configure client authentication.
Additionally, Mascot includes a role-based access mechanism called Mascot Security, although it is not intended to be a strong authentication system. Have a look at Single sign-on (SSO) and Mascot for tips how to integrate web server authentication with Mascot Security.