Skip to content. | Skip to navigation

Personal tools

Navigation

You are here: Home / Research / Private / ecs-cluster Configuration / xCat Installation

xCat Installation

Configuration

  • Public IP for node1: 10.5.16.210
  • Private (internal) IP for node1:  192.168.0.1 
  • root :: TeamVenusSC14
  • Cluster nodes password: ??????????? cluster ????????????
  • Site (public) network:  Ethernet using eno1 on node1 only. DHCP.
  • Management (private) network:  Ethernet using eth0 on all nodes.  Static IPs.   192.168.0.0/16
  • Base: CentOS 7.4, minimal install
  • xCAT - 2.13.1 (released Jan 2017)
  • Storage server
    • http://192.168.10.100:8080
    • admin :: TeamVenusSC14
  • Switch:
    • http://192.168.11.124
    • admin :: TeamVenusSC14

 

Reference: 

 

 

Install CentOS on node1

Install CentOS 7.4 on cluster1, minimal install, over entire disk

  • Hostname: node1

Setup networking "temporarily" so the installation can proceed:

dhclient eno1 -v

Edit /etc/yum.conf and change keepcache from 0 to 1 (so that copies of updated RPM files are saved in /var/cache/yum/)

....
keepcache=1
.... 

Update system: 

yum update

Automatically install security updates on node1

yum install yum-cron
vi /etc/yum/yum-cron.conf
# Change update_cmd from "default" to "security"
# Change apply_updates from "no" to "yes"
# The yum-cron.conf command will execute daily
systemctl start yum-cron

 Install extra packages for xcat:

yum install nano wget dhcp bind httpd nfs-utils perl-XML-Parser emacs xauth ntp ntpdate ntp-doc system-config-firewall net-tools

Install extra packages for general HPC environment:

yum install cmake ibutils net-snmp opensm opensm-libs swig smartmontools \
gcc gcc-gfortran gcc-c++ byacc bc tcl-devel \
atlas atlas-devel atlas-sse3 atlas-static \
blas blas-devel blas-static blas64 blas64-devel blas64-static \
lapack lapack64 lapack-static lapack-devel \

 

XXX OLD NOT USING INFINIBAND:   Install Infiniband packages:

XXXXX OLD:  yum install libibverbs libibverbs-utils libibverbs-devellibibmad libibmad-devel infiniband-diags ibutils ibutils-libs libibumad libibmad

XXX OLD NOT USING INFINIBAND:   Note that for compute nodes, you only need:  rdma libibverbs libipathverbs

Install extra packages:

yum install firefox firewall-config rsync pciutils beesu \
libX11-devel xorg-x11-apps xorg-x11-fonts-100dpi xorg-x11-fonts-75dpi.noarch dejavu-sans-fonts dejavu-serif-fonts wine-fonts \
gtk2-devel \
java-1.7.0-openjdk java-1.8.0-openjdk

Install EPEL library so we can get extra packages from that repository:

wget https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
rpm -Uvh epel-release-latest-7.noarch.rpm
yum install meld # For example, Meld is not included in RHEL 7 

Install RPM Fusion library so we can get extra packages from that repository:

yum localinstall --nogpgcheck https://download1.rpmfusion.org/free/el/rpmfusion-free-release-7.noarch.rpm https://download1.rpmfusion.org/nonfree/el/rpmfusion-nonfree-release-7.noarch.rpm
yum install ffmpeg

Install Google Chrome (much better for QNAP NAS web interface):

Create a file called /etc/yum.repos.d/google-chrome.repo and add the following lines of code to it.

[google-chrome]
name=google-chrome
baseurl=http://dl.google.com/linux/chrome/rpm/stable/$basearch
enabled=1
gpgcheck=1
gpgkey=https://dl-ssl.google.com/linux/linux_signing_key.pub
# Install Chrome:
yum install google-chrome-stable

 

Configure /etc/hosts so that node1 resolves correctly. (Tip: The last line is missing in the file. Add it)
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
127.0.0.1 node1 node1.cluster node1.ecs-cluster.serv.pacific.edu

 

Configure networks using "NMTUI" text mode utility:

nmtui      # To set options interactively
nmcli -p # Show current config
nmcli -p networking
nmcli -p connection
nmcli -p device
nmcli -p connection show Campus
nmcli -p connection show Private
  • "Campus" Network
    • Name=Campus
    • IP=10.5.16.99/24
    • Gateway=10.5.16.254
    • DNS=10.10.4.226, 10.10.4.227
    • Device=eno1 (NOT SPECIFIED IN CONFIG FILE)
      (can't set via GUI, use this command instead:  nmcli connection modify Campus connection.interface-name eno1)
    • MAC address=00:1e:67:44:0c:cb
    • Zone=Public
      (firewall default, nothing to set)
  • "Private" Network
    • Name=Private
    • IP=192.168.0.1/16
    • MAC address=00:1e:67:44:0c:cc
    • Device=eno2
      (can't set via GUI, use this command instead:  nmcli connection modify Private connection.interface-name eno2)
      (A rule in /usr/lib/udev/rules.d/60-net.rules instructs the udev helper utility, /lib/udev/rename_device, to look into all /etc/sysconfig/network-scripts/ifcfg-suffix files. If it finds an ifcfg file with a HWADDR entry matching the MAC address of an interface it renames the interface to the name given in the ifcfg file by the DEVICE directive.)
    • Zone=Trusted (for firewall)
      (can't set via GUI, use this command instead:   nmcli connection modify Private connection.zone trusted)
  • "mlnx_40gbe" Network
    • Name=mlnx_40gbe
    • IP=172.16.0.1/24
    • Device=ens2 (NOT SPECIFIED IN CONFIG FILE)
    • MAC address=00:02:C9:9F:7A:00
    • Zone=Trusted (for firewall)
      (can't set via GUI, use this command instead:   nmcli connection modify mlnx_40gbe connection.zone trusted)

 

Configure Mellanox Infinband/Ethernet NICs to operate in ETHERNET mode:

  • Interactive option:  Use connectx_port_config utility (Won't stick after reboot)
  • Method for compute nodes (applied in file sync later)  

    echo "eth" > /sys/bus/pci/devices/0000\:02\:00.0/mlx4_port1
    echo "eth" > /sys/bus/pci/devices/0000\:02\:00.0/mlx4_port2

  • Method for node 1 - Will persist after reboot
    Find the right PCI device:
    ls -ls /sys/bus/pci/drivers/mlx4_core/
    Edit 
    /etc/rdma/mlx4.conf to set that PCI device to ethernet mode. Add this line at the end:
    0000:02:00.0 eth eth
    Unload and then re-load the module (and its dependencies) to verify that it worked, or just restart:
    modprobe -r mlx4_en
    modprobe -r mlx4_ib
    modprobe mlx4_en
    modprobe mlx4_ib
    ifconfig -a

 

Install Mellanox Firmware Tools (MFT) - http://www.mellanox.com/page/management_tools

# Download latest firmware
# http://www.mellanox.com/page/firmware_table_ConnectX3IB
wget http://www.mellanox.com/downloads/firmware/fw-ConnectX3-rel-2_40_5030-MCX354A-FCB_A2-A5-FlexBoot-3.4.746.bin.zip
unzip ...

# Install
wget http://www.mellanox.com/downloads/MFT/mft-4.6.0-48-x86_64-rpm.tgz
tar -xzf mft-4.6.0-48-x86_64-rpm.tgz
cd mft-4.6.0-48-x86_64-rpm/
sudo ./install.sh

# Use
sudo mst start

# Check firmware version
sudo mst status
sudo mlxfwmanager

# Update firmware version from downloaded file
sudo mlxfwmanager -u -d <device> -i <filename>

# Update firmware version online from Mellanox site
sudo mlxfwmanager --online -u -d <device>
# (where device is PCI id, e.g. 0000:02:00.0)

# Export NIC settings
sudo flint -d <device> dc > nic_working_settings.ini
# (where device is PCI id, e.g. 0000:02:00.0)

###### Or, for the compute nodes:
/exports/home/jshafer/mellanox/mlxfwmanager --ssl-certificate=/exports/home/jshafer/mellanox/ca-bundle.crt --online -u -d 0000:02:00.0
####/exports/home/jshafer/mellanox/mlxfwmanager -u -d 0000:02:00.0 -i /exports/home/jshafer/mellanox/fw-ConnectX3-rel-2_40_5030-MCX354A-FCB_A2-A5-FlexBoot-3.4.746.bin

 

XXX OLD - Configure networks - Edit the file: /etc/sysconfig/network-scripts/ifcfg-eth0 - This will be the internal (private) cluster network over gigabit Ethernet

TYPE=Ethernet
BOOTPROTO=static
DEFROUTE=no
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
####NAME=eth0
####UUID=xxxxxxxxxxxxxxx
HWADDR=00:1e:67:44:0c:cc
DEVICE=eth0
ONBOOT=yes
IPADDR=192.168.0.1
PREFIX=16
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_PRIVACY=no
ZONE=trusted 
XXXX OLD - Edit the file:  /etc/sysconfig/network-scripts/ifcfg-eno1 - This will be the public (external) network over gigabit Ethernet from the management node to the outside world.
TYPE="Ethernet"
BOOTPROTO="static"
DEFROUTE="yes"
PEERDNS="yes"
PEERROUTES="yes"
IPV4_FAILURE_FATAL="no"
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
IPV6_DEFROUTE="yes"
IPV6_PEERDNS="yes"
IPV6_PEERROUTES="yes"
IPV6_FAILURE_FATAL="no"
NAME="eno1"
####UUID="xxxxxxxxxx"
HWADDR=00:1e:67:44:0c:cb
DEVICE="eno1"
ONBOOT="yes"
IPADDR=10.5.16.99
PREFIX=24
GATEWAY=10.5.16.254
DNS1=10.10.4.226
DNS2=10.10.4.227 
XXX OLD - Configure NTP client on node1.  

(NEW:  Use XCAT method instead) 

NO: chkconfig ntpd on # Turn NTP service on
NO: nano /etc/ntp.conf # Add "server ntp.pacific.edu iburst" as top server in list, above 0.centos.pool.ntp.org
NO: # Save and exit nano
NO: /etc/init.d/ntpd restart # Restart NTP 
NO: ntpq -np # Show status

XXX OLD - Configure Infiniband on node1

  • https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Networking_Guide/ch-Configure_InfiniBand_and_RDMA_Networks.html
  • https://docs.oracle.com/cd/E18476_01/doc.220/e18478/fabric.htm#ELMOG76113
systemctl enable opensm.service
systemctl status opensm
# Should be running

lspci -nn | grep "fini"
# Should show some device

lsmod | grep "ib_qib"
# Should show the driver active

ip -d link
ip link set ib0 up
# Should show the 'ib0' device

ibv_devices
# Should exist

ibv_devinfo
# Port should be ACTIVE

# Commands from Oracle Infiniband docs: https://docs.oracle.com/cd/E18476_01/doc.220/e18478/fabric.htm#ELMOG76113

ibhosts
# Should show cluster nodes

ibnetdiscover
# Should show cluster nodes

ibdiagnet -v -r
# Should run Infiniband network diagnostics

ibswitches
# Should show info on Infiniband switches

Relax memory limits for MPI/Infiniband
(Note that this won't fix everything.  SLURM runs under systemd, so we have to alter the slurmd.service config file for systemd to relax the limits there...) 

vi /etc/security/limits.d/95-openfabrics.conf
# ------------------------
# File contents should be:
# ------------------------

* soft memlock unlimited
* hard memlock unlimited 

 

Configure firewall for security + NAT

http://www.certdepot.net/rhel7-get-started-firewalld/
https://fedoraproject.org/wiki/FirewallD 

Public interface (eno1):  Block everything but port 22, 80, and 443
Private/trusted interfaces (eno2, ens2): Wide open (all sorts of xCAT services here...)

service firewalld start
service firewalld status # Show current status
systemctl status firewalld # Show current status (newer command style)
firewall-cmd --state # Show current status

# Set the Private connection to be part of the TRUSTED zone
nmcli connection modify Private connection.zone trusted

# See which interfaces are assigned to which zones
firewall-cmd --get-active-zones
# Should show eno1 ("Campus") as PUBLIC
# Should show eno2 ("Private") and ens2 ("mlnx_40gbe") as TRUSTED
# (eno2 was added to the trusted zone via nmcli command)

# Show current configuration of public zone
firewall-cmd --zone=public --list-all

# Add SSH, HTTP, and HTTPS as allowed services for the public zone
firewall-cmd --permanent --zone=public --add-service=ssh # On by default
firewall-cmd --permanent --zone=public --add-service=http
firewall-cmd --permanent --zone=public --add-service=https
firewall-cmd --reload # Required for changes to take effect

# Show current configuration of public zone
firewall-cmd --zone=public --list-all

# Enable NAT masquerading between the internal/"trusted" zone and the external/"public" zone
firewall-cmd --permanent --zone=public --add-masquerade
firewall-cmd --reload # Required for changes to take effect

firewall-cmd --list-all-zones # Show all status for all zones

 

Install xCAT on node1

Useful xCAT commands for future reference (showing configuration): 

tabdump networks
tabdump passwd
tabdump site
tabdump mac
lsdef [node name/range] 
psh [node name/range] command
noderm [node name/range] # Remove node from all xCAT databases - useful for starting over
/opt/xcat/bin/lsxcatd -v # Show xCAT version 

 

Add the xCAT repositories to YUM package manager:

cd /etc/yum.repos.d
wget http://sourceforge.net/projects/xcat/files/yum/2.10/xcat-core/xCAT-core.repo
wget http://sourceforge.net/projects/xcat/files/yum/xcat-dep/rh7/x86_64/xCAT-dep.repo
cd # Go back to home directory 

 Install xCAT:

yum clean metadata
yum install xCAT

Test xCAT:

source /etc/profile.d/xcat.sh  # Add xCAT commands to path
tabdump site   # Check to see if database is initializated

The output should similar to the following:

key,value,comments,disable
"xcatdport","3001",,
"xcatiport","3002",,
"tftpdir","/tftpboot",,
"installdir","/install",,
...

Configure xCAT on node 1

PASSWD TABLE: Configure xCAT-specific login accounts for system and IPMI information.  The 'tabedit' program manages database files using a VI-editor style interface. Use 'A' to enter append mode, enter your text, 'ESC' to get back to command mode, and then ":wq" to write the file and exit the editor.  "Fun times with VI".

tabedit passwd

When finished, the file should look something like this.

#key,username,password,cryptmethod,authdomain,comments,disable
"system","root","cluster",,,,
"ipmi","ipmi_username","ipmi_password",,,,

Be careful that you don't have any BLANK LINES after the last line, as that will cause an error upon saving.

DNS: Assuming that the network is currently configured, find out the current Pacific DNS servers:

cat /etc/resolv.conf

They are probably 10.10.4.226 and 10.10.4.227

Set site.forwarders to your site-wide DNS servers that can resolve site or public hostnames. The DNS on the management node will forward any requests it can't answer to these servers. Also, specify the IP address of your site-wide DNS (i.e. the management node) that will be distributed via DHCP to other cluster nodes.

chdef -t site forwarders=10.10.4.226,10.10.4.227
chdef -t site domain=ecs-cluster.serv.pacific.edu nameservers=192.168.0.1

Run makedns:

makedns -n

NTP:

Configure an external NTP server for all cluster nodes (we're small, no need to make node1 do this)

# Configure head node
chdef -t site ntpservers=ntp.pacific.edu
chdef -t site extntpservers=ntp.pacific.edu

# Set NTP to be configured automatically for nodes in compute group
# (This is one of the "postscripts" - tasks that run after a node boots
chdef -p -t group -o compute postscripts=setupntp

# Update configuration files
makentp -a 
makedhcp -n # NTP settings can propagate via DHCP 

 

Dynamic Discovery: Declare a dynamic range of addresses for discovery.  If you want to run a discovery process, a dynamic range must be defined in the networks table. It's used for the nodes to get an IP address before xCAT knows their MAC addresses.

In this case, we'll designate 192.168.11.1 - 192.168.11.253 as a dynamic range. Note that the first argument (192_168_10_0_xxxxxxx) is taken from "tabdump networks" and must match an existing line in that file.

Important note:  The dynamic range should be **DIFFERENT** from the range you wish to use for final node assignments.  In this case, our dynamic range is 192.168.11.x (for auto-discovery), but later we'll be pinning discovered nodes to the 192.168.10.x range.

chdef -t network 192_168_0_0-255_255_0_0 dynamicrange=192.168.11.1-192.168.11.253

 

Setup DHCP:  We don't want your DHCP server listening on your public (site) network, so set site.dhcpinterfaces to your MN's cluster-facing NICs.

chdef -t site dhcpinterfaces=eno2

Then this will get the network stanza part of the DHCP configuration (including the dynamic range) set:

makedhcp -n

The IP/MAC mappings for the nodes will be added to DHCP automatically as the nodes are discovered.

 Setup conserver:

makeconservercf

 

 

Sequential Node Discovery

(other options, like Profile Discovery, exist but are not used)

Initialize the discovery process (you might need to run this as root):

nodediscoverstart noderange=node[002-008] hostiprange=192.168.10.[2-8] bmciprange=192.168.10.[102-108] groups=compute,ipmi --skipbmcsetup 

Documentation: http://xcat.sourceforge.net/man1/nodediscoverstart.1.html

  • BMCSetup is not functional (out of the box) on Cray cluster - skip it

XXX Create a group name called compute?

Power on the nodes sequentially:  At this point you can physically power on the nodes one at a time, in the order you want them to receive their node names.

Display information about the discovery process

Verify the status of discovery:

nodediscoverstatus


Show the nodes that have been discovered so far:

nodediscoverls -t seq -l
tail -f /var/log/messages 


Stop the current sequential discovery process:

nodediscoverstop

 

A quick summary of what is happening during the discovery process is:

  • the nodes request a DHCP IP address and PXE boot instructions
  • the DHCP server on the MN responds with a dynamic IP address and the xCAT genesis boot kernel
  • the genesis boot kernel running on the node sends the MAC and MTMS to xcatd on the MN
  • xcatd uses specified node name pool to get the proper node entry
    • stores the node's MTMS in the db
    • puts the MAC/IP pair in the DHCP configuration
    • sends several of the node attributes to the genesis kernel on the node
  • the genesis kernel configures the BMC with the proper IP address, userid, and password, and then just drops into a shell

 

After a successful discovery process, the following attributes will be added to the database for each node. (You can verify this by running lsdef <node>, e.g. lsdef node002 ):

  • mac - the MAC address of the in-band NIC used to manage this node
  • mtm - the hardware type (machine-model)
  • serial - the hardware serial number

If you cannot discover the nodes successfully, see the next section XCAT_iDataPlex_Cluster_Quick_Start#Manually_Discover_Nodes.

Check that the nodes all have IP addresses assigned to them. If not, manually fix it. For example:

lsdef node003

chdef -t node node002 ip=192.168.10.2
chdef -t node node003 ip=192.168.10.3
chdef -t node node004 ip=192.168.10.4
chdef -t node node005 ip=192.168.10.5
chdef -t node node006 ip=192.168.10.6
chdef -t node node007 ip=192.168.10.7
chdef -t node node008 ip=192.168.10.8

tabdump hosts 

For some reason, the new nodes are not part of the "ipmi" group, despite requesting it.  "ipmi" group is not added to the nodes correctly.  Add it manually:

nodech node00[2-8] groups=compute,ipmi

 

Tip: If at some later time you want to force a re-discover of a node, run:

noderm <noderange> # Delete all memory of node from xCAT tables
makedhcp
-d <noderange> # Delete DHCP assignment for node

and then reboot the node(s).

 

Monitoring Node Discovery

When the bmcsetup process completes on each node (about 5-10 minutes), xCAT genesis will drop into a shell and wait indefinitely (and change the node's currstate attribute to "shell"). You can monitor the progress of the nodes using:

watch -d 'nodels ipmi chain.currstate|xcoll'

Before all nodes complete, you will see output like:

====================================  
        n1,n10,n11,n75,n76,n77,n78,n79,n8,n80,n81,n82,n83,n84,n85,n86,n87,n88,n89,n9,n90,n91
====================================

shell

====================================  

n31,n32,n33,n34,n35,n36,n37,n38,n39,n4,n40,n41,n42,n43,n44,n45,n46,n47,n48,n49,n5,n50,n51,n52,
 n53,n54,n55,n56,n57,n58,n59,n6,n60,n61,n62,n63,n64,n65,n66,n67,n68,n69,n7,n70,n71,n72,n73,n74
====================================   

    runcmd=bmcsetup

When all nodes have made it to the shell, xcoll will just show that the whole nodegroup "ipmi" has the output "shell":

====================================     
    ipmi
==================================== 

shell

 

When the nodes are in the xCAT genesis shell, you can ssh or psh to any of the nodes to check anything you want.
Verfiy HW Management Configuration
At this point, the BMCs should all be configured and ready for hardware management. To verify this:

XXX NOT WORKING:   rpower ipmi stat | xcoll
===================================     
    ipmi
===================================     

    on

 

Verify that remote console works on one node by running:

XXX NOT WORKING:   rcons <node>

To verify that you can see the genesis shell prompt (after hitting enter). To exit rcons type: ctrl-shift-E (all together), then "c", the ".".
You are now ready to choose an operating system and deployment method for the nodes....

 

 

Stateless Node Configuration

Download the ISO file for CentOS (the same ISO used in installing the management node)

wget http://mirror.oss.ou.edu/centos/7/isos/x86_64/CentOS-7-x86_64-Minimal-1708.iso

 The copycds command copies the contents of the linux distro media to /install/<os>/<arch> so that it will be available to install nodes with or create diskless images.

copycds /root/CentOS-7-x86_64-Minimal-1708.iso -n centos7.4

[OPTIONAL]: Create a file /etc/yum.repos.d/CentOS-installer.repo so that yum can find installer-provided rpms if it needs them in the future:

[local-centos7-x86_64]
name=xCAT local CentOS 7
baseurl=file:/install/centos7.4/x86_64
enabled=1
gpgcheck=0

 

Create a copy of the OS image and give it a different (preferably simpler) name.

(Retrieve OS image info in stanza format | use a regular expression to rename the image to "centos74" | make an xCAT data object definition using the previous data as input)

lsdef -t osimage -z centos7.4-x86_64-netboot-compute | sed 's/^[^ ]\+:/centos74:/' | mkdef -z

Change the root image directory of "centos74" to "/install/netboot/centos7.4/x86_64/centos74"

chdef -t osimage -o centos74 rootimgdir=/install/netboot/centos7.4/x86_64/centos74

 

Tip: To see the variables changed in the upcoming set of commands, use lsdef -t osimage centos74

Set up pkglists.   You likely want to customize the main pkglist for the image. This is the list of rpms or groups that will be installed from the distro. (Other rpms that they depend on will be installed automatically.)

mkdir -p /install/custom/netboot/centos
cp -p /opt/xcat/share/xcat/netboot/centos/compute.centos7.pkglist /install/custom/netboot/centos
vi /install/custom/netboot/centos/compute.centos7.pkglist
chdef -t osimage centos74 pkglist=/install/custom/netboot/centos/compute.centos7.pkglist

The goal is to install the fewest number of rpms that still provides the applications that you need, because the resulting ramdisk will use real memory in your nodes.

Set up the exclude list.  This excludes all files and directories you do not want in the image. The exclude list enables you to trim the image after the rpms are installed into the image, so that you can make the image as small as possible.

cp /opt/xcat/share/xcat/netboot/centos/compute.exlist /install/custom/netboot/centos
vi /install/custom/netboot/centos/compute.exlist
chdef -t osimage centos74 exlist=/install/custom/netboot/centos/compute.exlist

Make sure nothing is excluded in the exclude list that you need on the node. For example, if you require perl on your nodes, remove the line "./usr/lib/perl5*".

Create a directory to hold updated RPMs to you get from your package manager.

yum install createrepo
mkdir -p /install/updates/centos7.4/x86_64

# Grab any/all updates, even if we don't need them
/bin/cp -f /var/cache/yum/x86_64/7/base/packages/*.rpm /install/updates/centos7.4/x86_64/
/bin/cp -f /var/cache/yum/x86_64/7/updates/packages/*.rpm /install/updates/centos7.4/x86_64/
/bin/cp -f /var/cache/yum/x86_64/7/epel/packages/*.rpm /install/updates/centos7.4/x86_64/

# Or, if you want to download something but not install it on the management node:
# yum install --downloadonly --downloaddir=/install/updates/centos7.4/x86_64/ <package>

# Update the Repo. Must be done EVERY TIME YOU ADD NEW RPMs.
createrepo /install/updates/centos7.4/x86_64

Set the image generator to look at the updates repo for any new RPM files. The -p option appends this path onto the previous path, which was the CD with installer files.

chdef -t osimage centos74 -p pkgdir=/install/updates/centos7.4/x86_64

Set up a postinstall script (optional)

Postinstall scripts for diskless images are analogous to postscripts for diskfull installation. The postinstall script is run by genimage near the end of its processing. You can use it to do anything to your image that you want done every time you generate this kind of image. In the script you can install rpms that need special flags, or tweak the image in some way. There are some examples shipped in /opt/xcat/share/xcat/netboot/<distro>. If you create a postinstall script to be used by genimage, then point to it in your osimage definition. 

cp /opt/xcat/share/xcat/netboot/centos/compute.centos7.postinstall /install/custom/netboot/centos
vi /install/custom/netboot/centos/compute.centos7.postinstall
chdef -t osimage centos74 postinstall=/install/custom/netboot/centos/compute.centos7.postinstall

Note 1:  Modify this file so that your /etc/fstab is correct

Node 2: Modify this file to create a new limits.d sub-file that releases all memlock limits. (That way this setting is present upon BOOT)

cat <<END >$installroot/etc/security/limits.d/95-openfabrics.conf
* soft memlock unlimited
* hard memlock unlimited
END 

 

Create a script of commands to run on the compute nodes after they are booted, e.g. /etc/xcat_executealways. Be sure to chmod +x it!

#!/bin/sh
/usr/bin/echo "eth" > /sys/bus/pci/devices/0000\:02\:00.0/mlx4_port1
/usr/bin/echo "eth" > /sys/bus/pci/devices/0000\:02\:00.0/mlx4_port2

ntpdate ntp.pacific.edu   # Set block BEFORE running these jobs
# (In particular, pip3 needs an accurate time for SSL certs)

/usr/bin/echo "/usr/bin/systemctl start exports.mount" | /bin/at now + 4 minute
/usr/bin/echo "/usr/bin/systemctl start munge && /usr/bin/systemctl start slurmd" | /bin/at now + 5 minute
yes | pip3 install --upgrade pip
yes | pip3 install -U matplotlib
yes | pip3 install keras
yes | pip3 install tensorflow
yes | pip3 install scoop
yes | pip3 install deap
yes | pip3 install psutil
yes | pip3 install apscheduler

 

Set up Files to be synchronized on the nodes

http://sourceforge.net/p/xcat/wiki/Sync-ing_Config_Files_to_Nodes/

Sync lists contain a list of files that should be sync'd from the management node to the image and to the running nodes. This allows you to have 1 copy of config files for a particular type of node and make sure that all those nodes are running with those config files. The sync list should contain a line for each file you want sync'd, specifying the path it has on the MN and the path it should be given on the node. For example:


/etc/munge/munge.key -> /etc/munge/munge.key
/etc/slurm/slurm.conf -> /etc/slurm/slurm.conf
/etc/default/slurmd -> /etc/default/slurmd
/usr/lib/systemd/system/slurmd.service -> /usr/lib/systemd/system/slurmd.service
/etc/xcat_executealways -> /etc/xcat_executealways
######/etc/rdma/mlx4.conf -> /etc/rdma/mlx4.conf

######/var/run/slurm/* -> /var/run/slurm/
######/etc/security/limits.d/95-openfabrics.conf -> /etc/security/limits.d/95-openfabrics.conf
APPEND:
/etc/hosts_slave -> /etc/hosts
MERGE:
/etc/passwd_slave -> /etc/passwd
/etc/shadow_slave -> /etc/shadow
/etc/group_slave -> /etc/group
EXECUTEALWAYS:
/etc/xcat_executealways
##/usr/bin/echo "eth" > /sys/bus/pci/devices/0000\:02\:00.0/mlx4_port1
##/usr/bin/echo "eth" > /sys/bus/pci/devices/0000\:02\:00.0/mlx4_port2
##/usr/bin/echo "/usr/bin/systemctl start exports.mount" | /bin/at now + 1 minute
##/usr/bin/echo "/usr/bin/systemctl start munge && /usr/bin/systemctl start slurmd" | /bin/at now + 2 minute
#####/usr/bin/systemctl start exports.mount
#####/usr/bin/systemctl start munge && /usr/bin/systemctl start slurmd
#####/bin/bash -c "/usr/sbin/modprobe -r -a mlx4_en mlx4_ib && /usr/sbin/modprobe -a mlx4_en mlx4_ib"
#####/usr/bin/sleep 5; /usr/bin/systemctl stop munge; /usr/bin/pkill -u munge munged; /usr/bin/systemctl start munge; /usr/bin/sleep 5; /usr/bin/systemctl restart slurmd
 

If you put the above contents in /install/custom/netboot/centos/compute.synclist, then:

chdef -t osimage centos74 synclists=/install/custom/netboot/centos/compute.synclist 

Tip: To manually re-sync files after a node is booted, do:  updatenode node00[2-8] -F --verbose

 

Configure the desired noderange to use this osimage. In this example, we define that the whole compute group should use the image:

chdef -t group compute provmethod=centos74

Now that you have associated an osimage with nodes, if you want to list a node's attributes, including the osimage attributes all in one command:

lsdef node002 --osimage

 Generate a stateless image: If the image you are building is for nodes that are the same OS and architecture as the management node (the most common case), then you can run genimage to generate the image based on the definition:

genimage centos74

You could edit the files of the image directly in /install/netboot/centos7.2/x86_64/venusimage1/rootimg, but such changes will be lost the next time you run 'genimage'. Better to add any changes to the postinstall script.

Now pack the image to create the ramdisk:

packimage centos74

(Make sure that all packages are found. If you're missing packages, go back and copy the RPMs into the repository you created earlier...) 

Assign this image to your 'compute' group of nodes

nodeset compute osimage=centos74

(If you need to update your diskless image sometime later, change your osimage attributes and the files they point to accordingly, and then rerun genimage, packimage, nodeset, and boot the nodes.)
Now boot your nodes...

XXX NOT WORKING:   rsetboot compute net
XXX NOT WORKING: rpower compute boot XXXXX WON'T WORK WITHOUT IPMI??? XXXXXXX

 

 

XXXX DEBUG INFO XXXXXX

Add Kernel boot options to the entire compute group of nodes:

tabedit bootparams    # Add it to the addkcmdline option

Or...

chdef -t group compute addkcmdline="rdshell"
nodeset compute osimage=centos74 
  • http://fedoraproject.org/wiki/How_to_debug_Dracut_problems#Using_the_dracut_shell

 

QNAP NAS

  • IP: 192.168.10.100 (Static)
  • Name: STORAGE
  • Login:  admin :: venus
  • Configuration:
    • RAID 5 disk configuration
    • NTP server: ntp.pacific.edu
    • Create NFS shard folder called "exports" and set host-based permission on that folder for allowed IP address "192.168.0.0/255.255.0.0"
    • Disable everything you can find except for NFS storage hosting

 

Add custom host entry in xcat for the storage server (so "storage" points to 192.168.10.100)

tabedit hosts
# Add a line like this: "storage-qnap","192.168.10.100",,,,,

Do a test mount of the qnap NFS share to verify configuration:

mkdir /exports-backup
mount storage-qnap:/exports  /exports-backup

Permanently mount the qnap NFS share on startup by adding to /etc/fstab:

## (Slower since client first tries NFSv4 before timeout...) storage:/exports /exports nfs defaults 0 0
storage-qnap:/exports /exports-backup nfs vers=3,proto=udp 0 0 

 

Note: That same line needs to be added to the postinstall script for the compute node images. (The postinstall script generates /etc/fstab) 

# JAS - Append to end of /etc/fstab
echo "storage-qnap:/exports /exports-backup nfs vers=3,proto=udp 0 0 hard memlock unlimited" >> $installroot/etc/fstab

 

FreeNAS NAS

  • IP: 172.16.0.101 / 192.168.10.101
  • Hostname: storage-freenas.local
  • Login:
  • Configuration:
    • ZFS pool with all disks
    • NTP server: ntp.pacific.edu  (System->General->Add NTP Servers, Prefer=True)
    • NFSv4: enabled (Services->NFS>Enable NFS v4 + NFS v3 ownership model for NFSv4) - Otherwise you'll have to have consistent UIDs between FreeNAS and cluster
    • NFS Share (exports)
      • Comment: 'exports'
      • Maproot User to 'root'
      • Maproot Group to 'wheel'  (otherwise, you'll get errors when rsync tries to chown something as 'root')

 

Add custom host entry in xcat for the storage server (so "storage-freenas" points to 192.168.10.100)

tabedit hosts
# Add a line like this: "storage-freenas","172.16.0.101",,,,
# Add a line like this: "storage-freenas-1gbe","192.168.10.101",,,,

Do a test mount of the FreeNas NFS share to verify configuration:

mkdir /exports
mount storage-freenas:/mnt/cluster_hss/exports  /exports

 Permanently mount the FreeNas NFS share on startup by adding to /etc/fstab:

storage-freenas:/mnt/cluster_hss/exports /exports nfs defaults 0 0

 

Note: That same line needs to be added to the postinstall script for the compute node images. (The postinstall script generates /etc/fstab) 

# JAS - Append to end of /etc/fstab
echo "storage-freenas:/mnt/cluster_hss/exports /exports nfs defaults 0 0" >> $installroot/etc/fstab

 

 

Configure secondary network

# http://xcat-docs.readthedocs.io/en/stable/guides/admin-guides/manage_clusters/ppc64le/diskless/customize_image/cfg_second_adapter.html
# http://xcat-docs.readthedocs.io/en/stable/guides/admin-guides/references/man5/nics.5.html?highlight=nics

Define a network for XCat cluster (mgtifname is name on node1)

chdef -t network -o mlnx_40gbe net=172.16.0.0 mask=255.255.255.0 mgtifname=ens2

Define configuration information for secondary adapters

chdef node002 nicips.ens2="172.16.0.2" nicnetworks.ens2=mlnx_40gbe nictypes.ens2="Ethernet"
chdef node003 nicips.ens2="172.16.0.3" nicnetworks.ens2=mlnx_40gbe nictypes.ens2="Ethernet"
chdef node004 nicips.ens2="172.16.0.4" nicnetworks.ens2=mlnx_40gbe nictypes.ens2="Ethernet"
chdef node005 nicips.ens2="172.16.0.5" nicnetworks.ens2=mlnx_40gbe nictypes.ens2="Ethernet"
chdef node006 nicips.ens2="172.16.0.6" nicnetworks.ens2=mlnx_40gbe nictypes.ens2="Ethernet"
chdef node007 nicips.ens2="172.16.0.7" nicnetworks.ens2=mlnx_40gbe nictypes.ens2="Ethernet"
chdef node008 nicips.ens2="172.16.0.8" nicnetworks.ens2=mlnx_40gbe nictypes.ens2="Ethernet"

Add confignics into node postscript list to configure after booting
 (-p is appending to existing postscripts like setupntp)

chdef -p -t group -o compute postscripts=confignics
lsdef node002 # See if confignics was added to a node

 

xCAT Upgrading

Show xCAT version:

/opt/xcat/bin/lsxcatd -v  # Show xCAT version 

Backup the tables:

dumpxCATdb -p /path/to/backup/dir

Change the .repo files to be the desired version of xCAT and Linux distribution (find the latest links on official website):

cd /etc/yum.repos.d
mv xCAT-dep.repo xCAT-dep.repo.old
mv xCAT-core.repo xCAT-core.repo.old
wget ....xCAT-dep.repo # Find the correct .repo files from website for desired target
wget ....xCAT-core.repo  # Find the correct .repo files from website for desired target

Update Yum and run update

yum clean metadata
yum update 

 


Modules

http://modules.sourceforge.net/
https://linuxcluster.wordpress.com/2012/11/08/installing-and-configuring-environment-modules-on-centos-6/ 

Build:

wget http://downloads.sourceforge.net/project/modules/Modules/modules-3.2.10/modules-3.2.10.tar.gz
tar -xzf modules-3.2.10.tar.gz
cd modules-3.2.10
./configure
make
sudo make install 

Configure:

/etc/modulefiles/mpi/openmpi-2.0.2

#%Module 1.0
#
# OpenMPI module for use with 'environment-modules' package:
#
conflict mpi
prepend-path PATH /exports/opt/openmpi-2.0.2/bin
prepend-path LD_LIBRARY_PATH /exports/opt/openmpi-2.0.2/lib
prepend-path PYTHONPATH /usr/lib64/python2.7/site-packages/openmpi
prepend-path MANPATH /exports/opt/openmpi-2.0.2/share/man
setenv MPI_BIN /exports/opt/openmpi-2.0.2/bin
setenv MPI_SYSCONFIG /exports/opt/openmpi-2.0.2/etc
setenv MPI_FORTRAN_MOD_DIR /usr/lib64/gfortran/modules/openmpi-x86_64
setenv MPI_INCLUDE /exports/opt/openmpi-2.0.2/include
setenv MPI_LIB /exports/opt/openmpi-2.0.2/lib
setenv MPI_MAN /exports/opt/openmpi-2.0.2/share/man
setenv MPI_PYTHON_SITEARCH /usr/lib64/python2.7/site-packages/openmpi
setenv MPI_COMPILER openmpi-x86_64
setenv MPI_SUFFIX _openmpi
setenv MPI_HOME /exports/opt/openmpi-2.0.2
setenv OMPI_MCA_btl self,tcp,sm,vader

Note: The last item (OMPI_MCA_btl) specifically does NOT list the openib BTL since we don't have Infiband anymore. (Otherwise, you'd get non-fatal errors for each process that launched), e.g.

--------------------------------------------------------------------------

No OpenFabrics connection schemes reported that they were able to be

used on a specific port.  As such, the openib BTL (OpenFabrics

support) will be disabled for this port.

   Local host:           node002

  Local device:         mlx4_0

  Local port:           1

  CPCs attempted:       rdmacm

--------------------------------------------------------------------------

 

Test:

module avail                         # Modules available
module list     # Modules currently loaded
module display mpi/openmpi-2.0.2   # Details about this module

 

OpenMPI

http://www.open-mpi.org/

Build:

Download the latest source.
Configuration:

  • Choose a shared location on the /exports drive (so that all cluster nodes can access the SAME binaries/libraries).
  • Choose a unique folder name so multiple/conflicting versions of MPI can coexist.
  • Enable SLURM support (on by default)
  • Enable PMI2 support (to interface with SLURM correctly) 
wget https://www.open-mpi.org/software/ompi/v2.0/downloads/openmpi-2.0.2.tar.gz
tar -xzf openmpi-2.0.2.tar.gz
cd openmpi-2.0.2

./configure --help
# Tons and tons and TONS of options here!

./configure --prefix=/exports/opt/openmpi-2.0.2 --with-slurm --enable-static --with-pmi=/usr/ --with-pmi-libdir=/usr/lib64 CFLAGS="-I/usr/include/slurm"
##OLD - FOR 1.8.8: ./configure --prefix=/exports/opt/openmpi-2.0.2 --with-slurm --with-pmi=/usr/include/slurm --with-pmi-libdir=/usr/lib64 CFLAGS="-I/usr/include/slurm"
make
sudo make install

Test:

Create an ~/machines file with list of hostnames and core count for cluster

172.16.0.2 slots=48
172.16.0.3 slots=48
172.16.0.4 slots=48
172.16.0.5 slots=48
172.16.0.6 slots=48
172.16.0.7 slots=48
172.16.0.8 slots=48

Can you compile things?

# Set environment vars (assuming you already configured module)
##export PATH=/exports/opt/openmpi-2.0.2/bin:$PATH
##export LD_LIBRARY_PATH=/exports/opt/openmpi-2.0.2/lib:$LD_LIBRARY_PATH
module load mpi/openmpi-2.0.2

which mpirun
# Should be /exports/opt/openmpi-2.0.2/bin/mpirun
which mpicc
# Should be /exports/opt/openmpi-2.0.2/bin/mpicc

# Demo code (not present in v2.0.2 for some reason)
cd ~/openmpi-1.8.8/ompi/contrib/vt/vt/examples/c/
mpicc hello.c -o hello_c
mpicc ring.c -o ring_c

Can you run on a single node?

mpirun -np 1 /usr/bin/hostname
mpirun -np 1 hello_c
mpirun -np 1 ring_c

Can you run on multiple nodes?

$MPI_BIN/mpirun -np 2 -machinefile ~/machines /usr/bin/hostname
$MPI_BIN/mpirun -np 4 -machinefile ~/machines /usr/bin/hostname
$MPI_BIN/mpirun -np 7 -machinefile ~/machines /usr/bin/hostname

$MPI_BIN/mpirun -np 2 -machinefile ~/machines hello_c
...
...

$MPI_BIN/mpirun -np 2 -machinefile ~/machines ring_c
...
... 

 

  

XXX - Failed method 1 (RPM builds, but conflicts with basic CentOS rpm 'filesystem'):

wget https://www.open-mpi.org/software/ompi/v1.8/downloads/openmpi-1.8.8-1.src.rpm
rpmbuild --rebuild --define 'configure_options --with-devel-headers' openmpi-1.8.8-1.src.rpm
# Results are in rpmbuild folder

cd rpmbuild/RPMS/x86_64
sudo yum localinstall openmpi-1.8.8-1.x86_64.rpm 

XXX - Failed method 2  (RPM builds, but conflicts with basic CentOS rpm 'filesystem'):

wget http://www.open-mpi.org/software/ompi/v1.8/downloads/openmpi-1.8.8.tar.gz
wget https://github.com/open-mpi/ompi-release/raw/v1.8/contrib/dist/linux/buildrpm.sh
wget https://github.com/open-mpi/ompi-release/raw/v1.8/contrib/dist/linux/openmpi.spec
chmod +x buildrpm.sh

# Modify buildrpm.sh
# 1.) Change build_srpm to "no"
# 2.) Change build_multiple to "yes"
# 3.) At the top, add this line:
# export rpmtopdir='/exports/home/jshafer/openmpi-1.8.8'

mkdir -p ~/openmpi-1.8.8/SOURCES
buildrpm.sh openmpi-1.8.8.tar.gz
# (Had to repeat twice for some reason...)

# Install all the RPMS on Node1, and just the runtime RPM on the diskless nodes
cd openmpi-1.8.8/RPMS/x86_64/
# (Should be devel, docs, and runtime in here)

sudo yum localinstall openmpi-*.x86_64.rpm

 

OpenMP Validation Suite

http://web.cs.uh.edu/~openuh/download/

wget http://web.cs.uh.edu/~openuh/download/packages/OpenMP3.1_Validation.tar.gz
tar -xzf
OpenMP3.1_Validation.tar.gz
cd OpenMP3.1_Validation
# Check the Makefile, uncommenting the CC/FC and CFLAGS/FFLAGS variables for the compiler of your choice.
# (GCC is the default, so everything is OK)
make ctest # Run C-language test
make ftest  # Run Fortran90 test

 

Munge + SLURM

Munge:  http://dun.github.io/munge/
SLURM: http://slurm.schedmd.com/quickstart_admin.html 

 

Build RPMs for Munge:

wget https://github.com/dun/munge/releases/download/munge-0.5.13/munge-0.5.13.tar.xz
yum install bzip2-devel openssl-devel zlib-devel
rpmbuild -tb --clean munge-0.5.13.tar.xz 

Install Munge on Node1:

cd rpmbuild/RPMS/x86_64
rpm -Uvh munge-0.5.13-1.el7.centos.x86_64.rpm munge-debuginfo-0.5.13-1.el7.centos.x86_64.rpm munge-devel-0.5.13-1.el7.centos.x86_64.rpm munge-libs-0.5.13-1.el7.centos.x86_64.rpm
# Used -ivh to do original install

Configure Munge on Node1:

# Generate secret key
dd if=/dev/random bs=1 count=1024 >/etc/munge/munge.key 

# Enable / run at startup
systemctl enable munge.service
systemctl start munge
systemctl status munge

Test Munge on Node1:

# Generate a credential on stdout:
munge -n
# Check if a credential can be locally decoded:
munge -n | unmunge
# Check if a credential can be remotely decoded:
munge -n | ssh <somehost> unmunge
# Run a quick benchmark:
remunge

Install Munge on Nodes2-8:

  • All nodes need the same secret key (via synclist)
  • Nodes2-8 need ONLY the munge and munge-libs RPMs added to the pkglist. (Copy the rpms to the /install directory)
    munge-0.5.11-1.el7.centos.x86_64.rpm munge-libs-0.5.11-1.el7.centos.x86_64.rpm
  • Run-run createrepo (since you added RPMs)
    Re-run genimage
    Re-run packimage
  • Configure the "munge" service to start via the EXECUTEALWAYS block of the synclist  (/usr/bin/systemctl start munge).  The service should only start AFTER the config file is synced.

 

Build RPMs for SLURM (note that gtk2 is required for sview):

wget https://download.schedmd.com/slurm/slurm-17.11.4.tar.bz2
yum install readline-devel pam-devel perl-Switch gtk2 gtk2-devel mariadb-devel
rpmbuild -ta --clean slurm-17.11.4.tar.bz2

Install SLURM on Node1:

cd rpmbuild/RPMS/x86_64
rpm -Uvh slurm-17.11.4-1.el7.centos.x86_64.rpm slurm-contribs-17.11.4-1.el7.centos.x86_64.rpm slurm-devel-17.11.4-1.el7.centos.x86_64.rpm slurm-example-configs-17.11.4-1.el7.centos.x86_64.rpm slurm-libpmi-17.11.4-1.el7.centos.x86_64.rpm slurm-openlava-17.11.4-1.el7.centos.x86_64.rpm slurm-pam_slurm-17.11.4-1.el7.centos.x86_64.rpm slurm-perlapi-17.11.4-1.el7.centos.x86_64.rpm slurm-slurmctld-17.11.4-1.el7.centos.x86_64.rpm slurm-slurmd-17.11.4-1.el7.centos.x86_64.rpm slurm-slurmdbd-17.11.4-1.el7.centos.x86_64.rpm slurm-torque-17.11.4-1.el7.centos.x86_64.rpm
# Used -ivh to do original install

Create slurm.conf file using web tool:  http://slurm.schedmd.com/configurator.html

***ANNOYING GLITCHES***:
Ensure that the .PID files in /usr/lib/systemd/system/slurmd.service (and slurmctld.service) match the same locations in the .conf file.
Ensure that the directory where the PID files are stored exists and is writable to the correct user.
Ensure that the systemd environment file (empty) exists. 
Ensure that the slurmd.service file (/usr/lib/systemd/system/slurmd.service) contains the line LimitMEMLOCK=infinity in the [SERVICE] block, and that this file is synchronized to the nodes...
Bugs:  http://bugs.schedmd.com/show_bug.cgi?id=1664 

Configure SLURM on Node1 (which needs slurmctld):

useradd -r slurm   # Create service account for slurm user

# Create files owned by slurm user
# (To debug this on startup, do: slurmctld -v -v -v -v -D)
touch /etc/default/slurmctld
chown slurm:slurm /etc/default/slurmctld
touch /etc/default/slurmd
chown slurm:slurm /etc/default/slurmd
mkdir /var/run/slurm/
chown slurm:slurm /var/run/slurm

touch /var/spool/last_config_lite
chown slurm:slurm /var/spool/last_config_lite
touch /var/spool/last_config_lite.new
chown slurm:slurm /var/spool/last_config_lite.new
mkdir /var/spool/slurmd
chown slurm:slurm /var/spool/slurmd
mkdir /var/spool/slurm_state
chown slurm:slurm /var/spool/slurm_state
### (NOTE: state is saved in /var/spool by default, but that location is not writable by the slurm user)

# Enable / run at startup
systemctl enable slurmctld.service
systemctl start slurmctld
systemctl status slurmctld
# (Will fail until slurm.conf is created) 

Test SLURM on Node1:

sinfo 

Install SLURM on Nodes 2-8 (which needs slurmd):

  • The slurm user needs to exist on the nodes (add to passwd_slave)
  • The slurm group needs to exist on the nodes (add to group_slave)
  • The slurm.conf file needs to exist on the node
  • The file /etc/default/slurmd needs to exist on the node (empty)
  • The directory /var/run/slurm needs to exist on the node and owned by slurm:slurm (empty)
  • Nodes2-8 ONLY need the following RPMs added to the pkglist (Copy the rpms to the /install directory)
    slurm
    slurm-slurmdbd 
    slurm-devel
    slurm-munge
    slurm-plugins
    slurm-slurmdbd
    slurm-sql 
    slurm-libpmi
  • Run-run createrepo (since you added RPMs)
    Re-run genimage
    Re-run packimage
  • Configure the "slurmd" service to start via the EXECUTEALWAYS block of the synclist (/usr/bin/systemctl start munge).  The service should only start AFTER the config file is synced.

 

SLURM testing:

# Activate all nodes
sudo scontrol update state=RESUME node=node00[2-8]

# Show partition and available node info
sinfo

# Run something using default partition
srun /usr/bin/hostname
srun --nodes=7 /usr/bin/hostname
# Should see 7 hostnames printed

# MPI testing
module load mpi/openmpi-1.8.8 
cd ~/openmpi-1.8.8/ompi/contrib/vt/vt/examples/c/

# Run tasks in 'gradclass' partition (7 nodes, 8 cores per node)
srun --partition=gradclass --nodes=7 hello_c

# Run tasks in default 'compute' parition (7 nodes, 40 cores per node)
srun --partition=compute --nodes=7 hello_c

 

iSCB (Chassis) Control

 

Via Serial Port

Use USB->Serial adaptor on node1.

Serial port should be 115200 8N1, no hardware flow control no software flow control

sudo minicom -s

 

(Was unable to gain access via serial console - missing username/password)

Default network address is: 10.10.1.11

telnet 10.10.1.11 7000

Change default login:

iSCB> set passwd       #(set telnet password to admin)

Change default IP and netmask

(pending)

iSCB> nvram save

 

 

New User Setup

 ADMIN:  Add new user accounts, specifying their custom home directory location

# Add user account
sudo useradd -d /exports/home/<USERNAME> -c "<FULL NAME>" <USERNAME>
sudo passwd <USERNAME>

# For SUDO access (faculty and student admins only) 
sudo gpasswd -a <USERNAME> wheel

# To review which users already have SUDO access
sudo lid -g wheel

# To run MPI across the diskless compute nodes, you need to:
# 1.) Copy and paste desired entries into /etc/passwd_slave, /etc/shadow_slave, and /etc/group_slave!
# 2.) Re-sync (actually MERGE) those files onto the compute nodes via:
# sudo updatenode node00[2-8] -F 

 NEW USERS:   Create your SSH key to allow password-less authentication among cluster nodes. Add the public key to the authorized_keys file, which is shared (along with your home directory) among cluster nodes.

ssh-keygen -t rsa
# ENTER to accept default directory
# ENTER to accept NO passphrase
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 600 ~/.ssh/authorized_keys 

NEW GROUP (for shared collaborators)

sudo groupadd <GROUPNAME>
sudo gpasswd -a <USERNAME> <GROUPNAME>
.... (repeat as needed) ...

 

[****OLD NAT INSTRUCTIONS**** - USE NEW FIREWALLD METHOD INSTEAD]

From:  http://sumavi.com/sections/port-forwarding

 Edit /etc/sysctl.conf to allow ip forwarding: 

# Controls IP packet forwarding
net.ipv4.ip_forward = 1

[OLD INSTRUCTIONS] The default is 0. After changing the sysctl.conf reload them:

sysctl -p

Configure routing rules:

iptables -t nat -A POSTROUTING -o eno1 -j MASQUERADE
# where eno1 is the PUBLIC interface 
iptables -A FORWARD -i eth0 -j ACCEPT
iptables -A FORWARD -o eth0 -j ACCEPT
# (where eth0 is the PRIVATE interface) 

Save the rules to be persistent:

iptables-save