Kail's Blog

Somewhat HPC related blog

  • Running A Julia Kernel In Apptainer

    In a previous article Singularity Jupyter Kernel I went through setting up an Apptainer container to run iPython kernels in Jupyterlab.

    Recently I started looking into Julia and wanted to do the same setup. It turned out to not be as straightforward for me as I was not familiar with …

    Read more...

  • Running Ansible Against Chroot Directories

    Several cluster managers, including BCM, Warewulf, and Scyld use chroot's or chroot like methods for provisioning. This method allows the systemd administator to build a golden image with the requisite packages and configurations, then the cluster manager will usually via it's own methods handle final configurations during provisioning.

    Some of …

    Read more...

  • Comparison of Provisioning/Cluster Managers in HPC

    We are primarily a Bright Cluster Manager Base Command Manager (BCM) shop and recently some uncertainty has arisen around the future of BCM after its purchase by Nvidia last year. More specifically how pricing will be handled and whether BCM will only be available with Nvidia superpods and DGX systems …

    Read more...

  • Simple Nvidia PyTorch Container

    Another quick Apptainer example for getting pytorch quickly up and running.

    The Definition File

    Bootstrap: docker
    From: nvidia/cuda:11.7.1-runtime-ubuntu22.04
    
    %post
        apt-get -y update
        apt install -y python3 python3-pip
        pip3 install torch torchvision torchaudio
        pip3 install ipykernel
    

    Build the Container

    apptainer build pytorch.sif pytorch.def …
    Read more...

  • Running A Jupyter Kernel In Apptainer

    Just a quick example of running a Jupyter kernel inside an Singularity Apptainer container.

    First define a container with all the required applications, in particular ipykernel.

    bootstrap: docker
    From: python:3.11-slim-buster
    
    %runscript
        echo "Hello... I am a new Singularity container"
    
    %labels
        AUTHOR andrew@kail.io
    
    %post
        apt-get update …
    Read more...

  • Integrating Anaconda with Lmod

    In a previous post I looked into performance issues with Anaconda initialization. To get around users modifying their .bashrc and breaking their environment, I have instead written an Lmod modulefile that handles setting and unsetting the variables and fuctions it needs. This works for both bash and csh.

    -- Point to …
    Read more...

  • Performance Impact of Anaconda Initialization

    One of the biggest complaints we get from many users is a delayed login when they initially get into the cluster. In most cases, we have narrowed it down to users either installing their own version of Anaconda or they're using an existing install without without a customized modulefile.

    The …

    Read more...

  • Block IOT Network Outbound on OpnSense

    I have recently set up an IOT VLAN on my OPNsense router to move some devices to. How to block the outbound took a little work to figure out but finally found the rule settings to get it working.

    In my example below I'll be blocking all outbound network traffic …

    Read more...

  • The Impact of RHEL Changes on an HPC MSP

    Its been nearly a week now and after having stewed on it I think its time to speak my piece on this debacle.

    The recent changes to how Red Hat Enterprise Linux (RHEL) distributes their source code has caused quite a stir in the Linux and HPC community.

    While I …

    Read more...

  • PXE Booting on Proxmox

    In last's week post I installed netboot.xyz on an OPNSense firewall. Now its time to test the PXE boot process on a VM on my Proxmox Host. Its very straightforward, but there are a few caveats I found with netboot.

    First, start the VM Creation process in Proxmox. You …

    Read more...