Setting up the Training Lab for Remote Learning

Several years ago I started volunteering to help teach people an introduction to Cybersecurity. At the time it was easy to work with my students as we had a small lab of 6 computers running CentOS and VirtualBox in a small schoolroom. Each computer could load Kali and a Windows XP VM. We were allowed to use the lab once a month, so every month I’d have a group of 6 people who wanted to learn cybersecurity, and sometimes another volunteer to help me. Things were going pretty well and then COVID happened and this became unsustainable. Actually, a lot of things happened at roughly the same time which ended this volunteer initiative.

Part of our agreement with the school was that none of us would have root privileges to the CentOS host machines. They were understandably not thrilled with the idea that we might be using “hacking techniques” on their network, so part of our agreement was that we would not attempt to admin the CentOS machines and that we would not have permission to reconfigure the virtual interfaces to the VMs (ie. The VMs would be cut off from the internet and we couldn’t change that). I agreed to this and it worked fine, I had no reason to mess with the CentOS machines or the networking on the VMs aside from having them talk to each other. Everything we needed went through a dedicated admin. Unfortunately, the school decided not to renew this admin’s contract or he quit, I’m not entirely sure which.

The next issue that came up was accreditation. In the wake of increasingly common ransomware attacks, the school district decided that all computers needed to be accounted for and accredited. I’d normally be all for this, but the school just got rid of their lab admin and the rest of their team didn’t know what was going on with the lab and didn’t want to try and figure it out.

The lab was scheduled for decommissioning and no justification from me was enough to save it; however, the final nail in the coffin came when COVID-19 put the world into lockdown. No admin, no accreditation, and now no in-person meetings and suddenly there was no reason to keep the lab so the school shut it down. I wasn’t happy about this as I firmly believe that Cyber Security is one of the most important issues we face today. It’s so important that the last three Presidents have included it in their National Defense Strategies. I wanted to keep teaching this to people who wanted to learn. I needed to figure out a way to do this remotely.

Let’s first talk about the requirements and limitations I had.

I couldn’t spend tons of extra money:
- This one is pretty simple. My family is a single income family with two kids. I don’t have the money to buy cutting edge blade servers, nor did I have the money to power a blade constantly. I didn’t have an advanced edge router with a next gen firewall. I didn’t have TBs of storage on a NAS to hold VM HDDs and files. I didn’t have a Windows Server Key, etc. In short I was missing a lot of tech I would like to have
I had to be able to teach remotely:
- This actually would become a larger issue than I had initially expected. When I first started trying to do this remotely my idea was to use my GSuite account and Google Meet to teach. Everyone would need to download VirtualBox and set up a basic Kali VM. I made some videos on how to set up a VM and thought it would work. It didn’t. I ran into a number of issues where people would email me saying that they couldn’t figure out how to make the VM, it wouldn’t start, their computer wasn’t powerful enough for it, or (most common) they didn’t have an actual computer as they did everything off their phone or ipad. I needed a better solution, more on that later.
Whatever I did needed to be safe:
- I am teaching cybersecurity after all, so whatever I did needed to be safe.

With my limitations firmly in mind, I had to take a look at what I had available to me.

I had a decent PC with 32GB of DDR4 RAM and 3TBs of HDD.
I had a Google Wi-Fi mesh network
I had GSuite
I had a website
I had a handful of Raspberry Pis (4 to be exact).

So using that I needed to build a safe and secure training environment. My initial thought was to use one of the RPIs as a VPN concentrator and have students SSH into the other two for practice. I quickly dismissed that idea for a number of reasons.

Students would need SSH
Students would need openvpn
Resetting the RPIs after each class would be a pain in the neck since I couldn’t snapshot them.
RPIs would have access to the internet

I figured that since the students had difficulty setting up a VM I wouldn’t be able to convince them to set up OpenVPN and install an SSH client on their iPad. In addition to that, while reflashing the RPI’s SD card isn’t hard, it is time-consuming to reset everything up. Finally, I was concerned that students would have unfettered access to the internet using my ISP while in class. While I hope nothing bad would happen you never know what people will download and it is on my network made me very uncomfortable. This idea was out of the question.

The next idea I had was to run some VMs off my desktop and give the students some way to access those remotely. This would work in theory as VirtualBox supports port forwarding RDP and SSH into VMs. It also gives me the ability to reset the VMs after class and I could cut off the internet access to them. This solved two of my issues but still left the problem of requiring students to figure out OpenVPN and SSH. Nevertheless, this seemed to solve some of my problems so I was going to figure out how to make it work.

What I needed now was a way for students to access the VMs that is transparent to them. I started doing some research and came across Apache Guacamole, which allowed SSH and RDP through a web browser. This sounded like exactly what I needed, and it had a Docker container that supports ARM.

I set up Docker on my RPI4 and added the Guacamole container on top of it. From there it was a fairly simple matter of setting up user accounts and credentials to allow the students to SSH and RDP into the VMs. This solved two of my three limitations but it still wasn’t inherently safe. I would be running all of these VMs off my main desktop on the same network as my house. The Guacamole server would be publicly exposed to the internet and since Google Wi-Fi is junk there was no way to properly secure remote access. Without a good firewall on Google Wi-Fi, I needed a different way of controlling access. Enter Cloudflare Access.

Cloudflare Access is a product I’ve written about before when I talked about securing my Pi-Hole admin interface. It’s a fantastic product and I can’t recommend it highly enough. I set up an access rule for my Guacamole server but I still needed to restrict access on the host to only allow Cloudflare IPs. This is also something I went through with Pi-Hole and it’s a major pain; however, I was committed to doing this. A short time later and this was set up.

When it was all set up the end result looked logically like this.

I taught a few lessons with this setup and was actually remarkably surprised at how well it worked.

The biggest issue I ran into was the limitations of the RPI4 and video encoding. Simply put the RPI is not good at it so RDP and VNC connections worked, but were certainly not ideal. Colors were oversaturated, animations were blocky, and everything was just slow to load and render on screen. The second issue I ran into was a soft connection limit. I couldn’t find anything in the Guacamole documentation that said there was a connection limit so I have to conclude that this was a hardware limit on the RPI4. I could open 6 RDP connections but as soon as a 7th was opened everyone’s session would crash and then when everyone tried to reconnect it would trigger the fail2ban and ban the Cloudflare IP for 10 minutes. This happened a few times as people would accidentally back out of the page then reload the page and attempt to make a new RDP session instead of loading the preexisting one.

I continued to use this setup while I saved money for the better equipment I wanted. The first bit of equipment I bought was a pfSense router to act as an edge router, and the first thing I set up was a firewall rule to only allow Cloudflare IPs. I would later add in VLANs to separate the VMs from the rest of the network, but more on that later.

A few later I had saved up enough to buy a refurbished HP Z440 server. This server had 128GB of RAM, 16TBs of storage, and a Xeon processor. With this new server, I had more than enough RAM to give every student a Windows 7, Windows XP, and Kali VM at the same time. I loaded the Z440 with Proxmox and used its built-in noVNC viewer to access the VMs instead of Guacamole. I further went on to replace my Nginx reverse proxy with an haProxy built into pfSense.

I know that this sounds like I’m breaking my first limitation but it’s not. These are things that have been on my wishlist for around 6 years but I’ve always had more pressing matters to spend money on. I was finally able to buy them. I have about 8 virtual machines always running off that server to provide things like a mail server, SDN controller, NextCloud, WordPress, and SIEM. I then spin up Malware research workstations, threat research workstations, my training lab, and others as necessary. The Z440 helps me learn more about Cyber. The pfSense was also on the wishlist for years. While not necessarily a pfSense 6 years ago I knew I wanted an advanced router that I could use to separate IoT devices.

Since COVID-19 shut down the world I’ve used this setup to teach 19 lessons to over 300 people (not 300 unique people as sometimes people will attend more than one lesson, but I’m counting heads). I’ve spent around 100 hours teaching cybersecurity out of my basement to people who want to learn. I started down this path because I wanted to help people and in the process, I taught myself a ton about building a training lab.

Setting up the Training Lab for Remote Learning9 min read

admin

I finally bothered to create an automated backup workflow for my critical data.

Automating Weight Loss Tracking with Lose It!, OneDrive, Power Automate, and Splunk