Setting up a Malware Analysis Lab

A few days ago, I published an updated look at my home network and made a post on Reddit about it. One of the frequent questions I get asked whenever I put out a new network map is “What does your Malware network look like?” I’ll usually take some time to explain how I do things, but I’ve never actually broken out the entire thing. Today I’m changing that; I’m going to walk through my malware network and give some generalized advice for setting up a malware analysis environment.

Getting started

Picking a Hypervisor

Before we begin the most important thing to do is make sure that you’re being safe. Working with malware is dangerous and we’re long past the age where malware just slowed down your PC for the lulz. Malware today is far more dangerous with Ransomware able to destroy files across an entire network. You need to make sure you’re being safe when you work with malware so we’re going to talk a bit about hypervisors before we begin.

There are generally two types of hypervisors and they both have pros and cons. I’ll start with Type-2 hypervisors because they’re what most people are probably familiar with.

Type-2 Hypervisors

Type-2 hypervisors are what most people here are familiar with. With a type-2 hypervisor the host computer is running a full desktop environment and then splits that environment to create guest computers. An example of this is running VirtualBox on your Windows 10 computer. VirtualBox then creates a guest Linux computer that runs as a process in Windows 10.

Common Hypervisors include:

VirtualBox
VMWare Workstation Player

Both of these are good options when getting started but I’d consider VirtualBox a better option for brand new networks. VMWare Workstation is demonstrably the better of the two hypervisors but the costs for a license can be cost prohibitive for most people, especially if you’re not being paid for this work.

Advantages of Type-2 Hypervisors

Type-2 Hypervisors have an advantage of being able to run a VM on split almost any environment into multiple VMs. Put simply Type-2 hypervisors allow you to create and run virtual machines without needing dedicated hardware to run them. Type-2 hypervisors aren’t my preferred way of doing things, but they are still useful when you are somewhere without internet.

Disadvantages of Type-2 Hypervisors

Running the hypervisor on top of a full operating system means that the VM will have limited virtual hardware to work with. Your computer may have 16GBs of RAM but since you’re running a hypervisor on top of Windows 10 you will be limited to 12GB of RAM you can provide to VMs (since Windows needs a minimum of 4GBs). Furthermore, if the processor on your host is older the VMs may not run very efficiently.

The real reason I don’t use Type-2 Hypervisors is because of the guest agents. To make full advantage of VirtualBox or VMWare Workstation you will need to install their guest additions software. This software enables things like Copy & Paste from Host to VM, shared folders, and higher resolutions. The problem with this is that a lot of malwares are programmed specifically to look for these addons as part of its defense evasion techniques.

Type-1 Hypervisors

Type-1 Hypervisors, sometimes referred to as “bare metal hypervisors” are systems with a sole purpose of creating VMs. Type-1 Hypervisors usually have a very minimal interface, if any, on the physical machine they’re installed on, and you instead access them through some form of web server.

Some examples of Type-1 Hypervisors include:

KVM
ESXi
XenServer

I notably didn’t mention Hyper-V, which is a Type-1 Hypervisor, because it’s different. Hyper-V running on Windows 10 Pro technically is a Type-1 Hypervisor because the hypervisor lies below the OS (which is now actually a VM); however, the way you typically work with these VMs by RDPing from your Windows 10 to the VM. These are technically 2 separate VMs running parallel to each other instead of one on top of the host, but since you still must load a Windows 10 VM to RDP over to the guest you aren’t getting maximum efficiency on the Guest VM. Hyper-V is fantastic though, so I still highly recommend it.

Advantages of Type-1 Hypervisors

Type-1 Hypervisors are safer when working with malware since you’re physically separated from the VM. A VM escape does not result in your main computer being compromised. Type-1 Hypervisors are usually also Linux based so VM escapes are significantly less likely to result in any significant issues as most malware is written for Windows and won’t work on Linux without something like Wine.

Another major advantage for Type-1 Hypervisors when analyzing malware is that the Qemu agent is not as common. Malware does not frequently check for a Qemu agent installation.

Disadvantages of Type-1 Hypervisors

Except for Hyper-V Type-1 Hypervisors need external dedicated hardware. This can be very cost prohibitive for people not doing this professionally. The server I’m working with cost $800 when I bought it but thanks to CryptoMining it’s not selling for over $1000.

My Network:

For my network I’m using a Type-1 Hypervisor with Proxmox (KVM) to provide the Virtual machines for analysis. I’ll start at the top.

I’ve spoken about my love of pfSense before and I’m about to again. This network would still technically be possible without my main pfSense router but it would be significantly more difficult. This device is a physical pfSense that I have set up at the edge of my network. I’m not going to get into everything that this pfSense does for me in this post but it’s a significant amount. Instead what’s important to understand here is that pfSense passes two VLANs to my VM farm. It passes my VLAN 1 (native VLAN) and VLAN 3 for my analysis lab. This router handles the VLANS and ensures that VLAN3 is not able to talk to VLAN1, but the opposite is true.

Up next is my transfer network:

Transfer Network

This VM sits on the VLAN3 subnet separate from the actual analysis machines which are isolated behind another pfSense (more on that later). This purpose of this transfer network is to simply go and collect the malware for later analysis.

For this job I use the Tails OS in a VM to retrieve malware. I used Tails for three main reasons.

First: Tails in a Linux based operating system. This inherently gives an advantage when dealing with malware. While there is malware that can affect Linux the vast majority of malware is built for Windows. This means that if malware does auto execute on download, or is embedded in a page, etc its very unlikely that it will impact this machine.

Second: Tails is built with TOR set up by default. When I’m researching malware I don’t generally want to be attributed. While I’m not nearly important enough for someone to care about me downloading their malware I still don’t want my public IP hitting their servers.

Third: Tails auto-resets itself when shut down. This is a minor benefit for me as a I don’t have to spend time resetting the VM after every grab.

One important thing to understand with this VM is that it does have direct unrestricted access to the internet. I have a lot of security appliances set up on my network but this VM is intentionally permitted to bypass all of them. Because of this it was important to me to make sure that no analysis happens here and that there is no way for the malware to use this VM to reach back out to the internet.

Once I download the malware on the Tails machine I will use SCP to transfer the malware past the second pfSense and over to one of my analysis machines. When the transfer happens successfully I will shut down tails completely. It would be unlikely for malware to go backwards (especially since the lower pfsense blocks all outbound connections) but I want to be careful.

VM Router

Up next is my secondary pfSense. This device is virtualized by the Proxmox server and is used exclusively for the malware network. Before you ask, yes this pfSense is technically redundant. Everything I use this router for could be done in the main router; however, using a dedicated pfSesne is beneficial from an ease of management and analysis standpoint.

Using a dedicated router means that I can easily differentiate the malware’s traffic as it won’t be mixing with real world traffic. It also means alerts generated in the IDS, DNS blocker, and firewall logs are all going to be the result of the malware.

This also benefits me by making making management easier. If I screw up a routing rule or firewall rule it won’t result in a complete network crash.

I have this router set to block all outbound connections from the analysis network. The only time when the VMs have connection to the outside world is during maintenance updates on the analysis machines. I’ll disable to firewall rule, run my apt updates and cup all and then reenable the firewall rule again. When the firewall is off all of the traffic is automatically routed out of the network through a Private Internet Access VPN. Again, I’m not actually worried about my public IP getting logged, but I’d still rather be safe. Using a VPN also bypasses my edge router’s IPS and DNS based filtering (for this network only).

To access my VMs I need to set up two port forwarding rules.

The first rule is to port forward 22 (SSH). This allows me to to SCP malware files from my transfer machine to a receiving machine. The second rule I port forward is 3389 (RDP) to my FlareVM. The Flare VM has most of the tools I need so it’s where I spend the bulk of my time. Anything I don’t RDP into is connected to through noVNC on Proxmox.

On the router itself I have Suricata installed with the default Snort and Emerging Threats rule set. Suricate is configured with the maximum detection profile set and IPS disabled. I have IPS disabled specifically because I want malware to be able to reach the wider internet at certain points. I also have pfBlockerNG-devel installed to log my DNS queries and NTOP to visualize all of my network connections and provide easy PCAP. Finally I have Zeek installed as well to act as another IDS focused on anomaly detection.

Analysis net

With the router configured to meet my specifications its time to turn towards the actual analysis.

The VMs of my VM network are relatively basic, smarter people than me created these VMs loaded with tools. Flare is my main analysis VM and the only tool I’ve added so far is Autopsy so I can work with E01 (HDD images). All of these VMs are free so I’m not going to spend that much time on them.

In addition to the standard VMs I also have a copy of Windows XP, Windows 7, and Windows 10. I’ve gone to some lengths to make these VMs look as realistic as possible so for a few months I did actual work on these VMs. Some malware looks for recently opened Office documents in order to tell if it’s in a VM. In addition to standard programs I would use on a real network I keep an iso file loaded up in a read only state with a few tools including Regshot, Process Hacker, and FTKImager. I don’t leave these file/programs loaded on the actual VM incase malware checks for them

Conclusion

Malware analysis is a dangerous hobby. Before you begin you really need make sure you have a safe environment to conduct your analysis in. This article has outlined the way my network is set up in order to both facilitate easier analysis and provide security to my home network.

Setting up a Malware Analysis Lab11 min read

Getting started

Picking a Hypervisor

Type-2 Hypervisors

Advantages of Type-2 Hypervisors

Disadvantages of Type-2 Hypervisors

Type-1 Hypervisors

Advantages of Type-1 Hypervisors

Disadvantages of Type-1 Hypervisors

My Network:

Transfer Network

VM Router

Analysis net

Conclusion

admin

Setting up a Malware Analysis Lab11 min read

Getting started

Picking a Hypervisor

Type-2 Hypervisors

Advantages of Type-2 Hypervisors

Disadvantages of Type-2 Hypervisors

Type-1 Hypervisors

Advantages of Type-1 Hypervisors

Disadvantages of Type-1 Hypervisors

My Network:

Transfer Network

VM Router

Analysis net

Conclusion

admin

I finally bothered to create an automated backup workflow for my critical data.

Automating Weight Loss Tracking with Lose It!, OneDrive, Power Automate, and Splunk