OPSEC for Blue Teams - Sandboxes & Secure Communications

This will be the last blog in this series on OPSEC for Blue Teams. I will share some of my thoughts on sandboxes, secure communications and sharing of info & data, when dealing with a targeted attack.

Sandboxes

Performing dynamic analysis by running a malware sample in a sandbox can provide you with valuable information. Again, keep in mind that certain OPSEC rules are less strict for non-targeted attacks. Such as sandboxes with internet connectivity.

VirusTotal

VirusTotal is doing much more then scanning your malware sample with multiple AV solutions. Behavioral information is added that comes from sandboxes. However, do not upload malware samples to VirusTotal and similar solutions. An adversary knows the hash(es) of the malware and can easily find out if they are uploaded. This kind of monitoring can often be automated using APIs offered by these services or by creating a wrapper around the web GUI.

Simulated Internet

Having internet connectivity on your sandbox can be a risk, more on this later. A good option is to have simulated internet. For example INetSim and FakeNet-NG can provide this when integrated with your sandbox setup (there are also third party sandbox services having these solutions integrated). INetSim and FakeNet-NG provide the capability to simulate various internet services such as: DNS, HTTP/S, IRC and many more. No traffic is actually going to the internet, but is simulated by these fake internet services which also allows you to configure how to respond to requests. Among other things, simulated internet can provide you with the first request the malware sample will sent. Such as the C2 domain or URL from which the second-stage payload is downloaded. Subsequent requests may fail if the malware gets confused by an unexpected response.

Internet connectivity

Be very careful with sandboxes with an internet connection. This includes your own sandbox and third-party solution/services (some allow you to disable internet access). I am aware that not having internet on a sandbox downgrade much of its strength and functionality. But having a sandbox with internet can result in sending signals that can be picked up by the adversary. Some of these signals overlap with the ones already discussed for OSINT. But also things like a hostname that is not corresponding with the naming convention of your company, and you are dealing with malware sending back the sandbox's hostname to the C2 server.

When you do really need the malware to connect to the internet, create your own sandbox (or be very sure you are not dealing with a targeted attack). Having a sandbox on which you have complete control is always a good idea. Main point here is to have a sandbox that shares as much characteristics as possible with a real company’s endpoint. Make it look as normal as possible. For example, think about the following:

  • Use real hardware instead of a virtual machine.

  • Run the same operating system and version.

  • Keep the system up to date.

  • Make it look like the sandbox is actually used by having traces of user activity, but do not have any personal/production data on it.

  • The hostname of the sandbox should correspond with the company's naming convention.

  • The same applies for the username that is logged on to the sandbox.

  • When connecting to the internet have its external IP look normal.

  • Very important, make it technically impossible for your sandbox to communicate with your production IT infrastructure.

  • Do not connect to your sandbox from a production system, use different hardware for this (i.e. a system not connected in any way to your corporate network).

Secure communications and sharing of info & data

OPSEC on how to share information within the blue team and stakeholders (board of directors, security officers, etc.) becomes extra important when dealing with a targeted attack. Imagine the adversary has access to all information from the blue team on your current investigation. You now have lost your advantage as a defender.

If you are convinced a security incident is a targeted attack on your company. The attack has been going on for a while, you have not detected the adversary very early in the kill chain, and before they have had any time to move up in the kill chain (on the latter you should be very certain). You should take extra precautionary measures on how you share information concerning the incident within the blue team and to stakeholders. Assume that things like Active Directory and mail servers are compromised and the adversary is reading your communications.

Think about the following:

  • Having systems you can use for performing parts of your investigation, communications and sharing of information that are not connected in any way to your corporate network. Obviously, you are still bound to use many of the tools you have within the blue team, running on the company's IT infrastructure, to be able to perform the investigation. For example using Splunk to perform big parts of the analysis and an EDR solution to get additional information from systems.

  • Secure communication channels that are completely separate from your company's IT infrastructure. Think about instant messaging (e.g. Signal) and a secure platform to share documents, notes and data.

A topic I would like to explore further: what is the best way, taking feasibility into account, to make sure that all blue team systems and tools (client endpoints, data lake, security monitoring solutions, etc.) can be trusted, up to a certain level, within a compromised IT environment.

OPSEC for Blue Teams - Testing PassiveTotal & VirusTotal

This second blog in the series on OPSEC for Blue Teams is about testing tools used to get context and/or OSINT on domains and IPs. While performing these tests it also showed results that can be interesting for Red Teams.

Tool testing - PassiveTotal & VirusTotal

Remember we want to have a tool that does not sent any signals that can be picked up by an adversary. If we must send signals, it has to be something the adversary expect to see.

I often use PassiveTotal for getting context and some OSINT. I wondered how passive it actually was. I started by asking a question about this to PassiveTotal for which I received the following response:

In order to perform "true" passive request you would need to adjust some account settings, which is under "Sources" in your Account Settings:
1) Disable third-party pDNS sources (VirusTotal, Kaspersky) - both do an active lookup for a domain that does not appear in their repository
2) Disable the Pingly source as this is RiskIQ / PassiveTotal Active resolver

I followed up by performing some tests, for which I first created a test infrastructure:

  • I registered a new domain.

  • Created two VPS instances with both a unique public IP address:

    • Installed the Apache webserver and let it listen on port 80.

    • Configured them to be an authoritative nameserver for my new domain using Bind in a master/slave setup.

    • Enable logging for Bind and adding the HTTP host header to the logs of Apache.

  • I pointed the NS records for my new domain to my own nameservers.

  • The PTR records for both VPS instances I pointed to a unique subdomain that will also be resolved at my own nameservers (rev1.example.com and rev2.example.com).

  • For testing IP addresses I created a new VPS. I did this because I needed to be sure to have reliable results and therefore having an IP which had no recent relation to any of the subdomain names from previous tests.

For every test performed I created a unique subdomain and pointing the A record to one of my VPS instances. Then I used this subdomain within PassiveTotal and monitored the Bind and Apache logs for any activities related to the subdomain.

I was very pleased to see no activity at all when I disabled VirusTotal, Kaspersky and Pingly. When enabling all of them I immediately saw activities within the logs of Bind (for multiple type of DNS records) and none for Apache. Further testing showed enabling VirusTotal did not result in any activities within my logs. I asked a follow-up question on this to PassiveTotal, in which they provided the following answer:

While they do not do an on demand active lookup - if they have a miss - i.e. they don't have data on the domain or IP in question, they will trigger a lookup to close their collection gap, so it could take 24hrs to propagate. This may have changed over the years though.

In my logs I had not seen any DNS queries for domains used in my tests, for which the source VirusTotal was enabled in PassiveTotal, after 24 hours or even longer. This tells me the behavior of VirusTotal has changed. I decided to perform more testing on VirusTotal itself:

  • Of course doing a scan on a domain or IP shows activities within the logs of Bind (query for the DNS A and AAAA record). Within the Apache logs it was pretty easy to track this to VirusTotal by having "virustotalcloud" included in the User-Agent.

  • Doing a search with the web GUI of VirusTotal showed activities within the logs of Bind (only a query for the DNS A record).

  • When executing a domain report using the API, I saw no activities at all. The same holds for the IP address report. As expected, I saw no DNS queries for the subdomain by VirusTotal after 24 hours and longer.

Based on the test results I am pretty sure PassiveTotal uses the domain/IP report API calls for getting info on domains and IPs. I also noticed when first doing a scan on a domain using VirusTotal, new data is shown in PassiveTotal having VirusTotal as the source.

Monitoring by an adversary

It would be interesting to perform more research on monitoring signals as an adversary. For example: see which behaviors are seen regarding DNS resolution when domains are not shared with any security vendor. Like I did on purpose during my testing.

I observed a great amount of DNS queries for the PTR records by a very large group of different IPs and parties. Therefore it may be less useful for an adversary to monitor on these. Or could this be turned into useful monitoring by focusing on IPs from security vendors? For now I did not look into this.

Another thing I noticed, is a great amount of DNS queries for subdomains that do not exist. Again, by a large number of IPs and parties. It can therefore be of great help to an adversary to use a subdomain that does look legit but has a very low chance in being guessed. It would also be interesting to research if there are good ways to filter out the noise (such as domain name scanners that come along).

OPSEC for Blue Teams - Losing Defender's Advantage

This is a three-part blog about OPSEC for Blue Teams. This first part expresses some of my ideas about the risk of alerting the adversary and OPSEC for getting OSINT and context on domains and IPs. The second part is about testing tools (I performed tests on PassiveTotal and VirusTotal) which provide context and/or OSINT in relation to OPSEC. The last part will be on sandboxes, secure communications and sharing of info & data when dealing with a targeted attack.

When talking about adversaries in this series, I mean the ones which are targeting your company. So I do not discuss a threat actor executing a malware or phishing campaign against a large and diverse group of victims. You can be less strict on following certain OPSEC rules when you know you deal with a non-targeted attack. Still, following secure practices in both cases will make sure your default behavior is in line with good OPSEC rules.

The risk of alerting the adversary

You have an advantage as a defender when you found a trace of a targeted attack. This is called defender's advantage or the intruder's dilemma: "The attacker needs to hide all his traces, but the defender needs to just find one trace to unravel the intrusion.". Use this advantage to get a complete picture of the attack and take at once all actions necessary to flush out the adversary.

Among others, OSINT and sandboxes allow the execution of practices producing signals telling the adversary the blue team is onto them. Obviously, that is not a good thing and you can lose defender's advantage. Based on the type (one signal is stronger than the other) and/or number of signals the adversary picks up, they will take action to make sure not losing access to your IT infrastructure. Changing the malware installed on endpoints to prevent detection, replace the current C2 domain and IP with a new one, spread to more systems or deliberately go silent for a while or change tactics to prevent being detected a second time. They may have made a mistake from which they have learned and will not make second time.

OPSEC for getting context and OSINT

There are many cases in which you need more info about a domain name or IP address. Many tools and third-party services are available which can help you in getting context (whois data, passive DNS, autonomous system number, etc.) and/or OSINT. Make sure you know what your tools are doing in the background. For domains and IPs you do not want them to do any of the following actions by default:

  • Setting up a connection to the domain or IP to perform an analysis.

  • Active DNS resolution on the domain.

These activities leave signals that can be picked up by the adversary. Imagine an adversary is using the following domain for its C2 server: apiv2.attacker.com. They would only expect their malware to contact to this domain. The adversary could monitor for example on the following related to apiv2.attacker.com:

  • DNS resolutions for this domain:

    • The source IP of the DNS request is within a subnet which has no close relation to the company you are attacking.

    • The source IP belongs to a security vendor such as VirusTotal.

  • Seeing HTTP requests that are not in line with what you expect to receive from your malware on this domain: abnormal User-Agent, unknown URL paths, signs of scanning activity, etc.

Regarding DNS, use passive DNS for which there are many sources. When dealing with a targeted attack, third party passive DNS sources will not be of any help. Use what you already have, such as DNS logs. When you do need to perform a DNS lookup, perform the query from a company system. The reason you are doing this in the first place, is that an IP can help you in providing additional context and OSINT. Be aware though to only perform DNS queries on domains you have seen, do not start guessing domain names by querying for exmaple apiv1.attacker.com till apiv5.attacker.com. This is a signal that can be picked up by the adversary.

The main message I want to provide here: be as passive as possible when using tools and performing manual actions. In addition, when you send out signals, they have to look as normal as possible and correspond with what the adversary expects to see.

Bypass client-side generated HTTP security headers

Every now and then when doing a security test on a web application I have to deal with client-side generated HTTP headers that are there for security reasons. These headers can cause problems during a security test. Fortunately they can easily be bypassed using Burp Suite.

HTTP "security" header

Let me explain the concept of a HTTP security header using a header with the name Auth-code, that is being added by the web browser itself (i.e. think about JavaScript running in the browser). The header value which is calculated involves the contents of the HTTP request message. This can be as simple as calculating a hash over the body within a HTTP POST request message. 

The web application will check the value of the Auth-code every time a HTTP request is received. If the value received does not match the expected result, the web application will deny the HTTP request and most likely send back an error message to the web browser. E.g.: "Invalid input provided".

False sense of security

These kind of security mechanisms does not add any real security to the web application. An adversary has full control on what happens on the client (i.e. web browser) and therefore what will be send to the web application. Because a header such as Auth-code is calculated and added by the client itself, an adversary also has the ability to make sure the Auth-code always has the correct value.

Security checks should always be performed on the web application server and not be fully depended on code that is running within the browser.

Example of exploitation

Imagine having a web application that relies for an important part of its security on a custom header called Auth-code. The value for the header is calculated and added to the HTTP request using JavaScript by the client itself. 

This web application has a webpage that shows user personal details such as name, address, email, etc. When accessing this webpage a user ID is provided within the parameter UserID, contained within the body of the HTTP request. Manipulating the user ID (e.g. by means of a proxy) will result in an error at the web application. After all the provided Auth-code will not match up with the expected value.

All we have to do to bypass the security check in the above case is making sure the Auth-code we sent will always match the content of the body within the HTTP request. Every time we manipulate a parameter this will require re-calculating the value for the Auth-code. Of course we first need to know how this value is being calculated and under which circumstances it is being used. Knowing that, we can make sure to always include the correct value.

When doing a security test it can be quite cumbersome if re-calculating the Auth-code value has to be done manually. Or simply not possible when you want to take advantage of the active scanner within Burp. We need a way that we can automate this process. We can achieve this by writing our own extension for Burp Suite.

How to bypass using Burp Suite

Burp Suite allows expanding its functionality with extensions. Those can be written in Java, Python or Ruby. I choose to write it in Python. This extension will make sure that for every HTTP request being send the custom security header (e.g. Auth-code) will match the expected value. It will work for all requests being send through Burp Suite: Repeater, Intruder, Active scanner, etc.

Burp extension

On my GitHub page you will find a Burp extension that serves as a template for bypassing a custom security header. Within the Python code I have added comments that should help you in customising the code to fit the web application you are testing.

https://github.com/marcusbakker/Burp-Suite-Extensions

Hunting with JA3

Within this blog post I will explain how JA3 can be used in Threat Hunting. I will discuss a relative simple hunt on a possible way to identify malicious PowerShell using JA3 and a more advanced hunt that involves the use of Darktrace and JA3.

What is JA3?

Introduction

JA3 is a method to fingerprint a SSL/TLS client connection based on fields in the Client Hello message from the SSL/TLS handshake. The following fields within the Client Hello message are used: SSL/TLS Version, Accepted Ciphers, List of Extensions, Elliptic Curves, and Elliptic Curve Formats. The end result being a MD5 hash serving as the purpose for the fingerprint. Because the SSL/TLS handshake is sent in clear text we can use it to fingerprint any client application using the information within the Client Hello message.

 At this moment JA3 is being supported by:

  • Bro

  • Darktrace

  • MISP

  • Moloch

  • NGiNX

  • RedSocks

  • Trisul NSM

  • Python script that accepts a PCAP file (you can find this one on the GitHub page of JA3)

For more detailed info on JA3 see: https://github.com/salesforce/ja3

Uniqueness of the fingerprint

It is not uncommon to see that a particular JA3 hash is also being used by another type of application. For example: applications written in Java tends to result in the same JA3. You will also notice, depending on the Windows version, when looking at PowerShell that the same JA3 hash is also being used by the Windows Background Intelligence Transfer Service (BITS).

It is important to take collisions into account when performing investigations based on JA3 (please note I am not talking on hash collisions here). Still, JA3 can be very powerful when used for Threat Hunting and Incident Response.

Hunting for malicious PowerShell using JA3

What is PowerShell being used for

Why hunt for PowerShell? PowerShell is quite popular under adversaries for performing malicious activities. It is also very popular by system admins, but with of course a different end goal in mind. Commonly adversaries use PowerShell for:

  • Downloaders to facilitate the second stage of infection by downloading additional malicious code such as a backdoor.

  • Running backdoors that are written in PowerShell (e.g. PowerShell Empire).

  • Post-exploitation toolkits such as PowerSploit.

System administrators use it for:

  • Automating system administration activities.

  • Far less common, compared to adversaries, for downloading files from the internet.

Why hunt for PowerShell?

The above stated examples of adversary activities being performed using PowerShell make it very interesting from a security monitoring perspective to know when PowerShell communicates to the internet.

Why use JA3 and what to take into account

Other methods exist besides Invoke-WebRequest for communicating over the internet using PowerShell. I will use this one as an example.

When using the PowerShell cmdlet Invoke-WebRequest for communicating over the internet a User-Agent is sent containing PowerShell: "Mozilla/5.0 (Windows NT; Windows NT 6.3; en-US) WindowsPowerShell/4.0". However, this can easily be changed by providing a custom User-Agent to let the traffic look more normal (-UserAgent "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"). That is why relying on the User-Agent is not good enough. Modifying the User-Agent is actively being done by malware.

A far more reliable way of identifying PowerShell that communicates to the internet is to have a look at its JA3 hash values. Yes, we have more than one single JA3 hash for PowerShell:

  • The JA3 hash can differ between PowerShell versions. For example:

    • Windows 7 PowerShell 5.0: 05af1f5ca1b87cc9cc9b25185115607d

    • Windows 7 PowerShell 6.0: 36f7277af969a6947a61ae0b815907a1

  • Differences in Windows versions:

    • Windows Server 2016 PowerShell 5.1: 235a856727c14dba889ddee0a38dd2f2

    • Windows 10 PowerShell 5.1: 54328bd36c14bd82ddaa0c04b25ed9ad

  • There is a way of modifying the TLS version being send in the TLS Client Hello message and thereby having a different JA3 (-SslProtocol parameter in PowerShell v6 for Invoke-WebRequest).

  • When no domain name is involved with setting up the TLS connection, the Server Name Indication (SNI) extension is missing, hence a different JA3 hash.

  • Other methods of communicating to the internet using PowerShell can result in another JA3 hash value (e.g. when Windows BITS is used it can differ depending on the Windows version).

As stated before, there can always be collisions with other client applications which have the same JA3 hash as being used for PowerShell. All should be taken into account when doing proper Threat Hunting.

The hunt

Once we know which JA3 hashes can be seen in your environment, you can start hunting for interesting events. Expect to have collisions and therefore a way of picking out the notable events. A tactic here is to use stacking and look at the bottom of the stacked domain names (i.e. the domain names that occur least frequent). From their pivot to the associated URLs to spot possible malicious traffic. It will also help to enrich the events with the domain registration date and first start with looking at the youngest domain names. And you can add the date of first occurrence for the domain name within your IT Infrastructure to fist check the relative new domain names. Of course, all depends on the capabilities of your tools. Be creative here to see what works best within your environment.

JA3 hunting with Darktrace

Darktrace allows us to perform more advanced hunting with JA3 by employing some very useful metrics. They can be simple like the type of traffic (e.g. HTTPS) and more advanced like the rarity of a domain name within your environment. Metrics are used within Darktrace to build models. A model will have a set of conditions that need to be met before it triggers. In the terms of Darktrace a triggered model is called a model breach.

Hunting hypothesis -> Model

I will explain for one particular JA3 metric within Darktrace how this can be used with the following hunting hypothesis:

An adversary infects a victim's endpoint with a backdoor that starts communicating over HTTPS to a command and control (C2) server by sending beacons on regular or irregular intervals.

For this hypothesis the following characteristics are notable:

  • We hope, and we can somewhat assume, that the backdoor will have JA3 hash which is not frequently seen within our IT infrastructure.

  • Communications to the C2 server's destination are rare within your environment. When not dealing with domain fronting for which the domain is not frequently being accessed by other systems within your environment.

  • The backdoor sends beacons to stay in contact with its C2 server.

You can take the above characteristics to create a model. With this model we combine several weak indicators to increase our chances in detecting C2 channels. The important Darktrace metrics for this model are:

  • "Unusual JA3 hash": for example you can set this to 90% only to look at rare JA3 hashes within your whole environment.

  • "Rare external endpoint": you can do something similar for this metric by only taking into account the rare destinations (IP or domain) within your environment.

  • "Beaconing score": this metric also expects a percentage. The higher the percentage the more regular the beaconing is occurring.

Adversary backdoors often have a configurable jitter to prevent sending a beacon exactly every X minutes. For example by introducing a variation of 40%. Resulting in the traffic to blend in with normal outgoing network traffic, and thereby harder to detect. Within Darktrace this is one of the factors that will result in a lower "Beaconing score" and therefore still detectable when combined with other metrics.

Happy hunting! 

Volatility: proxies and network traffic

When dealing with an incident it can often happen that your starting point is a suspicious IP. For example, because the IP is showing a suspicious beaconing traffic pattern (i.e. malware calling home to its C2 server for new instructions). One of the questions you will have is what is causing this traffic. It can really help your investigation when you know which process (or sometimes processes) are involved. However, answering this question is challenging when you have to deal with the following:

  • An IT infrastructure where a non-transparent proxy is used for all outgoing network traffic (this is the case in many enterprise networks).
  • No other sources, except a memory dump, are available to you where you could find this information.

In this blog post I will explain how you can solve this with Volatility and strings.

Lab Setup

I have created a small lab setup to simulate an infrastructure where communications to the internet need to go through a web proxy.

lab_setup.png

The host with IP 172.31.0.250 is the internal web proxy server. This server forwards the traffic to the gateway and sends back the result to the client. The endpoint with IP 172.31.16.50 is used to perform our memory forensics on.

Generating network traffic

To make it as realistic as possible, the endpoint 172.31.16.50 is used to setup multiple connections to hosts on the internet using different applications. One of these applications is connecting to IP 35.178.122.152. The intention is to identify which process is responsible for communicating with this IP.

The problem

The effect of a non-transparent proxy on network connections

The goal for this blog is not to explain in detail how a web proxy works. If you want to have more details about this topic, I can recommend reading the following blog post: https://parsiya.net/blog/2016-07-28-thick-client-proxying---part-6-how-https-proxies-work/.

When an endpoint wants to communicate to internet within a IT infrastructure that requires all internet traffic have to pass a proxy. It will ask the internal web proxy server to connect to the external host and send back the results.

Applications communicating to the internet over a non-transparent proxy are not directly communicating with the host on the internet, but ask the proxy to do that. That is why you will not see something like the following within your network connections:

In the above netstat output it is pretty easy to identify which process did setup a connection to which exact external IP and port (e.g. Chrome.exe communicates to the external IP 108.177.126.138 over port 443). However, when dealing with a non-transparent proxy you will see something completely different.

These are the same network connections using the same applications. But this time all external connections are going through a proxy. Therefore all external communications seems to be going to the internal host 172.31.0.250 (the internal proxy server) over port 8080:

The Volatility plugin netscan will show similar output from which it seems that all outgoing connections are to internal hosts 172.31.0.250:

Solving the problem

Let's have a look how to pinpoint a particular IP address to a process using Volatility and strings. Instead of strings you could also use another utility, as long as the output contains the decimal byte offset and its corresponding string.

Strings
You have to start off by locating all physical memory address locations of the IP 35.178.122.152 within your memory dump (remember that this was the IP related to the suspicious beaconing pattern). You can use Linux strings for this with 2 passes to include both ASCII and (little endian) Unicode strings (-el). You set the parameter -td strings to include the byte offset in decimal format for every string. You will later need this offset for Volatility. The output of strings is written to the file out-linux-strings:

linux-strings.png

With some luck it could be the case that grepping for your suspicious IP within the strings output, that you can already spot the source without the need to use Volatility. Take note that this will not always be the case. Just grepping for the IP or using Bulk extractor to search through your memory dump can be of much help to find out more.

From the file out-linux-strings you grep all the lines that have a match on the IP 35.178.122.152 and output it to another file search-strings:

grep_on_ip.png

Volatility
You will use the file search-strings as input for the Volatility plugin strings. This plugin expects as input a file in the form <decimal_offset>:<string>, or <decimal_offset> <string>. The plugin will output the corresponding process ID and virtual address where the string can be found within the memory dump. Write the output of Volatility to the file out-strings:

With some basic Linux tools you can create a list of involved processes and do a count on number of occurrences. You can clearly see that process with ID 3008 is having a major role with 116 hits regarding the communications to IP 35.178.122.152 (the entries with FREE can be ignored as it is related to free memory which is no longer associated with a process. It can however contain valuable information).

When using the Volatility plugin pslist you can also find the corresponding process name taskhostex.exe for PID 3008:

Mission accomplished!
Please tell me if you have another way to solve this problem that involves memory forensics.


Update

Andrew Case (Volatility Core Developer) replied with another way on how you can try to solve this using the Volatility plugin yarascan: https://twitter.com/attrc/status/975675012307935232