Malware Analysis and Reverse Engineering Workflow
I have noticed that there is a lot of material on individual parts of the malware analysis and reverse engineering process. However, they do not cover really what someone’s workflow is from start to finish. This guide will be an overview of my general workflow.
Prerequsites
- Ensure you have finished setting up your malware lab
- This guide is not how everyone does it, everyone will refine their own workflow, which works best for them overtime.
Methodology
When working on malware it is good to have a methodology to guide your operations. Without an overall methodology, your work will fall flat as you have no purpose. It is a cool emerging field and can make your company look awesome, but at the same time, reverse engineering is time consuming and very expensive. If extra care is not given to an overall direction you can become very ineffective. People who do not know about the technical challenges malware analysis and reverse engineering pose, believe that you can send them a report on a fully reversed malware sample in a couple of hours. It is up to us to educate them that this is not the case, as always things are usually much more complex than they seem. Based on the complexity of a malware sample it could take minutes, hours, days, months, years, it really just depends!
Depending on where you work there will often be different methodologies to how they strategize this process. Because of this, do not be suprised when the methodology depicted in this guide is different elsewhere.
Purpose and Output
There are many purposes you can have when performing analysis like this.
- Detection Signatures
- Malware Configuration Extractors (Intelligence)
- Automated Unpacking
- Tracking Threat Actors
- Writing Technical Reports
There are certainly other potential usecases for this kind of work.
However, these would be the usecases for most.
Behavior
When analyzing malware it is important to focus on its behavior. This tells us exactly what operations we can expect the malware to perform once its in a target environment.
Endpoint
Endpoint behavior includes but is not limited to the following properties.
- Process Creation
- File Create and Delete
- Registry Keys
- DNS
- TCP Communication
- Code Injection
Network
Although there are some points regarding networking in the Endpoint section, this section is about full packet inspection. This is how the malware communicates, not just to a domain or IP address but the contents of that communication and how it behaves.
Detection
One of the things I pride myself on is writing meaningful detection. Imagine going to the doctor because you are feeling sick. If the doctor checks you over and only confirms you are sick or gives you a vague diagnosis without more details, it creates a poor customer experience. In the cybersecurity industry we can attribute this experience to looking at VirusTotal antivirus detection names. When I write detection, i create heuristic as well as classification signatures. This ensures we can catch more malware at a larger scale but also catch the ones we already know about and provide that diagnosis people are looking for.
When executing this detection methodology, I focus on the Tools and Techniques Tactics and Procedures (TTPs). A great reference diagram for this is the pyramid of pain shown below.
There is importance for intelligence like IP addresses hash values, domain names and network/host artifacts. However, this does not lie so much in detection, it lies more in verification of behaviors as observed by tier 1 SOC analysts, as well as, tracking threat actor infrastructure.
Process
When we work on malware analysis and reverse engineering it is important to have a process to keep you organized in your work. Without an overall process, it will be easy to become lost or have to redo work you have already completed.
It is also important that we are cognizant about the time complexity of our process. If we obtain our goal reasonably within the first stages of this pyramid, you must justify the expense of going further up the pyramid.
Time is money and reverse engineering code takes a long time but can come with additional benifits. Again, without a clear methodology and direction you could waste a lot of time and money.
Overview
The following graph provides an overview of the general process I follow.
The graph above can be descrived as follows.
- Triage
- Identify Filetype
- Determine which tools to use for further analysis
- Malware Analysis
- Static Analysis
- Use static tools based on filetype
- Dynamic Analysis
- Execute the malware and observe its behavior
- Static Analysis
- Malicious Intent
- Determine if the sample has malicious intent
- Identify Filetype
- Reversing
- Triage
- Determine the scope of work
- Actual Reversing
- Scoped Result
- Triage
Malware Analysis
Malware analysis is a process to triage the beginning stages is to ensure that we have a high level overview of the sample that we will use later for reverse engineering. This saves us time in the process as it provides reliable situational awareness.
Identify Filetype
We need to determine what kind of file we are working with as it will aid us in determing the tooling we will be using moving forward.
Suggested tools.
- Detect it Easy
- Linux file command
- Binwalk
|
|
Analysis
The malware analysis stage for me depicts the part of the process where we start investigating for malicious intent. This process includes analysis of the sample using both static and dynamic analysis. However, it is a more surface level approach than reverse engineering.
Static
Before we can perform static analysis we must be able to understand what it means.
Static program analysis is the analysis of computer software performed without executing any programs, in contrast with dynamic analysis, which is performed on programs during their execution. - https://en.wikipedia.org/wiki/Static_program_analysis
This means we perform analysis without executing the sample.
Suggested tools.
What we are looking for are indicators that clearly show malicious intent.
Dynamic
Before we can perform dynamic analysis we must be able to understand its meaning first.
Dynamic program analysis is the analysis of computer software that is performed by executing programs on a real or virtual processor. For dynamic program analysis to be effective, the target program must be executed with sufficient test inputs to cover almost all possible outputs. https://en.wikipedia.org/wiki/Dynamic_program_analysis
Now there are two types of dynamic analysis I like to do, the first one is automated dynamic analysis.
For automated dynamic analysis I will submit the sample I have to a sandbox service or my own sandbox system. Let us first understand what sanboxing is at a high level with the following definition.
Sandboxing is used to test code or applications that could be malicious before serving it up to critical devices. In cybersecurity, sandboxing is used as a method to test software which would end up being categorized as “safe” or “unsafe” after the test. - Malware Bytes
Now that we understand what sandboxing means I recommend the following resources.
Once you execute the sample in a sandbox, you will want to hunt for malicious intent.
The other type of dynamic analysis is to perform it manually on your own lab machine. To monitor the activity you can use tools like Wireshark for network and Procmon for endpoint behaviors. This type of dynamic analysis can involve manipulating the malware by responding to it over the network or placing files or other artifacts in places it needs them to enable further execution.
Malicious Intent
After we finish our static and dynamic malware analysis we need to indicate if the sample has malicious intent or not. If we do determine it has malicious intent, we have some decisions on tasks we need to make.
- Refer to the Purposes and Output section and complete what is needed
- Escalate to Reverse Engineering if needed
Additional refrences for malicious intent.
Depending on your scope of work and standard operating procedures and overall strategy, these will be the tasks you will need to decide you will complete or not.
NOTE: It is not always nessassary to escalate to reverse engineering, it depends again on the scope of work you receive and the complexity of the malware and the results you need. So the process can totally stop here and that’s okay too!
Reverse Engineering
Reverse engineering can be defined as follows.
Reverse engineering is a process or method through which one attempts to understand through deductive reasoning how a previously made device, process, system, or piece of software accomplishes a task with very little insight into exactly how it does so. - https://en.wikipedia.org/wiki/Reverse_engineering
Triage Review
This kind of triage is very different to triage for malware analysis. The goal of triage with reverse engineering is to use the information to help you identify areas of key interest in the malicious binary. For example, if the malware analysis triage notes indicate that the malware uses ws2_32.dll for communication, we can look into cross references for these API calls in the binary.
Once we have mapped out what tasks we wish to perform, we can move to the next step.
Reversing
In this stage we start actually reverse engineering the malware, typically we will use tools like Ghidra, IDA Pro and DNSpy to decompile the binary.
We will use these decompilers to create pseudo code.
In computer science, pseudocode is a plain language description of the steps in an algorithm or another system. Pseudocode often uses structural conventions of a normal programming language, but is intended for human reading rather than machine reading.
This code is an approximation of what the original code may look like.
It is up to us to clean this pseudo code up so we can make it more understandable for us humans to read.
This is a whole process of its own and we will not cover these specific principals in this workflow guide. We will save this for the reverse engineering guide.
Once completed, refer to the Purposes and Output section and complete what is needed.
Tips and Tricks
Here are some basic tips and tricks to help you stay organized in this process.
Standardized Analysis Folder Structure
When I start a new task like this I create a folder that contains the following structure.
- docs - a folder containing documentation from public articles to and own notes
- pcaps - packet capture files
- samples - samples from the analysis, which may include multiple stages
- scripts - scripts that help me automate the process
- projects - contains project files for Ghidra, IDA and more
With this folder structure I’m able to know where I am and where I’m going.
Standardized Decompiler Code Style
When working in Ghidra or IDA Pro it’s important for me to standardize how I name functions and variables in the pseudo code. Personally I keep this consistent with the standards as deemed by the operating system’s programming guides. This ensures that the documentation and the style guidelines is easily accessable and understood by everyone.
Comments
When working in Ghidra, IDA Pro or any other tool that allows you to make comments. Write lots of comments. This will keep you organized so that you know exactly where you are. If there are note taking features in the software you are using, take advantage of it.
Taking Notes
As I work through these processes, I will keep notes, typically in markdown format to keep myself organized. There are many different tools you can use to do this. However, I use Obsidian at the moment due to it’s wide variety of features.
Sharing your Work (TLP)
When working with malware it is always a good idea to understand what the sharing rights of the samples you are working on.
TLP | Usage | Sharing |
---|---|---|
TLP:RED | Sources may use TLP:RED when information cannot be effectively acted upon by additional parties, and could lead to impacts on a party’s privacy, reputation, or operations if misused. | Recipients may not share TLP:RED information with any parties outside of the specific exchange, meeting, or conversation in which it was originally disclosed. In the context of a meeting, for example, TLP:RED information is limited to those present at the meeting. In most circumstances, TLP:RED should be exchanged verbally or in person. |
TLP:AMBER | Sources may use TLP:AMBER when information requires support to be effectively acted upon, yet carries risks to privacy, reputation, or operations if shared outside of the organizations involved. | Recipients may only share TLP:AMBER information with members of their own organization, and with clients or customers who need to know the information to protect themselves or prevent further harm. Sources are at liberty to specify additional intended limits of the sharing: these must be adhered to. |
TLP:GREEN | Sources may use TLP:GREEN when information is useful for the awareness of all participating organizations as well as with peers within the broader community or sector. | Recipients may share TLP:GREEN information with peers and partner organizations within their sector or community, but not via publicly accessible channels. Information in this category can be circulated widely within a particular community. TLP:GREEN information may not be released outside of the community. |
TLP:WHITE | Sources may use TLP:WHITE when information carries minimal or no foreseeable risk of misuse, in accordance with applicable rules and procedures for public release. | Subject to standard copyright rules, TLP:WHITE information may be distributed without restriction. |
When we share our reports, signatures and more we will typically apply Traffic Light Protocol (TLP) to the content. This way other people in the community can understand how they are able to use and share the content.
Conclusion
When starting out for the first time it can be a daunting task even to begin understanding what someone’s workflow may look like beginning to end. This guide hopefully helps you create your own workflow that makes sense for you.