nt-list: Windows Linked Lists in idiomatic Rust

2022-07-28

On my quest to develop a ReactOS/Windows bootloader in Rust, I inevitably had to stumble upon the infamous LIST_ENTRY structure used in the LOADER_PARAMETER_BLOCK. This is what Windows, Windows drivers, and components influenced by Windows (e.g. UEFI) have been using for a long time to uniformly handle linked lists.
I did my first steps to implement LIST_ENTRY in Rust, but realized early on that a naive Rust implementation would involve raw pointer operations everywhere, and soon cover my entire code in unsafe blocks. Surely there must be a better way…

A short intro to `LIST_ENTRY`

Unlike most textbook implementations of linked lists, the Windows one comes in the form of a structure (LIST_ENTRY) that you embed into your element structure. That structure has the usual forward and backward links (called flink and blink).
However, they don’t directly point to other element structures, but to their LIST_ENTRY fields. Additionally, the last link of a LIST_ENTRY sequence also points back to the list header to mark the end of a list.
The following sketch should summarize it:

A sketch showing the differences between textbook linked lists and Windows Linked Lists

This design for doubly linked lists exhibits several unique properties:

A single element can be part of multiple lists (by having multiple LIST_ENTRY fields).
Such an element can be a pain when figuring out who is eventually responsible for deallocating it.
YOU are responsible for pushing only elements of the same type to a list.
Without any type safety, the C/C++ compiler cannot prevent you from adding differently typed elements to the same list.
You don’t get the corresponding element structure of a forward or backward link right away.
You need to use the CONTAINING_RECORD macro for some offset calculation, and no C/C++ compiler will check the parameters YOU passed to that macro.
The list header needs to stay on a stable address.
Otherwise, we won’t be able to detect the end of a list.

Particularly the last point has been a long-time blocker for a safe Rust implementation.
With stack variables being movable by default in Rust, and moved often, stable addresses could only be guaranteed via heap allocations in the past. Fortunately, this is no longer the case since Pinning was introduced in Rust 1.33.

nt-list enters the stage

My just released nt-list Rust crate introduces type safety for Windows Linked Lists, taking away some responsibility from the user and moving it to the compiler. Apart from the just discussed LIST_ENTRY doubly linked list, I have also covered its singly linked list cousin, SINGLE_LIST_ENTRY.
Additionally, my crate offers an idiomatic Rust interface similar to that of LinkedList and Vec.

Type safety is achieved by defining an empty enum for each list, and passing it as a type to the NtListEntry field when declaring your element structure. The Rust compiler will then make sure that only compatible elements are pushed to the list. Likewise, my crate can now infer what element structure to return to you, when it only has a pointer to an entry field.

Memory management is another aspect that the original LIST_ENTRY leaves to each user, making it prone to leaks, dangling pointers, and use-after-frees. To avoid these issues, my Rust crate provides a boxing variant of the list that heap-allocates each element and deallocates it on removal. With that assurance, all NtBoxingListHead functions are safe, unlike those of its non-boxing counterpart NtListHead.

If you can, you should actually use the boxing variant all the time. A notable exception is when a list element is part of more than one list via multiple entry fields. You then don’t want NtBoxingListHead to deallocate a removed element while it’s still part of another list. Two possible solutions come to mind here:

You use NtListHead for all lists of that element, and manage the required memory manually.
This is also an option when you are in a restricted no_std environment that supports no heap allocations through alloc::boxed::Box.
You define a primary list that is managed through NtBoxingListHead, and one or more secondary lists managed by NtListHead.
All elements then need to be added to the primary list first before they are added to any secondary lists. Even more important, an element needs to be removed from all secondary lists before it’s removed from the primary list (and subsequently deallocated).
Granted, this part is unsafe, but it’s as best as I can do for multiple lists.

Examples

Without further ado, we finally come to some example code.

All the fun begins by defining an empty enum and declaring it as a Windows-compatible doubly linked list (NtList). To hide the complexity of the involved traits, I’ve implemented a derive macro for this:

#[derive(NtList)]
enum MyList {}

Next you define your C-compatible element structure. By deriving NtListElement, the structure is scanned for NtListEntry fields and the proper traits are implemented automagically. The optional #[boxed] attribute also declares the single entry field where NtBoxingListHead will be usable.

#[derive(Default, NtListElement)]
#[repr(C)]
struct MyElement {
    value: i32,
    #[boxed]
    entry: NtListEntry<Self, MyList>,
}

That’s basically it! Now you can create a new list using NtBoxingListHead::new and use it almost like you would use a Vec.

I say “almost”, because we have to consider the stable address of the list head. Due to this requirement, all NtListHead and NtBoxingListHead functions take a pinned self reference. I use the moveit crate to simplify creating the list head as a pinned reference.

Taking this together, you create your list via

moveit! {
    let mut list = NtBoxingListHead::<MyElement, MyList>::new();
}

and then work with elements like this:

list.as_mut().push_back(MyElement {
    value: 42,
    ..Default::default()
});

for element in list.as_ref().iter() {
    println!("{}", element.value);
}

The Vec-like API of nt-list is more of a safety feature than you might think.
I have seen numerous examples of C/C++ code like

for (LIST_ENTRY* Current = List.Flink; Current != &List; Current = Current->Flink)
{
    MY_ELEMENT* Element = CONTAINING_RECORD(Current, MY_ELEMENT, Entry);

    if (Element->Value == 42)
    {
        RemoveEntryList(&Element->Entry);
        HeapFree(GetProcessHeap(), 0, Element);
    }
}

Now what is wrong with this? Apart from the non-trivial init statement, loop condition, and CONTAINING_RECORD macro call, this code suffers from a classic use-after-free bug. Every element with value 42 is removed from the list, but the Current pointer (which is nothing more than Element->Entry) is dereferenced in the iterator expression after Element has already been deallocated. In the worst case, this bug goes unnoticed until an application eventually crashes in production.

My NtBoxingListHead provides the retain function to cover this pattern, which is inspired from Rust’s recently stabilized Vec::retain_mut. The code above in safe and concise Rust would be as simple as

list.as_mut().retain(|element| element.value != 42);

Check out the tests for more examples and the module-level documentation of single_list for a singly linked list example.

nt-list in WinDbg

Thanks to the Rust compiler outputting PDB files on Windows, it is possible to verify the compatibility between nt-list and LIST_ENTRY using WinDbg. Which is exactly what I’m doing here:

Using WinDbg to traverse a Windows Linked List created by the nt-list Rust crate

I’m using the !list extension here to traverse all items of my doubly linked list.
In my example, the full command is:

!list -t _LIST_ENTRY.Flink -x "dt nt_list_test::MyElement @@(#CONTAINING_RECORD(@$extret, nt_list_test::MyElement, entry))" 0x000001d0`c551bff8

Your command will vary depending on your structure name, field name, and of course the current memory address of the first entry.

Conclusion

With Linked Lists being ubiquitous in the Windows kernel and influenced components, I hope that my nt-list crate will be useful even beyond my bootloader use case. Like my previous ntfs crate, I have dual-licensed it under the Apache 2.0 and MIT licenses to enable a wide usage.

Note that this is my first excursion into making inherently unsafe data structures safe, and also into proc-macro magic. The first release of nt-list is likely not perfect, and I’m more than welcome to your feedback!

Finally, it should be said that the Rust documentation is right about linked lists in general: If you can, use a Vec or a VecDeque over any linked list for better memory and CPU cache utilization.
However, if you need to maintain compatibility to an existing linked list of Windows, you now have a tool to do that in idiomatic and safe Rust.

‹ Previous Post

My FOSDEM talk on implementing NTFS in Rust

Next Post ›