Part 1: Introduction To Windows API and General Concepts
A comprehensive guide, covering everything you need to know about the Windows API from start to finish.
Windows API
The Windows application interface (API) is the user-mode system programming interface to the Windows OS family as per windows internals definition. Also, prior to the introduction of 64-bit versions of Windows, the programming interface to the 32-bit versions of the Windows OS was called the Win32 API.
Nowadays, it refers to both the 32-bit and 64-bit variants.
Before proceeding to a clear definition, we need to understand what does user and kernel mode means.
User Mode: This is where regular applications (like your browser, text editor, etc.) run. They donât have direct access to the hardware or core OS resources to ensure stability and security. By chance, if they need to perform actions that require system-level access (e.g., reading from disk), then these applications perform system calls (syscalls) via the Windows API, which requests the kernel to perform the action on their behalf.
Kernel Mode: This is where the core of the operating system, including hardware drivers and system-level processes operates. It has full access to system resources.
It should be clear by now, that user mode programs must âaskâ the kernel to perform tasks that involve low-level system operations through syscalls.
Now, What Actually is Windows API ?
Windows API is a set of tools that allows develops to create programs that can interact with the Windows OS. It works in user mode, meaning it lets programs access system resources like files, memory and devices without interacting directly with the core parts of the OS.
Following are some set of tools:
- File Management: Tools to create, read, write, and delete files. For example, the
CreateFile
function allows programs to open or create files. - Memory Management: Functions like
VirtualAlloc
to allocate memory for a program. - Window Management: Tools to create and manage windows in a graphical user interface, like
CreateWindowEx
for creating a new window. - Process and Thread Management: Functions like
CreateProcess
andCreateThread
to run programs and manage tasks. - Device Input: Tools to handle input from the keyboard or mouse, like
GetAsyncKeyState
to detect key presses.
Note that Windows API operates in user mode doesnât necessarily mean that it doesnât interact with kernel mode. It interacts with the kernel mode when it needs to access core OS services using syscalls.
Some General Concepts
Windows Runtime
Windows 8 introduced a new API and supporting runtime called the Windows Runtime ( WinRT ) for building modern, touch-friendly and visually appealing apps using multiple programming languages like C++, C#, VB.NET and JavaScript. It is not to be confused with Windows RT which was for the discontinued ARM-based Windows OS version.
.NET framework
The .NET Framework is a software development platform created by Microsoft that provides a runtime environment and libraries for building and running applications. It supports multiple programming languages like C#, VB.NET, and F#.
It allows developers to create desktop applications, web apps, and services, and it runs primarily on Windows operating systems.
The .NET Framework consists of two major components:
The Common Language Runtime (CLR): This is the run-time engine for .NET and includes a Just In Time (JIT) compiler that translates Common Intermediate Language (CIL) instructions to the underlying hardware CPU machine language, a garbage collector, type verification, code access security, and more.
The .NET Framework Class Library (FCL): This is a large collection of types that implement functionality typically needed by client and server applications, such as user interface services, networking, database access, and much more.
Services, functions, and routines
The word service can refer to a callable routine (specifically a function or procedure that can be invoked during a programâs execution) in the OS, a device driver, or a server process. The following list describes what certain terms mean:
Windows API Functions: These are documented, callable subroutines (Typically refers to a smaller, more focused routine, often part of a larger program or routine) in the Windows API. Examples include CreateProcess, CreateFile and GetMessage.
Native system services (or system calls): These are the undocumented, underlying services in the OS that are callable from user mode. For example, NtCreateFile is the internal system service the Windows CreateFileW function calls to open or create a file.
How the native system calls work
This is how it works:
1) Our application (e.g., Notepad.exe) accesses
kernel32.dll
(this DLL is used for interacting with other processes, memory, drives or the file system) and calls the Windows APICreateFileW
function, which is part of the Windows API. This function is used to open or create a file.x64dbg: In main module, calling CreateFileW() whose stub resides in kernel32.dll
Here, you can clearly see that thereâs a
call
instruction which essentially is similar to calling a function, here it is callingCreateFileW
which resides in kernel32.dll (the â.â after the kernel32 means that we are particularly targeting theCreateFileW
function under kernel32.dll).x64dbg: Entered into kernel32.dll memory space where we see a jump to CreateFileW user-mode implementation which resides in the kernelbase.dll
After following the previous
call
, we enter into kernel32.dll memory space (the space it takes in a process to load itself). Here we see a small stub or forwarder code containing thejmp
to the actual user-mode implementation.The
jmp
instruction is similar to acall
. The only difference is that, when you follow acall
instruction, it pushes the return address onto the stack, allowing a return to the instruction immediately after thecall
(e.g., as shown in the previous image withmov rbx, rax
). In contrast, ajmp
instruction transfers control without saving the return address, so execution continues from the jump destination without returning to the original spot.2) The actual user-mode implementation of
CreateFileW
resides inkernelbase.dll
(in modern version of Windows). This DLL contains the user-mode implementation of various Windows API functions. Thekernelbase.dll
handles some parameter validation and preparation for the system call but does not directly interact with the kernel.x64dbg: Entered into kernelbase.dll where the call to NtCreateFile resides
3)
CreateFileW
inkernelbase.dll
then calls a native API (function) inntdll.dll
(the Windows NT Layer DLL), which provides a access to lower-level system services. Specifically, it callsNTCreateFile
which is the native API for file operation in Windows.x64dbg: The NtCreateFile stub which performs a syscall
4) The Native API
NTCreateFile
contains the technical instructions or syscall call stub to initiate the system call by executing the system call and enables the temporary transition (CPU switch) from user mode (ring 3) to kernel mode (ring 0) after execution.The
syscall
instruction is a CPU instruction that switches the CPU from user mode (Ring 3) to kernel mode (Ring 0). This is managed using theSYSCALL/SYSRET
model.The CPU uses:
MSRs
(Model-Specific Registers) likeMSR_LSTAR
to know the address of theKiSystemCall64
handler, which is the systemâs kernel-mode system call dispatcher.- The
GS
segment is switched to point to kernel-mode memory structures such as theKTHREAD
(which represents the kernel thread for the process).
The CPU saves the current user mode context (e.g., register values, instruction pointer) before the switch, and the kernel stack is activated for the thread.
This is how it looks like
How transitioning using syscall looks like
5) Once in kernel mode, the system service dispatcher
KiSystemCall/KiSystemCall64
consults the System Service Descriptor Table (SSDT) which contains the addresses (or pointers) to the actual kernel functions that should be executed based on the executed system call ID (index number in theEAX
register, here55
, which is the system call id forNtCreateFile
). This table maps the system call number forNtCreateFile
to its corresponding function in the kernel.7) Based on the system call number, the SSDT directs the call to the appropriate kernel service, which in this case is the kernel implementation of
NtCreateFile
. The kernelâs file system drivers (e.g., NTFS drivers) then perform the actual file operation, such as opening or creating the file.8) The kernelâs file system drivers (such as the NTFS driver) perform the actual operation of opening or creating the file. They access the file system and hardware, which user-mode code canât directly interact with.
9) Once the file operation is completed by the kernel, the
KiSystemCall64
dispatcher uses thesysret
instruction to switch the CPUâs privilege level back from Ring 0 (kernel mode) to Ring 3 (user mode), restoring the user-mode threadâs context. The result (e.g., a file handle or error code) is passed back up the chain: from the kernel tontdll.dll
, then back tokernelbase.dll
, and finally to your application (e.g., Notepad.exe). Control switches back to user mode as the result is returned.Yes, thereâs more to it, but this is sufficient for us now.
Kernel support functions (or routines): These are the subroutines inside the Windows OS that can be called only from kernel mode. For example, ExAllocatePoolWithTag is the routine that device drivers call to allocate memory from the Windows system heaps (called pools).
Windows services: These are processes started by the Windows service control manager. For example, the Task Scheduler service runs in a user-mode process that supports the schtasks command (which is similar to the UNIX commands at and cron).
Dynamic link libraries (DLLs): These are callable subroutines linked together as a binary file that can be dynamically loaded by applications that use the subroutines. Examples include msvcrt.dll (the C run-time library) and kernel32.dll (the one we have already seen). Windows user-mode components and applications use DLLs extensively. The advantage DLLs provide over static libraries is that applications can share DLLs, and Windows ensures that there is only one in - memory copy of a DLLâs code among the applications that are referencing it.
See you in next part <3â¤ď¸
References
â TODO