WINDOWS SYSTEM PROGRAMMING WITH C/C++

Windows System Architecture

History

Windows was originally a 16-bit graphical layer for MS-DOS that was written by Microsoft.As it grew, it gained the ability to handle 32-bit programs and eventually became totally 32-bit when Windows NT and 2000 came out. After Windows 95, Microsoft began to remove dependencies on DOS and ﬁnally fully implemented the separation in Windows 2000. Windows has many advanced features as well as many platform speciﬁc problems.

It possesses an Application Programming Interface that consists of thousands of mostly undocumented GUI functions as well as having varying degrees of MS-DOS compatibility. Additionally, with the advent of NT (New Technology), Windows relies completely on the NT kernel instead of its MS-DOS subsystem, the NT kernel is capable of emulating the necessary DOS functionality. In addition to the NT kernel, Microsoft has also introduced many API wrappers, such as the MFCs (Microsoft Foundation Classes), COM (Component Object Model), and .NET technologies.
The most popular languages for use on Windows include Visual Basic/VB6 and C/C++2 ,

Windows Kernels

Windows 1.0, 2.0, and 3.11 are considered to be an older generation of Windows systems that were built to be a simple graphical layer over the MS-DOS operating system. Windows 95, Windows 98, and Windows ME were designed to bypass MS-DOS (although DOS was still present), and were all based on the same code structure known as the "9x Kernel".

Windows NT 4.0, Windows 2000, Windows XP, Windows Vista, Windows 7, and Windows Server are all based on a collection of code known as the "NT Kernel".

System Architecture

The Windows NT Kernel is divided into several sections, here we will brieﬂy discuss how the Windows operating system is put together. At the most basic level is the ﬁle NTOSKRNL.EXE, the kernel of the Windows operating system, and the most importantﬁle on your computer. If you are interested in seeing this for yourself, you can ﬁnd it in the C:\Windows\System32 folder (this can also be found using the following path %sys-temroot%\system32 ) on your own Windows NT machines.
NTOSKRNL.EXE provides some of the basic functionality of Windows, but one ﬁle alone cannot make the whole system work. NTOSKRNL relies heavily on a Dynamic Link Library (DLL) known as HAL.DLL.

HAL stands for "Hardware Abstraction Layer", and is the portion of code that allows low-level mechanisms such as interrupts and BIOS communication to be handled independently.
If we consider Windows architecture as a layered architecture, with NTOSKRNL.EXE and HAL.DLL on the bottom layer, the next layer up contains two important ﬁles, NTDLL.DLL, and WIN32K.SYS. NTDLL contains a number of user-mode functions such as system call
stubs and the run-time library (RTL) code, collectively known as the (largely undocumented) "Native API". Much of the run-time library code is shared between NTOSKRNL and NTDLL. WIN32K.SYS is a kernel-mode driver that implements windowing and graphics, allowing for user interfaces to be created.

The next layer up contains a number of libraries that will be of primary interest to us. This layer comprises what is called the Win32 API, and it contains (almost) all the functions that a user will need in order to program in Windows. The Win32 API is divided into 4 component parts, each one a .DLL:

kernel32.DLL

This contains most of the system-related Win32 API functions. Most of these functions are just wrappers around the lower-level NTDLL functions, but some functionality such as National Language Support (NLS) and console handling are not available in NTDLL.

advapi32.DLL
This contains other system-related functions such as registry and service handling.

gdi32.DLL
This contains a number of basic functions for drawing. These functions are all relatively simple, and allow the user to draw shapes (circles, rectangles, etc.) on the screen, to display and manipulate bitmaps, etc.

user32.DLL
This contains a number of functions that implement the familiar user-interface of Windows.
Programs, message boxes, prompts, etc are all implemented using the User32 functions. User32 performs its tasks by calling system calls implemented by

WIN32K.SYS.

In addition to the 4 primary libraries in the Win32 API, there are a number of other important libraries that a Windows programmer should become familiar with:

MSVCRT.DLL
MSVCRT.DLL is the dynamic link library that contains the implementations of the C standard library (stdlib) functions that C programmers should be familiar with. These are the functions deﬁned in the common header ﬁles stdio.h, string.h, stdlib.h, etc.

WS2_32.DLL

This is the Winsock2 library, that contains the standard Berkeley socket API for communicating on the internet.

User Mode vs Kernel Mode

In Windows (and most modern operating systems), there is a distinction between code that is running in "user mode", and code that is running in "kernel mode". This chapter is going to point out some of the diﬀerences. Firstly, Intel CPUs have modes of operation called rings which specify the type of instructions and memory available to the running code. There are four rings:

• Ring 0 (also known as kernel mode) has full access to every resource. It is the mode in which the Windows kernel runs.

• Rings 1 and 2 can be customized with levels of access but are generally unused unless there are virtual machines running.

• Ring 3 (also known as user mode) has restricted access to resources.
The reason for this is because if all programs ran in kernel mode, they would be able to overwrite each others' memory and possibly bring down the entire system when they crashed.

virtual memory

When a program is started (e.g. a web browser or a word processor), it runs in its own process. A process contains its own "virtual" memory space and resources. Its memory is "virtual" because the process thinks memory is at address 0x12345678 may actually be at address 0x65f7a678 in physical memory. Similarly, two diﬀerent processes may have diﬀerent data stored at (to them) 0x00401000.

This is implemented by dividing memory into chunks called pages; on x86 systems one page is 4 kilobytes in size. Each page can have its own set of attributes, such as read-only/read-write.

The CPU has a transparent mechanism for translating virtual addresses to physical addresses through a page table which the operating system sets up.

Virtual memory is useful for many reasons:
1. The process cannot access other process' memory,
2. Each page can have diﬀerent protection settings (read-only or read-write, kernel-mode-only)

3. Inactive memory regions of the process can be "paged out" (stored) to the pageﬁle and be retrieved by the operating system when needed. This is also done when the system is low on physical memory.

User Mode

Every process started by Windows (with the exception of the System "process") runs in user mode. In this mode, programs cannot modify paging directly and so have no way of accessing other programs' memory except through API functions. Programs in user mode also cannot interfere with interrupts and context switching.

Kernel Mode, Interrupts, and System Calls

When Windows is ﬁrst loaded, the Windows kernel is started. It runs in kernel mode and sets up paging and virtual memory. It then creates some system processes and allows them to run in user mode. How does the CPU ever switch back to kernel mode then? This is not done automatically by the CPU. The CPU is often interrupted by certain events (timers, keyboard, hard disk I/O), and these are called interrupts. The kernel must ﬁrst set up interrupt handlers to deal with these events. Then, whenever interrupts occur, the CPU stops executing the currently running program, immediately switches to kernel mode, and executes the interrupt handler for that event. The handler saves the state of the CPU, performs some processing relevant to that event, and restores the state of the CPU (possibly
switching back to user mode) so the CPU can resume execution of the program.

When a program wants to call a Windows API function1 , it triggers an interrupt2 which causes the CPU to switch to kernel mode and begin executing the desired API function.
When the API function has ﬁnished processing, it switches back to user mode and resumes execution of the program. This is because API functions like ReadProcessMemory cannot work in user mode; the program can't access other programs' memory. In kernel mode, however, the API function can read any memory region without restriction.

1. Actually, Windows API functions eventually call a diﬀerent API: the Native API. This is the API used by the Windows NT family of kernels. This is when the CPU switches to kernel-mode.

2. Modern CPUs have special, faster instructions for system calls, such as sysenter and sysexit on x86. These instructions cause the CPU to switch to ring 0, and then begin executing a handler set up by the operating system.

Context Switching
So, a program runs and calls API functions. How do other programs get a chance to
run, then? Most of the time, programs simply allow the operating system to switch to another program because they are waiting for something (human input, hard disk). These programs are known as unrunnable programs, and since they make calls to the kernel to wait for something, the kernel knows to perform context switching to allow another program to run. This is done by:

1. Saving the current program's state (including registers),
2. Figuring out which program to run next,
3. and restoring a diﬀerent program's state.

If a program (thread or process to be more accurate) runs for more than a certain period of time (the thread quantum or a processes time slice), the operating system will context switch to another program. This idea is called preemption. Preemption is accomplished by setting a timed interrupt in the processor that will invoke context switching. The time slice that is used may be diﬀerent for each process.

WINDOWS DATA TYPES AND HANDLES

Hungarian Notation

First, let's make a quick note about the naming convention used for some data types, and some variables. The Win32 API uses the so-called "Hungarian Notation" for naming variables. Hungarian Notation requires that a variable be prefixed with an abbreviation of its data type, so that when you are reading the code, you know exactly what type of variable it is. The reason this practice is done in the Win32 API is because there are many different data types, making it difficult to keep them all straight. Also, there are a number of different data types that are essentially defined the same way, and therefore some compilers will not pick up on errors when they are used incorrectly. As we discuss each data type, we will also note the common prefixes for that data type.

Putting the letter "P" in front of a data type, or "p" in front of a variable usually indicates that the variable is a pointer. The letters "LP" or the prefix "lp" stands for "Long Pointer", which is exactly the same as a regular pointer on 32 bit machines. LP data objects are simply legacy objects that were carried over from Windows 3.1 or earlier, when pointers and long pointers needed to be differentiated. On modern 32-bit systems, these prefixes can be used interchangeably.

LPVOID

LPVOID data types are defined as being a "pointer to a void object". This may seem strange to some people, but the ANSI-C standard allows for generic pointers to be defined as "void*" types. This means that LPVOID pointers can be used to point to any type of object, without creating a compiler error. However, the burden is on the programmer to keep track of what type of object is being pointed to.

Also, some Win32 API functions may have arguments labeled as "LPVOID lpReserved". These reserved data members should never be used in your program, because they either depend on functionality that hasn't yet been implemented by Microsoft, or else they are only used in certain applications. If you see a function with an "LPVOID lpReserved" argument, you must always pass a NULL value for that parameter - some functions will fail if you do not do so.

LPVOID objects frequently do not have prefixes, although it is relatively common to prefix an LPVOID variable with the letter "p", as it is a pointer.

DWORD, WORD, BYTE

These data types are defined to be a specific length, regardless of the target platform. There is a certain amount of additional complexity in the header files to achieve this, but the result is code that is very well standardized, and very portable to different hardware platforms and different compilers.

DWORDs (Double WORDs), the most commonly occurring of these data types, are defined always to be unsigned 32-bit quantities. On any machine, be it 16, 32, or 64 bits, a DWORD is always 32 bits long. Because of this strict definition, DWORDS are very common and popular on 32-bit machines, but are less common on 16-bit and 64-bit machines.

WORDs (Single WORDs) are defined strictly as unsigned 16-bit values, regardless of what machine you are programming on. BYTEs are defined strictly as being unsigned 8-bit values. QWORDs (Quad WORDs), although rare, are defined as being unsigned 64-bit quantities. Putting a "P" in front of any of these identifiers indicates that the variable is a pointer. putting two "P"s in front indicates it's a pointer to a pointer. These variables may be unprefixed, or they may use any of the prefixes common with DWORDs. Because of the differences in compilers, the definition of these data types may be different, but typically these definitions are used:

#include <stdint.h>

typedef uint8_t BYTE;
typedef uint16_t WORD;
typedef uint32_t DWORD;
typedef uint64_t QWORD;

Notice that these definitions are not the same in all compilers. It is a known issue that the GNU GCC compiler uses the long and short specifiers differently from the Microsoft C Compiler. For this reason, the windows header files typically will use conditional declarations for these data types, depending on the compiler being used. In this way, code can be more portable.

As usual, we can define pointers to these types as:

#include <stdint.h>

typedef uint8_t * PBYTE;
typedef uint16_t * PWORD;
typedef uint32_t * PDWORD;
typedef uint64_t * PQWORD;

typedef uint8_t ** PPBYTE;
typedef uint16_t ** PPWORD;
typedef uint32_t ** PPDWORD;
typedef uint64_t ** PPQWORD;

DWORD variables are typically prefixed with "dw". Likewise, we have the following prefixes:

Data Type	Prefix
BYTE	"b"
WORD	"w"
DWORD	"dw"
QWORD	"qw"

LONG, INT, SHORT, CHAR

These types are not defined to a specific length. It is left to the host machine to determine exactly how many bits each of these types has.

Types

typedef long LONG;
typedef unsigned long ULONG;
typedef int INT;
typedef unsigned int UINT;
typedef short SHORT;
typedef unsigned short USHORT;
typedef char CHAR;
typedef unsigned char UCHAR;

LONG notation: LONG variables are typically prefixed with an "l" (lower-case L).

UINT notation: UINT variables are typically prefixed with an "i" or a "ui" to indicate that it is an integer, and that it is unsigned.

CHAR, UCHAR notation: These variables are usually prefixed with a "c" or a "uc" respectively.

If the size of the variable doesn't matter, you can use some of these integer types. However, if you want to exactly specify the size of a variable, so that it has a certain number of bits, use the BYTE, WORD, DWORD, or QWORD identifiers, because their lengths are platform-independent and never change.

STR, LPSTR

STR data types are string data types, with storage already allocated. This data type is less common than the LPSTR. STR data types are used when the string is supposed to be treated as an immediate array, and not as a simple character pointer. The variable name prefix for a STR data type is "sz" because it's a zero-terminated string (ends with a null character).

Most programmers will not define a variable as a STR, opting instead to define it as a character array, because defining it as an array allows the size of the array to be set explicitly. Also, creating a large string on the stack can cause greatly undesirable stack-overflow problems.

LPSTR stands for "Long Pointer to a STR", and is essentially defined as such:

#define STR * LPSTR;

LPSTR can be used exactly like other string objects, except that LPSTR is explicitly defined as being ASCII, not unicode, and this definition will hold on all platforms. LPSTR variables will usually be prefixed with the letters "lpsz" to denote a "Long Pointer to a String that is Zero-terminated". The "sz" part of the prefix is important, because some strings in the Windows world (especially when talking about the DDK) are not zero-terminated. LPSTR data types, and variables prefixed with the "lpsz" prefix can all be used seamlessly with the standard library <string.h> functions.

TCHAR

TCHAR data types, as will be explained in the section on Unicode, are generic character data types. TCHAR can hold either standard 1-byte ASCII characters, or wide 2-byte Unicode characters. Because this data type is defined by a macro and is not set in stone, only character data should be used with this type. TCHAR is defined in a manner similar to the following (although it may be different for different compilers):

#ifdef UNICODE
#define TCHAR WORD
#else
#define TCHAR BYTE
#endif

TSTR, LPTSTR

Strings of TCHARs are typically referred to as TSTR data types. More commonly, they are defined as LPTSTR types as such:

#define TCHAR * LPTSTR

These strings can be either UNICODE or ASCII, depending on the status of the UNICODE macro. LPTSTR data types are long pointers to generic strings, and may contain either ASCII strings or Unicode strings, depending on the environment being used. LPTSTR data types are also prefixed with the letters "lpsz".

HANDLE

HANDLE data types are some of the most important data objects in Win32 programming, and also some of the hardest for new programmers to understand. Inside the kernel, Windows maintains a table of all the different objects that the kernel is responsible for. Windows, buttons, icons, mouse pointers, menus, and so on, all get an entry in the table, and each entry is assigned a unique address known as a HANDLE. If you want to pick a particular entry out of that table, you need to give Windows the HANDLE value, and Windows will return the corresponding table entry.

HANDLEs are defined as void pointers (void*). They are used as unique identifiers to each Windows object in our program such as a button, a window an icon, etc. Specifically their definition follows: typedef PVOID HANDLE; and typedef void *PVOID; In other words HANDLE = void*.

HANDLEs are generally prefixed with an "h". Handles are unsigned integers that Windows uses internally to keep track of objects in memory. Windows moves objects like memory blocks in memory to make room, if the object is moved in memory, the handles table is updated.

Below are a few special handles that are worth discussing:

HWND

HWND data types are "Handles to a Window", and are used to keep track of the various objects that appear on the screen. To communicate with a particular window, you need to have a copy of the window's handle. HWND variables are usually prefixed with the letters "hwnd", just so the programmer knows they are important.

Canonically, main windows are defined as:

HWND hwnd;

Child windows are defined as:

HWND hwndChild1, hwndChild2...

and Dialog Box handles are defined as:

HWND hDlg;

Although you are free to name these variables whatever you want in your own program, readability and compatibility suffer when an idiosyncratic naming scheme is chosen - or worse, no scheme at all.

HINSTANCE

HINSTANCE variables are handles to a program instance. Each program gets a single instance variable, and this is important so that the kernel can communicate with the program. If you want to create a new window, for instance, you need to pass your program's HINSTANCE variable to the kernel, so that the kernel knows what program instance the new window belongs to. If you want to communicate with another program, it is frequently very useful to have a copy of that program's instance handle. HINSTANCE variables are usually prefixed with an "h", and furthermore, since there is frequently only one HINSTANCE variable in a program, it is canonical to declare that variable as such:

HINSTANCE hInstance;

It is usually a benefit to make this HINSTANCE variable a global value, so that all your functions can access it when needed.

HMENU

If your program has a drop-down menu available (as most visual Windows programs do), that menu will have an HMENU handle associated with it. To display the menu, or to alter its contents, you need to have access to this HMENU handle. HMENU handles are frequently prefixed with simply an "h".

WPARAM, LPARAM

In the earlier days of Microsoft Windows, parameters were passed to a window in one of two formats: WORD-length (16-bit) parameters, and LONG-length (32-bit) parameters. These parameter types were defined as being WPARAM (16-bit) and LPARAM (32-bit). However, in modern 32-bit and 64-bit systems, both WPARAM and LPARAM are able to hold a pointer, so are 32 resp. 64 bits long. The names however have not changed, for legacy reasons.

WPARAM and LPARAM variables are generic function parameters, and are frequently type-cast to other data types including pointers and DWORDs.