Part 4: Breakthrough UART Config – Debug Like Never Before

Previously in Part 3, we gave our STM32F429ZI microcontroller a heartbeat using SysTick – a timer that ticks every millisecond. Now that SysTick_Init() has completed and returned to main(), our microcontroller knows how to keep time. But it still can’t communicate with the outside world. It’s like having a computer with no monitor or keyboard – it can think, but it can’t tell you what it’s thinking.

Where We Are in the Code

The SysTick_Init() function has just finished, and we’re back in main.c. The program counter (the processor’s bookmark that tracks which instruction to execute next) is pointing to the next line after SysTick_Init(). That line is:

UART_Init(115200);

What is UART and Why Do We Need It?

Before diving into the code, let’s understand what we’re about to initialize:

UART stands for Universal Asynchronous Receiver-Transmitter. It’s a communication protocol that’s been around since the 1960s. Think of it as a telephone line for your microcontroller – it allows the chip to send messages to your computer and receive commands back.

Why 115200? This number is the baud rate – the speed of communication measured in bits per second. At 115200 baud, we can transmit 115,200 individual 1s and 0s every second. This is fast enough for real-time debugging but slow enough to be reliable over simple wires.

How UART sends data: When you want to send the letter ‘A’, UART doesn’t just send the 8 bits that represent ‘A’. It adds extra bits:

A start bit (always 0) to say “data is coming”
The 8 data bits for ‘A’ (01000001 in binary)
A stop bit (always 1) to say “data is done”

So sending one character actually requires 10 bits. This is why at 115200 baud, we can only send 11,520 characters per second (115200 ÷ 10).

The Journey Begins: Jumping to UART_Init

When the processor encounters UART_Init(115200), several things happen in sequence:

The processor saves its current location in main() on the stack (so it knows where to return)
It loads the value 115200 into a register (temporary storage inside the CPU)
It jumps to the memory address where UART_Init function begins

This jump takes us from main.c to uart.c, where UART_Init is defined:

void UART_Init(uint32_t baud)
{
    /* Enable clocks and configure GPIO pins as before */

We’re now inside the UART_Init function. The parameter ‘baud’ contains 115200. The first line is just a comment, which the processor completely ignores.

Understanding the Hardware We’re About to Configure

Inside your STM32F429ZI, there are multiple UART hardware blocks (USART1 through USART8). We’re going to use USART3. But why USART3 specifically? On the STM32F429ZI development board, USART3 is conveniently connected to pins that are accessible and commonly used for debugging.

USART3 will use two pins:

PD8 for transmitting (TX) – data flows OUT of the microcontroller
PD9 for receiving (RX) – data flows INTO the microcontroller

But right now, these components are completely powered down. It’s like having a telephone that’s not plugged into the wall – we need to give them power first.

Step 1: Powering Up the Hardware

Enabling the GPIO Port

The first real instruction that executes is:

RCC->AHB1ENR |= RCC_AHB1ENR_GPIODEN;

Let’s decode this completely:

RCC stands for Reset and Clock Control. It’s the power management center of your microcontroller. Think of it as the main electrical panel in a building – it controls which rooms (peripherals) get electricity.

GPIOD stands for General Purpose Input/Output port D. The STM32F429ZI has multiple GPIO ports (A through K), each containing 16 pins. GPIOD contains pins PD0 through PD15. Right now, all these pins are dead – no power.

What this line does: It flips the switch to send power to GPIOD. Here’s how:

RCC->AHB1ENR accesses the AHB1 Enable Register (AHB1 is one of the chip’s internal buses)
RCC_AHB1ENR_GPIODEN is a constant that equals 0x00000008 (bit 3)
The |= operation sets bit 3 to 1, which enables the clock for GPIOD

After this line executes, all 16 pins in GPIOD receive power and can be used. We need this because two of these pins (PD8 and PD9) will be our UART pins.

Enabling the UART Peripheral

RCC->APB1ENR |= RCC_APB1ENR_USART3EN;

Now we need to power up USART3 itself:

USART3 is a dedicated piece of hardware inside your microcontroller. It’s like a specialized chip within the chip, designed specifically to handle serial communication. Without power, it’s completely inactive.

What this line does:

RCC->APB1ENR accesses the APB1 Enable Register (APB1 is another internal bus, typically running at a lower speed than AHB1)
RCC_APB1ENR_USART3EN equals 0x00040000 (bit 18)
This enables the clock signal to USART3

Think of the clock signal as the heartbeat that makes digital circuits work. Every digital circuit needs a regular pulse (clock) to synchronize its operations. No clock = no operation.

After this line, USART3 is powered up and its internal registers can be accessed. But it’s not configured yet – it’s like a telephone that’s plugged in but hasn’t been programmed with a phone number.

Step 2: Configuring the GPIO Pins

Now we need to tell pins PD8 and PD9 that they’re no longer general-purpose pins – they have a special job to do.

Understanding Pin Modes

Each GPIO pin can operate in one of four modes:

Input mode (00): Read external signals (like reading a button press)
Output mode (01): Software controls the pin directly (like blinking an LED)
Alternate Function mode (10): Internal hardware controls the pin (like UART)
Analog mode (11): For analog-to-digital or digital-to-analog conversion

We need Alternate Function mode because we want USART3 hardware to control these pins automatically.

Setting the Pin Modes

GPIOD->MODER &= ~(GPIO_MODER_MODER8_0 | GPIO_MODER_MODER9_0);

The MODER register is 32 bits wide. Each pin gets 2 bits:

Pins PD0-PD7 use bits [15:0]
Pin PD8 uses bits [17:16]
Pin PD9 uses bits [19:18]
Pins PD10-PD15 use bits [31:20]

This first line clears bit 16 and bit 18 (the lower bits of PD8 and PD9’s mode fields). We’re preparing to set these pins to mode “10”.

GPIOD->MODER |= (GPIO_MODER_MODER8_1 | GPIO_MODER_MODER9_1);

This second line sets bit 17 and bit 19 (the upper bits of PD8 and PD9’s mode fields).

Combined result:

PD8: bits [17:16] = 10 (Alternate Function mode)
PD9: bits [19:18] = 10 (Alternate Function mode)

Why two operations? We can’t just write “10” directly because we don’t want to disturb other pins’ settings. The two-step process (clear then set) ensures we only modify the bits we care about.

The Magic of Alternate Function Mode

When a pin is in Alternate Function mode, something special happens inside the chip. There’s essentially a switch (multiplexer) that can connect each pin to different internal hardware:

In Output mode: Your software controls the pin by writing to registers
In Alternate Function mode: Internal hardware (like USART3) controls the pin automatically

Here’s the key insight: In Alternate Function mode, when you write data to USART3, the USART3 hardware automatically toggles the pin voltage with microsecond precision. Your software doesn’t need to worry about timing individual bits.

Selecting Which Alternate Function

Each pin that supports Alternate Function mode can connect to up to 16 different internal peripherals (AF0 through AF15). We need to tell the chip which one to use.

GPIOD->AFR[1] &= ~(0xF << 0);  // Clear PD8's alternate function
GPIOD->AFR[1] &= ~(0xF << 4);  // Clear PD9's alternate function

The AFR (Alternate Function Register) array has two elements:

AFR[0] controls pins 0-7
AFR[1] controls pins 8-15

Each pin gets 4 bits (allowing selection of AF0-AF15). We first clear these bits to 0000.

GPIOD->AFR[1] |= (7 << 0);     // Set PD8 to AF7
GPIOD->AFR[1] |= (7 << 4);     // Set PD9 to AF7

Now we set both pins to alternate function 7 (AF7). On the STM32F429ZI:

PD8 + AF7 = USART3_TX (transmit)
PD9 + AF7 = USART3_RX (receive)

The pins are now internally wired to USART3. When USART3 wants to transmit a bit, it will appear on PD8. When a bit arrives on PD9, USART3 will capture it.

Step 3: Configuring USART3

Disabling USART3 for Safe Configuration

USART3->CR1 &= ~USART_CR1_UE;

Before configuring any UART settings, we must disable it. This is a safety requirement – like turning off a machine before adjusting it. CR1 is the Control Register 1, and UE means USART Enable.

The Critical Baud Rate Calculation

Now comes the most important configuration – setting the communication speed. UART communication is all about precise timing. Both the transmitter and receiver must agree on exactly how long each bit lasts.

uint32_t divider = SystemCoreClock_Get() / baud;

This line calls SystemCoreClock_Get(), which returns 16000000 (16 MHz – the speed of our system clock). We divide this by our desired baud rate:

16,000,000 ÷ 115,200 = 138.888…

But we can only use whole numbers, so integer division gives us 138.

What does this mean? To transmit bits at 115,200 per second using a 16 MHz clock, we need to count 138.888… clock cycles for each bit. Since we can’t count partial cycles, we need a way to represent this fractional value.

Encoding the Fractional Divider

The STM32 solves this by splitting the divider into two parts:

uint32_t mantissa = divider / 16;  // 138 / 16 = 8
uint32_t fraction = divider % 16;  // 138 % 16 = 10

This represents our divider as: 8 + 10/16 = 8.625

This isn’t exactly 8.6805… (which would be perfect), but it’s close enough. The actual baud rate will be:

16,000,000 ÷ 138 = 115,942 baud (about 0.7% fast)

This small error is acceptable – UART can tolerate up to about 5% error.

Checking for Rounding

uint32_t remainder = (SystemCoreClock_Get() % baud) * 16 / baud;
if (remainder >= 8) {
    // Rounding code would go here
}

This checks if we should round up our fraction for better accuracy. In our case, remainder is 1, which is less than 8, so no rounding occurs.

Setting the Baud Rate

USART3->BRR = (mantissa << 4) | fraction;

The BRR (Baud Rate Register) stores our calculated divider:

Mantissa (8) goes in bits [15:4]: 8 << 4 = 0x80
Fraction (10) goes in bits [3:0]: 0x0A
Combined: 0x8A (138 in decimal)

USART3 will now count exactly 138 system clock cycles between each bit it transmits.

Configuring Data Format

USART3->CR1 &= ~(USART_CR1_M | USART_CR1_PCE | USART_CR1_PS);

This sets up the data format by clearing three bits:

M bit: Word length (0 = 8 data bits)
PCE bit: Parity Control Enable (0 = no parity)
PS bit: Parity Selection (doesn’t matter since parity is disabled)

USART3->CR2 &= ~USART_CR2_STOP;

This clears the STOP bits field, which sets 1 stop bit.

Our configuration is now “8N1”:

8 data bits
No parity
1 stop bit

This is the most common UART configuration and what most terminal programs expect.

Enabling the UART

USART3->CR1 |= (USART_CR1_TE | USART_CR1_RE);

This enables both:

TE (Transmitter Enable): Allows USART3 to send data
RE (Receiver Enable): Allows USART3 to receive data

USART3->CR1 |= USART_CR1_UE;

This is the moment USART3 comes alive! When this line executes:

The transmit pin (PD8) immediately goes HIGH (idle state)
The receive pin (PD9) starts monitoring for incoming data
The baud rate generator starts counting
USART3 is fully operational

Returning to main()

The closing brace marks the end of UART_Init(). The function epilogue executes (restoring registers and stack), and execution returns to main(). But USART3 continues running independently in the background, ready to send or receive data.

The First Message

Back in main(), the very next line is:

UART_SendString("\r\n=== Task Scheduler and Power Management Demo ===\r\n");

This sends our first message through the newly initialized UART. Let’s trace through how this works.

Inside UART_SendString

void UART_SendString(const char* str) {
    while (*str) {
        while (!(USART3->SR & USART_SR_TXE));
        USART3->DR = *str++;
    }
    while (!(USART3->SR & USART_SR_TC));
}

For each character in the string:

Check if ready to transmit: The inner while loop checks the TXE (Transmit data register Empty) flag in the Status Register (SR). This flag is 1 when USART3 is ready for new data.
Send the character: Writing to the Data Register (DR) hands the character to USART3 hardware.
Hardware takes over: USART3 automatically:
- Shifts the character into its transmit shift register
- Adds a start bit (0) at the beginning
- Sends each of the 8 data bits
- Adds a stop bit (1) at the end
- Maintains precise timing (8.68 microseconds per bit)
Move to next character: The process repeats for each character.
Wait for completion: After all characters are sent, we wait for the TC (Transmission Complete) flag to ensure the last character fully transmitted.

Timing Analysis

Each character requires 10 bits (start + 8 data + stop). At 115200 baud:

Time per bit: 1 ÷ 115200 = 8.68 microseconds
Time per character: 10 × 8.68 = 86.8 microseconds
Our 52-character message: 52 × 86.8 = 4.5 milliseconds

During this 4.5ms transmission, SysTick continues running in the background. It will fire 4-5 times, incrementing systick_counter. This parallel operation is crucial for system timing.

UART and SysTick Working Together

While UART handles communication, SysTick provides timing services:

Timeout Protection: UART functions use systick_counter to implement timeouts, preventing infinite loops if hardware fails.
Transmission Delays: Before entering low-power modes, we use SysTick_Delay() to ensure UART finishes transmitting: UART_SendString("Entering sleep mode...\r\n"); SysTick_Delay(10); // Wait 10ms for transmission to complete // Now safe to sleep
Independent Operation: Both peripherals run independently:
- SysTick fires every 1ms regardless of what UART is doing
- UART transmits/receives at 115200 baud regardless of SysTick

The Complete Picture

After UART_Init() completes, your microcontroller has:

A heartbeat: SysTick providing 1ms timing reference
A voice: UART enabling communication at 115200 baud
Independence: Both peripherals running autonomously in hardware

The microcontroller can now:

Send debug messages to your computer
Receive commands from terminal programs
Report system status in real-time
Maintain precise timing while communicating

Deep Dive: What Happens During Data Transmission

Let’s trace exactly what happens at the hardware level when we send a single character ‘A’ (ASCII 65, binary 01000001):

The Transmission Timeline

When your code writes ‘A’ to USART3->DR:

Time 0.00μs: Character ‘A’ written to data register

USART3 immediately copies ‘A’ from DR to its internal shift register
TXE flag clears (transmit buffer now full)
TX pin (PD8) transitions from HIGH to LOW for the start bit

Time 8.68μs: First data bit (LSB)

Bit 0 of ‘A’ (which is 1) appears on PD8
Pin goes HIGH

Time 17.36μs: Second data bit

Bit 1 of ‘A’ (which is 0) appears on PD8
Pin goes LOW

Time 26.04μs through 69.44μs: Remaining data bits

Each bit of ‘A’ appears on PD8 in sequence
Pin toggles between HIGH and LOW based on each bit value

Time 78.12μs: Stop bit

PD8 goes HIGH (stop bit is always 1)
TXE flag sets (transmit buffer empty, ready for next character)

Time 86.80μs: Transmission complete

PD8 remains HIGH (idle state)
TC flag sets if no more data pending

The Reception Process

While transmission is straightforward, reception is more complex because the UART must detect when data starts arriving:

Idle state: RX pin (PD9) is HIGH, UART is continuously sampling
Start bit detection: When PD9 goes LOW, UART detects potential start bit
Validation: UART samples again at the middle of the start bit period to confirm
Data collection: UART samples PD9 eight more times (once per bit period) to collect the data
Stop bit check: UART verifies the stop bit is HIGH
Data ready: RXNE flag sets, received byte available in DR register

Error Detection

UART includes several error detection mechanisms:

Framing Error: If the stop bit isn’t HIGH when expected, UART sets the FE flag. This usually means the baud rates don’t match.

Overrun Error: If new data arrives before you read the previous byte, UART sets the ORE flag and data is lost.

Noise Error: If the UART detects inconsistent samples within a bit period, it sets the NE flag.

Parity Error: If parity is enabled and doesn’t match, UART sets the PE flag (we disabled parity, so this won’t occur).

The Bigger Picture: How UART Fits into the System

The Initialization Sequence So Far

Looking at our main() function, we can see a carefully orchestrated initialization sequence:

Interrupt system check: Verify the vector table is correctly set up
Timing foundation: Initialize SysTick for system timing
Communication channel: Initialize UART for debug output (we are here)
Power management: Coming next
Task scheduler: Coming after that

Each initialization builds on the previous ones. UART uses SysTick for timeouts. Power management will use UART to report status. The task scheduler will use both for its operation.

Memory Map Perspective

From a memory perspective, we’ve now configured three major regions:

RCC registers (0x40023800): Enabled clocks for GPIOD and USART3
GPIOD registers (0x40020C00): Configured PD8/PD9 as alternate function
USART3 registers (0x40004800): Set baud rate, format, and enabled

Each write to these memory-mapped registers directly controls hardware behavior. There’s no operating system or driver layer – we’re directly manipulating the silicon.

Real-Time Constraints

UART introduces our first real-time constraint to the system:

At 115200 baud, we must read received data within 87μs to avoid overrun
During transmission, each character takes 87μs regardless of CPU speed
Unlike SysTick which we control, UART timing is fixed by the baud rate

This is why the timeout mechanisms using SysTick are so important – they prevent UART operations from blocking the entire system.

Debug Output Architecture

With UART initialized, we’ve established a debug output architecture:

Immediate Output Functions

UART_SendString("Direct message\r\n");  // Blocks until complete
UART_SendByte('X');                     // Sends single character

Future Enhancement Possibilities

The current implementation is synchronous (blocking). Future enhancements could include:

Interrupt-driven transmission (non-blocking)
DMA-based transmission (offload to hardware)
Circular buffers for buffered I/O
Printf-style formatting

Integration with System Features

Throughout the codebase, you’ll see UART used for:

Task scheduler status reports
Power state transitions
Error reporting
Performance metrics
Command processing

Each subsystem can now report its status in real-time, making debugging and monitoring much easier.

The Path Forward

With UART operational, the microcontroller has found its voice. Every subsequent initialization and operation can now be monitored and debugged. When power management initializes next, it will report each state transition. When tasks start running, they’ll announce their execution. When errors occur, they’ll be logged.

This is the power of having UART initialized early in the boot sequence – it transforms a silent, black-box system into one that can tell you exactly what it’s doing at every step.

What’s Next

The next initialization will be Power Management, which adds sophisticated control over the microcontroller’s power consumption. It will use both SysTick (for wake-up timing) and UART (for status reporting) to create an energy-efficient system that can sleep when idle and wake when needed.

Next: Part 5 – Power Management Initialization →

This is Part 4 of the STM32 IoT Framework series.