Part 9: Your Ultimate Guide to Critical System Task Functions

In previous parts, we built the infrastructure: initialization (Parts 1-5), task scheduler (Part 6), main loop (Part 7), and command interface (Part 8). Now let’s examine the actual tasks that bring our system to life. These six tasks demonstrate different patterns and techniques used in embedded systems.

Task Overview

Our system runs six tasks with different periods and priorities:

TaskFunctionPeriodPriorityPurpose
0Task_LED1_Blink500msHIGH (0)Green LED toggle
1Task_LED2_Blink1000msNORMAL (1)Blue LED toggle
2Task_LED3_Blink2000msNORMAL (1)Red LED toggle
3Task_UARTStatus3000msLOW (2)Status messages
4Task_SystemMonitor5000msLOW (2)System health
5Task_PowerSaving5000msNORMAL (1)Power management

Understanding Priority Values:

  • HIGH = 0 (executes first)
  • NORMAL = 1 (executes second)
  • LOW = 2 (executes last)

Lower numbers mean higher priority. When multiple tasks are ready:

  1. All priority 0 tasks execute first (in array order)
  2. Then all priority 1 tasks
  3. Finally all priority 2 tasks

Why These Specific Periods?

  • 500ms: Fast enough to see blinking, slow enough to count
  • 1000ms: Standard 1-second heartbeat, easy to verify
  • 2000ms: Creates visual pattern with other LEDs
  • 3000ms: Frequent enough for monitoring, not overwhelming
  • 5000ms: Balances information freshness with UART traffic

These periods are all multiples of 500ms, making the pattern predictable and debuggable.

Task 0: LED1 Green Blink (500ms)

void Task_LED1_Blink(void) {
    static uint32_t counter = 0;
    counter++;
    
    /* Toggle green LED on PB0 */
    GPIOB->ODR ^= (1 << 0);
    
    /* Every 10 executions, print status */
    if (counter % 10 == 0) {
        UART_SendString("[LED1] Green LED task running\r\n");
    }
}

What does ‘static’ mean and why use it?

static uint32_t counter = 0;  // Persists between calls

Without static:

  • Variable created each time function called
  • Initialized to 0 every time
  • Destroyed when function returns
  • Counter would always be 0→1

With static:

  • Variable created once at program start
  • Initialized only once
  • Keeps its value between calls
  • Counter goes 0→1→2→3→…

What is the ^= operator? The ^= is the XOR-assign operator:

GPIOB->ODR ^= (1 << 0);  // Same as: GPIOB->ODR = GPIOB->ODR ^ (1 << 0);

XOR truth table:

A | B | A^B
--|---|----
0 | 0 | 0
0 | 1 | 1
1 | 0 | 1
1 | 1 | 0

For toggling: Current bit XOR 1 = Flipped bit

What is (1 << 0)? This is bit shifting:

1 << 0 = 0000 0001 (binary) = 1 (decimal)  - Bit 0
1 << 7 = 1000 0000 (binary) = 128 (decimal) - Bit 7
1 << 14 = 0100 0000 0000 0000 (binary) = 16384 - Bit 14

Why toggle instead of explicitly on/off? Toggle approach:

GPIOB->ODR ^= (1 << 0);  // One line, always alternates

Explicit approach would need:

static bool led_on = false;
if (led_on) {
    GPIOB->ODR &= ~(1 << 0);  // Clear bit (LED off)
    led_on = false;
} else {
    GPIOB->ODR |= (1 << 0);   // Set bit (LED on)
    led_on = true;
}

Toggle is simpler and guarantees alternation.

Why print every 10 executions?

  • Every execution: 2 messages/second, floods terminal
  • Every 10: One message per 5 seconds, readable
  • Also demonstrates modulo operator usage

Execution Timeline:

Time(ms) | Counter | LED State | UART Output
---------|---------|-----------|-------------
0        | 0→1     | OFF→ON    | None
500      | 1→2     | ON→OFF    | None
1000     | 2→3     | OFF→ON    | None
...      | ...     | ...       | ...
4500     | 9→10    | ON→OFF    | "[LED1] Green LED task running"
5000     | 10→11   | OFF→ON    | None

Static Variable Memory Layout:

Where do static variables live?

RAM Memory Sections:
+------------------+ 0x20000000 (RAM start)
|                  |
|    .data         | ← Static variables initialized to non-zero
|  (counter = 0)   | ← Our counters are here
+------------------+
|    .bss          | ← Static variables initialized to zero
|                  | ← (compiler optimizes zero-init here)
+------------------+
|    Heap          | ← Dynamic memory (malloc)
|       ↓          |
+------------------+
|       ↑          |
|    Stack         | ← Local variables, function calls
+------------------+ 0x20030000 (RAM end)

.data section detail:
Address    | Variable           | Value
-----------|-------------------|-------
0x20000100 | Task_LED1_counter | 10
0x20000104 | Task_LED2_counter | 5
0x20000108 | Task_LED3_counter | 2

Each static counter:

  • Gets permanent address
  • Exists for entire program
  • Not on stack (survives function return)

What is GPIOB->ODR? GPIO registers control pins:

GPIOB: Base address 0x40020400
├── MODER:  Mode register (input/output/alternate/analog)
├── OTYPER: Output type (push-pull/open-drain)
├── OSPEEDR: Speed register
├── PUPDR:  Pull-up/down register
├── IDR:    Input data register (read pins)
├── ODR:    Output data register (write pins) ← We use this
├── BSRR:   Bit set/reset register
└── ... other registers

ODR = Output Data Register:

  • 32-bit register
  • Each bit controls one pin
  • Bit 0 → Pin PB0
  • Bit 7 → Pin PB7
  • Bit 14 → Pin PB14

GPIO Toggle Deep Dive:

GPIOB->ODR ^= (1 << 0);

Step by step for first toggle:

1. Read current ODR value:     0x00000000 (all pins low)
2. Create mask (1 << 0):       0x00000001 (bit 0 set)
3. XOR operation:              0x00000000 ^ 0x00000001 = 0x00000001
4. Write back to ODR:          0x00000001 (PB0 high, LED on)

Next toggle:
1. Read current ODR value:     0x00000001 (PB0 high)
2. Create mask (1 << 0):       0x00000001 (bit 0 set)
3. XOR operation:              0x00000001 ^ 0x00000001 = 0x00000000
4. Write back to ODR:          0x00000000 (PB0 low, LED off)

UART Message Timing Analysis: Message: “[LED1] Green LED task running\r\n” (32 characters)

At 115200 baud:

  • 1 bit time = 1/115200 = 8.68μs
  • 1 character = 10 bits (1 start + 8 data + 1 stop)
  • 1 character time = 10 × 8.68μs = 86.8μs
  • 32 characters = 32 × 86.8μs = 2.78ms

This blocks other tasks for ~2.8ms, causing jitter.

Task 1: LED2 Blue Blink (1000ms)

void Task_LED2_Blink(void) {
    static uint32_t counter = 0;
    counter++;
    
    /* Toggle blue LED on PB7 */
    GPIOB->ODR ^= (1 << 7);
    
    /* Every 10 executions, print status */
    if (counter % 10 == 0) {
        UART_SendString("[LED2] Blue LED task running\r\n");
    }
}

Why use different GPIO pins (PB0, PB7, PB14)?

Physical reasons:

STM32F429 Discovery Board LED Layout:
     [USB]
+----------------+
|                |
| LED3 (Red)     | ← PB14
|                |
|   [LCD Screen] |
|                |
| LED4 (Blue)    | ← PB7
|                |
| LED1 (Green)   | ← PB0
+----------------+

Electrical reasons:

  • Each LED needs ~20mA current
  • GPIO pins spread load across port
  • Reduces electrical interference
  • Better heat dissipation

Software reasons:

  • Easy to identify in code (0, 7, 14)
  • No adjacent bits (prevents accidents)
  • Clear bit patterns in debugger

Understanding (1 << 7):

Bit position:  15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0
               ------------------------------------------------
1 << 0:         0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  1
1 << 7:         0  0  0  0  0  0  0  0  1  0  0  0  0  0  0  0
1 << 14:        0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0

Decimal values:
1 << 0  = 1
1 << 7  = 128
1 << 14 = 16384

Bit Manipulation Without Affecting Other Pins:

Initial state (multiple LEDs on):

GPIOB->ODR = 0x00004001  (bits 14 and 0 set)
Binary:      0100 0000 0000 0001
             ││              │
             │└─ Bit 14 (Red ON)
             └─ Bit 0 (Green ON)

Toggle bit 7 (Blue LED):

Current:     0100 0000 0000 0001  (0x4001)
XOR mask:    0000 0000 1000 0000  (0x0080)
             --------------------
Result:      0100 0000 1000 0001  (0x4081)
                        │
                        └─ Bit 7 toggled (Blue ON)

Other bits remain unchanged! This is why XOR is perfect for toggling single bits.

What if we used OR or AND instead?

// Using OR (always turns ON):
GPIOB->ODR |= (1 << 7);   // Can only set bit to 1

// Using AND (always turns OFF):
GPIOB->ODR &= ~(1 << 7);  // Can only clear bit to 0

// Using XOR (toggles):
GPIOB->ODR ^= (1 << 7);   // Flips current state

Task Execution at 1000ms Intervals: This task runs half as often as Task 0:

  • Green LED: 2 blinks per second (500ms period)
  • Blue LED: 1 blink per second (1000ms period)
  • Creates alternating pattern for visual distinction

Task 2: LED3 Red Blink (2000ms)

void Task_LED3_Blink(void) {
    static uint32_t counter = 0;
    counter++;
    
    /* Toggle red LED on PB14 */
    GPIOB->ODR ^= (1 << 14);
    
    /* Every 10 executions, print status */
    if (counter % 10 == 0) {
        UART_SendString("[LED3] Red LED task running\r\n");
    }
}

Visual Pattern Creation – Complete Cycle:

The three LEDs create a repeating pattern. Here’s the complete 2-second cycle:

Time  | Green (500ms) | Blue (1000ms) | Red (2000ms) | Visual Effect
------|---------------|---------------|--------------|---------------
0ms   | ON  ●         | ON  ●         | ON  ●        | All ON (white)
250ms | ON  ●         | ON  ●         | ON  ●        | All ON (white)
500ms | OFF ○         | ON  ●         | ON  ●        | Blue+Red (magenta)
750ms | OFF ○         | ON  ●         | ON  ●        | Blue+Red (magenta)
1000ms| ON  ●         | OFF ○         | ON  ●        | Green+Red (yellow)
1250ms| ON  ●         | OFF ○         | ON  ●        | Green+Red (yellow)
1500ms| OFF ○         | OFF ○         | ON  ●        | Red only
1750ms| OFF ○         | OFF ○         | ON  ●        | Red only
2000ms| ON  ●         | ON  ●         | OFF ○        | Green+Blue (cyan)
(Pattern repeats)

Why is this pattern useful?

  1. Visual Debugging: Can see if scheduler is working correctly
  2. Timing Verification: Each LED frequency is clearly visible
  3. No Synchronization: LEDs drift apart if timing is wrong
  4. Load Testing: All combinations of tasks running

How long until pattern repeats exactly? Least Common Multiple (LCM) of periods:

  • LCM(500, 1000, 2000) = 2000ms
  • Pattern repeats every 2 seconds
  • 8 distinct states in 2 seconds

Bit position for PB14:

Bit position: 15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0
1 << 14:       0  1  0  0  0  0  0  0  0  0  0  0  0  0  0  0
Hex value:     0x4000
Decimal:       16384

Why bit 14 specifically? On the STM32F429 Discovery board:

  • PB14 connects to red LED through current-limiting resistor
  • Hardware designers chose this pin for PCB layout reasons
  • We must match hardware design in software

Counter and Message Timing:

Execution | Time(s) | Counter | Message?
----------|---------|---------|----------
1         | 2       | 1       | No
2         | 4       | 2       | No
...       | ...     | ...     | ...
10        | 20      | 10      | Yes - prints message
11        | 22      | 11      | No

Message appears every 20 seconds (10 × 2s period).

Task 3: UART Status Messages (3000ms)

void Task_UARTStatus(void) {
    static uint32_t counter = 0;
    char buffer[100];
    
    counter++;
    
    /* Get current system time in seconds */
    uint32_t seconds = systick_counter / 1000;
    uint32_t minutes = seconds / 60;
    seconds = seconds % 60;
    
    /* Format status message */
    snprintf(buffer, sizeof(buffer), 
             "\r\n[STATUS] Uptime: %02lu:%02lu | Task runs: %lu\r\n",
             minutes, seconds, counter);
    
    UART_SendString(buffer);
}

Why use a local buffer instead of direct UART_SendString?

// Without buffer (multiple UART calls):
UART_SendString("\r\n[STATUS] Uptime: ");
UART_SendNumber(minutes);  // Would need to implement
UART_SendString(":");
UART_SendNumber(seconds);  // Would need to implement
// ... more calls

// With buffer (single UART call):
snprintf(buffer, sizeof(buffer), format, ...);  // Build complete message
UART_SendString(buffer);                        // Send once

Benefits of buffer approach:

  • Single UART transmission (less overhead)
  • Atomic operation (no interleaving)
  • Standard C formatting with snprintf
  • Easier to calculate message length

Understanding Time Calculations:

Step-by-step example with systick_counter = 185000:

1. Total milliseconds:     185000
2. Total seconds:          185000 ÷ 1000 = 185
3. Minutes:                185 ÷ 60 = 3 (integer division)
4. Remaining seconds:      185 % 60 = 5 (modulo operation)
5. Display:                "03:05"

What is the % (modulo) operator?

185 ÷ 60 = 3 remainder 5
185 % 60 = 5 (the remainder)

More examples:
59 % 60 = 59   (less than divisor)
60 % 60 = 0    (exact multiple)
61 % 60 = 1    (one over)
125 % 60 = 5   (two minutes + 5 seconds)

Understanding snprintf Format Specifiers:

snprintf(buffer, sizeof(buffer), 
         "\r\n[STATUS] Uptime: %02lu:%02lu | Task runs: %lu\r\n",
         minutes, seconds, counter);

Breaking down %02lu:

  • % – Start of format specifier
  • 0 – Pad with zeros (not spaces)
  • 2 – Minimum width of 2 characters
  • l – Long integer (32-bit)
  • u – Unsigned decimal

Examples:

Value | %lu  | %02lu
------|------|-------
5     | "5"  | "05"
15    | "15" | "15"
125   | "125"| "125"

Buffer Size Calculation:

Message template: "\r\n[STATUS] Uptime: XX:XX | Task runs: XXXXXXXXXX\r\n"

Fixed text:        40 characters
Time (XX:XX):       5 characters  
Counter (max 10):  10 characters
Total maximum:     55 characters
Buffer size:      100 characters (45 extra for safety)

Why 100 bytes when we only need 55?

  • Future-proofing (might add more info)
  • Round number (easier to remember)
  • Stack has plenty of space (128KB)
  • Safety margin for modifications

Stack Frame When Task Executes:

Higher addresses
+------------------+
| Return address   | 4 bytes
+------------------+
| counter          | 0 bytes (static, in .data)
+------------------+
| buffer[99]       |
| buffer[98]       |
| ...              | 100 bytes
| buffer[1]        |
| buffer[0]        |
+------------------+
| seconds          | 4 bytes
+------------------+
| minutes          | 4 bytes
+------------------+ ← Stack pointer
Lower addresses

Total stack usage: 112 bytes

UART Transmission Timing: Typical message: “\r\n[STATUS] Uptime: 03:45 | Task runs: 75\r\n” (44 chars)

  • Transmission time: 44 × 86.8μs = 3.82ms
  • During this time, no other tasks can execute
  • This is why it has LOW priority

Task 4: System Monitor (5000ms)

void Task_SystemMonitor(void) {
    static uint32_t execCount = 0;
    static uint32_t lastFreeHeap = 0;
    uint32_t currentFreeHeap;
    char buffer[200];
    
    execCount++;
    
    UART_SendString("\r\n========= System Monitor =========\r\n");
    
    /* Basic system info */
    snprintf(buffer, sizeof(buffer),
             "Execution count: %lu\r\n"
             "System clock: %lu MHz\r\n"
             "Flash size: 2048 KB\r\n"
             "RAM size: 256 KB\r\n",
             execCount,
             SystemCoreClock / 1000000);
    UART_SendString(buffer);
    
    /* Task scheduler summary */
    uint8_t ready = 0, suspended = 0, stopped = 0;
    
    for (uint8_t i = 0; i < MAX_TASKS; i++) {
        switch(TaskScheduler_GetTaskState(i)) {
            case TASK_STATE_READY:
            case TASK_STATE_RUNNING:
                ready++;
                break;
            case TASK_STATE_SUSPENDED:
                suspended++;
                break;
            case TASK_STATE_STOPPED:
                stopped++;
                break;
        }
    }
    
    snprintf(buffer, sizeof(buffer),
             "Tasks - Ready: %u, Suspended: %u, Stopped: %u\r\n",
             ready, suspended, stopped);
    UART_SendString(buffer);
    
    UART_SendString("==================================\r\n");
}

Why two static variables?

static uint32_t execCount = 0;     // Counts task executions
static uint32_t lastFreeHeap = 0;  // For future heap monitoring
  • Both persist between calls
  • lastFreeHeap prepared for heap tracking feature
  • Shows planning for future enhancements

SystemCoreClock Division:

SystemCoreClock / 1000000  // Convert Hz to MHz

Example: 16000000 / 1000000 = 16 MHz

Why divide by 1000000?

  • 1 Hz = 1 cycle/second
  • 1 MHz = 1,000,000 cycles/second
  • 16,000,000 Hz = 16 MHz

Understanding the Task State Counting:

for (uint8_t i = 0; i < MAX_TASKS; i++) {
    switch(TaskScheduler_GetTaskState(i)) {

This loop iterates through all task slots:

  • i = 0: Check task 0 state
  • i = 1: Check task 1 state
  • … up to MAX_TASKS (8)

How the switch statement counts:

switch(TaskScheduler_GetTaskState(i)) {
    case TASK_STATE_READY:
    case TASK_STATE_RUNNING:
        ready++;
        break;

The fall-through pattern:

  • If state is READY: increment ready, then break
  • If state is RUNNING: increment ready, then break
  • Both states count as “ready” (active)

Example State Counting:

Current system state:
Task 0: READY      → ready++ (ready = 1)
Task 1: READY      → ready++ (ready = 2)
Task 2: SUSPENDED  → suspended++ (suspended = 1)
Task 3: READY      → ready++ (ready = 3)
Task 4: RUNNING    → ready++ (ready = 4)
Task 5: STOPPED    → stopped++ (stopped = 1)
Task 6: <empty>    → no state, skip
Task 7: <empty>    → no state, skip

Final: "Tasks - Ready: 4, Suspended: 1, Stopped: 1"

Why count RUNNING as ready?

  • RUNNING is temporary (microseconds)
  • Only during actual execution
  • By the time user sees message, task is READY again
  • Users care about “active” vs “inactive”

String Building Pattern: The function builds output in chunks:

  1. Fixed header
  2. System info (one snprintf)
  3. Task summary (one snprintf)
  4. Fixed footer

This minimizes buffer usage and UART calls.

Total Transmission Time:

Header:         37 chars × 86.8μs = 3.2ms
System info:    ~80 chars × 86.8μs = 6.9ms
Task summary:   ~50 chars × 86.8μs = 4.3ms
Footer:         36 chars × 86.8μs = 3.1ms
Total:          ~17.5ms blocking time

This is the longest continuous blocking operation in our system.

Task 5: Power Saving (5000ms)

void Task_PowerSaving(void) {
    static uint32_t counter = 0;
    static bool lowPowerEnabled = false;
    
    counter++;
    
    /* Every 10 executions (50 seconds), enter stop mode */
    if (counter % 10 == 0) {
        UART_SendString("\r\n[POWER] Entering Stop mode for 2 seconds...\r\n");
        UART_SendString("(RTC will wake us up)\r\n");
        
        /* Ensure UART transmission completes */
        SysTick_Delay(20);
        
        /* Set RTC alarm for 2 seconds */
        uint32_t current_rtc = RTC_GetCounter();
        RTC_SetAlarm(current_rtc + 2);
        
        /* Enable low power mode */
        lowPowerEnabled = true;
        
        /* Enter Stop mode */
        PowerMgmt_EnterLowPowerMode(POWER_MODE_STOP);
        
        /* We're back from Stop mode */
        lowPowerEnabled = false;
        
        UART_SendString("[POWER] Woke up from Stop mode\r\n");
    }
}

Understanding the Timing:

Task period: 5000ms (5 seconds)
Special action: Every 10 executions
Total: 10 × 5 = 50 seconds between stop modes

Timeline:
0s:   Task runs (counter = 1)
5s:   Task runs (counter = 2)
...
45s:  Task runs (counter = 9)
50s:  Task runs (counter = 10) → ENTERS STOP MODE
52s:  Wakes up from stop mode
57s:  Task runs (counter = 11)
...

What is the RTC? RTC = Real Time Clock

  • Separate 32.768 kHz oscillator
  • Runs during Stop mode (main clock stopped)
  • Uses very little power (~1μA)
  • Can wake system with alarm

RTC Counter vs SysTick Counter:

SysTick Counter:          RTC Counter:
- Counts milliseconds     - Counts seconds
- Stops in Stop mode      - Continues in Stop mode
- 16MHz source           - 32.768kHz source
- General timing         - Low power timing

Setting the RTC Alarm:

uint32_t current_rtc = RTC_GetCounter();  // Get current RTC seconds
RTC_SetAlarm(current_rtc + 2);           // Set alarm 2 seconds later

Example:

  • Current RTC: 1234 seconds since init
  • Alarm set: 1236 seconds
  • Wake occurs: When RTC reaches 1236

Why the 20ms delay?

SysTick_Delay(20);

The messages total ~80 characters:

  • Transmission time: 80 × 86.8μs = 6.9ms
  • UART hardware FIFO: 16 bytes
  • Some characters might still be in FIFO
  • 20ms ensures everything is transmitted
  • Prevents corrupted message on wake

What happens in Stop Mode?

Component     | Before Stop  | During Stop | After Wake
--------------|--------------|-------------|-------------
CPU           | Running      | Stopped     | Running
HSI (16MHz)   | On          | Off         | On (restarted)
SysTick       | Counting     | Stopped     | Resumes
UART Clock    | On          | Off         | On
GPIO States   | Maintained   | Maintained  | Maintained
RAM           | Powered      | Powered     | Preserved
RTC           | Running      | Running     | Running
Current       | ~15mA        | ~3μA        | ~15mA

Power Consumption Calculation:

50-second cycle:
- Normal operation: 48 seconds at 15mA
- Stop mode: 2 seconds at 0.003mA

Energy per cycle:
Normal: 48s × 15mA = 720 mA·s
Stop:   2s × 0.003mA = 0.006 mA·s
Total:  720.006 mA·s

Average current: 720.006 / 50 = 14.4mA
Savings: (15 - 14.4) / 15 = 4%

Why only 4% savings?

  • Stop mode is only 2 seconds out of 50
  • Demo purposes (shows functionality)
  • Real applications might sleep 99% of time
  • Then savings would be dramatic

The bool variable lowPowerEnabled:

static bool lowPowerEnabled = false;

Currently unused but prepared for:

  • Disabling power save via command
  • Status reporting
  • Conditional behavior
  • Future enhancements

Impact on System Timing:

Enter Stop at systick_counter = 50000
2 seconds pass in real time
Wake at systick_counter = 50000 (unchanged!)

Result: System "loses" 2 seconds
- Tasks scheduled for 50500ms run late
- Time display becomes inaccurate
- Demonstrates Stop mode limitation

Task Interactions and Timing

Simultaneous Task Execution

When multiple tasks are ready at the same time, the scheduler must decide execution order.

Example: What happens at exactly 5000ms?

First, let’s see which tasks are due:

Task | Last Run | Period | Next Due | Due at 5000ms?
-----|----------|--------|----------|---------------
0    | 4500ms   | 500ms  | 5000ms   | YES ✓
1    | 4000ms   | 1000ms | 5000ms   | YES ✓
2    | 4000ms   | 2000ms | 6000ms   | NO ✗
3    | 3000ms   | 3000ms | 6000ms   | NO ✗
4    | 0ms      | 5000ms | 5000ms   | YES ✓
5    | 0ms      | 5000ms | 5000ms   | YES ✓

Four tasks are ready! Priority determines order:

Ready tasks sorted by priority:
1. Task 0 (LED1): Priority 0 (HIGH)     ← Executes first
2. Task 1 (LED2): Priority 1 (NORMAL)   ← Executes second
3. Task 5 (Power): Priority 1 (NORMAL)  ← Executes third
4. Task 4 (Monitor): Priority 2 (LOW)   ← Executes last

Why does Task 1 execute before Task 5? Both have priority 1, so the scheduler uses array order:

  • Task 1 is at index 1
  • Task 5 is at index 5
  • Lower index executes first

Execution Timeline at 5000ms:

Time     | Event
---------|------------------------------------------
5000.0ms | Main loop iteration begins
5000.0ms | Dispatcher finds Task 0 ready (HIGH priority)
5000.0ms | Task 0 executes (~0.01ms)
5000.0ms | Dispatcher returns to main loop
5001.0ms | Next iteration, dispatcher finds Task 1
5001.0ms | Task 1 executes (~0.01ms)
5002.0ms | Next iteration, dispatcher finds Task 5
5002.0ms | Task 5 executes (~0.1ms, no stop mode)
5003.0ms | Next iteration, dispatcher finds Task 4
5003.0ms | Task 4 starts (long UART output)
5020.5ms | Task 4 completes
5021.5ms | Main loop continues normally

Total time to process all tasks: ~21ms

UART Contention

Multiple tasks use UART. Since we have cooperative scheduling (no preemption), messages never interleave.

Scenario: Multiple tasks want to print at 10000ms

Task 0 (LED1) wants to print (10th execution) Task 1 (LED2) wants to print (10th execution)

Timeline:

10000.0ms: Task 0 (HIGH priority) executes first
10000.0ms: Starts sending "[LED1] Green LED task running\r\n"
10002.8ms: Completes transmission (32 chars × 87μs)
10002.8ms: Returns to scheduler
10003.8ms: Task 1 executes
10003.8ms: Starts sending "[LED2] Blue LED task running\r\n"  
10006.6ms: Completes transmission
10006.6ms: Returns to scheduler

What prevents message corruption?

  1. Cooperative scheduling: Tasks run to completion
  2. Blocking UART: SendString waits for completion
  3. No interrupts: No preemption during transmission

In a preemptive system, you’d need:

  • Mutex/semaphore for UART access
  • Message queuing
  • Or risk corrupted output

Stack Usage Analysis

How much stack does our deepest call use?

Call chain for System Monitor (worst case):

main()                          8 bytes (locals)
└── while(1)                    0 bytes
    └── TaskScheduler_RunDispatcher()  16 bytes
        └── task->taskFunc()           4 bytes (pointer)
            └── Task_SystemMonitor()   220 bytes
                ├── snprintf()         48 bytes
                └── UART_SendString()  12 bytes

Total: 308 bytes

Stack memory visualization during deepest call:

High Address (0x20030000)
+------------------------+
|    (unused stack)      | ← ~127KB available
+------------------------+
| UART_SendString locals |
| snprintf locals        |
| buffer[200]            | ← Largest stack user
| Task_SystemMonitor vars|
| taskFunc pointer       |
| RunDispatcher vars     |
| main locals            |
+------------------------+ ← Current stack pointer
|    (free space)        |
+------------------------+
Low Address (0x20000000)

Is 308 bytes safe?

  • Total stack: 128KB = 131,072 bytes
  • Used: 308 bytes
  • Usage: 0.23%
  • Very safe!

What would cause stack overflow?

  • Recursive functions without limit
  • Very large local arrays
  • Deeply nested function calls
  • Interrupt handlers with large locals

Performance Metrics

Task Execution Times

Understanding how long each task takes to execute:

TaskTypical TimeWorst CaseCause of Worst Case
LED tasks<10μs<10μsFixed operation
UART Status4.5ms4.5msFixed message length
System Monitor22ms25msVariable message length
Power Saving100μs2.02sStop mode entry

How do we measure microseconds? At 16MHz clock:

  • 1 clock cycle = 1/16,000,000 = 62.5 nanoseconds
  • 1 microsecond = 16 clock cycles
  • LED toggle = ~5 instructions = ~0.3μs
  • Add function overhead = ~10μs total

Why such different execution times?

  • LED tasks: Just toggle one bit (very fast)
  • UART tasks: Limited by baud rate (86.8μs per character)
  • Power task: Mostly fast, except when entering stop mode

Jitter Analysis

Jitter = Difference between when task should run vs when it actually runs

Best Case – No Contention:

Task 1 scheduled for: 1000ms
No other tasks ready
Actual execution: 1000ms
Jitter: 0ms (perfect!)

Worst Case – All Tasks Ready at 5000ms:

Task 4 (Monitor) scheduled for: 5000ms
Must wait for:
- Task 0: ~0.01ms
- Task 1: ~0.01ms  
- Task 5: ~0.1ms
- Dispatcher overhead: ~0.1ms

Actual execution: 5000.22ms
Jitter: 0.22ms

But if Task 0 prints message:
- Task 0 with UART: 2.8ms
Total wait: ~3ms
Jitter: 3ms

Is 3ms jitter acceptable? Depends on application:

  • LED blinking: Yes (human eye won’t notice)
  • Motor control: Maybe (depends on precision needed)
  • Communication protocol: No (might miss deadlines)

CPU Utilization

Let’s calculate exact CPU usage:

Per-second breakdown:

Activity          | Frequency      | Time Each | Total/Second
------------------|----------------|-----------|-------------
LED1 toggle       | 2 per second   | 10μs      | 20μs
LED2 toggle       | 1 per second   | 10μs      | 10μs
LED3 toggle       | 0.5 per second | 10μs      | 5μs
UART status       | 0.33 per sec   | 4.5ms     | 1.5ms
System monitor    | 0.2 per second | 22ms      | 4.4ms
Power saving      | 0.2 per second | 100μs     | 20μs
Main loop checks  | 1000 per sec   | 30μs      | 30ms
----------------------------------------------------------------
Total active time per second:                    35.955ms

CPU Utilization = (Active Time / Total Time) × 100% = (35.955ms / 1000ms) × 100% = 3.6%

Where does the other 96.4% go?

SysTick_Delay(1);  // CPU waits here

During delay:

  • CPU executes WFI (Wait For Interrupt)
  • Enters sleep mode
  • Wakes on SysTick interrupt
  • Low power consumption

Power consumption estimate:

Active (3.6%):  15mA × 0.036 = 0.54mA
Idle (96.4%):   5mA × 0.964 = 4.82mA
Average:                        5.36mA

Much better than 15mA continuous!

Design Patterns Demonstrated

1. Static Counters

Each task maintains execution count without dynamic allocation:

static uint32_t counter = 0;
counter++;

Why this pattern?

  • No malloc/free (deterministic)
  • No stack usage between calls
  • Each task has independent counter
  • Perfect for embedded systems

Alternative (bad for embedded):

// DON'T DO THIS in embedded:
uint32_t* counter = malloc(sizeof(uint32_t));  // Heap fragmentation
*counter++;

2. Modulo Scheduling

Tasks perform special actions periodically:

if (counter % 10 == 0) {
    // Every 10th execution
}

How modulo works for scheduling:

Counter | % 10 | Action?
--------|------|--------
1       | 1    | No
2       | 2    | No
...     | ...  | ...
9       | 9    | No
10      | 0    | Yes!
11      | 1    | No
...     | ...  | ...
20      | 0    | Yes!

Common modulo patterns:

if (counter % 2 == 0)    // Every even execution
if (counter % 5 == 0)    // Every 5th execution
if (counter % 60 == 0)   // Every minute (if called each second)

3. Safe String Building

Using snprintf with sizeof() prevents overflows:

char buffer[100];
snprintf(buffer, sizeof(buffer), "Value: %d", value);

Why sizeof() instead of hardcoding?

// BAD: Must update two places if size changes
char buffer[100];
snprintf(buffer, 100, ...);  // Easy to forget to update

// GOOD: Automatically correct
char buffer[100];
snprintf(buffer, sizeof(buffer), ...);  // Always matches

What happens on overflow?

char buffer[10];
snprintf(buffer, sizeof(buffer), "Hello, World!");
// Result: "Hello, Wo\0" (truncated, but safe!)
// Never writes past buffer end

4. State Inspection

System monitor reads scheduler state without modifying:

TaskScheduler_GetTaskState(i);  // Returns state, doesn't change it

Read-only pattern benefits:

  • Can’t accidentally modify state
  • Safe to call from any task
  • No side effects
  • Easier to debug

Implementation pattern:

// In scheduler:
TaskState_t TaskScheduler_GetTaskState(uint8_t taskId) {
    return tasks[taskId].state;  // Just return, don't modify
}

5. Hardware Abstraction

Direct register access wrapped in meaningful operations:

// Instead of:
*((volatile uint32_t*)0x40020414) ^= 0x00000001;  // What does this do??

// We write:
GPIOB->ODR ^= (1 << 0);  // Toggle LED on PB0 - self-documenting!

Benefits of abstraction:

  • Self-documenting code
  • Easier to port to different hardware
  • Compiler can still optimize
  • Type safety

6. Cooperative Timing

Tasks keep execution short:

void Task_LED1_Blink(void) {
    // Quick operations only
    GPIOB->ODR ^= (1 << 0);      // Fast
    if (counter % 10 == 0) {      // Fast
        UART_SendString(...);      // Slower but necessary
    }
    // Return quickly
}

What NOT to do in cooperative tasks:

// BAD: Blocks entire system
void Bad_Task(void) {
    for(int i = 0; i < 1000000; i++) {  // Long loop
        // Process data
    }
    while(!some_condition) {  // Waiting loop
        // Wait for something
    }
}

Common Issues and Solutions

Issue 1: UART Message Corruption

Symptom:

Expected: "[LED1] Green LED task running"
Actual:   "[LED1] Gre[LED2] Blue en LED task running"

Cause: Task preemption mid-message (can’t happen in our system)

Why we’re safe:

  • Cooperative scheduler = tasks run to completion
  • UART_SendString is blocking = waits until done
  • No task can interrupt another

In preemptive systems, you’d need:

// Mutex protection example
mutex_lock(&uart_mutex);
UART_SendString(message);
mutex_unlock(&uart_mutex);

Issue 2: Time Loss in Stop Mode

Symptom:

Before Stop: Uptime: 01:00
After 30s Stop: Uptime: 01:02 (should be 01:32!)

Cause: SysTick stops counting during Stop mode

Solutions:

  1. Use RTC for timekeeping:
uint32_t rtc_before = RTC_GetCounter();
Enter_Stop_Mode();
uint32_t rtc_after = RTC_GetCounter();
uint32_t stop_duration = rtc_after - rtc_before;
systick_counter += (stop_duration * 1000);  // Compensate
  1. Accept the limitation:
  • Document that time tracking is lost
  • Use for applications where exact time doesn’t matter
  1. Use Sleep mode instead:
  • SysTick keeps running
  • Higher power consumption
  • Time tracking maintained

Issue 3: Stack Overflow

Symptom: Random crashes, corrupted variables, HardFault

How to detect:

// Stack canary method
#define STACK_CANARY 0xDEADBEEF
uint32_t stack_canary = STACK_CANARY;

void check_stack(void) {
    if (stack_canary != STACK_CANARY) {
        // Stack corrupted!
        Error_Handler();
    }
}

Prevention strategies:

  1. Minimize local arrays:
// BAD: Large stack usage
void bad_task(void) {
    char huge_buffer[10000];  // 10KB on stack!
}

// GOOD: Reasonable size
void good_task(void) {
    char buffer[100];  // 100 bytes
}
  1. Use static for large data:
// Moves from stack to .data section
static char large_buffer[1000];
  1. Monitor stack usage:
// Check how much stack used
uint32_t stack_used = &_estack - __get_MSP();

Issue 4: Priority Inversion

What is priority inversion? High-priority task waits for low-priority task.

Classic scenario (CAN’T happen in our system):

1. Low priority task takes mutex
2. High priority task needs same mutex
3. High priority task blocks
4. Medium priority task runs instead!

Why we’re immune:

  • No mutexes or semaphores
  • No shared resources with locks
  • Tasks can’t block each other
  • Cooperative = no preemption during task

If you need resource sharing:

// Simple flag method (works for cooperative)
static volatile bool uart_busy = false;

void send_message(char* msg) {
    while (uart_busy);  // Wait if busy
    uart_busy = true;
    UART_SendString(msg);
    uart_busy = false;
}

Issue 5: Missed Task Executions

Symptom: LED should blink 2x/second, only blinks 1x/second

Possible causes:

  1. Task taking too long:
void problematic_task(void) {
    // This blocks for 600ms!
    for(int i = 0; i < 10; i++) {
        UART_SendString("Long message...\r\n");
        SysTick_Delay(50);
    }
}
// 500ms LED task misses execution
  1. Wrong period calculation:
// Registering with wrong period
TaskScheduler_RegisterTask(..., 1000, ...);  // 1000ms not 500ms!
  1. Task suspended/stopped:
// Check task state
if (TaskScheduler_GetTaskState(0) != TASK_STATE_READY) {
    // Task not running!
}

Debugging approach:

  1. Add execution counter to task
  2. Print counter periodically
  3. Calculate actual rate
  4. Compare to expected rate

Summary

These six tasks demonstrate fundamental embedded systems concepts:

What We’ve Learned

1. Task Design Principles:

  • Keep tasks short and non-blocking
  • Use static variables for persistent state
  • Return quickly to allow other tasks to run
  • Print status periodically, not continuously

2. Timing Techniques:

  • Different periods create visual patterns
  • Modulo operator for periodic actions
  • Priority determines execution order
  • Jitter is inevitable but manageable

3. Resource Management:

  • UART is shared but not contested (cooperative)
  • Stack usage is minimal with local buffers
  • Power modes trade functionality for efficiency
  • Static allocation avoids heap fragmentation

4. Real-World Patterns:

  • Status reporting (health monitoring)
  • Power management (battery devices)
  • Visual indicators (user feedback)
  • System diagnostics (debugging)

System Behavior Summary

Every Second:

  • 1000 main loop iterations
  • ~36ms active processing
  • ~964ms in low-power wait
  • 3.6% CPU utilization

Visual Output:

  • 3 LEDs blinking at different rates
  • Status message every 3 seconds
  • System monitor every 5 seconds
  • Stop mode demo every 50 seconds

Power Profile:

  • Normal operation: ~15mA
  • With delays: ~5.4mA average
  • During stop mode: ~3μA
  • Overall savings: >60%

Key Takeaways

  1. Cooperative scheduling is simple and predictable
    • No race conditions
    • No priority inversion
    • Easy to debug
    • Perfect for many embedded applications
  2. Tasks must cooperate
    • Short execution times
    • No blocking waits
    • Voluntary yielding
    • Shared responsibility
  3. Hardware and software work together
    • GPIO for LED control
    • UART for communication
    • RTC for low-power timing
    • SysTick for system timing
  4. Power efficiency requires planning
    • Use appropriate sleep modes
    • Minimize active time
    • Consider wake-up sources
    • Accept timing trade-offs

The complete system demonstrates how simple components – timers, tasks, and schedulers – combine to create a responsive, efficient embedded application. Each task serves a purpose while working harmoniously with others, creating a system greater than the sum of its parts.


Next: Part 10 – System Integration: The Complete Picture →

This is Part 9 of the STM32 IoT Framework series.

Leave a Comment

Your email address will not be published. Required fields are marked *