Page 1 of 1

STM32 Programming. Part 8: DMA

Posted: 16 Oct 2023, 04:15
by Oleg
Direct memory access (DMA), or direct memory access (DMA) is used for fast data transfer between memory and peripheral, memory and memory, or between two peripherals without CPU involvement. One DMA1 controller with 7 channels is available in the STM32F103c8 microcontroller. DMA2 is present only in the high-density and XL-density microcontrollers.

Contents
  • DMA functional description
  • DMA channel priorities
  • DMA channels
  • Alignment of data of different bit sizes
  • AHB/APB Peripheral Addressing Features
  • Circular mode DMA (Circular mode)
  • Memory-to-memory mode
  • DMA interrupts
  • DMA transfer errors
  • DMA Channels and Peripherals
Functional Description of DMA
The DMA controller shares the system bus with the CPU core. If the CPU and DMA are accessing the same memory area or peripheral, DMA can suspend CPU access to the system bus, but at least half of the bus bandwidth is reserved for the CPU. This means that even with intensive DMA communication, the CPU will not freeze up completely.

When a certain event occurs, the peripheral device sends a request signal to the DMA controller. This starts the data exchange process, which consists of 3 steps:
  1. Loading data from the peripheral's register (if the direction of transfer is from peripheral to memory) or loading data from memory (if the direction of transfer is from memory to peripheral);
  • Saving the loaded data to memory (if from peripheral to memory) or to peripheral (if from memory to peripheral);
  • Decreasing the value of the DMA_CNDTRx register by one. Once this register is zero, the data transfer is complete.
DMA channel priorities
DMA1 in microcontrollers STM32F103c8 has 7 channels, and at a particular time, data transfer can be carried out only on one of them. However, if several channels are active, then at simultaneous occurrence of DMA requests the transfer will be started for the channel whose priority is higher. Let me give you a small example. Suppose we have DMA1 channel 3 configured to transmit a data array to SPI1 and channel 1 to receive from ADC1. Let's set the priority of channel 3 higher than that of channel 1. In this case, if requests from SPI1 and ADC1 occur simultaneously, the request from SPI1 (channel 3) will be processed first and then from ADC1 (channel 1). That is, several DMA channels can be enabled simultaneously, but only one of them can transmit simultaneously.

Priorities can be set programmatically, there are 4 gradations in total:
  • Very high priority
  • High priority
  • Medium priority
  • Low priority
If the priority level is the same (e.g. channel 1 and channel 3 are set to Very high priority), the channel with the lower number will have priority over the channel with the higher number (channel 1 will have higher priority).

DMA Channels
Each DMA channel has the following registers:
  • DMA_CCRx is the DMA channel configuration register. It contains the data transfer configuration bits, the channel interrupt enable bits, and the channel enable bit.
  • DMA_CNDTRx - how many data frames are to be transferred (0..65535). I intentionally used the word "frame" and not "byte". The point is that it is possible to set the number of bytes of data that are transferred per transaction (configured in the register DMA_CCRx), we will talk about this further.
  • DMA_CMARx - memory address. If the direction of transfer from the periphery to memory, we will write data here, if the opposite, we will read.
  • DMA_CPARx - address of the peripheral device. If the direction of transfer from the periphery to memory, we will read data from here, if the opposite, we will write it. In Memory-to-memory mode the memory address is also written here.
So, everything seems to be clear: we have 4 registers, which can be used to configure the sending of data back and forth.

Now it's time to talk about such a thing as incrementing the address of memory and peripherals. All examples will be given for SPI. Peripheral address incrementing is most often meaningless, but memory incrementing is very useful. Let the DMA_CMARx register contains the address of the zero cell of the array, which we are going to send to SPI (remember C pointers). After each sending of data to SPI the internal memory pointer of the DMA channel will be incremented by 1 element of the array. Here it is worth noting one important point: the increment is made by the internal pointer, which is not available programmatically for reading or writing, the register DMA_CMARx does not change its value in the process of data transfer.

On the example of SPI it will work like this. In the register DMA_CMARx entered the address of the zero element of the array that we want to send, in DMA_CPARx address of the data register DR module SPI. In DMA_CNDTRx we write the number of bytes to be transferred. Enabled the increment of the memory address, in the SPI module allowed a request to the DMA to transfer data and started the process by setting the EN bit in the DMA_CCRx register. In the initial state the SPI transmitter is empty, the empty transmitter flag is set to one, this triggers the DMA overgrowth. The DMA receives a request from SPI, then reads a byte of data from the array and writes it to the SPI interface DR register, increments the internal memory pointer by one array element (1 byte) and decrements the DMA_CNDTRx register by one. After the SPI spits out a byte of data, the process will repeat. All this will continue until the value of DMA_CNDTRx is zero. The DMA channel will then complete the transfer and will no longer respond to requests from the SPI.

But this is for the case if we need to transfer data to the periphery one byte at a time. What should we do if we have 2 bytes array bit rate and the periphery wants 2 bytes as input too?

For such cases, the DMA_CCRx register has configuration bits for the peripheral register bit size (PSIZE) and memory data bit size (MSIZE). They can take the following values:
  • 00: 8-bits
  • 01: 16-bits
  • 10: 32-bits
  • 11: Reserved (this combination is not used).
That is, if we set
  • MSIZE=16 bits (2 bytes)
, we will send 2 bytes at a time, and the pointer to the memory address will be incremented by 2. But the
  • DMA_CNDTRx
register will still be decreased by one, because it contains not the number of bytes to be transferred, but the number of transfers (transactions). It turns out that
  • MSIZE
is used to tell the
  • DMA
how many bytes to increment the internal pointer to the memory address. This is correct, but
  • MSIZE
is used for one more thing.

Alignment of data of different digits

Very often there are situations when the digit capacity of the data receiver does not match the digit capacity of the source. For example, in an SPI module, the bit capacity of the DR data register is 16 bits (2 bytes, or a half word). However, SPI can be configured to transmit 8 bits at a time and we have an array of data to transmit, with a digit capacity of 8 bits. DMA allows us to configure the data transmitter and receiver bit sizes independently. As mentioned above, we use the MSIZE bits to set the data bit size in memory. But there are also bits PSIZE, which should be used to specify the bit size of the register of the peripheral device (8, 16 or 32 bits). If PSIZE is not equal to MSIZE, DMA will perform automatic data alignment according to the following rules.

Suppose the data source digit capacity is 8 bits and the receiver digit capacity is 16 bits. Then during the transfer DMA will add 8 non-significant zeros to the data from the source and write them to the receiver: for example 0x13 was read from the source and 0x0013 was written to the receiver. If the source bit is larger than the receiver bit, the DMA will cut off the extra high bits from the source data and write only the low bits to the receiver: if the source bit is 32 bits and the receiver bit is 8 bits, the DMA will read a value from the source, for example, 0xABCDEF12, and the receiver will receive 0x12. Basically, everything is the same as when assigning values to variables in C.

In the Reference manual for STM32F1xxx microcontrollers in the section about DMA there is such a table:
Figure 1. DMA data alignment rules
image.png (156.27 KiB)
Figure 1. DMA data alignment rules Viewed 3568 times
The table in the Reference manual may seem rather complicated (in fact it is). Let's understand it in one of the cases. For example, the data source is 32 bits and the receiver is 16 bits:
Figure 2. Conversion of 32-bit values to 16-bit values
image.png (17.34 KiB)
Figure 2. Conversion of 32-bit values to 16-bit values Viewed 3568 times
Suppose we transfer data from one array in memory to another via DMA. And these arrays have different bit sizes. The source array has 32 bits and contains the following data:
  • element 0: B3B2B1B0, offset 0x00
  • element 1: B7B6B5B4, offset 0x04
  • element 2: BBBAB9B8, offset 0x08
  • element 3: BFBEBEBDBC, offset 0x0C
If you don't understand what an offset is and why it takes these values, you should read about data organization in microcontroller memory and C pointers.

As an array it will look like this:

Code: Select all

uint32_t Sourse[4] = 
{
  0xB3B2B1B0,
  0xB7B6B5B4,
  0xBBBAB9B8,
  0xBFBEBDBC
};
Well, and the data receiver:

Code: Select all

uint16_t Destination[4];
And DMA will perform this operation:

for(int i=0; i<4; i++)
Destination = Sourse;

I think that people familiar with C will understand the result of this operation:

Code: Select all

uint16_t Destination[4] =
{
  0xB1B0,
  0xB5B4,
  0xB9B8,
  0xBDBC
};
In principle, all the same is true for data transfer from memory to the peripheral register, only in that case the peripheral address increment is not used.

Peculiarities of accessing the AHB/APB periphery
There is one very important and not obvious feature of the architecture of STM32 microcontrollers. CPU in STM32 is 32-bit, and for writing to memory 8, 16 or 32 bits there are different commands and different write requests. For RAM there is no problem: we can perform 8, 16 and 32-bit memory queries. But the AHB/APB peripherals can only be accessed with 32-bit requests. What if we need to write to a register that has a bit capacity less than 32 bits? Let me explain by the example of the same SPI. The DR data register has 16 bits, and the higher 16 bits of the 32-bit bus are simply not used:
Figure 3. SPI register map
image.png (80.94 KiB)
Figure 3. SPI register map Viewed 3568 times
If in the DMA we configure the peripheral bit size PSIZE = 16 bits and the memory bit size MSIZE = 16 bits, the DMA will duplicate the low 16 bits to the high 16 bits and make a 32 bit request to the peripheral:

Code: Select all

0xABCD -> 0xABCDABCD
I.e. from 0xABCD the DMA will make 0xABCDABCD and this value will be sent to the peripheral. And since the upper 16 bits of the DR register are not used (reserved), the upper 16 bits are simply ignored. You can also set PSIZE = 32 bits, and then the DR register will be written to the value 0x0000ABCD. But if PSIZE is set to 8 bits, the DMA will do the following conversion:

Code: Select all

0xAB -> 0xABABABABABABABAB
Thus, 0xABAB will be written to the DR and not 0x00AB as you might think if you don't know these peculiarities. It is precisely because the periphery can only be accessed with 32-bit requests that all registers in the periphery are aligned on a 32-bit boundary (see Figure 3).

Circular mode DMA

I think everyone is familiar with the circular buffer. It is very convenient to use it when continuously receiving / transmitting data. In DMA microcontrollers STM32 this mode is implemented in hardware, and it is enabled by the CIRC bit in the control register DMA_CCRx. If this mode is activated, then after the transfer of all data through DMA (after DMA_CNDTRx becomes zero), the register DMA_CNDTRx is reloaded with the original value and the transfer continues.

Memory-to-memory mode

In "normal" mode, the DMA channel waits for a request for data transfer from any peripheral module (SPI, ADC, timer, etc.) However, the DMA channel can work without a request from the peripheral, ie transfer will begin immediately after setting the EN bit in the register DMA_CCRx. This mode can be used to copy one memory area to another. To do this, it is necessary to put the addresses of the arrays to be copied into the DMA_CPARx and DMA_CMARx registers and set the MEM2MEM bit in the DMA_CCRx register. It turns out that both the peripheral address register and the memory address register are assigned to the addresses of the arrays in memory. When forwarding MEM2MEM, any free DMA channel can be used. And how is the transfer direction selected? Exactly the same way as when exchanging data with peripherals: by the DIR bit of the DMA_CCRx register. An example of memory-to-memory transfer will be in one of the next articles, where we will move on to practice. It should be noted that you cannot use MEM2MEM mode simultaneously with Circular mode.

DMA interrupts

Each DMA channel has 3 interrupts:
  • Half-transfer - DMA has transferred half of the data, it is convenient when streaming data together with ring mode: while transferring one half of the array, fill the other half.
  • Transfer complete - interrupt on completion of data transfer.
  • Transfer error - transfer error interrupt.
Errors during data transfer via DMA

DMA error can occur when reading / writing to the reserved address space of the microcontroller STM32. When an error occurs, the corresponding DMA channel is disabled (EN bit is cleared) and a Transfer error interrupt occurs (if enabled).

DMA channels and peripherals

DMA1 in STM32F103C8 microcontrollers has 7 channels of data transfer, and each channel has its own periphery. Here is a table from the Reference manual to make it clearer:
Figure 4. DMA channels and corresponding requests from peripheral devices
image.png (35.74 KiB)
Figure 4. DMA channels and corresponding requests from peripheral devices Viewed 3568 times
For example, channel 1 can serve requests from ADC1, TIM2_CH3 and TIM4_CH1, and channel 2 from SPI1_RX, USART3_TX, TIM1_CH1, TIM2_UP and TIM3_CH3. It is worth noting that the requests themselves must be enabled in the peripheral registers, and if you enable DMA request from 2 sources, the data exchange will start from 2 different requests. I can't give you an example where this might be useful, and most likely this configuration doesn't make sense.

That's all for now, the next article will describe DMA registers and then we will move on to practice.