The Device memory attribute is defined for memory locations where an access to the location can cause side effects, or where the value returned for a load can vary depending on the number of loads performed. Memory-mapped peripherals and I/O locations are typical examples of areas of memory that you must mark as Device. The marking of a region of memory as Device is performed on a per-region basis in the MPU.
Accesses to memory-mapped locations that have side effects that apply to memory of type Normal might require memory barriers to ensure correct execution. An example where this might be an issue is the programming of the control registers of a memory controller while accesses are being made to the memories controlled by the controller.
Instruction fetches must not be performed to areas of memory containing read-sensitive devices, because there is no ordering requirement between instruction fetches and explicit accesses. As a result, instruction fetches from such devices can result in Unpredictable behavior. Up to 64 bytes can be prefetched sequentially ahead of the current instruction being executed. To enable this, you must locate read-sensitive devices in memory in such a way to permit prefetching.
Explicit accesses from the processor to regions of memory marked as Device occur in the size and order defined by the instruction. The number of location accesses is specified by the program. Accesses to regions of memory marked as Device are not restartable. Repeat accesses to such locations when there is only one access in the program are not possible in the ARM1156T2-S processor. An example of where a repeat access might be required is before and after an interrupt to enable the interrupt to abandon a slow access. These optimizations are not performed on regions of memory marked as Device.
In addition, address locations
marked as Device are not held in a cache.
The above Red mark is very important: The above looks
like old arm article before Cortex series were invented.
Can now we have device memory as cacheable as well?
Another important thingi:
The specification allows the combined use of memory types Device Non-buffered and Device Buffered to force transactions to reach their final destination.
A transaction that is marked as Device Buffered is required to reach its final destination in a timely manner, but there is no indication back to the issuing master when the transaction is visible to all other masters.
If a Device Buffered transaction or stream of transactions is followed by a Device Non-buffered transaction that uses the same AXI ID, it will force all of the Device Buffered transactions to reach the final destination before a response is given to the Device Non-buffered transaction.
A Device Non-buffered transaction can only guarantee the completion of Device Buffered transactions that are issued with the same ID, that are to the same slave device. The minimum address space occupied by a single slave device is 4kbytes.