How Msi Address Register Gets Set In Pcie Devices
Contents
- Introduction
- How Signals are Routed
- Configuration
- Enumerating the PCI Bus
- Enhanced Configuration Address Space (ECAM)
- Control Register
- Status Register
- Bridges
- Capabilities
- Interrupts
- INT#
- MSI
- MSI-Ten
- Base Address Registers (BAR)
- Assigning BARs
- PCI-Express Devices
- Recap
- Communication
- Drivers
Introduction
Peripheral Component Interconnect or PCI and its serial cousin, PCI express, is a motorcoach where components can be added to an existing organisation without too much headache.
In the older days of ISA and EISA busses, the wires were physically connected to certain places, such as the I/O bus and/or MMIO. Furthermore, the interrupt request vectors would be specified when the board was made. Some configuration could happen, for example, I could gear up a jumper or "dual-inline package" (DIP) switches to change the I/O address or IRQ vector.
With more and more boards being manufactured that do unlike things, this became untenable. The PCI bus solved a lot of these bug by making much of the configuration done at run time. This means that the operating system can enumerate the PCI motorcoach and gear up the addresses and other options. Information technology also means that the I/O addresses are programmable by the operating system! This has given rise to the "plug-due north-play" system.
This bus ran parallel wires at 33 MHz or 66 MHz depending on a run-time setting. The wires were physically connected from the device to the endpoint, such every bit the CPU or DMA controller through a chipset "south span". Equally more than and more devices were added, including those soldered directly to the motherboard, a ameliorate bus was needed. This is where PCI Limited came into play.
PCI express (PCIe) changed the parallel nature into a series nature. It also changed the connections betwixt devices and the host. Now, PCIe is more like a "star" network topology. Devices are connected together through a PCIe host, and older PCI cards can still be continued through a PCIe bridge. PCI express also has different signaling wires depending on the application.
PCIe uses differential signalling, meaning two wires send out the same information 180 degrees out of phase. So, if one wire is sending a 1, the other is sending a 0, and vice-versa. The thought is that since the wires are so close together, both wires will be affected by interference the aforementioned. And then, you tin can differentiate the signal from the noise just by subtracting the values of the two wires.
The signals that come from PCIe come in ane of the following packages: 1x, 2x, 4x, 8x, or 16x. At that place are options for more than, but 16x is the fastest in consumer computers present. To read this, read it equally 1x (one times) or 16x (sixteen times). This is multiplying the number of wire pairs. So, 16x has sixteen differential pairs or 32 wires in total. Since two of each of the 32 wires have the same data (merely the contrary), this allows 16 unlike signals all at once.

Bespeak Routing
PCI and PCIe are both "busses", which but means a central arrangement to connect all of our devices. PCI express tin can be thought of as the "Internet of devices". In other words, we accept "computers" and "routers" in the Cyberspace sense, which are called "type-0 devices" and "blazon-1 bridges" in the PCIe sense.
PCI Connections
A type-0 device is given its name past the "header" type, which you will encounter beneath when yous go to configure the device. Every device connected to PCIe requires (ane) PCI configuration access mechanism (CAM) registers and (2) device-specific registers.
Configuration Registers
We access the CAM though MMIO. This memory address is designed into the calculator and arrangement boards. For VirtIO, the CAM is connected to 0x3000_0000. Whatever memory address between 0x3000_0000 and 0x3FFF_FFFF will be sent to the PCI subsystem by the retention controller. This is programmed into the memory controller when the hardware is fabricated (hard coded) or through some sort of non-volatile RAM, which is sometimes called firmware.
Address Option and Bespeak Routing
When the retentivity controller sees the 0x3xxx_xxxx retention address, it frontward it to the PCI "root port", which is the PCI subsystem. The root port then dissects this retentiveness address bits into ii pieces: a bus number (8 bits from 27 to 20) and a device number (v $.25 from xix to xv). Think of the bus number every bit a "subnet accost" on the Internet (like 192.168.1.xxx) and the device number existence the IP address (like xxx.xxx.xxx.125). These 2 pieces of data form a PCI accost.
Instance PCI Address
A PCI address is formed by dissecting the MMIO address:
Case address: 0x3076_b000 In binary: 0011 0000 0111 0110 1011 0000 0000 0000 Regroup the binary digits: 0011 [00000111] [01101] [ 011 ] [ 0000 ] [0000 00] 00 [ Bus # ] [Dev #] [Func # ] [Ext Reg # ] [ Reg # ]
The memory address above points to bus #7, device #13. The function number (#3 higher up), extended annals, and annals fields are for a specific register on a device. The last two $.25 are used for other purposes and are not used for identifying a device on the PCI bus.
A device tin can accept multiple "functions". For instance, consider a graphics card. Information technology can have an audio output, HDMI output, or even a USB-C input/output. This would be three functions nether the same graphics card device.
Nosotros will NOT be exploring whatsoever multifunction devices for this grade, and so all role numbers will exist 0.
PCI Network Topology
So, at this point, the retention controller forwarded a load/store to the PCI root port based on the memory address. The root port now receives this and separates out the bus number and device number. It so passes these values to all devices directly under it. I utilize the term directly because we have yet to impact the type-one bridges. So, the bridges are directly connected to the root port, but nosotros cannot see anything behind the bridge until we configure information technology. Call back of the span as a router in your firm. Unless you configure port forwarding or a DMZ, anyone on the outside cannot see the devices behind your router.

A type-1 device connected to PCI is chosen a bridge. A span's task is to forward signals from a primary bus to a secondary bus. The principal coach is the bus that the bridge is actually connected to. In the figure higher up, the master bus for the span would be bus 0. We become to configure the secondary jitney. Equally long as we selection a unique motorbus number, nosotros tin can assign it whatever we want. Nonetheless, the all-time thing to do is just increment the coach number of every bridge you run into.
Until yous configure the master and secondary bus numbers for the bridge above, you volition NOT exist able to see the devices behind it–the two devices under coach i. This is why we start at jitney 0, device 0 and apply a doubly-nested for-loop to go through all the devices on one omnibus before switching to the next. Since the span above is on double-decker 0, we will configure information technology before we actually start enumerating the devices on autobus 1. Since the span is responsible for forwarding signals from bus 0 to motorbus 1, the bridge must be configured otherwise the signals won't be forwarded.
Motorcoach Numbering for Bridges



What is a BAR?
A BAR is a base address register. This is a register in the PCI configuration access mechanism address space. Think of the BARs equally memory pointers. The memory accost a BAR points to connects the retentivity controller to a devices' actual registers. Recall that devices have (1) PCI registers and (2) device-specific registers. We accept yet to actually connect to #ii. Nosotros can but do then by configuring the Confined.
One matter to note is a BAR is Non what we use to access a device. Instead, a BAR stores a retention address we can then use to access a device's registers. And then, if we write 0x4041_0000 into a BAR, we can then load and store to the memory accost 0x4041_0000 to access the device-specific registers.

We cannot access the device-specific registers until we give a memory accost to a base of operations-address-register located in the PCI configuration infinite.

We will take a await at the device-specific registers in the Virtual I/O lecture.
BAR Retention Pointer Routing
When we write a memory address into a BAR, the PCI subsystem blocks out a infinite in the retentivity controller. This is the power of the PCI organisation. We can write essentially any retentivity address in the BAR so use the memory controller to admission annihilation behind it…which volition be the device-specific registers.
The device itself will tell yous which Confined information technology uses and for which purposes. And so, we have to know which kind of device nosotros're driving (via vendor id and device id, see below).
Again, a BAR is a memory pointer. To access registers behind it, nosotros load and store using the retentivity address the BAR points to and NOT the BAR itself.
The output beneath (from info pci) shows a block device. Every type-0 device tin have upwards to six, 32-bit BARs (retention pointers). All the same, many are not used. We can also have BARs that shop 64-bit retention addresses. In that example nosotros employ BAR[x] to shop the lower 32-bits and BAR[ten+1] to store the upper 32-bits. For case, the output below shows BAR[4] is a 64-fleck BAR, meaning that BAR[4] stores the lower 32-bits and BAR[5] stores the upper 32-bits of a 64-bit retentivity accost.
Jitney 3, device 2, function 0: SCSI controller: PCI device 1af4:1042 PCI subsystem 1af4:1100 IRQ 0, pivot A BAR1: 32 bit memory at 0x00000000 [0x00000fff]. BAR4: 64 bit prefetchable memory at 0x40320000 [0x40323fff]. id "blk3"
Recall that we can essentially put any memory address in the BAR. The qemu virt motorcar reserves memory addresses 0x4000_0000 through 0x4FFF_FFFF for PCI device-specific registers. It is important to note that 0x4000_0000 is Non a RAM address. Instead, information technology is an address the retentiveness controller forwards to PCI so that information technology can read from (load) or write to (store) a device-specific registers on the hardware.
Above, I put the double-decker number starting at flake 20 and the device number starting at flake 16 (1 flake over from where it is in ECAM). The device above is on bus #3 and information technology is device #2. This is why the memory accost pointed to past BAR4 is 0x4032_0000.
Observe that the terminate retentivity address of BAR4 is 0x4032_3FFF. Each BAR is a arrow to a memory address which in turn connects to a device-specific register. These registers can be many different sizes. We will discuss below the mechanism for determining how much memory space each BAR needs. It involves writing -ane into the BAR and seeing what comes out.
Configuration
The configuration of PCI is its power. The PCI bus has a configuration address mechanism (CAM) and PCIe extends this to a much larger address infinite (256 bytes to 4096 bytes) chosen enhanced configuration address mechanism (ECAM).
The ECAM for the QEMU virt is at MMIO address 0x3000_0000
. Each PCI host, bridge, and device has an ECAM, so to offset configuring, we demand to enumerate all of the devices fastened to PCI.
To practise and then, there is some terminology nosotros need to know. PCI devices are oriented in a jitney, slot (device), and function fashion. The bus is what host the device is connected to. Each host has a number of slots (aka devices) where the device actually connects. And so there is a function, which is an addressable unit of measurement of the device itself. Most PCI(due east) devices accept only one function (function 0). However, if bit 7 of the header_type field is set to 1, then it is a multifunction device. A multifunction device can accept up to 8 functions (0 – 7), and they must be enumerated like the busses and devices.
Enumerating the PCI Bus
To enumerate the motorbus, we starting time with bus 0, slot 0 and go on incrementing until the accost space is over. This is called the brute forcefulness method, merely information technology'southward only done one time at boot time for not-hot-plug devices. Nosotros volition not be covering hot plug devices in this course. A hot plug device is a device that tin exist plugged in or taken out while the computer and OS are running.
For virt, we have up to 256 busses (up to eight $.25 for motorbus number), and each motorbus tin take upwards to 32 devices per bus (5 bits for device number). The base address of each ECAM is given by the post-obit diagram.

Recall that the ECAM starts at 0x3000_0000
, and then this would be charabanc 0, device 0, function 0. For ECAM, the passenger vehicle number starts at bit xx, the device number starts at bit 15, the function number starts at bit 12, the extended register number starts at bit 8, and the register number starts at bit 2. Then, we tin can calculate the bus and device using the post-obit role.
#define MMIO_ECAM_BASE 0x30000000 static volatile struct EcamHeader *pcie_get_ecam(uint8_t omnibus, uint8_t device, uint8_t office, uint16_t reg) { // Since we're shifting, we need to make sure we // have plenty space to shift into. uint64_t bus64 = bus & 0xff; uint64_t device64 = device & 0x1f; uint64_t function64 = function & 0x7; uint64_t reg64 = reg & 0x3ff; // Finally, put the address together return (struct EcamHeader *) (MMIO_ECAM_BASE | // base 0x3000_0000 (bus64 << twenty) | // bus number A[(20+due north-1):20] (up to 8 $.25) (device64 << 15) | // device number A[xix:fifteen] (function64 << 12) | // function number A[fourteen:12] (reg64 << 2)); // annals number A[11:2] }
Now, nosotros can apply our function to determine the memory address of the header.
int bus; int device; // There are a MAXIMUM of 256 busses // although some implementations allow for fewer. // Minimum # of busses is 1 for (bus = 0;motorbus < 256;coach++) { for (device = 0;device < 32;device++) { // EcamHeader is defined below struct EcamHeader *ec = pci_get_ecam(charabanc, device, 0, 0); // Vendor ID 0xffff ways "invalid" if (ec->vendor_id == 0xffff) continue; // If we get here, we have a device. printf("Device at double-decker %d, device %d (MMIO @ 0x%08lx), form: 0x%04x\northward", bus, device, ec, ec->class_code); printf(" Device ID : 0x%04x, Vendor ID : 0x%04x\n", ec->device_id, ec->vendor_id); } }
Enhanced Configuration Accost Infinite (ECAM)
The configuration layout is based on the header type, but the get-go sixteen bytes are the same for all devices.

struct EcamHeader { uint16_t vendor_id; uint16_t device_id; uint16_t command_reg; uint16_t status_reg; uint8_t revision_id; uint8_t prog_if; union { uint16_t class_code; struct { uint8_t class_subcode; uint8_t class_basecode; }; }; uint8_t cacheline_size; uint8_t latency_timer; uint8_t header_type; uint8_t bist; marriage { struct { uint32_t bar[6]; uint32_t cardbus_cis_pointer; uint16_t sub_vendor_id; uint16_t sub_device_id; uint32_t expansion_rom_addr; uint8_t capes_pointer; uint8_t reserved0[iii]; uint32_t reserved1; uint8_t interrupt_line; uint8_t interrupt_pin; uint8_t min_gnt; uint8_t max_lat; } type0; struct { uint32_t bar[2]; uint8_t primary_bus_no; uint8_t secondary_bus_no; uint8_t subordinate_bus_no; uint8_t secondary_latency_timer; uint8_t io_base; uint8_t io_limit; uint16_t secondary_status; uint16_t memory_base; uint16_t memory_limit; uint16_t prefetch_memory_base; uint16_t prefetch_memory_limit; uint32_t prefetch_base_upper; uint32_t prefetch_limit_upper; uint16_t io_base_upper; uint16_t io_limit_upper; uint8_t capes_pointer; uint8_t reserved0[3]; uint32_t expansion_rom_addr; uint8_t interrupt_line; uint8_t interrupt_pin; uint16_t bridge_control; } type1; struct { uint32_t reserved0[ix]; uint8_t capes_pointer; uint8_t reserved1[three]; uint32_t reserved2; uint8_t interrupt_line; uint8_t interrupt_pin; uint8_t reserved3[2]; } common; }; };
The PCI device is in piddling-endian format, so the offset byte is Vendor ID, followed by the Device ID. For the QEMU virt purposes, the host'due south vendor ID is 0x1b36, whereas each virtio device'south vendor ID is 0x1af4. The device ID combined with the course code tells us what kind device is connected.
The common parts of the header take the following meanings. There are a lot of fields, and many we will non utilise. The ones we volition employ are in assuming
.
-
vendor_id
– The vendor ID of the device. 0x0000 and 0xffff means device is Non connected (and should be skipped). -
device_id
– The device ID given to this device. This will place which driver should configure the device. -
command_reg
– The command register (detailed below). -
status_reg
– The condition register (detailed beneath). -
revision_id
– Device specific revision information (generally not used). -
prog_if
– Programmable interface (generally not used). -
class_code
– The form identifier. For case, base class (upper 8 bits) 0x09 is input, and sub grade (lower 8 $.25) 0x80 is "other". -
cacheline_size
– The number of 32-flake words in cache (for bus master devices). -
latency_timer
– The number of PCI bus clocks required for bus mastering (for double-decker primary devices). -
header_type
– The type of header (Type 0 – device, Type 1 – pci-to-pci bridge). -
bist
– Built-in Cocky Test (BIST).
Type 0 headers incorporate the following fields.
-
bar[6]
– Base accost registers. Programmable MMIO addresses to place upwards to 6, 32-bit or iii, 64-bit registers. The registers are specific to the device, including I/O and configuration. The Os volition write the MMIO address to link these registers. For 64-fleck registers,bar[n]
is the low 32 bits of the accost andbar[n+1]
is the loftier 32 bits of the accost. -
cardbus_cis_pointer
– Cardbus (PCMCIA) coach specification pointer. -
sub_vendor_id
– The vendor id of the attached subsystem. This is additional data to the vendor id. -
sub_device_id
– The device id of the attached subsystem. This is additional data to the device id. -
expansion_rom_addr
– The address for expansion ROM. -
capes_pointer
– The capability arrow to the head of the adequacy linked list (described beneath). -
interrupt_line
– The interrupt vector that the device is connected to. For virt, this is wired to 0. -
interrupt_pin
– The interrupt pin that the device volition trigger. PCI has four interrupt pins: INTA#, INTB#, INTC#, and INTD#. 0 means the interrupt is not connected, ane = INTA#, 2 = INTB#, 3 = INTC#, and iv = INTD#. -
min_gnt
– Minimum "gain" time. -
max_lat
– Maximum latency.
Command Register
The command register allows u.s. to transport some commands to the PCI device for configuration purposes (not I/O).

This is a read/write register, and information technology is per-device. If nosotros utilise MMIO (which we will), we desire to ensure Memory Infinite is one then that the PCI device tin answer to MMIO reads/writes. We will not be using PIO, and so the I/O Space bit should be set to 0.
Make sure that you prepare the command register Before loading or storing to the accost assigned to the Confined! All devices should have flake ane gear up, and all bridges should have $.25 1 and 2 set. NOTE: The "info pci" command in QEMU will not display the addresses you store into the Confined until the control register is set to have memory space requests (bit index ane).
Keep reading farther for boosted considerations when writing to this register.
Status Register
The status register gives a response to a command and has the following structure.

The status annals is a read/write annals, but we only write i to the bits we want to reset. Writing a 0 into a bit will leave information technology unchanged.
PCI Bridges
PCI bridges are similar a network switch. A PCI bridge uses a type 1 header, and information technology frontward communication to and from a divide, secondary bus. Nosotros will not exist able to see any device behind a bridge until we set up the bridge.

Remember beginning to fix the bus chief and retention-mapped I/O bits (1 and 2) in the command register before setting these fields.
There are a few fields we have to business concern ourselves with hither. The first few fields are the: (1) master bus number, (2) secondary bus number, and (3) subordinate bus number. A bridge is fastened to a principal bus, usually omnibus 0, and it forwards requests to and from the secondary bus, which we accept to enumerate. Whatever value we assign into the secondary motorcoach number will exist the bus number for all devices backside the bridge. Finally, the subordinate motorbus number is the highest jitney number that will be controlled by this bridge. If in that location are bridges behind other bridges, this is when we will demand to fix the subordinate bus number. The subordinate bus number must exist >= the secondary coach number. For instance, if nosotros are enumerating a span with three bridges behind it, the subordinate charabanc number would be secondary + 3. This obviates that any nested bridges must have sequential autobus numbers.
Finally, there are four fields we will need to set here. (ane) memory and prefetchable retention base of operations and (2) memory and prefetchable memory limit. In this case, the base is the lowest retentivity addresses that tin be forwarded through the bridge, and the limit is the highest memory addresses that can be forwarded through the bridge.
The retentiveness addresses we store in these fields are just the upper 16-$.25 of the retention address. So, if we want to allow the secondary bus to apply the MMIO addresses 0x40000000 through 0x4fffffff, then we would set the retentiveness base to 0x4000 (upper 16 $.25) and the limit to 0x4fff (upper 16 $.25).
Even though xvi-bits of the retentiveness address are stored in this register, only the upper 12 bits are used. This means that only what we set in bits 20 and above will actually be identified by the bridge.

However, if we choose our memory a little fleck improve, we tin can shift the autobus number into the 20th bit, which is the kickoff addressable chip on the span. For example, bus 1 would only need to forward memory transactions from 0x4010_0000 through 0x401F_FFFF. The following screenshot shows the output of info pci for the fourth PCI span connected to the root port.
Charabanc 0, device four, function 0: PCI bridge: PCI device 1b36:000c IRQ 0, pivot A Coach 0. secondary motorbus iv. subordinate passenger vehicle 4. IO range [0xf000, 0x0fff] memory range [0x40400000, 0x404fffff] prefetchable memory range [0x40400000, 0x404fffff] BAR0: 32 chip retentiveness at 0x00000000 [0x00000fff]. id "bridge4"
Notice that only retention addresses betwixt 0x4040_0000 and 0x404F_FFFF will be forwarded to devices (and other bridges) behind this bridge. Therefore, if nosotros set a BAR on a device behind this bridge to 0x4030_0000, the device will never hear retention transactions since the span is not configured to forward those addresses. Note that this bridge does accept BAR0, only we are not required to configure Confined on bridges.
Each bar tin can be prefetchable or not, which is indicated past bit iii. We need to set the retention addresses for both memory base of operations and prefetchable memory base.
static void pcie_setup_bridge(volatile struct EcamHeader *ec, uint16_t bus) { static uint8_t subordinate = 1; uint64_t addrst = 0x40000000 | ((uint64_t)subordinate << xx); uint64_t addred = addrst + ((1 << 20) - 1); PciEcam *ec = pci_get_ecam(omnibus, slot, 0, 0); ec->command_reg = COMMAND_REG_MMIO | COMMAND_REG_BUSMASTER; ec->type1.memory_base = addrst >> sixteen; ec->type1.memory_limit = addred >> xvi; ec->type1.prefetch_memory_base = addrst >> sixteen; ec->type1.prefetch_memory_limit = addred >> sixteen; ec->type1.primary_bus_no = bus; ec->type1.secondary_bus_no = subordinate; ec->type1.subordinate_bus_no = subordinate; subordinate += 1; }
We can figure out the motorbus through the ecam address, but uint16_t bus passed into this office allows us to set the bridge's chief port. The primary port of most bridges is the port that it was found on. We prepare both retentiveness and prefetch_memory. Which of these are accessed is controlled by scrap 3 of each BAR.
The PCI bridges will hear all retention accesses from the root port. This is why we take to specify the retentiveness base and memory limit. This is mainly for the bridges to forward data we load or store in the memory addresses in the BARs, so with some conscientious planning, we can only forwards the portion of the data required.
Capabilities
The capes_pointer
points to an offset based at the top of the header where a linked list of capabilities are. Each adequacy has a unique identifier (ID), and the structure at the offset is based on the adequacy. The capabilities linked list permit the states to see what sort of things each device can do. An important capability ID is 0x09, which is the "Vendor-specific capability". We volition be looking at these capabilities to determine which base address register (BAR) is connected to which part of the device.
The capabilities all accept a common 2-byte sequence, still each capability can have an expanded structure. PCI devices that have a capabilities linked-list will have the status_reg
bit 4 (Capabilities List) prepare to 1. If this fleck is 0, then there are no capabilities, and the capes_pointer
should be considered invalid…although it volition most likely be 0.

The first byte is the capability ID, and then the side by side byte is the first to the next capability. All offsets are based on the top of the ECAM (the accost of the vendor ID field). The last capability will have the side by side capability set to 0, signaling there are no more capabilities.
Again, each capability has its own construction, which nosotros will only know afterwards reading the capability ID.
struct Capability { uint8_t id; uint8_t next; }; // Brand certain in that location are capabilities (bit 4 of the status register). if (0 != (ptr_to_ecam->status_reg & (1 << 4)) { unsigned char capes_next = ec->common.capes_pointer; while (capes_next != 0) { unsigned long cap_addr = (unsigned long)pcie_get_ecam(autobus, slot, 0, 0) + capes_next; struct Adequacy *cap = (struct Adequacy *)cap_addr; switch (cap->id) { case 0x09: /* Vendor Specific */ { /* ... */ } break; case 0x10: /* PCI-limited */ { } pause; default: printf("Unknown adequacy ID 0x%02x (adjacent: 0x%02x)\n", cap->id, cap->next); break; } capes_next = cap->next; } }
Interrupts
For fast moving devices, such as those connected to PCIe x16, signaling an interrupt for every data transfer will get expensive, and end up slowing the device. PCI and PCIe tin can function using message signaled interrupts (MSI) or "extended" message signaled interrupts (MSI-Ten).
An MSI or MSI-10 is a place in retentivity where the PCI device signals a "message" that would unremarkably crusade an interrupt. The operating system can look at a field called Pending Bit Array or PBA. If an interrupt is awaiting, information technology tin can then handle the interrupt as normal.
MSI-10 is exposed as a capability (Capability ID = 0x11), and MSI is exposed every bit a adequacy (Capability ID = 0x05). For typical cases, we use the more advanced MSI-X if information technology is available over MSI.
Annotation: MSI/MSIX is NOT supported by RISC-V'southward PLIC. There is an AIA (avant-garde interrupt compages) that is currently in development that will support MSI/MSIX. However, currently, the PLIC will be assigned a PCIe device based on its slot between 32 and 35. The following formula determines the interrupt pin. Note that more than than one PCIe device might be continued to the same interrupt!
\(\text{IRQ}=32 + [(\text{passenger vehicle} + \text{slot})\mod~4]\)
INT#
Our emulating software codes the values of the interrupt of the PCIe devices based on the jitney and slot of the device. This means that the interrupt_pin field in the ECAM is not valid and will usually be 0.
Since multiple devices tin can trigger on the same interrupt pivot, we have to inquire each device on that interrupt pin if it caused the interrupt.
Each device has a special style of "interrupting" in this system. For most devices, those handlers will be in Virtio.
MSI

(message command bits viii = 0, 7 = 0)


(message control $.25 8 = 1, vii = 0)

(bulletin command bits viii = 1, 7 = 1)
The message control register for MSI has the following $.25:
Bits | R/Westward | Clarification |
---|---|---|
15:9 | RO | RESERVED |
viii | RO | Per-vector masking capable. (0 = No, 1 = Yes) |
7 | RO | 64-scrap address capable (0 = No, 1 = Yes) |
6:4 | RW | Multiple messages enable 0b000 – ane bulletin 0b001 – ii messages 0b010 – iv letters 0b011 – eight messages 0b100 – 16 messages 0b101 – 32 messages 0b110 and 0b111 – Reserved |
3:1 | RO | Multiple messages capable Fields are the same as multiple letters enable. |
0 | RW | MSI Enable (set to ane to enable MSI, set to 0 to disable MSI). |
MSI-X
PCI/PCIe uses EITHER MSI or MSI-X, but Non both. Both tin can be provided as capabilities; however, software (the OS) must choose ane or none. If neither MSI nor MSI-X are chosen, the PCI infrastructure will use the interrupt organisation. For RISC-Five this ways it will use the PLIC interrupts betwixt 32 and 35.

The message control annals allows us to configure the MSI-X. The table offset contains the offset of the table subsequently reading the BAR given by BIR (Base Indicator Register) (i.e., ec->type0.bar[bir]
).


The message command register is different for MSI-X every bit well. Information technology is still 16-$.25, but it contains the post-obit fields.
Bits | R/Westward | Clarification |
---|---|---|
15 | RW | MSI-Ten enable (1 = enabled, 0 = disabled) |
xiv | RW | Function mask (1 = all vectors masked, 0 = masking based on each vector's masked scrap). |
13:xi | RO | Reserved |
10:0 | RO | Table size (size encoded as msg_control[10:0] – 1). |
We tin fix bit 1 in the vector control to mask (turn off) interrupts to that vector. Withal, if the bit is cleared (reset), then messages can be posted at that place. We give a 64-chip address in the message address field, and a bulletin in the message data field. When an interrupt is signaled, the device's function will write the data into the memory given by the memory accost. The message accost upper field stores the upper 32 bits of a 64-bit address [63:32], and the message address field stores the lower 32 bits of a 64-bit address [31:0].
The PCI specification allows us to write the address in one store (doubleword) or in two divide stores (word). However, the vector must be masked before any changes to the message information field or to the message accost field are made.
Base Address Registers (Confined)
Each base address annals points to a place in memory where the register on the device is mapped. Since nosotros don't know the size beforehand, we have to inquire the BAR what size it needs. This is washed by writing all 1s for each scrap in the BAR, then reading the value back out to see what it gives us. All $.25 of 1 are "necessary", whereas all $.25 of 0 are wildcards. Nosotros can determine the size that it asks for by masking the last four bits, inverting the $.25, and then adding 1 (two's complement).
Remember that type 0 headers have 6 BAR annals fields. Not all may be used. If this is the case, nosotros will get all 0s when nosotros write to it. However, it is the capabilities linked-list that tells u.s.a. what these bars are used for. In other words, the Bars are for the device, not the PCI (or PCIe) autobus.
In that location is bachelor infinite in physical retentiveness space available for us in the QEMU virt system to map these base address registers starting at 0x4000_0000 up to, simply non including, 0x8000_0000. We can see this is the case looking at the QEMU source code (hw/riscv/virt.c).
[VIRT_PCIE_ECAM] = { 0x30000000, 0x10000000 }, [VIRT_PCIE_MMIO] = { 0x40000000, 0x40000000 }, [VIRT_DRAM] = { 0x80000000, 0x0 },
Each BAR must be mapped using a certain size, which is unlike for each device. I recommend enumerating the device number by the hex digit represented by $.25 23:20. For example, the double-decker index 0 is at 0x400x_xxxx, bus index i is at 0x401x_xxxx, and autobus alphabetize ii is at 0x402x_xxxx, and so forth. This allows for xvi devices at once, which is far more than y'all'll need. NOTE: Practise NOT forget to map the addresses pointed to by the BARs in the MMU. Fifty-fifty though the BAR points to a physical address, when the Os accesses it, it will be a virtual address, and then it must be mapped properly in a page table.
Some Bars are 64 bits and others are 32 $.25. This can exist adamant by looking at BAR[2:1]. If this value is 0b00, it is a 32-bit BAR, if this value is 0b10, and so it is a 64-scrap BAR. Chip index 0 (the to the lowest degree significant bit) determines if this is a memory-mapped BAR (0) or it is a PIO-mapped BAR (1). Our machine does not have a PIO bus, so we can only use MMIO Confined.

Determining BAR Mapping Length
The base address registers merely contain a memory address to the top of the memory. However, the data contained at this memory accost varies in size. We tin come across if we look at the capability list the size of each.
We can determine the amount of address space and alignment a BAR needs past starting time disabling the BAR via the control register (make sure I/O and memory are 0), and then by writing all 1s into the BAR. We can so read dorsum the BAR field. Annihilation that is still a 1 is a valid portion of the BAR. However, anything that is 0 means that when we pass it a retention address, it should be aligned by this amount.
Nosotros tin use the alignment to effigy out the size that the BAR maps to past taking the two's complement of the value after masking the last four bits, -(BAR & ~0xFUL). Nosotros mask off the last four bits of the BAR since those bits are used for a 1-bit "prefetchable" field, a 2-fleck "size" field, and a 1-bit "memory-space identifier" field.
Assigning Bars
When you discover a blazon 0 device, you need to look at the Confined to assign them an MMIO space address. If a BAR has the value of 0 (all 0s), and then information technology is an unused BAR.
Planning the bridges and using bits 27:twenty equally the autobus number and $.25 18:sixteen every bit the device number makes forwarding easy. We have all the infinite betwixt 0x4000_0000 through 0x4FFF_FFFF.
PCI-express Devices
Devices connected as PCIe will take the capability ID of 0x10. This will so expose a much larger configuration space. For the QEMU virt, we will not apply many of these capabilities equally they have to do with link negotiation and power management.

The outset field nosotros see is the PCI Express Capabilities Register, which has the following structure.

The remainder of the registers deal with actual hardware, and they don't make much sense for a virtual device. The merely reason nosotros intendance about the PCIe configuration is for the interrupt message number (for MSI/MSI-10).
Recap
Equally you tin tell, there is a lot of information above. This is the tradeoff with making a bus, similar, PCI generic and easy. Information technology makes our lives a fiddling flake harder. Here are the steps boiled down to eventually talk to a PCI device.
- Enumerate the coach from motorbus 0 through 255 at MMIO address 0x3000_0000.
- For each bus, enumerate the devices on the bus from device 0 through 31.
- Check the vendor ID. If it is 0xffff, information technology is not an attached device, skip and go to the adjacent device.
- Otherwise, enumerate the Confined by checking if they're 64 or 32 bits.
- You lot do not need to set up anything in a bridge'southward BARs
- Also retrieve Bars prepare to all 0s are not used, so they don't need an address.
- Enumerate the capabilities linked list.
- Check how much space is required for an address.
- Double cheque the BAR is really used by the device.
- Write all 1s (-1UL) to the BAR.
- Read back the accost from the same BAR and clear bits 3:0.
- The value of -readback volition be the amount of address infinite needed.
- Give the BARs in the capabilities listing an empty chunk of memory starting with 0x4000_0000.
- We will shift the omnibus as role of the address offset at bit xx. So, for example, all devices backside bus 1 will be between 0x4010_0000 and 0x401F_FFFF. And so, we will put in 0x4010 as the retention base and 0x401F every bit the retention limit for the bridge between 0 and ane.
- Communicate with the BAR's address as if they contain the device registers.
When checking and irresolute a BAR, make sure the command bit alphabetize 1 (retention space) is cleared. Merely set this scrap after all of the Bars are set up. If you set a device then the memory space bit is cleared, that device will no longer become "present", and you volition accept to reinitialize it.
Communicating with PCI(due east) Devices
Setup takes time, only luckily it only has to exist done once. Now that we've set up the base accost registers, and we know the capabilities and type of each device, we tin now frontwards this information to the correct driver. Recall that PCI relays iii pieces of useful information in this regard: (1) vendor id, (two) device id, and (3) class id. The class is the broadest, most general area where we tin start forwarding the information to a device commuter. And then, the device abstraction can determine which specific commuter to use based on the device and vendor id.
Device drivers should never touch the PCI configuration or BARs directly. Instead, they should get through the PCI subsystem in club to read or write or to more specifically configure a device.
The data at each BAR ways something dissimilar for each device. Y'all can see that PCI setup for all devices is the same until we get to the capabilities and finally the Confined.
Drivers
When we enumerate the PCI bus, we have to forrard the device we constitute based on the vendor ID and device ID to a driver that knows how to talk to that device. We can create a list of vendor IDs and device IDs so a callback role that will be used to handle the device.
The control annals's fleck alphabetize 1 allows the PCI root or span to forwards requests to a specific device. Even so, if we disable this, the device itself may reset, and then only change the command register when changing Confined.
How Msi Address Register Gets Set In Pcie Devices,
Source: https://marz.utk.edu/my-courses/cosc562/pcie/
Posted by: fayexameste.blogspot.com
0 Response to "How Msi Address Register Gets Set In Pcie Devices"
Post a Comment