IOMMU – DMAR fault – PTE Read access is not set

Standard

Technical environment

  • Debian Jessie 8.1 KVM host machine
  • Standard GNU/Linux Kernel 3.16.7
  • libvirt 1.2.9
  • qemu-kvm 2.1.2
  • Emulex OneConnect E820 10 Gigabit Dual Port

Issue

All we need to benefit SR-IOV performance is an hardware with IOMMU enabled (Intel VT-d in case of Intel). IOMMU activation for Intel based hardwares only requires the following kernel boot parameter:

intel_iommu=on

Although this works on many systems, the boot process fails (systematically) on specific servers:

DMAR:[fault reason 06] PTE Read access is not set
dmar: DRHD: handling fault status reg 2
dmar: DMAR:[DMA Read] Request device [01:00.0] fault addr cd43c00

dmar-error

This happens for instance on Dell PowerEdge R420 Servers with a standard Debian/Jessie. Here, our Operating System cannot start, as this DMAR error message goes into a loop. All DMA read requests fails for the PCI device [01:00.0].

01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2008 [Falcon] (rev 03)

Known bug on kernel 3.16 with MegaRAID controller ? Solved with an older or more recent kernel ? Probably, and regardless, I want to preserve the default kernel (3.16) delivered with the last stable Debian release.

Solution

A quick fix consists in replacing the boot kernel parameter ‘intel_iommu=on’ with ‘intel_iommu=pt’ “to set up pass through (PT) mode in context mapping entry”. DMAR is then disabled in the GNU/Linux kernel but KVM still benefits IOMMU and interrupt remapping.

[ 0.043564] dmar: Host address width 46
[ 0.043566] dmar: DRHD base: 0x000000d4800000 flags: 0x0
[ 0.043572] dmar: IOMMU 0: reg_base_addr d4800000 ver 1:0 cap d2078c106f0466 ecap f020de
[ 0.043574] dmar: DRHD base: 0x000000df100000 flags: 0x1
[ 0.043578] dmar: IOMMU 1: reg_base_addr df100000 ver 1:0 cap d2078c106f0466 ecap f020de
[ 0.043579] dmar: RMRR base: 0x000000cf458000 end: 0x000000cf46ffff
[ 0.043580] dmar: RMRR base: 0x000000cf450000 end: 0x000000cf450fff
[ 0.043581] dmar: RMRR base: 0x000000cf452000 end: 0x000000cf452fff
[ 0.043582] dmar: ATSR flags: 0x0
[ 0.043703] IOAPIC id 2 under DRHD base 0xd4800000 IOMMU 0
[ 0.043704] IOAPIC id 0 under DRHD base 0xdf100000 IOMMU 1
[ 0.043705] IOAPIC id 1 under DRHD base 0xdf100000 IOMMU 1
[ 0.043706] HPET id 0 under DRHD base 0xdf100000
[ 0.043707] Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.043708] Your BIOS is broken and requested that x2apic be disabled.
This will slightly decrease performance.
Use 'intremap=no_x2apic_optout' to override BIOS request.
[ 0.043712] dmar: DRHD: handling fault status reg 102
[ 0.043806] dmar: DMAR:[DMA Read] Request device [01:00.0] fault addr cd43c000
DMAR:[fault reason 06] PTE Read access is not set
[ 0.044382] Enabled IRQ remapping in xapic mode
[ 0.044383] x2apic not enabled, IRQ remapping is in xapic mode
[ 0.044387] Switched APIC routing to physical flat.

Note that X2APIC is unsupported on 12th Generation Dell PowerEdge Servers. You have to enable X2APIC extension mode in the BIOS starting from the 13th Generation.