Skip to content

Commit 3b5724d

Browse files
committed
drm/i915: Wait for writes through the GTT to land before reading back
If we quickly switch from writing through the GTT to a read of the physical page directly with the CPU (e.g. performing relocations through the GTT and then running the command parser), we can observe that the writes are not visible to the CPU. It is not a coherency problem, as extensive investigations with clflush have demonstrated, but a mere timing issue - we have to wait for the GTT to complete it's write before we start our read from the CPU. The issue can be illustrated in userspace with: gtt = gem_mmap__gtt(fd, handle, 0, OBJECT_SIZE, PROT_READ | PROT_WRITE); cpu = gem_mmap__cpu(fd, handle, 0, OBJECT_SIZE, PROT_READ | PROT_WRITE); gem_set_domain(fd, handle, I915_GEM_DOMAIN_GTT, I915_GEM_DOMAIN_GTT); for (i = 0; i < OBJECT_SIZE / 64; i++) { int x = 16*i + (i%16); gtt[x] = i; clflush(&cpu[x], sizeof(cpu[x])); assert(cpu[x] == i); } Experimenting with that shows that this behaviour is indeed limited to recent Atom-class hardware. Testcase: igt/gem_exec_flush/basic-batch-default-cmd #byt Signed-off-by: Chris Wilson <[email protected]> Reviewed-by: Joonas Lahtinen <[email protected]> Link: http://patchwork.freedesktop.org/patch/msgid/[email protected]
1 parent a314d5c commit 3b5724d

File tree

1 file changed

+11
-1
lines changed

1 file changed

+11
-1
lines changed

drivers/gpu/drm/i915/i915_gem.c

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3173,20 +3173,30 @@ i915_gem_clflush_object(struct drm_i915_gem_object *obj,
31733173
static void
31743174
i915_gem_object_flush_gtt_write_domain(struct drm_i915_gem_object *obj)
31753175
{
3176+
struct drm_i915_private *dev_priv = to_i915(obj->base.dev);
31763177
uint32_t old_write_domain;
31773178

31783179
if (obj->base.write_domain != I915_GEM_DOMAIN_GTT)
31793180
return;
31803181

31813182
/* No actual flushing is required for the GTT write domain. Writes
3182-
* to it immediately go to main memory as far as we know, so there's
3183+
* to it "immediately" go to main memory as far as we know, so there's
31833184
* no chipset flush. It also doesn't land in render cache.
31843185
*
31853186
* However, we do have to enforce the order so that all writes through
31863187
* the GTT land before any writes to the device, such as updates to
31873188
* the GATT itself.
3189+
*
3190+
* We also have to wait a bit for the writes to land from the GTT.
3191+
* An uncached read (i.e. mmio) seems to be ideal for the round-trip
3192+
* timing. This issue has only been observed when switching quickly
3193+
* between GTT writes and CPU reads from inside the kernel on recent hw,
3194+
* and it appears to only affect discrete GTT blocks (i.e. on LLC
3195+
* system agents we cannot reproduce this behaviour).
31883196
*/
31893197
wmb();
3198+
if (INTEL_GEN(dev_priv) >= 6 && !HAS_LLC(dev_priv))
3199+
POSTING_READ(RING_ACTHD(dev_priv->engine[RCS].mmio_base));
31903200

31913201
old_write_domain = obj->base.write_domain;
31923202
obj->base.write_domain = 0;

0 commit comments

Comments
 (0)