The motivation behind this new interface is expose at runtime the
creation of new OA configs which can be used as part of the i915 perf
open interface. This will enable the kernel to learn new configs which
may be experimental, or otherwise not part of the core set currently
available through the i915 perf interface.
v2: Drop DRM_ERROR for userspace errors (Matthew)
Add padding to userspace structure (Matthew)
s/guid/uuid/ (Matthew)
v3: Use u32 instead of int to iterate through registers (Matthew)
v4: Lock access to dynamic config list (Lionel)
v5: by Matthew:
Fix uninitialized error values
Fix incorrect unwiding when opening perf stream
Use kmalloc_array() to store register
Use uuid_is_valid() to valid config uuids
Declare ioctls as write only
Check padding members are set to 0
by Lionel:
Return ENOENT rather than EINVAL when trying to remove non
existing config
v6: by Chris:
Use ref counts for OA configs
Store UUID in drm_i915_perf_oa_config rather then using pointer
Shuffle fields of drm_i915_perf_oa_config to avoid padding
v7: by Chris
Rename uapi pointers fields to end with '_ptr'
v8: by Andrzej, Marek, Sebastian
Update register whitelisting
by Lionel
Add more register names for documentation
Allow configuration programming in non-paranoid mode
Add support for value filter for a couple of registers already
programmed in other part of the kernel
v9: Documentation fix (Lionel)
Allow writing WAIT_FOR_RC6_EXIT only on Gen8+ (Andrzej)
v10: Perform read access_ok() on register pointers (Lionel)
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Andrzej Datczuk <andrzej.datczuk@intel.com>
Reviewed-by: Andrzej Datczuk <andrzej.datczuk@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170803165812.2373-2-lionel.g.landwerlin@intel.com
In the following commit we'll introduce loadable userspace
configs. This change reworks how configurations are handled in the
perf driver and retains only the test configurations in kernel space.
We now store the test config in dev_priv and resolve the id only once
when opening the perf stream. The OA config is then handled through a
pointer to the structure holding the configuration details.
v2: Rework how test configs are handled (Lionel)
v3: Use u32 to hold number of register (Matthew)
v4: Removed unused dev_priv->perf.oa.current_config variable (Matthew)
v5: Lock device when accessing exclusive_stream (Lionel)
v6: Ensure OACTXCONTROL is always reprogrammed (Lionel)
v7: Switch a couple of index variable from int to u32 (Matthew)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170803165812.2373-3-lionel.g.landwerlin@intel.com
The hdmi bits simply don't exist, so nerf them. I think audio doesn't
work on gen3 at all, and for the limited color range we should
probably use the colorimetry sdvo paramater instead of the bit in the
port.
But fixing sdvo isn't my goal, I just want to get the backtrace out of
the way, and this takes care of that.
Still, while at it fix the missing read-out of the gen4 audio bit,
maybe that part even works ...
v2: Instead of trying to plug the damage in ->compute_config() make
sure we never set intel_sdvo->is_hdmi, which stops the bad state at
the source. Suggested by Chris Wilson. Also make sure we don't break
this by accident by putting a WARN_ON in place.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170726193251.25393-1-daniel.vetter@ffwll.ch
lockdep complaints about a locking recursion for the i2c bus lock
because both the sdvo ddc proxy bus and the gmbus nested within use
the same locking class. It's not really a deadlock since we never nest
the other way round, but it's annoying.
Fix it by pulling the gmbus locking into the i2c lock_ops for both
i2c_adapater and making sure that the ddc_proxy_xfer function is
entirely lockless.
Re-layouting the extracted function resulted in some whitespace
cleanups, I figured we might as well keep them.
v2: Review from Chris:
- s/locked/unlocked/ since I got the naming backwards
- Use the vfuncs of the proxied adatper instead of re-rolling copies.
That's more consistent with the other proxying we're doing.
Cc: Chris Wilson <chris@chris-wilson.co.uk>
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170726132647.31833-1-daniel.vetter@ffwll.ch
Reduce acquisition of struct_mutex to the critical regions that must
hold it; for KMS, we need struct_mutex currently only for the purpose of
pinning/unpinning the framebuffer's VMA into the global GTT. This allows
us to avoid taking the struct_mutex when disabling the CRTC (i.e. NULL
framebuffer objects) before a reset. (Not yet achieving the full goal of
avoiding the strut_mutex nesting, but good enough to break the first
half of the reset deadlock.)
v2: Keep pages pinning inside struct_mutex for the moment.
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Reviewed-by: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170726160038.29487-1-chris@chris-wilson.co.uk
[danvet: Drop another case of grabbing dev->struct_mutex around
cleanup_planes, which popped up because I had to redo the drm-next
backmerge for entirely different reasons. Acked by Chris on irc.]
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
When reading the i915_energy_uJ debugfs file, it tries to fetch
MSR_RAPL_POWER_UNIT, which might not be available, like in a vm
environment, causing the exception shown below.
We can easily prevent it by doing a rdmsrl_safe read instead, which will
handle the exception, allowing us to abort the debugfs file read.
This was caught by the new igt@debugfs_test@read_all_entries testcase in
the CI.
unchecked MSR access error: RDMSR from 0x606 at rIP:0xffffffffa0078f66
(i915_energy_uJ+0x36/0xb0 [i915])
Call Trace:
seq_read+0xdc/0x3a0
full_proxy_read+0x4f/0x70
__vfs_read+0x23/0x120
? putname+0x4f/0x60
? rcu_read_lock_sched_held+0x75/0x80
? entry_SYSCALL_64_fastpath+0x5/0xb1
vfs_read+0xa0/0x150
SyS_read+0x44/0xb0
entry_SYSCALL_64_fastpath+0x1c/0xb1
RIP: 0033:0x7f1f5e9f4500
RSP: 002b:00007ffc77e65cf8 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: ffffffff8146e003 RCX: 00007f1f5e9f4500
RDX: 0000000000000200 RSI: 00007ffc77e65d10 RDI: 0000000000000006
RBP: ffffc900007abf88 R08: 0000000001eaff20 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000006 R14: 0000000000000005 R15: 0000000001eb94db
? __this_cpu_preempt_check+0x13/0x20
v2:
- Drop unsigned long long cast and improve calculation (Chris)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101901
Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.co.uk>
Link: https://patchwork.freedesktop.org/patch/msgid/87o9s7zrx3.fsf@dilma.collabora.co.uk
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The goal here was to minimise doing any thing or any check inside the
kernel that was not strictly required. For a userspace that assumes
complete control over the cache domains, the kernel is usually using
outdated information and may trigger clflushes where none were
required.
However, swapping is a situation where userspace has no knowledge of the
domain transfer, and will leave the object in the CPU cache. The kernel
must flush this out to the backing storage prior to use with the GPU. As
we use an asynchronous task tracked by an implicit fence for this, we
also need to cancel the ASYNC flag on the object so that the object will
wait for the clflush to complete before being executed. This also absolves
userspace of the responsibility imposed by commit 77ae995789 ("drm/i915:
Enable userspace to opt-out of implicit fencing") that its needed to ensure
that the object was out of the CPU cache prior to use on the GPU.
Fixes: 77ae995789 ("drm/i915: Enable userspace to opt-out of implicit fencing")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101571
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20170721145037.25105-5-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
I was being overly paranoid in not updating the execobject.offset after
performing the fallback copy where we set reloc.presumed_offset to -1.
The thinking was to ensure that a subsequent NORELOC execbuf would be
forced to process the invalid relocations. However this is overkill so
long as we *only* update the execobject.offset following a successful
update of the relocation value witin the batch. If we have to repeat the
execbuf due to a later interruption, then we may skip the relocations on
the second pass (honouring NORELOC) since the execobject.offset match
the actual offsets (even though reloc.presumed_offset is garbage).
Subsequent calls to execbuf with NORELOC should themselves ensure that
the reloc.presumed_offset have been corrected in case of future
migration.
Reporting back the actual execobject.offset, even when
reloc.presumed_offset is garbage, ensures that reuse of those objects
use the latest information to avoid relocations.
Fixes: 2889caa923 ("drm/i915: Eliminate lots of iterations over the execobjects array")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101635
Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170721145037.25105-4-chris@chris-wilson.co.uk
Reviewed-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To support ycbcr output, we need a pipe CSC block to do
RGB->YCBCR conversion.
Current Intel platforms have only one pipe CSC unit, so
we can either do color correction using it, or we can perform
RGB->YCBCR conversion.
This function adds a csc handler, which uses recommended bspec
values to perform RGB->YCBCR conversion (target color space BT709)
V2: Rebase
V3: Rebase
V4: Rebase
V5: Addressed review comments from Ander
- Remove extra line added in the patch
- Add the spec details in the commit message
- Combine two if(cond) while calling intel_crtc_compute_config
V6: Handle YCBCR420 outputs only (Ville)
V7: Addressed review comments from Ville:
- Add description about target colorspace
- Remove the comments from CSC function
- DRM_DEBUG->DEBUG_KMS for atomic failure due to CSC unit busy
- Remove unnecessary debug message about YCBCR420 possibe
V8: Addressed review comments from Ville:
- Remove extra comment, not required.
- Do not add extra variable for CTM, reuse pipe_config
Added r-b from Ville
V9: Remove extra whitespace (Imre)
V10: Added r-b from Imre
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjala <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1500650709-14447-5-git-send-email-shashank.sharma@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To get HDMI YCBCR420 output, the PIPEMISC register should be
programmed to:
- Generate YCBCR output (bit 11)
- In case of YCBCR420 outputs, it should be programmed in full
blend mode to use the scaler in 5x3 ratio (bits 26 and 27)
This patch:
- Adds definition of these bits.
- Programs PIPEMISC for YCBCR420 outputs.
- Adds readouts to compare HW and SW states.
V2: rebase
V3: rebase
V4: rebase
V5: added r-b from Ander
V6: Handle only YCBCR420 outputs (ville)
V7: rebase
V8: Addressed review comments from Ville
- Add readouts for state->ycbcr420 and 420 pixel_clock.
- Handle warning due to mismatch in clock for ycbcr420 clock.
- Rename PIPEMISC macros to match the Bspec.
- Add a debug print stating if YCBCR 4:2:0 output enabled.
Added r-b from Ville
V9: Addressed review comments from Imre:
- Add 420 mode clock adjustment in intel_hdmi_mode_valid to
prevent 420_only modes getting rejected for high clock.
- Add port clock adjustment for ycbcr420 modes in ddi_get_clock
- Rename macros as per Ville's suggestion.
- Remove unnecessary wl changes.
V10: Added r-b from Imre
V11: Fixed faulty dotclock handling, and addressed missing comment
from previous set of review comments (Imre)
V12: Fixed dotclock for 12bpc too, removed 420 check for GEN < 10
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Reviewed-by: Ville Syrjala <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1500904172-31717-1-git-send-email-shashank.sharma@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
To get a YCBCR420 output from intel platforms, we need one
scaler to scale down YCBCR444 samples to YCBCR420 samples.
This patch:
- Does scaler allocation for HDMI ycbcr420 outputs.
- Programs PIPE_MISC register for ycbcr420 output.
V2: rebase
V3: rebase
V4: rebase
V5: addressed review comments from Ander:
- No need to check both scaler_user && hdmi_output.
Check for scaler_user is enough.
V6: rebase
V7: Do not create a new scaler user, use existing pipe scaler user.
V8: rebase
V9: Addressed review comments from Ville:
- Remove leftover comment for HDMI scaler user.
- Remove unnecessary blank line.
- Make scaler alocation failure a DEBUG log instead of ERROR.
Added r-b from Ville
V10: Update commit message as per latest code (Imre)
Added r-b from Imre
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Ander Conselvan De Oliveira <conselvan2@gmail.com>
Cc: Imre Deak <imre.deak@intel.com>
Reviewed-by: Ville Syrjala <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1500650709-14447-3-git-send-email-shashank.sharma@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
This patch checks encoder level support for YCBCR420 outputs.
The logic goes as simple as this:
If the input mode is YCBCR420-only mode: prepare HDMI for
YCBCR420 output, else continue with RGB output mode.
It checks if the mode is YCBCR420 and source can support this
output then it marks the ycbcr_420 output indicator into crtc
state, for further staging in driver.
V2: Split the patch into two, kept helper functions in DRM layer.
V3: Changed the compute_config function based on new DRM API.
V4: Rebase
V5: Rebase
V6: Check and handle YCBCR420-only modes, discard the property
based approach (Ville)
V7: Addressed review comments from Ville
- add else case in 12BPC check.
- extract ycbcr420 state inside hdmi_12bpc_possible function.
V8: Addressed review comments from Ville
- Remove extra blank lines.
- Remove "HDMI" from the description of ycbcr420 state variable.
- Remove local variable, use crtc_state->ycbcr420 instead.
Added r-b from Ville.
V9: Rebase
V10: Added r-b from Imre
Cc: Ville Syrjala <ville.syrjala@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@intel.com>
Cc: Ander Conselvan de Oliveira <conselvan2@gmail.com>
Reviewed-by: Ville Syrjala <ville.syrjala@linux.intel.com>
Reviewed-by: Imre Deak <imre.deak@intel.com>
Signed-off-by: Shashank Sharma <shashank.sharma@intel.com>
Signed-off-by: Imre Deak <imre.deak@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1500650709-14447-2-git-send-email-shashank.sharma@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
The pattern of a power well backing a set of fuses whose initialization
we need to wait for during power well enabling is common to all GEN9+
platforms. Adding support for this to the HSW power well enable helper
allows us to use the HSW/BDW power well code for GEN9+ as well in a
follow-up patch.
v2:
- Use an enum for power gates instead of raw numbers. (Ville)
Signed-off-by: Imre Deak <imre.deak@intel.com>
Reviewed-by: Arkadiusz Hiler <arkadiusz.hiler@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20170711204236.5618-6-imre.deak@intel.com
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>