[PATCH 2/6] seqno-fence: Hardware dma-buf implementation of fencing (v4) (original) (raw)
Rob Clark robdclark at gmail.com
Mon Feb 17 10:24:17 PST 2014
- Previous message: [PATCH 2/6] seqno-fence: Hardware dma-buf implementation of fencing (v4)
- Next message: [PATCH 2/6] seqno-fence: Hardware dma-buf implementation of fencing (v4)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
On Mon, Feb 17, 2014 at 12:36 PM, Christian König <deathsimple at vodafone.de> wrote:
Am 17.02.2014 18:27, schrieb Rob Clark:
On Mon, Feb 17, 2014 at 11:56 AM, Christian König <deathsimple at vodafone.de> wrote:
Am 17.02.2014 16:56, schrieb Maarten Lankhorst:
This type of fence can be used with hardware synchronization for simple hardware that can block execution until the condition (dmabuf[offset] - value) >= 0 has been met.
Can't we make that just "dmabuf[offset] != 0" instead? As far as I know this way it would match the definition M$ uses in their WDDM specification and so make it much more likely that hardware supports it. well 'buf[offset] >= value' at least means the same slot can be used for multiple operations (with increasing values of 'value').. not sure if that is something people care about. =value seems to be possible with adreno and radeon. I'm not really sure about others (although I presume it as least supported for nv desktop stuff). For hw that cannot do >=value, we can either have a different fence implementation which uses the !=0 approach. Or change seqno-fence implementation later if needed. But if someone has hw that can do !=0 but not >=value, speak up now ;-) Here! Radeon can only do >=value on the DMA and 3D engine, but not with UVD or VCE. And for the 3D engine it means draining the pipe, which isn't really a good idea.
hmm, ok.. forgot you have a few extra rings compared to me. Is UVD re-ordering from decode-order to display-order for you in hw? If not, I guess you need sw intervention anyways when a frame is done for frame re-ordering, so maybe hw->hw sync doesn't really matter as much as compared to gpu/3d->display. For dma<->3d interactions, seems like you would care more about hw<->hw sync, but I guess you aren't likely to use GPU A to do a resolve blit for GPU B..
For 3D ring, I assume you probably want a CP_WAIT_FOR_IDLE before a CP_MEM_WRITE to update fence value in memory (for the one signalling the fence). But why would you need that before a CP_WAIT_REG_MEM (for the one waiting for the fence)? I don't exactly have documentation for adreno version of CP_WAIT_REG_{MEM,EQ,GTE}.. but PFP and ME appear to be same instruction set as r600, so I'm pretty sure they should have similar capabilities.. CP_WAIT_REG_MEM appears to be same but with 32bit gpu addresses vs 64b.
BR, -R
Christian.
Apart from that I still don't like the idea of leaking a drivers IRQ context outside of the driver, but without a proper GPU scheduler there probably isn't much alternative. I guess it will be not uncommon scenario for gpu device to just need to kick display device to write a few registers for a page flip.. probably best not to schedule a worker just for this (unless the signalled device otherwise needs to). I think it is better in this case to give the signalee some rope to hang themselves, and make it the responsibility of the callback to kick things off to a worker if needed. BR, -R Christian. A software fallback still has to be provided in case the fence is used with a device that doesn't support this mechanism. It is useful to expose this for graphics cards that have an op to support this.
Some cards like i915 can export those, but don't have an option to wait, so they need the software fallback. I extended the original patch by Rob Clark. v1: Original v2: Renamed from bikeshed to seqno, moved into dma-fence.c since not much was left of the file. Lots of documentation added. v3: Use fenceops instead of custom callbacks. Moved to own file to avoid circular dependency between dma-buf.h and fence.h v4: Add spinlock pointer to seqnofenceinit Signed-off-by: Maarten Lankhorst <maarten.lankhorst at canonical.com> --- Documentation/DocBook/device-drivers.tmpl | 1 drivers/base/fence.c | 50 +++++++++++++ include/linux/seqno-fence.h | 109 +++++++++++++++++++++++++++++ 3 files changed, 160 insertions(+) create mode 100644 include/linux/seqno-fence.h diff --git a/Documentation/DocBook/device-drivers.tmpl b/Documentation/DocBook/device-drivers.tmpl index 7a0c9ddb4818..8c85c20942c2 100644 --- a/Documentation/DocBook/device-drivers.tmpl +++ b/Documentation/DocBook/device-drivers.tmpl @@ -131,6 +131,7 @@ X!Edrivers/base/interface.c !Edrivers/base/dma-buf.c !Edrivers/base/fence.c !Iinclude/linux/fence.h +!Iinclude/linux/seqno-fence.h !Edrivers/base/reservation.c !Iinclude/linux/reservation.h !Edrivers/base/dma-coherent.c diff --git a/drivers/base/fence.c b/drivers/base/fence.c index 12df2bf62034..cd0937127a89 100644 --- a/drivers/base/fence.c +++ b/drivers/base/fence.c @@ -25,6 +25,7 @@ #include <linux/export.h> #include <linux/atomic.h> #include <linux/fence.h> +#include <linux/seqno-fence.h> #define CREATETRACEPOINTS #include <trace/events/fence.h> _@@ -413,3 +414,52 @@ fenceinit(struct fence *fence, const struct fenceops *ops, tracefenceinit(fence); } _EXPORTSYMBOL(fenceinit); + +static const char *seqnofencegetdrivername(struct fence *fence) { + struct seqnofence *seqnofence = toseqnofence(fence); + return seqnofence->ops->getdrivername(fence); +} + +static const char *seqnofencegettimelinename(struct fence *fence) { + struct seqnofence *seqnofence = toseqnofence(fence); + return seqnofence->ops->gettimelinename(fence); +} + +static bool seqnoenablesignaling(struct fence *fence) +{ + struct seqnofence *seqnofence = toseqnofence(fence); + return seqnofence->ops->enablesignaling(fence); +} + +static bool seqnosignaled(struct fence *fence) +{ + struct seqnofence *seqnofence = toseqnofence(fence); + return seqnofence->ops->signaled && seqnofence->ops->signaled(fence); +} + +static void seqnorelease(struct fence *fence) +{ + struct seqnofence *f = toseqnofence(fence); + + dmabufput(f->syncbuf); + if (f->ops->release) + f->ops->release(fence); + else + kfree(f); +} + +static long seqnowait(struct fence *fence, bool intr, signed long timeout) +{ + struct seqnofence *f = toseqnofence(fence); + return f->ops->wait(fence, intr, timeout); +} + +const struct fenceops seqnofenceops = { + .getdrivername = seqnofencegetdrivername, + .gettimelinename = seqnofencegettimelinename, + .enablesignaling = seqnoenablesignaling, + .signaled = seqnosignaled, + .wait = seqnowait, + .release = seqnorelease, +}; +EXPORTSYMBOL(seqnofenceops); diff --git a/include/linux/seqno-fence.h b/include/linux/seqno-fence.h new file mode 100644 index 000000000000..952f7909128c --- /dev/null +++ b/include/linux/seqno-fence.h @@ -0,0 +1,109 @@ +/* + * seqno-fence, using a dma-buf to synchronize fencing + * + * Copyright (C) 2012 Texas Instruments + * Copyright (C) 2012 Canonical Ltd + * Authors: + * Rob Clark <robdclark at gmail.com> + * Maarten Lankhorst <maarten.lankhorst at canonical.com> + * + * This program is free software; you can redistribute it and/or modify it + * under the terms of the GNU General Public License version 2 as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see <http://www.gnu.org/licenses/>. + */ + _+#ifndef LINUXSEQNOFENCEH _+#define LINUXSEQNOFENCEH + +#include <linux/fence.h> +#include <linux/dma-buf.h> + +struct seqnofence { + struct fence base; + + const struct fenceops *ops; + struct dmabuf *syncbuf; + uint32t seqnoofs; +}; + +extern const struct fenceops seqnofenceops; + +/** + * toseqnofence - cast a fence to a seqnofence + * @fence: fence to cast to a seqnofence + * + * Returns NULL if the fence is not a seqnofence, + * or the seqnofence otherwise. + */ +static inline struct seqnofence * +toseqnofence(struct fence *fence) +{ + if (fence->ops != &seqnofenceops) + return NULL; + return containerof(fence, struct seqnofence, base); +} + +/** + * seqnofenceinit - initialize a seqno fence + * @fence: seqnofence to initialize + * @lock: pointer to spinlock to use for fence + * @syncbuf: buffer containing the memory location to signal on + * @context: the execution context this fence is a part of + * @seqnoofs: the offset within @syncbuf + * @seqno: the sequence # to signal on + * @ops: the fenceops for operations on this seqno fence + * + * This function initializes a struct seqnofence with passed parameters, + * and takes a reference on syncbuf which is released on fence destruction. + * + * A seqnofence is a dmafence which can complete in software when + * enablesignaling is called, but it also completes when + * (s32)((syncbuf)[seqnoofs] - seqno) >= 0 is true + * + * The seqnofence will take a refcount on the syncbuf until it's + * destroyed, but actual lifetime of syncbuf may be longer if one of the + * callers take a reference to it. + * + * Certain hardware have instructions to insert this type of wait condition + * in the command stream, so no intervention from software would be needed. + * This type of fence can be destroyed before completed, however a reference + * on the syncbuf dma-buf can be taken. It is encouraged to re-use the same + * dma-buf for syncbuf, since mapping or unmapping the syncbuf to the + * device's vm can be expensive. + * + * It is recommended for creators of seqnofence to call fencesignal + * before destruction. This will prevent possible issues from wraparound at + * time of issue vs time of check, since users can check fenceissignaled + * before submitting instructions for the hardware to wait on the fence. + * However, when ops.enablesignaling is not called, it doesn't have to be + * done as soon as possible, just before there's any real danger of seqno + * wraparound. + */ +static inline void +seqnofenceinit(struct seqnofence *fence, spinlockt *lock, + struct dmabuf *syncbuf, uint32t context, uint32t seqnoofs, + uint32t seqno, const struct fenceops *ops) +{ + BUGON(!fence || !syncbuf || !ops); + BUGON(!ops->wait || !ops->enablesignaling || !ops->getdrivername || !ops->gettimelinename); + + /* _+ * ops is used in fenceinit for getdrivername, so needs to be + * initialized first + */ + fence->ops = ops; _+ fenceinit(&fence->base, &seqnofenceops, lock, context, seqno); + getdmabuf(syncbuf); + fence->syncbuf = syncbuf; + fence->seqnoofs = seqnoofs; +} + _+#endif /* LINUXSEQNOFENCEH */
dri-devel mailing list dri-devel at lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
dri-devel mailing list dri-devel at lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
- Previous message: [PATCH 2/6] seqno-fence: Hardware dma-buf implementation of fencing (v4)
- Next message: [PATCH 2/6] seqno-fence: Hardware dma-buf implementation of fencing (v4)
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]