[U-Boot] [PATCH 1/2] common.h: Introduce DEFINE_CACHE_ALIGN_BUFFER

This is the out-of-function-scope counterpart of ALLOC_CACHE_ALIGN_BUFFER.
Signed-off-by: Marek Vasut marex@denx.de Cc: Tom Rini trini@ti.com Cc: Ilya Yanok ilya.yanok@cogentembedded.com --- include/common.h | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/include/common.h b/include/common.h index 17c64b0..06d278f 100644 --- a/include/common.h +++ b/include/common.h @@ -965,6 +965,17 @@ int cpu_release(int nr, int argc, char * const argv[]); \ type *name = (type *) ALIGN((uintptr_t)__##name, ARCH_DMA_MINALIGN)
+/* + * DEFINE_CACHE_ALIGN_BUFFER() is similar to ALLOC_CACHE_ALIGN_BUFFER, but it's + * purpose is to allow allocating aligned buffers outside of function scope. + * Usage of this macro shall be avoided or used with extreme care! + */ +#define DEFINE_CACHE_ALIGN_BUFFER(type, name, size) \ + static char __##name[roundup(size * sizeof(type), ARCH_DMA_MINALIGN)] \ + __aligned(ARCH_DMA_MINALIGN); \ + \ + static type *name = (type *)__##name; + /* Pull in stuff for the build system */ #ifdef DO_DEPS_ONLY # include <environment.h>

From: Tom Rini trini@ti.com
The USB spec says that 32 bytes is the minimum required alignment. However on some platforms we have a larger minimum requirement for cache coherency. In those cases, use that value rather than the USB spec minimum. We add a cpp check to <usb.h> to define USB_DMA_MINALIGN and make use of it in ehci-hcd.c and musb_core.h. We cannot use MAX() here as we are not allowed to have tests inside of align(...).
Signed-off-by: Tom Rini trini@ti.com [ilya.yanok]: fix size alignment, drop (incorrect) rounding when invalidating the buffer. If we got unaligned buffer from the upper layer -- that's definetely a bug so it's good to buzz about it. But we have to align the buffer length -- upper layers should take care to reserve enough space. Signed-off-by: Ilya Yanok ilya.yanok@cogentembedded.com [marek.vasut]: introduce some crazy macro voodoo Signed-off-by: Marek Vasut marex@denx.de --- drivers/usb/host/ehci-hcd.c | 81 ++++++++++++++++++++++++------------------ drivers/usb/musb/musb_core.h | 2 +- include/usb.h | 10 ++++++ 3 files changed, 57 insertions(+), 36 deletions(-)
diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c index 04300be..5199560 100644 --- a/drivers/usb/host/ehci-hcd.c +++ b/drivers/usb/host/ehci-hcd.c @@ -34,7 +34,10 @@ struct ehci_hccr *hccr; /* R/O registers, not need for volatile */ volatile struct ehci_hcor *hcor;
static uint16_t portreset; -static struct QH qh_list __attribute__((aligned(32))); +DEFINE_CACHE_ALIGN_BUFFER(struct QH, qh_list, 1); + +#define ALIGN_END_ADDR(type, ptr, size) \ + ((uint32_t)(ptr) + roundup((size) * sizeof(type), USB_DMA_MINALIGN))
static struct descriptor { struct usb_hub_descriptor hub; @@ -172,13 +175,13 @@ static int ehci_td_buffer(struct qTD *td, void *buf, size_t sz) { uint32_t delta, next; uint32_t addr = (uint32_t)buf; - size_t rsz = roundup(sz, 32); + size_t rsz = roundup(sz, USB_DMA_MINALIGN); int idx;
if (sz != rsz) debug("EHCI-HCD: Misaligned buffer size (%08x)\n", sz);
- if (addr & 31) + if (addr != ALIGN(addr, USB_DMA_MINALIGN)) debug("EHCI-HCD: Misaligned buffer address (%p)\n", buf);
idx = 0; @@ -207,8 +210,8 @@ static int ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer, int length, struct devrequest *req) { - static struct QH qh __attribute__((aligned(32))); - static struct qTD qtd[3] __attribute__((aligned (32))); + ALLOC_CACHE_ALIGN_BUFFER(struct QH, qh, 1); + ALLOC_CACHE_ALIGN_BUFFER(struct qTD, qtd, 3); int qtd_counter = 0;
volatile struct qTD *vtd; @@ -229,8 +232,8 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer, le16_to_cpu(req->value), le16_to_cpu(req->value), le16_to_cpu(req->index));
- memset(&qh, 0, sizeof(struct QH)); - memset(qtd, 0, sizeof(qtd)); + memset(qh, 0, sizeof(struct QH)); + memset(qtd, 0, 3 * sizeof(*qtd));
toggle = usb_gettoggle(dev, usb_pipeendpoint(pipe), usb_pipeout(pipe));
@@ -244,7 +247,7 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer, * qh_overlay.qt_next ...... 13-10 H * - qh_overlay.qt_altnext */ - qh.qh_link = cpu_to_hc32((uint32_t)&qh_list | QH_LINK_TYPE_QH); + qh->qh_link = cpu_to_hc32((uint32_t)qh_list | QH_LINK_TYPE_QH); c = (usb_pipespeed(pipe) != USB_SPEED_HIGH && usb_pipeendpoint(pipe) == 0) ? 1 : 0; endpt = (8 << 28) | @@ -255,14 +258,14 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer, (usb_pipespeed(pipe) << 12) | (usb_pipeendpoint(pipe) << 8) | (0 << 7) | (usb_pipedevice(pipe) << 0); - qh.qh_endpt1 = cpu_to_hc32(endpt); + qh->qh_endpt1 = cpu_to_hc32(endpt); endpt = (1 << 30) | (dev->portnr << 23) | (dev->parent->devnum << 16) | (0 << 8) | (0 << 0); - qh.qh_endpt2 = cpu_to_hc32(endpt); - qh.qh_overlay.qt_next = cpu_to_hc32(QT_NEXT_TERMINATE); + qh->qh_endpt2 = cpu_to_hc32(endpt); + qh->qh_overlay.qt_next = cpu_to_hc32(QT_NEXT_TERMINATE);
- tdp = &qh.qh_overlay.qt_next; + tdp = &qh->qh_overlay.qt_next;
if (req != NULL) { /* @@ -340,13 +343,13 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer, tdp = &qtd[qtd_counter++].qt_next; }
- qh_list.qh_link = cpu_to_hc32((uint32_t)&qh | QH_LINK_TYPE_QH); + qh_list->qh_link = cpu_to_hc32((uint32_t)qh | QH_LINK_TYPE_QH);
/* Flush dcache */ - flush_dcache_range((uint32_t)&qh_list, - (uint32_t)&qh_list + sizeof(struct QH)); - flush_dcache_range((uint32_t)&qh, (uint32_t)&qh + sizeof(struct QH)); - flush_dcache_range((uint32_t)qtd, (uint32_t)qtd + sizeof(qtd)); + flush_dcache_range((uint32_t)qh_list, + ALIGN_END_ADDR(struct QH, qh_list, 1)); + flush_dcache_range((uint32_t)qh, ALIGN_END_ADDR(struct QH, qh, 1)); + flush_dcache_range((uint32_t)qtd, ALIGN_END_ADDR(struct qTD, qtd, 3));
usbsts = ehci_readl(&hcor->or_usbsts); ehci_writel(&hcor->or_usbsts, (usbsts & 0x3f)); @@ -369,12 +372,12 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer, timeout = USB_TIMEOUT_MS(pipe); do { /* Invalidate dcache */ - invalidate_dcache_range((uint32_t)&qh_list, - (uint32_t)&qh_list + sizeof(struct QH)); - invalidate_dcache_range((uint32_t)&qh, - (uint32_t)&qh + sizeof(struct QH)); + invalidate_dcache_range((uint32_t)qh_list, + ALIGN_END_ADDR(struct QH, qh_list, 1)); + invalidate_dcache_range((uint32_t)qh, + ALIGN_END_ADDR(struct QH, qh, 1)); invalidate_dcache_range((uint32_t)qtd, - (uint32_t)qtd + sizeof(qtd)); + ALIGN_END_ADDR(struct qTD, qtd, 3));
token = hc32_to_cpu(vtd->qt_token); if (!(token & 0x80)) @@ -382,9 +385,17 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer, WATCHDOG_RESET(); } while (get_timer(ts) < timeout);
- /* Invalidate the memory area occupied by buffer */ - invalidate_dcache_range(((uint32_t)buffer & ~31), - ((uint32_t)buffer & ~31) + roundup(length, 32)); + /* + * Invalidate the memory area occupied by buffer + * Don't try to fix the buffer alignment, if it isn't properly + * aligned it's upper layer's fault so let invalidate_dcache_range() + * vow about it. But we have to fix the length as it's actual + * transfer length and can be unaligned. This is potentially + * dangerous operation, it's responsibility of the calling + * code to make sure enough space is reserved. + */ + invalidate_dcache_range((uint32_t)buffer, + ALIGN_END_ADDR(u8, buffer, length));
/* Check that the TD processing happened */ if (token & 0x80) { @@ -403,9 +414,9 @@ ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer, goto fail; }
- qh_list.qh_link = cpu_to_hc32((uint32_t)&qh_list | QH_LINK_TYPE_QH); + qh_list->qh_link = cpu_to_hc32((uint32_t)qh_list | QH_LINK_TYPE_QH);
- token = hc32_to_cpu(qh.qh_overlay.qt_token); + token = hc32_to_cpu(qh->qh_overlay.qt_token); if (!(token & 0x80)) { debug("TOKEN=%#x\n", token); switch (token & 0xfc) { @@ -733,16 +744,16 @@ int usb_lowlevel_init(void) #endif
/* Set head of reclaim list */ - memset(&qh_list, 0, sizeof(qh_list)); - qh_list.qh_link = cpu_to_hc32((uint32_t)&qh_list | QH_LINK_TYPE_QH); - qh_list.qh_endpt1 = cpu_to_hc32((1 << 15) | (USB_SPEED_HIGH << 12)); - qh_list.qh_curtd = cpu_to_hc32(QT_NEXT_TERMINATE); - qh_list.qh_overlay.qt_next = cpu_to_hc32(QT_NEXT_TERMINATE); - qh_list.qh_overlay.qt_altnext = cpu_to_hc32(QT_NEXT_TERMINATE); - qh_list.qh_overlay.qt_token = cpu_to_hc32(0x40); + memset(qh_list, 0, sizeof(*qh_list)); + qh_list->qh_link = cpu_to_hc32((uint32_t)qh_list | QH_LINK_TYPE_QH); + qh_list->qh_endpt1 = cpu_to_hc32((1 << 15) | (USB_SPEED_HIGH << 12)); + qh_list->qh_curtd = cpu_to_hc32(QT_NEXT_TERMINATE); + qh_list->qh_overlay.qt_next = cpu_to_hc32(QT_NEXT_TERMINATE); + qh_list->qh_overlay.qt_altnext = cpu_to_hc32(QT_NEXT_TERMINATE); + qh_list->qh_overlay.qt_token = cpu_to_hc32(0x40);
/* Set async. queue head pointer. */ - ehci_writel(&hcor->or_asynclistaddr, (uint32_t)&qh_list); + ehci_writel(&hcor->or_asynclistaddr, (uint32_t)qh_list);
reg = ehci_readl(&hccr->cr_hcsparams); descriptor.hub.bNbrPorts = HCS_N_PORTS(reg); diff --git a/drivers/usb/musb/musb_core.h b/drivers/usb/musb/musb_core.h index a8adcce..e914369 100644 --- a/drivers/usb/musb/musb_core.h +++ b/drivers/usb/musb/musb_core.h @@ -145,7 +145,7 @@ struct musb_regs { struct musb_epN_regs epN; } ep[16];
-} __attribute__((packed, aligned(32))); +} __attribute__((packed, aligned(USB_DMA_MINALIGN))); #endif
/* diff --git a/include/usb.h b/include/usb.h index 6da91e7..ba3d169 100644 --- a/include/usb.h +++ b/include/usb.h @@ -29,6 +29,16 @@ #include <usb_defs.h> #include <usbdescriptors.h>
+/* + * The EHCI spec says that we must align to at least 32 bytes. However, + * some platforms require larger alignment. + */ +#if ARCH_DMA_MINALIGN > 32 +#define USB_DMA_MINALIGN ARCH_DMA_MINALIGN +#else +#define USB_DMA_MINALIGN 32 +#endif + /* Everything is aribtrary */ #define USB_ALTSETTINGALLOC 4 #define USB_MAXALTSETTING 128 /* Hard limit */

Hi Marek,
On Sun, Jul 8, 2012 at 7:08 AM, Marek Vasut marex@denx.de wrote:
[...]
diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index 04300be..5199560 100644 --- a/drivers/usb/host/ehci-hcd.c +++ b/drivers/usb/host/ehci-hcd.c @@ -34,7 +34,10 @@ struct ehci_hccr *hccr; /* R/O registers, not need for volatile */ volatile struct ehci_hcor *hcor;
static uint16_t portreset; -static struct QH qh_list __attribute__((aligned(32))); +DEFINE_CACHE_ALIGN_BUFFER(struct QH, qh_list, 1);
This will align on ARCH_DMA_MINALIGN, not USB_DMA_MINALIGN. In case of ARCH_DMA_MINALIGN < 32 we will loose the 32-byte alignment required by USB spec.
The same with other buffers. Otherwise looks great.
Regards, Ilya.

Dear Ilya Yanok,
Hi Marek,
On Sun, Jul 8, 2012 at 7:08 AM, Marek Vasut marex@denx.de wrote:
[...]
diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index 04300be..5199560 100644 --- a/drivers/usb/host/ehci-hcd.c +++ b/drivers/usb/host/ehci-hcd.c @@ -34,7 +34,10 @@ struct ehci_hccr *hccr; /* R/O registers, not need for volatile */
volatile struct ehci_hcor *hcor;
static uint16_t portreset;
-static struct QH qh_list __attribute__((aligned(32))); +DEFINE_CACHE_ALIGN_BUFFER(struct QH, qh_list, 1);
This will align on ARCH_DMA_MINALIGN, not USB_DMA_MINALIGN. In case of ARCH_DMA_MINALIGN < 32 we will loose the 32-byte alignment required by USB spec.
That's true -- maybe we should create ALLOC_ALIGN_BUFFER() and ALLOC_CACHE_ALIGN_BUFFER() to be a special case of it ?
The same with other buffers. Otherwise looks great.
Regards, Ilya.
Best regards, Marek Vasut

On Sun, Jul 8, 2012 at 11:51 AM, Marek Vasut marex@denx.de wrote:
Dear Ilya Yanok,
Hi Marek,
On Sun, Jul 8, 2012 at 7:08 AM, Marek Vasut marex@denx.de wrote:
[...]
diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index 04300be..5199560 100644 --- a/drivers/usb/host/ehci-hcd.c +++ b/drivers/usb/host/ehci-hcd.c @@ -34,7 +34,10 @@ struct ehci_hccr *hccr; /* R/O registers, not need for volatile */
volatile struct ehci_hcor *hcor;
static uint16_t portreset;
-static struct QH qh_list __attribute__((aligned(32))); +DEFINE_CACHE_ALIGN_BUFFER(struct QH, qh_list, 1);
This will align on ARCH_DMA_MINALIGN, not USB_DMA_MINALIGN. In case of ARCH_DMA_MINALIGN < 32 we will loose the 32-byte alignment required by USB spec.
That's true -- maybe we should create ALLOC_ALIGN_BUFFER() and ALLOC_CACHE_ALIGN_BUFFER() to be a special case of it ?
Lets think. USB says "32byte min". Do we have other buses today that apply similar constraints? If so, we should probably go about abstracting into a __ALIGN_FOO(size, what) and do a general one and a USB one that does 32 or cachline. If it's just USB that might be overkill however...

Dear Tom Rini,
On Sun, Jul 8, 2012 at 11:51 AM, Marek Vasut marex@denx.de wrote:
Dear Ilya Yanok,
Hi Marek,
On Sun, Jul 8, 2012 at 7:08 AM, Marek Vasut marex@denx.de wrote:
[...]
diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c
index 04300be..5199560 100644 --- a/drivers/usb/host/ehci-hcd.c +++ b/drivers/usb/host/ehci-hcd.c @@ -34,7 +34,10 @@ struct ehci_hccr *hccr; /* R/O registers, not need for volatile */
volatile struct ehci_hcor *hcor;
static uint16_t portreset;
-static struct QH qh_list __attribute__((aligned(32))); +DEFINE_CACHE_ALIGN_BUFFER(struct QH, qh_list, 1);
This will align on ARCH_DMA_MINALIGN, not USB_DMA_MINALIGN. In case of ARCH_DMA_MINALIGN < 32 we will loose the 32-byte alignment required by USB spec.
That's true -- maybe we should create ALLOC_ALIGN_BUFFER() and ALLOC_CACHE_ALIGN_BUFFER() to be a special case of it ?
Lets think. USB says "32byte min". Do we have other buses today that apply similar constraints? If so, we should probably go about abstracting into a __ALIGN_FOO(size, what) and do a general one and a USB one that does 32 or cachline. If it's just USB that might be overkill however...
Let's do this, we dunno what craziness can befall us in the future. And having the macro handy won't hurt anyway.
Best regards, Marek Vasut

Hi Marek,
On Sun, Jul 8, 2012 at 7:08 AM, Marek Vasut marex@denx.de wrote:
/* Invalidate the memory area occupied by buffer */
invalidate_dcache_range(((uint32_t)buffer & ~31),
((uint32_t)buffer & ~31) + roundup(length, 32));
/*
* Invalidate the memory area occupied by buffer
* Don't try to fix the buffer alignment, if it isn't properly
* aligned it's upper layer's fault so let
invalidate_dcache_range()
* vow about it. But we have to fix the length as it's actual
* transfer length and can be unaligned. This is potentially
* dangerous operation, it's responsibility of the calling
* code to make sure enough space is reserved.
*/
invalidate_dcache_range((uint32_t)buffer,
ALIGN_END_ADDR(u8, buffer, length));
We shouldn't use ALIGN_END_ADDR here. This can lead to strange results on systems with ARCH_DMA_MINALIGN < 32.
Regards, Ilya.

Dear Ilya Yanok,
Hi Marek,
On Sun, Jul 8, 2012 at 7:08 AM, Marek Vasut marex@denx.de wrote:
/* Invalidate the memory area occupied by buffer */
invalidate_dcache_range(((uint32_t)buffer & ~31),
((uint32_t)buffer & ~31) + roundup(length, 32));
/*
* Invalidate the memory area occupied by buffer
* Don't try to fix the buffer alignment, if it isn't properly
* aligned it's upper layer's fault so let
invalidate_dcache_range()
* vow about it. But we have to fix the length as it's actual
* transfer length and can be unaligned. This is potentially
* dangerous operation, it's responsibility of the calling
* code to make sure enough space is reserved.
*/
invalidate_dcache_range((uint32_t)buffer,
ALIGN_END_ADDR(u8, buffer, length));
We shouldn't use ALIGN_END_ADDR here. This can lead to strange results on systems with ARCH_DMA_MINALIGN < 32.
Good point, ALIGN_END_ADDR should align it to max(32, length) ?
Regards, Ilya.
Best regards, Marek Vasut

Dear Marek,
On Tue, Jul 10, 2012 at 5:55 AM, Marek Vasut marex@denx.de wrote:
invalidate_dcache_range((uint32_t)buffer,
ALIGN_END_ADDR(u8, buffer, length));
We shouldn't use ALIGN_END_ADDR here. This can lead to strange results on systems with ARCH_DMA_MINALIGN < 32.
Good point, ALIGN_END_ADDR should align it to max(32, length) ?
Hm.. no, actually we should align buffer length to ARCH_DMA_MINALIGN.
Regards, Ilya.

Hi,
On Sun, Jul 8, 2012 at 7:08 AM, Marek Vasut marex@denx.de wrote:
@@ -207,8 +210,8 @@ static int ehci_submit_async(struct usb_device *dev, unsigned long pipe, void *buffer, int length, struct devrequest *req) {
static struct QH qh __attribute__((aligned(32)));
static struct qTD qtd[3] __attribute__((aligned (32)));
ALLOC_CACHE_ALIGN_BUFFER(struct QH, qh, 1);
Somehow this doesn't work being allocated on stack... I assumed that this was declared as static just to use __attribute__((aligned))... but it seems that current code doesn't like qh address being changed (I'm getting "Timed out on TD" messages). Any ideas? Changing it to DEFINE_... seems to help but it looks like hiding another bug.
Regards, Ilya.

Dear Ilya Yanok,
Hi,
On Sun, Jul 8, 2012 at 7:08 AM, Marek Vasut marex@denx.de wrote:
@@ -207,8 +210,8 @@ static int
ehci_submit_async(struct usb_device *dev, unsigned long pipe, void
*buffer,
int length, struct devrequest *req)
{
static struct QH qh __attribute__((aligned(32)));
static struct qTD qtd[3] __attribute__((aligned (32)));
ALLOC_CACHE_ALIGN_BUFFER(struct QH, qh, 1);
Somehow this doesn't work being allocated on stack... I assumed that this was declared as static just to use __attribute__((aligned))...
Not really ... but then, we have indeed a problem here. It was probably declared static to preserve the contents of these variables across the the calls of this function. Sigh, the USB code is really bloody :-/
Ilya, can you try pulling these out of the function?
but it seems that current code doesn't like qh address being changed (I'm getting "Timed out on TD" messages). Any ideas? Changing it to DEFINE_... seems to help but it looks like hiding another bug.
Regards, Ilya.
Best regards, Marek Vasut

Dear Marek,
On Sun, Jul 15, 2012 at 12:07 PM, Marek Vasut marex@denx.de wrote:
@@ -207,8 +210,8 @@ static int
ehci_submit_async(struct usb_device *dev, unsigned long pipe, void
*buffer,
int length, struct devrequest *req)
{
static struct QH qh __attribute__((aligned(32)));
static struct qTD qtd[3] __attribute__((aligned (32)));
ALLOC_CACHE_ALIGN_BUFFER(struct QH, qh, 1);
Somehow this doesn't work being allocated on stack... I assumed that this was declared as static just to use __attribute__((aligned))...
Not really ... but then, we have indeed a problem here. It was probably declared static to preserve the contents of these variables across the the calls of this function. Sigh, the USB code is really bloody :-/
Hm.. no, I don't think it's about contents... Look, there are memset() calls straight in the beginning of the function. It looks like it's about qh address: I tried debugging it a bit and it seems to work until qh has the same address and break when the address eventually changes.
Ilya, can you try pulling these out of the function?
I can but what for? Are there any differences (except for scope) declaring statics inside or outside function? As I said declaring qh static (with DEFINE_CACHE_ALIGN_BUFFER) works but...
Regards, Ilya.

Dear Ilya Yanok,
Dear Marek,
On Sun, Jul 15, 2012 at 12:07 PM, Marek Vasut marex@denx.de wrote:
@@ -207,8 +210,8 @@ static int
ehci_submit_async(struct usb_device *dev, unsigned long pipe, void
*buffer,
int length, struct devrequest *req)
{
static struct QH qh __attribute__((aligned(32)));
static struct qTD qtd[3] __attribute__((aligned (32)));
ALLOC_CACHE_ALIGN_BUFFER(struct QH, qh, 1);
Somehow this doesn't work being allocated on stack... I assumed that this was declared as static just to use __attribute__((aligned))...
Not really ... but then, we have indeed a problem here. It was probably declared static to preserve the contents of these variables across the the calls of this function. Sigh, the USB code is really bloody :-/
Hm.. no, I don't think it's about contents... Look, there are memset() calls straight in the beginning of the function. It looks like it's about qh address: I tried debugging it a bit and it seems to work until qh has the same address and break when the address eventually changes.
I see, stupid me. So it's only the address? And I take it the controller doesn't use those after it leaves the function, right (maybe that might be it) ?
Ilya, can you try pulling these out of the function?
I can but what for? Are there any differences (except for scope) declaring statics inside or outside function? As I said declaring qh static (with DEFINE_CACHE_ALIGN_BUFFER) works but...
Regards, Ilya.
Best regards, Marek Vasut

Hi Marek,
On Sun, Jul 8, 2012 at 7:08 AM, Marek Vasut marex@denx.de wrote:
This is the out-of-function-scope counterpart of ALLOC_CACHE_ALIGN_BUFFER. +#define DEFINE_CACHE_ALIGN_BUFFER(type, name, size) \
static char __##name[roundup(size * sizeof(type),
ARCH_DMA_MINALIGN)] \
__aligned(ARCH_DMA_MINALIGN); \
We need linux/compiler.h (not included from common.h) for __aligned.
Regards, Ilya.

Dear Ilya Yanok,
Hi Marek,
On Sun, Jul 8, 2012 at 7:08 AM, Marek Vasut marex@denx.de wrote:
This is the out-of-function-scope counterpart of ALLOC_CACHE_ALIGN_BUFFER. +#define DEFINE_CACHE_ALIGN_BUFFER(type, name, size) \
static char __##name[roundup(size * sizeof(type),
ARCH_DMA_MINALIGN)] \
__aligned(ARCH_DMA_MINALIGN); \
We need linux/compiler.h (not included from common.h) for __aligned.
Correct, but is it a good idea to include it in common.h ? Ilya, can you re-do this patchset and repost?
Regards, Ilya.
Best regards, Marek Vasut

On Saturday 07 July 2012 23:08:14 Marek Vasut wrote:
+/*
- DEFINE_CACHE_ALIGN_BUFFER() is similar to ALLOC_CACHE_ALIGN_BUFFER, but it's
- purpose is to allow allocating aligned buffers outside of function scope.
- Usage of this macro shall be avoided or used with extreme care!
- */
+#define DEFINE_CACHE_ALIGN_BUFFER(type, name, size) \
- static char __##name[roundup(size * sizeof(type), ARCH_DMA_MINALIGN)] \
__aligned(ARCH_DMA_MINALIGN); \
\
- static type *name = (type *)__##name;
how is this any different from doing: static __u8 foo[1234] __aligned(ARCH_DMA_MINALIGN); -mike

Dear Mike Frysinger,
On Saturday 07 July 2012 23:08:14 Marek Vasut wrote:
+/*
- DEFINE_CACHE_ALIGN_BUFFER() is similar to ALLOC_CACHE_ALIGN_BUFFER,
but it's + * purpose is to allow allocating aligned buffers outside of function scope. + * Usage of this macro shall be avoided or used with extreme care! + */ +#define DEFINE_CACHE_ALIGN_BUFFER(type, name, size)
\
- static char __##name[roundup(size * sizeof(type), ARCH_DMA_MINALIGN)] \
__aligned(ARCH_DMA_MINALIGN); \
\
- static type *name = (type *)__##name;
how is this any different from doing: static __u8 foo[1234] __aligned(ARCH_DMA_MINALIGN); -mike
Does __aligned() align both start of the buffer downwards and end of it upwards ?
Best regards, Marek Vasut

On Friday 20 July 2012 07:31:47 Marek Vasut wrote:
Dear Mike Frysinger,
On Saturday 07 July 2012 23:08:14 Marek Vasut wrote:
+/* DEFINE_CACHE_ALIGN_BUFFER() is similar to ALLOC_CACHE_ALIGN_BUFFER, but it's purpose is to allow allocating aligned buffers outside of function scope. Usage of this macro shall be avoided or used with extreme care! */ +#define DEFINE_CACHE_ALIGN_BUFFER(type, name, size)
- static char __##name[roundup(size * sizeof(type), ARCH_DMA_MINALIGN)]
__aligned(ARCH_DMA_MINALIGN);
- static type *name = (type *)__##name;
how is this any different from doing: static __u8 foo[1234] __aligned(ARCH_DMA_MINALIGN);
Does __aligned() align both start of the buffer downwards and end of it upwards ?
it guarantees the start is aligned. i don't believe it does any tail padding.
that said, you've added just 1 consumer, but it uses in function scope, so i don't see why you had to define a new helper in the first place. the existing one would work fine shouldn't it ?
you should probably add a const to the 2nd part so gcc is more likely to optimize away the storage: static type * const name = (type *)__##name; otherwise older gcc versions will create a pointer to go through rather than having things directly access the buffer.
w/out const: $ readelf -s a.out | grep foo 11: 00000000004005d0 8 OBJECT LOCAL DEFAULT 13 foo.1592 12: 0000000000402080 96 OBJECT LOCAL DEFAULT 25 __foo.1591 w/const: $ readelf -s a.out | grep foo 11: 0000000000402080 96 OBJECT LOCAL DEFAULT 25 __foo.1591 -mike

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 07/20/2012 02:47 PM, Mike Frysinger wrote:
On Friday 20 July 2012 07:31:47 Marek Vasut wrote:
Dear Mike Frysinger,
On Saturday 07 July 2012 23:08:14 Marek Vasut wrote:
+/* DEFINE_CACHE_ALIGN_BUFFER() is similar to ALLOC_CACHE_ALIGN_BUFFER, but it's purpose is to allow allocating aligned buffers outside of function scope. Usage of this macro shall be avoided or used with extreme care! */ +#define DEFINE_CACHE_ALIGN_BUFFER(type, name, size) + static char __##name[roundup(size * sizeof(type), ARCH_DMA_MINALIGN)] + __aligned(ARCH_DMA_MINALIGN); + static type *name = (type *)__##name;
how is this any different from doing: static __u8 foo[1234] __aligned(ARCH_DMA_MINALIGN);
Does __aligned() align both start of the buffer downwards and end of it upwards ?
it guarantees the start is aligned. i don't believe it does any tail padding.
that said, you've added just 1 consumer, but it uses in function scope, so i don't see why you had to define a new helper in the first place. the existing one would work fine shouldn't it ?
The rough outline of the problems are: - - We need to have buffers that are multiple of cache size, for clearing. - - Today we know ehci-hcd had a problem. We also know other drivers / layers have problems, but they aren't as readily breakable.
That's why we put the macro in <common.h> rather than a USB header.
- -- Tom

On Friday 20 July 2012 17:50:33 Tom Rini wrote:
On 07/20/2012 02:47 PM, Mike Frysinger wrote:
On Friday 20 July 2012 07:31:47 Marek Vasut wrote:
Dear Mike Frysinger,
On Saturday 07 July 2012 23:08:14 Marek Vasut wrote:
+/* DEFINE_CACHE_ALIGN_BUFFER() is similar to ALLOC_CACHE_ALIGN_BUFFER, but it's purpose is to allow allocating aligned buffers outside of function scope. Usage of this macro shall be avoided or used with extreme care! */ +#define DEFINE_CACHE_ALIGN_BUFFER(type, name, size) + static char __##name[roundup(size * sizeof(type), ARCH_DMA_MINALIGN)] + __aligned(ARCH_DMA_MINALIGN); + static type *name = (type *)__##name;
how is this any different from doing: static __u8 foo[1234] __aligned(ARCH_DMA_MINALIGN);
Does __aligned() align both start of the buffer downwards and end of it upwards ?
it guarantees the start is aligned. i don't believe it does any tail padding.
that said, you've added just 1 consumer, but it uses in function scope, so i don't see why you had to define a new helper in the first place. the existing one would work fine shouldn't it ?
The rough outline of the problems are:
- We need to have buffers that are multiple of cache size, for clearing.
- Today we know ehci-hcd had a problem. We also know other drivers /
layers have problems, but they aren't as readily breakable.
That's why we put the macro in <common.h> rather than a USB header.
that wasn't the question. no one in the tree needs the new macro at all, regardless of what header it lives in. i guess the answer is that some code in the future (which hasn't been merged) might use it. -mike

On Sat, Jul 21, 2012 at 01:22:40PM -0400, Mike Frysinger wrote:
On Friday 20 July 2012 17:50:33 Tom Rini wrote:
On 07/20/2012 02:47 PM, Mike Frysinger wrote:
On Friday 20 July 2012 07:31:47 Marek Vasut wrote:
Dear Mike Frysinger,
On Saturday 07 July 2012 23:08:14 Marek Vasut wrote:
+/* DEFINE_CACHE_ALIGN_BUFFER() is similar to ALLOC_CACHE_ALIGN_BUFFER, but it's purpose is to allow allocating aligned buffers outside of function scope. Usage of this macro shall be avoided or used with extreme care! */ +#define DEFINE_CACHE_ALIGN_BUFFER(type, name, size) + static char __##name[roundup(size * sizeof(type), ARCH_DMA_MINALIGN)] + __aligned(ARCH_DMA_MINALIGN); + static type *name = (type *)__##name;
how is this any different from doing: static __u8 foo[1234] __aligned(ARCH_DMA_MINALIGN);
Does __aligned() align both start of the buffer downwards and end of it upwards ?
it guarantees the start is aligned. i don't believe it does any tail padding.
that said, you've added just 1 consumer, but it uses in function scope, so i don't see why you had to define a new helper in the first place. the existing one would work fine shouldn't it ?
The rough outline of the problems are:
- We need to have buffers that are multiple of cache size, for clearing.
- Today we know ehci-hcd had a problem. We also know other drivers /
layers have problems, but they aren't as readily breakable.
That's why we put the macro in <common.h> rather than a USB header.
that wasn't the question. no one in the tree needs the new macro at all, regardless of what header it lives in. i guess the answer is that some code in the future (which hasn't been merged) might use it.
Er, between drivers/usb/host/ehci-hcd.c and drivers/usb/eth/smsc95xx.c the three new macros are used today.
participants (4)
-
Ilya Yanok
-
Marek Vasut
-
Mike Frysinger
-
Tom Rini