
On Wed, 2013-10-02 at 17:16 +0300, Claudiu Manoil wrote:
On 10/1/2013 9:50 PM, Scott Wood wrote:
On Tue, 2013-10-01 at 14:38 +0300, Claudiu Manoil wrote:
On 10/1/2013 2:22 AM, Scott Wood wrote:
On Mon, 2013-09-30 at 12:44 +0300, Claudiu Manoil wrote:
+static RTXBD rtx __aligned(8); +#define RXBD(i) rtx.rxbd[i] +#define TXBD(i) rtx.txbd[i] +#define GET_BD_STAT(T, i) be16_to_cpu((__force __be16)T##BD(i).status) +#define SET_BD_STAT(T, i, v) T##BD(i).status = (__force __u16)cpu_to_be16(v) +#define GET_BD_BLEN(T, i) be16_to_cpu((__force __be16)T##BD(i).length) +#define SET_BD_BLEN(T, i, v) T##BD(i).length = (__force __u16)cpu_to_be16(v) +#define GET_BD_BPTR(T, i) be32_to_cpu((__force __be32)T##BD(i).bufptr) +#define SET_BD_BPTR(T, i, v) T##BD(i).bufptr = (__force __u32)cpu_to_be32(v)
Why the forcing? Can't you declare the data with __be16/__be32?
The __force annotation is obviously needed to suppress the sparse warnings due to BD data declaration type not being __bexx, but the generic uintxx_t, as you noticed as well. Now, why I didn't use __bexx instead? The main reason would be maintainability: the DMA descriptors may not be in big endian format exclusively. The eTSEC hw may actually have an option to interpret the DMA descriptors in little endian format.
May have or does have?
If it does have such a feature, do you plan to use it? Usually I have not seen such features used for (e.g.) little-endian PCI devices on big endian hosts.
I has that option, but I don't really plan to use it, clearly not for big endian hosts. The "may have" was for future little endian hosts. But I think this option is not really needed by the uboot driver so doing the byte swapping in software should be ok (i.e. performance wise).
If we don't plan to use it even on future little endian hosts, then it's no big deal to use __beNN.
What's wrong with:
for (t = 0; in_be16(&rtx.rxbd[rx_idx].status) & RXBD_EMPTY; t++) {
Or if you don't want to use I/O accessors on the DMA descriptors (Is synchronization ensured some other way? We had problems with this in the Linux driver before...):
Synchronization here is is insured by declaring the RTXBD structure type as "volatile" (see RTXBD declaration, a couple of lines above).
That does not achieve hardware synchronization, and even the effects on the compiler are questionable due to volatile's vague semantics.
The existing code has been working this way for quite a while on the mpc85xx platforms,
It was "working" for a while in Linux as well, until we encountered a workload where it didn't (though granted, there was no volatile in that case). See Linux commit 3b6330ce2a3e1f152f79a6203f73d23356e243a7
Good point, I guess it would be safer too use some memory barriers around accesses to BD fields in the uboot driver too. However some portable barriers would be needed, eieio() doesn't have an equivalent for ARM.
FWIW, I see some other places in U-Boot's TSEC driver that use out_be32() on the descriptors (e.g. startup_tsec after "Point to the buffer descriptors").
That's only to program the tbase and rbase registers with the physical address of the Tx/Rx BD rings' base.
Oops. :-P
Also, there doesn't seem to be other net drivers using I/O accessors for the BD rings.
I picked some random examples, and the first driver in Linux in which I could quickly find the BD rings uses I/O accessors (drivers/net/ethernet/realtek/8319too.c). I then checked its U-Boot eqivalent (drivers/net/rtl8139.c) and it also uses I/O accessors for the descriptors.
I actually meant accessing buffer descriptor fields (like status, length, dma address). As you can see in this example from linux 8319too.c's Rx routine, there's no I/O accessor involved:
/* read size+status of next frame from DMA ring buffer */ rx_status = le32_to_cpu (*(__le32 *) (rx_ring + ring_offset));
However the driver does use a rmb() just before this line.
Yeah, it doesn't help when both types of accesses show up when searching for the ring, and accessors exist with both possible argument orderings. Especially when a driver has custom accessors.
It's OK to use explicit synchronization rather than I/O accessors, if you're careful to ensure that it's correct.
It looks like drivers/net/fec_mxc.c in U-Boot is an example that uses I/O accessors on ring data, e.g.:
writew(length, &fec->tbd_base[fec->tbd_index].data_length);
-Scott