[U-Boot] Some notes and status on porting U-Boot to the Cavium 64-bit Octeon MIPS Processor

Hi all,
I have 2010.09 U-Boot working fairly well now on our Octeon 63XX processor and have a number of observations about U-Boot.
One of our challenges is the fact that the Octeon is a 64-bit multi-core MIPS processor that requires 64-bit memory access. In our case, various I/O subsystems are mapped to physical addresses outside of the 4GB address space. Due to this, we don't even support a 32-bit Linux kernel but require a 64-bit kernel. We also support loading applications linked with our SDK onto various cores of our processor, which are basically special ELF images.
Another challenge is the fact that we load U-Boot at the top of physical RAM which requires that we use TLB mapping due to the fact that it is quite common for us to support 4GB or more DRAM in our embedded environment. We need to do this since a number of customers have custom operating systems and applications that load in the lower 4GB and need all the space they can get. The TLB mapping is not nearly as invasive as one might think and it actually simplifies the case where we could be loading the same u-boot image from different addresses, such as NOR flash, failsafe, NAND and booting over PCIe/X. The only places where this does have an impact is in device drivers that perform DMA and/or PCI access since we need to make sure to use wrappers for mapping the virtual addresses to physical addresses. In our case, I am using bus_to_phys and use the 64-bit result to fill in the appropriate descriptors. So far I have made changes to EHCI and the e1000 drivers, but will likely be working on others like AHCI for SATA. We also don't know where we will load U-Boot into physical memory until run-time, even on a per-board basis since many boards support DIMMs rather than memory on the board.
I've run into a number of corner cases and bugs in U-Boot which I have fixed and hope to soon release patches which are not specific to Octeon:
1. 4GB and larger NAND flash doesn't work. I have a generic patch for this I will try and get out this week. Tested with Micron MT29F32G08CBABA. 2. Support for TI TMP42X temperature monitor 3. PCI breaks if we assign the BAR address space to 0xf8000000-0xffffffff (our hardware actually maps this to 0x11b00f0000000 or 0x11c00f800000 depending on the PCIE bus number). I have a generic fix for this I also hope to release this week. 4. Support for Spansion S29GL064N which reports a manufacturer ID of 0x0000 and requires AMD fixups. 5. Fix for Micron NAND flash MT29F32G08CBABA which erroneously reports a 16- bit bus when it has an 8-bit BUS. 6. Ability for software to change the u-boot prompt at run-time/boot time. We use this to indicate whether we are booting from RAM, FLASH, or failsafe with a single U-Boot binary.
Our 64-bit platform is different enough in that it is almost a completely different processor compared to the other MIPS platforms. While we were able to make use of and expand mipsregs.h and many of the other include files, we had to replace almost all of the C code as well as start.S. Even lib/board.c could not be reused, though hopefully I can clean this up so it can be used in the future.
Due to the vast differences, I have tried to update the MIPS cpu tree to be like ARM and PowerPC such that different CPUs have different directories and a common mips32 directory.
Just for the CPU portion we have added around 29Kloc of code, not including the 440Kloc of code from our SDK which we link with U-Boot (LGPL). We make use of our SDK since it abstracts support for all of our chips and makes porting to new chips easier, such as our upcoming 68XX series of 32 core processors.
I have tried to minimize our changes to the common code as much as possible and have been largely successul. Where I have made changes, I have separated out all changes specific to our hardware using #ifdef CONFIG_OCTEON. Some of these changes may be useful for other platforms if other platforms choose to add 64-bit support.
Moving forward I hope I can get our changes included into the main U-Boot release, but this is going to take a while as I still have a lot of cleanup to do as well as add support for more of our processors and boards.
We also have a number of custom commands we have added for our platform. Some of these may be able to be made more generic, but we use our own commands for things like loading our Linux kernel and simple executive applications since they require some datastructures to be configured that are tied to our SDK and are needed for backwards compatibility.
Most of the code for our platform is also under mips/cpu/octeon since it is common between many different boards. The amount of board specific code is actually quite small.
I still have a way to go before I can start submitting patches. I need to get our code committed to our main trunk in-house then I'll have to clean up some of the formatting and break some of our files (our DDR DRAM initialization code is over 300K!).
I don't think I can include our SDK as a series of patches on the mailing list since it is about 26MB with some of the hardware generated files being hundreds of kilobytes to 12MB for our register database file (which fortunately isn't needed by u-boot!) It's available under the LGPL but not easily accessible through our open-source web site without registration :(
I have also been trying to keep up with the GIT patches to u-boot and I don't think it will be all that difficult to move to the latest version since it sounds like most of the changes have affected PowerPC and ARM.
-Aaron Williams

Dear Aaron Williams,
In message 201102071402.37099.Aaron.Williams@caviumnetworks.com you wrote:
One of our challenges is the fact that the Octeon is a 64-bit multi-core MIPS processor that requires 64-bit memory access. In our case, various I/O subsystems are mapped to physical addresses outside of the 4GB address space. Due to this, we don't even support a 32-bit Linux kernel but require a 64-bit kernel. We also support loading applications linked with our SDK onto various cores of our processor, which are basically special ELF images.
There are many systems which have a physical address space that requires more than 32 bit. So far, we still run a 32 bit U-Boot on these. I understand that you are using a 64 bit port of U-Boot?
Another challenge is the fact that we load U-Boot at the top of physical RAM which requires that we use TLB mapping due to the fact that it is quite common for us to support 4GB or more DRAM in our embedded environment. We need to do this since a number of customers have custom operating systems and applications that load in the lower 4GB and need all the space they can get.
Existing U-Boot deals with this by mapping just the lower and the upper parts of available physical memory. See the CONFIG_VERY_BIG_RAM config option.
The TLB mapping is not nearly as invasive as one might think and it actually simplifies the case where we could be loading the same u-boot image from different addresses, such as NOR flash, failsafe, NAND and booting over PCIe/X. The only places where this does have an impact is in device drivers that perform DMA and/or PCI access since we need to make sure to use wrappers for mapping the virtual addresses to physical addresses. In our case, I am using bus_to_phys and use the 64-bit result to fill in the appropriate descriptors. So far I have made changes to EHCI and the e1000 drivers, but will likely be working on others like AHCI for SATA. We also don't know where we will load U-Boot into physical memory until run-time, even on a per-board basis since many boards support DIMMs rather than memory on the board.
This is a pretty standard situation - that is what the original design did, and what all PowerPC (and now all working ARM) boards are doing, too.
I've run into a number of corner cases and bugs in U-Boot which I have fixed and hope to soon release patches which are not specific to Octeon:
- 4GB and larger NAND flash doesn't work. I have a generic patch for this I
will try and get out this week. Tested with Micron MT29F32G08CBABA. 2. Support for TI TMP42X temperature monitor 3. PCI breaks if we assign the BAR address space to 0xf8000000-0xffffffff (our hardware actually maps this to 0x11b00f0000000 or 0x11c00f800000 depending on the PCIE bus number). I have a generic fix for this I also hope to release this week. 4. Support for Spansion S29GL064N which reports a manufacturer ID of 0x0000 and requires AMD fixups. 5. Fix for Micron NAND flash MT29F32G08CBABA which erroneously reports a 16- bit bus when it has an 8-bit BUS. 6. Ability for software to change the u-boot prompt at run-time/boot time. We use this to indicate whether we are booting from RAM, FLASH, or failsafe with a single U-Boot binary.
All these sound pretty harmless and independent of your Octeon support.
Our 64-bit platform is different enough in that it is almost a completely different processor compared to the other MIPS platforms. While we were able to make use of and expand mipsregs.h and many of the other include files, we had to replace almost all of the C code as well as start.S. Even lib/board.c could not be reused, though hopefully I can clean this up so it can be used in the future.
We are in the process of overcoming the long-term repercussions of the original ARM port which did not follow the original (PPC based) design. ARM is now fixed, or in the process of being cleaned up. Unfortunately MIPS folled the ARM example, and here this work still needs to be done. So if you are looking for reference implementations, you rather want to look at PPC code than ARM or MIPS code.
One part of the unification attempts is that we are trying to synchronize lib/board.c across architectures, with the goal to use a single common version soon. So again, pleasre use PPC as reference for changes, or at least try not to introduce any additional, new incompatibilities.
Just for the CPU portion we have added around 29Kloc of code, not including the 440Kloc of code from our SDK which we link with U-Boot (LGPL). We make use of our SDK since it abstracts support for all of our chips and makes porting to new chips easier, such as our upcoming 68XX series of 32 core processors.
Do I understand correctly t hat this SDK (whatever that might be) will then have to be included with the U-Boot distribution?
What would be alternatives to static linking, such as to avoid adding all this code?
I have tried to minimize our changes to the common code as much as possible and have been largely successul. Where I have made changes, I have separated out all changes specific to our hardware using #ifdef CONFIG_OCTEON. Some of these changes may be useful for other platforms if other platforms choose to add 64-bit support.
Such general changes should then NOT use CONFIG_OCTEON, but some generic variable name.
We also have a number of custom commands we have added for our platform. Some of these may be able to be made more generic, but we use our own commands for things like loading our Linux kernel and simple executive applications since they require some datastructures to be configured that are tied to our SDK and are needed for backwards compatibility.
Many boards use board specific commands; I see no problem with having SoC specific commands either.
I don't think I can include our SDK as a series of patches on the mailing list since it is about 26MB with some of the hardware generated files being hundreds of kilobytes to 12MB for our register database file (which fortunately isn't needed by u-boot!) It's available under the LGPL but not easily accessible through our open-source web site without registration :(
U-Boot is supposed to be self-sufficient, i. e. to contain all parts that are required to build a working U-Boot image. I see a potential area of conflicts here.
I have also been trying to keep up with the GIT patches to u-boot and I don't think it will be all that difficult to move to the latest version since it sounds like most of the changes have affected PowerPC and ARM.
Indeed.
Best regards,
Wolfgang Denk

On Monday, February 07, 2011 02:26:15 pm Wolfgang Denk wrote:
Dear Aaron Williams,
In message 201102071402.37099.Aaron.Williams@caviumnetworks.com you wrote:
One of our challenges is the fact that the Octeon is a 64-bit multi-core MIPS processor that requires 64-bit memory access. In our case, various I/O subsystems are mapped to physical addresses outside of the 4GB address space. Due to this, we don't even support a 32-bit Linux kernel but require a 64-bit kernel. We also support loading applications linked with our SDK onto various cores of our processor, which are basically special ELF images.
There are many systems which have a physical address space that requires more than 32 bit. So far, we still run a 32 bit U-Boot on these. I understand that you are using a 64 bit port of U-Boot?
No. We are using a 32-bit port since I think trying to make a 64-bit port of U-Boot would be far more involved. We do have support for loading and executing 64-bit ELF images, however. We use special memcpy/memset functions for 64-bit memory addressing in these cases.
Another challenge is the fact that we load U-Boot at the top of physical RAM which requires that we use TLB mapping due to the fact that it is quite common for us to support 4GB or more DRAM in our embedded environment. We need to do this since a number of customers have custom operating systems and applications that load in the lower 4GB and need all the space they can get.
Existing U-Boot deals with this by mapping just the lower and the upper parts of available physical memory. See the CONFIG_VERY_BIG_RAM config option.
I just looked at this and this. The only place I see this used is in arch/powerpc/lib/board.c and it looks like it just limits the effective memory size to CONFIG_MAX_MEM_MAPPED. This won't work for us. As I said, we need to move it out of the lower 4GB when there's more memory involved. We also don't want there being holes in the middle of the memory if we can help it, nor do we want to place u-boot at the lower end of memory. At least on MIPS the TLB support is quite simple and is performed very early on by adding a single TLB entry. We have customers that need to make use of the entire KUSEG, for example. Our SDK only reserves a very small portion of this.
The TLB mapping is not nearly as invasive as one might think and it actually simplifies the case where we could be loading the same u-boot image from different addresses, such as NOR flash, failsafe, NAND and booting over PCIe/X. The only places where this does have an impact is in device drivers that perform DMA and/or PCI access since we need to make sure to use wrappers for mapping the virtual addresses to physical addresses. In our case, I am using bus_to_phys and use the 64-bit result to fill in the appropriate descriptors. So far I have made changes to EHCI and the e1000 drivers, but will likely be working on others like AHCI for SATA. We also don't know where we will load U-Boot into physical memory until run-time, even on a per-board basis since many boards support DIMMs rather than memory on the board.
This is a pretty standard situation - that is what the original design did, and what all PowerPC (and now all working ARM) boards are doing, too.
In our case we just map u-boot to always be at 0xC0000000 regardless of where it is located in physical memory. In my current setup it is located at physical address 0x10f800000. So far I've found that the e1000 and EHCI drivers assume that virtual addresses are physical addresses. Furthermore, the e1000 driver assumes that readl can be used to access the PCI BAR address space, which on our platform is not the case since that is located at 0x11b00f0000000 or 0x11c00f80000000 and needs the appropriate wrapper function. Alternatively I could add more TLB entries to map the PCI BAR address space to 0xf0000000 and 0xf8000000 respectively or modify readl/writel to make the appropriate translations.
Would using virt_to_bus to convert pointers to DMA addresses be appropriate instead of the current assumption that pointers can be used as DMA addresses directly? This seems like a portable solution since on platforms where the pointer and DMA address are the same the macro would just do nothing. Even if we didn't use virtual addresses and were using, say, KSEG1 the pointer and physical address don't match.
I've run into a number of corner cases and bugs in U-Boot which I have fixed and hope to soon release patches which are not specific to Octeon:
- 4GB and larger NAND flash doesn't work. I have a generic patch for
this I will try and get out this week. Tested with Micron MT29F32G08CBABA. 2. Support for TI TMP42X temperature monitor 3. PCI breaks if we assign the BAR address space to 0xf8000000-0xffffffff (our hardware actually maps this to 0x11b00f0000000 or 0x11c00f800000 depending on the PCIE bus number). I have a generic fix for this I also hope to release this week. 4. Support for Spansion S29GL064N which reports a manufacturer ID of 0x0000 and requires AMD fixups. 5. Fix for Micron NAND flash MT29F32G08CBABA which erroneously reports a 16- bit bus when it has an 8-bit BUS. 6. Ability for software to change the u-boot prompt at run-time/boot time. We use this to indicate whether we are booting from RAM, FLASH, or failsafe with a single U-Boot binary.
All these sound pretty harmless and independent of your Octeon support.
Our 64-bit platform is different enough in that it is almost a completely different processor compared to the other MIPS platforms. While we were able to make use of and expand mipsregs.h and many of the other include files, we had to replace almost all of the C code as well as start.S. Even lib/board.c could not be reused, though hopefully I can clean this up so it can be used in the future.
We are in the process of overcoming the long-term repercussions of the original ARM port which did not follow the original (PPC based) design. ARM is now fixed, or in the process of being cleaned up. Unfortunately MIPS folled the ARM example, and here this work still needs to be done. So if you are looking for reference implementations, you rather want to look at PPC code than ARM or MIPS code.
One part of the unification attempts is that we are trying to synchronize lib/board.c across architectures, with the goal to use a single common version soon. So again, pleasre use PPC as reference for changes, or at least try not to introduce any additional, new incompatibilities.
That is what I did, though for board.c I basically took our old implementation and cleaned it up for the new u-boot. I need to make another pass and do a proper implementation once I get some time.
Just for the CPU portion we have added around 29Kloc of code, not including the 440Kloc of code from our SDK which we link with U-Boot (LGPL). We make use of our SDK since it abstracts support for all of our chips and makes porting to new chips easier, such as our upcoming 68XX series of 32 core processors.
Do I understand correctly t hat this SDK (whatever that might be) will then have to be included with the U-Boot distribution?
What would be alternatives to static linking, such as to avoid adding all this code?
It would have to be included only for the Octeon processors. It is statically linked and we don't want to get away from this. Also, some of our u-boot files are in turn statically linked against some of our utilities, such as our utility which loads and boots u-boot over the PCI(e) bus. It contains all of the register definitions that are used on our board as well as functionality to initialize and work with different I/O blocks. It provides an abstraction to make it easy to deal with all of our various chips and boards as well as errata. At the moment in our internal tree we just create symlinks to all of the files. The header files I'm placing under arch/mips/include/asm/arch- octeon and the .c files under arch/mips/cpu/mips/octeon/cvmx to separate them.
Also note that u-boot for Octeon can only be compiled with our toolchain, since there is some dependency on some of the include files from our GCC distribution as well, plus our toolchain distribution includes support for some of the extensions we make use of.
I have tried to minimize our changes to the common code as much as possible and have been largely successul. Where I have made changes, I have separated out all changes specific to our hardware using #ifdef CONFIG_OCTEON. Some of these changes may be useful for other platforms if other platforms choose to add 64-bit support.
Such general changes should then NOT use CONFIG_OCTEON, but some generic variable name.
I agree, though some many cases they are not general, such as some of the support for our compact flash in cmd_ide.c and a few other areas.
We also have a number of custom commands we have added for our platform. Some of these may be able to be made more generic, but we use our own commands for things like loading our Linux kernel and simple executive applications since they require some datastructures to be configured that are tied to our SDK and are needed for backwards compatibility.
Many boards use board specific commands; I see no problem with having SoC specific commands either.
I am placing these commands under arch/mips/cpu/octeon/commands rather than clutter up the common code, unless you feel it's better to put all the commands under the common code.
I don't think I can include our SDK as a series of patches on the mailing list since it is about 26MB with some of the hardware generated files being hundreds of kilobytes to 12MB for our register database file (which fortunately isn't needed by u-boot!) It's available under the LGPL but not easily accessible through our open-source web site without registration :(
U-Boot is supposed to be self-sufficient, i. e. to contain all parts that are required to build a working U-Boot image. I see a potential area of conflicts here.
We don't have any problem including our SDK with U-Boot. I can work on trying to cut down on the files that are needed. One thing to note is that we make patches and changes to our SDK fairly regularly to fix errata and to support new chips.
The main parts I am using are the sections dealing with memory, PCI/PCIE, model identification, Ethernet, twsi (I2C), UART, TLB (for virt_to_bus) and a few other areas. The memory part we use to tell our SDK which blocks of memory are in use by u-boot. There are many areas that are not used by u-boot but this may grow. For example, in the future I may make use of the usb support for our earlier chips that have a proprietary USB interface or our zip support for faster unzip, md5, sha1, etc.
The biggest portions of our SDK are generated files for dealing with errors, a register database (12MB!) and include files which define access functions to all the various registers in all of the I/O units for all of our chips (which also may not be needed). I need to play with this and see what files I can remove from the SDK.
I have also been trying to keep up with the GIT patches to u-boot and I don't think it will be all that difficult to move to the latest version since it sounds like most of the changes have affected PowerPC and ARM.
Indeed.
Best regards,
Wolfgang Denk

Dear Aaron Williams,
In message 201102071524.17440.Aaron.Williams@caviumnetworks.com you wrote:
these. I understand that you are using a 64 bit port of U-Boot?
No. We are using a 32-bit port since I think trying to make a 64-bit port of U-Boot would be far more involved. We do have support for loading and executing 64-bit ELF images, however. We use special memcpy/memset functions for 64-bit memory addressing in these cases.
OK, let's discuss this when we see your code.
Existing U-Boot deals with this by mapping just the lower and the upper parts of available physical memory. See the CONFIG_VERY_BIG_RAM config option.
I just looked at this and this. The only place I see this used is in arch/powerpc/lib/board.c and it looks like it just limits the effective memory size to CONFIG_MAX_MEM_MAPPED. This won't work for us. As I said, we need to move it out of the lower 4GB when there's more memory involved. We also don't
Why?
want there being holes in the middle of the memory if we can help it, nor do we want to place u-boot at the lower end of memory. At least on MIPS the TLB
The question is what is the lesser of two evils: not mapping all of the available memory and stick with a total of mapped memory that is addressable in 32 bit adress space, or adding the complexity to deal with 64 bit address spaces - which will require changes in _lots_ of places.
But again I suggst to defer the discussion until we see your code.
Would using virt_to_bus to convert pointers to DMA addresses be appropriate instead of the current assumption that pointers can be used as DMA addresses directly? This seems like a portable solution since on platforms where the pointer and DMA address are the same the macro would just do nothing. Even if we didn't use virtual addresses and were using, say, KSEG1 the pointer and physical address don't match.
Yes, I think this would indeed be a better approach.
Do I understand correctly t hat this SDK (whatever that might be) will then have to be included with the U-Boot distribution?
What would be alternatives to static linking, such as to avoid adding all this code?
It would have to be included only for the Octeon processors. It is statically
I don't undertsand what you mean. Either it is part of the U-Boot source tree, or it is not. You cannot add it "only for the Octeon processors" - either it's there, or it ain't.
Given that we are talking about code in the order of 30% of the total U-Boot code, probably not conforming to U-Boot coding standards, I am anything but happy about such an addition. Assume ever SoC vendor comes up with similar ideas...
linked and we don't want to get away from this. Also, some of our u-boot files are in turn statically linked against some of our utilities, such as our
I understand that you do not _want_ to change this. My question is: what would you do if you _had_ to change it?
Also note that u-boot for Octeon can only be compiled with our toolchain, since there is some dependency on some of the include files from our GCC distribution as well, plus our toolchain distribution includes support for some of the extensions we make use of.
Can this be fixed? I mean, copying some header files should probably be a solvable problem. What about of these "extensions" - are they absolutely needed in U-Boot? Usually such extensions are either performance optimizations which are not really needed in U-Boot, or other well-localized operations that can b ehandlef with small assembler stubs.
Such general changes should then NOT use CONFIG_OCTEON, but some generic variable name.
I agree, though some many cases they are not general, such as some of the support for our compact flash in cmd_ide.c and a few other areas.
In general we do not want to see board or SoC specific changes to common code.
Many boards use board specific commands; I see no problem with having SoC specific commands either.
I am placing these commands under arch/mips/cpu/octeon/commands rather than clutter up the common code, unless you feel it's better to put all the commands under the common code.
This sounds OK with me.
I don't think I can include our SDK as a series of patches on the mailing list since it is about 26MB with some of the hardware generated files being hundreds of kilobytes to 12MB for our register database file (which fortunately isn't needed by u-boot!) It's available under the LGPL but not easily accessible through our open-source web site without registration :(
U-Boot is supposed to be self-sufficient, i. e. to contain all parts that are required to build a working U-Boot image. I see a potential area of conflicts here.
We don't have any problem including our SDK with U-Boot. I can work on trying
But we probably will have problems adding tons of such code which is of no use to anybody else.
The biggest portions of our SDK are generated files for dealing with errors, a register database (12MB!) and include files which define access functions to
I'd expect that only a tiny percentage of this is actually needed / used in U-Boot. We should restric it to these actually used parts.
Best regards,
Wolfgang Denk

On Tuesday, February 08, 2011 12:53:27 am Wolfgang Denk wrote:
Dear Aaron Williams,
In message 201102071524.17440.Aaron.Williams@caviumnetworks.com you wrote:
these. I understand that you are using a 64 bit port of U-Boot?
No. We are using a 32-bit port since I think trying to make a 64-bit port of U-Boot would be far more involved. We do have support for loading and executing 64-bit ELF images, however. We use special memcpy/memset functions for 64-bit memory addressing in these cases.
OK, let's discuss this when we see your code.
Existing U-Boot deals with this by mapping just the lower and the upper parts of available physical memory. See the CONFIG_VERY_BIG_RAM config option.
I just looked at this and this. The only place I see this used is in arch/powerpc/lib/board.c and it looks like it just limits the effective memory size to CONFIG_MAX_MEM_MAPPED. This won't work for us. As I said, we need to move it out of the lower 4GB when there's more memory involved. We also don't
Why?
As I have said, we need all of the lower portion of memory and do not want to introduce holes in the memory space for u-boot for loading 64-bit applications. Also, some of our customers are loading their own proprietary operating systems that need all of the memory. The overhead from our SDK in the lower memory region is only a few kilobytes.
One other nice thing about the TLB support is all of the fixups go away.
want there being holes in the middle of the memory if we can help it, nor do we want to place u-boot at the lower end of memory. At least on MIPS the TLB
The question is what is the lesser of two evils: not mapping all of the available memory and stick with a total of mapped memory that is addressable in 32 bit adress space, or adding the complexity to deal with 64 bit address spaces - which will require changes in _lots_ of places.
But again I suggst to defer the discussion until we see your code.
Would using virt_to_bus to convert pointers to DMA addresses be appropriate instead of the current assumption that pointers can be used as DMA addresses directly? This seems like a portable solution since on platforms where the pointer and DMA address are the same the macro would just do nothing. Even if we didn't use virtual addresses and were using, say, KSEG1 the pointer and physical address don't match.
Yes, I think this would indeed be a better approach.
This is what I am doing, though I'm having to make changes on a driver by driver basis to add this. I'm using dma_addr_t for the address result of this then using the same method used in Linux to fill in both the upper and lower 32-bit descriptor portions, i.e. foo->lower = dma_addr & 0xffffffff; foo-
upper = ((dma_addr >> 16) >> 16); This is how it's done in the Linux kernel.
So far the drivers I've modified this way seem to work fine.
Do I understand correctly t hat this SDK (whatever that might be) will then have to be included with the U-Boot distribution?
What would be alternatives to static linking, such as to avoid adding all this code?
It would have to be included only for the Octeon processors. It is statically
I don't undertsand what you mean. Either it is part of the U-Boot source tree, or it is not. You cannot add it "only for the Octeon processors" - either it's there, or it ain't.
Given that we are talking about code in the order of 30% of the total U-Boot code, probably not conforming to U-Boot coding standards, I am anything but happy about such an addition. Assume ever SoC vendor comes up with similar ideas...
It can be included with u-boot, though I think first I should try and see how much I can strip it down. We cannot separate u-boot completely from the SDK, though. We use it extensively for things like Ethernet and will also likely make use of it for USB when I add the Synopsis USB support used in our older chips (which is not EHCI/OHCI compatible).
linked and we don't want to get away from this. Also, some of our u-boot files are in turn statically linked against some of our utilities, such as our
I understand that you do not _want_ to change this. My question is: what would you do if you _had_ to change it?
We might be able to make some changes, but it might be difficult since we have a lot of developers working on this around the world. I would prefer to not have to maintain two separate code bases. I will need to see how much I can strip out.
Also note that u-boot for Octeon can only be compiled with our toolchain, since there is some dependency on some of the include files from our GCC distribution as well, plus our toolchain distribution includes support for some of the extensions we make use of.
Can this be fixed? I mean, copying some header files should probably be a solvable problem. What about of these "extensions" - are they absolutely needed in U-Boot? Usually such extensions are either performance optimizations which are not really needed in U-Boot, or other well-localized operations that can b ehandlef with small assembler stubs.
Support for our Octeon2 processor is not in the mainline trunk yet. We also have some changes needed to work with our SDK and the Linux kernel with our newer chips.
Such general changes should then NOT use CONFIG_OCTEON, but some generic variable name.
I agree, though some many cases they are not general, such as some of the support for our compact flash in cmd_ide.c and a few other areas.
In general we do not want to see board or SoC specific changes to common code.
The common code is already chock full of them from what I've seen in a number of areas. The number of changes for our SoC is actually quite small. In some cases I've just changed some functions to use __attribute__((__weak__)) so we can define the functionality elsewhere. I count a total of 21 ifdefs in the common code, most of those are in cmd_ide.c, one in cmd_boot.c, one in cmd_elf.c, two in env_flash.c and some in main.c, which can be removed (those are for adding the support for dynamically changing the boot prompt) and one in miiphyutil.c. Right now cmd_ide.c can be cleaned up a bit. The problem is that cmd_ide.c includes the driver functionality in it, requiring us to do this. I can clean this up by making some of the functions weak. Is this reasonable?
One other change I had to make was for the command table. I have to force the alignment to 8 bytes in the compiler or very bad things happen.
Many boards use board specific commands; I see no problem with having SoC specific commands either.
I am placing these commands under arch/mips/cpu/octeon/commands rather than clutter up the common code, unless you feel it's better to put all the commands under the common code.
This sounds OK with me.
I don't think I can include our SDK as a series of patches on the mailing list since it is about 26MB with some of the hardware generated files being hundreds of kilobytes to 12MB for our register database file (which fortunately isn't needed by u-boot!) It's available under the LGPL but not easily accessible through our open-source web site without registration :(
U-Boot is supposed to be self-sufficient, i. e. to contain all parts that are required to build a working U-Boot image. I see a potential area of conflicts here.
We don't have any problem including our SDK with U-Boot. I can work on trying
But we probably will have problems adding tons of such code which is of no use to anybody else.
As I have said, I will try and cut down on what is included.
The biggest portions of our SDK are generated files for dealing with errors, a register database (12MB!) and include files which define access functions to
I'd expect that only a tiny percentage of this is actually needed / used in U-Boot. We should restric it to these actually used parts.
Agreed.
Best regards,
Wolfgang Denk

Dear Aaron Williams,
In message 201102081927.36497.Aaron.Williams@caviumnetworks.com you wrote:
...
memory size to CONFIG_MAX_MEM_MAPPED. This won't work for us. As I said, we need to move it out of the lower 4GB when there's more memory involved. We also don't
Why?
As I have said, we need all of the lower portion of memory and do not want to introduce holes in the memory space for u-boot for loading 64-bit
When I ask "why?" then I do so to understand the reasons fdor your design decisions (and eventually to be able to recomment alternative solutions).
In such a situation phrases as "we need" or "we [do not] want" are completely useless und unhelpful, unless they are accompanied with a "because ..." part that explains _why_ you [think you] "need" or "want" these things.
I understand that you do not _want_ to change this. My question is: what would you do if you _had_ to change it?
We might be able to make some changes, but it might be difficult since we have a lot of developers working on this around the world. I would prefer to not have to maintain two separate code bases. I will need to see how much I can strip out.
Well, frankly, this is your problem, not ours. If Cavium had decided to discuss the design early we might have come up with a solution that fits without too much problems. Instead, Cavium went on and kept their stuff closed for years - now they say that it's difficult to change things. Please understand that our foxus is on the overall quality and maintainability of the code, so such arguments do not carry much weight.
since there is some dependency on some of the include files from our GCC distribution as well, plus our toolchain distribution includes support for some of the extensions we make use of.
Can this be fixed? I mean, copying some header files should probably be a solvable problem. What about of these "extensions" - are they absolutely needed in U-Boot? Usually such extensions are either performance optimizations which are not really needed in U-Boot, or other well-localized operations that can b ehandlef with small assembler stubs.
Support for our Octeon2 processor is not in the mainline trunk yet. We also have some changes needed to work with our SDK and the Linux kernel with our newer chips.
You reply, but you do not actually answer my questions.
In general we do not want to see board or SoC specific changes to common code.
The common code is already chock full of them from what I've seen in a number
Yes, and we've learned that lesson, and now we strive hard not to add to the mess any more.
of areas. The number of changes for our SoC is actually quite small. In some cases I've just changed some functions to use __attribute__((__weak__)) so we can define the functionality elsewhere. I count a total of 21 ifdefs in the common code, most of those are in cmd_ide.c, one in cmd_boot.c, one in cmd_elf.c, two in env_flash.c and some in main.c, which can be removed (those are for adding the support for dynamically changing the boot prompt) and one in miiphyutil.c. Right now cmd_ide.c can be cleaned up a bit. The problem is that cmd_ide.c includes the driver functionality in it, requiring us to do this. I can clean this up by making some of the functions weak. Is this reasonable?
I don't know. I haven't seen any of the code yet, so I really cannot tell.
One other change I had to make was for the command table. I have to force the alignment to 8 bytes in the compiler or very bad things happen.
Why?
Best regards,
Wolfgang Denk

On Tuesday, February 08, 2011 11:39:58 pm Wolfgang Denk wrote:
Dear Aaron Williams,
In message 201102081927.36497.Aaron.Williams@caviumnetworks.com you wrote:
...
memory size to CONFIG_MAX_MEM_MAPPED. This won't work for us. As I said, we need to move it out of the lower 4GB when there's more memory involved. We also don't
Why?
As I have said, we need all of the lower portion of memory and do not want to introduce holes in the memory space for u-boot for loading 64-bit
When I ask "why?" then I do so to understand the reasons fdor your design decisions (and eventually to be able to recomment alternative solutions).
In such a situation phrases as "we need" or "we [do not] want" are completely useless und unhelpful, unless they are accompanied with a "because ..." part that explains _why_ you [think you] "need" or "want" these things.
We cannot have u-boot consuming space in KUSEG or in KSEG0. Placing U-Boot at the bottom of memory causes problems for some of our customers which load proprietary operating systems that don't handle this well. They also need all the memory they can get. By using a single TLB entry we map U-Boot to the top of physical memory and map it to KSEG2. This makes the entire KUSEG and KSEG0 available for customer applications and operating systems. We don't have the uncached KSEG1 since the Octeon is a cache coherent design. We also can't load U-Boot at the top of KUSEG or KSEG0 because that would create a hole in the middle of memory for loading 64-bit applications.
The other alternative would be to make U-Boot fully 64-bit. That, however, would be a far larger challenge since there are assumptions everywhere about 32-bit pointers.
By utilizing the TLB we map U-Boot to the top of physical memory which leaves almost all of the memory available for loading large 64-bit applications and operating systems without creating a hole in KSEG0/KUSEG (64-bit) where U-Boot was loaded.
As a side benefit, the relocation code is no longer needed with the exception of gd and the environment.
The drawback is that drivers cannot assume a pointer is always to a physical address, but they shouldn't make that assumption anyway. For example, if u- boot is running out of KSEG0 then the pointer and physical address will not match on our platform.
I understand that you do not _want_ to change this. My question is: what would you do if you _had_ to change it?
We might be able to make some changes, but it might be difficult since we have a lot of developers working on this around the world. I would prefer to not have to maintain two separate code bases. I will need to see how much I can strip out.
Well, frankly, this is your problem, not ours. If Cavium had decided to discuss the design early we might have come up with a solution that fits without too much problems. Instead, Cavium went on and kept their stuff closed for years - now they say that it's difficult to change things. Please understand that our foxus is on the overall quality and maintainability of the code, so such arguments do not carry much weight.
I can't say much about that. That was before my time here.
since there is some dependency on some of the include files from our GCC distribution as well, plus our toolchain distribution includes support for some of the extensions we make use of.
Can this be fixed? I mean, copying some header files should probably be a solvable problem. What about of these "extensions" - are they absolutely needed in U-Boot? Usually such extensions are either performance optimizations which are not really needed in U-Boot, or other well-localized operations that can b ehandlef with small assembler stubs.
Support for our Octeon2 processor is not in the mainline trunk yet. We also have some changes needed to work with our SDK and the Linux kernel with our newer chips.
You reply, but you do not actually answer my questions.
The code uses features found in the Octeon II extensions that are not available in the binutils trunk yet. I know for a fact that the standard MIPS toolchain used to compile other MIPS boards in U-Boot will not work at all for the Octeon. It doesn't even support the n32 ABI which is absolutely required.
In general we do not want to see board or SoC specific changes to common code.
The common code is already chock full of them from what I've seen in a number
Yes, and we've learned that lesson, and now we strive hard not to add to the mess any more.
For the most part I have avoided them, and many that are there can be changed to be more general. The IDE code will take more work since it's already a mess.
of areas. The number of changes for our SoC is actually quite small. In some cases I've just changed some functions to use __attribute__((__weak__)) so we can define the functionality elsewhere. I count a total of 21 ifdefs in the common code, most of those are in cmd_ide.c, one in cmd_boot.c, one in cmd_elf.c, two in env_flash.c and some in main.c, which can be removed (those are for adding the support for dynamically changing the boot prompt) and one in miiphyutil.c. Right now cmd_ide.c can be cleaned up a bit. The problem is that cmd_ide.c includes the driver functionality in it, requiring us to do this. I can clean this up by making some of the functions weak. Is this reasonable?
I don't know. I haven't seen any of the code yet, so I really cannot tell.
One other change I had to make was for the command table. I have to force the alignment to 8 bytes in the compiler or very bad things happen.
Why?
Because otherwise the code generated by the compiler to enumerate the commands when looking for a match will crash because the compiler increments the pointer by a value that does not match what the linker does. The linker's minimum section alignment is 64-bits for n32, but the compiler generates code assuming 32-bit alignment. If the commands were just an array then there would be no problem. The other alternative is to add padding to force the data structure to be a multiple of 64-bits. I have not found any other workaround. I don't think this is an issue for a purely 32-bit ABI, such as o32, but it is an issue with n32 which supports 64-bit load/store operations and registers. Even telling the linker to force 32-bit alignment does not help, it still performs 64-bit alignment when using the n32 ABI. I haven't looked into the details for the ABI, but I wouldn't be surprised if the linker is doing the right thing and the problem is just that u-boot's assumptions break in this case.
Best regards,
Wolfgang Denk

Hi Aaron
Am 07.02.2011 23:02, schrieb Aaron Williams:
- Fix for Micron NAND flash MT29F32G08CBABA which erroneously reports a 16-
bit bus when it has an 8-bit BUS.
Can you send that patch separately? I have a iMX25 board here with the 2GiB version of that chip which also reports a 16 bit bus width. I didn't had time to look into this and if you have a fix for this...
Thanks, Matthias
participants (3)
-
Aaron Williams
-
Matthias Weißer
-
Wolfgang Denk