
On Thu, 2014-10-02 at 17:29 -0600, Stephen Warren wrote:
Please mention the commit description too, so people can still find the commit if it's been cherry-picked to a different branch with a different commit ID:
aeb3fcb35956 ARM: tegra: Use mem size from MC rather than ODMDATA
Will sure do.
I still think calling get_ram_size() is completely pointless given how Tegra works and that this code is already running from RAM. Equally, there doesn't appear to be any protection in the implementation against trashing the code that implements get_ram_size() with the RAM writes the code performs. I'd rather not call get_ram_size() at all.
I really don't think it is that pointless given most all other ports are currently using it (see doc/README.memory-test which I quoted) but I agree to a certain extend to the claim you are making. Let me add Wolfgang CC to the discussion as the original author thereof.
That's *extremely* unlikely. Perhaps it will catch a very tiny percentage of HW problems,
It would catch stuck address lines which is the most like manufacturing issue to ever happen on the RAM side of things. But agreed for that to happen is very unlikely. In the last 10 thousand Tegra modules we produced we probably only saw one or two such failures.
and some number of SW configuration problems (i.e. BCT EmemCfg field mismatch with the board's actual RAM size).
Exactly, which is BTW how we are currently still doing things on our Apalis T30 modules which exist in a 1 GB and a 2 GB flavour both running off the exact same BCT. Downsteam so far we used custom code to detect the mirroring happening on the 1 GB variant:
http://git.toradex.com/gitweb/u-boot-toradex.git/blob/refs/heads/colibri:/ar...
Now using get_ram_size() automatically takes care of this plus it also takes care of cases where customers for some reason flash the wrong BCT onto say some Colibri T20 modules which unfortunately may still boot even with the wrong BCT (see above given link some 22 source lines further up)!
get_ram_size() doesn't perform anything remotely like a complete RAM test; it just writes to each power-of-two address, which skips a lot of RAM cells!
Yes, it is NOT a complete test of the RAM part but that should also not be necessary as the vendor of that part should already have done that as part of his production testing.
Even then, I'm not sure that it's robust. One issue I immediately saw: it writes to *base then reads it back with only a CPU synchronization between and no bus clear operation (e.g. writing a different pattern to a different address). I've such tests pass that operation when there's nothing connected to the bus at all, due to capacitance in bus drivers/input buffers.
I was thinking about that as well but I guess one would probably not get that far on a Tegra in that case plus if really deemed required one could add a custom synchronisation primitive (see sync define in common/memsize.c) which so far is only done for PowerPCs.
Aside from that, this change looks reasonable at a quick glance, although I do hope your testing was very thorough.
Yes, as mentioned I tested it on a Colibri T20 256 MB V1.1B, V1.1B 2nd, V1.2A, Colibri T20 512 MB V1.1C, V1.2A, Apalis T30 1GB V1.0A, Apalis T30 2 GB V1.0B, V1.0C and Colibri T30 V1.1B, V1.1C, V1.1D and V1.1E plus on a NVIDIA Jetson TK1. Unfortunately I do not have any T114 which I could test it on (or do you happen to know how I could run mainline U-Boot on my NVIDIA Shield?).
I personally don't have time to go through and test every board we support with this change.
Understood. Would be good if somebody could give it a quick try on a Dalmore I believe you guys call your T114 dev board.
(As an aside, I learned something from this patch; I'd thought that our earlier SoCs didn't have a register in the EMC that indicated the RAM size, and so code *had* to use the ODMDATA to determine the RAM size. Evidently that wasn't the case, and we should have been using an EMC register all along).
Nope, we never ever used that ODMDATA crap even on T20 albeit downstream using a slightly different approach via EMC_ADR_CFG_BASE.
(As an aside, I am really honoured that after working quite intensively on that stuff for the last three and a half years I am able to pass some tricks on to the actual master of disaster (;-p).