[U-Boot] 460GT PCIe configuration

Stefan et al,
I am trying to troubleshoot a weird PCIe problem on a PPC460GT based target, and it is getting curiouser and curiouser.
There is a tlb overlap I mentioned in an earlier email; on top of that there are some things happening in cpu/ppc4xx/4xx_pcie.c which I also find hard to understand:
there is a static function pcie_get_base(), which returns a value as in
address = pcie_get_base(hose, devfn)
there are two instances of this, in both cases `address' is never used.
The CONFIG_SYS_PCIE0_XCFGBASE constant (and its counterparts for other PCIe ports) is defined and used in the code, and gets a TLB entry assigned, but I can't find a place where it is programmed into the CPU - how does it know where this section is?!
I have several different targets with different PCIe components, but all using the same base CPU subsystem design, and on some of them PCIe components misbehave, namely, PCIe memory read transactions fail with a machine check after a timeout, even though the PCIe side of things is fine (when looking with a protocol analyzer).
Any insight/explanations/suggestions would be highly appreciated, TIA, vadim

On Tuesday 13 January 2009, vb wrote:
I am trying to troubleshoot a weird PCIe problem on a PPC460GT based target, and it is getting curiouser and curiouser.
There is a tlb overlap I mentioned in an earlier email; on top of that there are some things happening in cpu/ppc4xx/4xx_pcie.c which I also find hard to understand:
there is a static function pcie_get_base(), which returns a value as in
address = pcie_get_base(hose, devfn)
there are two instances of this, in both cases `address' is never used.
Good catch. pcie_get_base() can be removed. This is probably a remnant from an older driver version.
The CONFIG_SYS_PCIE0_XCFGBASE constant (and its counterparts for other PCIe ports) is defined and used in the code, and gets a TLB entry assigned, but I can't find a place where it is programmed into the CPU
- how does it know where this section is?!
Again you seem to be correct here. I can't find a place where this area is programmed. I don't have the time to dig into this right now, so it would be great if you could work on this a little deeper. I suggest to look at the Linux 4xx PCI driver (arch/powerpc/sysdev/ppc4xx_pci.c) as reference.
I have several different targets with different PCIe components, but all using the same base CPU subsystem design, and on some of them PCIe components misbehave, namely, PCIe memory read transactions fail with a machine check after a timeout, even though the PCIe side of things is fine (when looking with a protocol analyzer).
Is this all 460EX? Or some other 4xx? What are the PCIe endpoints you are using? Do you see the same problems on Canyonlands as well?
Best regards, Stefan
===================================================================== DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-0 Fax: +49-8142-66989-80 Email: office@denx.de =====================================================================

On Wed, Jan 14, 2009 at 7:03 AM, Stefan Roese sr@denx.de wrote:
On Tuesday 13 January 2009, vb wrote:
The CONFIG_SYS_PCIE0_XCFGBASE constant (and its counterparts for other PCIe ports) is defined and used in the code, and gets a TLB entry assigned, but I can't find a place where it is programmed into the CPU
- how does it know where this section is?!
Again you seem to be correct here. I can't find a place where this area is programmed. I don't have the time to dig into this right now, so it would be great if you could work on this a little deeper. I suggest to look at the Linux 4xx PCI driver (arch/powerpc/sysdev/ppc4xx_pci.c) as reference.
Stefan,
thank you for confirming my suspicions and for your suggestion, I will compare notes with the Linux version (should have thought about this earlier). But I also was under impression that Linux does not touch some parts of PCI configuration, as the memory map is set by u-boot and used by Linux. Or does linux use the addresses from the device tree to reprogram the PCIe subsystem?
I have several different targets with different PCIe components, but all using the same base CPU subsystem design, and on some of them PCIe components misbehave, namely, PCIe memory read transactions fail with a machine check after a timeout, even though the PCIe side of things is fine (when looking with a protocol analyzer).
Is this all 460EX? Or some other 4xx? What are the PCIe endpoints you are using? Do you see the same problems on Canyonlands as well?
This is 460GT, so the eval board is glacier, not canyonlands.
The PCI endpoints which work are an Intel NIC (tried it with the glacier), and some Broadcom integrated ethernet switches (those work on our own design). The one which fails is based on the very similar 460GT based platform, but uses an Altera FPGA with a standard Altera PCIe interface implementation.
What happens is that config space transactions (both read and write) and memory writes work fine, but attempts to read Altera's memory mapped space causes a machine check with very vague error reporting.
I will try submitting diffs for your review, but I need to get to the bottom of this first...
cheers, vadim
Best regards, Stefan
===================================================================== DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-0 Fax: +49-8142-66989-80 Email: office@denx.de =====================================================================

On Wednesday 14 January 2009, vb wrote:
thank you for confirming my suspicions and for your suggestion, I will compare notes with the Linux version (should have thought about this earlier). But I also was under impression that Linux does not touch some parts of PCI configuration, as the memory map is set by u-boot and used by Linux. Or does linux use the addresses from the device tree to reprogram the PCIe subsystem?
Correct. Linux (re-)configures the 4xx PCI(e) controller completely. Everything should be overwritten by Linux.
Best regards, Stefan
===================================================================== DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-0 Fax: +49-8142-66989-80 Email: office@denx.de =====================================================================

On Wednesday 14 January 2009, Stefan Roese wrote:
On Wednesday 14 January 2009, vb wrote:
thank you for confirming my suspicions and for your suggestion, I will compare notes with the Linux version (should have thought about this earlier). But I also was under impression that Linux does not touch some parts of PCI configuration, as the memory map is set by u-boot and used by Linux. Or does linux use the addresses from the device tree to reprogram the PCIe subsystem?
Correct. Linux (re-)configures the 4xx PCI(e) controller completely. Everything should be overwritten by Linux.
BTW: Do you see the same problems (PCIe memory read timeout) under Linux?
If PCIe works on Glacier and fails on your custom board it may be a hardware related problem on your board (either board routing or endpoint etc). Are you sure that your FPGA based PCIe endpoint is working correctly? Can you "plug" a standard PCIe endpoint in your custom hardware?
Best regards, Stefan
===================================================================== DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany Phone: +49-8142-66989-0 Fax: +49-8142-66989-80 Email: office@denx.de =====================================================================

Hello Vadim, Stefan,
On Wed, Jan 14, 2009 at 5:42 PM, vb vb@vsbe.com wrote:
On Wed, Jan 14, 2009 at 7:03 AM, Stefan Roese sr@denx.de wrote:
On Tuesday 13 January 2009, vb wrote:
I have several different targets with different PCIe components, but all using the same base CPU subsystem design, and on some of them PCIe components misbehave, namely, PCIe memory read transactions fail with a machine check after a timeout, even though the PCIe side of things is fine (when looking with a protocol analyzer).
on our own design). The one which fails is based on the very similar 460GT based platform, but uses an Altera FPGA with a standard Altera PCIe interface implementation.
What happens is that config space transactions (both read and write) and memory writes work fine, but attempts to read Altera's memory mapped space causes a machine check with very vague error reporting.
Vadim, I am currently designing a Altera FPGA PCIe application and testing on the AMCC460EX as one of the targets, so I may provide some hints on where to look. I debugged *without* protocol analyser and also hit some emachine check exceptions when the FPGA logic was still misbehaving.
Let me just throw some hints and references your way, hope they might be useful:
Do you use a reference design? Does the FPGA application respond to the reads? Do you map the BARs correctly? Do you use 32-bit read/writes, 32-bit alignment?
In the linux-next GIT tree, my driver for the Altera PCIe Chaining DMA design is included: drivers/staging/altpciechdma/altpciechdma.c
Best regards,

On Tue, Mar 10, 2009 at 2:23 PM, Leon Woestenberg leon.woestenberg@gmail.com wrote:
Hello Vadim, Stefan,
Hi Leon, thank you for your interest, the problem I was dealing with has been long fixed.
What happened was that in the Read transaction response the FPGA was setting one of the header attributes to a value different from what the transaction originator (460GT root complex) requested. The analyzer was not even highlighting that as a potential issue (even though it was showing the discrepancy between request and response headers).
It turned out that the 460GT PCIe core was much more sensitive (some would say closer to the spec and thus more correct) than some other CPUs, and was just ignoring the PCIe responses with incorrect header contents, eventually generating PCIe timeouts which caused and were reported as PLB timeouts (because this is where the CPU was waiting).
Very poor error reporting mechanism, but we have to do with what we have to do :-)
cheers, /vb
On Wed, Jan 14, 2009 at 5:42 PM, vb vb@vsbe.com wrote:
On Wed, Jan 14, 2009 at 7:03 AM, Stefan Roese sr@denx.de wrote:
On Tuesday 13 January 2009, vb wrote:
I have several different targets with different PCIe components, but all using the same base CPU subsystem design, and on some of them PCIe components misbehave, namely, PCIe memory read transactions fail with a machine check after a timeout, even though the PCIe side of things is fine (when looking with a protocol analyzer).
on our own design). The one which fails is based on the very similar 460GT based platform, but uses an Altera FPGA with a standard Altera PCIe interface implementation.
What happens is that config space transactions (both read and write) and memory writes work fine, but attempts to read Altera's memory mapped space causes a machine check with very vague error reporting.
Vadim, I am currently designing a Altera FPGA PCIe application and testing on the AMCC460EX as one of the targets, so I may provide some hints on where to look. I debugged *without* protocol analyser and also hit some emachine check exceptions when the FPGA logic was still misbehaving.
Let me just throw some hints and references your way, hope they might be useful:
Do you use a reference design? Does the FPGA application respond to the reads? Do you map the BARs correctly? Do you use 32-bit read/writes, 32-bit alignment?
In the linux-next GIT tree, my driver for the Altera PCIe Chaining DMA design is included: drivers/staging/altpciechdma/altpciechdma.c
Best regards,
Leon
participants (3)
-
Leon Woestenberg
-
Stefan Roese
-
vb