[U-Boot] mx28 spl power cpu clock configuration

Hi Marek/Fabio,
I think there's an error in code setting up the CPU clock in the SPL for the i.MX28. When instruction stepping through mx28_power_clock2pll in spl_power_init.c, the processor drops dead right after PLL bypass has been disabled.
The i.MX28 reference manual (pag 116) states that the JTAG clock is derived from the CPU clock and that the JTAG tap will stop working if CPU clock is stalled, too low or disabled. I figured that disabling PLL bypass would temporarilly cause an irregular clock, throwing off the probe, even in adaptive clocking mode. Three different probes later, I know it is not.
Freescal tech support said it may be related to a non ARM specification shortcut in the clock tree, but that's not causing the problem either.
I succeeded in reproducing the problem using only my JTAG probe (Abatron BDI300) on adaptive clocking. Below the transcript of the probe's command line interface: 926E>reset ... 926E>md 0x80040000 1 80040000 : 0x00000000 0 .... 926E>mm 0x80040000 0x00020000 926E>md 0x80040000 1 80040000 : 0x00020000 131072 .... 926E> 926E>md 0x800401d0 1 800401d0 : 0x000441ff 279039 .A.. 926E>mm 0x800401d0 0x000041ff 926E>md 0x800401d0 1 After the last command, both mx28evk board and JTAG probe hang. The last JTAG transaction, caused by the last 'mm'-command, is shown on attachement 'TCK-RTCK, Adaptive, No Frac0.png'. This picture shows that the transaction the probe is raising TCK, but the target is no longer following it, as it's supposed to do.
When configuring the probe at a fixed clock of 1MHz, the same sequence no longer hangs up the probe, but just hangs up the target: 926E>reset ... 926E>md 0x80040000 1 80040000 : 0x00000000 0 .... 926E>mm 0x80040000 0x00020000 926E>md 0x80040000 1 80040000 : 0x00020000 131072 .... 926E> 926E>md 0x800401d0 1 800401d0 : 0x000441ff 279039 .A.. 926E>mm 0x800401d0 0x000041ff 926E>md 0x80040000 1 80040000 : 0xffffffff -1 .... The last read-back is obviously a bogus value. The last JTAG transaction, caused by the last 'mm'-command, is shown on attachement 'TCK-RTCK, 1MHz, No Frac0.png'. This picture shows that half-way the transaction the stops outputting RTCK, while the probe continues on it's fixed clock.
I think the cause of this problem is that PLL bypass is disabled - using PLL0 as CPU clock source instead of it's reference xtal - while CPU clock gating on PLL0 is still enabled. Now I don't fully understand why this problem only occurs when instruction-stepping the code, and not under normal operating conditions. It may be related to delay, because some time later mx28_mem_init in spl_mem_init.c does disable CPU clock gating on PLL0.
I have tested this by modifying the sequence above by inserting commands to disable CPU clock gating: 926E>reset ... 926E>md 0x80040000 1 80040000 : 0x00000000 0 .... 926E>mm 0x80040000 0x00020000 926E>md 0x80040000 1 80040000 : 0x00020000 131072 .... 926E> 926E>md 0x800401b0 1 800401b0 : 0x92929292 -1835887982 .... 926E>mm 0x800401b0 0x12525513 926E>md 0x800401b0 1 800401b0 : 0x52521513 1381111059 ..RR 926E> 926E>md 0x800401d0 1 800401d0 : 0x000441ff 279039 .A.. 926E>mm 0x800401d0 0x000041ff 926E>md 0x800401d0 1 800401d0 : 0x000041ff 16895 .A.. After this sequence, both probe and board are still fully responsive. Even the written value can be read back successfully. Attachement 'TCK-RTCK, Adaptive, Frac0.png' shows the JTAG transaction, caused by the last 'mm'-command. The zoomed section at the bottom shows how the clock frequency increases half-way the command.
The sequence above changes more to the clkctrl_frac0 register than just disabling CPU clock gating, but I have repeated this sequence writing a value of 0x92929212 (over a power-up default of 0x92929292) and that works just the same.
Shouldn't we configure clkctrl_frac0 - or at least disable CPU clock gating - before disabling PLL bypass?
Cheers,
Robert.

Hi Marek/Fabio,
I think there's an error in code setting up the CPU clock in the SPL for the i.MX28. When instruction stepping through mx28_power_clock2pll in spl_power_init.c, the processor drops dead right after PLL bypass has been disabled.
Stepping through the code is not recommended, that's why I couldn't debug certain parts of the power init code either. But I don't think it's a bug, I suspect it's an expected behaviour during this transition.
The i.MX28 reference manual (pag 116) states that the JTAG clock is derived from the CPU clock and that the JTAG tap will stop working if CPU clock is stalled, too low or disabled. I figured that disabling PLL bypass would temporarilly cause an irregular clock, throwing off the probe, even in adaptive clocking mode. Three different probes later, I know it is not.
Freescal tech support said it may be related to a non ARM specification shortcut in the clock tree, but that's not causing the problem either.
I succeeded in reproducing the problem using only my JTAG probe (Abatron BDI300) on adaptive clocking. Below the transcript of the probe's command line interface: 926E>reset ... 926E>md 0x80040000 1 80040000 : 0x00000000 0 .... 926E>mm 0x80040000 0x00020000 926E>md 0x80040000 1 80040000 : 0x00020000 131072 .... 926E> 926E>md 0x800401d0 1 800401d0 : 0x000441ff 279039 .A.. 926E>mm 0x800401d0 0x000041ff 926E>md 0x800401d0 1 After the last command, both mx28evk board and JTAG probe hang. The last JTAG transaction, caused by the last 'mm'-command, is shown on attachement 'TCK-RTCK, Adaptive, No Frac0.png'. This picture shows that the transaction the probe is raising TCK, but the target is no longer following it, as it's supposed to do.
When configuring the probe at a fixed clock of 1MHz, the same sequence no longer hangs up the probe, but just hangs up the target: 926E>reset ... 926E>md 0x80040000 1 80040000 : 0x00000000 0 .... 926E>mm 0x80040000 0x00020000 926E>md 0x80040000 1 80040000 : 0x00020000 131072 .... 926E> 926E>md 0x800401d0 1 800401d0 : 0x000441ff 279039 .A.. 926E>mm 0x800401d0 0x000041ff 926E>md 0x80040000 1 80040000 : 0xffffffff -1 .... The last read-back is obviously a bogus value. The last JTAG transaction, caused by the last 'mm'-command, is shown on attachement 'TCK-RTCK, 1MHz, No Frac0.png'. This picture shows that half-way the transaction the stops outputting RTCK, while the probe continues on it's fixed clock.
I think the cause of this problem is that PLL bypass is disabled - using PLL0 as CPU clock source instead of it's reference xtal - while CPU clock gating on PLL0 is still enabled. Now I don't fully understand why this problem only occurs when instruction-stepping the code, and not under normal operating conditions. It may be related to delay, because some time later mx28_mem_init in spl_mem_init.c does disable CPU clock gating on PLL0.
I have tested this by modifying the sequence above by inserting commands to disable CPU clock gating: 926E>reset ... 926E>md 0x80040000 1 80040000 : 0x00000000 0 .... 926E>mm 0x80040000 0x00020000 926E>md 0x80040000 1 80040000 : 0x00020000 131072 .... 926E> 926E>md 0x800401b0 1 800401b0 : 0x92929292 -1835887982 .... 926E>mm 0x800401b0 0x12525513 926E>md 0x800401b0 1 800401b0 : 0x52521513 1381111059 ..RR 926E> 926E>md 0x800401d0 1 800401d0 : 0x000441ff 279039 .A.. 926E>mm 0x800401d0 0x000041ff 926E>md 0x800401d0 1 800401d0 : 0x000041ff 16895 .A.. After this sequence, both probe and board are still fully responsive. Even the written value can be read back successfully. Attachement 'TCK-RTCK, Adaptive, Frac0.png' shows the JTAG transaction, caused by the last 'mm'-command. The zoomed section at the bottom shows how the clock frequency increases half-way the command.
The sequence above changes more to the clkctrl_frac0 register than just disabling CPU clock gating, but I have repeated this sequence writing a value of 0x92929212 (over a power-up default of 0x92929292) and that works just the same.
Shouldn't we configure clkctrl_frac0 - or at least disable CPU clock gating - before disabling PLL bypass?
This seems reasonable. Fabio, can you comment?
M
Cheers,
Robert.

Hi Robert,
On 1/25/12, Marek Vasut marek.vasut@gmail.com wrote:
Shouldn't we configure clkctrl_frac0 - or at least disable CPU clock gating - before disabling PLL bypass?
This seems reasonable. Fabio, can you comment?
Could you please post a patch with your proposed change so that we can test it?
Regards,
Fabio Estevam

Hi Fabio,
Could you please post a patch with your proposed change so that we can test it?
I was hoping for a suggestion from you, as you know this SoC far better than me. Currently I am trying different solutions. Even though they prevent the system from hanging up, they still don't enable me to step through the code. And since therey's no problem with normal operation, I think they aren't worth anything unless they fix instruction stepping.
But as Marek, says: Instruction stepping this section can fail for another reason.
Cheers,
Robert.

Hi Fabio,
Could you please post a patch with your proposed change so that we can test it?
I was hoping for a suggestion from you, as you know this SoC far better than me. Currently I am trying different solutions. Even though they prevent the system from hanging up, they still don't enable me to step through the code.
From your previous email, it looked like that was proper solution. You can still send a patch so we can test and proceed further in sync.
And since therey's no problem with normal operation, I think they aren't worth anything unless they fix instruction stepping.
What do you mean?
But as Marek, says: Instruction stepping this section can fail for another reason.
I never said such sentence.
M
Cheers,
Robert.

From your previous email, it looked like that was proper solution. You can still send a patch so we can test and proceed further in sync.
I will send it in, as soon as I have a solution that enables instruction stepping through this code.
What do you mean?
If my 'solution' doesn't enable instruction, it's not really solving any problems and hence not worth anybody's time. Before I send anything in, I want to understand why it works, and preferably why it didn't work without it.
I never said such sentence.
You said instruction stepping this area isn't recommended, so I concluded that many things can go wrong there.
Cheers,
Robert.

From your previous email, it looked like that was proper solution. You can still send a patch so we can test and proceed further in sync.
I will send it in, as soon as I have a solution that enables instruction stepping through this code.
What do you mean?
If my 'solution' doesn't enable instruction, it's not really solving any problems and hence not worth anybody's time. Before I send anything in, I want to understand why it works, and preferably why it didn't work without it.
I never said such sentence.
You said instruction stepping this area isn't recommended, so I concluded that many things can go wrong there.
Cheers,
Robert.
Robert ... do you really want to cooperate and help fix stuff mainline or do you want to keep everyone in blind, make them guess/help you and when you fix something, never come back and have the fix only for yourself?
M

Robert ... do you really want to cooperate and help fix stuff mainline or do you want to keep everyone in blind, make them guess/help you and when you fix something, never come back and have the fix only for yourself?
Well, given my very verbose thread start, including oscillographs, it it not my intention to leave everyone in the blind.
My 'fix' as it is now, doesn't fix any real problem. It's not finished yet. As it looks now, it makes the JTAG connection unreliable. Data is getting corrupted when it's read or written. However, the system no longer hangs up itself.
I will post a patch when I've found a working solution. It's in my best interest to have it tested and reviewed by you guys, as you understand the clock tree of this SoC a lot better than I. Besides that, I'd like to have it in the mainline as well, so we don't have to maintain our patches.

Robert ... do you really want to cooperate and help fix stuff mainline or do you want to keep everyone in blind, make them guess/help you and when you fix something, never come back and have the fix only for yourself?
Well, given my very verbose thread start, including oscillographs, it it not my intention to leave everyone in the blind.
That's indeed a good start!
My 'fix' as it is now, doesn't fix any real problem. It's not finished yet. As it looks now, it makes the JTAG connection unreliable. Data is getting corrupted when it's read or written. However, the system no longer hangs up itself.
That IS a progress.
I will post a patch when I've found a working solution. It's in my best interest to have it tested and reviewed by you guys, as you understand the clock tree of this SoC a lot better than I. Besides that, I'd like to have it in the mainline as well, so we don't have to maintain our patches.
But we can also test and review the current solution ;-)

My 'fix' as it is now, doesn't fix any real problem. It's not finished yet. As it looks now, it makes the JTAG connection unreliable. Data is getting corrupted when it's read or written. However, the system no longer hangs up itself.
That IS a progress.
Progress is made, but no completely working solution is found yet.
I've just found out that registers hw_clkctrl_frac0 and hw_clkctrl_frac1 should be accessed as bytes only. It's in the manual (page 886 and 887), but I completely missed that the first 10 times. We access hw_clkctrl_frac0 as a word, though I doubt if this has any serious consequences.
I will post a patch when I've found a working solution. It's in my best interest to have it tested and reviewed by you guys, as you understand the clock tree of this SoC a lot better than I. Besides that, I'd like to have it in the mainline as well, so we don't have to maintain our patches.
But we can also test and review the current solution ;-)
I hope I can offer a patch up for review later today, but definitely before the end of the week.

Hi Robert,
On Wed, Jan 25, 2012 at 2:36 PM, Robert Deliën robert@delien.nl wrote:
Hi Fabio,
Could you please post a patch with your proposed change so that we can test it?
I was hoping for a suggestion from you, as you know this SoC far better than me. Currently I am trying different solutions. Even though they prevent the system from hanging up, they still don't enable me to step through the code. And since therey's no problem with normal operation, I think they aren't worth anything unless they fix instruction stepping.
But as Marek, says: Instruction stepping this section can fail for another reason.
I talked to some folks that reported that RV-ICE Jtag works fine on mx28. Will send you offline their init script, so that you could compare it with your BDI 3000 script.
Regards,
Fabio Estevam

Hi Robert,
On Wed, Jan 25, 2012 at 2:36 PM, Robert Deliën robert@delien.nl wrote:
Hi Fabio,
Could you please post a patch with your proposed change so that we can test it?
I was hoping for a suggestion from you, as you know this SoC far better than me. Currently I am trying different solutions. Even though they prevent the system from hanging up, they still don't enable me to step through the code. And since therey's no problem with normal operation, I think they aren't worth anything unless they fix instruction stepping.
But as Marek, says: Instruction stepping this section can fail for another reason.
I talked to some folks that reported that RV-ICE Jtag works fine on mx28. Will send you offline their init script, so that you could compare it with your BDI 3000 script.
Hi Fabio, did they also test stepping through the power_init() ?
M
Regards,
Fabio Estevam

On Mon, Jan 30, 2012 at 7:53 PM, Marek Vasut marek.vasut@gmail.com wrote:
Hi Fabio, did they also test stepping through the power_init() ?
They tested stepping using their standalone code, not U-boot.
Regards,
Fabio Estevam

On Mon, Jan 30, 2012 at 7:53 PM, Marek Vasut marek.vasut@gmail.com wrote:
Hi Fabio, did they also test stepping through the power_init() ?
They tested stepping using their standalone code, not U-boot.
Regards,
Fabio Estevam
Can you jab me on G? I'm on the problem and I have a possible suspect.

Hi Fabio,
Could you please post a patch with your proposed change so that we can test it?
Still working on it. Got delayed by an incompatibility between the SoC and an SD-Card controller. I'm the only software developer currently on this project, so I swich back-and-forth all the time. Sorry, but you'll get it when I have it.
I talked to some folks that reported that RV-ICE Jtag works fine on mx28. Will send you offline their init script, so that you could compare it with your BDI 3000 script.
That would be great! But I'm pretty sure they won't be able to step through the pll bypass disabling in spl_power, because that is stalling the CPU clock and disables JTAG.
But I'll give it a good look; perhaps they've come up with something I didn't notice. Thanks a lot!
Cheers,
Robert.

Hi Robert,
On 1/25/12, Marek Vasut marek.vasut@gmail.com wrote:
Shouldn't we configure clkctrl_frac0 - or at least disable CPU clock gating - before disabling PLL bypass?
This seems reasonable. Fabio, can you comment?
Could you please post a patch with your proposed change so that we can test it?
Hi Fabio,
I bought a really crappy custom board a few days ago (some china-made crap) sporting mx287, but apparently I'm hitting similar issue you do here.
When I swap power_init and mem_init though, the board boots fine, othervise it hangs.
M

On Thu, Jan 26, 2012 at 4:32 PM, Marek Vasut marek.vasut@gmail.com wrote:
Hi Fabio,
I bought a really crappy custom board a few days ago (some china-made crap) sporting mx287, but apparently I'm hitting similar issue you do here.
When I swap power_init and mem_init though, the board boots fine, othervise it hangs.
Ok, so looks like this is not compiler related issue then.
Do you know if m28 board can reboot fine or not?
Regards,
Fabio Estevam

On Thu, Jan 26, 2012 at 4:32 PM, Marek Vasut marek.vasut@gmail.com wrote:
Hi Fabio,
I bought a really crappy custom board a few days ago (some china-made crap) sporting mx287, but apparently I'm hitting similar issue you do here.
When I swap power_init and mem_init though, the board boots fine, othervise it hangs.
Ok, so looks like this is not compiler related issue then.
Do you know if m28 board can reboot fine or not?
Yes, M28 is ok.

When I swap power_init and mem_init though, the board boots fine, othervise it hangs.
Ok, so looks like this is not compiler related issue then.
I will try this too. I've noticed that mem_init also touched PLL bypass, etc. I'm starting to wonder if we need to touch PLL bypass in power_init at all.
Do you know if m28 board can reboot fine or not?
I have noticed this behaviour on the mx28evk too, but only under more obscure circumstances. I'll try to reproduce it, so I can be a bit more specific.
Cheers,
Robert.
participants (3)
-
Fabio Estevam
-
Marek Vasut
-
Robert Deliën