Re: [PATCH 2/6] clk: actions: Add SD/MMC clocks

On 13/12/2020 09:44, Amit Singh Tomar wrote:
Hi,
This commit adds SD/MMC clocks, and provides .set/get_rate callbacks for SD/MMC device present on Actions OWL S700 SoCs.
Signed-off-by: Amit Singh Tomar amittomer25@gmail.com
drivers/clk/owl/clk_owl.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++ drivers/clk/owl/clk_owl.h | 2 ++ 2 files changed, 68 insertions(+)
diff --git a/drivers/clk/owl/clk_owl.c b/drivers/clk/owl/clk_owl.c index c9bc5c2..49a492c 100644 --- a/drivers/clk/owl/clk_owl.c +++ b/drivers/clk/owl/clk_owl.c @@ -92,6 +92,9 @@ int owl_clk_enable(struct clk *clk) setbits_le32(priv->base + CMU_DEVCLKEN1, CMU_DEVCLKEN1_ETH); setbits_le32(priv->base + CMU_ETHERNETPLL, 5); break;
- case CLK_SD0:
setbits_le32(priv->base + CMU_DEVCLKEN0, CMU_DEVCLKEN0_SD0);
default: return -EINVAL; }break;
@@ -121,6 +124,9 @@ int owl_clk_disable(struct clk *clk) case CLK_ETHERNET: clrbits_le32(priv->base + CMU_DEVCLKEN1, CMU_DEVCLKEN1_ETH); break;
- case CLK_SD0:
clrbits_le32(priv->base + CMU_DEVCLKEN0, CMU_DEVCLKEN0_SD0);
default: return -EINVAL; }break;
@@ -128,12 +134,69 @@ int owl_clk_disable(struct clk *clk) return 0; }
+static ulong owl_get_sd_clk_rate(struct owl_clk_priv *priv, int sd_index) +{
- ulong parent_rate;
- uint div = 1;
- /* Clock output of DEV_PLL
* Range: 48M ~ 756M
* Frequency= DEVPLLCLK * 6
*/
- parent_rate = readl(priv->base + CMU_DEVPLL) & 0x7f;
According to the datasheet the clock source could also be NAND_PLL, depending on bit 9. Both PLLs use the same rate calculation, so it's just matter of the PLL address offset to use for covering both.
- parent_rate *= 6000000;
- div += readl(priv->base + (CMU_SD0CLK + sd_index*0x4)) & 0x1f;
- return (parent_rate / div);
+}
+static ulong owl_set_sd_clk_rate(struct owl_clk_priv *priv, ulong rate,
int sd_index)
+{
- ulong parent_rate;
- uint div = 1, val;
Unneeded initialisation, you overwrite this below.
- /* Clock output of DEV_PLL
* Range: 48M ~ 756M
* Frequency= DEVPLLCLK * 6
*/
- parent_rate = readl(priv->base + CMU_DEVPLL) & 0x7f;
- parent_rate *= 6000000;
- rate *= 2;
Where does this come from?
- div = (parent_rate / rate) - 1;
Please check rate being not zero before you dividing by it. And I would keep the real divider value for now, so drop the -1 here.
- if (div >= 128)
div |= 0x100;
This does not seem right. You would need to divide the value by 128, once you decided to use the "divide-by-128" bit (which is bit 8). What about moving this below, after you know the register value? Then you can divide by 128 and set bit 8.
- val = readl(priv->base + (CMU_SD0CLK + sd_index*0x4));
- /* Bits 4..0 is used to program div value */
- val &= ~0x1f;
- val |= div;
Please mask div before applying.
- /* As per Manuals Bits 31..10 are reserved but Bits 11 and
* 10 needed to be set for proper operation
*/
- val |= 0xc00;
Where does this come from? I don't see the Linux driver doing that?
- /* Bit 9 and 8 must be cleared */
Bit 9 is the parent clock selection, which you hard code to DEV_CLK. Bit 8 is the divide-by-128 bit. Please split those two up and explain it in a comment, maybe.
Cheers, Andre
- if (div < 128)
val &= ~0x300;
- else
val &= ~0x200;
- writel(val, priv->base + (CMU_SD0CLK + sd_index*0x4));
- return owl_get_sd_clk_rate(priv, 0);
+}
static ulong owl_clk_get_rate(struct clk *clk) { struct owl_clk_priv *priv = dev_get_priv(clk->dev); ulong rate;
switch (clk->id) {
- case CLK_SD0:
rate = owl_get_sd_clk_rate(priv, 0);
default: return -ENOENT; }break;
@@ -147,6 +210,9 @@ static ulong owl_clk_set_rate(struct clk *clk, ulong rate) ulong new_rate;
switch (clk->id) {
- case CLK_SD0:
new_rate = owl_set_sd_clk_rate(priv, rate, 0);
default: return -ENOENT; }break;
diff --git a/drivers/clk/owl/clk_owl.h b/drivers/clk/owl/clk_owl.h index a01f81a..ee5eba4 100644 --- a/drivers/clk/owl/clk_owl.h +++ b/drivers/clk/owl/clk_owl.h @@ -62,4 +62,6 @@ struct owl_clk_priv { #define CMU_DEVCLKEN1_UART5 BIT(21) #define CMU_DEVCLKEN1_UART3 BIT(11)
+#define CMU_DEVCLKEN0_SD0 BIT(22)
#endif

Hi,
Thanks for having the detailed look and providing comments:
According to the datasheet the clock source could also be NAND_PLL, depending on bit 9. Both PLLs use the same rate calculation, so it's just matter of the PLL address offset to use for covering both.
Ok, should I change the comment as Clock output of DEV/NAND_PLL ?
/* Clock output of DEV_PLL
* Range: 48M ~ 756M
* Frequency= DEVPLLCLK * 6
*/
parent_rate = readl(priv->base + CMU_DEVPLL) & 0x7f;
parent_rate *= 6000000;
rate *= 2;
Where does this come from?
While experimenting about it, I found that doubling input rate doubles the speed , and BSP code does the same.
Not sure if it is a good enough explanation or maybe its side effect of not programming divider value well when its >=128. Would check it again.
div = (parent_rate / rate) - 1;
Please check rate being not zero before you dividing by it. And I would keep the real divider value for now, so drop the -1 here.
Ok. I would make the check.
if (div >= 128)
div |= 0x100;
This does not seem right. You would need to divide the value by 128, once you decided to use the "divide-by-128" bit (which is bit 8). What about moving this below, after you know the register value? Then you can divide by 128 and set bit 8.
Sure, Thanks for explaining it in detail. It helped understanding ACTIONS not so usual clock design.
val = readl(priv->base + (CMU_SD0CLK + sd_index*0x4));
/* Bits 4..0 is used to program div value */
val &= ~0x1f;
val |= div;
Please mask div before applying.
But I am masking the first 4.0 bit above , ~0x1f;
/* As per Manuals Bits 31..10 are reserved but Bits 11 and
* 10 needed to be set for proper operation
*/
val |= 0xc00;
Where does this come from? I don't see the Linux driver doing that?
Actually if I don't set 11th and 10th bit , I see SD speed of 2KiB/s and unstable behaviour while loading files from the card. BSP U-boot driver chooses a value 0xfffffce0 for it.
Thanks Amit

On 14/12/2020 14:12, Amit Tomer wrote:
Hi,
Thanks for having the detailed look and providing comments:
According to the datasheet the clock source could also be NAND_PLL, depending on bit 9. Both PLLs use the same rate calculation, so it's just matter of the PLL address offset to use for covering both.
Ok, should I change the comment as Clock output of DEV/NAND_PLL ?
I meant you should read the parent rate either from the NAND or the DEV PLL, depending on this bit. Since you force the parent to be DEV that's not strictly needed for this current use case, but it's much cleaner to do so in the *clock* driver, and as I said, is not hard to do.
/* Clock output of DEV_PLL
* Range: 48M ~ 756M
* Frequency= DEVPLLCLK * 6
*/
parent_rate = readl(priv->base + CMU_DEVPLL) & 0x7f;
parent_rate *= 6000000;
rate *= 2;
Where does this come from?
While experimenting about it, I found that doubling input rate doubles the speed , and BSP code does the same.
Sure doubling the input frequency improves the performance, but is also wrong. If the card can do twice as much *reliably*, then it would advertise more advanced speed modes, and we could use them. Otherwise it's just overclocking. You should check what the card reports (SDR50?), then check which clock value you set, and what the performance is (for reading a bigger number of sectors). If that matches up (SDR50 => 50 MHz => ~25MB/s), then it's correct. Everything else should be investigated.
Not sure if it is a good enough explanation or maybe its side effect of not programming divider value well when its >=128. Would check it again.
This explanation sounds very dodgy. There might be a hidden divider somewhere, but this should be discoverable by the test I described above. And since the "divide-by-128" is only needed for the initial 400KHz communication, I don't think it's related here.
div = (parent_rate / rate) - 1;
Please check rate being not zero before you dividing by it. And I would keep the real divider value for now, so drop the -1 here.
Ok. I would make the check.
if (div >= 128)
div |= 0x100;
This does not seem right. You would need to divide the value by 128, once you decided to use the "divide-by-128" bit (which is bit 8). What about moving this below, after you know the register value? Then you can divide by 128 and set bit 8.
Sure, Thanks for explaining it in detail. It helped understanding ACTIONS not so usual clock design.
val = readl(priv->base + (CMU_SD0CLK + sd_index*0x4));
/* Bits 4..0 is used to program div value */
val &= ~0x1f;
val |= div;
Please mask div before applying.
But I am masking the first 4.0 bit above , ~0x1f;
No, you are masking val, not div. If div is bigger than 31, you set other bits in this register.
/* As per Manuals Bits 31..10 are reserved but Bits 11 and
* 10 needed to be set for proper operation
*/
val |= 0xc00;
Where does this come from? I don't see the Linux driver doing that?
Actually if I don't set 11th and 10th bit , I see SD speed of 2KiB/s and unstable behaviour while loading files from the card. BSP U-boot driver chooses a value 0xfffffce0 for it.
I wouldn't rely on downstream BSP U-Boot for *anything*. I guess you see normal speed in (almost) mainline Linux? Check what we do with the clock register there (can dump it once the driver has initialised). I would rather mimic the mainline kernel here than anything else.
Cheers, Andre
participants (2)
-
Amit Tomer
-
André Przywara