[U-Boot] [PATCH v1] mmc: dwmmc: Enable small delay before returning error

From: "Ang, Chee Hong" chee.hong.ang@intel.com
'SET_BLOCKLEN' may occasionally fail on first attempt. This patch enable a small delay in dwmci_send_cmd() on busy, I/O or CRC error to allow the MMC controller recovers from the failure/error on subsequent retries.
Signed-off-by: Ang, Chee Hong chee.hong.ang@intel.com --- drivers/mmc/dw_mmc.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/mmc/dw_mmc.c b/drivers/mmc/dw_mmc.c index 7544b84..8dcc518 100644 --- a/drivers/mmc/dw_mmc.c +++ b/drivers/mmc/dw_mmc.c @@ -266,8 +266,11 @@ static int dwmci_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, if (data) flags = dwmci_set_transfer_mode(host, data);
- if ((cmd->resp_type & MMC_RSP_136) && (cmd->resp_type & MMC_RSP_BUSY)) - return -1; + if ((cmd->resp_type & MMC_RSP_136) && + (cmd->resp_type & MMC_RSP_BUSY)) { + ret = -1; + goto delay_ret; + }
if (cmd->cmdidx == MMC_CMD_STOP_TRANSMISSION) flags |= DWMCI_CMD_ABORT_STOP; @@ -316,11 +319,13 @@ static int dwmci_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, return -ETIMEDOUT; } else if (mask & DWMCI_INTMSK_RE) { debug("%s: Response Error.\n", __func__); - return -EIO; + ret = -EIO; + goto delay_ret; } else if ((cmd->resp_type & MMC_RSP_CRC) && (mask & DWMCI_INTMSK_RCRC)) { debug("%s: Response CRC Error.\n", __func__); - return -EIO; + ret = -EIO; + goto delay_ret; }
@@ -347,6 +352,7 @@ static int dwmci_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, } }
+delay_ret: udelay(100);
return ret;

On 2/18/19 5:16 AM, chee.hong.ang@intel.com wrote:
From: "Ang, Chee Hong" chee.hong.ang@intel.com
'SET_BLOCKLEN' may occasionally fail on first attempt.
Why ?
This patch enable a small delay in dwmci_send_cmd() on busy, I/O or CRC error to allow the MMC controller recovers from the failure/error on subsequent retries.
Why 100 uS ?
Can we rather detect whether the controller recovered by polling some bit?
Signed-off-by: Ang, Chee Hong chee.hong.ang@intel.com
drivers/mmc/dw_mmc.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/mmc/dw_mmc.c b/drivers/mmc/dw_mmc.c index 7544b84..8dcc518 100644 --- a/drivers/mmc/dw_mmc.c +++ b/drivers/mmc/dw_mmc.c @@ -266,8 +266,11 @@ static int dwmci_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, if (data) flags = dwmci_set_transfer_mode(host, data);
- if ((cmd->resp_type & MMC_RSP_136) && (cmd->resp_type & MMC_RSP_BUSY))
return -1;
if ((cmd->resp_type & MMC_RSP_136) &&
(cmd->resp_type & MMC_RSP_BUSY)) {
ret = -1;
goto delay_ret;
}
if (cmd->cmdidx == MMC_CMD_STOP_TRANSMISSION) flags |= DWMCI_CMD_ABORT_STOP;
@@ -316,11 +319,13 @@ static int dwmci_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, return -ETIMEDOUT; } else if (mask & DWMCI_INTMSK_RE) { debug("%s: Response Error.\n", __func__);
return -EIO;
ret = -EIO;
} else if ((cmd->resp_type & MMC_RSP_CRC) && (mask & DWMCI_INTMSK_RCRC)) { debug("%s: Response CRC Error.\n", __func__);goto delay_ret;
return -EIO;
ret = -EIO;
}goto delay_ret;
@@ -347,6 +352,7 @@ static int dwmci_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, } }
+delay_ret: udelay(100);
return ret;

On Mon, 2019-02-18 at 12:57 +0100, Marek Vasut wrote:
On 2/18/19 5:16 AM, chee.hong.ang@intel.com wrote:
From: "Ang, Chee Hong" chee.hong.ang@intel.com
'SET_BLOCKLEN' may occasionally fail on first attempt.
Why ?
This is part of the workaround of mmc driver which is enabled with 'CONFIG_MMC_QUIRKS' config. https://github.com/u-boot/u-boot/blob/dbe70c7d4e3d5c705a98d82952e05a591 efd0683/drivers/mmc/mmc.c#L272
This patch enable a small delay in dwmci_send_cmd() on busy, I/O or CRC error to allow the MMC controller recovers from the failure/error on subsequent retries.
Why 100 uS ?
This is a good question. Perhaps the driver's authors can explain this. The driver delay 100us after dwmci_send_cmd() success with the command sent. But it never delay 100us whenever mmc controller response to the command sent with I/O or CRC errors. MMC drivers expect the first 'SET_BLOCKEN' command issued by dwmci_send_cmd() may fail intermittently so it will retry up to 4 times before it gave up and return error. Without this 100us delay after error response, 'SET_BLOCKEN' may sometime fail in 4 attempts in a row. Therefore, we encountered intermittent failure in loading u-boot (SSBL) from MMC.
Can we rather detect whether the controller recovered by polling some bit?
Hmmm...I can take a look at the databook of this controller and dig further. Probably this is the limitation of the controller itself. The mmc driver code which introduce 100us delay after every command sent in dwmci_send_cmd() looks iffy.
Signed-off-by: Ang, Chee Hong chee.hong.ang@intel.com
drivers/mmc/dw_mmc.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/mmc/dw_mmc.c b/drivers/mmc/dw_mmc.c index 7544b84..8dcc518 100644 --- a/drivers/mmc/dw_mmc.c +++ b/drivers/mmc/dw_mmc.c @@ -266,8 +266,11 @@ static int dwmci_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, if (data) flags = dwmci_set_transfer_mode(host, data);
- if ((cmd->resp_type & MMC_RSP_136) && (cmd->resp_type &
MMC_RSP_BUSY))
return -1;
- if ((cmd->resp_type & MMC_RSP_136) &&
- (cmd->resp_type & MMC_RSP_BUSY)) {
ret = -1;
goto delay_ret;
- }
if (cmd->cmdidx == MMC_CMD_STOP_TRANSMISSION) flags |= DWMCI_CMD_ABORT_STOP; @@ -316,11 +319,13 @@ static int dwmci_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, return -ETIMEDOUT; } else if (mask & DWMCI_INTMSK_RE) { debug("%s: Response Error.\n", __func__);
return -EIO;
ret = -EIO;
goto delay_ret;
} else if ((cmd->resp_type & MMC_RSP_CRC) && (mask & DWMCI_INTMSK_RCRC)) { debug("%s: Response CRC Error.\n", __func__);
return -EIO;
ret = -EIO;
goto delay_ret;
} @@ -347,6 +352,7 @@ static int dwmci_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, } } +delay_ret: udelay(100); return ret;

On 2/18/19 3:51 PM, Ang, Chee Hong wrote:
On Mon, 2019-02-18 at 12:57 +0100, Marek Vasut wrote:
On 2/18/19 5:16 AM, chee.hong.ang@intel.com wrote:
From: "Ang, Chee Hong" chee.hong.ang@intel.com
'SET_BLOCKLEN' may occasionally fail on first attempt.
Why ?
This is part of the workaround of mmc driver which is enabled with 'CONFIG_MMC_QUIRKS' config. https://github.com/u-boot/u-boot/blob/dbe70c7d4e3d5c705a98d82952e05a591 efd0683/drivers/mmc/mmc.c#L272
OK, let's take a step back. What problem do you observe, that you're trying to fix ?
This patch enable a small delay in dwmci_send_cmd() on busy, I/O or CRC error to allow the MMC controller recovers from the failure/error on subsequent retries.
Why 100 uS ?
This is a good question. Perhaps the driver's authors can explain this. The driver delay 100us after dwmci_send_cmd() success with the command sent. But it never delay 100us whenever mmc controller response to the command sent with I/O or CRC errors. MMC drivers expect the first 'SET_BLOCKEN' command issued by dwmci_send_cmd() may fail intermittently so it will retry up to 4 times before it gave up and return error. Without this 100us delay after error response, 'SET_BLOCKEN' may sometime fail in 4 attempts in a row. Therefore, we encountered intermittent failure in loading u-boot (SSBL) from MMC.
Can you be more specific about the failure you saw ?
Can we rather detect whether the controller recovered by polling some bit?
Hmmm...I can take a look at the databook of this controller and dig further. Probably this is the limitation of the controller itself. The mmc driver code which introduce 100us delay after every command sent in dwmci_send_cmd() looks iffy.
If you have access to it, please do.
btw do you have the same problem if you disable caches ?
Signed-off-by: Ang, Chee Hong chee.hong.ang@intel.com
drivers/mmc/dw_mmc.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/mmc/dw_mmc.c b/drivers/mmc/dw_mmc.c index 7544b84..8dcc518 100644 --- a/drivers/mmc/dw_mmc.c +++ b/drivers/mmc/dw_mmc.c @@ -266,8 +266,11 @@ static int dwmci_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, if (data) flags = dwmci_set_transfer_mode(host, data);
- if ((cmd->resp_type & MMC_RSP_136) && (cmd->resp_type &
MMC_RSP_BUSY))
return -1;
- if ((cmd->resp_type & MMC_RSP_136) &&
- (cmd->resp_type & MMC_RSP_BUSY)) {
ret = -1;
goto delay_ret;
- }
if (cmd->cmdidx == MMC_CMD_STOP_TRANSMISSION) flags |= DWMCI_CMD_ABORT_STOP; @@ -316,11 +319,13 @@ static int dwmci_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, return -ETIMEDOUT; } else if (mask & DWMCI_INTMSK_RE) { debug("%s: Response Error.\n", __func__);
return -EIO;
ret = -EIO;
goto delay_ret;
} else if ((cmd->resp_type & MMC_RSP_CRC) && (mask & DWMCI_INTMSK_RCRC)) { debug("%s: Response CRC Error.\n", __func__);
return -EIO;
ret = -EIO;
goto delay_ret;
} @@ -347,6 +352,7 @@ static int dwmci_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, } } +delay_ret: udelay(100); return ret;

On Mon, 2019-02-18 at 21:38 +0100, Marek Vasut wrote:
On 2/18/19 3:51 PM, Ang, Chee Hong wrote:
On Mon, 2019-02-18 at 12:57 +0100, Marek Vasut wrote:
On 2/18/19 5:16 AM, chee.hong.ang@intel.com wrote:
From: "Ang, Chee Hong" chee.hong.ang@intel.com
'SET_BLOCKLEN' may occasionally fail on first attempt.
Why ?
This is part of the workaround of mmc driver which is enabled with 'CONFIG_MMC_QUIRKS' config. https://github.com/u-boot/u-boot/blob/dbe70c7d4e3d5c705a98d82952e05 a591 efd0683/drivers/mmc/mmc.c#L272
OK, let's take a step back. What problem do you observe, that you're trying to fix ?
See my detail explanation below.
This patch enable a small delay in dwmci_send_cmd() on busy, I/O or CRC error to allow the MMC controller recovers from the failure/error on subsequent retries.
Why 100 uS ?
This is a good question. Perhaps the driver's authors can explain this. The driver delay 100us after dwmci_send_cmd() success with the command sent. But it never delay 100us whenever mmc controller response to the command sent with I/O or CRC errors. MMC drivers expect the first 'SET_BLOCKEN' command issued by dwmci_send_cmd() may fail intermittently so it will retry up to 4 times before it gave up and return error. Without this 100us delay after error response, 'SET_BLOCKEN' may sometime fail in 4 attempts in a row. Therefore, we encountered intermittent failure in loading u-boot (SSBL) from MMC.
Can you be more specific about the failure you saw ?
Here are the steps for booting u-boot (SSBL) from MMC:
1) SPL (FSBL) get loaded to OCRAM 2) SPL try to load uboot as SSBL from MMC 3) SPL issue 'SET_BLOCKEN' to MMC controller 4) If MMC controller response with error continue to step 5 else go to step 7 5) If is less than 4 attempts go to step 3 else continue to step 6 6) Indicate to user there was an error loading uboot as SSBL from MMC and stop here 7) Send command to read blocks of data from MMC 8) uboot successfully loaded to DDR memory 9) SPL jumps to uboot and continue the boot flow.
So this patch actually introduce a small delay at step 4. Without this small delay, step 4 might sometime fail in all 4 attempts to issue the 'SET_BLOCKEN' command to MMC controller. Note that this failure is intermittent. If it fails at step 3 and there is no small delay in step 4, it will most likely fail in subsequent attempts.
Can we rather detect whether the controller recovered by polling some bit?
Hmmm...I can take a look at the databook of this controller and dig further. Probably this is the limitation of the controller itself. The mmc driver code which introduce 100us delay after every command sent in dwmci_send_cmd() looks iffy.
If you have access to it, please do.
Unfortunately, I can't find any registers I can poll for the status of the readiness of the controller from DesignWare storage controller databook.
btw do you have the same problem if you disable caches ?
The error happen while SPL trying to load uboot as SSBL from MMC into DDR memory.At this point, the caches are disabled.
Signed-off-by: Ang, Chee Hong chee.hong.ang@intel.com
drivers/mmc/dw_mmc.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/mmc/dw_mmc.c b/drivers/mmc/dw_mmc.c index 7544b84..8dcc518 100644 --- a/drivers/mmc/dw_mmc.c +++ b/drivers/mmc/dw_mmc.c @@ -266,8 +266,11 @@ static int dwmci_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, if (data) flags = dwmci_set_transfer_mode(host, data);
- if ((cmd->resp_type & MMC_RSP_136) && (cmd->resp_type
& MMC_RSP_BUSY))
return -1;
- if ((cmd->resp_type & MMC_RSP_136) &&
- (cmd->resp_type & MMC_RSP_BUSY)) {
ret = -1;
goto delay_ret;
- }
if (cmd->cmdidx == MMC_CMD_STOP_TRANSMISSION) flags |= DWMCI_CMD_ABORT_STOP; @@ -316,11 +319,13 @@ static int dwmci_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, return -ETIMEDOUT; } else if (mask & DWMCI_INTMSK_RE) { debug("%s: Response Error.\n", __func__);
return -EIO;
ret = -EIO;
goto delay_ret;
} else if ((cmd->resp_type & MMC_RSP_CRC) && (mask & DWMCI_INTMSK_RCRC)) { debug("%s: Response CRC Error.\n", __func__);
return -EIO;
ret = -EIO;
goto delay_ret;
} @@ -347,6 +352,7 @@ static int dwmci_send_cmd(struct mmc *mmc, struct mmc_cmd *cmd, } } +delay_ret: udelay(100); return ret;

On 2/20/19 2:57 PM, Ang, Chee Hong wrote:
On Mon, 2019-02-18 at 21:38 +0100, Marek Vasut wrote:
On 2/18/19 3:51 PM, Ang, Chee Hong wrote:
On Mon, 2019-02-18 at 12:57 +0100, Marek Vasut wrote:
On 2/18/19 5:16 AM, chee.hong.ang@intel.com wrote:
From: "Ang, Chee Hong" chee.hong.ang@intel.com
'SET_BLOCKLEN' may occasionally fail on first attempt.
Why ?
This is part of the workaround of mmc driver which is enabled with 'CONFIG_MMC_QUIRKS' config. https://github.com/u-boot/u-boot/blob/dbe70c7d4e3d5c705a98d82952e05 a591 efd0683/drivers/mmc/mmc.c#L272
OK, let's take a step back. What problem do you observe, that you're trying to fix ?
See my detail explanation below.
This patch enable a small delay in dwmci_send_cmd() on busy, I/O or CRC error to allow the MMC controller recovers from the failure/error on subsequent retries.
Why 100 uS ?
This is a good question. Perhaps the driver's authors can explain this. The driver delay 100us after dwmci_send_cmd() success with the command sent. But it never delay 100us whenever mmc controller response to the command sent with I/O or CRC errors. MMC drivers expect the first 'SET_BLOCKEN' command issued by dwmci_send_cmd() may fail intermittently so it will retry up to 4 times before it gave up and return error. Without this 100us delay after error response, 'SET_BLOCKEN' may sometime fail in 4 attempts in a row. Therefore, we encountered intermittent failure in loading u-boot (SSBL) from MMC.
Can you be more specific about the failure you saw ?
Here are the steps for booting u-boot (SSBL) from MMC:
- SPL (FSBL) get loaded to OCRAM
- SPL try to load uboot as SSBL from MMC
- SPL issue 'SET_BLOCKEN' to MMC controller
s/SPL/MMC framework via DWMMC driver/
- If MMC controller response with error continue to step 5 else go to
step 7 5) If is less than 4 attempts go to step 3 else continue to step 6 6) Indicate to user there was an error loading uboot as SSBL from MMC and stop here 7) Send command to read blocks of data from MMC 8) uboot successfully loaded to DDR memory 9) SPL jumps to uboot and continue the boot flow.
So this patch actually introduce a small delay at step 4. Without this small delay, step 4 might sometime fail in all 4 attempts to issue the 'SET_BLOCKEN' command to MMC controller.
Isn't what you observe here some sort of timeout which you need to respect when the card rejects a command ? If so, such timeout should be described in either of the SD or MMC specification and it should be in the MMC framework core instead of a driver.
Note that this failure is intermittent. If it fails at step 3 and there is no small delay in step 4, it will most likely fail in subsequent attempts.
Can we rather detect whether the controller recovered by polling some bit?
Hmmm...I can take a look at the databook of this controller and dig further. Probably this is the limitation of the controller itself. The mmc driver code which introduce 100us delay after every command sent in dwmci_send_cmd() looks iffy.
If you have access to it, please do.
Unfortunately, I can't find any registers I can poll for the status of the readiness of the controller from DesignWare storage controller databook.
btw do you have the same problem if you disable caches ?
The error happen while SPL trying to load uboot as SSBL from MMC into DDR memory.At this point, the caches are disabled.
I-Cache is enabled in SPL, D-cache _might_ be too.
Do you get this error after cold boot too ?

On Thu, 2019-02-21 at 11:06 +0100, Marek Vasut wrote:
On 2/20/19 2:57 PM, Ang, Chee Hong wrote:
On Mon, 2019-02-18 at 21:38 +0100, Marek Vasut wrote:
On 2/18/19 3:51 PM, Ang, Chee Hong wrote:
On Mon, 2019-02-18 at 12:57 +0100, Marek Vasut wrote:
On 2/18/19 5:16 AM, chee.hong.ang@intel.com wrote:
From: "Ang, Chee Hong" chee.hong.ang@intel.com
'SET_BLOCKLEN' may occasionally fail on first attempt.
Why ?
This is part of the workaround of mmc driver which is enabled with 'CONFIG_MMC_QUIRKS' config. https://github.com/u-boot/u-boot/blob/dbe70c7d4e3d5c705a98d8295 2e05 a591 efd0683/drivers/mmc/mmc.c#L272
OK, let's take a step back. What problem do you observe, that you're trying to fix ?
See my detail explanation below.
This patch enable a small delay in dwmci_send_cmd() on busy, I/O or CRC error to allow the MMC controller recovers from the failure/error on subsequent retries.
Why 100 uS ?
This is a good question. Perhaps the driver's authors can explain this. The driver delay 100us after dwmci_send_cmd() success with the command sent. But it never delay 100us whenever mmc controller response to the command sent with I/O or CRC errors. MMC drivers expect the first 'SET_BLOCKEN' command issued by dwmci_send_cmd() may fail intermittently so it will retry up to 4 times before it gave up and return error. Without this 100us delay after error response, 'SET_BLOCKEN' may sometime fail in 4 attempts in a row. Therefore, we encountered intermittent failure in loading u-boot (SSBL) from MMC.
Can you be more specific about the failure you saw ?
Here are the steps for booting u-boot (SSBL) from MMC:
- SPL (FSBL) get loaded to OCRAM
- SPL try to load uboot as SSBL from MMC
- SPL issue 'SET_BLOCKEN' to MMC controller
s/SPL/MMC framework via DWMMC driver/
Yes. It is done in drivers/mmc/mmc.c (Line 276)
- If MMC controller response with error continue to step 5 else go
to step 7 5) If is less than 4 attempts go to step 3 else continue to step 6 6) Indicate to user there was an error loading uboot as SSBL from MMC and stop here 7) Send command to read blocks of data from MMC 8) uboot successfully loaded to DDR memory 9) SPL jumps to uboot and continue the boot flow.
So this patch actually introduce a small delay at step 4. Without this small delay, step 4 might sometime fail in all 4 attempts to issue the 'SET_BLOCKEN' command to MMC controller.
Isn't what you observe here some sort of timeout which you need to respect when the card rejects a command ? If so, such timeout should be described in either of the SD or MMC specification and it should be in the MMC framework core instead of a driver.
I don't have chance to test other MMC drivers and I am not sure other MMC drivers need such small delay for next retry if the card reject this command. This DW MMC driver already enforce a small delay even the command is success, so this makes me think that this 'time out' issue is only specific to this DW MMC driver. Is it necessary to 'enforce' this small delay in common MMC framework (affecting all other MMC drivers as well) as shown below:
drivers/mmc/mmc.c
#ifdef CONFIG_MMC_QUIRKS if (err && (mmc->quirks & MMC_QUIRK_RETRY_SET_BLOCKLEN)) { int retries = 4; /* * It has been seen that SET_BLOCKLEN may fail on the first * attempt, let's try a few more time */ do { err = mmc_send_cmd(mmc, &cmd, NULL); if (!err) break; + /* Enforce small delay before retry */ + udelay(100); } while (retries--); } #endif
Note that this failure is intermittent. If it fails at step 3 and there is no small delay in step 4, it will most likely fail in subsequent attempts.
Can we rather detect whether the controller recovered by polling some bit?
Hmmm...I can take a look at the databook of this controller and dig further. Probably this is the limitation of the controller itself. The mmc driver code which introduce 100us delay after every command sent in dwmci_send_cmd() looks iffy.
If you have access to it, please do.
Unfortunately, I can't find any registers I can poll for the status of the readiness of the controller from DesignWare storage controller databook.
btw do you have the same problem if you disable caches ?
The error happen while SPL trying to load uboot as SSBL from MMC into DDR memory.At this point, the caches are disabled.
I-Cache is enabled in SPL, D-cache _might_ be too.
Do you get this error after cold boot too ?
Yes. It happened after cold boot too.

On 2/22/19 4:19 PM, Ang, Chee Hong wrote:
On Thu, 2019-02-21 at 11:06 +0100, Marek Vasut wrote:
On 2/20/19 2:57 PM, Ang, Chee Hong wrote:
On Mon, 2019-02-18 at 21:38 +0100, Marek Vasut wrote:
On 2/18/19 3:51 PM, Ang, Chee Hong wrote:
On Mon, 2019-02-18 at 12:57 +0100, Marek Vasut wrote:
On 2/18/19 5:16 AM, chee.hong.ang@intel.com wrote: > > > > From: "Ang, Chee Hong" chee.hong.ang@intel.com > > 'SET_BLOCKLEN' may occasionally fail on first attempt. Why ?
This is part of the workaround of mmc driver which is enabled with 'CONFIG_MMC_QUIRKS' config. https://github.com/u-boot/u-boot/blob/dbe70c7d4e3d5c705a98d8295 2e05 a591 efd0683/drivers/mmc/mmc.c#L272
OK, let's take a step back. What problem do you observe, that you're trying to fix ?
See my detail explanation below.
> > > This patch enable a small delay in dwmci_send_cmd() on > busy, I/O or CRC error to allow the MMC controller recovers > from the failure/error on subsequent retries. Why 100 uS ?
This is a good question. Perhaps the driver's authors can explain this. The driver delay 100us after dwmci_send_cmd() success with the command sent. But it never delay 100us whenever mmc controller response to the command sent with I/O or CRC errors. MMC drivers expect the first 'SET_BLOCKEN' command issued by dwmci_send_cmd() may fail intermittently so it will retry up to 4 times before it gave up and return error. Without this 100us delay after error response, 'SET_BLOCKEN' may sometime fail in 4 attempts in a row. Therefore, we encountered intermittent failure in loading u-boot (SSBL) from MMC.
Can you be more specific about the failure you saw ?
Here are the steps for booting u-boot (SSBL) from MMC:
- SPL (FSBL) get loaded to OCRAM
- SPL try to load uboot as SSBL from MMC
- SPL issue 'SET_BLOCKEN' to MMC controller
s/SPL/MMC framework via DWMMC driver/
Yes. It is done in drivers/mmc/mmc.c (Line 276)
- If MMC controller response with error continue to step 5 else go
to step 7 5) If is less than 4 attempts go to step 3 else continue to step 6 6) Indicate to user there was an error loading uboot as SSBL from MMC and stop here 7) Send command to read blocks of data from MMC 8) uboot successfully loaded to DDR memory 9) SPL jumps to uboot and continue the boot flow.
So this patch actually introduce a small delay at step 4. Without this small delay, step 4 might sometime fail in all 4 attempts to issue the 'SET_BLOCKEN' command to MMC controller.
Isn't what you observe here some sort of timeout which you need to respect when the card rejects a command ? If so, such timeout should be described in either of the SD or MMC specification and it should be in the MMC framework core instead of a driver.
I don't have chance to test other MMC drivers and I am not sure other MMC drivers need such small delay for next retry if the card reject this command.
Does the SD or MMC JEDEC spec say something about such delay?
This DW MMC driver already enforce a small delay even the command is success, so this makes me think that this 'time out' issue is only specific to this DW MMC driver.
So why is there that 100 uS delay ? Does the Linux DWMMC driver have such a delay too?
Is it necessary to 'enforce' this small delay in common MMC framework (affecting all other MMC drivers as well) as shown below:
drivers/mmc/mmc.c
#ifdef CONFIG_MMC_QUIRKS if (err && (mmc->quirks & MMC_QUIRK_RETRY_SET_BLOCKLEN)) { int retries = 4; /* * It has been seen that SET_BLOCKLEN may fail on the first * attempt, let's try a few more time */ do { err = mmc_send_cmd(mmc, &cmd, NULL); if (!err) break;
/* Enforce small delay before retry */
} while (retries--); }udelay(100);
#endif
Expanding the CC to other DWMMC users, maybe their datasheets contain something about the 100 uS delay.
Note that this failure is intermittent. If it fails at step 3 and there is no small delay in step 4, it will most likely fail in subsequent attempts.
Can we rather detect whether the controller recovered by polling some bit?
Hmmm...I can take a look at the databook of this controller and dig further. Probably this is the limitation of the controller itself. The mmc driver code which introduce 100us delay after every command sent in dwmci_send_cmd() looks iffy.
If you have access to it, please do.
Unfortunately, I can't find any registers I can poll for the status of the readiness of the controller from DesignWare storage controller databook.
btw do you have the same problem if you disable caches ?
The error happen while SPL trying to load uboot as SSBL from MMC into DDR memory.At this point, the caches are disabled.
I-Cache is enabled in SPL, D-cache _might_ be too.
Do you get this error after cold boot too ?
Yes. It happened after cold boot too.
Hmmmm, try applying
[PATCH V2] ARM: socfpga: Clear PL310 early in SPL
I wonder if this might be what you're hitting.

On Fri, 2019-02-22 at 17:02 +0100, Marek Vasut wrote:
On 2/22/19 4:19 PM, Ang, Chee Hong wrote:
On Thu, 2019-02-21 at 11:06 +0100, Marek Vasut wrote:
On 2/20/19 2:57 PM, Ang, Chee Hong wrote:
On Mon, 2019-02-18 at 21:38 +0100, Marek Vasut wrote:
On 2/18/19 3:51 PM, Ang, Chee Hong wrote:
On Mon, 2019-02-18 at 12:57 +0100, Marek Vasut wrote: > > > > On 2/18/19 5:16 AM, chee.hong.ang@intel.com wrote: > > > > > > > > > > From: "Ang, Chee Hong" chee.hong.ang@intel.com > > > > 'SET_BLOCKLEN' may occasionally fail on first attempt. > Why ? This is part of the workaround of mmc driver which is enabled with 'CONFIG_MMC_QUIRKS' config. https://github.com/u-boot/u-boot/blob/dbe70c7d4e3d5c705a98d 8295 2e05 a591 efd0683/drivers/mmc/mmc.c#L272
OK, let's take a step back. What problem do you observe, that you're trying to fix ?
See my detail explanation below.
> > > > > > > > > > > This patch enable a small delay in dwmci_send_cmd() on > > busy, I/O or CRC error to allow the MMC controller > > recovers > > from the failure/error on subsequent retries. > Why 100 uS ? This is a good question. Perhaps the driver's authors can explain this. The driver delay 100us after dwmci_send_cmd() success with the command sent. But it never delay 100us whenever mmc controller response to the command sent with I/O or CRC errors. MMC drivers expect the first 'SET_BLOCKEN' command issued by dwmci_send_cmd() may fail intermittently so it will retry up to 4 times before it gave up and return error. Without this 100us delay after error response, 'SET_BLOCKEN' may sometime fail in 4 attempts in a row. Therefore, we encountered intermittent failure in loading u-boot (SSBL) from MMC.
Can you be more specific about the failure you saw ?
Here are the steps for booting u-boot (SSBL) from MMC:
- SPL (FSBL) get loaded to OCRAM
- SPL try to load uboot as SSBL from MMC
- SPL issue 'SET_BLOCKEN' to MMC controller
s/SPL/MMC framework via DWMMC driver/
Yes. It is done in drivers/mmc/mmc.c (Line 276)
- If MMC controller response with error continue to step 5
else go to step 7 5) If is less than 4 attempts go to step 3 else continue to step 6 6) Indicate to user there was an error loading uboot as SSBL from MMC and stop here 7) Send command to read blocks of data from MMC 8) uboot successfully loaded to DDR memory 9) SPL jumps to uboot and continue the boot flow.
So this patch actually introduce a small delay at step 4. Without this small delay, step 4 might sometime fail in all 4 attempts to issue the 'SET_BLOCKEN' command to MMC controller.
Isn't what you observe here some sort of timeout which you need to respect when the card rejects a command ? If so, such timeout should be described in either of the SD or MMC specification and it should be in the MMC framework core instead of a driver.
I don't have chance to test other MMC drivers and I am not sure other MMC drivers need such small delay for next retry if the card reject this command.
Does the SD or MMC JEDEC spec say something about such delay?
I checked the SD/MMC JEDEC. Nothing about this delay is mentioned in the spec.
This DW MMC driver already enforce a small delay even the command is success, so this makes me think that this 'time out' issue is only specific to this DW MMC driver.
So why is there that 100 uS delay ? Does the Linux DWMMC driver have such a delay too?
Linux DWMMC has only the delay for card detection and I think is quite common because this is needed to make sure the memory card is properly inserted before any operations can be started.
Is it necessary to 'enforce' this small delay in common MMC framework (affecting all other MMC drivers as well) as shown below:
drivers/mmc/mmc.c
#ifdef CONFIG_MMC_QUIRKS if (err && (mmc->quirks & MMC_QUIRK_RETRY_SET_BLOCKLEN)) { int retries = 4; /* * It has been seen that SET_BLOCKLEN may fail on the first * attempt, let's try a few more time */ do { err = mmc_send_cmd(mmc, &cmd, NULL); if (!err) break;
/* Enforce small delay before retry */
} while (retries--); }udelay(100);
#endif
Expanding the CC to other DWMMC users, maybe their datasheets contain something about the 100 uS delay.
Note that this failure is intermittent. If it fails at step 3 and there is no small delay in step 4, it will most likely fail in subsequent attempts.
> > > > Can we rather detect whether the controller recovered by > polling > some > bit? Hmmm...I can take a look at the databook of this controller and dig further. Probably this is the limitation of the controller itself. The mmc driver code which introduce 100us delay after every command sent in dwmci_send_cmd() looks iffy.
If you have access to it, please do.
Unfortunately, I can't find any registers I can poll for the status of the readiness of the controller from DesignWare storage controller databook.
btw do you have the same problem if you disable caches ?
The error happen while SPL trying to load uboot as SSBL from MMC into DDR memory.At this point, the caches are disabled.
I-Cache is enabled in SPL, D-cache _might_ be too.
Do you get this error after cold boot too ?
Yes. It happened after cold boot too.
Hmmmm, try applying
[PATCH V2] ARM: socfpga: Clear PL310 early in SPL
I wonder if this might be what you're hitting.
Nope. This patch only apply to Aaria 10 and this problem is found on Stratix 10 platform.

On 3/5/19 8:10 AM, Ang, Chee Hong wrote: [...]
- If MMC controller response with error continue to step 5
else go to step 7 5) If is less than 4 attempts go to step 3 else continue to step 6 6) Indicate to user there was an error loading uboot as SSBL from MMC and stop here 7) Send command to read blocks of data from MMC 8) uboot successfully loaded to DDR memory 9) SPL jumps to uboot and continue the boot flow.
So this patch actually introduce a small delay at step 4. Without this small delay, step 4 might sometime fail in all 4 attempts to issue the 'SET_BLOCKEN' command to MMC controller.
Isn't what you observe here some sort of timeout which you need to respect when the card rejects a command ? If so, such timeout should be described in either of the SD or MMC specification and it should be in the MMC framework core instead of a driver.
I don't have chance to test other MMC drivers and I am not sure other MMC drivers need such small delay for next retry if the card reject this command.
Does the SD or MMC JEDEC spec say something about such delay?
I checked the SD/MMC JEDEC. Nothing about this delay is mentioned in the spec.
OK, neither of the specs say anything, hmmmm
This DW MMC driver already enforce a small delay even the command is success, so this makes me think that this 'time out' issue is only specific to this DW MMC driver.
So why is there that 100 uS delay ? Does the Linux DWMMC driver have such a delay too?
Linux DWMMC has only the delay for card detection and I think is quite common because this is needed to make sure the memory card is properly inserted before any operations can be started.
So why doesn't Linux need this special 100 uS delay , but U-Boot does ?
Is it necessary to 'enforce' this small delay in common MMC framework (affecting all other MMC drivers as well) as shown below:
drivers/mmc/mmc.c
#ifdef CONFIG_MMC_QUIRKS if (err && (mmc->quirks & MMC_QUIRK_RETRY_SET_BLOCKLEN)) { int retries = 4; /* * It has been seen that SET_BLOCKLEN may fail on the first * attempt, let's try a few more time */ do { err = mmc_send_cmd(mmc, &cmd, NULL); if (!err) break;
/* Enforce small delay before retry */
} while (retries--); }udelay(100);
#endif
Expanding the CC to other DWMMC users, maybe their datasheets contain something about the 100 uS delay.
Note that this failure is intermittent. If it fails at step 3 and there is no small delay in step 4, it will most likely fail in subsequent attempts.
> > > >> >> >> >> Can we rather detect whether the controller recovered by >> polling >> some >> bit? > Hmmm...I can take a look at the databook of this controller > and > dig > further. Probably this is the limitation of the controller > itself. > The > mmc driver code which introduce 100us delay after every > command > sent in > dwmci_send_cmd() looks iffy. If you have access to it, please do.
Unfortunately, I can't find any registers I can poll for the status of the readiness of the controller from DesignWare storage controller databook.
btw do you have the same problem if you disable caches ?
The error happen while SPL trying to load uboot as SSBL from MMC into DDR memory.At this point, the caches are disabled.
I-Cache is enabled in SPL, D-cache _might_ be too.
Do you get this error after cold boot too ?
Yes. It happened after cold boot too.
Hmmmm, try applying
[PATCH V2] ARM: socfpga: Clear PL310 early in SPL
I wonder if this might be what you're hitting.
Nope. This patch only apply to Aaria 10 and this problem is found on Stratix 10 platform.
The patch I referenced applies to Gen5, but OK.
participants (3)
-
Ang, Chee Hong
-
chee.hong.ang@intel.com
-
Marek Vasut