[PATCH 0/6] Miscellaneous FWU fixes

The following set of patches are miscellaneous fixes and some hardening of the FWU update logic.
Sughosh Ganu (6): fwu: v2: perform some checks before reading metadata fwu: v2: try reading both copies of metadata fwu: v1: do a version check for the metadata fwu: check all images for transitioning out of Trial State fwu: add dependency checks for selecting FWU metadata version fwu: do not allow capsule processing on exceeding Trial Counter threshold
include/fwu.h | 11 ++++++ lib/fwu_updates/Kconfig | 1 + lib/fwu_updates/fwu.c | 31 +++++++++++++++- lib/fwu_updates/fwu_v1.c | 18 +++++++-- lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------ 5 files changed, 99 insertions(+), 42 deletions(-)

The version 2 of the FWU metadata has a top level structure, followed by optional information on the updatable images. Perform some sanity checks on some of the fields in the top level structure to determine if the rest of the structure has to be read.
Signed-off-by: Sughosh Ganu sughosh.ganu@linaro.org --- lib/fwu_updates/fwu_v2.c | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-)
diff --git a/lib/fwu_updates/fwu_v2.c b/lib/fwu_updates/fwu_v2.c index 108bc9bb4a..d0d8a25929 100644 --- a/lib/fwu_updates/fwu_v2.c +++ b/lib/fwu_updates/fwu_v2.c @@ -58,24 +58,6 @@ static int fwu_mdata_sanity_checks(void) struct fwu_data *data = fwu_get_data(); struct fwu_mdata *mdata = data->fwu_mdata;
- if (mdata->version != FWU_MDATA_VERSION) { - log_err("FWU metadata version %u. Expected value of %u\n", - mdata->version, FWU_MDATA_VERSION); - return -EINVAL; - } - - if (!mdata->desc_offset) { - log_err("No image information provided with the Metadata. "); - log_err("Image information expected in the metadata\n"); - return -EINVAL; - } - - if (mdata->desc_offset != 0x20) { - log_err("Descriptor Offset(0x%x) in the FWU Metadata not equal to 0x20\n", - mdata->desc_offset); - return -EINVAL; - } - num_banks = fwu_get_fw_desc(mdata)->num_banks; num_images = fwu_get_fw_desc(mdata)->num_images;
@@ -238,6 +220,24 @@ int fwu_init(void) return ret; }
+ if (mdata.version != FWU_MDATA_VERSION) { + log_err("FWU metadata version %u. Expected value of %u\n", + mdata.version, FWU_MDATA_VERSION); + return -EINVAL; + } + + if (!mdata.desc_offset) { + log_err("No image information provided with the Metadata. "); + log_err("Image information expected in the metadata\n"); + return -EINVAL; + } + + if (mdata.desc_offset != 0x20) { + log_err("Descriptor Offset(0x%x) in the FWU Metadata not equal to 0x20\n", + mdata.desc_offset); + return -EINVAL; + } + ret = fwu_mdata_copies_allocate(mdata.metadata_size); if (ret) return ret;

On 8/30/24 13:40, Sughosh Ganu wrote:
The version 2 of the FWU metadata has a top level structure, followed by optional information on the updatable images. Perform some sanity checks on some of the fields in the top level structure to determine if the rest of the structure has to be read.
Signed-off-by: Sughosh Ganu sughosh.ganu@linaro.org
lib/fwu_updates/fwu_v2.c | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-)
diff --git a/lib/fwu_updates/fwu_v2.c b/lib/fwu_updates/fwu_v2.c index 108bc9bb4a..d0d8a25929 100644 --- a/lib/fwu_updates/fwu_v2.c +++ b/lib/fwu_updates/fwu_v2.c @@ -58,24 +58,6 @@ static int fwu_mdata_sanity_checks(void) struct fwu_data *data = fwu_get_data(); struct fwu_mdata *mdata = data->fwu_mdata;
- if (mdata->version != FWU_MDATA_VERSION) {
log_err("FWU metadata version %u. Expected value of %u\n",
mdata->version, FWU_MDATA_VERSION);
return -EINVAL;
- }
- if (!mdata->desc_offset) {
log_err("No image information provided with the Metadata. ");
log_err("Image information expected in the metadata\n");
return -EINVAL;
- }
- if (mdata->desc_offset != 0x20) {
log_err("Descriptor Offset(0x%x) in the FWU Metadata not equal to 0x20\n",
mdata->desc_offset);
return -EINVAL;
- }
- num_banks = fwu_get_fw_desc(mdata)->num_banks; num_images = fwu_get_fw_desc(mdata)->num_images;
@@ -238,6 +220,24 @@ int fwu_init(void) return ret; }
- if (mdata.version != FWU_MDATA_VERSION) {
log_err("FWU metadata version %u. Expected value of %u\n",
mdata.version, FWU_MDATA_VERSION);
return -EINVAL;
- }
- if (!mdata.desc_offset) {
log_err("No image information provided with the Metadata. ");
log_err("Image information expected in the metadata\n");
return -EINVAL;
- }
- if (mdata.desc_offset != 0x20) {
In 2/6 you are introducing this macro +#define FWU_IMG_DESC_OFFSET 0x20U
I think you can do it directly here and just use it in 2/6.
log_err("Descriptor Offset(0x%x) in the FWU Metadata not equal to 0x20\n",
mdata.desc_offset);
return -EINVAL;
- }
- ret = fwu_mdata_copies_allocate(mdata.metadata_size); if (ret) return ret;
M

On Tue, 3 Sept 2024 at 19:22, Michal Simek michal.simek@amd.com wrote:
On 8/30/24 13:40, Sughosh Ganu wrote:
The version 2 of the FWU metadata has a top level structure, followed by optional information on the updatable images. Perform some sanity checks on some of the fields in the top level structure to determine if the rest of the structure has to be read.
Signed-off-by: Sughosh Ganu sughosh.ganu@linaro.org
lib/fwu_updates/fwu_v2.c | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-)
diff --git a/lib/fwu_updates/fwu_v2.c b/lib/fwu_updates/fwu_v2.c index 108bc9bb4a..d0d8a25929 100644 --- a/lib/fwu_updates/fwu_v2.c +++ b/lib/fwu_updates/fwu_v2.c @@ -58,24 +58,6 @@ static int fwu_mdata_sanity_checks(void) struct fwu_data *data = fwu_get_data(); struct fwu_mdata *mdata = data->fwu_mdata;
if (mdata->version != FWU_MDATA_VERSION) {
log_err("FWU metadata version %u. Expected value of %u\n",
mdata->version, FWU_MDATA_VERSION);
return -EINVAL;
}
if (!mdata->desc_offset) {
log_err("No image information provided with the Metadata. ");
log_err("Image information expected in the metadata\n");
return -EINVAL;
}
if (mdata->desc_offset != 0x20) {
log_err("Descriptor Offset(0x%x) in the FWU Metadata not equal to 0x20\n",
mdata->desc_offset);
return -EINVAL;
}
num_banks = fwu_get_fw_desc(mdata)->num_banks; num_images = fwu_get_fw_desc(mdata)->num_images;
@@ -238,6 +220,24 @@ int fwu_init(void) return ret; }
if (mdata.version != FWU_MDATA_VERSION) {
log_err("FWU metadata version %u. Expected value of %u\n",
mdata.version, FWU_MDATA_VERSION);
return -EINVAL;
}
if (!mdata.desc_offset) {
log_err("No image information provided with the Metadata. ");
log_err("Image information expected in the metadata\n");
return -EINVAL;
}
if (mdata.desc_offset != 0x20) {
In 2/6 you are introducing this macro +#define FWU_IMG_DESC_OFFSET 0x20U
I think you can do it directly here and just use it in 2/6.
Will do. Thanks.
-sughosh
log_err("Descriptor Offset(0x%x) in the FWU Metadata not equal to 0x20\n",
mdata.desc_offset);
return -EINVAL;
}
ret = fwu_mdata_copies_allocate(mdata.metadata_size); if (ret) return ret;
M

On 8/30/24 13:40, Sughosh Ganu wrote:
The version 2 of the FWU metadata has a top level structure, followed by optional information on the updatable images. Perform some sanity checks on some of the fields in the top level structure to determine if the rest of the structure has to be read.
Signed-off-by: Sughosh Ganu sughosh.ganu@linaro.org
lib/fwu_updates/fwu_v2.c | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-)
diff --git a/lib/fwu_updates/fwu_v2.c b/lib/fwu_updates/fwu_v2.c index 108bc9bb4a..d0d8a25929 100644 --- a/lib/fwu_updates/fwu_v2.c +++ b/lib/fwu_updates/fwu_v2.c @@ -58,24 +58,6 @@ static int fwu_mdata_sanity_checks(void) struct fwu_data *data = fwu_get_data(); struct fwu_mdata *mdata = data->fwu_mdata;
- if (mdata->version != FWU_MDATA_VERSION) {
log_err("FWU metadata version %u. Expected value of %u\n",
mdata->version, FWU_MDATA_VERSION);
return -EINVAL;
- }
- if (!mdata->desc_offset) {
log_err("No image information provided with the Metadata. ");
log_err("Image information expected in the metadata\n");
return -EINVAL;
- }
- if (mdata->desc_offset != 0x20) {
log_err("Descriptor Offset(0x%x) in the FWU Metadata not equal to 0x20\n",
mdata->desc_offset);
return -EINVAL;
- }
- num_banks = fwu_get_fw_desc(mdata)->num_banks; num_images = fwu_get_fw_desc(mdata)->num_images;
@@ -238,6 +220,24 @@ int fwu_init(void) return ret; }
- if (mdata.version != FWU_MDATA_VERSION) {
log_err("FWU metadata version %u. Expected value of %u\n",
mdata.version, FWU_MDATA_VERSION);
return -EINVAL;
- }
- if (!mdata.desc_offset) {
log_err("No image information provided with the Metadata. ");
log_err("Image information expected in the metadata\n");
return -EINVAL;
- }
nit: One more thing here. You are doing this just to report that desc_offset is not 0 and below it is equal to 0x20. I think you can check directly that it is 0x20.
M

In the version 2 of the FWU metadata, the metadata is broken into two parts, a top-level structure, which provides information on the total size of the structure among other things. Try reading the primary partition first, and if that fails, try reading the secondary partition. This will help in the scenario where the primary metadata partition has been corrupted, but the secondary partition is intact.
Signed-off-by: Sughosh Ganu sughosh.ganu@linaro.org --- lib/fwu_updates/fwu_v2.c | 77 +++++++++++++++++++++------------------- 1 file changed, 41 insertions(+), 36 deletions(-)
diff --git a/lib/fwu_updates/fwu_v2.c b/lib/fwu_updates/fwu_v2.c index d0d8a25929..69306282aa 100644 --- a/lib/fwu_updates/fwu_v2.c +++ b/lib/fwu_updates/fwu_v2.c @@ -10,6 +10,12 @@ #include <linux/types.h>
#define FWU_MDATA_VERSION 0x2U +#define FWU_IMG_DESC_OFFSET 0x20U + +static struct fwu_mdata g_mdata; + +#define PRI_PART 0x1U +#define SEC_PART 0x2U
static inline struct fwu_fw_store_desc *fwu_get_fw_desc(struct fwu_mdata *mdata) { @@ -109,6 +115,31 @@ static int fwu_trial_state_start(uint update_index) return 0; }
+static bool fwu_get_mdata_mandatory(uint part) +{ + int ret = 0; + struct udevice *fwu_dev = fwu_get_dev(); + + memset(&g_mdata, 0, sizeof(struct fwu_mdata)); + + ret = fwu_read_mdata(fwu_dev, &g_mdata, + part == PRI_PART ? true : false, + sizeof(struct fwu_mdata)); + if (ret) + return false; + + if (g_mdata.version != FWU_MDATA_VERSION) + return false; + + if (!g_mdata.desc_offset) + return false; + + if (g_mdata.desc_offset != FWU_IMG_DESC_OFFSET) + return false; + + return true; +} + /** * fwu_populate_mdata_image_info() - Populate the image information * of the metadata @@ -169,24 +200,14 @@ int fwu_state_machine_updates(bool trial_state, uint32_t update_index) */ int fwu_get_mdata_size(uint32_t *mdata_size) { - int ret = 0; - struct fwu_mdata mdata = { 0 }; struct fwu_data *data = fwu_get_data(); - struct udevice *fwu_dev = fwu_get_dev();
if (data->metadata_size) { *mdata_size = data->metadata_size; return 0; }
- ret = fwu_read_mdata(fwu_dev, &mdata, 1, - sizeof(struct fwu_mdata)); - if (ret) { - log_err("FWU metadata read failed\n"); - return ret; - } - - *mdata_size = mdata.metadata_size; + *mdata_size = g_mdata.metadata_size; if (!*mdata_size) return -EINVAL;
@@ -206,39 +227,23 @@ int fwu_get_mdata_size(uint32_t *mdata_size) int fwu_init(void) { int ret; - struct fwu_mdata mdata = { 0 }; - struct udevice *fwu_dev = fwu_get_dev();
/* * First we read only the top level structure * and get the size of the complete structure. + * Try reading the first partition first, if + * that does not work, try the secondary + * partition. The idea is, if one of the + * partitions is corrupted, it should be restored + * from the intact partition. */ - ret = fwu_read_mdata(fwu_dev, &mdata, 1, - sizeof(struct fwu_mdata)); - if (ret) { + if (!fwu_get_mdata_mandatory(PRI_PART) && + !fwu_get_mdata_mandatory(SEC_PART)) { log_err("FWU metadata read failed\n"); - return ret; - } - - if (mdata.version != FWU_MDATA_VERSION) { - log_err("FWU metadata version %u. Expected value of %u\n", - mdata.version, FWU_MDATA_VERSION); - return -EINVAL; - } - - if (!mdata.desc_offset) { - log_err("No image information provided with the Metadata. "); - log_err("Image information expected in the metadata\n"); - return -EINVAL; - } - - if (mdata.desc_offset != 0x20) { - log_err("Descriptor Offset(0x%x) in the FWU Metadata not equal to 0x20\n", - mdata.desc_offset); - return -EINVAL; + return -1; }
- ret = fwu_mdata_copies_allocate(mdata.metadata_size); + ret = fwu_mdata_copies_allocate(g_mdata.metadata_size); if (ret) return ret;

On 8/30/24 13:40, Sughosh Ganu wrote:
In the version 2 of the FWU metadata, the metadata is broken into two parts, a top-level structure, which provides information on the total size of the structure among other things. Try reading the primary partition first, and if that fails, try reading the secondary partition. This will help in the scenario where the primary metadata partition has been corrupted, but the secondary partition is intact.
Signed-off-by: Sughosh Ganu sughosh.ganu@linaro.org
lib/fwu_updates/fwu_v2.c | 77 +++++++++++++++++++++------------------- 1 file changed, 41 insertions(+), 36 deletions(-)
diff --git a/lib/fwu_updates/fwu_v2.c b/lib/fwu_updates/fwu_v2.c index d0d8a25929..69306282aa 100644 --- a/lib/fwu_updates/fwu_v2.c +++ b/lib/fwu_updates/fwu_v2.c @@ -10,6 +10,12 @@ #include <linux/types.h>
#define FWU_MDATA_VERSION 0x2U +#define FWU_IMG_DESC_OFFSET 0x20U
+static struct fwu_mdata g_mdata;
+#define PRI_PART 0x1U +#define SEC_PART 0x2U
static inline struct fwu_fw_store_desc *fwu_get_fw_desc(struct fwu_mdata *mdata) { @@ -109,6 +115,31 @@ static int fwu_trial_state_start(uint update_index) return 0; }
+static bool fwu_get_mdata_mandatory(uint part) +{
- int ret = 0;
- struct udevice *fwu_dev = fwu_get_dev();
- memset(&g_mdata, 0, sizeof(struct fwu_mdata));
- ret = fwu_read_mdata(fwu_dev, &g_mdata,
part == PRI_PART ? true : false,
sizeof(struct fwu_mdata));
- if (ret)
return false;
- if (g_mdata.version != FWU_MDATA_VERSION)
return false;
Any reason to remove error logs which were here?
Thanks, Michal

On Tue, 3 Sept 2024 at 22:37, Michal Simek michal.simek@amd.com wrote:
On 8/30/24 13:40, Sughosh Ganu wrote:
In the version 2 of the FWU metadata, the metadata is broken into two parts, a top-level structure, which provides information on the total size of the structure among other things. Try reading the primary partition first, and if that fails, try reading the secondary partition. This will help in the scenario where the primary metadata partition has been corrupted, but the secondary partition is intact.
Signed-off-by: Sughosh Ganu sughosh.ganu@linaro.org
lib/fwu_updates/fwu_v2.c | 77 +++++++++++++++++++++------------------- 1 file changed, 41 insertions(+), 36 deletions(-)
diff --git a/lib/fwu_updates/fwu_v2.c b/lib/fwu_updates/fwu_v2.c index d0d8a25929..69306282aa 100644 --- a/lib/fwu_updates/fwu_v2.c +++ b/lib/fwu_updates/fwu_v2.c @@ -10,6 +10,12 @@ #include <linux/types.h>
#define FWU_MDATA_VERSION 0x2U +#define FWU_IMG_DESC_OFFSET 0x20U
+static struct fwu_mdata g_mdata;
+#define PRI_PART 0x1U +#define SEC_PART 0x2U
static inline struct fwu_fw_store_desc *fwu_get_fw_desc(struct fwu_mdata *mdata) { @@ -109,6 +115,31 @@ static int fwu_trial_state_start(uint update_index) return 0; }
+static bool fwu_get_mdata_mandatory(uint part) +{
int ret = 0;
struct udevice *fwu_dev = fwu_get_dev();
memset(&g_mdata, 0, sizeof(struct fwu_mdata));
ret = fwu_read_mdata(fwu_dev, &g_mdata,
part == PRI_PART ? true : false,
sizeof(struct fwu_mdata));
if (ret)
return false;
if (g_mdata.version != FWU_MDATA_VERSION)
return false;
Any reason to remove error logs which were here?
They just got dropped during the code refactor. Will put them back. Thanks.
-sughosh

On 8/30/24 13:40, Sughosh Ganu wrote:
In the version 2 of the FWU metadata, the metadata is broken into two parts, a top-level structure, which provides information on the total size of the structure among other things. Try reading the primary partition first, and if that fails, try reading the secondary partition. This will help in the scenario where the primary metadata partition has been corrupted, but the secondary partition is intact.
Signed-off-by: Sughosh Ganu sughosh.ganu@linaro.org
lib/fwu_updates/fwu_v2.c | 77 +++++++++++++++++++++------------------- 1 file changed, 41 insertions(+), 36 deletions(-)
diff --git a/lib/fwu_updates/fwu_v2.c b/lib/fwu_updates/fwu_v2.c index d0d8a25929..69306282aa 100644 --- a/lib/fwu_updates/fwu_v2.c +++ b/lib/fwu_updates/fwu_v2.c @@ -10,6 +10,12 @@ #include <linux/types.h>
#define FWU_MDATA_VERSION 0x2U +#define FWU_IMG_DESC_OFFSET 0x20U
And this is value which is sizeof(struct fwu_mdata) right?
+static struct fwu_mdata g_mdata;
+#define PRI_PART 0x1U +#define SEC_PART 0x2U
You already have them in include/fwu.h
88 enum { 89 PRIMARY_PART = 1, 90 SECONDARY_PART, 91 BOTH_PARTS, 92 }; 93
M

Do a sanity check that the version of the FWU metadata that has been read aligns with the version enabled in the image. This allows to indicate an early failure as part of the FWU module initialisation.
Signed-off-by: Sughosh Ganu sughosh.ganu@linaro.org --- lib/fwu_updates/fwu_v1.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/lib/fwu_updates/fwu_v1.c b/lib/fwu_updates/fwu_v1.c index efb8d51500..023e43728d 100644 --- a/lib/fwu_updates/fwu_v1.c +++ b/lib/fwu_updates/fwu_v1.c @@ -146,6 +146,7 @@ int fwu_init(void) { int ret; uint32_t mdata_size; + struct fwu_mdata mdata = {0};
fwu_get_mdata_size(&mdata_size);
@@ -157,10 +158,16 @@ int fwu_init(void) * Now read the entire structure, both copies, and * validate that the copies. */ - ret = fwu_get_mdata(NULL); + ret = fwu_get_mdata(&mdata); if (ret) return ret;
+ if (mdata.version != 0x1) { + log_err("FWU metadata version %u. Expected value of %u\n", + mdata.version, FWU_MDATA_VERSION); + return -EINVAL; + } + fwu_data_init();
return 0;

The platform transitions out of Trial State into the Regular State only when all the images in the update bank have been accepted. Check for this condition before transitioning out of Trial State.
Signed-off-by: Sughosh Ganu sughosh.ganu@linaro.org --- include/fwu.h | 11 +++++++++++ lib/fwu_updates/fwu.c | 25 +++++++++++++++++++++++++ lib/fwu_updates/fwu_v1.c | 9 ++++++--- lib/fwu_updates/fwu_v2.c | 3 +++ 4 files changed, 45 insertions(+), 3 deletions(-)
diff --git a/include/fwu.h b/include/fwu.h index 77ec65e618..70a5166f5e 100644 --- a/include/fwu.h +++ b/include/fwu.h @@ -417,4 +417,15 @@ int fwu_state_machine_updates(bool trial_state, uint32_t update_index); */ int fwu_init(void);
+/** + * fwu_bank_accepted() - Has the bank been accepted + * @data: Version agnostic FWU metadata information + * @bank: Update bank to check + * + * Check in the given bank if all the images have been accepted. + * + * Return: true if all images accepted, false otherwise + */ +bool fwu_bank_accepted(struct fwu_data *data, uint32_t bank); + #endif /* _FWU_H_ */ diff --git a/lib/fwu_updates/fwu.c b/lib/fwu_updates/fwu.c index c7fc8987be..a94a769a2b 100644 --- a/lib/fwu_updates/fwu.c +++ b/lib/fwu_updates/fwu.c @@ -28,6 +28,31 @@ enum { IMAGE_ACCEPT_CLEAR, };
+/** + * fwu_bank_accepted() - Has the bank been accepted + * @data: Version agnostic FWU metadata information + * @bank: Update bank to check + * + * Check in the given bank if all the images have been accepted. + * + * Return: true if all images accepted, false otherwise + */ +bool fwu_bank_accepted(struct fwu_data *data, uint32_t bank) +{ + u32 i; + struct fwu_image_entry *img_entry; + struct fwu_image_bank_info *img_bank_info; + + img_entry = &data->fwu_images[0]; + for (i = 0; i < CONFIG_FWU_NUM_IMAGES_PER_BANK; i++) { + img_bank_info = &img_entry[i].img_bank_info[bank]; + if (!img_bank_info->accepted) + return false; + } + + return true; +} + static int trial_counter_update(u16 *trial_state_ctr) { bool delete; diff --git a/lib/fwu_updates/fwu_v1.c b/lib/fwu_updates/fwu_v1.c index 023e43728d..c311a8857a 100644 --- a/lib/fwu_updates/fwu_v1.c +++ b/lib/fwu_updates/fwu_v1.c @@ -52,11 +52,14 @@ static void fwu_data_init(void) memcpy(dst_img_info, src_img_info, image_info_size); }
-static int fwu_trial_state_update(bool trial_state) +static int fwu_trial_state_update(bool trial_state, uint32_t bank) { int ret; struct fwu_data *data = fwu_get_data();
+ if (!trial_state && !fwu_bank_accepted(data, bank)) + return 0; + if (trial_state) { ret = fwu_trial_state_ctr_start(); if (ret) @@ -112,9 +115,9 @@ void fwu_populate_mdata_image_info(struct fwu_data *data) * Return: 0 if OK, -ve on error */ int fwu_state_machine_updates(bool trial_state, - __maybe_unused uint32_t update_index) + uint32_t update_index) { - return fwu_trial_state_update(trial_state); + return fwu_trial_state_update(trial_state, update_index); }
/** diff --git a/lib/fwu_updates/fwu_v2.c b/lib/fwu_updates/fwu_v2.c index 69306282aa..9c696952ed 100644 --- a/lib/fwu_updates/fwu_v2.c +++ b/lib/fwu_updates/fwu_v2.c @@ -88,6 +88,9 @@ static int fwu_bank_state_update(bool trial_state, uint32_t bank) struct fwu_data *data = fwu_get_data(); struct fwu_mdata *mdata = data->fwu_mdata;
+ if (!trial_state && !fwu_bank_accepted(data, bank)) + return 0; + mdata->bank_state[bank] = data->bank_state[bank] = trial_state ? FWU_BANK_VALID : FWU_BANK_ACCEPTED;

On 8/30/24 13:40, Sughosh Ganu wrote:
The platform transitions out of Trial State into the Regular State only when all the images in the update bank have been accepted. Check for this condition before transitioning out of Trial State.
Signed-off-by: Sughosh Ganu sughosh.ganu@linaro.org
include/fwu.h | 11 +++++++++++ lib/fwu_updates/fwu.c | 25 +++++++++++++++++++++++++ lib/fwu_updates/fwu_v1.c | 9 ++++++--- lib/fwu_updates/fwu_v2.c | 3 +++ 4 files changed, 45 insertions(+), 3 deletions(-)
diff --git a/include/fwu.h b/include/fwu.h index 77ec65e618..70a5166f5e 100644 --- a/include/fwu.h +++ b/include/fwu.h @@ -417,4 +417,15 @@ int fwu_state_machine_updates(bool trial_state, uint32_t update_index); */ int fwu_init(void);
+/**
- fwu_bank_accepted() - Has the bank been accepted
- @data: Version agnostic FWU metadata information
- @bank: Update bank to check
- Check in the given bank if all the images have been accepted.
- Return: true if all images accepted, false otherwise
- */
+bool fwu_bank_accepted(struct fwu_data *data, uint32_t bank);
- #endif /* _FWU_H_ */
diff --git a/lib/fwu_updates/fwu.c b/lib/fwu_updates/fwu.c index c7fc8987be..a94a769a2b 100644 --- a/lib/fwu_updates/fwu.c +++ b/lib/fwu_updates/fwu.c @@ -28,6 +28,31 @@ enum { IMAGE_ACCEPT_CLEAR, };
+/**
- fwu_bank_accepted() - Has the bank been accepted
- @data: Version agnostic FWU metadata information
- @bank: Update bank to check
- Check in the given bank if all the images have been accepted.
- Return: true if all images accepted, false otherwise
- */
+bool fwu_bank_accepted(struct fwu_data *data, uint32_t bank) +{
- u32 i;
- struct fwu_image_entry *img_entry;
- struct fwu_image_bank_info *img_bank_info;
- img_entry = &data->fwu_images[0];
- for (i = 0; i < CONFIG_FWU_NUM_IMAGES_PER_BANK; i++) {
img_bank_info = &img_entry[i].img_bank_info[bank];
if (!img_bank_info->accepted)
return false;
- }
- return true;
+}
- static int trial_counter_update(u16 *trial_state_ctr) { bool delete;
diff --git a/lib/fwu_updates/fwu_v1.c b/lib/fwu_updates/fwu_v1.c index 023e43728d..c311a8857a 100644 --- a/lib/fwu_updates/fwu_v1.c +++ b/lib/fwu_updates/fwu_v1.c @@ -52,11 +52,14 @@ static void fwu_data_init(void) memcpy(dst_img_info, src_img_info, image_info_size); }
-static int fwu_trial_state_update(bool trial_state) +static int fwu_trial_state_update(bool trial_state, uint32_t bank) { int ret; struct fwu_data *data = fwu_get_data();
- if (!trial_state && !fwu_bank_accepted(data, bank))
return 0;
- if (trial_state) { ret = fwu_trial_state_ctr_start(); if (ret)
@@ -112,9 +115,9 @@ void fwu_populate_mdata_image_info(struct fwu_data *data)
- Return: 0 if OK, -ve on error
*/ int fwu_state_machine_updates(bool trial_state,
__maybe_unused uint32_t update_index)
{uint32_t update_index)
- return fwu_trial_state_update(trial_state);
return fwu_trial_state_update(trial_state, update_index); }
/**
diff --git a/lib/fwu_updates/fwu_v2.c b/lib/fwu_updates/fwu_v2.c index 69306282aa..9c696952ed 100644 --- a/lib/fwu_updates/fwu_v2.c +++ b/lib/fwu_updates/fwu_v2.c @@ -88,6 +88,9 @@ static int fwu_bank_state_update(bool trial_state, uint32_t bank) struct fwu_data *data = fwu_get_data(); struct fwu_mdata *mdata = data->fwu_mdata;
- if (!trial_state && !fwu_bank_accepted(data, bank))
return 0;
- mdata->bank_state[bank] = data->bank_state[bank] = trial_state ? FWU_BANK_VALID : FWU_BANK_ACCEPTED;
Tested-by: Michal Simek michal.simek@amd.com
Thanks, Michal

The FWU code supports both versions of the FWU metadata, i.e. v1 and v2. A platform can then select one of the two versions through a config symbol. Put a dependency in the FWU metadata version selection config symbol to ensure that both versions of the metadata cannot be enabled.
Signed-off-by: Sughosh Ganu sughosh.ganu@linaro.org --- lib/fwu_updates/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/lib/fwu_updates/Kconfig b/lib/fwu_updates/Kconfig index 51b7fbbefd..a722107c12 100644 --- a/lib/fwu_updates/Kconfig +++ b/lib/fwu_updates/Kconfig @@ -40,6 +40,7 @@ config FWU_MDATA_V1
config FWU_MDATA_V2 bool "Enable support FWU Metadata version 2" + depends on !FWU_MDATA_V1 help The FWU specification supports two versions of the metadata structure. This option enables support for FWU

On 8/30/24 13:40, Sughosh Ganu wrote:
The FWU code supports both versions of the FWU metadata, i.e. v1 and v2. A platform can then select one of the two versions through a config symbol. Put a dependency in the FWU metadata version selection config symbol to ensure that both versions of the metadata cannot be enabled.
Signed-off-by: Sughosh Ganu sughosh.ganu@linaro.org
lib/fwu_updates/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/lib/fwu_updates/Kconfig b/lib/fwu_updates/Kconfig index 51b7fbbefd..a722107c12 100644 --- a/lib/fwu_updates/Kconfig +++ b/lib/fwu_updates/Kconfig @@ -40,6 +40,7 @@ config FWU_MDATA_V1
config FWU_MDATA_V2 bool "Enable support FWU Metadata version 2"
- depends on !FWU_MDATA_V1 help The FWU specification supports two versions of the metadata structure. This option enables support for FWU
Reviewed-by: Michal Simek michal.simek@amd.com
Thanks, Michal

When in Trial State, the platform keeps a count of the number of times it has booted in the Trial State. Once the threshold of the maximum allowed count exceeds, the platform reverts to boot from a different bank on subsequent boot, thus coming out of the Trial State. It is expected that all the updated images would be accepted or rejected while the platform is in Trial State. Put in checks so that it is not possible to apply an empty capsule once the max Trial Count exceeds.
Signed-off-by: Sughosh Ganu sughosh.ganu@linaro.org --- lib/fwu_updates/fwu.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/lib/fwu_updates/fwu.c b/lib/fwu_updates/fwu.c index a94a769a2b..7f085a0211 100644 --- a/lib/fwu_updates/fwu.c +++ b/lib/fwu_updates/fwu.c @@ -113,6 +113,8 @@ static int fwu_trial_count_update(void) ret = fwu_revert_boot_index(); if (ret) log_err("Unable to revert active_index\n"); + + trial_counter_update(NULL); ret = 1; } else { log_info("Trial State count: attempt %d out of %d\n", @@ -762,8 +764,8 @@ static int fwu_boottime_checks(void) return 0;
in_trial = in_trial_state(); - if (!in_trial || (ret = fwu_trial_count_update()) > 0) - ret = trial_counter_update(NULL); + + ret = in_trial ? fwu_trial_count_update() : trial_counter_update(NULL);
if (!ret) boottime_check = 1;

On 8/30/24 13:40, Sughosh Ganu wrote:
When in Trial State, the platform keeps a count of the number of times it has booted in the Trial State. Once the threshold of the maximum allowed count exceeds, the platform reverts to boot from a different bank on subsequent boot, thus coming out of the Trial State. It is expected that all the updated images would be accepted or rejected while the platform is in Trial State. Put in checks so that it is not possible to apply an empty capsule once the max Trial Count exceeds.
Signed-off-by: Sughosh Ganu sughosh.ganu@linaro.org
lib/fwu_updates/fwu.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/lib/fwu_updates/fwu.c b/lib/fwu_updates/fwu.c index a94a769a2b..7f085a0211 100644 --- a/lib/fwu_updates/fwu.c +++ b/lib/fwu_updates/fwu.c @@ -113,6 +113,8 @@ static int fwu_trial_count_update(void) ret = fwu_revert_boot_index(); if (ret) log_err("Unable to revert active_index\n");
ret = 1; } else { log_info("Trial State count: attempt %d out of %d\n",trial_counter_update(NULL);
@@ -762,8 +764,8 @@ static int fwu_boottime_checks(void) return 0;
in_trial = in_trial_state();
- if (!in_trial || (ret = fwu_trial_count_update()) > 0)
ret = trial_counter_update(NULL);
ret = in_trial ? fwu_trial_count_update() : trial_counter_update(NULL);
if (!ret) boottime_check = 1;
Tested-by: Michal Simek michal.simek@amd.com
Thanks, Michal

On 8/30/24 13:40, Sughosh Ganu wrote:
The following set of patches are miscellaneous fixes and some hardening of the FWU update logic.
Sughosh Ganu (6): fwu: v2: perform some checks before reading metadata fwu: v2: try reading both copies of metadata fwu: v1: do a version check for the metadata fwu: check all images for transitioning out of Trial State fwu: add dependency checks for selecting FWU metadata version fwu: do not allow capsule processing on exceeding Trial Counter threshold
include/fwu.h | 11 ++++++ lib/fwu_updates/Kconfig | 1 + lib/fwu_updates/fwu.c | 31 +++++++++++++++- lib/fwu_updates/fwu_v1.c | 18 +++++++-- lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------ 5 files changed, 99 insertions(+), 42 deletions(-)
I think there is still issue with returning values. Take a look at my log. I am in trial state - I accepted both images already that's why fwu has everything accepted. But I can still apply empty capsules which are passing. 1. if they are accepted status should be reflected and visible via fwu 2. if they are not accepted error from command should be returned.
Thanks, Michal
ZynqMP> fwu FWU Metadata crc32: 0x12fd554e version: 0x2 size: 0xda active_index: 0x0 previous_active_index: 0x1 bank_state[0]: 0xfc bank_state[1]: 0xfc bank_state[2]: 0xff bank_state[3]: 0xff Image Info
Image Type Guid: DE6066E8-0256-4FAD-8238-E406E274C4CF Location Guid: D7CE8A58-CE2C-11ED-81CD-D324E93AC223 Image Guid: F64A0548-2CCE-ED11-8F66-7BC4531CFE6B Image Acceptance: yes Image Guid: 3E9C814B-2CCE-ED11-BEC8-23DE4C6D2CF2 Image Acceptance: yes
Image Type Guid: CF9ECFD4-938B-41C5-8551-1F883AB7DC18 Location Guid: D7CE8A58-CE2C-11ED-81CD-D324E93AC223 Image Guid: 52DA04FB-9D0E-EE11-A57F-637805837C3F Image Acceptance: yes Image Guid: 46926007-9E0E-EE11-A23A-A38980B779A1 Image Acceptance: yes Custom fields covered by CRC 0x12 CUSTOM 00000000: 31 32 33 34 35 36 37 38 64 65 61 64 62 65 65 66 12345678deadbeef CUSTOM 00000010: 0a 0b .. 0xffd80050 = 0xb002c001 0xffd80054 = 0xb0020ff0 0xffd80058 = 0x1d000048 0xffd8005c = 0x0 ZynqMP> tftpboot 0x100000 192.168.0.105:capsule1-revert.bin && efidebug capsule update -v 0x100000 Set rate id 48/125000000 Using ethernet@ff0e0000 device TFTP from server 192.168.0.105; our IP address is 192.168.0.155 Filename 'capsule1-revert.bin'. Load address: 0x100000 Loading: # 3.9 KiB/s done Bytes transferred = 28 (1c hex) Capsule guid: acd58b4b-c0e8-475f-99b5-6b3f7e07aaf0 Capsule flags: 0x0 Capsule header size: 0x1c Capsule image size: 0x1c ZynqMP> echo $? 0 ZynqMP> fwu FWU Metadata crc32: 0x4aef4913 version: 0x2 size: 0xda active_index: 0x1 previous_active_index: 0x0 bank_state[0]: 0xfc bank_state[1]: 0xfc bank_state[2]: 0xff bank_state[3]: 0xff Image Info
Image Type Guid: DE6066E8-0256-4FAD-8238-E406E274C4CF Location Guid: D7CE8A58-CE2C-11ED-81CD-D324E93AC223 Image Guid: F64A0548-2CCE-ED11-8F66-7BC4531CFE6B Image Acceptance: yes Image Guid: 3E9C814B-2CCE-ED11-BEC8-23DE4C6D2CF2 Image Acceptance: yes
Image Type Guid: CF9ECFD4-938B-41C5-8551-1F883AB7DC18 Location Guid: D7CE8A58-CE2C-11ED-81CD-D324E93AC223 Image Guid: 52DA04FB-9D0E-EE11-A57F-637805837C3F Image Acceptance: yes Image Guid: 46926007-9E0E-EE11-A23A-A38980B779A1 Image Acceptance: yes Custom fields covered by CRC 0x12 CUSTOM 00000000: 31 32 33 34 35 36 37 38 64 65 61 64 62 65 65 66 12345678deadbeef CUSTOM 00000010: 0a 0b .. 0xffd80050 = 0xb002c001 0xffd80054 = 0xb0020ff0 0xffd80058 = 0x1d000048 0xffd8005c = 0x0 ZynqMP> tftpboot 0x100000 192.168.0.105:capsule2-accept.bin && efidebug capsule update -v 0x100000 Set rate id 48/125000000 Using ethernet@ff0e0000 device TFTP from server 192.168.0.105; our IP address is 192.168.0.155 Filename 'capsule2-accept.bin'. Load address: 0x100000 Loading: # 5.9 KiB/s done Bytes transferred = 44 (2c hex) Capsule guid: 0c996046-bcc0-4d04-85ec-e1fcedf1c6f8 Capsule flags: 0x0 Capsule header size: 0x1c Capsule image size: 0x2c ZynqMP> echo $? 0 ZynqMP> fwu FWU Metadata crc32: 0x4aef4913 version: 0x2 size: 0xda active_index: 0x1 previous_active_index: 0x0 bank_state[0]: 0xfc bank_state[1]: 0xfc bank_state[2]: 0xff bank_state[3]: 0xff Image Info
Image Type Guid: DE6066E8-0256-4FAD-8238-E406E274C4CF Location Guid: D7CE8A58-CE2C-11ED-81CD-D324E93AC223 Image Guid: F64A0548-2CCE-ED11-8F66-7BC4531CFE6B Image Acceptance: yes Image Guid: 3E9C814B-2CCE-ED11-BEC8-23DE4C6D2CF2 Image Acceptance: yes
Image Type Guid: CF9ECFD4-938B-41C5-8551-1F883AB7DC18 Location Guid: D7CE8A58-CE2C-11ED-81CD-D324E93AC223 Image Guid: 52DA04FB-9D0E-EE11-A57F-637805837C3F Image Acceptance: yes Image Guid: 46926007-9E0E-EE11-A23A-A38980B779A1 Image Acceptance: yes Custom fields covered by CRC 0x12 CUSTOM 00000000: 31 32 33 34 35 36 37 38 64 65 61 64 62 65 65 66 12345678deadbeef CUSTOM 00000010: 0a 0b .. 0xffd80050 = 0xb002c001 0xffd80054 = 0xb0020ff0 0xffd80058 = 0x1d000048 0xffd8005c = 0x0 ZynqMP>

On Wed, 4 Sept 2024 at 13:45, Michal Simek michal.simek@amd.com wrote:
On 8/30/24 13:40, Sughosh Ganu wrote:
The following set of patches are miscellaneous fixes and some hardening of the FWU update logic.
Sughosh Ganu (6): fwu: v2: perform some checks before reading metadata fwu: v2: try reading both copies of metadata fwu: v1: do a version check for the metadata fwu: check all images for transitioning out of Trial State fwu: add dependency checks for selecting FWU metadata version fwu: do not allow capsule processing on exceeding Trial Counter threshold
include/fwu.h | 11 ++++++ lib/fwu_updates/Kconfig | 1 + lib/fwu_updates/fwu.c | 31 +++++++++++++++- lib/fwu_updates/fwu_v1.c | 18 +++++++-- lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------ 5 files changed, 99 insertions(+), 42 deletions(-)
I think there is still issue with returning values. Take a look at my log. I am in trial state - I accepted both images already that's why fwu has everything accepted. But I can still apply empty capsules which are passing.
- if they are accepted status should be reflected and visible via fwu
- if they are not accepted error from command should be returned.
The code currently, does not behave incorrectly, but we can indeed return an error status if the empty capsule checks fail, like is done for normal capsules. I will add this change in the next version. Thanks.
-sughosh
Thanks, Michal
ZynqMP> fwu FWU Metadata crc32: 0x12fd554e version: 0x2 size: 0xda active_index: 0x0 previous_active_index: 0x1 bank_state[0]: 0xfc bank_state[1]: 0xfc bank_state[2]: 0xff bank_state[3]: 0xff Image Info
Image Type Guid: DE6066E8-0256-4FAD-8238-E406E274C4CF Location Guid: D7CE8A58-CE2C-11ED-81CD-D324E93AC223 Image Guid: F64A0548-2CCE-ED11-8F66-7BC4531CFE6B Image Acceptance: yes Image Guid: 3E9C814B-2CCE-ED11-BEC8-23DE4C6D2CF2 Image Acceptance: yes
Image Type Guid: CF9ECFD4-938B-41C5-8551-1F883AB7DC18 Location Guid: D7CE8A58-CE2C-11ED-81CD-D324E93AC223 Image Guid: 52DA04FB-9D0E-EE11-A57F-637805837C3F Image Acceptance: yes Image Guid: 46926007-9E0E-EE11-A23A-A38980B779A1 Image Acceptance: yes Custom fields covered by CRC 0x12 CUSTOM 00000000: 31 32 33 34 35 36 37 38 64 65 61 64 62 65 65 66 12345678deadbeef CUSTOM 00000010: 0a 0b .. 0xffd80050 = 0xb002c001 0xffd80054 = 0xb0020ff0 0xffd80058 = 0x1d000048 0xffd8005c = 0x0 ZynqMP> tftpboot 0x100000 192.168.0.105:capsule1-revert.bin && efidebug capsule update -v 0x100000 Set rate id 48/125000000 Using ethernet@ff0e0000 device TFTP from server 192.168.0.105; our IP address is 192.168.0.155 Filename 'capsule1-revert.bin'. Load address: 0x100000 Loading: # 3.9 KiB/s done Bytes transferred = 28 (1c hex) Capsule guid: acd58b4b-c0e8-475f-99b5-6b3f7e07aaf0 Capsule flags: 0x0 Capsule header size: 0x1c Capsule image size: 0x1c ZynqMP> echo $? 0 ZynqMP> fwu FWU Metadata crc32: 0x4aef4913 version: 0x2 size: 0xda active_index: 0x1 previous_active_index: 0x0 bank_state[0]: 0xfc bank_state[1]: 0xfc bank_state[2]: 0xff bank_state[3]: 0xff Image Info
Image Type Guid: DE6066E8-0256-4FAD-8238-E406E274C4CF Location Guid: D7CE8A58-CE2C-11ED-81CD-D324E93AC223 Image Guid: F64A0548-2CCE-ED11-8F66-7BC4531CFE6B Image Acceptance: yes Image Guid: 3E9C814B-2CCE-ED11-BEC8-23DE4C6D2CF2 Image Acceptance: yes
Image Type Guid: CF9ECFD4-938B-41C5-8551-1F883AB7DC18 Location Guid: D7CE8A58-CE2C-11ED-81CD-D324E93AC223 Image Guid: 52DA04FB-9D0E-EE11-A57F-637805837C3F Image Acceptance: yes Image Guid: 46926007-9E0E-EE11-A23A-A38980B779A1 Image Acceptance: yes Custom fields covered by CRC 0x12 CUSTOM 00000000: 31 32 33 34 35 36 37 38 64 65 61 64 62 65 65 66 12345678deadbeef CUSTOM 00000010: 0a 0b .. 0xffd80050 = 0xb002c001 0xffd80054 = 0xb0020ff0 0xffd80058 = 0x1d000048 0xffd8005c = 0x0 ZynqMP> tftpboot 0x100000 192.168.0.105:capsule2-accept.bin && efidebug capsule update -v 0x100000 Set rate id 48/125000000 Using ethernet@ff0e0000 device TFTP from server 192.168.0.105; our IP address is 192.168.0.155 Filename 'capsule2-accept.bin'. Load address: 0x100000 Loading: # 5.9 KiB/s done Bytes transferred = 44 (2c hex) Capsule guid: 0c996046-bcc0-4d04-85ec-e1fcedf1c6f8 Capsule flags: 0x0 Capsule header size: 0x1c Capsule image size: 0x2c ZynqMP> echo $? 0 ZynqMP> fwu FWU Metadata crc32: 0x4aef4913 version: 0x2 size: 0xda active_index: 0x1 previous_active_index: 0x0 bank_state[0]: 0xfc bank_state[1]: 0xfc bank_state[2]: 0xff bank_state[3]: 0xff Image Info
Image Type Guid: DE6066E8-0256-4FAD-8238-E406E274C4CF Location Guid: D7CE8A58-CE2C-11ED-81CD-D324E93AC223 Image Guid: F64A0548-2CCE-ED11-8F66-7BC4531CFE6B Image Acceptance: yes Image Guid: 3E9C814B-2CCE-ED11-BEC8-23DE4C6D2CF2 Image Acceptance: yes
Image Type Guid: CF9ECFD4-938B-41C5-8551-1F883AB7DC18 Location Guid: D7CE8A58-CE2C-11ED-81CD-D324E93AC223 Image Guid: 52DA04FB-9D0E-EE11-A57F-637805837C3F Image Acceptance: yes Image Guid: 46926007-9E0E-EE11-A23A-A38980B779A1 Image Acceptance: yes Custom fields covered by CRC 0x12 CUSTOM 00000000: 31 32 33 34 35 36 37 38 64 65 61 64 62 65 65 66 12345678deadbeef CUSTOM 00000010: 0a 0b .. 0xffd80050 = 0xb002c001 0xffd80054 = 0xb0020ff0 0xffd80058 = 0x1d000048 0xffd8005c = 0x0 ZynqMP>

On 8/30/24 13:40, Sughosh Ganu wrote:
The following set of patches are miscellaneous fixes and some hardening of the FWU update logic.
Sughosh Ganu (6): fwu: v2: perform some checks before reading metadata fwu: v2: try reading both copies of metadata fwu: v1: do a version check for the metadata fwu: check all images for transitioning out of Trial State fwu: add dependency checks for selecting FWU metadata version fwu: do not allow capsule processing on exceeding Trial Counter threshold
include/fwu.h | 11 ++++++ lib/fwu_updates/Kconfig | 1 + lib/fwu_updates/fwu.c | 31 +++++++++++++++- lib/fwu_updates/fwu_v1.c | 18 +++++++-- lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------ 5 files changed, 99 insertions(+), 42 deletions(-)
I found one more thing.
I did this change
diff --git a/tools/mkfwumdata.c b/tools/mkfwumdata.c index fbc2067bc12d..dab9530e499c 100644 --- a/tools/mkfwumdata.c +++ b/tools/mkfwumdata.c @@ -293,7 +293,7 @@ static void fwu_fill_version_specific_mdata(struct fwu_mdata_object *mobj) struct fwu_mdata *mdata = mobj->mdata;
mdata->metadata_size = mobj->size; - mdata->desc_offset = sizeof(struct fwu_mdata); + mdata->desc_offset = 0x21; sizeof(struct fwu_mdata);
for (i = 0; i < MAX_BANKS_V2; i++) mdata->bank_state[i] = i < mobj->banks ?
to break desc_offset location. I generated mdata structure and write it to primary mdata partition.
This has been spotted by (my debug) which is the reason why I did it.
if (g_mdata.desc_offset != FWU_IMG_DESC_OFFSET) { printf("mdata offset is not 0x20\n"); return false; }
But I got one more message below which
mdata offset is not 0x20 Both FWU metadata copies are valid but do not match. Restoring the secondary partition from the primary
But this is wrong. 0x21 there is wrong as been detected but sync code detects it as correct one and by this break also backup copy.
Thanks, Michal

On Wed, 4 Sept 2024 at 17:30, Michal Simek michal.simek@amd.com wrote:
On 8/30/24 13:40, Sughosh Ganu wrote:
The following set of patches are miscellaneous fixes and some hardening of the FWU update logic.
Sughosh Ganu (6): fwu: v2: perform some checks before reading metadata fwu: v2: try reading both copies of metadata fwu: v1: do a version check for the metadata fwu: check all images for transitioning out of Trial State fwu: add dependency checks for selecting FWU metadata version fwu: do not allow capsule processing on exceeding Trial Counter threshold
include/fwu.h | 11 ++++++ lib/fwu_updates/Kconfig | 1 + lib/fwu_updates/fwu.c | 31 +++++++++++++++- lib/fwu_updates/fwu_v1.c | 18 +++++++-- lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------ 5 files changed, 99 insertions(+), 42 deletions(-)
I found one more thing.
I did this change
diff --git a/tools/mkfwumdata.c b/tools/mkfwumdata.c index fbc2067bc12d..dab9530e499c 100644 --- a/tools/mkfwumdata.c +++ b/tools/mkfwumdata.c @@ -293,7 +293,7 @@ static void fwu_fill_version_specific_mdata(struct fwu_mdata_object *mobj) struct fwu_mdata *mdata = mobj->mdata;
mdata->metadata_size = mobj->size;
mdata->desc_offset = sizeof(struct fwu_mdata);
mdata->desc_offset = 0x21; sizeof(struct fwu_mdata); for (i = 0; i < MAX_BANKS_V2; i++) mdata->bank_state[i] = i < mobj->banks ?
to break desc_offset location. I generated mdata structure and write it to primary mdata partition.
This has been spotted by (my debug) which is the reason why I did it.
if (g_mdata.desc_offset != FWU_IMG_DESC_OFFSET) { printf("mdata offset is not 0x20\n"); return false; }
But I got one more message below which
mdata offset is not 0x20 Both FWU metadata copies are valid but do not match. Restoring the secondary partition from the primary
The reason why this logic is in place is to handle the scenario whereby, during the metadata update, the primary copy gets updated correctly, but due to some reason (like a power cut), the secondary copy does not -- this is because the primary copy gets updated first. In such a scenario, the secondary copy of metadata can be then restored from the primary copy. In the scenario where the primary copy is corrupted, it will show up in the crc check, and will get restored from the secondary copy.
-sughosh
But this is wrong. 0x21 there is wrong as been detected but sync code detects it as correct one and by this break also backup copy.
Thanks, Michal

On 9/5/24 08:24, Sughosh Ganu wrote:
On Wed, 4 Sept 2024 at 17:30, Michal Simek michal.simek@amd.com wrote:
On 8/30/24 13:40, Sughosh Ganu wrote:
The following set of patches are miscellaneous fixes and some hardening of the FWU update logic.
Sughosh Ganu (6): fwu: v2: perform some checks before reading metadata fwu: v2: try reading both copies of metadata fwu: v1: do a version check for the metadata fwu: check all images for transitioning out of Trial State fwu: add dependency checks for selecting FWU metadata version fwu: do not allow capsule processing on exceeding Trial Counter threshold
include/fwu.h | 11 ++++++ lib/fwu_updates/Kconfig | 1 + lib/fwu_updates/fwu.c | 31 +++++++++++++++- lib/fwu_updates/fwu_v1.c | 18 +++++++-- lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------ 5 files changed, 99 insertions(+), 42 deletions(-)
I found one more thing.
I did this change
diff --git a/tools/mkfwumdata.c b/tools/mkfwumdata.c index fbc2067bc12d..dab9530e499c 100644 --- a/tools/mkfwumdata.c +++ b/tools/mkfwumdata.c @@ -293,7 +293,7 @@ static void fwu_fill_version_specific_mdata(struct fwu_mdata_object *mobj) struct fwu_mdata *mdata = mobj->mdata;
mdata->metadata_size = mobj->size;
mdata->desc_offset = sizeof(struct fwu_mdata);
mdata->desc_offset = 0x21; sizeof(struct fwu_mdata); for (i = 0; i < MAX_BANKS_V2; i++) mdata->bank_state[i] = i < mobj->banks ?
to break desc_offset location. I generated mdata structure and write it to primary mdata partition.
This has been spotted by (my debug) which is the reason why I did it.
if (g_mdata.desc_offset != FWU_IMG_DESC_OFFSET) { printf("mdata offset is not 0x20\n"); return false; }
But I got one more message below which
mdata offset is not 0x20 Both FWU metadata copies are valid but do not match. Restoring the secondary partition from the primary
The reason why this logic is in place is to handle the scenario whereby, during the metadata update, the primary copy gets updated correctly, but due to some reason (like a power cut), the secondary copy does not -- this is because the primary copy gets updated first. In such a scenario, the secondary copy of metadata can be then restored from the primary copy. In the scenario where the primary copy is corrupted, it will show up in the crc check, and will get restored from the secondary copy.
But in this case - CRC of primary is correct but data inside is wrong because offset is at 0x21 not at 0x20. That likely means that U-Boot is not actually using this field to find where data is but I think that's fine to say not supported mdata structure but u-boot should never copy this "primary broken" description to secondary one as syncup.
M

On Thu, 5 Sept 2024 at 13:03, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 08:24, Sughosh Ganu wrote:
On Wed, 4 Sept 2024 at 17:30, Michal Simek michal.simek@amd.com wrote:
On 8/30/24 13:40, Sughosh Ganu wrote:
The following set of patches are miscellaneous fixes and some hardening of the FWU update logic.
Sughosh Ganu (6): fwu: v2: perform some checks before reading metadata fwu: v2: try reading both copies of metadata fwu: v1: do a version check for the metadata fwu: check all images for transitioning out of Trial State fwu: add dependency checks for selecting FWU metadata version fwu: do not allow capsule processing on exceeding Trial Counter threshold
include/fwu.h | 11 ++++++ lib/fwu_updates/Kconfig | 1 + lib/fwu_updates/fwu.c | 31 +++++++++++++++- lib/fwu_updates/fwu_v1.c | 18 +++++++-- lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------ 5 files changed, 99 insertions(+), 42 deletions(-)
I found one more thing.
I did this change
diff --git a/tools/mkfwumdata.c b/tools/mkfwumdata.c index fbc2067bc12d..dab9530e499c 100644 --- a/tools/mkfwumdata.c +++ b/tools/mkfwumdata.c @@ -293,7 +293,7 @@ static void fwu_fill_version_specific_mdata(struct fwu_mdata_object *mobj) struct fwu_mdata *mdata = mobj->mdata;
mdata->metadata_size = mobj->size;
mdata->desc_offset = sizeof(struct fwu_mdata);
mdata->desc_offset = 0x21; sizeof(struct fwu_mdata); for (i = 0; i < MAX_BANKS_V2; i++) mdata->bank_state[i] = i < mobj->banks ?
to break desc_offset location. I generated mdata structure and write it to primary mdata partition.
This has been spotted by (my debug) which is the reason why I did it.
if (g_mdata.desc_offset != FWU_IMG_DESC_OFFSET) { printf("mdata offset is not 0x20\n"); return false; }
But I got one more message below which
mdata offset is not 0x20 Both FWU metadata copies are valid but do not match. Restoring the secondary partition from the primary
The reason why this logic is in place is to handle the scenario whereby, during the metadata update, the primary copy gets updated correctly, but due to some reason (like a power cut), the secondary copy does not -- this is because the primary copy gets updated first. In such a scenario, the secondary copy of metadata can be then restored from the primary copy. In the scenario where the primary copy is corrupted, it will show up in the crc check, and will get restored from the secondary copy.
But in this case - CRC of primary is correct but data inside is wrong because offset is at 0x21 not at 0x20.
How will this event play out ? Am I missing something ? When the metadata is getting updated, the offset will be written as 0x20, and not 0x21. And in case the metadata gets corrupted, the CRC check would fail and the primary partition would get restored from the secondary (assuming that the secondary copy is correct).
-sughosh
That likely means that U-Boot is not actually using this field to find where data is but I think that's fine to say not supported mdata structure but u-boot should never copy this "primary broken" description to secondary one as syncup.
M

On 9/5/24 09:38, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:03, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 08:24, Sughosh Ganu wrote:
On Wed, 4 Sept 2024 at 17:30, Michal Simek michal.simek@amd.com wrote:
On 8/30/24 13:40, Sughosh Ganu wrote:
The following set of patches are miscellaneous fixes and some hardening of the FWU update logic.
Sughosh Ganu (6): fwu: v2: perform some checks before reading metadata fwu: v2: try reading both copies of metadata fwu: v1: do a version check for the metadata fwu: check all images for transitioning out of Trial State fwu: add dependency checks for selecting FWU metadata version fwu: do not allow capsule processing on exceeding Trial Counter threshold
include/fwu.h | 11 ++++++ lib/fwu_updates/Kconfig | 1 + lib/fwu_updates/fwu.c | 31 +++++++++++++++- lib/fwu_updates/fwu_v1.c | 18 +++++++-- lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------ 5 files changed, 99 insertions(+), 42 deletions(-)
I found one more thing.
I did this change
diff --git a/tools/mkfwumdata.c b/tools/mkfwumdata.c index fbc2067bc12d..dab9530e499c 100644 --- a/tools/mkfwumdata.c +++ b/tools/mkfwumdata.c @@ -293,7 +293,7 @@ static void fwu_fill_version_specific_mdata(struct fwu_mdata_object *mobj) struct fwu_mdata *mdata = mobj->mdata;
mdata->metadata_size = mobj->size;
mdata->desc_offset = sizeof(struct fwu_mdata);
mdata->desc_offset = 0x21; sizeof(struct fwu_mdata); for (i = 0; i < MAX_BANKS_V2; i++) mdata->bank_state[i] = i < mobj->banks ?
to break desc_offset location. I generated mdata structure and write it to primary mdata partition.
This has been spotted by (my debug) which is the reason why I did it.
if (g_mdata.desc_offset != FWU_IMG_DESC_OFFSET) { printf("mdata offset is not 0x20\n"); return false; }
But I got one more message below which
mdata offset is not 0x20 Both FWU metadata copies are valid but do not match. Restoring the secondary partition from the primary
The reason why this logic is in place is to handle the scenario whereby, during the metadata update, the primary copy gets updated correctly, but due to some reason (like a power cut), the secondary copy does not -- this is because the primary copy gets updated first. In such a scenario, the secondary copy of metadata can be then restored from the primary copy. In the scenario where the primary copy is corrupted, it will show up in the crc check, and will get restored from the secondary copy.
But in this case - CRC of primary is correct but data inside is wrong because offset is at 0x21 not at 0x20.
How will this event play out ? Am I missing something ? When the metadata is getting updated, the offset will be written as 0x20, and not 0x21. And in case the metadata gets corrupted, the CRC check would fail and the primary partition would get restored from the secondary (assuming that the secondary copy is correct).
It is not after update. It is initial state with incorrect mdata structure.
M

On Thu, 5 Sept 2024 at 13:09, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 09:38, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:03, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 08:24, Sughosh Ganu wrote:
On Wed, 4 Sept 2024 at 17:30, Michal Simek michal.simek@amd.com wrote:
On 8/30/24 13:40, Sughosh Ganu wrote:
The following set of patches are miscellaneous fixes and some hardening of the FWU update logic.
Sughosh Ganu (6): fwu: v2: perform some checks before reading metadata fwu: v2: try reading both copies of metadata fwu: v1: do a version check for the metadata fwu: check all images for transitioning out of Trial State fwu: add dependency checks for selecting FWU metadata version fwu: do not allow capsule processing on exceeding Trial Counter threshold
include/fwu.h | 11 ++++++ lib/fwu_updates/Kconfig | 1 + lib/fwu_updates/fwu.c | 31 +++++++++++++++- lib/fwu_updates/fwu_v1.c | 18 +++++++-- lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------ 5 files changed, 99 insertions(+), 42 deletions(-)
I found one more thing.
I did this change
diff --git a/tools/mkfwumdata.c b/tools/mkfwumdata.c index fbc2067bc12d..dab9530e499c 100644 --- a/tools/mkfwumdata.c +++ b/tools/mkfwumdata.c @@ -293,7 +293,7 @@ static void fwu_fill_version_specific_mdata(struct fwu_mdata_object *mobj) struct fwu_mdata *mdata = mobj->mdata;
mdata->metadata_size = mobj->size;
mdata->desc_offset = sizeof(struct fwu_mdata);
mdata->desc_offset = 0x21; sizeof(struct fwu_mdata); for (i = 0; i < MAX_BANKS_V2; i++) mdata->bank_state[i] = i < mobj->banks ?
to break desc_offset location. I generated mdata structure and write it to primary mdata partition.
This has been spotted by (my debug) which is the reason why I did it.
if (g_mdata.desc_offset != FWU_IMG_DESC_OFFSET) { printf("mdata offset is not 0x20\n"); return false; }
But I got one more message below which
mdata offset is not 0x20 Both FWU metadata copies are valid but do not match. Restoring the secondary partition from the primary
The reason why this logic is in place is to handle the scenario whereby, during the metadata update, the primary copy gets updated correctly, but due to some reason (like a power cut), the secondary copy does not -- this is because the primary copy gets updated first. In such a scenario, the secondary copy of metadata can be then restored from the primary copy. In the scenario where the primary copy is corrupted, it will show up in the crc check, and will get restored from the secondary copy.
But in this case - CRC of primary is correct but data inside is wrong because offset is at 0x21 not at 0x20.
How will this event play out ? Am I missing something ? When the metadata is getting updated, the offset will be written as 0x20, and not 0x21. And in case the metadata gets corrupted, the CRC check would fail and the primary partition would get restored from the secondary (assuming that the secondary copy is correct).
It is not after update. It is initial state with incorrect mdata structure.
Well, if the metadata partition is being written with incorrect data, not sure how we combat that (or even if we should). After all, the spec clearly states that the metadata cannot be protected against malicious writes. The logic that you pointed out earlier is to handle the scenario where the primary partition got updated, but the secondary did not.
-sughosh

On 9/5/24 09:43, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:09, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 09:38, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:03, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 08:24, Sughosh Ganu wrote:
On Wed, 4 Sept 2024 at 17:30, Michal Simek michal.simek@amd.com wrote:
On 8/30/24 13:40, Sughosh Ganu wrote: > > The following set of patches are miscellaneous fixes and some > hardening of the FWU update logic. > > Sughosh Ganu (6): > fwu: v2: perform some checks before reading metadata > fwu: v2: try reading both copies of metadata > fwu: v1: do a version check for the metadata > fwu: check all images for transitioning out of Trial State > fwu: add dependency checks for selecting FWU metadata version > fwu: do not allow capsule processing on exceeding Trial Counter > threshold > > include/fwu.h | 11 ++++++ > lib/fwu_updates/Kconfig | 1 + > lib/fwu_updates/fwu.c | 31 +++++++++++++++- > lib/fwu_updates/fwu_v1.c | 18 +++++++-- > lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------ > 5 files changed, 99 insertions(+), 42 deletions(-) >
I found one more thing.
I did this change
diff --git a/tools/mkfwumdata.c b/tools/mkfwumdata.c index fbc2067bc12d..dab9530e499c 100644 --- a/tools/mkfwumdata.c +++ b/tools/mkfwumdata.c @@ -293,7 +293,7 @@ static void fwu_fill_version_specific_mdata(struct fwu_mdata_object *mobj) struct fwu_mdata *mdata = mobj->mdata;
mdata->metadata_size = mobj->size;
mdata->desc_offset = sizeof(struct fwu_mdata);
mdata->desc_offset = 0x21; sizeof(struct fwu_mdata); for (i = 0; i < MAX_BANKS_V2; i++) mdata->bank_state[i] = i < mobj->banks ?
to break desc_offset location. I generated mdata structure and write it to primary mdata partition.
This has been spotted by (my debug) which is the reason why I did it.
if (g_mdata.desc_offset != FWU_IMG_DESC_OFFSET) { printf("mdata offset is not 0x20\n"); return false; }
But I got one more message below which
mdata offset is not 0x20 Both FWU metadata copies are valid but do not match. Restoring the secondary partition from the primary
The reason why this logic is in place is to handle the scenario whereby, during the metadata update, the primary copy gets updated correctly, but due to some reason (like a power cut), the secondary copy does not -- this is because the primary copy gets updated first. In such a scenario, the secondary copy of metadata can be then restored from the primary copy. In the scenario where the primary copy is corrupted, it will show up in the crc check, and will get restored from the secondary copy.
But in this case - CRC of primary is correct but data inside is wrong because offset is at 0x21 not at 0x20.
How will this event play out ? Am I missing something ? When the metadata is getting updated, the offset will be written as 0x20, and not 0x21. And in case the metadata gets corrupted, the CRC check would fail and the primary partition would get restored from the secondary (assuming that the secondary copy is correct).
It is not after update. It is initial state with incorrect mdata structure.
Well, if the metadata partition is being written with incorrect data, not sure how we combat that (or even if we should). After all, the spec clearly states that the metadata cannot be protected against malicious writes. The logic that you pointed out earlier is to handle the scenario where the primary partition got updated, but the secondary did not.
I don't disagree with you.
Scenario here is that primary partition is updated with incorrect data content (but still correct CRC). This is correctly detected that data is not right. It means primary partition is wrong and shouldn't be used and it is not used.
But because it has correct CRC syncup code logic is doing syncup.
That's why I think we have two codes which have pretty much independent logic what to do. If primary is wrong sync should be from secondary to primary but that's not what code is doing.
Thanks, Michal

On Thu, 5 Sept 2024 at 13:20, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 09:43, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:09, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 09:38, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:03, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 08:24, Sughosh Ganu wrote:
On Wed, 4 Sept 2024 at 17:30, Michal Simek michal.simek@amd.com wrote: > > > > On 8/30/24 13:40, Sughosh Ganu wrote: >> >> The following set of patches are miscellaneous fixes and some >> hardening of the FWU update logic. >> >> Sughosh Ganu (6): >> fwu: v2: perform some checks before reading metadata >> fwu: v2: try reading both copies of metadata >> fwu: v1: do a version check for the metadata >> fwu: check all images for transitioning out of Trial State >> fwu: add dependency checks for selecting FWU metadata version >> fwu: do not allow capsule processing on exceeding Trial Counter >> threshold >> >> include/fwu.h | 11 ++++++ >> lib/fwu_updates/Kconfig | 1 + >> lib/fwu_updates/fwu.c | 31 +++++++++++++++- >> lib/fwu_updates/fwu_v1.c | 18 +++++++-- >> lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------ >> 5 files changed, 99 insertions(+), 42 deletions(-) >> > > I found one more thing. > > I did this change > > diff --git a/tools/mkfwumdata.c b/tools/mkfwumdata.c > index fbc2067bc12d..dab9530e499c 100644 > --- a/tools/mkfwumdata.c > +++ b/tools/mkfwumdata.c > @@ -293,7 +293,7 @@ static void fwu_fill_version_specific_mdata(struct > fwu_mdata_object *mobj) > struct fwu_mdata *mdata = mobj->mdata; > > mdata->metadata_size = mobj->size; > - mdata->desc_offset = sizeof(struct fwu_mdata); > + mdata->desc_offset = 0x21; sizeof(struct fwu_mdata); > > for (i = 0; i < MAX_BANKS_V2; i++) > mdata->bank_state[i] = i < mobj->banks ? > > > to break desc_offset location. I generated mdata structure and write it to > primary mdata partition. > > This has been spotted by (my debug) which is the reason why I did it. > > if (g_mdata.desc_offset != FWU_IMG_DESC_OFFSET) { > printf("mdata offset is not 0x20\n"); > return false; > } > > But I got one more message below which > > mdata offset is not 0x20 > Both FWU metadata copies are valid but do not match. Restoring the secondary > partition from the primary
The reason why this logic is in place is to handle the scenario whereby, during the metadata update, the primary copy gets updated correctly, but due to some reason (like a power cut), the secondary copy does not -- this is because the primary copy gets updated first. In such a scenario, the secondary copy of metadata can be then restored from the primary copy. In the scenario where the primary copy is corrupted, it will show up in the crc check, and will get restored from the secondary copy.
But in this case - CRC of primary is correct but data inside is wrong because offset is at 0x21 not at 0x20.
How will this event play out ? Am I missing something ? When the metadata is getting updated, the offset will be written as 0x20, and not 0x21. And in case the metadata gets corrupted, the CRC check would fail and the primary partition would get restored from the secondary (assuming that the secondary copy is correct).
It is not after update. It is initial state with incorrect mdata structure.
Well, if the metadata partition is being written with incorrect data, not sure how we combat that (or even if we should). After all, the spec clearly states that the metadata cannot be protected against malicious writes. The logic that you pointed out earlier is to handle the scenario where the primary partition got updated, but the secondary did not.
I don't disagree with you.
Scenario here is that primary partition is updated with incorrect data content (but still correct CRC). This is correctly detected that data is not right. It means primary partition is wrong and shouldn't be used and it is not used.
But because it has correct CRC syncup code logic is doing syncup.
That's why I think we have two codes which have pretty much independent logic what to do. If primary is wrong sync should be from secondary to primary but that's not what code is doing.
The scenario of "primary is wrong" will only play out in case of corruption in the metadata. The scenario that you highlight, that of the metadata being wrong, but the CRC check succeeds will only happen when the metadata has been maliciously written -- if not, then the secondary partition should also be written with the wrong metadata, and the checks would then fail. In the case of the primary metadata getting corrupted, it does get restored from the secondary partition, assuming that the secondary partition is intact.
-sughosh
Thanks, Michal

On 9/5/24 10:12, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:20, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 09:43, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:09, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 09:38, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:03, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 08:24, Sughosh Ganu wrote: > On Wed, 4 Sept 2024 at 17:30, Michal Simek michal.simek@amd.com wrote: >> >> >> >> On 8/30/24 13:40, Sughosh Ganu wrote: >>> >>> The following set of patches are miscellaneous fixes and some >>> hardening of the FWU update logic. >>> >>> Sughosh Ganu (6): >>> fwu: v2: perform some checks before reading metadata >>> fwu: v2: try reading both copies of metadata >>> fwu: v1: do a version check for the metadata >>> fwu: check all images for transitioning out of Trial State >>> fwu: add dependency checks for selecting FWU metadata version >>> fwu: do not allow capsule processing on exceeding Trial Counter >>> threshold >>> >>> include/fwu.h | 11 ++++++ >>> lib/fwu_updates/Kconfig | 1 + >>> lib/fwu_updates/fwu.c | 31 +++++++++++++++- >>> lib/fwu_updates/fwu_v1.c | 18 +++++++-- >>> lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------ >>> 5 files changed, 99 insertions(+), 42 deletions(-) >>> >> >> I found one more thing. >> >> I did this change >> >> diff --git a/tools/mkfwumdata.c b/tools/mkfwumdata.c >> index fbc2067bc12d..dab9530e499c 100644 >> --- a/tools/mkfwumdata.c >> +++ b/tools/mkfwumdata.c >> @@ -293,7 +293,7 @@ static void fwu_fill_version_specific_mdata(struct >> fwu_mdata_object *mobj) >> struct fwu_mdata *mdata = mobj->mdata; >> >> mdata->metadata_size = mobj->size; >> - mdata->desc_offset = sizeof(struct fwu_mdata); >> + mdata->desc_offset = 0x21; sizeof(struct fwu_mdata); >> >> for (i = 0; i < MAX_BANKS_V2; i++) >> mdata->bank_state[i] = i < mobj->banks ? >> >> >> to break desc_offset location. I generated mdata structure and write it to >> primary mdata partition. >> >> This has been spotted by (my debug) which is the reason why I did it. >> >> if (g_mdata.desc_offset != FWU_IMG_DESC_OFFSET) { >> printf("mdata offset is not 0x20\n"); >> return false; >> } >> >> But I got one more message below which >> >> mdata offset is not 0x20 >> Both FWU metadata copies are valid but do not match. Restoring the secondary >> partition from the primary > > The reason why this logic is in place is to handle the scenario > whereby, during the metadata update, the primary copy gets updated > correctly, but due to some reason (like a power cut), the secondary > copy does not -- this is because the primary copy gets updated first. > In such a scenario, the secondary copy of metadata can be then > restored from the primary copy. In the scenario where the primary copy > is corrupted, it will show up in the crc check, and will get restored > from the secondary copy.
But in this case - CRC of primary is correct but data inside is wrong because offset is at 0x21 not at 0x20.
How will this event play out ? Am I missing something ? When the metadata is getting updated, the offset will be written as 0x20, and not 0x21. And in case the metadata gets corrupted, the CRC check would fail and the primary partition would get restored from the secondary (assuming that the secondary copy is correct).
It is not after update. It is initial state with incorrect mdata structure.
Well, if the metadata partition is being written with incorrect data, not sure how we combat that (or even if we should). After all, the spec clearly states that the metadata cannot be protected against malicious writes. The logic that you pointed out earlier is to handle the scenario where the primary partition got updated, but the secondary did not.
I don't disagree with you.
Scenario here is that primary partition is updated with incorrect data content (but still correct CRC). This is correctly detected that data is not right. It means primary partition is wrong and shouldn't be used and it is not used.
But because it has correct CRC syncup code logic is doing syncup.
That's why I think we have two codes which have pretty much independent logic what to do. If primary is wrong sync should be from secondary to primary but that's not what code is doing.
The scenario of "primary is wrong" will only play out in case of corruption in the metadata. The scenario that you highlight, that of the metadata being wrong, but the CRC check succeeds will only happen when the metadata has been maliciously written --
yes. And I did it on purpose to check your code and all error condition you are checking to make sure that code traps them.
if not, then the secondary partition should also be written with the wrong metadata, and the checks would then fail. In the case of the primary metadata getting corrupted, it does get restored from the secondary partition, assuming that the secondary partition is intact.
When primary is cleared/has incorrect CRC there is no problem and recovery happens but if CRC is right but data inside wrong code doesn't check it.
Thanks, Michal

On Thu, 5 Sept 2024 at 14:07, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 10:12, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:20, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 09:43, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:09, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 09:38, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:03, Michal Simek michal.simek@amd.com wrote: > > > > On 9/5/24 08:24, Sughosh Ganu wrote: >> On Wed, 4 Sept 2024 at 17:30, Michal Simek michal.simek@amd.com wrote: >>> >>> >>> >>> On 8/30/24 13:40, Sughosh Ganu wrote: >>>> >>>> The following set of patches are miscellaneous fixes and some >>>> hardening of the FWU update logic. >>>> >>>> Sughosh Ganu (6): >>>> fwu: v2: perform some checks before reading metadata >>>> fwu: v2: try reading both copies of metadata >>>> fwu: v1: do a version check for the metadata >>>> fwu: check all images for transitioning out of Trial State >>>> fwu: add dependency checks for selecting FWU metadata version >>>> fwu: do not allow capsule processing on exceeding Trial Counter >>>> threshold >>>> >>>> include/fwu.h | 11 ++++++ >>>> lib/fwu_updates/Kconfig | 1 + >>>> lib/fwu_updates/fwu.c | 31 +++++++++++++++- >>>> lib/fwu_updates/fwu_v1.c | 18 +++++++-- >>>> lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------ >>>> 5 files changed, 99 insertions(+), 42 deletions(-) >>>> >>> >>> I found one more thing. >>> >>> I did this change >>> >>> diff --git a/tools/mkfwumdata.c b/tools/mkfwumdata.c >>> index fbc2067bc12d..dab9530e499c 100644 >>> --- a/tools/mkfwumdata.c >>> +++ b/tools/mkfwumdata.c >>> @@ -293,7 +293,7 @@ static void fwu_fill_version_specific_mdata(struct >>> fwu_mdata_object *mobj) >>> struct fwu_mdata *mdata = mobj->mdata; >>> >>> mdata->metadata_size = mobj->size; >>> - mdata->desc_offset = sizeof(struct fwu_mdata); >>> + mdata->desc_offset = 0x21; sizeof(struct fwu_mdata); >>> >>> for (i = 0; i < MAX_BANKS_V2; i++) >>> mdata->bank_state[i] = i < mobj->banks ? >>> >>> >>> to break desc_offset location. I generated mdata structure and write it to >>> primary mdata partition. >>> >>> This has been spotted by (my debug) which is the reason why I did it. >>> >>> if (g_mdata.desc_offset != FWU_IMG_DESC_OFFSET) { >>> printf("mdata offset is not 0x20\n"); >>> return false; >>> } >>> >>> But I got one more message below which >>> >>> mdata offset is not 0x20 >>> Both FWU metadata copies are valid but do not match. Restoring the secondary >>> partition from the primary >> >> The reason why this logic is in place is to handle the scenario >> whereby, during the metadata update, the primary copy gets updated >> correctly, but due to some reason (like a power cut), the secondary >> copy does not -- this is because the primary copy gets updated first. >> In such a scenario, the secondary copy of metadata can be then >> restored from the primary copy. In the scenario where the primary copy >> is corrupted, it will show up in the crc check, and will get restored >> from the secondary copy. > > But in this case - CRC of primary is correct but data inside is wrong because > offset is at 0x21 not at 0x20.
How will this event play out ? Am I missing something ? When the metadata is getting updated, the offset will be written as 0x20, and not 0x21. And in case the metadata gets corrupted, the CRC check would fail and the primary partition would get restored from the secondary (assuming that the secondary copy is correct).
It is not after update. It is initial state with incorrect mdata structure.
Well, if the metadata partition is being written with incorrect data, not sure how we combat that (or even if we should). After all, the spec clearly states that the metadata cannot be protected against malicious writes. The logic that you pointed out earlier is to handle the scenario where the primary partition got updated, but the secondary did not.
I don't disagree with you.
Scenario here is that primary partition is updated with incorrect data content (but still correct CRC). This is correctly detected that data is not right. It means primary partition is wrong and shouldn't be used and it is not used.
But because it has correct CRC syncup code logic is doing syncup.
That's why I think we have two codes which have pretty much independent logic what to do. If primary is wrong sync should be from secondary to primary but that's not what code is doing.
The scenario of "primary is wrong" will only play out in case of corruption in the metadata. The scenario that you highlight, that of the metadata being wrong, but the CRC check succeeds will only happen when the metadata has been maliciously written --
yes. And I did it on purpose to check your code and all error condition you are checking to make sure that code traps them.
The code should detect and fix any genuine error conditions that might come up in the field. Even the check for the case where the two metadata copies are valid but different is a scenario that can actually happen when updating the metadata copies, and there is a reason why we are assuming that the primary is the correct copy and should be used to restore the secondary. If there are any valid scenarios that are not being handled, they should be identified and fixed.
if not, then the secondary partition should also be written with the wrong metadata, and the checks would then fail. In the case of the primary metadata getting corrupted, it does get restored from the secondary partition, assuming that the secondary partition is intact.
When primary is cleared/has incorrect CRC there is no problem and recovery happens but if CRC is right but data inside wrong code doesn't check it.
This is where I think that this is not a genuine "error" case. So, if someone has written a metadata copy with wrong data but right CRC, and only written this bad copy to the primary partition, how do we trust that the secondary copy is actually correct ? One can generate a secondary metadata copy where the primary checks pass, but the image information in the metadata is incorrect.
-sughosh
Thanks, Michal

On 9/5/24 11:17, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 14:07, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 10:12, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:20, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 09:43, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:09, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 09:38, Sughosh Ganu wrote: > On Thu, 5 Sept 2024 at 13:03, Michal Simek michal.simek@amd.com wrote: >> >> >> >> On 9/5/24 08:24, Sughosh Ganu wrote: >>> On Wed, 4 Sept 2024 at 17:30, Michal Simek michal.simek@amd.com wrote: >>>> >>>> >>>> >>>> On 8/30/24 13:40, Sughosh Ganu wrote: >>>>> >>>>> The following set of patches are miscellaneous fixes and some >>>>> hardening of the FWU update logic. >>>>> >>>>> Sughosh Ganu (6): >>>>> fwu: v2: perform some checks before reading metadata >>>>> fwu: v2: try reading both copies of metadata >>>>> fwu: v1: do a version check for the metadata >>>>> fwu: check all images for transitioning out of Trial State >>>>> fwu: add dependency checks for selecting FWU metadata version >>>>> fwu: do not allow capsule processing on exceeding Trial Counter >>>>> threshold >>>>> >>>>> include/fwu.h | 11 ++++++ >>>>> lib/fwu_updates/Kconfig | 1 + >>>>> lib/fwu_updates/fwu.c | 31 +++++++++++++++- >>>>> lib/fwu_updates/fwu_v1.c | 18 +++++++-- >>>>> lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------ >>>>> 5 files changed, 99 insertions(+), 42 deletions(-) >>>>> >>>> >>>> I found one more thing. >>>> >>>> I did this change >>>> >>>> diff --git a/tools/mkfwumdata.c b/tools/mkfwumdata.c >>>> index fbc2067bc12d..dab9530e499c 100644 >>>> --- a/tools/mkfwumdata.c >>>> +++ b/tools/mkfwumdata.c >>>> @@ -293,7 +293,7 @@ static void fwu_fill_version_specific_mdata(struct >>>> fwu_mdata_object *mobj) >>>> struct fwu_mdata *mdata = mobj->mdata; >>>> >>>> mdata->metadata_size = mobj->size; >>>> - mdata->desc_offset = sizeof(struct fwu_mdata); >>>> + mdata->desc_offset = 0x21; sizeof(struct fwu_mdata); >>>> >>>> for (i = 0; i < MAX_BANKS_V2; i++) >>>> mdata->bank_state[i] = i < mobj->banks ? >>>> >>>> >>>> to break desc_offset location. I generated mdata structure and write it to >>>> primary mdata partition. >>>> >>>> This has been spotted by (my debug) which is the reason why I did it. >>>> >>>> if (g_mdata.desc_offset != FWU_IMG_DESC_OFFSET) { >>>> printf("mdata offset is not 0x20\n"); >>>> return false; >>>> } >>>> >>>> But I got one more message below which >>>> >>>> mdata offset is not 0x20 >>>> Both FWU metadata copies are valid but do not match. Restoring the secondary >>>> partition from the primary >>> >>> The reason why this logic is in place is to handle the scenario >>> whereby, during the metadata update, the primary copy gets updated >>> correctly, but due to some reason (like a power cut), the secondary >>> copy does not -- this is because the primary copy gets updated first. >>> In such a scenario, the secondary copy of metadata can be then >>> restored from the primary copy. In the scenario where the primary copy >>> is corrupted, it will show up in the crc check, and will get restored >>> from the secondary copy. >> >> But in this case - CRC of primary is correct but data inside is wrong because >> offset is at 0x21 not at 0x20. > > How will this event play out ? Am I missing something ? When the > metadata is getting updated, the offset will be written as 0x20, and > not 0x21. And in case the metadata gets corrupted, the CRC check would > fail and the primary partition would get restored from the secondary > (assuming that the secondary copy is correct).
It is not after update. It is initial state with incorrect mdata structure.
Well, if the metadata partition is being written with incorrect data, not sure how we combat that (or even if we should). After all, the spec clearly states that the metadata cannot be protected against malicious writes. The logic that you pointed out earlier is to handle the scenario where the primary partition got updated, but the secondary did not.
I don't disagree with you.
Scenario here is that primary partition is updated with incorrect data content (but still correct CRC). This is correctly detected that data is not right. It means primary partition is wrong and shouldn't be used and it is not used.
But because it has correct CRC syncup code logic is doing syncup.
That's why I think we have two codes which have pretty much independent logic what to do. If primary is wrong sync should be from secondary to primary but that's not what code is doing.
The scenario of "primary is wrong" will only play out in case of corruption in the metadata. The scenario that you highlight, that of the metadata being wrong, but the CRC check succeeds will only happen when the metadata has been maliciously written --
yes. And I did it on purpose to check your code and all error condition you are checking to make sure that code traps them.
The code should detect and fix any genuine error conditions that might come up in the field. Even the check for the case where the two metadata copies are valid but different is a scenario that can actually happen when updating the metadata copies, and there is a reason why we are assuming that the primary is the correct copy and should be used to restore the secondary. If there are any valid scenarios that are not being handled, they should be identified and fixed.
if not, then the secondary partition should also be written with the wrong metadata, and the checks would then fail. In the case of the primary metadata getting corrupted, it does get restored from the secondary partition, assuming that the secondary partition is intact.
When primary is cleared/has incorrect CRC there is no problem and recovery happens but if CRC is right but data inside wrong code doesn't check it.
This is where I think that this is not a genuine "error" case. So, if someone has written a metadata copy with wrong data but right CRC, and only written this bad copy to the primary partition, how do we trust that the secondary copy is actually correct ? One can generate a secondary metadata copy where the primary checks pass, but the image information in the metadata is incorrect.
If U-Boot knows that data is not correct, based on it's check (in this case one incorrect field), you should never use it. If second also won't pass this check U-Boot shouldn't use it too.
If secondary copy pass all your tests you trust that it is fine. But if not, then you don't trust it.
And that's exactly what it is happening here. Code knows that primary data is wrong and properly detect it but another code blindly not checking it and just copy it incorrect data over correct one.
Thanks, Michal

On 9/6/24 08:35, Michal Simek wrote:
On 9/5/24 11:17, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 14:07, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 10:12, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:20, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 09:43, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:09, Michal Simek michal.simek@amd.com wrote: > > > > On 9/5/24 09:38, Sughosh Ganu wrote: >> On Thu, 5 Sept 2024 at 13:03, Michal Simek michal.simek@amd.com wrote: >>> >>> >>> >>> On 9/5/24 08:24, Sughosh Ganu wrote: >>>> On Wed, 4 Sept 2024 at 17:30, Michal Simek michal.simek@amd.com wrote: >>>>> >>>>> >>>>> >>>>> On 8/30/24 13:40, Sughosh Ganu wrote: >>>>>> >>>>>> The following set of patches are miscellaneous fixes and some >>>>>> hardening of the FWU update logic. >>>>>> >>>>>> Sughosh Ganu (6): >>>>>> fwu: v2: perform some checks before reading metadata >>>>>> fwu: v2: try reading both copies of metadata >>>>>> fwu: v1: do a version check for the metadata >>>>>> fwu: check all images for transitioning out of Trial State >>>>>> fwu: add dependency checks for selecting FWU metadata version >>>>>> fwu: do not allow capsule processing on exceeding Trial Counter >>>>>> threshold >>>>>> >>>>>> include/fwu.h | 11 ++++++ >>>>>> lib/fwu_updates/Kconfig | 1 + >>>>>> lib/fwu_updates/fwu.c | 31 +++++++++++++++- >>>>>> lib/fwu_updates/fwu_v1.c | 18 +++++++-- >>>>>> lib/fwu_updates/fwu_v2.c | 80 >>>>>> ++++++++++++++++++++++------------------ >>>>>> 5 files changed, 99 insertions(+), 42 deletions(-) >>>>>> >>>>> >>>>> I found one more thing. >>>>> >>>>> I did this change >>>>> >>>>> diff --git a/tools/mkfwumdata.c b/tools/mkfwumdata.c >>>>> index fbc2067bc12d..dab9530e499c 100644 >>>>> --- a/tools/mkfwumdata.c >>>>> +++ b/tools/mkfwumdata.c >>>>> @@ -293,7 +293,7 @@ static void fwu_fill_version_specific_mdata(struct >>>>> fwu_mdata_object *mobj) >>>>> struct fwu_mdata *mdata = mobj->mdata; >>>>> >>>>> mdata->metadata_size = mobj->size; >>>>> - mdata->desc_offset = sizeof(struct fwu_mdata); >>>>> + mdata->desc_offset = 0x21; sizeof(struct fwu_mdata); >>>>> >>>>> for (i = 0; i < MAX_BANKS_V2; i++) >>>>> mdata->bank_state[i] = i < mobj->banks ? >>>>> >>>>> >>>>> to break desc_offset location. I generated mdata structure and write >>>>> it to >>>>> primary mdata partition. >>>>> >>>>> This has been spotted by (my debug) which is the reason why I did it. >>>>> >>>>> if (g_mdata.desc_offset != FWU_IMG_DESC_OFFSET) { >>>>> printf("mdata offset is not 0x20\n"); >>>>> return false; >>>>> } >>>>> >>>>> But I got one more message below which >>>>> >>>>> mdata offset is not 0x20 >>>>> Both FWU metadata copies are valid but do not match. Restoring the >>>>> secondary >>>>> partition from the primary >>>> >>>> The reason why this logic is in place is to handle the scenario >>>> whereby, during the metadata update, the primary copy gets updated >>>> correctly, but due to some reason (like a power cut), the secondary >>>> copy does not -- this is because the primary copy gets updated first. >>>> In such a scenario, the secondary copy of metadata can be then >>>> restored from the primary copy. In the scenario where the primary copy >>>> is corrupted, it will show up in the crc check, and will get restored >>>> from the secondary copy. >>> >>> But in this case - CRC of primary is correct but data inside is wrong >>> because >>> offset is at 0x21 not at 0x20. >> >> How will this event play out ? Am I missing something ? When the >> metadata is getting updated, the offset will be written as 0x20, and >> not 0x21. And in case the metadata gets corrupted, the CRC check would >> fail and the primary partition would get restored from the secondary >> (assuming that the secondary copy is correct). > > It is not after update. It is initial state with incorrect mdata structure.
Well, if the metadata partition is being written with incorrect data, not sure how we combat that (or even if we should). After all, the spec clearly states that the metadata cannot be protected against malicious writes. The logic that you pointed out earlier is to handle the scenario where the primary partition got updated, but the secondary did not.
I don't disagree with you.
Scenario here is that primary partition is updated with incorrect data content (but still correct CRC). This is correctly detected that data is not right. It means primary partition is wrong and shouldn't be used and it is not used.
But because it has correct CRC syncup code logic is doing syncup.
That's why I think we have two codes which have pretty much independent logic what to do. If primary is wrong sync should be from secondary to primary but that's not what code is doing.
The scenario of "primary is wrong" will only play out in case of corruption in the metadata. The scenario that you highlight, that of the metadata being wrong, but the CRC check succeeds will only happen when the metadata has been maliciously written --
yes. And I did it on purpose to check your code and all error condition you are checking to make sure that code traps them.
The code should detect and fix any genuine error conditions that might come up in the field. Even the check for the case where the two metadata copies are valid but different is a scenario that can actually happen when updating the metadata copies, and there is a reason why we are assuming that the primary is the correct copy and should be used to restore the secondary. If there are any valid scenarios that are not being handled, they should be identified and fixed.
if not, then the secondary partition should also be written with the wrong metadata, and the checks would then fail. In the case of the primary metadata getting corrupted, it does get restored from the secondary partition, assuming that the secondary partition is intact.
When primary is cleared/has incorrect CRC there is no problem and recovery happens but if CRC is right but data inside wrong code doesn't check it.
This is where I think that this is not a genuine "error" case. So, if someone has written a metadata copy with wrong data but right CRC, and only written this bad copy to the primary partition, how do we trust that the secondary copy is actually correct ? One can generate a secondary metadata copy where the primary checks pass, but the image information in the metadata is incorrect.
If U-Boot knows that data is not correct, based on it's check (in this case one incorrect field), you should never use it. If second also won't pass this check U-Boot shouldn't use it too.
If secondary copy pass all your tests you trust that it is fine. But if not, then you don't trust it.
And that's exactly what it is happening here. Code knows that primary data is wrong and properly detect it but another code blindly not checking it and just copy it incorrect data over correct one.
here is where the problem is. Code reads mdata structure and checks only CRC over it and that's it. There should be additional checking perform before you can say that data is correct.
diff --git a/lib/fwu_updates/fwu.c b/lib/fwu_updates/fwu.c index 983f232bd179..9ec750b51da0 100644 --- a/lib/fwu_updates/fwu.c +++ b/lib/fwu_updates/fwu.c @@ -313,10 +313,13 @@ int fwu_get_mdata(struct fwu_mdata *mdata) err = fwu_read_mdata(g_dev, parts_mdata[i], !i, mdata_size); if (!err) { err = mdata_crc_check(parts_mdata[i]); - if (!err) - parts_ok[i] = true; - else - log_debug("mdata : %s crc32 failed\n", i ? "secondary" : "primary"); + if (!err) { + if (check that format is corrrect) + parts_ok[i] = true; + + continue; + } + log_debug("mdata : %s crc32 failed\n", i ? "secondary" : "primary"); } }
Because issue is that CRC check only cause that part_ok[0] = true but that's wrong.
Thanks, Michal

On Fri, 6 Sept 2024 at 12:05, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 11:17, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 14:07, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 10:12, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:20, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 09:43, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:09, Michal Simek michal.simek@amd.com wrote: > > > > On 9/5/24 09:38, Sughosh Ganu wrote: >> On Thu, 5 Sept 2024 at 13:03, Michal Simek michal.simek@amd.com wrote: >>> >>> >>> >>> On 9/5/24 08:24, Sughosh Ganu wrote: >>>> On Wed, 4 Sept 2024 at 17:30, Michal Simek michal.simek@amd.com wrote: >>>>> >>>>> >>>>> >>>>> On 8/30/24 13:40, Sughosh Ganu wrote: >>>>>> >>>>>> The following set of patches are miscellaneous fixes and some >>>>>> hardening of the FWU update logic. >>>>>> >>>>>> Sughosh Ganu (6): >>>>>> fwu: v2: perform some checks before reading metadata >>>>>> fwu: v2: try reading both copies of metadata >>>>>> fwu: v1: do a version check for the metadata >>>>>> fwu: check all images for transitioning out of Trial State >>>>>> fwu: add dependency checks for selecting FWU metadata version >>>>>> fwu: do not allow capsule processing on exceeding Trial Counter >>>>>> threshold >>>>>> >>>>>> include/fwu.h | 11 ++++++ >>>>>> lib/fwu_updates/Kconfig | 1 + >>>>>> lib/fwu_updates/fwu.c | 31 +++++++++++++++- >>>>>> lib/fwu_updates/fwu_v1.c | 18 +++++++-- >>>>>> lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------ >>>>>> 5 files changed, 99 insertions(+), 42 deletions(-) >>>>>> >>>>> >>>>> I found one more thing. >>>>> >>>>> I did this change >>>>> >>>>> diff --git a/tools/mkfwumdata.c b/tools/mkfwumdata.c >>>>> index fbc2067bc12d..dab9530e499c 100644 >>>>> --- a/tools/mkfwumdata.c >>>>> +++ b/tools/mkfwumdata.c >>>>> @@ -293,7 +293,7 @@ static void fwu_fill_version_specific_mdata(struct >>>>> fwu_mdata_object *mobj) >>>>> struct fwu_mdata *mdata = mobj->mdata; >>>>> >>>>> mdata->metadata_size = mobj->size; >>>>> - mdata->desc_offset = sizeof(struct fwu_mdata); >>>>> + mdata->desc_offset = 0x21; sizeof(struct fwu_mdata); >>>>> >>>>> for (i = 0; i < MAX_BANKS_V2; i++) >>>>> mdata->bank_state[i] = i < mobj->banks ? >>>>> >>>>> >>>>> to break desc_offset location. I generated mdata structure and write it to >>>>> primary mdata partition. >>>>> >>>>> This has been spotted by (my debug) which is the reason why I did it. >>>>> >>>>> if (g_mdata.desc_offset != FWU_IMG_DESC_OFFSET) { >>>>> printf("mdata offset is not 0x20\n"); >>>>> return false; >>>>> } >>>>> >>>>> But I got one more message below which >>>>> >>>>> mdata offset is not 0x20 >>>>> Both FWU metadata copies are valid but do not match. Restoring the secondary >>>>> partition from the primary >>>> >>>> The reason why this logic is in place is to handle the scenario >>>> whereby, during the metadata update, the primary copy gets updated >>>> correctly, but due to some reason (like a power cut), the secondary >>>> copy does not -- this is because the primary copy gets updated first. >>>> In such a scenario, the secondary copy of metadata can be then >>>> restored from the primary copy. In the scenario where the primary copy >>>> is corrupted, it will show up in the crc check, and will get restored >>>> from the secondary copy. >>> >>> But in this case - CRC of primary is correct but data inside is wrong because >>> offset is at 0x21 not at 0x20. >> >> How will this event play out ? Am I missing something ? When the >> metadata is getting updated, the offset will be written as 0x20, and >> not 0x21. And in case the metadata gets corrupted, the CRC check would >> fail and the primary partition would get restored from the secondary >> (assuming that the secondary copy is correct). > > It is not after update. It is initial state with incorrect mdata structure.
Well, if the metadata partition is being written with incorrect data, not sure how we combat that (or even if we should). After all, the spec clearly states that the metadata cannot be protected against malicious writes. The logic that you pointed out earlier is to handle the scenario where the primary partition got updated, but the secondary did not.
I don't disagree with you.
Scenario here is that primary partition is updated with incorrect data content (but still correct CRC). This is correctly detected that data is not right. It means primary partition is wrong and shouldn't be used and it is not used.
But because it has correct CRC syncup code logic is doing syncup.
That's why I think we have two codes which have pretty much independent logic what to do. If primary is wrong sync should be from secondary to primary but that's not what code is doing.
The scenario of "primary is wrong" will only play out in case of corruption in the metadata. The scenario that you highlight, that of the metadata being wrong, but the CRC check succeeds will only happen when the metadata has been maliciously written --
yes. And I did it on purpose to check your code and all error condition you are checking to make sure that code traps them.
The code should detect and fix any genuine error conditions that might come up in the field. Even the check for the case where the two metadata copies are valid but different is a scenario that can actually happen when updating the metadata copies, and there is a reason why we are assuming that the primary is the correct copy and should be used to restore the secondary. If there are any valid scenarios that are not being handled, they should be identified and fixed.
if not, then the secondary partition should also be written with the wrong metadata, and the checks would then fail. In the case of the primary metadata getting corrupted, it does get restored from the secondary partition, assuming that the secondary partition is intact.
When primary is cleared/has incorrect CRC there is no problem and recovery happens but if CRC is right but data inside wrong code doesn't check it.
This is where I think that this is not a genuine "error" case. So, if someone has written a metadata copy with wrong data but right CRC, and only written this bad copy to the primary partition, how do we trust that the secondary copy is actually correct ? One can generate a secondary metadata copy where the primary checks pass, but the image information in the metadata is incorrect.
If U-Boot knows that data is not correct, based on it's check (in this case one incorrect field), you should never use it. If second also won't pass this check U-Boot shouldn't use it too.
The fact that we are trying to counter incorrect provisioning of metadata, which can well be a malicious act of writing incorrect data/correct CRC copy only to one partition seems like an anomaly which should not be handled. I would still be okay if this were a fool-proof way of fixing the board, but it isn't. Like I mentioned earlier, we can still have a secondary copy of the metadata with wrong data/correct CRC, and which does not get detected in the initial checks, as the top level structure fields are correct, but the image information is not.
If secondary copy pass all your tests you trust that it is fine. But if not, then you don't trust it.
But we cannot detect that the secondary copy is passing all the tests, as the issue could well be with the image description, and that would be detected only when an update is attempted, or the user identifies it (possibly) using the fwu_mdata_read command. And before that, the primary copy would be restored with the incorrect secondary copy, as part of the metadata read function.
What I think can be done is, as a middle ground, we can instead have this as a platform policy. So we add a function pointer for this, and when such a scenario is detected, the platform can have a callback which then restores the primary copy from the secondary. What are your thoughts on this ?
-sughosh
And that's exactly what it is happening here. Code knows that primary data is wrong and properly detect it but another code blindly not checking it and just copy it incorrect data over correct one.
Thanks, Michal

On 9/6/24 11:47, Sughosh Ganu wrote:
On Fri, 6 Sept 2024 at 12:05, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 11:17, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 14:07, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 10:12, Sughosh Ganu wrote:
On Thu, 5 Sept 2024 at 13:20, Michal Simek michal.simek@amd.com wrote:
On 9/5/24 09:43, Sughosh Ganu wrote: > On Thu, 5 Sept 2024 at 13:09, Michal Simek michal.simek@amd.com wrote: >> >> >> >> On 9/5/24 09:38, Sughosh Ganu wrote: >>> On Thu, 5 Sept 2024 at 13:03, Michal Simek michal.simek@amd.com wrote: >>>> >>>> >>>> >>>> On 9/5/24 08:24, Sughosh Ganu wrote: >>>>> On Wed, 4 Sept 2024 at 17:30, Michal Simek michal.simek@amd.com wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 8/30/24 13:40, Sughosh Ganu wrote: >>>>>>> >>>>>>> The following set of patches are miscellaneous fixes and some >>>>>>> hardening of the FWU update logic. >>>>>>> >>>>>>> Sughosh Ganu (6): >>>>>>> fwu: v2: perform some checks before reading metadata >>>>>>> fwu: v2: try reading both copies of metadata >>>>>>> fwu: v1: do a version check for the metadata >>>>>>> fwu: check all images for transitioning out of Trial State >>>>>>> fwu: add dependency checks for selecting FWU metadata version >>>>>>> fwu: do not allow capsule processing on exceeding Trial Counter >>>>>>> threshold >>>>>>> >>>>>>> include/fwu.h | 11 ++++++ >>>>>>> lib/fwu_updates/Kconfig | 1 + >>>>>>> lib/fwu_updates/fwu.c | 31 +++++++++++++++- >>>>>>> lib/fwu_updates/fwu_v1.c | 18 +++++++-- >>>>>>> lib/fwu_updates/fwu_v2.c | 80 ++++++++++++++++++++++------------------ >>>>>>> 5 files changed, 99 insertions(+), 42 deletions(-) >>>>>>> >>>>>> >>>>>> I found one more thing. >>>>>> >>>>>> I did this change >>>>>> >>>>>> diff --git a/tools/mkfwumdata.c b/tools/mkfwumdata.c >>>>>> index fbc2067bc12d..dab9530e499c 100644 >>>>>> --- a/tools/mkfwumdata.c >>>>>> +++ b/tools/mkfwumdata.c >>>>>> @@ -293,7 +293,7 @@ static void fwu_fill_version_specific_mdata(struct >>>>>> fwu_mdata_object *mobj) >>>>>> struct fwu_mdata *mdata = mobj->mdata; >>>>>> >>>>>> mdata->metadata_size = mobj->size; >>>>>> - mdata->desc_offset = sizeof(struct fwu_mdata); >>>>>> + mdata->desc_offset = 0x21; sizeof(struct fwu_mdata); >>>>>> >>>>>> for (i = 0; i < MAX_BANKS_V2; i++) >>>>>> mdata->bank_state[i] = i < mobj->banks ? >>>>>> >>>>>> >>>>>> to break desc_offset location. I generated mdata structure and write it to >>>>>> primary mdata partition. >>>>>> >>>>>> This has been spotted by (my debug) which is the reason why I did it. >>>>>> >>>>>> if (g_mdata.desc_offset != FWU_IMG_DESC_OFFSET) { >>>>>> printf("mdata offset is not 0x20\n"); >>>>>> return false; >>>>>> } >>>>>> >>>>>> But I got one more message below which >>>>>> >>>>>> mdata offset is not 0x20 >>>>>> Both FWU metadata copies are valid but do not match. Restoring the secondary >>>>>> partition from the primary >>>>> >>>>> The reason why this logic is in place is to handle the scenario >>>>> whereby, during the metadata update, the primary copy gets updated >>>>> correctly, but due to some reason (like a power cut), the secondary >>>>> copy does not -- this is because the primary copy gets updated first. >>>>> In such a scenario, the secondary copy of metadata can be then >>>>> restored from the primary copy. In the scenario where the primary copy >>>>> is corrupted, it will show up in the crc check, and will get restored >>>>> from the secondary copy. >>>> >>>> But in this case - CRC of primary is correct but data inside is wrong because >>>> offset is at 0x21 not at 0x20. >>> >>> How will this event play out ? Am I missing something ? When the >>> metadata is getting updated, the offset will be written as 0x20, and >>> not 0x21. And in case the metadata gets corrupted, the CRC check would >>> fail and the primary partition would get restored from the secondary >>> (assuming that the secondary copy is correct). >> >> It is not after update. It is initial state with incorrect mdata structure. > > Well, if the metadata partition is being written with incorrect data, > not sure how we combat that (or even if we should). After all, the > spec clearly states that the metadata cannot be protected against > malicious writes. The logic that you pointed out earlier is to handle > the scenario where the primary partition got updated, but the > secondary did not.
I don't disagree with you.
Scenario here is that primary partition is updated with incorrect data content (but still correct CRC). This is correctly detected that data is not right. It means primary partition is wrong and shouldn't be used and it is not used.
But because it has correct CRC syncup code logic is doing syncup.
That's why I think we have two codes which have pretty much independent logic what to do. If primary is wrong sync should be from secondary to primary but that's not what code is doing.
The scenario of "primary is wrong" will only play out in case of corruption in the metadata. The scenario that you highlight, that of the metadata being wrong, but the CRC check succeeds will only happen when the metadata has been maliciously written --
yes. And I did it on purpose to check your code and all error condition you are checking to make sure that code traps them.
The code should detect and fix any genuine error conditions that might come up in the field. Even the check for the case where the two metadata copies are valid but different is a scenario that can actually happen when updating the metadata copies, and there is a reason why we are assuming that the primary is the correct copy and should be used to restore the secondary. If there are any valid scenarios that are not being handled, they should be identified and fixed.
if not, then the secondary partition should also be written with the wrong metadata, and the checks would then fail. In the case of the primary metadata getting corrupted, it does get restored from the secondary partition, assuming that the secondary partition is intact.
When primary is cleared/has incorrect CRC there is no problem and recovery happens but if CRC is right but data inside wrong code doesn't check it.
This is where I think that this is not a genuine "error" case. So, if someone has written a metadata copy with wrong data but right CRC, and only written this bad copy to the primary partition, how do we trust that the secondary copy is actually correct ? One can generate a secondary metadata copy where the primary checks pass, but the image information in the metadata is incorrect.
If U-Boot knows that data is not correct, based on it's check (in this case one incorrect field), you should never use it. If second also won't pass this check U-Boot shouldn't use it too.
The fact that we are trying to counter incorrect provisioning of metadata, which can well be a malicious act of writing incorrect data/correct CRC copy only to one partition seems like an anomaly which should not be handled. I would still be okay if this were a fool-proof way of fixing the board, but it isn't. Like I mentioned earlier, we can still have a secondary copy of the metadata with wrong data/correct CRC, and which does not get detected in the initial checks, as the top level structure fields are correct, but the image information is not.
Partially agree. You are checking in fwu_mdata_sanity_checks and even data is matching u-boot configuration. You are checking now num_banks, num_images. guid should be also known to u-boot and I can't see the reason why this functions can't check it too.
And more and more I am looking at the code more things that current fwu_mdata_sanity_checks should be called before you mark parts_ok[] = true.
If secondary copy pass all your tests you trust that it is fine. But if not, then you don't trust it.
But we cannot detect that the secondary copy is passing all the tests, as the issue could well be with the image description, and that would be detected only when an update is attempted, or the user identifies it (possibly) using the fwu_mdata_read command. And before that, the primary copy would be restored with the incorrect secondary copy, as part of the metadata read function.
What I think can be done is, as a middle ground, we can instead have this as a platform policy. So we add a function pointer for this, and when such a scenario is detected, the platform can have a callback which then restores the primary copy from the secondary. What are your thoughts on this ?
I have to think about this more (not on Friday afternoon).
Cheers, Michal
participants (2)
-
Michal Simek
-
Sughosh Ganu