
Hello Gaurav,
Cc: Michael Walle
-----Original Message----- From: U-Boot u-boot-bounces@lists.denx.de On Behalf Of Gaurav Jain Sent: Monday, January 10, 2022 1:27 PM To: u-boot@lists.denx.de Cc: Stefano Babic sbabic@denx.de; Fabio Estevam festevam@gmail.com; Peng Fan peng.fan@nxp.com; Simon Glass sjg@chromium.org; Priyanka Jain priyanka.jain@nxp.com; Ye Li ye.li@nxp.com; Horia Geanta horia.geanta@nxp.com; Ji Luo ji.luo@nxp.com; Franck Lenormand franck.lenormand@nxp.com; Silvano Di Ninno silvano.dininno@nxp.com; Sahil malhotra sahil.malhotra@nxp.com; Pankaj Gupta pankaj.gupta@nxp.com; Varun Sethi V.Sethi@nxp.com; NXP i . MX U-Boot Team uboot-imx@nxp.com; Shengzhou Liu Shengzhou.Liu@nxp.com; Mingkai Hu mingkai.hu@nxp.com; Rajesh Bhagat rajesh.bhagat@nxp.com; Meenakshi Aggarwal meenakshi.aggarwal@nxp.com; Wasim Khan wasim.khan@nxp.com; Alison Wang alison.wang@nxp.com; Pramod Kumar pramod.kumar_1@nxp.com; Tang Yuantian andy.tang@nxp.com; Adrian Alonso adrian.alonso@nxp.com; Vladimir Oltean olteanv@gmail.com Subject: [PATCH v8 10/15] crypto/fsl: Improve hwrng performance in kernel
From: Ye Li ye.li@nxp.com
RNG parameters are reconfigured.
- For TRNG to generate 256 bits of entropy, RNG TRNG Seed Control register is configured to have reduced SAMP_SIZE from default 2500 to 512. it is number of entropy samples that will be taken during Entropy generation.
- self-test registers(Monobit Limit, Poker Range, Run Length Limit) are synchronized with new RTSDCTL[SAMP_SIZE] of 512.
TRNG time is caluculated based on sample size.
Typo: caluculated -> calculated
time required to generate entropy is reduced and hwrng performance improved from 0.3 kB/s to 1.3 kB/s.
Is there any degradation in passed/failed FIPS 140-2 test count? Can you provide some results from at least rngtest run?
Signed-off-by: Ye Li ye.li@nxp.com Acked-by: Gaurav Jain gaurav.jain@nxp.com>
drivers/crypto/fsl/jr.c | 102 +++++++++++++++++++++++++++++++++------- include/fsl_sec.h | 1 + 2 files changed, 87 insertions(+), 16 deletions(-)
diff --git a/drivers/crypto/fsl/jr.c b/drivers/crypto/fsl/jr.c index a84440ab10..e5346a84a4 100644 --- a/drivers/crypto/fsl/jr.c +++ b/drivers/crypto/fsl/jr.c @@ -603,30 +603,100 @@ static u8 get_rng_vid(ccsr_sec_t *sec) */ static void kick_trng(int ent_delay, ccsr_sec_t *sec) {
- u32 samples = 512; /* number of bits to generate and test */
- u32 mono_min = 195;
- u32 mono_max = 317;
- u32 mono_range = mono_max - mono_min;
- u32 poker_min = 1031;
- u32 poker_max = 1600;
- u32 poker_range = poker_max - poker_min + 1;
- u32 retries = 2;
- u32 lrun_max = 32;
- s32 run_1_min = 27;
- s32 run_1_max = 107;
- s32 run_1_range = run_1_max - run_1_min;
- s32 run_2_min = 7;
- s32 run_2_max = 62;
- s32 run_2_range = run_2_max - run_2_min;
- s32 run_3_min = 0;
- s32 run_3_max = 39;
- s32 run_3_range = run_3_max - run_3_min;
- s32 run_4_min = -1;
- s32 run_4_max = 26;
- s32 run_4_range = run_4_max - run_4_min;
- s32 run_5_min = -1;
- s32 run_5_max = 18;
- s32 run_5_range = run_5_max - run_5_min;
- s32 run_6_min = -1;
- s32 run_6_max = 17;
- s32 run_6_range = run_6_max - run_6_min;
I have a feeling that this whole block of local variables can be simplified. I'm not sure it is required to list this so detailed.
You can attempt to define those values in header file and use macros to compute bound conditions, rather than allocating this on the stack here.
- u32 val;
- struct rng4tst __iomem *rng = (struct rng4tst __iomem *)&sec->rng;
u32 val;
/* put RNG4 into program mode */
sec_setbits32(&rng->rtmctl, RTMCTL_PRGM);
/* rtsdctl bits 0-15 contain "Entropy Delay, which defines the
* length (in system clocks) of each Entropy sample taken
* */
- /* Put RNG in program mode */
- /* Setting both RTMCTL:PRGM and RTMCTL:TRNG_ACC causes TRNG to
* properly invalidate the entropy in the entropy register and
* force re-generation.
*/
- sec_setbits32(&rng->rtmctl, RTMCTL_PRGM | RTMCTL_ACC);
- /* Configure the RNG Entropy Delay
* Performance-wise, it does not make sense to
* set the delay to a value that is lower
* than the last one that worked (i.e. the state handles
* were instantiated properly. Thus, instead of wasting
* time trying to set the values controlling the sample
* frequency, the function simply returns.
val = sec_in32(&rng->rtsdctl);*/
- val = (val & ~RTSDCTL_ENT_DLY_MASK) |
(ent_delay << RTSDCTL_ENT_DLY_SHIFT);
- val &= RTSDCTL_ENT_DLY_MASK;
- val >>= RTSDCTL_ENT_DLY_SHIFT;
- if (ent_delay < val) {
/* Put RNG4 into run mode */
sec_clrbits32(&rng->rtmctl, RTMCTL_PRGM | RTMCTL_ACC);
return;
- }
- val = (ent_delay << RTSDCTL_ENT_DLY_SHIFT) | samples; sec_out32(&rng->rtsdctl, val);
- /* min. freq. count, equal to 1/4 of the entropy sample length */
- sec_out32(&rng->rtfreqmin, ent_delay >> 2);
- /* disable maximum frequency count */
- sec_out32(&rng->rtfreqmax, RTFRQMAX_DISABLE);
- /*
* Recommended margins (min,max) for freq. count:
* freq_mul = RO_freq / TRNG_clk_freq
* rtfrqmin = (ent_delay x freq_mul) >> 1;
* rtfrqmax = (ent_delay x freq_mul) << 3;
* Given current deployments of CAAM in i.MX SoCs, and to simplify
* the configuration, we consider [1,16] to be a safe interval
Statement "... we consider ..." should be perhaps extended with justification on how this conclusion was made. Some test results (either here in the comment or in commit message) would be beneficial to see.
* for the freq_mul and the limits of the interval are used to compute
* rtfrqmin, rtfrqmax
*/
- sec_out32(&rng->rtfreqmin, ent_delay >> 1);
- sec_out32(&rng->rtfreqmax, ent_delay << 7);
- sec_out32(&rng->rtscmisc, (retries << 16) | lrun_max);
- sec_out32(&rng->rtpkrmax, poker_max);
- sec_out32(&rng->rtpkrrng, poker_range);
- sec_out32(&rng->rsvd1[0], (mono_range << 16) | mono_max);
- sec_out32(&rng->rsvd1[1], (run_1_range << 16) | run_1_max);
- sec_out32(&rng->rsvd1[2], (run_2_range << 16) | run_2_max);
- sec_out32(&rng->rsvd1[3], (run_3_range << 16) | run_3_max);
- sec_out32(&rng->rsvd1[4], (run_4_range << 16) | run_4_max);
- sec_out32(&rng->rsvd1[5], (run_5_range << 16) | run_5_max);
- sec_out32(&rng->rsvd1[6], (run_6_range << 16) | run_6_max);
Writing in reserved area is not a good idea. Those registers are defined in HW, can you please define them also in the header file?
- val = sec_in32(&rng->rtmctl); /*
* select raw sampling in both entropy shifter
* Select raw sampling in both entropy shifter
This change is not needed, otherwise please adjust all comments in original file, as for example comment below also starts with small letter...
* and statistical checker */
- sec_setbits32(&rng->rtmctl, RTMCTL_SAMP_MODE_RAW_ES_SC);
The only usage of RTMCTL_SAMP_MODE_RAW_ES_SC is dropped here, please remove it from include/fsl_sec.h as well.
- /* put RNG4 into run mode */
- sec_clrbits32(&rng->rtmctl, RTMCTL_PRGM);
- val &= ~RTMCTL_SAMP_MODE_INVALID;
- val |= RTMCTL_SAMP_MODE_RAW_ES_SC;
- /* Put RNG4 into run mode */
- val &= ~(RTMCTL_PRGM | RTMCTL_ACC);
BIT() macro for both new and existing defines?
- /*test with sample mode only */
- sec_out32(&rng->rtmctl, val);
}
static int rng_init(uint8_t sec_idx, ccsr_sec_t *sec) diff --git a/include/fsl_sec.h b/include/fsl_sec.h index 7b6e3e2c20..2b3239414a 100644 --- a/include/fsl_sec.h +++ b/include/fsl_sec.h @@ -34,6 +34,7 @@ #if CONFIG_SYS_FSL_SEC_COMPAT >= 4 /* RNG4 TRNG test registers */ struct rng4tst { +#define RTMCTL_ACC 0x20 #define RTMCTL_PRGM 0x00010000 /* 1 -> program mode, 0 -> run mode */ #define RTMCTL_SAMP_MODE_VON_NEUMANN_ES_SC 0 /* use von Neumann data in both entropy shifter and -- 2.17.1
-- andrey