
8 Oct
2009
8 Oct
'09
11:30 p.m.
On Thursday 08 October 2009 07:29:51 Alessandro Rubini wrote:
Similarly, I'm not interested in "4 bytes at a time, then 1 at a time" as it's quite a corner case. If such optimizations are really useful, then we'd better have hand-crafted assembly for each arch, possibly lifted from glibc.
why ? it's trivial to implement with little code impact. have your code run while the len is larger than 4 (sizeof-whatever), then fall through to the loop that runs while the len is larger than 0 instead of immediately returning. -mike