Control bits as in ARM and x86 force serialization of arithmetic due to the RW d...

ansible · on Nov 1, 2020

Yes, the old, old way of having a single condition code register or the like (which dates back 40+ years) doesn't work well these days.

I like the Mill CPU approach, where every "register" (it doesn't have named registers actually) has the full set of status bits associated with it, and not just for overflow. Things like "not a result" (NaR), which can represent the result of a failed speculative load for example (because the process doesn't have permission to read from that page, for example).

saagarjha · on Nov 1, 2020

I thought that compilers couldn't really use this effectively…

ansible · on Nov 2, 2020

> I thought that compilers couldn't really use this effectively…

The status bits part in general, or the speculative load stuff?

They allegedly have all this working, privately. They haven't released any development tools or such to the public.

I've often toyed with the idea of writing an instruction-level simulator (as opposed to the RTL sim or whatever they have internally). But even sticking to the public information, I'd likely be infringing on their patents.

CalChris · on Nov 1, 2020

No. Control bits (status bits, flags, ...) get renamed just as registers get renamed.

Basically, if there's a bottleneck to x86 code, Intel has run into it, profiled for it and generally optimized around it both in their microarchitectures and in the their C compiler.

wbl · on Nov 1, 2020

That's one of the tricks. But it doesn't solve the issue of clobbers, which Intel had to introduce new variants of ADD and MUL to solve. Named predicate registers make it all much easier for everyone.

CalChris · on Nov 1, 2020

Thanks. I knew about condition code renaming from discussion with an Intel compiler engineer. I didn't know about clobbers and I'll read up on that.

wbl · on Nov 1, 2020

https://en.m.wikipedia.org/wiki/Intel_ADX is the solution Intel created.

tom_mellior · on Nov 1, 2020

ARM has separate instruction variants with and without setting of flags. Normally one uses the flag-less versions, so you don't have this problem.

pizlonator · on Nov 1, 2020

I think what you’re saying is basically true but it’s a trade against density.

If you did overflow checks rarely then what you say is a very good point indeed. The key thing is just the frequency of this stuff in modern languages.