The 80186 already added dedicated logic for address computation, as well as mult...

The 80186 already added dedicated logic for address computation, as well as multiply/divide and repeated shift/rotate (no barrel shifter, but only one cycle for each shifted bit instead of several as on the 8086). It also had the extended instructions, except for those related to protected mode.

However the microcode format remained essentially the same[1], so I don't think there was a fundamental redesign in either the EU or BIU.

The '286 is different, and not just in the BIU (which now has to enforce segment limits etc). From what I've pieced together looking at die shots, and US patent 4442484:

There are three 6-bit fields to select registers for each micro-instruction. ALU operations can apparently take any register as operand, only immediate values have to be first loaded into a temporary register[2]. The microcode is also organized more like a conventional ROM instead of being addressed directly by opcodes.

Bytes from the prefetch queue first go through a separate decoding stage. An "entry point PLA" translates the opcode (with additional inputs for 0Fh and REP prefixes, real/protected mode, and "modr/m extended opcodes") into a microcode address. That address, any operands including a 16 bit immediate and 17(?) bit displacement field, and other flags are placed into a "decoded instruction queue" holding up to three instructions.

From what I've read the 386 was actually very similar despite adding 32 bit registers and paging. The next major changes to the microarchitecture came in the 486 and Pentium.

[1] https://news.ycombinator.com/item?id=34334799

[2] https://rep-lodsb.mataroa.blog/blog/the-286s-internal-regist...