It's not impossible to run 16-bit code on a 64-bit CPU, just not possible to run it directly on the CPU when operating in 64-bit Long mode. For example in 32-bit Protected Mode you can still thunk and run 16-bit instructions, and in 64-bit Long mode, you can do the same and run 32-bit instructions, however you cannot, in 64-bit long mode, thunk down to 16-bit and run 16-bit instructions. Because of that, it would be necessary to write a full-blown x86 emulated environment in that case, and in the case of Windows apparently it wasn't considered a worthwhile engineering investment. Fact is you can download and use any number of emulation or virtualization programs and run that software, just not directly. I have some PCEM setups that attempt to accurately emulate the original IBM PC as well as the IBM XT, so perhaps the thoguht was users could download BOchs, PCEm, VMWare, VirtualBox, etc. if they needed to run that sort of software.
68K had the early benefit of being 32-bit only. Not only that, but since was no backwards compatibility legacy there was no legacy to be compatible with- the entire instruction set was 32-bit. Simplified everything to not have any context switches. Even the 6888x co-processors were better designed, but that was again largely because of there being no particular need for backwards compatibility. Intel learned that lesson twice- first with the 80186 and then later with the Itanium.- as much as they might want to they cannot realistically come out with a new CPU design that drops backwards compatibility because no matter how revolutionary or amazing it's design, nobody will want it unless they can use it with existing software. the 68K had a bit of a jump-start by being present in the Macintosh, which helped jump-start it's position.
Even the co-processor design was better. the original 8087 was sort of a mess- completely different, stack-based approach from the 86, and thanks to backwards compat it's still stack-based today. Rather goofy design. the 6888x co-processors were generally "better" and more simple to interface with. Their main downfall being transcendental trig functions which could actually be executed faster by emulating floating point on the main 68K, largely thanks to some cache performance differences.
It's big disadvantage and why it was eventually dumped by Apple in favour of the PowerPC was that it wasn't really forward capable, it wasn't able to "evolve" as quickly to compete with other processors. Even the 80286 was giving it a run for it's money in '87- the 386 was more or less a finishing move- Apple managed to move on to the PowerPC by 1994. PowerPC itself being another beast altogether, too.
PowerPC is pretty interesting as well. There are some Linux distributions that will run on it. (Yellowdog). Like Motorola PPC eventually got supplanted because Intel just had more manufacturing scale available to overcome engineering problems.
Not to mention the PowerPC's issues regarding power Draw. A fully loaded Power Mac G5 could draw a full megawatt! Later models even had to use a different IEC connector (C20) for high amperage. Mine is a "boring" Single core-single CPU system.
The 64-Bit transition with Intel is interesting as well. I mentioned the Itanium- the IA-64 Architecture was Intel's first foray into 64-bit and for the most part they sunset support for the x86 altogether- it required software emulation. So in s ome Some Intel was going "Alright time to get rid of all this legacy crap so we can make a nice speedy chip!" but... well nobody really liked it. Software vendors needed to provide 64-bit versions of their software or it would run terribly through software-side emulation, and software vendors would only commit to it if the userbase was high enough, and users wouldn't buy them unless the software they used ran on it, etc.
AMD came out with AMD 64 and defined a 64-bit operating mode that allowed the chip to be fully backwards compatible while supporting 64-bit operating modes. More or less AMD did the same thing Intel did when moving to 32-bit. Intel eventually gave up in IA-64 and through a series of legal battles with AMD eventually agreed to license AMD64.
The most interesting part to me is where x64 came from. x86, after all, had the x "stand" for 2, 3, 4, 5 (Pentium) etc. but what does the X stand for? IK recently saw some rants about how x64 is gibberish and nobody should say it because it's wrong.
That caused me to look into it and determine where the term came from. I doubted that it was "a stupid naming decision made by idiots at Microsoft with Windows XP x64 Edition" and I was sure there was reasoning there.
And it turns out there was. Turns out that AMD and Intel do actually implement different instruction sets. Intel's is called "Intel64" and AMD's is called "AMD 64" so "x64" refers to the common instruction set between them- it "stands in" for AMD and Intel.