Performance of microkernels is or was a hotly debated topic back in the '90s; microkernels generally had lower performance than their monolithic counterparts. There was a famous argument comparing Linux (monolithic) to MINIX (micro) [0]. Wikipedia has simple explanations for the differences between hybrid, mono, and micro kernels that serve as a decent primer [1] and a visual that I find helpful [2].
[0] https://en.wikipedia.org/wiki/Tanenbaum%E2%80%93Torvalds_deb...
[1] https://en.wikipedia.org/wiki/Kernel_(operating_system)
[2] https://en.wikipedia.org/wiki/Hybrid_kernel#/media/File:OS-s...
My opinion summed up ridiculously concisely is that the reasons are largely accidents of timing (i.e. there was room in the market for new entrants), mixed with trivial-at-the-time but high-impact choices. Had Tanenbaum been permitted by his publisher to give away copies of minix freely, Linux may never have been started at all. Had Tanenbaum chose raw performance over teachability (the kernel was a teaching tool after all), it may have made a difference. (warning: this part is controversial) There's also a lot of question about whether the GPL licensing kept Linux relevant, or if it was largely inconsequential. My opinion is that GPL made a big difference. Had the licensing of Minix been GPL flavored, it probably wouldn't have been used by companies like Intel for the Management Engine, but there might have been enough "forced sharing" to keep it relevant. Impossible to say for sure, but that's my two cents.
https://en.wikipedia.org/wiki/Tanenbaum–Torvalds_debate
Which in a nutshell comes down to Linus's view that producing a system that meets current needs (for example by offering a simple API) and is reasonably performant -- even if it ran on but a single architecture such as the i386 -- was more promising than a design offering "theoretical" benefits (such as portability) but which either didn't deliver the goods today or just wasn't reasonably performant.
This is a vast simplification of a much broader set of issues but if one digs into the notes of why subsequent projects (e.g. Windows NT) moved away from microkernel designs, or the roughly analogous microservices debate of the past decade -- one often finds echoes of the same basic considerations.
Windows and the Unix OSes just have a lot more market penetration, strong network effects, and a big first-mover advantage. That means that when you are buying your OS or your hardware or your software, you're probably stuck with a monokernel.
Some kinds of code are inherently in the core failure domain and trying to use restartable processes for them is impossible. For example if the code configuring your PCI controller crashes half way through, well, you need to reboot. You can't just restart the driver because you have no idea what state the hardware is in. A reboot gets you back into a known state, a server restart doesn't. Likewise if your root filesystem server segfaults, you can't restart it because your process start code is going to involve sending RPCs to the filesystem server so it'd just deadlock.
Finally, it's a spectrum anyway. Microkernel vs monolithic is not a hard divide and code moves in and out of those kernels as time and hardware changes. The graphics stack on Windows NT started out of kernel, moved into the kernel, then moved back out again. In all OS' the bulk of the code resides in userspace libraries and in servers connected via IPC.
Also the model of a Unix kernel with services (daemons) in user space is kind of similar to the microkernel model. The graphical desktop is not in the kernel, the mail server isn't in the kernel, crond isn't in the kernel, the web server isn't in the kernel, the database server isn't in the kernel ... and maybe that's as close as we need to get to the microkernel.
Speaking of web servers in user space: at least for serving a static page, it's faster for that to be inside the monolithic kernel, where it can access the protocol stack and filesystem code without crossing a protection boundary. Over the history of Linux, we have seen experimentation both with moving kernel things into user space (like with FUSE) as well as user things in to the kernel (TUX server, now evidently unmaintained).
- the people required to work on it
- in a timely fashion
- in a coordinated fashion
- the resources required
- bugs, updates, maintenance
2. The competition for aforementioned talent
3. The competition for aforementioned resources
4. Profit motive
You can write a rocket OS as a "proper" microkernel, but there's no way you could write a full-blown Windows OS as a "proper" microkernel.
That said, Windows or Linux can take leaves out of the microkernel playbook and apply them judiciously: KVM's architecture is arguably an example of this, where the core kernel does only the minimum necessary, and everything else is pushed to the user-mode component (QEMU, Firecracker, etc).
Now the question becomes why these components are built in the kernel and not in userspace? The answer is clear for each individual component. See how the in-tree NTFS driver eventually replaces NTFS-3g. Basically when an in-tree solution exists it just gets preferred over userspace solutions because of performance and reliability etc.