TCP makes NAT traversal harder, not easier. WebRTC can be peer to peer thanks to its use of UDP, not in spite of it. The complications of STUN comes from NAT, not from UDP. The complications of TURN comes from firewalls (or overly restrictive (i.e bad) NATs), not from UDP.
Except they wanted to build something usable? If I could snap my fingers and make IPv6 deployment and usability, I and I'm sure WebRTC folks would. This entire stance is even more hard to mesh with the real world given how long WebRTC has been in development and use.
Well, the idea behind would be to move to a clean ipv6/tcp protocol stack at the same pace than ipv6 deployment.
The best compromise would have been a dual stack: the clean pure p2p ipv6/tcp, and the brain f*cked ipv4/nat/upnp/turn/stun/etc. To have a clean and brutal split between the 2.
In my country, people would mostly run the clean ipv6/tcp protocol stack. And I am thinking REALLY simple: no transcode in the protocol (only the node will decide if it needs to transcode), one stream, buffered (it's not a video call). Yeah... excrutiatingly simple, then with many alternative implementations, aka sane.
But, it is kind of a dream, I know CDNs and streaming services won't let that happen in any way, I would not be surprised to see some DDOS-ing "from" them to keep people slaves of their big tech "centralized" services.
I think that will be unavoidable unless HTTP3/QUIC or IPv6 gains widespread support on all kinds of network infrastructure.