That's like saying “The ‘Web’ can't display images, the ‘Web’ needs plug-ins to do so. Plug-ins like libjpeg or giflib”.
Right now, you can put a <video> or <audio> tag on a page and it will work in something like 95% of the browsers in use. That's already better than Flash and going up as IE8 users upgrade to newer versions of IE or install Chrome/Flash.
Sure, you can't rely on users not recompiling their browser to disable it but you also can't rely on 100% support for anything on the web – users disable image loading, plugins, stylesheets or JavaScript, install incredibly overzealous ad-blockers, use ISPs which tamper with page contents, etc.
The 'Web' can't play video or audio, the 'Web' needs plug-ins to do so. Plug-ins like H264 decoders and Flash.