Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yet another cherry picked attempt at music gen with a "demo" page that only contains the outputs that happen to not sound like incoherent noise. There's a reason ChatGPT and Midjourney are so popular, but no music-gen tools have even come close: you can actually create stuff with them that is useful and/or enjoyable. Good music gen is much harder, and the reasons for this (still unclear) are pretty important to the future of AI, imo.


That's quite a harsh critique that doesn't seem warranted. There are quite a lot of different examples with varying quality. Granted, it's likely that these are to some extent cherry picked, but it's unlikely that these are rare and that most output sounds like incoherent noise, as you seem to insinuate.


Their current demos seem worse than riffusion results I get on average. Music gen is hard because music is inherently a composed of many many different instruments, each with unique sound and function. Simply training end-to-end will almost always end up badly.


I'm sorry, I like riffusion as much as the next guy, but in what world is any riffusion example better than the literal first example on this demo page?


Might be personal preference, but I don't think those examples are good at all. The only thing I am impressed with is the story mode and conditioning. I typically use riffusion to generate swing/electropop music, could be I'm too biased.


> Music gen is hard because music is inherently a composed of many many different instruments, each with unique sound and function. Simply training end-to-end will almost always end up badly.

Yes! That's what I figured when I started my project to create an AI assistant for melodies (http://melodies.ai/). I'm quite sure that splitting it up is the way to go, and that the first AI hit song will not be created as a full song but by combining instruments/vocals.


FYI, they don't actually train end-to-end, they have three separate models that are each trained via self-supervised learning independently (I think, I haven't read the paper very carefully).


Makes one wonder if there are models that rely on loops.


Well... no all releases have to be toys for your average twitter user to play with. Most of them are just academic stuff to show team progress.


I'm not saying it about this one, but some are akin to Nikola rolling their truck down a hill. Like the Google phone assistant one that would call and make appointments for you and stuff. Very likely cherrypicked at the time. Or Microsoft's Milo, which was almost entirely staged.




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: