Google researchers have developed an AI tool, MusicLM, that can generate high-quality music from text descriptions.
(For insights on emerging themes at the intersection of technology, business and policy, subscribe to our tech newsletter Today’s Cache.)
MusicLM generates music at 24 kHz that remains consistent over several minutes. It outperforms previous systems both in audio quality and adherence to the text descriptions, Google claimed in a research paper on Thursday.
The model can even transform a whistled or hummed melody according to the style described in a text caption, Google added.
The tech company however does not have any plans to release the model right now as it acknowledged that there are several risks associated with the new model and the use-case it tackles.
“We strongly emphasize the need for more future work in tackling these risks associated to music generation — we have no plans to release models at this point,” Google said in the research.
The generated samples will reflect the biases present in the training data, raising the question about appropriateness for music generation for cultures underrepresented in the training data, while at the same time also raising concerns about cultural appropriation, Google noted.
The company’s future works may focus on lyrics generation, along with improvement of text conditioning and vocal quality, modeling of high-level song structure like introduction, verse, chorus, and modeling the music at a higher sample rate, according to the research paper.
COMMents
SHARE
Email