Meta introduced SAM Audio and impressed the community with the ability to separate individual sounds from audio or video with a single click or text command. Now users can easily isolate, for example, a dog’s bark amidst city noise without spending hours on complex editing. The model uses an approach similar to visual segmentation but applies it to sound.
To interact with SAM Audio, simply select the desired fragment using a click, text query, or timestamp. The interface is simple and intuitive, making the tool accessible even to beginners. Developers have opened the code and model weights, so anyone can experiment with the new technology independently.
This approach opens up new possibilities for video and audio editing, as removing unwanted noise or leaving only the voice has become much easier. The model works quickly and confidently and does not require special knowledge. Such a tool is especially valuable for content creators, journalists, and anyone working with sound.
Meta has made SAM Audio open to everyone, and this immediately sparked a wave of discussions in the AI community. Participants are already sharing their first impressions and ideas for using this technology in their projects. SAM Audio promises to become a new standard in sound editing thanks to the simplicity and precision of actions.

