Stability AI Introduces Offline Audio Generation for Smartphones

Stability AI Introduces Offline Audio Generation for Smartphones

Stability AI and Arm have optimized the Stable-Audio-Open program for smartphone CPUs, significantly reducing the time required to generate sound.

Stable-Audio-Open can now create sound effects and audio clips in seconds without an internet connection. To achieve this, Stability AI and Arm adapted the AI model for mobile processors using Arm’s KleidiAI library alongside Stability AI’s proprietary technology.

Thanks to this collaboration, audio generation time has been cut from 240 seconds to just 8 seconds in one test case using an Armv9 CPU. According to a joint press release from the companies, this advancement aims to eliminate the need for high-performance hardware in this type of application.

With this breakthrough, Stable-Audio-Open is the first application capable of generating audio on a smartphone without relying on cloud-based processing, company representatives claim. Most other similar solutions still depend on cloud computing. However, the model is not yet available for public use, and it remains unclear when or in what form Stability AI and Arm will release it to users.

Stability AI has also emphasized that this is only the beginning of its collaboration with Arm. The partnership is expected to play a key role in enhancing media creation capabilities on mobile devices, extending beyond audio to include image, video, and even 3D content generation.

The open-source Stable-Audio-Open model was originally introduced in June 2024. It allows users to generate audio clips up to 47 seconds long from text prompts.