Stability AI launches Stability Audio
On September 13th, Stability AI launched Stable Audio: Fast Timing-Conditioned Latent Audio Diffusion
In the fast-evolving landscape of generative AI, diffusion-based generative models have been a game-changer, enhancing the creation of images, videos, and audio. “Latent diffusion models” have accelerated this process significantly, a development spearheaded by Stability AI.
However, a standing challenge has been the fixed output size during audio generation, limiting the production of various audio lengths significantly. Stability AI introduces “Stable Audio,” a forward-thinking architecture that leverages text metadata along with audio file specifications to generate user-defined audio lengths, up to a maximum threshold set during the training phase, overcoming the limitations posed by fixed output sizes.
Utilizing state-of-the-art diffusion sampling techniques, the Stable Audio model can quickly render stereo audio, showcasing an impressive performance on an NVIDIA A100 GPU. This multifaceted system encompasses a Variational Autoencoder (VAE), a text encoder, and a conditioned diffusion model founded on U-Net principles, offering high-fidelity outputs and paving the way for dynamic audio generation.
The incorporation of a CLAP model for text encoding allows for richer textual representation, optimizing the model’s performance through cross-attention layers within the diffusion U-Net. This innovation by Stability AI symbolizes a breakthrough in determining and embedding timing cues to facilitate dynamic control over output audio lengths.
Built on a robust 907M parameter U-Net inspired by Moûsai, Stable Audio leverages modern techniques to refine input noise based on text and timing details, showcasing a remarkable advancement in long-sequence audio generation.
To fine-tune the flagship Stable Audio model, Stability AI utilized a comprehensive dataset provided by AudioSparx. Encompassing a vast range of audio files paired with textual metadata, this dataset, tallying over 19,500 hours of audio, illustrates Stability AI’s commitment to revolutionizing generative AI in audio production, blending depth and quality in an unprecedented manner.
Key Features:
- Text-conditioned audio diffusion model: Stable Audio is a latent diffusion model that can generate audio conditioned on text metadata. This allows users to control the content of the generated audio by providing text prompts.
- Duration and start time control: Stable Audio also allows users to control the duration and start time of the generated audio. This is particularly useful for generating music, as it allows users to generate songs of specific lengths.
- Fast inference: Stable Audio uses a heavily down sampled latent representation of audio, which allows for much faster inference times compared to working with raw audio.
Benefits:
- Improved control: Stable Audio allows users to have more control over the generated audio, both in terms of content and duration. This makes it a more powerful tool for creative applications.
- Faster generation: Stable Audio’s fast inference times allow users to generate audio more quickly. This can be useful for a variety of applications, such as music production and sound design.
- New possibilities: Stable Audio opens up new possibilities for audio generation. For example, it can be used to generate music that is tailored to a specific genre or style or to generate sound effects for video games and movies.
A sample audio of what the system is capable of generating is included below:
Other AI News
-
Google Approaches Launch of Gemini AI Software, The Information reports
The software giant is reportedly providing a select group of companies with early access to Gemini, its nascent conversational AI software, in a bid to compete with OpenAI’s latest GPT-4 model, according to The Information. The tool, expected to enhance functionalities ranging from chatbots to content summarization and generation, is part of Google’s strategic move to bolster its stake in the generative AI sector, a field currently dominated by the Microsoft-supported OpenAI, which made waves with its ChatGPT launch last year.
Gemini, which is also foreseen to assist software engineers in coding and creating original visuals based on user inputs, will be available through Google Cloud Vertex AI service, although the company is currently releasing only a constrained version of the platform, reserving its full potential to match the capabilities of GPT-4 in forthcoming versions. This unveiling comes on the heels of Google’s recent initiatives in generative AI, including the integration of AI-powered features in its Search tool for markets in India and Japan and the extension of similar services to business clients at a subscription rate of $30 monthly per user. Google has yet to comment on the developments.
-
Tech Giants Convene with US Legislators; Musk Advocates for AI “Referee”
During a recent high-profile meeting on Capitol Hill with lawmakers and tech leaders including Meta Platforms CEO Mark Zuckerberg and Alphabet CEO Sundar Pichai, Tesla CEO Elon Musk urged the creation of a U.S. “referee” for artificial intelligence (AI) regulation, likening it to a sports overseer ensuring safety and public interest. Musk labelled the meeting a potential historic milestone for civilization and highlighted the dualistic nature of AI technology.
Zuckerberg encouraged a collaborative approach between government and American firms in defining guidelines for emergent technology, highlighting the government’s duty to balance progress and safeguards.
Convened by Democratic Senate Majority Leader Chuck Schumer and attended by over 60 senators, the assembly fostered unanimous agreement on the imminent need for AI regulation, especially focusing on curbing dangers associated with deep fakes as the 2024 U.S. general election approaches.
Despite the consensus on the urgency, lawmakers acknowledged the extensive road ahead with Republican Senator Mike Rounds noting that they were not yet ready to draft legislation.
The meeting comes at a time when global regulators are grappling to establish rules governing AI, and companies are increasingly committing to voluntary AI guidelines championed by President Joe Biden to prevent misuse of the powerful technology.
-
Tencent Set to Release “Hunyuan” AI Model Following Chinese Regulatory Approval
Tencent Holdings to unveil “Hunyuan” AI Model to the public following Chinese Regulatory Clearance, it said on Friday.
-
Meta Platforms is developing a potent artificial intelligence (AI) system expected to rival OpenAI’s foremost model, according to the Wall Street Journal’s Sunday report, citing informed sources.
The endeavour, aiming for completion next year, seeks to supersede its existing Llama 2 model, offering much-enhanced functionalities.
Llama 2, an open-source AI language model unveiled in July, is facilitated through Microsoft’s Azure cloud services, rivalling products from OpenAI and Google. This advancement in Meta’s AI technology aims to aid firms in establishing services capable of sophisticated text and analytical outputs.
The planned launch for the initial training of this large language model is in early 2024, although specifics may vary as development progresses. While Meta has yet to respond to inquiries from Reuters, the inception indicates the tech giant’s endeavour to capture a share of the flourishing generative AI sector, an area increasingly sought after by businesses and corporate entities for its innovative potential. Apple is also reportedly venturing into similar developments with a framework dubbed ‘Ajax’ and a prospective chatbot, hinting at a burgeoning trend in AI advancements.
-
Adobe Rolls Out Firefly, Generative Fill AI Tools in Major Creative Cloud Update
Adobe is revolutionizing its Creative Cloud software by deeply integrating artificial intelligence (AI) technology, spearheaded by the introduction of its new AI engine, Adobe Firefly. Firefly facilitates the creation and modification of media content through user-friendly text prompts, potentially altering the landscape of digital art and media, which has been dominated by Adobe’s platforms for decades. Firefly now broadens even further the spectrum of digital creation.
Transitioning from beta to general availability, Adobe also introduced standalone apps, Adobe Express Premium and Firefly, fostering a playground for AI-generated content and simplifying social media and marketing content creation.
However, as AI integration in creative fields expands, it brings along concerns of potential plagiarism and deception, pressing the need for robust legal frameworks. Adobe accentuates user responsibility in navigating the legal landscapes while employing AI-generated content.
Addressing legalities, Adobe’s Vice President Dana Rao urged the inception of the Federal Anti-Impersonation Right (FAIR) Act in a blog post to protect artists from AI tool misuses that mimic their work or style.
As Adobe steers into this new direction, it prompts crucial questions about the future of AI in art, initiating discussions on authenticity and the ethical dimensions surrounding AI-assisted creativity.
-
AI startup Anthropic, recognized for its Claude 2 large language model assistant which rivals OpenAI’s ChatGPT, has entered into a collaboration with global management consultancy Boston Consulting Group (BCG) to facilitate direct access to Anthropic’s AI technologies, including Claude 2, for BCG’s clients.
The partnership aims to foster innovation and enhance productivity in various strategic solutions.
Dario Amodei led Anthropic continues to carve out its space in the booming enterprise AI sector, following substantial investment and significant agreements, including a recent $100 million deal with South Korean telecom heavyweight SKT.
BCG plans to leverage Anthropic’s tech, notable for its adherence to strict moral and ethical guidelines, in a range of applications such as synthesizing research documents, accelerating fraud detection, and automating writing-related tasks in enterprise resource planning transformations. The collaboration underscores Anthropic’s commitment to safe AI, with Claude being designed to be “helpful, honest, and harmless.” Sylvain Duranton, the global leader of BCG’s tech build and design unit, stressed the critical balance between ethical AI application and potent financial outcomes.
The two firms, already having cooperatively hosted a UN workshop focusing on responsible AI utilization, anticipate the collaborative effort to guide businesses in ethically leveraging AI’s transformative potential. Anthropic’s head of strategic partnerships, Neerav Kingsland, expressed enthusiasm for the collaboration, highlighting BCG’s reputation for steering enterprise companies through tech transitions responsibly.
This venture signifies a critical expansion for the approximately $5 billion valued Anthropic, following a trend of AI firms championing their products through strategic alliances, as seen with recent initiatives from other AI giants including OpenAI. Industry experts forecast the burgeoning generative AI sector could potentially unlock trillions in global corporate profits, heralding democratized access to AI technologies and spearheading significant advancements in the field.
-
Professional service firm EY is launching an AI-integrated platform following a substantial $1.4 billion investment in artificial intelligence technology.
This strategic move, aligning with rivals such as KPMG and Accenture, is indicative of the rising reliance on AI to spur business growth.
The newly unveiled EY.ai platform incorporates AI functionalities into popular EY offerings, including the widely-utilized data management solution, EY Fabric. “AI’s moment is now,” emphasized EY Global Chairman and CEO Carmine Di Sibio, signalling the pervasive role of AI in shaping the future business landscape.
A notable component of this venture is the development of a large-language model named EY.ai EYQ, a critical element in advancing AI systems. EY’s technology-forward approach is supported by collaborations with influential players like Microsoft, securing early access to pivotal resources such as Azure OpenAI functionalities.
By fostering partnerships with tech stalwarts including Dell, SAP, and Thomson Reuters, EY demonstrates its commitment to leveraging AI technology to its fullest potential, ensuring a prepared and informed approach to AI integration in business operations.
-
Alibaba has announced plans to grant public access to its artificial intelligence (AI) model, Tongyi Qianwen, a decision that reflects the regulatory green light from Chinese authorities to commercialize the model amid heightened focus on AI development in the ongoing tech rivalry between China and the US.
According to a statement from the Alibaba Cloud Intelligence Division on WeChat, collaborations have been initiated with several organizations, including OPPO and Zhejiang University, to leverage Tongyi Qianwen for training large language models and creating language model applications. The division revealed that a free open-source variant for commercial utilization is slated for release soon.
The unveiling comes in the wake of internal leadership shifts at Alibaba, including the appointment of Eddie Wu as the new Group CEO. Wu, underscoring the pivotal role of AI in shaping the future trajectory of Alibaba, emphasized in a recent communique to employees the imperative to keep abreast of AI advancements to avoid obsolescence. Alibaba initially introduced Tongyi Qianwen in April, with plans for comprehensive integration into business apps, marking a substantial stride in the firm’s AI-driven strategy.
-
Nvidia’s stronghold in AI chip manufacturing has significantly reduced venture capital deals for potential competitors, with U.S. deals plummeting 80% this quarter compared to the previous year.
The Santa Clara-based firm’s prowess in language data chips has discouraged investors from heavily backing startups entering this sphere. Greg Reichow of Eclipse Ventures points out the difficulty in penetrating this market due to Nvidia’s dominance.
By August’s end, U.S. chip startups secured $881.4 million in funding, a sharp decline from the $1.79 billion raised in the first three quarters of 2022. Deal numbers also dwindled from 23 to just four.
Despite Nvidia’s marked presence, competitors like AMD are gearing up to challenge with new launches. Intel has also joined the fray through strategic acquisitions, presenting potential long-term alternatives to Nvidia’s offerings. Another promising arena is data-intensive computing chips for prediction algorithms, where Nvidia hasn’t established dominance, indicating investment opportunities.
-
Silicon Valley startup Enfabrica has raised $125 million in a Series B funding round to advance its development of networking chips for AI data centres, with Nvidia coming in as a strategic investor.
The startup, established by former executives of Broadcom and Google, aims to remedy a significant issue faced by Nvidia’s graphics processing unit (GPU) chips: sitting idle due to insufficient data feed from connected networks.
Enfabrica has crafted a chip that facilitates enhanced connectivity in data centres, preventing delays by allowing Nvidia’s GPUs to access data from various sources simultaneously. This innovation promises more efficient GPU usage, potentially halving the number of chips required for the same computing workload, thereby reducing costs.
“The key is enabling better GPU utilization,” remarked Enfabrica CEO Rochan Sankar, underscoring the need to lower costs to foster widespread AI computing adoption. The latest funding round, steered by Atreides Management, welcomed participation from several new and existing investors, including IAG Capital Partners and Sutter Hill Ventures.
-
Shares of Arm Holdings, the British chip designer, experienced a drop of 4.5% closing at $60.75 on Friday, following a remarkable Nasdaq debut where the company reached a valuation of $65 billion.
The decline was attributed to broader market trends, with major U.S. stock indexes and an index of semiconductors witnessing dips amid concerns about weakened consumer demand. The fall was also partially ascribed to profit-taking following a significant leap in the initial session.
Despite the volatile session, there remains optimism surrounding Arm’s potential growth in the cloud computing market, a sector anticipated to grow annually at 17% until 2025, largely driven by advancements in AI. While Arm currently holds a 10% share in this market, analysts see an opportunity for the firm to leverage its specialization in energy-efficient central processing units (CPUs) to further benefit from the AI surge, potentially in partnership with giants like Nvidia.
The market anticipates increased trading volatility for Arm, with a limited number of public shares and SoftBank retaining a 90% stake. Arm’s potential inclusion in indexes such as the Nasdaq 100 fuels hope for a revival in the IPO market, buoyed by the company’s high-growth prospects despite a marginal dip in full-year sales reported ahead of the IPO.
Experts advise a cautious approach, with brokerage Needham initiating stock coverage with a ‘hold’ rating as they await a more opportune entry point, citing the IPO valuation as too steep despite Arm’s promising avenue for growth through a larger capture of smartphone value. The coming days will see the launch of options contracts on Arm Holdings on Nasdaq exchanges, offering investors a fresh avenue to speculate on the trajectory of the year’s most substantial initial public offering.
About The Author
Bogdan Iancu
Bogdan Iancu is a seasoned entrepreneur and strategic leader with over 25 years of experience in diverse industrial and commercial fields. His passion for AI, Machine Learning, and Generative AI is underpinned by a deep understanding of advanced calculus, enabling him to leverage these technologies to drive innovation and growth. As a Non-Executive Director, Bogdan brings a wealth of experience and a unique perspective to the boardroom, contributing to robust strategic decisions. With a proven track record of assisting clients worldwide, Bogdan is committed to harnessing the power of AI to transform businesses and create sustainable growth in the digital age.
Leave A Comment