SMAUG-72B:Multi-Modal Transformers Design using DALL-E3 & CANVA

Introducing Mighty “Smaug-72B” – The New King of Open-Source AI

Introduction

  • “Smaug-72B,” a groundbreaking open-source language model, has been released by Abacus AI, setting a new benchmark in the AI community. This model, derived from “Qwen-72B” developed by the Qwen team at Alibaba Group, has quickly ascended to the top of the Hugging Face Open LLM leaderboard, surpassing notable models such as GPT-3.5 and Mistral Medium.

Features

  • Origin and Development: Smaug-72B is a fine-tuned iteration of Qwen-72B, incorporating advanced techniques to enhance its performance.
  • Open-Source Accessibility: As an open-source model, Smaug-72B is freely available for download, use, and modification, fostering innovation and collaboration within the AI community.
  • Comprehensive Training: Leveraging a dataset that spans a wide array of knowledge domains, Smaug-72B has been trained to understand and generate natural language with remarkable accuracy.

Benefits

  • Superior Benchmark Performance: Smaug-72B has demonstrated exceptional capabilities, outperforming other leading models in several key benchmarks, making it the first open-source model to achieve an average score above 80 across all major LLM evaluations.
  • Enhanced Reasoning and Math Skills: Specialized fine-tuning techniques have significantly improved the model’s performance in reasoning and math tasks, areas where many large language models traditionally struggle.
  • Promotion of AI Democratization: By being open-source, Smaug-72B contributes to the democratization of AI technology, allowing a broader range of developers and researchers to access state-of-the-art tools.

Other Technical Details

  • Fine-Tuning Techniques: The model’s success in reasoning and math is attributed to specific fine-tuning processes that target these competencies, details of which will be elaborated in an upcoming research paper by Abacus AI.
  • Future Applications: Abacus AI intends to apply the successful techniques used in Smaug-72B to enhance other models, including “miqu,” indicating a strategic approach to leveraging these advancements across various models.

Conclusion

  • The release of Smaug-72B marks a significant milestone in the field of artificial intelligence, particularly within the open-source community. Its superior performance and open accessibility not only challenge the dominance of proprietary models but also pave the way for a new era of innovation and collaboration in AI. As the first open-source model to top the Hugging Face Open LLM leaderboard with an average score of 80, Smaug-72B embodies the potential of open-source AI to rival and even surpass the capabilities of models developed by tech giants, heralding a future where AI technology is more accessible and equitable.

Other AI News

  • Google Deepmind proposes ‘self-discover’ framework for LLMs, improves GPT-4 performance

Google DeepMind, in collaboration with the University of Southern California, has proposed a novel ‘self-discover’ prompting framework aimed at enhancing the reasoning capabilities of large language models (LLMs), including OpenAI’s GPT-4 and Google’s PaLM 2. This new approach, detailed in a paper published on arXiv and Hugging Face, significantly improves LLM performance on complex reasoning benchmarks by as much as 32% compared to existing techniques like Chain of Thought (CoT). The self-discover framework enables LLMs to autonomously identify and utilize task-specific reasoning structures, thereby optimizing problem-solving efficiency and effectiveness.

The self-discover framework operates by guiding LLMs to generate a coherent reasoning structure intrinsic to a given task, which is then used to solve instances of the task. This method not only outperforms traditional prompting techniques across a variety of reasoning tasks but also requires significantly less computational resources, making it a more efficient option for enterprises. The researchers’ findings demonstrate notable performance improvements in GPT-4 and PaLM 2’s accuracy across several reasoning tasks, highlighting the potential of self-discover to advance the problem-solving capabilities of LLMs. This development represents a significant step towards achieving general intelligence in AI, with implications for enhancing Human-AI collaboration in complex reasoning and problem-solving.

  • Microsoft brings AI image editing to Copilot, unveils Deucalion

Microsoft has announced significant updates to its Copilot AI search and chatbot experience, introducing new AI image creation and editing functionalities and unveiling a new AI model named Deucalion. This announcement coincides with Microsoft’s release of a new video advertisement set to air during the upcoming NFL Super Bowl, highlighting Copilot’s capabilities beyond traditional search functions, including content and software creation. The redesign of Copilot’s web landing page aims to provide a more visually engaging user experience with a visual carousel showcasing AI-generated images and sample prompts. These updates are part of Microsoft’s broader strategy to promote Copilot as a versatile tool for creative projects and everyday tasks, emphasizing its potential in the entertainment industry despite ongoing debates about AI’s impact on creative professions.

The introduction of AI image generation and editing directly into Copilot, powered by Microsoft’s Designer AI art generator and leveraging DALL-E 3 technology, marks a significant expansion of Copilot’s functionalities. This move reflects Microsoft’s commitment to making AI image generation more accessible to users, despite recent controversies surrounding AI-generated content. Additionally, the introduction of the Deucalion model, named after a figure in Greek mythology, aims to enhance Copilot’s performance by offering richer and faster responses in its “Balanced” mode. This development underscores Microsoft’s ongoing efforts to refine and improve Copilot’s AI capabilities, positioning it as a comprehensive tool for a wide range of applications, from entertainment to software development.

  • Cimba.AI emerges from stealth with $1.25M pre-seed to help enterprises build AI agents

Cimba.AI, a generative AI startup focused on enabling enterprises to build custom AI agents, has emerged from stealth with $1.25 million in pre-seed funding. The funding round was led by Ripple Ventures, with contributions from SeaChange, PackVC, and angel investors including Chad Sanderson and Chris Riccomini. Founded by former Airbnb and AWS veterans, Cimba.AI offers a web-based platform that allows enterprises to create AI agents designed to assist employees in their workflows, pull data into useful formats like dashboards and graphs, and streamline recurring business operations using the enterprise’s unique data.

The Seattle-based startup aims to empower businesses to build their own “Jarvis” for larger operations, providing insights and taking actions based on contextual and proprietary data. Cimba.AI’s platform supports the creation of self-learning AI agents without requiring extensive technical skills, significantly reducing the workload for data scientists. The platform is similar to OpenAI’s GPT Builder and Hugging Face’s Hugging Chat Assistants builder, offering a user-friendly interface for specifying the functionalities of AI agents and their data sources. Cimba.AI’s approach to AI-driven business operations has garnered support from notable investors and design partners, highlighting its potential to automate repetitive data analysis tasks and enhance productivity across various enterprise departments.

  • Stability AI launches SVD 1.1, a diffusion model for more consistent AI videos

Stability AI has announced the launch of SVD 1.1, an upgraded version of its Stable Video Diffusion (SVD) model, designed to generate short AI videos with improved motion and consistency. This new model, a fine-tuned iteration of the original SVD and its extension SVD-XT, aims to address previous limitations by delivering more photorealistic and dynamically consistent video outputs. Available for public use, SVD 1.1 can be accessed via Hugging Face and is part of Stability’s subscription memberships, which cater to both individual and enterprise users with various tiers, including a free option and others starting at $20 per month for commercial use.

SVD 1.1 enhances the generation of four-second videos with 25 frames at a resolution of 1024×576, given a context frame of the same size. This model promises to overcome challenges such as the lack of photorealism, motionless or slow-panning videos, and inaccurately generated faces and people that were evident in its predecessors. While Stability AI claims improvements with SVD 1.1, the actual performance and consistency of AI-generated videos remain to be seen in practice. The model is also available through the Stability AI developer platform via an API, facilitating the integration of advanced video generation into various products. This release underscores Stability AI’s continued push to advance generative AI technology, competing with other platforms like Runway and Pika, which also offer innovative video generation and customization features.

  • Meta will label AI-generated content on Facebook, Instagram, and Threads

Meta has announced its initiative to label AI-generated content on its platforms, including Facebook, Instagram, and Threads, acknowledging the challenge of identifying all AI-generated content accurately. This move follows the viral spread of pornographic AI-generated deepfakes of singer Taylor Swift on Twitter, sparking widespread condemnation and raising concerns about the misuse of AI in creating deceptive content. As the 2024 US elections approach, Meta faces increased pressure to address the proliferation of AI-generated images and doctored videos that could mislead the public.

The company is collaborating with industry organizations like the Partnership on AI (PAI) to develop common standards for identifying AI-generated content, utilizing IPTC metadata and invisible watermarks in line with PAI’s best practices. Meta aims to label images as “Imagined with AI” when industry-standard indicators suggest they are AI-generated, striving to develop classifiers to automatically detect such content. Despite the challenges, Meta’s effort represents a significant step towards transparency and accountability in the digital content ecosystem, highlighting the ongoing debate about the ethical use and identification of synthetic media.

  • OpenAI joins Meta in labeling AI generated images

OpenAI has announced an update to its ChatGPT app and the integrated AI image generator model, DALL-E 3, to include new metadata tagging for identifying imagery as AI-generated. This initiative follows Meta’s similar announcement to label AI images on its platforms. OpenAI’s update involves metadata using C2PA (Coalition for Content Provenance and Authenticity) specifications, enabling identification of AI-generated images across web and mobile platforms. The company has also provided a link to the Content Credentials website, where users can verify if an image is AI-generated. This move is part of OpenAI’s efforts to combat disinformation, especially ahead of the 2024 global elections, by embedding an electronic “signature” in the AI image file’s code. However, OpenAI acknowledges that metadata like C2PA is not foolproof, as it can be removed intentionally or accidentally, and most social media platforms today remove metadata from uploaded images.

The C2PA initiative, founded in February 2021, aims to develop technical standards for certifying the source and history of media content to address disinformation and content fraud. OpenAI’s adoption of C2PA standards for its AI-generated images is a step towards ensuring content provenance and authenticity. Despite the potential for metadata removal, this approach represents a significant effort to provide transparency and accountability in AI-generated content. Meta’s platform-wide AI labeling scheme, also relying on C2PA and the IPTC Photo Metadata Standard, indicates a growing industry commitment to addressing the challenges of AI-generated content and its implications for misinformation and copyright issues.

  • AMD unveils Embedded+ architecture for edge AI hardware

Advanced Micro Devices (AMD) has introduced its Embedded+ architecture, combining AMD Ryzen Embedded processors with Versal adaptive System-on-Chips (SoCs) on a single integrated board. This innovative architecture is designed to empower hardware companies to develop power-efficient AI applications at the edge, streamlining the process for original design manufacturers (ODMs) to create scalable and energy-efficient edge AI applications. The Embedded+ architecture represents a strategic move by AMD, following its acquisition of Xilinx for $50 billion in 2022, to integrate AI engines that accelerate edge AI processing in computing products.

The Embedded+ platform is particularly aimed at industrial, medical, smart city infrastructure, and automotive embedded systems, offering a streamlined path for data acquisition, processing in real time, and visualization. AMD’s approach reduces ODM qualification and build times, accelerating innovation and time-to-market for edge AI applications. The architecture combines AMD’s x86 compute capabilities with integrated graphics and programmable hardware, making it suitable for AI inferencing, sensor fusion, and industrial networking. Sapphire Technology has already launched the first ODM solution based on this architecture, showcasing the potential of Embedded+ to revolutionize embedded processing and AI capabilities at the edge.

  • Google Bard rebrands as ‘Gemini’ with new Android app and Advanced model

Google Bard, previously known as Google’s AI chatbot competitor to OpenAI’s ChatGPT, has been rebranded to Gemini. Alongside the name change, Google introduced an “Advanced” mode and launched a dedicated Android app, expanding its accessibility and functionality. Alphabet CEO Sundar Pichai announced the rebranding, indicating that the bard.google.com URL would now redirect to gemini.google.com, with Bard being phased out in favor of the new Gemini branding. This rebranding aligns with Google’s strategy to integrate Gemini models into widely used products and services, including Workspace and Google Cloud, enhancing AI capabilities for both individual and enterprise users.

Gemini’s Advanced mode, previously referred to as “Ultra,” represents Google’s effort to offer a more powerful version of its AI chatbot, challenging the performance of OpenAI’s GPT-4. To access the Advanced mode, users are required to subscribe to Google’s Google One AI Premium Plan, priced similarly to OpenAI’s subscription for ChatGPT Plus. Google claims that Gemini Advanced outperforms human experts on the massive multitask language understanding (MMLU) benchmark, which tests knowledge and problem-solving abilities across a wide range of subjects. The introduction of Gemini, especially its Advanced mode, signifies Google’s ambition to lead in AI technology by providing a versatile and powerful tool for a broad spectrum of tasks, from coding and logical reasoning to creative collaborations.

  • Perplexity partners with Vercel, opening AI search to developer apps

Perplexity AI, a California-based startup aiming to rival Google and Microsoft in AI search and discovery, has announced a partnership with Vercel, a platform for web development. This collaboration enables developers on Vercel to integrate Perplexity’s large language models (LLMs) into their applications, leveraging them as a knowledge support system. This move is part of Perplexity’s broader strategy to expand its presence in the AI domain and establish itself as a leader in knowledge discovery. The partnership with Vercel, known for helping developers build, deploy, and host web applications, allows for the integration of Perplexity’s API for online LLMs, offering developers access to the latest information without any knowledge cut-off, making it ideal for time-sensitive queries.

Vercel’s AI SDK, which already supported models from companies like OpenAI and Mistral, now includes Perplexity, giving teams more options for AI integration. This collaboration is expected to enhance the development of AI-powered applications, providing real-time, precise answers and enabling a wide range of AI use cases. Perplexity’s online LLMs, which use knowledge from the internet without any cut-off date, are designed to prioritize high-quality, non-SEOed sites in their responses. This partnership marks a significant step for Perplexity in pushing its knowledge discovery AI across various products and platforms, following its recent collaborations with other AI hardware companies and browsers.

  • Apple releases ‘MGIE’, a revolutionary AI model for instruction-based image editing

Apple has unveiled a groundbreaking open-source AI model named “MGIE” (MLLM-Guided Image Editing), designed to revolutionize instruction-based image editing. Developed in collaboration with researchers from the University of California, Santa Barbara, MGIE leverages multimodal large language models (MLLMs) to interpret user commands for pixel-level image manipulations. This model can perform a wide range of editing tasks, from Photoshop-style modifications and global photo optimization to local editing, based on natural language instructions. MGIE’s introduction marks a significant advancement in the field of AI-driven image editing, demonstrating the potential of MLLMs in enhancing creative tasks.

MGIE operates by using MLLMs to generate expressive instructions from user input, which guide the editing process, and to create a visual imagination, a latent representation of the desired edit. This innovative approach allows MGIE to handle complex editing scenarios, offering features like expressive instruction-based editing, Photoshop-style modifications, global photo optimization, and local editing. Available as an open-source project on GitHub, MGIE is accessible for various editing tasks, with a demo notebook and a web demo hosted on Hugging Face Spaces. MGIE represents a major step forward in instruction-based image editing, showcasing Apple’s growing expertise in AI research and development and opening new possibilities for cross-modal interaction and creativity.

  • Bumble’s new AI tool identifies and blocks scam accounts, fake profiles

Bumble has introduced a new AI-powered feature named Deception Detector, aimed at combating spam, scams, and fake profiles on its platform. This innovative tool has shown remarkable efficiency in preliminary tests, automatically blocking 95% of accounts flagged as spam or scams. Within just two months of testing, reports of spam, scams, and fake profiles from users dropped by 45%. The Deception Detector works in tandem with Bumble’s human moderation team, enhancing the app’s security measures and user trust. This development is particularly significant given Bumble’s internal research, which highlighted fake profiles and scam risks as major concerns among users, especially women, who expressed anxiety over the authenticity of their online matches.

The introduction of the Deception Detector aligns with Bumble’s mission to foster equitable relationships and empower women to initiate contact, underscoring the company’s commitment to ensuring genuine connections on its platform. This move is timely, considering the Federal Trade Commission’s report last year that romance scams cost victims $1.3 billion in 2022. Bumble’s proactive approach to leveraging AI for safety, including the earlier introduction of the Private Detector feature for identifying and blurring nude images, and AI-powered icebreaker suggestions in Bumble For Friends, demonstrates the company’s ongoing efforts to make its app safer and more user-friendly. The Deception Detector, along with the Private Detector, are utilized across both Bumble and Bumble for Friends, showcasing Bumble’s comprehensive strategy to enhance user experience and security through AI.

  • Jua raises $16M to build a foundational AI model for the natural world, starting with the weather

Jua, a Swiss startup, has secured $16 million in seed funding to develop a groundbreaking “physics” AI model aimed at understanding the natural world, beginning with weather and climate patterns. This venture, co-led by 468 Capital and the Green Generation Fund, with additional support from Promus Ventures, Kadmos Capital, and others, seeks to provide precise modeling and forecasting for various sectors including energy, agriculture, insurance, transportation, and government. Jua’s initiative comes at a critical time when climate change and geopolitical volatility demand more accurate forecasting tools. The startup’s ambition is to create a foundational model that not only predicts weather but also addresses broader physical phenomena, laying the groundwork for advancements in artificial general intelligence.

Jua’s approach differentiates itself by integrating vast amounts of data, claiming to ingest information at a scale 20 times larger than similar projects like Google’s DeepMind GraphCast. The startup’s model aims to incorporate “noisy data” such as recent satellite imagery and topography, alongside traditional weather station data, into a unified system for enhanced predictive accuracy. This method promises to use significantly less compute power than legacy systems, potentially lowering operational costs. Jua’s efforts to build a foundational AI model for the physical world reflect a growing trend in AI development, where foundational models become crucial platforms for innovation. With its focus on efficiency and a comprehensive understanding of the natural world, Jua is poised to offer valuable insights across a wide range of industries affected by natural phenomena.

  • Confirmed: Entrust is buying AI-based ID verification startup Onfido, sources say for more than $400M

Onfido, a pioneering identity verification startup utilizing computer vision, machine learning, and other AI technologies, is being acquired by Entrust, a company known for its certification and verification services. The acquisition, valued at “well above” $400 million, is still pending regulatory approvals. Entrust plans to integrate Onfido’s AI tools into its extensive technology stack, enhancing its position in identity lifecycle management. This move comes as Entrust seeks to bolster its offerings with AI-based tools, notably Onfido’s Atlas AI, to address the increasing concerns around security and identity verification in various sectors, including government and financial services.

Onfido’s journey reflects significant tech trends, from the AI boom of the 2010s to the heightened demand for digital identity verification during the COVID-19 pandemic. Despite facing challenges in the post-pandemic economy, Onfido’s acquisition by Entrust signals a strategic move to enhance security measures amid growing data breaches and regulatory demands for data protection. The acquisition not only aims to provide more advanced and secure digital identity verification solutions worldwide but also highlights the ongoing consolidation in the tech industry, where comprehensive platforms are integrating specialized solutions to offer end-to-end services.

  • UK AI startup Greyparrot bags strategic tie-up with recycling giant Bollegraaf

Dutch recycling giant Bollegraaf Group has made a strategic investment in UK-based AI startup Greyparrot, which specializes in using computer vision for waste analytics. This partnership marks a significant move in the recycling industry, aiming to enhance the efficiency and intelligence of waste management processes. Greyparrot, founded in 2019 and a TechCrunch Disrupt battlefield alum, has developed AI technology that provides valuable data on discarded materials, aiding in the improvement of recycling quality and the recovery of recyclable materials from mixed or contaminated waste streams. This collaboration is set to propel the shift towards more automated and data-driven recycling facilities, with Greyparrot’s technology expected to play a central role in future waste recovery operations.

The strategic partnership includes the transfer of Bollegraaf’s AI vision business to Greyparrot, incorporating a team of six and marking Greyparrot’s expansion into mainland Europe with a new office in the Netherlands. Bollegraaf’s investment in Greyparrot, valued at $12.8 million, secures a non-controlling, non-majority stake in the startup. This investment will accelerate the integration of Greyparrot’s AI technology into Bollegraaf’s recycling systems, aiming to build smarter Material Recovery Facilities (MRFs) that are fully adaptive and automated. The collaboration between Greyparrot and Bollegraaf represents a significant step towards digitizing the waste sector, promising increased efficiency, recovery rates, and higher quality recycled materials.

  • Sam Altman Pursues Multi-Trillion Dollar Fundraising to Revolutionize Global Chip Industry and AI Expansion

Sam Altman, CEO of OpenAI, is embarking on a monumental endeavor to transform the global semiconductor industry by raising trillions of dollars. In collaboration with investors, including the government of the United Arab Emirates, Altman aims to significantly enhance the world’s chip manufacturing capacity and expand its capabilities to support AI development, among other objectives. This ambitious tech initiative, potentially costing between $5 trillion to $7 trillion, seeks to address the limitations faced by OpenAI, particularly the shortage of advanced AI chips necessary for training large language models like ChatGPT. Altman’s vision involves not only increasing the availability of these critical graphics-processing units (GPUs) but also addressing the immense energy consumption of AI facilities.

The scale of investment discussed would vastly exceed the current global semiconductor industry’s size, which had sales of $527 billion last year, with expectations to reach $1 trillion annually by 2030. The proposed funding amounts are unprecedented in corporate fundraising, surpassing the national debts of major economies and the market capitalization of tech giants like Microsoft and Apple. Altman’s plan includes forming partnerships between OpenAI, investors, chip manufacturers, and power providers to establish chip foundries operated by existing chipmakers, with OpenAI committing to being a major customer. This initiative, still in early discussions, faces numerous challenges, including securing the approval of the U.S. government, given the strategic importance of the semiconductor industry.

About The Author

Bogdan Iancu

Bogdan Iancu is a seasoned entrepreneur and strategic leader with over 25 years of experience in diverse industrial and commercial fields. His passion for AI, Machine Learning, and Generative AI is underpinned by a deep understanding of advanced calculus, enabling him to leverage these technologies to drive innovation and growth. As a Non-Executive Director, Bogdan brings a wealth of experience and a unique perspective to the boardroom, contributing to robust strategic decisions. With a proven track record of assisting clients worldwide, Bogdan is committed to harnessing the power of AI to transform businesses and create sustainable growth in the digital age.