Generative AI INTRO. Foundation Models

101591_the sun sets on planet Mars where two completely d_esrgan-v1-x2plus

Gen AI Text to Image using Stability.ai by Bogdan Iancu

Generative AI INTRO

An INTRO to Generative AI. This article will (hopefully) clarify the concept of generative AI, its workings, model types, and applications.

Generative AI, a subset of artificial intelligence, is a technological marvel capable of creating diverse content, including text, images, audio, and synthetic (new) data.

Before we delve into generative AI, let’s set the stage with a brief overview of artificial intelligence and its relationship with machine learning. Imagine AI as a discipline, akin to physics, a branch of computer science dedicated to creating intelligent agents capable of reasoning, learning, and acting autonomously. Within this discipline, we find machine learning, a subfield that trains a model from input data, enabling computers to learn without explicit programming.

Machine learning models are often categorized as supervised or unsupervised. The key distinction lies in the data labels. Labelled data carries a tag, such as a name, type, or number, while unlabelled data comes without a tag.

Consider this example: you’re a meteorologist with historical data on weather patterns and storm occurrences. A supervised model could use this data to predict future weather events, while an unsupervised model might group or cluster regions based on their climate characteristics to identify patterns.

Deep learning, another subset of machine learning, uses artificial neural networks to process more complex patterns. These networks, inspired by the human brain, consist of interconnected nodes or neurons that learn tasks by processing data and making predictions. Deep learning models can handle both labelled and unlabelled data, a method known as semi-supervised learning.

Generative AI, a subset of deep learning, uses artificial neural networks and can process both labelled and unlabelled data using supervised, unsupervised, and semi-supervised methods. Machine learning models can be divided into two types: generative and discriminative.

Discriminative models classify or predict labels for data points, while generative models generate new data instances based on a learned probability distribution of existing data.

In summary, generative models create new data instances, while discriminative models differentiate between different kinds of data instances. Generative AI models learn patterns in content to generate new content. If the output is a number or a class, it’s not generative AI. If the output is natural language, an image, or audio, it is generative AI.

To visualize this mathematically, consider the equation y = f(x), where y is the model output, f represents the function used in the calculation, and x represents the input or inputs used for the formula.

If y is a number, like predicted temperature, it’s not generative AI. If y is a sentence, like “describe the weather,” it is generative, as the question would elicit a text response based on the large data the model was trained on.

In conclusion, the traditional learning process, both supervised and unsupervised, takes training code and label data to build a model. Generative AI, however, takes a step further by creating new, unique content.

What is Generative AI? What types of Gen AI are there?

Depending on the use case or problem, the model can give you a prediction. It can classify something or cluster something. We use this example to show you how much more robust the gen AI process is. The gen AI process can take training code, label data, and unlabelled data of all data types and build a foundation model. The foundation model can then generate new content. For example, text, code, images, audio, video, etcetera.

We’ve come a long way from traditional programming to neural networks to generative models. In traditional programming, we used to have to hard code the rules for distinguishing a cat– the type, animal; legs, four; ears, two; fur, yes; likes yarn and catnip. In the wave of neural networks, we could give the network pictures of cats and dogs and ask if this is a cat and it would predict a cat.

In the generative wave, we as users can generate our own content, whether it be text, images, audio, video, et cetera. For example, models like GPT-3 or Codex, ingest very, very large data from multiple sources across the internet and build foundation language models we can use simply by asking a question, whether typing it into a prompt or verbally talking into the prompt itself. So, when you ask it what’s a cat, it can give you everything it has learned about a cat.

Now we come to our formal definition.

What is generative AI? Gen AI is a type of artificial intelligence that creates new content based on what it has learned from existing content. The process of learning from existing content is called training and results in the creation of a statistical model when given a prompt. AI uses the model to predict what an expected response might be and this generates new content. Essentially, it learns the underlying structure of the data and can then generate new samples that are similar to the data it was trained on.

As previously mentioned, a generative language model can take what it has learned from the examples it’s been shown and create something entirely new based on that information. Large language models are one type of generative AI since they generate novel combinations of text in the form of natural-sounding language. A generative image model takes an image as input and can output text, another image, or video.

For example, under the output text, you can get visual question answers while under the output image, an image completion is generated. And under the output video, the animation is generated. A generative language model takes text as input and can output more text, an image, audio, or decisions. For example, under the output text, question answering is generated. And under the output image, a video is generated.

We’ve stated that generative language models learn about patterns and language through training data, and then, given some text, they predict what comes next. Thus generative language models are pattern-matching systems. They learn about patterns based on the data you provide.

Here is an example. Based on things it’s learned from its training data, it offers predictions of how to complete this sentence, I’m making a sandwich with peanut butter and jelly.

Here is the same example using GPT-3, which is trained on a massive amount of text data and is able to communicate and generate humanlike text in response to a wide range of prompts and questions. Here is another example. The meaning of life is– and GPT-3 gives you a contextual answer and then shows the highest probability response.

The power of generative AI comes from the use of transformers. Transformers produced a 2018 revolution in natural language processing. At a high level, a transformer model consists of an encoder and a decoder. The encoder encodes the input sequence and passes it to the decoder, which learns how to decode the representation for a relevant task.

In transformers, hallucinations are words or phrases that are generated by the model that are often nonsensical or grammatically incorrect. Hallucinations can be caused by a number of factors, including the model is not trained on enough data, the model is trained on noisy or dirty data, the model is not given enough context, or the model is not given enough constraints. Hallucinations can be a problem for transformers because they can make the output text difficult to understand. They can also make the model more likely to generate incorrect or misleading information.

A prompt is a short piece of text that is given to the large language model as input. And it can be used to control the output of the model in a variety of ways. Prompt design is the process of creating a prompt that will generate the desired output from a large language model. As previously mentioned, gen AI depends a lot on the training data that you have fed into it. And it analyses the patterns and structures of the input data and thus learns. But with access to a browser-based prompt, you, the user, can generate your own content.

We’ve shown illustrations of the types of input based on data. Here are the associated model types.

Text-to-text. Text-to-text models take a natural language input and produce text output. These models are trained to learn the mapping between a pair of texts, e.g., for example, translation from one language to another.

Text-to-image. Text-to-image models are trained on a large set of images, each captioned with a short text description. Diffusion is one method used to achieve this. Text-to-video and text-to-3D. Text-to-video models aim to generate a video representation from text input. The input text can be anything from a single sentence to a full script. And the output is a video that corresponds to the input text. Similarly, text-to-3D models generate three-dimensional objects that correspond to a user’s text description. For example, this can be used in games or other 3D worlds.

Text-to-task. Text-to-task models are trained to perform a defined task or action based on text input. This task can be a wide range of actions such as answering a question, performing a search, making a prediction, or taking some sort of action. For example, a text-to-task model could be trained to navigate a web UI or make changes to a doc through the GUI.

Foundation Models. Examples

A foundation model is a large AI model pre-trained on a vast quantity of data designed to be adapted or fine-tuned to a wide range of downstream tasks, such as sentiment analysis, image captioning, and object recognition. Foundation models have the potential to revolutionize many industries, including healthcare, finance, and customer service. They can be used to detect fraud and provide personalized customer support.

OpenAI offers a variety of foundation models. The language foundation models include GPT-3 (and now GPT-4) for chat and text. The vision foundation models include DALL-E, which has been shown to be effective at generating high-quality images from text descriptions.

Let’s say you have a use case where you need to gather sentiments about how your customers are feeling about your product or service. You can use the classification task sentiment analysis task model for just that purpose. And what if you needed to perform occupancy analytics? There is a task model for your use case.

1. Sentiment Analysis with GPT-3:

To analyze sentiments about your product or service using GPT-3, you can use the OpenAI API to classify text based on sentiment.

The above script sends a prompt to the classification model and gets a sentiment label in return.

2. Occupancy Analytics:

For occupancy analytics, you would typically use a dataset and run some sort of machine learning model on it. OpenAI doesn’t have a specific model for this, but you can use GPT-3 to generate code for the task. You would need to provide a prompt describing the task, and GPT-3 could generate Python code using libraries like Pandas or NumPy.

3. Code Generation:

You can use GPT-3 to generate code for converting a pandas DataFrame to a JSON file. Here’s how:

This code sends a prompt to the codex engine, which should return Python code for converting a DataFrame to a JSON file.

4. Using the Generated Code:

After you get the code from GPT-3, you can copy it and paste it into your Jupyter notebook. From there, you can run the code and check if it works as expected.

Remember to replace 'your-api-key' with your actual OpenAI API key.

Please note that this is a simplified explanation, and actual usage may require handling more complex scenarios and errors.

To summarize, GPT-3’s code generation can help you debug your lines of source code, explain your code to your line by line, craft SQL queries for your database, translate code from one language to another, and generate documentation and tutorials for source code.

OpenAI’s API lets you quickly explore and customize gen AI models that you can leverage in your applications. This API helps developers create and deploy Gen AI models by providing a variety of tools and resources that make it easy to get started.

Developers can create gen AI apps without having to write any code by using OpenAI’s robust API and integrating it with their own tools and interfaces. They can create their own digital assistants, custom search engines, knowledge bases, training applications, and much more.

OpenAI’s API lets you test and experiment with large language models and gen AI tools. To make prototyping quick and more accessible, developers can integrate this API with their own suite of tools and use it to access the API using a graphical user interface. The suite includes a number of different tools such as a model training tool, a model deployment tool, and a model monitoring tool. The model training tool helps developers train ML models on their data using different algorithms. The model deployment tool helps developers deploy ML models to production with a number of different deployment options.

The model monitoring tool helps developers monitor the performance of their ML models in production using a dashboard and a number of different metrics.

About The Author

Bogdan Iancu

Bogdan Iancu is a seasoned entrepreneur and strategic leader with over 25 years of experience in diverse industrial and commercial fields. His passion for AI, Machine Learning, and Generative AI is underpinned by a deep understanding of advanced calculus, enabling him to leverage these technologies to drive innovation and growth. As a Non-Executive Director, Bogdan brings a wealth of experience and a unique perspective to the boardroom, contributing to robust strategic decisions. With a proven track record of assisting clients worldwide, Bogdan is committed to harnessing the power of AI to transform businesses and create sustainable growth in the digital age.

ML | AI | Gen AI

https://www.linkedin.com/in/bogdaniancu1973/

Generative AI INTRO. Foundation Models