top of page
Writer's pictureJacinth Paul

GenAI Overview | What are LLMs?

Updated: Feb 9

What are LLMs?

Large Language Models (LLMs) are advanced AI models designed to understand, interpret, generate, and respond to human language on a large scale.


LLMs are a prominent example of Generative AI (GenAI) in the field of natural language processing (NLP). They utilize the principles of machine learning and deep learning to generate human-like text, making them a significant subset of AI technologies.


Autoregressive Models (e.g., GPT-3):

  • Application: Content generation, language translation, creative writing.

  • Example: OpenAI's GPT-3.


Autoencoding Models (e.g., BERT):

  • Application: Text summarization, question answering, information extraction.

  • Example: Google's BERT.


Seq2Seq Models (e.g., T5):

  • Application: Machine translation, summarization, text-to-speech.

  • Example: Google’s T5 (Text-to-Text Transfer Transformer).

 

What are Foundation Models?

Foundation models are versatile, deep learning models pre-trained on extensive, diverse datasets. They provide a base that can be fine-tuned for various specific tasks, including but not limited to language, vision, and decision-making processes. Examples include


GPT-3 (OpenAI):

  • An autoregressive language model known for its ability to generate coherent and contextually relevant text.

  • Applications: Content creation, conversational agents, language translation, and more.

BERT (Bidirectional Encoder Representations from Transformers - Google):

  • A model designed to understand the context of a word in a sentence by examining the words around it.

  • Applications: Text classification, search engine optimization, sentiment analysis, question answering.

T5 (Text-To-Text Transfer Transformer - Google):

  • Frames all NLP tasks as text-to-text problems, converting every NLP problem into a text generation task.

  • Applications: Machine translation, summarization, question answering, text classification.


RoBERTa (Robustly Optimized BERT Pretraining Approach - Facebook AI):

  • An optimized version of BERT, pre-trained on a larger dataset and for a longer time.

  • Applications: Enhanced versions of BERT’s applications like natural language inference, sentiment analysis.

LaMDA (Language Model for Dialogue Applications - Google):

  • Specialized in dialogue, this model is designed to engage in free-flowing conversations on a seemingly endless number of topics.

  • Applications: Conversational AI, chatbots, virtual assistants.


LLaMA (Large Language Model - Meta/Facebook AI):

  • A model emphasizing accessibility and efficiency, suitable for a range of hardware environments.

  • Applications: Flexible deployment in various environments, including those with limited computational resources.


PaLM (Pathways Language Model - Google):

  • A model focused on multi-tasking and understanding a vast range of languages and tasks.

  • Applications: Multilingual language understanding, complex task handling, and multi-modal applications. 


This sequence illustrates the process through which a large language model LLM operates

How does the LLM process information?

This below sequence illustrates the process through which a language model operates, from receiving an input to providing an output. This involves machine learning algorithms and neural network processes, with high computational powers


Input Processing

  • Step 1: User inputs a query or statement.

  • Step 2: The input is preprocessed to convert it into a format understandable by the model. This typically involves tokenization, where the input text is broken down into smaller pieces, often called tokens.


Model Interaction

  • Step 3: The preprocessed input is fed into the language model.

  • Step 4: The model, which consists of multiple layers of neural networks, processes the input. Each layer performs specific transformations and feature extractions.


Context Understanding and Response Generation

  • Step 5: The model analyzes the input considering the context, patterns, and training it has received.

  • Step 6: Based on this analysis, the model generates a response. This involves predicting the next words or sentences that best respond to the input, based on its training data.


Output Processing

  • Step 7: The generated response is post-processed if necessary, to ensure it is in a human-readable format.

  • Step 8: The response is delivered to the user.


Feedback Loop (Optional)

  • Step 9: In some implementations, feedback from the user on the quality of the response is used for continuous learning and model improvement.



4 GenAI Models - Source (GenAI Framework report, UK)

There is a distinction between a service and a model in the context of cloud computing. A service, as provided by Cloud Service Providers, represents a comprehensive offering that typically amalgamates various elements such as models, data, and additional components. On the other hand, a model acts as the pivotal element within a service, often grounded in a foundational model like a Large Language Model (LLM).


Services are generally fine-tuned for production environments, with a focus on user-friendliness, often facilitated through a graphical user interface. Its important to note that these services are often associated with a cost. They could necessitate a subscription or other forms of payment. One example of a service is Azure Machine Learning which is a Cloud Service designed for data scientists and ML engineers to manage the whole ML lifecycle (train, test, deploy and handle MLOps) in a single platform.

コメント


Subscribe to PSHQ

Thanks for submitting!

Topics

bottom of page