Please enable JavaScript.
Coggle requires JavaScript to display documents.
spring-ai
AI Concepts, spring-ai
AI Concepts - Coggle Diagram
spring-ai
AI Concepts
Models
AI models are algorithms that mimic human thinking by processing and generating information. They learn from large datasets to create various outputs like predictions, text, and images that can be used across different industries.
There are diverse AI models for different purposes. While ChatGPT became famous for text generation, other models offer different capabilities. Before ChatGPT, people were particularly interested in models like Midjourney and Stable Diffusion that could create images from text descriptions.
Spring AI supports three main types of model interactions:
- language processing
- image generation
- audio processing.
It also supports text embeddings, which convert text into numerical representations (vectors) that AI models can understand and use for advanced applications.
GPT (Generative Pre-trained Transformer) models are special because they come pre-trained. This pre-training means developers can use these AI tools without needing deep expertise in machine learning or having to train the models themselves.
Prompt
Prompts are not just simple text inputs to AI models - they are sophisticated instructions that guide AI outputs. While they might appear as simple text in a chat box, prompts are actually more complex structures that vary depending on the AI model.
In ChatGPT's API, prompts have different roles. The "system" role sets the behavior and context for the AI, while the "user" role contains the actual user input. This creates a structured way of communicating with the model.
Creating effective prompts requires skill and understanding. Unlike structured query languages like SQL, communicating with ChatGPT is more like having a human conversation, requiring a different approach to formulating questions and instructions
"Prompt Engineering" has emerged as a specialized field focused on creating effective AI prompts. Various techniques have been developed to improve prompt effectiveness, and investing time in prompt crafting can significantly improve results.
Prompt creation has become a collaborative effort with active research. Surprisingly effective prompts can be counterintuitive - like starting with "Take a deep breath and work on this step by step." This shows how much we still have to learn about effectively using AI language models, both current and future versions.
Prompt Templates
- Effective prompts require two main elements: establishing context for the request and allowing for variable substitution based on user input.
- Spring AI uses the StringTemplate library as its template engine to create and manage prompts, following traditional text-based template approaches.
- Shows a basic template example where placeholders {adjective} and {content} can be filled with specific values to create a customized joke prompt.
- In Spring AI's architecture, prompt templates function similarly to Views in Spring MVC. They use a Map object to fill template placeholders, and the resulting string becomes the prompt sent to the AI model.
- Prompt formats have evolved from simple strings to more complex structures. Modern prompts often contain multiple messages, with each string having a specific role that guides the AI model's behavior.
Service public class PromptService {
Autowired
private ChatClient chatClient;
public String generateJoke(String adjective, String content) {
Map<String, Object> model = new HashMap<>(); model.put("adjective", adjective);
model.put("content", content); PromptTemplate template = new PromptTemplate("Tell me a
{adjective} joke about {content}."); Prompt prompt = template.create(model); return chatClient.call(prompt).getResult().
getOutput().getContent(); } }
Embeddings
Embeddings are numerical representations of text, images, or videos that capture relationships between inputs.
Embeddings work by converting text, image, and video into arrays of floating point numbers, called vectors.
These vectors are designed to capture the meaning of the text, images, and videos.
The length of the embedding array is called the vector’s dimensionality.
By calculating the numerical distance between the vector representations of two pieces of text, an application can determine the similarity between the objects used to generate the embedding vectors.
Java developers integrating AI into their applications don't need to deeply understand the complex mathematics behind AI - they just need to grasp the basic concepts of how AI components work.
Embeddings are vector representations used in RAG (Retrieval Augmented Generation) that convert data into points in a high-dimensional space.
Similar to how points on a 2D plane can be close or far apart, embeddings place similar concepts closer together in this space, making them useful for tasks like semantic search and recommendations
A simple way to understand this concept is to think of the semantic space as a vector - a mathematical way to represent positions in this multi-dimensional space.
semantic space
a semantic space is a way to organize and represent the meanings of words or concepts in a structured manner, allowing for the analysis and comparison of semantic relationships.
Tokens
Tokens serve as the building blocks of how an AI model works.
On input, models convert words to tokens.
On output, they convert tokens back to words.
In English, one token roughly corresponds to 75% of a word.
For reference, Shakespeare’s complete works, totaling around 900,000 words, translate to approximately 1.2 million tokens.
Perhaps more important is that Tokens = Money.
In the context of hosted AI models, your charges are determined by the number of tokens used.
Both input and output contribute to the overall token count.
Also, models are subject to token limits, which restrict the amount of text processed in a single API call.
This threshold is often referred to as the "context window". The model does not process any text that exceeds this limit.
For instance, ChatGPT3 has a 4K token limit, while GPT4 offers varying options, such as 8K, 16K, and 32K.
Anthropic’s Claude AI model features a 100K token limit, and Meta’s recent research yielded a 1M token limit model.
To summarize the collected works of Shakespeare with GPT4, you need to devise software engineering strategies to chop up the data and present the data within the model’s context window limits. The Spring AI project helps you with this task.
Structured Output
The output of AI models traditionally arrives as a java.lang.String, even if you ask for the reply to be in JSON. It may be a correct JSON, but it is not a JSON data structure. It is just a string.
The Structured output conversion employs meticulously crafted prompts, often necessitating multiple interactions with the model to achieve the desired formatting.
voir(The Spring AI Structured Output Converters )
-
-
Function Calling
Large Language Models (LLMs) are frozen after training, leading to stale knowledge, and they are unable to access or modify external data.
The Function Calling mechanism addresses these shortcomings.
It allows you to register your own functions to connect the large language models to the APIs of external systems. These systems can provide LLMs with real-time data and perform data processing actions on their behalf.
-
Evaluating AI responses
AI output evaluation is crucial for ensuring accuracy and quality, with pre-trained models offering built-in evaluation capabilities.
Evaluation examines if responses match user intent and query context, using metrics like relevance, coherence, and factual accuracy.
One evaluation method involves having the AI model assess whether its own response aligns with the user's request.
Vector database information can be used as reference data to improve evaluation and determine response relevance.
-