Please enable JavaScript.
Coggle requires JavaScript to display documents.
ISQB Certified Tester Specialist Level Testing with Generative AI (CT…
ISQB Certified Tester Specialist Level Testing with Generative AI (CT-GenAI)
Introduction to Generative AI for Software Testing
1.1 Generative AI Key Concepts and Foundations
Focuses on understanding the underlying technology of Generative AI, Large Language Models (LLMs), and their key terminology.
1.1.1 AI Spectrum: Symbolic AI, Classic ML, Deep Learning, and Generative AI
Symbolic AI: Uses rule-based systems to imitate human decision-making
Classic ML: Data-driven approach for tasks such as defect categorization
Deep Learning: Uses neural networks to automatically learn features from complex data
Generative AI (GenAI): Uses deep learning techniques to create new content (text, images, code) by learning patterns
1.1.2 Generative AI and LLMs Foundations
Based on the 'generative pre-trained transformer' (GPT) model, trained on vast, diverse datasets
SLMs (Small Language Models): Compact models with fewer parameters, designed for lightweight and focused GenAI solutions
Key LLM processing concepts:
Tokenization: Process of splitting text into smaller units called 'tokens'
Embeddings: Numerical representations (vectors) of tokens that encode their semantic relationships
Transformer Model: Neural network architecture that processes context of long text sequences
Non-deterministic behavior: LLMs may vary their output for the same input due to probabilistic nature
Context Window: The amount of preceding text the model can consider
1.1.3 Foundation, Instruction-Tuned, and Reasoning LLMs
Foundation LLMs (Foundation Models): General-purpose models. Require adaptation
Instruction-tuned LLMs: Fine-tuned with datasets pairing 'prompts' with 'expected responses'
Reasoning LLMs: Emphasize structured cognitive skills (e.g., logical inference, "chain-of-thought")
1.1.4 Multimodal LLMs and Vision-Language Models
Multimodal LLMs: Process multiple data modalities (text, images, sound, video)
Vision-Language Models (VLM): Specifically integrate visual (images) and textual information
Application in Testing: Can analyze visual elements (screenshots) alongside textual descriptions
1.2 Leveraging Generative AI in Software Testing: Key Principles
Explores LLM capabilities to automate and enhance testing tasks, and the different ways to integrate this technology (chatbots vs. applications).
1.2.1 Key LLM Capabilities for Testing Tasks
Requirements analysis and improvement: Identify ambiguities and inconsistencies
Test case creation support: Generate test cases and suggest test objectives
Test oracle generation: Help generate expected results
Test data generation: Generate synthetic datasets and set boundary values
Test automation support: Help generate and enhance test scripts
Test results analysis: Help analyze results and classify anomalies
Testware creation: Help create documents like test plans and defect reports
1.2.2 AI Chatbots and LLM-Powered Testing Applications
AI Chatbots: Provide a conversational interface (natural language) for direct LLM interaction
LLM-Powered Testing Applications: Integrate LLM capabilities into existing testing tools or frameworks
Prompt Engineering for Effective Software Testing
2.1 Effective Prompt Development
Covers the essential elements for designing clear and structured instructions (prompts) that guide Generative AI models to produce high-quality testing outputs.
2.1.1 Prompt Structure for Generative AI in Software Testing
6 typical components:
Role: Defines the perspective the GenAI should take
Context: Background information the model needs
Instruction: Clear and concise directives about the task
Input data: Information necessary for the task
Restricciones (Constraints): Limitations the LLM must follow
Output format: Specifications on how the response should look
2.1.2 Key Prompting Techniques for Software Testing
Prompt Chaining: Decompose a complex task into a series of intermediate steps
Few-shot Prompting: Provide the LLM with
a few
examples within the prompt
Zero-shot Prompting (Comparison): No examples are provided
One-shot Prompting (Comparison): Only
one
example is provided
Meta Prompting: Leverage the AI's ability to generate or refine its own prompts
2.1.3 System Prompt and User Prompt
System Prompt: Defines the
general
personality, tone, operating rules, and constraints
User Prompt: The user's actual input or question
2.2 Applying Prompt Engineering Techniques to Testing Tasks
Details how to use prompt engineering in the main phases of the testing process: analysis, design, implementation, and control.
2.2.1 Test Analysis with Generative AI
Typical tasks: Identify potential defects in the test basis, generate test conditions, prioritize test conditions, suggest relevant testing techniques
2.2.2 Test Design and Implementation with Generative AI
Typical tasks: Test case generation, synthetic test data synthesis, automated test script generation
2.2.3 Automated Regression Testing with Generative AI
Typical tasks: Automated script implementation, impact analysis and test optimization, self-healing tests, automated test reports
2.2.4 Test Monitoring and Control with Generative AI
Typical tasks: Test monitoring and metric analysis, test control (reprioritization), test completion insights, enhanced visualization
2.2.5 Choosing Prompting Techniques for Software Testing
Prompt Chaining: Best for complex tasks requiring precision and human verification
Few-shot Prompting: Best for repetitive tasks or those requiring a specific output format
Meta Prompting: Best for flexible and dynamic tasks or for creating new prompts
2.3 Evaluating Generative AI Outputs and Refining Prompts
Focuses on how to measure the quality and usefulness of AI-generated testware and the iterative process for improving instructions using testing metrics.
2.3.1 Metrics for Evaluating Generative AI Outputs in Testing Tasks
Accuracy: Overall correctness of the output
Precision: Correctness of the output with respect to a specific objective
Recall: Ability to identify all relevant instances
Relevance and Contextual Fit: Whether the output is applicable and appropriate for the context
Diversity: Ensuring a wide range of inputs and scenarios are covered
Execution Success Rate: Proportion of generated scripts that can execute successfully
Time Efficiency: Time saved compared to manual effort
2.3.2 Techniques for Iteratively Evaluating and Refining Prompts
Iterative prompt modification: Start with a base prompt and gradually modify it
A/B testing of prompts: Create multiple prompt versions and evaluate which yields better results
Output analysis: Examine the output looking for inaccuracies
Integrate user feedback: Collect opinions from testers
Adjust prompt length and specificity: Experiment with different levels of detail
Managing Generative AI Risks in Software Testing
3.1 Hallucinations, Reasoning Errors, and Biases
Analyzes internal risks related to the generation of incorrect, illogical, or discriminatory content by the AI, and how to detect and mitigate them.
3.1.1 Hallucinations, Reasoning Errors, and Biases in Generative AI
Hallucinations: AI generates output that appears factually incorrect or invented
Reasoning Errors: LLMs misinterpret logical structures
Biases: Stem from the training data
3.1.2 Identifying Hallucinations, Reasoning Errors, and Biases in LLM Output
Hallucination Detection: Cross-verification, domain expertise consultation, consistency checks
Reasoning Error Detection: Logical validation, output testing (executing generated tests)
Bias Detection: Reviewing whether generated testware precisely represents the test strategy
3.1.3 Mitigation Techniques for Hallucinations, Reasoning Errors, and Biases
Provide complete context, divide prompts into manageable segments (prompt chaining), use clear data formats, compare results across models
3.1.4 Mitigating Non-Deterministic Behavior of LLMs
Adjust 'Temperature' parameter: Lowering temperature (e.g., to 0) reduces randomness
Set random seeds: Ensures reproducibility
3.2 Data Privacy and Security Risks of Generative AI
Addresses threats to confidentiality, integrity, and availability of information when using AI systems, including "Shadow AI."
3.2.1 Key Data Privacy and Security Risks Associated with Generative AI
Data Privacy Concerns: Unintentional data exposure, lack of control over data usage, compliance risks
Security Risks: Vulnerable infrastructure, exploitation of LLM vulnerabilities, malicious input
3.2.2 Data Privacy and Vulnerabilities in GenAI Testing Processes and Tools
Attack Vectors: Data exfiltration, request manipulation, data poisoning, malicious code generation
3.2.3 Mitigation Strategies to Protect Privacy and Enhance Security
Data Protection Measures: Data minimization, anonymization and pseudonymization, secure storage, training
Additional Strategies: Systematic review of generated output, choice of a secure operating environment
3.3 Energy Consumption and Environmental Impact
Discusses the computational cost and carbon footprint associated with LLM training and inference, and the need for optimization.
3.3.1 Impact of GenAI Use on Energy Consumption and CO2 Emissions
Training and processing LLMs (inference) requires intensive use of computational resources, resulting in a substantial environmental load
3.4 AI Regulations, Standards, and Best Practice Frameworks
Presents the regulatory landscape and risk management frameworks to ensure ethical and compliant use of Generative AI in testing.
3.4.1 Relevant Regulations, Standards, and Frameworks for GenAI in Testing
ISO/IEC 42001:2023: Standard for managing AI systems
ISO/IEC 23053:2022: Framework for AI Systems using ML
EU AI Act: Regulation classifying applications by risk level
NIST AI Risk Management Framework (RMF): Guides for managing AI risks
LLM-Powered Testing Infrastructure
4.1 Architectural Approaches for LLM-Powered Testing Infrastructure
Describes the necessary architecture to integrate LLMs into the testing environment, including the use of Vector Databases and the RAG approach.
4.1.1 Key Architectural Concepts and Components
Definition: Infrastructure that integrates an LLM into the software testing process
Typical Architecture: Front-end, Back-end, LLM
Integration of multiple data sources: Relational databases and Vector databases
4.1.2 Retrieval-Augmented Generation (RAG)
RAG Definition: Technique that enhances LLMs by incorporating additional
external
data sources
Runtime process: 1. Retrieval of relevant data 'chunks'. 2. Generation of 'grounded' response
4.1.3 The Role of LLM-Powered Agents in Test Automation
LLM-Powered Agents: Specialized GenAI applications for semi-autonomous or autonomous task processing
Key difference from chatbots: Agents can "act" by invoking functions or "tools"
Autonomy Levels: Autonomous and Semi-autonomous Agents
Multi-agent Architectures: Collaborative system where multiple agents coordinate
4.2 Fine-Tuning and LLMOps: Operationalizing Generative AI
Explains advanced techniques (like Fine-Tuning) to adapt LLMs to specific domains and LLMOps practices for managing their lifecycle in production.
4.2.1 Fine-Tuning LLMs for Testing Tasks
Fine-tuning: Adapts a pre-trained Language Model to perform specific tasks or adjust to particular domains
Involves
additional supervised training
on a focused, labeled dataset
Challenges: Requires high-quality datasets, mitigating 'overfitting', managing 'opacity'
4.2.2 LLMOps in Deploying and Managing LLMs for Testing
LLMOps (Large Language Model Operations): Practices, tools, and processes to streamline the development, deployment, and maintenance of LLMs
Implementation Approaches: Using an AI chatbot, using an integrated GenAI testing tool, or In-house development
Deploying and Integrating Generative AI in Testing Organizations
5.1 Roadmap for Generative AI Adoption in Testing
Establishes a strategic plan, including mitigating "Shadow AI" risks and criteria for model selection, across the adoption phases.
5.1.1 Risks of "Shadow AI"
Shadow AI: Use of GenAI tools or systems
without
formal approval
Risks: Data security and privacy weaknesses, regulatory compliance issues, vague intellectual property
5.1.2 Key Aspects of a Generative AI Strategy in Testing
Define measurable objectives, select the correct LLMs/SLMs, ensure input data quality, establish training programs, collect metrics, and establish governance guidelines
5.1.3 Selection of LLMs/SLMs for Testing Tasks
Key criteria: Model performance, Fine-tuning potential, recurring cost, community and support
5.1.4 Phases in Generative AI Adoption in Testing
Phase 1: Discovery: Awareness, training, and initial experimentation
Phase 2: Initiation and use definition: Identify and prioritize use cases, evaluate infrastructure
Phase 3: Utilization and iteration: Full integration, continuous monitoring and measurement
5.2 Managing Change When Adopting Generative AI
Details the evolution of roles (Tester and Test Manager), essential skills required, and capability building within testing teams.
5.2.1 Essential Skills and Knowledge for Testing with GenAI
Key skills: Mastering prompt engineering techniques, understanding the model, developing test review methods, knowledge of risks, data security implications
5.2.2 Building GenAI Capabilities in Testing Teams
Practical approach with various LLMs, structured learning paths, internal Communities of Practice (CoP), sharing prompt libraries
5.2.3 Evolution of Testing Processes in AI-Enabled Organizations
Tester Evolution: Shifts to an
AI-assisted
testing specialist. Tasks: AI output review, prompt refinement
Test Manager Evolution: Updated responsibilities include: Developing an
AI-based
testing strategy,
AI-based
risk management, leading hybrid teams