Please enable JavaScript.

Coggle requires JavaScript to display documents.

LLM & Text Generation - Coggle Diagram

- - - - experts
      - Gating Network (Router)
        
        learns to route the input to the relevant expert (input)
      - Combination Mechanism
        
        combines the output of experts, weighted according to probabilities provides by gating network to produce final prediction (output)
    - - Gemini
      - GLaM
      - Mixtral 8x7B
      - Gemini 1.5 Pro
        
        Long-Context Understanding (up to 10 million tokens without degrading performance)
- - - - encoder takes the input, and create a representation of it and decoder uses this decoder to print te output
      - transformer process
        
        position encoding
        
        each word has it's calculated weight
        
        in parallel
      - feed forward layer
- - - - 2 or more examples
    - - Additional Prompt: Let's think step by step.
      - Give better answers
      - uses a simple 'greedy decoding' strategy, which means it typically produces a single reasoning path based on the highest probability tokens at each step
      - For CoT prompting, set the temperature to 0.
    - - Reason and act
      - enabling LLMs to solve complex tasks using natural language reasoning combined with external tools (search, code interpreter etc.)
    - - it allows LLMs to explore multiple different reasoning paths simultaneously, rather than just following a single linear chain of thought.
      - for complex tasks that require exploration.
    - - Automatic Prompt
        Engineering
    - - without examples
    - - System prompting
        
        sets the overall context and purpose for the language model
        
        big picture
        
        f.ex. return the output in JSON format
        
        an additional task to the system.
      - Contextual prompting
        
        provides specific details or background information
        
        Context: You are writing for a blog about retro 80's arcade video games.
      - Role prompting
        
        assign a character
        
        f.ex. 5 years old kid
        
        sets style of output
    - - Providing the same CoT prompt multiple times, often with a higher temperature setting to encourage the generation of diverse reasoning paths
      - Choosing the most common answer among the multiple responses
      - use software or an API to send the exact same prompt to the language model several times,
      - process the multiple responses received to identify the most frequent final answer
  - - - Explain to me the below Bash code...
    - - a technique where you use multiple input formats to guide
        a LLM, f.ex. text, images, audio
    - - between f.ex. bas and python
      - Translate the below Bash code to a Python snippet....
    - - The below Python code gives an error:.. Debug what's wrong and explain how I can improve the code.
  - - - Temperature
        
        low temperature
        
        temperature = 0
        
        Greedy search / greedy decoding
        
        model always picks most likely next token
        
        1 more item...
        
        model picks the most probable words, without randomness
        
        topK, topP are ignored
        
        for code
        
        deterministic
        
        higher temperature
        
        better creativity
        
        for exploring new ideas
        
        more randomness
        
        controls creatity and randomness in token selection.
      - nucleus sampling
        
        TopK
        
        it's used to choosing next word (token) during generatong text
        
        higher key - more variety of te output
        
        fixed numner
        
        limits the output to the topK most likely tokens
        
        Top
        
        samples from a dynamic subset of tokens whose cumulative probability adds up to P
        
        higher p, more possible words
      - Random
        
        parameters
        
        Best-of-N sampling
        
        generate multiple responses and choose the best based on some criteria
        
        more creative output but higher chances of getting nonsensical text
      - order: topK, topP, temperature
      - topK=1 or topP=1 overrides the temperature, always forces single best guess
      - starting points
        
        temperature .2, topP .95 topK 30
    - - number of tokens to generate response
      - more tokens - slower response
- - - - lot of example questions and answers
- - - - pojęcie matematyczne, które oznacza ocenianie funkcji w konkretnych punktach jej dziedziny — „punkt po punkcie”.
      - setup xx levels of grading
- - - - Indexing
      - Retrieval
      - Generation
    - - Opensource vector DB
    - - the model retrieves relevant information from the database and augments the original prompt with it.
  - - - long term memory for LLMs
      - similarity search for diferent types of data (test, videos etc)
      - semantic search (based on the meaning, context)
      - recommendation enginge
  - - - words
      - sentences
      - paragraphs
      - tokenization
        
        text to tokens, each token get unique ID
    - - Word2Vec
        
        continues bag of words
        
        skip gram
      - similar words are use in similar or same context
    - - global vectors
      - extension of word2vec
        
        global ocntext
    - - extension of word2vec
- - - - Edge
      - Node
      - State component
        
        store inputs, outputs, variables
      - Conditional edge
    - - router
      - react agent
      - reflection agent
  - - - memory
      - prompt
      - llm
      - agent