Please enable JavaScript.
Coggle requires JavaScript to display documents.
AI Industry Map, GPTS 4 still "king", scaling experiments likely…
AI Industry Map
coute report
AI Stack
AI Ops
Hugging Face
AI Cloud
Lambda
Models
Open aI
improve models
Multi-moality
Longer contexts
local models
New architectures
Better model eval
AI Semis
SambaNova
Apps
Jasper. ai
Scaling AI model performance has been a focus of AI researchers
Larger datasets: How much data is optimal for training models?
More compute: Can we reduce the compute costs of training & inference?
More parameters: Will scaling parameters continue improving performance?
Longer training: Can training for mor epochs improve perforamce?
AI Avatar
The core determinants of AI digital human interactive performance are four driving abilities:
Text: Whether the content spoken by the AI digital human is appropriate.
Text Generation Capability (out of 100 points):
40 points: Rigid machine level (Last century's NLP technology)
60 points: Relatively intelligent (Chat GPT level + knowledge base + light tuning)
80 points: Indistinguishable from humans (Chat GPT level + knowledge base + heavy tuning)
90 points: Chat GPT App level of voice content generation
100 points: Completely natural human level
Voice: Whether the voice of the AI digital human is pleasant to hear.
40 points: Rigid machine level (Last century's TTS technology)
60 points: Relatively intelligent, still with a machine feel (Open-source framework average level / second-tier cloud service provider level)
80 points: Indistinguishable from humans (Open-source framework + fine-tuning top-tier / first-tier cloud service provider's premium sound)
90 points: Chat GPT App level of voice content generation
100 points: Completely natural human level
Expression: Whether the expressions of the AI digital human are rich.
Facial Expression Generation Capability:
40 points: Rigid machine level (Oculus LipSync level)
60 points: Relatively intelligent, still with a machine feel (SadTalk/VideoRetalking/Wav2Lip)
70 points: 3D (NVIDIA Audio2Face level)
80 points: Indistinguishable from humans (HeyGen/Top 2D digital human manufacturers)
100 points: Completely natural human level
Motion: Whether the movements of the AI digital human are lively.
Limb Movement Generation Capability:
40 points: Rigid machine level (Traditional motion library level)
60 points: Relatively intelligent, still with a machine feel (NVIDIA Audio2Gesture)
70 points: MoFang Technology digital human【Mirror】level
80 points: Indistinguishable from humans (Latest academic research level)
100 points: Completely natural human level
LLM Navidia
Chat GPT by Stephen Wolfram
Global Map
Data
GPTS 4 still "king", scaling experiments likely continuing
Larger models need more data; extending data runway & optimising quality crucial
Training & Inference is getting optimised: AI at the edge is emerging
Still an open question