
New
AI & RoboticsMore in AI & Robotics→
Microsoft Research's Lens: Detailed Captions Beat Raw Scale for Efficient Image Generation
Microsoft Research unveils Lens, a 3.8B parameter text-to-image model that matches 80B rivals using one-fifth the compute, thanks to 800M detailed captions and smart architecture.
Key Takeaways
- Lens is a 3.8B parameter text-to-image model using one-fifth the compute of comparable models.
- Uses 800M image-text pairs with detailed GPT-4.1 captions (avg 100 words).
DE
DT Editorial Team··via the-decoder.com