
AI & RoboticsMore in AI & Robotics→
ByteDance Study Finds Long-Document AI Learns Better From Questions Than From Transcribing Text
Researchers from ByteDance Seed and HKUST report that question-answer training improved long-document performance in multimodal models, while pure text-recognition training actually made results worse.
Key Takeaways
- Researchers compared OCR-style training with question-answer supervision for long documents.
- The study reports that pure text-recognition training worsened performance.
DE
DT Editorial Team··via the-decoder.com