Meta's 3rd version of their open language model, Llama 3 has proven to be incredibly strong as it is even competitive with some frontier models. In this blog post, we dive into some of the technical details behind Llama 3. Additionally, we provide an in-depth overview of how the open language model compares to an array of other models, while unpacking the rationale of its immense dataset size used for training and the implications this is likely to have on scaling.
Blog
DeepSeek R1's release has triggered a seismic shift in the AI landscape. Within just 15 days, their AI assistant topped app stores across 140 markets, surpassing 30 million daily active users and breaking ChatGPT's previous adoption records. This China-born AI lab, while maintaining a low profile, has fundamentally challenged how we think about the path to AGI. Even Sam Altman acknowledged that closed-source might have been “on the wrong side of history”, with OpenAI subsequently releasing o3-mini.

OpenAI’s Sora model has amazed the world by its ability to generate extremely realistic videos of a wide variety of scenes. In this blog post, we dive into some of the technical details behind Sora. We also talk about our current thinking around the implications of these video models. Finally, we discuss our thoughts around the compute used for training models like Sora and present projections for how that training compute compares to inference, which has meaningful indications for estimated future GPU demand.