FlashAttention 2: making Transformers 800% faster w/o approximation - with Tri Dao of Together AI

FlashAttention 2: making Transformers 800% faster w/o approximation - with Tri Dao of Together AI

From 🇺🇸 Latent Space: The AI Engineer Podcast, published at 2023-07-26 16:46

Audio: FlashAttention 2: making Transformers 800% faster w/o approximation - with Tri Dao of Together AI

This article has not been summarized yet.

View Original ↗

Explore more podcasts from United States