• 0 Posts
  • 1 Comment
Joined 1Y ago
cake
Cake day: Jun 24, 2023

help-circle
rss

I can’t believe I’ll get excited about phone specs again πŸ™ŒπŸ»βœ¨. Do you think it could be possible to parallelize computation among various phones to run inference on transformer models? I assume is not worth it since you would need to transfer a ton of data among devices to run attention per layer, but the llama people have pulled so many tricks at this point…