Apple’s latest machine learning research could make creating models for Apple Intelligence faster, by coming up with a technique to almost triple the rate of generating tokens when using Nvidia GPUs.
One of the problems in creating large language models (LLMs) for tools and apps that offer AI-based functionality, such as Apple Intelligence, is inefficiencies in producing the LLMs in the first place. Training models for machine learning is a resource-intensive and slow process, which is often countered by buying more hardware and taking on increased energy costs.
Earlier in 2024, Apple published and open-sourced Recurrent Drafter, known as ReDrafter, a method of speculative decoding to improve performance in training. It used an RNN (Recurrent Neural Network) draft model combining beam search with dynamic tree attention for predicting and verifying draft tokens from multiple paths.
Continue Reading on AppleInsider | Discuss on our Forums
Source: AppleInsider News