Google launches Gemini 1.5 Flash and significant updates to other models and features

Today, Google has introduced Gemini 1.5 Flash: a model that’s lighter-weight than 1.5 Pro, and designed to be fast and efficient to serve at scale, the company said in a blog post.

By
  • Storyboard18,
| May 15, 2024 , 9:53 am
1.5 Flash excels at summarization, chat applications, image and video captioning, data extraction from long documents and tables, and more. (Representative Image: Lauren Edvalson via Unsplash)
1.5 Flash excels at summarization, chat applications, image and video captioning, data extraction from long documents and tables, and more. (Representative Image: Lauren Edvalson via Unsplash)

Google has announced the launch of a slew of new features related to artificial intelligence and its corresponding tools.

In December last year, Google launched its first natively multimodal model Gemini 1.0 in three sizes: Ultra, Pro and Nano. Just a few months later it released 1.5 Pro, with enhanced performance and a breakthrough long context window of 1 million tokens.

Developers and enterprise customers have been finding 1.5 Pro’s long context window, multimodal reasoning capabilities and impressive overall performance incredibly useful.

Today, Google has introduced Gemini 1.5 Flash: a model that’s lighter-weight than 1.5 Pro, and designed to be fast and efficient to serve at scale, the company said in a blog post.

Both 1.5 Pro and 1.5 Flash are available in public preview with a 1 million token context window in Google AI Studio and Vertex AI. 1.5 Pro is also available with a 2 million token context window via waitlist to developers using the API and to Google Cloud customers, the blog read.

Additionally, Google also announced the introduction of updates across the Gemini family of models and announced the next generation of open models, Gemma 2, and shared progress on the future of AI assistants with Project Astra.

1.5 Flash excels at summarization, chat applications, image and video captioning, data extraction from long documents and tables, and more.

The behind this is that the model has been trained by 1.5 Pro through a process called “distillation,” where the most essential knowledge and skills from a larger model are transferred to a smaller, more efficient model, the blog said.

The blog also stated that 1.5 Pro can now follow increasingly complex and nuanced instructions, including ones that specify product-level behavior involving role, format and style.

Google has improved control over the model’s responses for specific use cases, like crafting the persona and response style of a chat agent. In another case, it can automate workflows through multiple function calls. Users can now steer model behavior by setting system instructions.

Google has also added audio understanding in the Gemini API and Google AI Studio through which, 1.5 Pro can reason cross image and audio for videos that are uploaded in Google AI Studio. 1.5 Pro will also be integrated into Google products, including Gemini Advanced and in Workspace apps.

Leave a comment