News Arena

Join us

Home
/

google-launches-audio-to-text-feature-in-latest-gemini-pro-update

Technology

Google launches audio-to-text feature in latest Gemini Pro update

The update, known as Gemini 1.5 Pro, is now available as a public preview on Google's Vertex AI development platform.

- California - UPDATED: April 10, 2024, 07:25 PM - 2 min read

Image for representative use.

Google launches audio-to-text feature in latest Gemini Pro update

Image for representative use.


Google has introduced audio-to-text functionality in its latest update to Gemini Pro, enabling its chatbot to process audio files and extract text information. This enhancement expands the capabilities of AI chatbots beyond visual comprehension to auditory understanding.

 

The update, known as Gemini 1.5 Pro, is now available as a public preview on Google's Vertex AI development platform. Initially rolled out to a limited group in February, the feature aims to empower enterprise users to experiment with audio processing capabilities.

 

Announced at the Cloud Next conference in Las Vegas, Google underlined Gemini 1.5 Pro as its most advanced generative model yet. Unlike its predecessor, this version requires minimal tweaking, making it more efficient in learning from data.

 

Gemini 1.5 Pro is equipped with multimodal capabilities, enabling it to transcribe various audio sources, including TV shows, movies, radio broadcasts, and conference calls. Additionally, it supports multiple languages, further broadening its utility.

 

However, TechCrunch highlighted potential limitations in audio-to-text transcription quality, particularly in video content. Google employs a token system for data processing, where a million tokens translate to approximately 700,000 words or 30,000 lines of code.

 

Early previews of Gemini 1.5 Pro demonstrated its ability to pinpoint specific moments in video transcripts. For instance, AI enthusiast Rowan Cheung showcased a demo where the chatbot accurately identified and summarised key events in a sports contest.

 

Notable early adopters of Gemini 1.5 Pro include United Wholesale Mortgage, TBS, and Replit, each exploring enterprise-focused applications such as mortgage underwriting, metadata tagging automation, and code generation and explanation.

 

Google's latest update marks a significant advancement in AI capabilities, paving the way for enhanced audio processing and expanding the potential applications of chatbots in enterprise settings.

TOP CATEGORIES

  • Paris Olympics

QUICK LINKS

About us Rss FeedSitemapPrivacy PolicyTerms & Condition
logo

2024 News Arena India Pvt Ltd | All rights reserved | The Ideaz Factory