Artwork

TWIML and Sam Charrington์—์„œ ์ œ๊ณตํ•˜๋Š” ์ฝ˜ํ…์ธ ์ž…๋‹ˆ๋‹ค. ์—ํ”ผ์†Œ๋“œ, ๊ทธ๋ž˜ํ”ฝ, ํŒŸ์บ์ŠคํŠธ ์„ค๋ช…์„ ํฌํ•จํ•œ ๋ชจ๋“  ํŒŸ์บ์ŠคํŠธ ์ฝ˜ํ…์ธ ๋Š” TWIML and Sam Charrington ๋˜๋Š” ํ•ด๋‹น ํŒŸ์บ์ŠคํŠธ ํ”Œ๋žซํผ ํŒŒํŠธ๋„ˆ๊ฐ€ ์ง์ ‘ ์—…๋กœ๋“œํ•˜๊ณ  ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ๊ท€ํ•˜์˜ ํ—ˆ๋ฝ ์—†์ด ๊ท€ํ•˜์˜ ์ €์ž‘๋ฌผ์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋‹ค๊ณ  ์ƒ๊ฐ๋˜๋Š” ๊ฒฝ์šฐ ์—ฌ๊ธฐ์— ์„ค๋ช…๋œ ์ ˆ์ฐจ๋ฅผ ๋”ฐ๋ฅด์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค https://ko.player.fm/legal.
Player FM -ํŒŸ ์บ์ŠคํŠธ ์•ฑ
Player FM ์•ฑ์œผ๋กœ ์˜คํ”„๋ผ์ธ์œผ๋กœ ์ „ํ™˜ํ•˜์„ธ์š”!

Inside Nano Banana ๐ŸŒ and the Future of Vision-Language Models with Oliver Wang - #748

1:03:39
 
๊ณต์œ 
 

Manage episode 508093774 series 2355587
TWIML and Sam Charrington์—์„œ ์ œ๊ณตํ•˜๋Š” ์ฝ˜ํ…์ธ ์ž…๋‹ˆ๋‹ค. ์—ํ”ผ์†Œ๋“œ, ๊ทธ๋ž˜ํ”ฝ, ํŒŸ์บ์ŠคํŠธ ์„ค๋ช…์„ ํฌํ•จํ•œ ๋ชจ๋“  ํŒŸ์บ์ŠคํŠธ ์ฝ˜ํ…์ธ ๋Š” TWIML and Sam Charrington ๋˜๋Š” ํ•ด๋‹น ํŒŸ์บ์ŠคํŠธ ํ”Œ๋žซํผ ํŒŒํŠธ๋„ˆ๊ฐ€ ์ง์ ‘ ์—…๋กœ๋“œํ•˜๊ณ  ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ๊ท€ํ•˜์˜ ํ—ˆ๋ฝ ์—†์ด ๊ท€ํ•˜์˜ ์ €์ž‘๋ฌผ์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋‹ค๊ณ  ์ƒ๊ฐ๋˜๋Š” ๊ฒฝ์šฐ ์—ฌ๊ธฐ์— ์„ค๋ช…๋œ ์ ˆ์ฐจ๋ฅผ ๋”ฐ๋ฅด์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค https://ko.player.fm/legal.

Today, weโ€™re joined by Oliver Wang, principal scientist at Google DeepMind and tech lead for Gemini 2.5 Flash Imageโ€”better known by its code name, โ€œNano Banana.โ€ We dive into the development and capabilities of this newly released frontier vision-language model, beginning with the broader shift from specialized image generators to general-purpose multimodal agents that can use both visual and textual data for a variety of tasks. Oliver explains how Nano Banana can generate and iteratively edit images while maintaining consistency, and how its integration with Geminiโ€™s world knowledge expands creative and practical use cases. We discuss the tension between aesthetics and accuracy, the relative maturity of image models compared to text-based LLMs, and scaling as a driver of progress. Oliver also shares surprising emergent behaviors, the challenges of evaluating vision-language models, and the risks of training on AI-generated data. Finally, we look ahead to interactive world models and VLMs that may one day โ€œthinkโ€ and โ€œreasonโ€ in images.

The complete show notes for this episode can be found at https://twimlai.com/go/748.

  continue reading

769 ์—ํ”ผ์†Œ๋“œ

Artwork
icon๊ณต์œ 
 
Manage episode 508093774 series 2355587
TWIML and Sam Charrington์—์„œ ์ œ๊ณตํ•˜๋Š” ์ฝ˜ํ…์ธ ์ž…๋‹ˆ๋‹ค. ์—ํ”ผ์†Œ๋“œ, ๊ทธ๋ž˜ํ”ฝ, ํŒŸ์บ์ŠคํŠธ ์„ค๋ช…์„ ํฌํ•จํ•œ ๋ชจ๋“  ํŒŸ์บ์ŠคํŠธ ์ฝ˜ํ…์ธ ๋Š” TWIML and Sam Charrington ๋˜๋Š” ํ•ด๋‹น ํŒŸ์บ์ŠคํŠธ ํ”Œ๋žซํผ ํŒŒํŠธ๋„ˆ๊ฐ€ ์ง์ ‘ ์—…๋กœ๋“œํ•˜๊ณ  ์ œ๊ณตํ•ฉ๋‹ˆ๋‹ค. ๋ˆ„๊ตฐ๊ฐ€๊ฐ€ ๊ท€ํ•˜์˜ ํ—ˆ๋ฝ ์—†์ด ๊ท€ํ•˜์˜ ์ €์ž‘๋ฌผ์„ ์‚ฌ์šฉํ•˜๊ณ  ์žˆ๋‹ค๊ณ  ์ƒ๊ฐ๋˜๋Š” ๊ฒฝ์šฐ ์—ฌ๊ธฐ์— ์„ค๋ช…๋œ ์ ˆ์ฐจ๋ฅผ ๋”ฐ๋ฅด์‹ค ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค https://ko.player.fm/legal.

Today, weโ€™re joined by Oliver Wang, principal scientist at Google DeepMind and tech lead for Gemini 2.5 Flash Imageโ€”better known by its code name, โ€œNano Banana.โ€ We dive into the development and capabilities of this newly released frontier vision-language model, beginning with the broader shift from specialized image generators to general-purpose multimodal agents that can use both visual and textual data for a variety of tasks. Oliver explains how Nano Banana can generate and iteratively edit images while maintaining consistency, and how its integration with Geminiโ€™s world knowledge expands creative and practical use cases. We discuss the tension between aesthetics and accuracy, the relative maturity of image models compared to text-based LLMs, and scaling as a driver of progress. Oliver also shares surprising emergent behaviors, the challenges of evaluating vision-language models, and the risks of training on AI-generated data. Finally, we look ahead to interactive world models and VLMs that may one day โ€œthinkโ€ and โ€œreasonโ€ in images.

The complete show notes for this episode can be found at https://twimlai.com/go/748.

  continue reading

769 ์—ํ”ผ์†Œ๋“œ

All episodes

×
 
Loading …

ํ”Œ๋ ˆ์ด์–ด FM์— ์˜ค์‹ ๊ฒƒ์„ ํ™˜์˜ํ•ฉ๋‹ˆ๋‹ค!

ํ”Œ๋ ˆ์ด์–ด FM์€ ์›น์—์„œ ๊ณ ํ’ˆ์งˆ ํŒŸ์บ์ŠคํŠธ๋ฅผ ๊ฒ€์ƒ‰ํ•˜์—ฌ ์ง€๊ธˆ ๋ฐ”๋กœ ์ฆ๊ธธ ์ˆ˜ ์žˆ๋„๋ก ํ•ฉ๋‹ˆ๋‹ค. ์ตœ๊ณ ์˜ ํŒŸ์บ์ŠคํŠธ ์•ฑ์ด๋ฉฐ Android, iPhone ๋ฐ ์›น์—์„œ๋„ ์ž‘๋™ํ•ฉ๋‹ˆ๋‹ค. ์žฅ์น˜ ๊ฐ„ ๊ตฌ๋… ๋™๊ธฐํ™”๋ฅผ ์œ„ํ•ด ๊ฐ€์ž…ํ•˜์„ธ์š”.

 

๋น ๋ฅธ ์ฐธ์กฐ ๊ฐ€์ด๋“œ

ํƒ์ƒ‰ํ•˜๋Š” ๋™์•ˆ ์ด ํ”„๋กœ๊ทธ๋žจ์„ ๋“ค์–ด๋ณด์„ธ์š”.
์žฌ์ƒ