The Fourth vLLM Meetup x AGI Builders Meetup · Luma (original) (raw)

​Looking forward to see everyone! If you are on the waitlist, feel free to still join us!

​👋 We're thrilled to invite you to the 4th vLLM meetup, as part of the monthly AGI Builders meetup on June 11th.

​❤️ It's a gathering where AI builders, researchers, and enthusiasts share ideas, inspire peers, and transform the future. In particular, this event will connect vLLM users and developers to share and learn together.

​💡 In this event, engineers from BentoML and vLLM teams will share recent update.

​🍕 Light refreshments will be available.

Agenda:

​About the talks:

​Talk 1: Scaling LLMs like you mean it

​Speaker: Sean Sheng, Head of Engineering, BentoML

​Abstract: With vLLM significantly enhancing the efficiency of open-source LLMs, deploying these models in production environments still presents considerable scaling challenges. Although serverless architecture promises flexible resource allocation and cost efficiency, deploying LLMs on serverless GPUs faces specific hurdles such as cold starts, elastic scaling, and inference orchestration. This talk will explore these challenges and discuss the solutions we have implemented at BentoML to build a robust AI model inference platform.

​Talk 2: vLLM Project Update

​Speaker: Zhuohan Li, Woosuk Kwon, Simon Mo; vLLM maintainers, UC Berkeley

​Abstract: In this talk, vLLM maintainers will share update about the project, diving into recent feature additions, and unveil upcoming project roadmap.

About the hosts:

Cloudflare helps organizations make employees, applications, and networks faster & more secure.

BentoML empowers developers to run any AI models in the cloud and scale with confidence.

vLLM is a high-throughput and memory-efficient inference and serving engine for LLMs. It is an open source project contributed by many and adopted across industry.

Note: