Exploring the frontiers of GenAI with a special focus on Infra & LLMs

Micro-summit series

April 15th

In-person & Virtual

Join us for the New York City in-person & Virtual summit dedicated exclusively to GenAI Infra & LLMs.

The micro-summit includes:

  • 3 hand-on workshops
  • Post-summit networking 
Understanding of the best practices, methodologies, principles and lessons around deploying Machine Learning models into production environments.

Speakers

Aman Khan

Aman Khan

Group Product Manager, Arize AI

Workshop: Tracing and Evaluating a LlamaIndex Application

Rod Rivera

Rod Rivera

Growth Lead, TitanML

Workshop: Orchestrating AI Models for Powerful Applications

Niv Hertz

Niv Hertz

Director of AI, Aporia

Workshop: Building A RAG Chatbot from Scratch with Minimum Hallucinations

Sponsors

Tickets

NYC Micro-summit on GenAI

In-person & Virtual

Subject to minor changes.

Agenda

April 15th, 2024

9:45 AM
EDT

9:45 AM EDT

Registration

10:05 AM
-
11:35 AM

10:05 AM - 11:35 AM

"Orchestrating AI Models for Powerful Applications"

Rod Rivera, Growth Lead
TitanML

11:40 AM
-
1:10 PM

11:40 AM - 1:10 PM

"Tracing and Evaluating a LlamaIndex Application"

Aman Khan, Group Product Manager
Arize AI

1:15 AM
-
2:45 PM

11:40 AM - 1:10 PM

"Building A RAG Chatbot from Scratch
with Minimum Hallucinations"

Niv Hertz, Director of AI
Aporia

Join Our Community

Our goal is to provide an open, inclusive community of ML practitioners who can share projects, best practices and case studies. Join our open group, meet our community and share your work with practitioners from around the world. Join us here and learn more:
Talk: Tracing and Evaluating a LlamaIndex Application

Presenter:
Aman Khan, Group Product Manager, Arize AI

About the Speaker:
Aman is a Group Product Manager at Arize AI, where he works on scaling ML Observability solutions so that Data Scientists can monitor and improve ML models in production. Prior to Arize, Aman was the PM on the Jukebox Feature Store in the ML Platform team at Spotify across ~50 data science teams. Aman was also PM for ML Evaluation frameworks across data science and engineering teams for self-driving cars at Cruise, which helped launch the first self-driving car service in an urban environment. Aman studied Mechanical Engineering at UC Berkeley and lived in the SF Bay Area for 9 years before moving to NYC. When he’s not working on making models more observable, he enjoys working with founders by angel investing early stage startups as well as running, cycling and skiing.

Technical Level: 5

Talk Abstract:
As more and more companies start using LLMs to build their own chatbots or search systems, poor retrieval is a common issue plaguing teams. So, how can you efficiently build and analyze the LLM that powers your search system, and how do you know how to improve it? If there isn’t enough context to pull in, then the prompt doesn’t have enough context to answer the question. You’ll learn how to identify if there is decent overlap between queries and context, locate where there is a density of queries without enough context, and the next steps you can take to fine tune your model. Learning Objectives: Hands-on demonstration focused on building and analyzing a context retrieval use case. Workshop participants will have the opportunity to investigate the model in Colab. Once building is complete leveraging LlamaIndex, use Phoenix to visualize the query and context density of the model.

What You’ll Learn:
Learning Objectives: Hands-on demonstration focused on building and analyzing a context retrieval use case. Workshop participants will have the opportunity to investigate the model in Colab. Once building is complete leveraging LlamaIndex, use Phoenix to visualize the query and context density of the model.

Talk: Orchestrating AI Models for Powerful Applications

Presenter:
Rod Rivera, Growth Lead, TitanML

About the Speaker:
Rod Rivera is the Growth Lead at TitanML. He is a technical leader with experience in enterprise machine learning at Philip Morris, Samsung, and Rocket Internet, contributing to innovative AI projects at Alibaba and Huawei. As a professor of AI at ITAM in Mexico and Ph.D. researcher specializing in Generative AI, Rod focuses on vector embeddings for multimodal data such as time series and event sequences in finance and supply chain applications. Rod compiles the largest collaborative database of AI tools at AI Product Engineer, an open community.

Technical Level: 6

Talk Abstract:
Combining diverse AI models is crucial for building advanced applications. This workshop will teach you to integrate cutting-edge open-source models with leading commercial offerings to create modern AI solutions.

Discover how to combine local language models like Mistral and LLaMa2 with GPT-4, unlocking capabilities such as secure data interaction and enterprise search enrichment for RAG and agent applications. Leverage TitanML’s Takeoff Inference Server to streamline model orchestration.

Python proficiency is the only prerequisite. After the workshop, you’ll understand the possibilities of multi-model AI applications for the enterprise and have practical examples to apply in your organization.

What You’ll Learn:

  • Learning to combine AI models from open-source sources with commercial cloud ones such as GPT.
  • Orchestrate open and closed models to build RAG (chat with your data) applications.
  • Combine free and commercial models to create autonomous agents.
Workshop: Building A RAG Chatbot from Scratch with Minimum Hallucinations

Presenter:
Niv Hertz, Director of AI, Aporia

Talk Abstract:
We’ll start from the beginning – but go very technical, very fast. By the end of this lecture, you’ll have all the resources you need to start your own RAG chatbot in your company.

Join me to learn more about:

  • Why RAGs? And real-world examples of how they can provide value in different industries.
  • Breaking down the RAG architecture and all relevant concepts (Vector DB, knowledge base, etc.)
  • Comparing different frameworks, such as OpenAI assistants, Langchain, and LlamaIndex.
  • Best practices related to real-time data ingestion from any data source and chunking.
  • Different methodologies to improve retrieval and minimize hallucinations (re-ranking, knowledge graphs)
  • Add guardrails to significantly mitigate hallucinations and prevent prompt injection attacks, jailbreaks, PII data leakage, and more risks to your generative AI apps.

What You’ll Learn:

  • Why RAGs? And real-world examples of how they can provide value in different industries.
  • Breaking down the RAG architecture and all relevant concepts (Vector DB, knowledge base, etc.)
  • Comparing different frameworks, such as OpenAI assistants, Langchain, and LlamaIndex.
  • Best practices related to real-time data ingestion from any data source and chunking.
  • Different methodologies to improve retrieval and minimize hallucinations (re-ranking, knowledge graphs)
  • Add guardrails to significantly mitigate hallucinations and prevent prompt injection attacks, jailbreaks, PII data leakage, and more risks to your generative AI apps.