Exploring the frontiers of GenAI with a special focus on Infra & LLMs

Micro-summit series

March 27th

In-person & Virtual
Free to attend

Join us for the San Francisco in-person & Virtual summit dedicated exclusively to GenAI Infra & LLMs.

The micro-summit includes:
  • 2 Talks
  • 4 Workshops
  • Talks for beginners/intermediate & advanced

Understanding of the best practices, methodologies, principles and lessons around deploying Machine Learning models into production environments.

Speakers

Dre Olgiati

Dre Olgiati

Distinguished Engineer, LinkedIn

Talk: MLOps at LinkedIn

Kumaran Ponnambalam

Kumaran Ponnambalam

Principal Engineer - AI, Outshift by Cisco

Talk: Building Secure and Trustworthy Applications with Generative AI

Workshops

Niv Hertz

Niv Hertz

Director of AI, Aporia


Workshop: Building A RAG Chatbot from Scratch with Minimum Hallucinations

Francisco Castillo

Francisco Castillo

Software Engineer/Data Scientist, Arize AI

Workshop: Tracing and Evaluating a LlamaIndex Application

Anupam Datta

Anupam Datta

Co-Founder, President & Chief Scientist, TruEra

Workshop: Evaluating and Monitoring LLM Applications

Daniel Huang

Daniel Huang

Senior Software Engineer, TruEra

Workshop: Evaluating and Monitoring LLM Applications

Alex Sherstinsky

Developer Relations Engineer, Predibase

Workshop: Fine-Tune Your Own LLM to Rival GPT-4

Arnav Garg

Machine Learning Engineer, Predibase

Workshop: Fine-Tune Your Own LLM to Rival GPT-4

Abhay Malik

Product Manager, Predibase


Workshop: Fine-Tune Your Own LLM to Rival GPT-4

Sponsors

Tickets

Micro-summit on GenAI

In-person & Virtual

Subject to minor changes.

Agenda

March 27th 2024

10:00 AM
PST

10:00 AM PST

Registration

10:15 AM
-
10:45 AM

10:15 AM - 10:45 AM

"MLOps at LinkedIn"

Dre Olgiati, Distinguished Engineer
LinkedIn

10:50 AM
-
12:20 PM

10:50 AM - 12:20 PM

"Tracing and Evaluating a LlamaIndex Application"

Francisco Castillo, Software Engineer/Data Scientist
Arize AI

12:55 PM
-
1:55 PM

12:55 PM - 1:55 PM

"Building A RAG Chatbot from Scratch with Minimum Hallucinations"

Niv Hertz, Director of AI
Aporia

2:00 PM
-
2:30 PM

2:00 PM - 2:30 PM

"Building Secure and Trustworthy Applications with Generative AI"

Kumaran Ponnambalam, Principal Engineer - AI
Outshift by Cisco

2:35 PM
-
4:05 PM

2:35 PM - 4:05 PM

"Fine-Tune Your Own LLM to Rival GPT-4"

Alex Sherstinsky, Developer Relations Engineer
Arnav Garg, Machine Learning Engineer
Abhay Malik, Product Manager
Predibase

4:10 PM
-
5:40 PM

4:10 PM - 5:40 PM

"Evaluating and Monitoring LLM Applications"

Anupam Datta, Co-Founder, President & Chief Scientist
Daniel Huang, Senior Software Engineer
TruEra

Join Our Community

Our goal is to provide an open, inclusive community of ML practitioners who can share projects, best practices and case studies. Join our open group, meet our community and share your work with practitioners from around the world. Join us here and learn more:
Talk: MLOps at LinkedIn

Presenter:
Dre Olgiati, Distinguished Engineer, LinkedIn

About the Speaker:
Dre is a Distinguished Engineer at LinkedIn, where he drives efforts across all the ML platforms. Previously he spent 9 years at AWS, where he led the SageMaker ML platform

Technical Level: Basic understanding of MLOps

Talk Abstract:
Overview of MLOps activity at LI

What You’ll Learn:
MLOps for complex recommendation systems

Talk: Building Secure and Trustworthy Applications with Generative AI

Presenter:
Kumaran Ponnambalam, Principal Engineer – AI, Outshift by Cisco

About the Speaker:
Kumaran Ponnambalam is a technology leader with 20+ years of experience in AI, Big Data, Data Processing & Analytics. His focus is on creating robust, scalable AI models and services to drive effective business solutions. He is currently leading AI initiatives in the Emerging Technologies & Incubation Group in Cisco. In this role he is focused on building MLOps and Observability services to enable ML. In his previous roles, he has built data pipelines, analytics, integrations, and conversational bots around customer engagement. He has also authored several courses on the LinkedIn Learning Platform in AI and Big Data.

Technical Level: 4

Talk Abstract:
As the possibilities of using Generative AI to improve businesses are exploding in the past few months, so are the concerns around Security and Responsible AI in these applications. As enterprises work to bring Generative AI applications to production, they need additional capabilities to ensure that these applications are secure, protect privacy, fair, explainable, compliant and accountable.
What are the additional services and processes that enterprises need to build these capabilities?
What challenges exist in bringing in these capabilities into the enterprise?
How can these capabilities be implemented and integrated in an efficient manner?
How are these capabilities monitored to ensure their continued performance?
In this session, we will discuss solutions to these questions that enterprises can leverage to safely and quickly deliver Generative AI capabilities.

What You’ll Learn:
Familiarity with Generative AI and machine learning

Workshop: Building A RAG Chatbot from Scratch with Minimum Hallucinations

Presenter:
Niv Hertz, Director of AI, Aporia

Talk Abstract:
We’ll start from the beginning – but go very technical, very fast. By the end of this lecture, you’ll have all the resources you need to start your own RAG chatbot in your company.

Join me to learn more about:

  • Why RAGs? And real-world examples of how they can provide value in different industries.
  • Breaking down the RAG architecture and all relevant concepts (Vector DB, knowledge base, etc.)
  • Comparing different frameworks, such as OpenAI assistants, Langchain, and LlamaIndex.
  • Best practices related to real-time data ingestion from any data source and chunking.
  • Different methodologies to improve retrieval and minimize hallucinations (re-ranking, knowledge graphs)
  • Add guardrails to significantly mitigate hallucinations and prevent prompt injection attacks, jailbreaks, PII data leakage, and more risks to your generative AI apps.

What You’ll Learn:

  • Why RAGs? And real-world examples of how they can provide value in different industries.
  • Breaking down the RAG architecture and all relevant concepts (Vector DB, knowledge base, etc.)
  • Comparing different frameworks, such as OpenAI assistants, Langchain, and LlamaIndex.
  • Best practices related to real-time data ingestion from any data source and chunking.
  • Different methodologies to improve retrieval and minimize hallucinations (re-ranking, knowledge graphs)
  • Add guardrails to significantly mitigate hallucinations and prevent prompt injection attacks, jailbreaks, PII data leakage, and more risks to your generative AI apps.
Workshop: Tracing and Evaluating a LlamaIndex Application

Presenter:
Francisco Castillo, Software Engineer/Data Scientist, Arize AI

About the Speaker:
Kiko is a software engineer and data scientist at Arize AI, a leader in ML Observability. He has an MA in Applied Math from ASU and previously did research at von Karman Institute for Fluid Dynamics.

Technical Level: 4

Talk Abstract:
As more and more companies start using LLMs to build their own chatbots or search systems, poor retrieval is a common issue plaguing teams. So, how can you efficiently build and analyze the LLM that powers your search system, and how do you know how to improve it? If there isn’t enough context to pull in, then the prompt doesn’t have enough context to answer the question.
You’ll learn how to identify if there is decent overlap between queries and context, locate where there is a density of queries without enough context, and the next steps you can take to fine tune your model.

Learning Objectives:

  • Hands-on demonstration focused on building and analyzing a context retrieval use case.
  • Workshop participants will have the opportunity to investigate the model in Colab.
  • Once building is complete leveraging LlamaIndex, use Phoenix to visualize the query and context density of the model.
  • Hands-on demonstration focused on building and analyzing a context retrieval use case. Workshop participants will have the opportunity to investigate the model in Colab.
  • Once building is complete leveraging LlamaIndex, use Phoenix to visualize the query and context density of the model.

 

What You’ll Learn:

  • Build a simple query engine using LlamaIndex that uses retrieval-augmented generation to answer questions over the Arize documentation,
  • Record trace data in OpenInference tracing format using the global arize_phoenix handler
  • Inspect the traces and spans of your application to identify sources of latency and cost,
  • Export your trace data as a pandas dataframe and run an LLM Evals to measure the precision@k of the query engine’s retrieval step.
  • Hands-on demonstration focused on building and analyzing a context retrieval use case.
  • Workshop participants will have the opportunity to investigate the model in Colab.
  • Once building is complete leveraging LlamaIndex, use Phoenix to visualize the query and context density of the model.
Workshop: Evaluating and Monitoring LLM Applications
Presenter: Anupam Datta, Co-Founder, President & Chief Scientist | Daniel Huang, Senior Software Engineer, TruEra About the Speaker: Anupam Datta is Co-Founder, President, and Chief Scientist at TruEra. Prior to founding TruEra, Anupam spent over a decade as a Professor of Electrical & Computer Engineering and Computer Science at Carnegie Mellon University. Anupam and team have worked with transformer models since they were invented in 2017 and has recently developed TruLens – an innovative open source project for evaluating and tracking LLM apps. Daniel Huang is a Senior Software Engineer at TruEra, where he focuses on machine learning research and developing the TruEra core platform. Prior to TruEra, Daniel worked in software engineering at Amazon, LotusFlare, and Reflect, and earned an MS in Computer Science at Carnegie Mellon University. Technical Level: 4 Talk Abstract: Large language models (LLMs) have demonstrated impressive capabilities in natural language processing. However, these models also have a tendency to hallucinate — generating convincing but false information. We will discuss the factors that fundamentally contribute to hallucinations in LLMs. To address these issues, better evaluation methods are needed that systematically probe for hallucinations across diverse settings, even in applications that use Retrieval-Augmented Generation (RAGs). Simply measuring overlap with human annotated references falls short. We will propose evaluation methodologies based on the RAG Triad — context relevance, groundedness, and answer relevance — that can surface flaws in LLM apps and guide iteration to improve them. We will also show how this can be done practically using our open source framework, TruLens, leveraging both LLMs-as-a-judge and smaller, custom models that scale better to production monitoring workloads. Developing more robust models, evaluation, and monitoring is indeed a critical part of the tech stack to deploy LLMs responsibly. What You’ll Learn: In this session, you will learn:
  • How to uncover flaws in LLM apps
  • How to iterate to improve LLM apps
  • How to scale up to production monitoring and workloads
Workshop: Fine-Tune Your Own LLM to Rival GPT-4

Presenters:
Alex Sherstinsky, Developer Relations Engineer | Arnav Garg, Machine Learning Engineer | Abhay Malik, Product Manager, Predibase

About the Speakers:
Alex Sherstinsky is the Developer Relations Engineer at Predibase, with cross-functional responsibilities ranging from Machine Learning and Data Products Engineering to Developer Community Growth Marketing. Most recently, Alex was a Staff Machine Learning and Data Products Engineer at GreatExpectations, Inc., providers of the leading Open Source Software Data Quality framework. Alex earned a Ph.D. in machine learning from M.I.T and holds 7 U.S. Patents.

Arnav Machine Learning Engineer at Predibase, where he works on scalable ML infrastructure, LLM capabilities like fine-tuning, and maintains/develops open-source www.Ludwig.ai, a low-code deep learning framework. You can find his open source contributions at github.com/arnavgarg1. Prior to Predibase, Arnav worked as a Machine Learning Scientist at Atlassian focused on building ML powered smart features for Confluence and Trello, notably working on recommendation systems.

Abhay is the product lead at Predibase, the first engineering platform for building with open-source AI, where he’s responsible for helping set the strategic product direction and roadmap. Previously he was a product lead at Aisera, an AI Service Management (AISM) platform, and the Co-Founder and CTO of BlitzIQ, an AI Assistant for sales teams, that was part of Y Combinator W19.

Technical Level: 6

Talk Abstract:
Last month, we launched LoRA Land, a collection of 25+ fine-tuned Mistral-7b models that outperform GPT4 on a set of task-specific applications. Now you can build your own LoRA Land with popular open-source LLM frameworks, Ludwig and LoRAX.

In this workshop, we’ll provide an intro to fine-tuning incl. use cases, best practices, and techniques for efficient fine-tuning with LoRA. Then you’ll get hands-on with Ludwig, the declarative framework for building custom AI models, to easily and efficiently fine-tune your own set of small task-specific models that rival GPT4. Along the way, we will also show you how to use LoRAX to serve your base model and then dynamically serve many fine-tuned adapters on top in real time—all on a single GPU.

Workshop topics:

  • Intro to efficient fine-tuning techniques, best practices, and use cases
  • How we fine-tuned 25+ models that rival GPT4 for less than $8 each
  • Hand-on Session Part 1: Fine-tune multiple adapters using Ludwig
  • Hand-on Session Part 2: Serve and prompt your base and fine-tuned LLMs with LoRAX
  • Wrap-up

By the end of the workshop, you will be able to use this framework to cost-effectively build your own production LLM applications.

What You’ll Learn:

  • Understand the benefits, use cases and techniques for fine-tuning LLMs
  • How to efficiently fine-tune and serve your own open-source LLMs that rival GPT4
  • How to use popular open-source LLM frameworks Ludwig and LoRAX