Machine Learning REPA Week 2021

Free Online Conference on Machine Learning Engineering, MLOps and Management practices

05 - 11 April 2021

Our goal is

to share good practices and solutions for Machine Learning engineering and process automation, learn best Open Source tools, and share team collaboration and product management insights from around the various industries and applications.

Main topics

How to organize you project and code?
How to enforce your team collaboration?
How to manage a project growth?

How to build and automate pipelines?
Version control for your code, data and pipelines
Tools and practices in machine learning applications
Reproducibility of machine learning pipelines

How to manage ML experiments and metrics tracking?
What tools to use?
Model Lifecycle and Development process

How to build a production ready solution with a model/pipeline you've developed?
What is MLOps and how to make it work
Build CI/CD for Machine Learning

Testing in Machine Learning
Deploying your solution is not the end of the story!
How to monitor your model works appropriate?
Tools and integrations for monitoring deployed model

Program

Every day we have two parallel tracks. We start at 7:00 pm at Moscow time (9:00 am Los Angeles time). Check your time here

April 5

April 6

April 7

April 8

April 9

April 10

April 11

Track 1: Machine Learning Product and Team Management

7:00–7:20 pm

ML REPA Week 2021

Welcome Talk from organizers

7:20–8:00 pm

What is the Maturity Model in Data science?

No doubt that you want to establish a strong, stable, automated, repeatable data science process. You may be wondering why the team cannot start creating reproducible ML models by any person from anywhere, instead of showing the work of the model on a single Data Scientist's computer.

Yuliya Rubtsova, PhD, Solution architect @ Datamonsters

Track 2: ML Pipelines Automation, Engineering & MLOps

7:00–7:20 pm

ML REPA Week 2021

Welcome Talk from organizers

7:20–8:00 pm

DVC: data versioning and ML experiments on top of Git

With the open-source tool DVC, hundreds of experiments can be tracked automatically. DVC tracks data, code, and metrics together, keeping the code and metadata in a Git repository while caching the data anywhere the user chooses.

Dmitry Petrov, Creator of DVC, Co-founder & CEO @ Iterative.ai

8:00–8:40 pm

Eliminate technical debt with iterative ML pipelines

Let's face it, the genesis of most ML models is usually with iterative Jupyter notebook based experimentation on a single dev machine, which slowly transitions towards production pipelines as it escalates in value. Which begs the question: Why not write these pipelines from day 0? To learn about why this does not happen, and perhaps how it might change in the future, tune into this talk.

Hamza Tahir, ZenML, Co-creator

8:40–9:20 pm

How to create your MLOps environment following best practices

The talk will help you understand Machine Learning Operations but as well how to build your architecture in MLOps based on your needs and environment. As well some tools will be discussed during this session

MOHAMED SABRI, Data Science and MLOps specialist

Track 1: Machine Learning Product and Team Management

7:00–7:40 pm

AI development process: Common mistakes

In this talk I will cover full development stage of the ML models' building: data preparation and data version control, code versions control, metrics tracking and hyperparameters tuning

Kseniia Melnikova, Product Owner(Data/AI), SoftwareOne

7:40–8:20 pm

The 3 components your "Agile AI" product development stack should include

What tools and techniques should I have in my toolbox when it comes to delivering AI products?

Ashley Beattie, Agile By Design - Head of DevOps Transformation

8:20–9:00 pm

Structuring machine learning projects

In this talk, we will cover a set of tools and principles that the speaker found useful with practical case breakdowns.

Yerzat Marat, Knowtions Research, Project manager

Track 2: ML Pipelines Automation, Engineering & MLOps

7:00–7:40 pm

Moving from prototype to production

This talk will be helpful for managers who work with machine learning and data science teams. It can give you some ideas on forming an effective ML engineering team and estimating its performance. And how engineering practices can help you with data science projects.

Alexander Mokryak, ML Engineer @ Exness

7:40–8:20 pm

Reducing the distance between Prototyping and Production - Why obsessing over experimentation and iteration compounds ROIs

Case Study - Creating a ML system that personalizes ADs/Offers to users real time

Soumanta Das, Yugen.ai, Co-Founder

8:20–9:00 pm

Building ML Pipelines with Dagster: The role of the orchestrator in machine learning

Dagster is an orchestrator that puts data at the center. While orchestrators typically focus on sequencing computations in production, Dagster brings orchestration to the entire ML development lifecycle

Sandy Ryza, Software Engineer @ Elementl, working on Dagster

9:00–9:40 pm

Workflow & MLOps for batch scoring applications with DVC, MLflow and Airflow

How to organize team workflow, automate pipelines and integrate tools? Let's discuss some approaches to Machine Learning experiments management and metrics tracking with DVC and MLflow. At the next step, we will apply Airflow for production run models for batch scoring.

Mikhail Rozhkov, Co-Creator @ ML REPA, Solution Engineer @ Iterative.ai

Track 1: Machine Learning Product and Team Management

7:00–7:40 pm

Lean Data Science: agile practices for Data Science projects

How to come up with product hypothesis, prioritise them and how to effectively manage work using Kanban and Scrum approaches

Askhat Urazbaev, Agile coach, Founder of LeanDS

7:40–8:20 pm

Creating a perfect backlog for ML project

When a team gets a product hypothesis for validation, a number of questions arise. How to decompose the hypothesis to work on it in parallel? What tasks should the team do, and what should be assigned to another one? In this talk, we will consider an approach to decomposition that helps to answer these questions.

Alexey Mogilnikov, Senior Data Scientist at SBDA Group
Chief Methodologist at LeanDS

8:20–9:00 pm

Cool as ICE: adapting backlog prioritization for data science and ML products

I will share the adapted ICE prioritization technique for implementing a transparent DS hypotheses testing process that allowed to free 50% of our team's resource and ensure that most initiatives lead to business value.

Vasilia Gainulina, Senior Product Manager, Beeline Big Data

Track 2: ML Pipelines Automation, Engineering & MLOps

7:00–7:40 pm

VCS repository structure for ML projects

This talk is about essential task for IT projects - settings VCS repository. And while it is known that version control for ML projects is difficult - this topic is rarely discussed.

Timur Dzhumakaev, MegaFon, Senior DS

7:40–8:20 pm

Automating Machine Learning with GitHub Actions & GitLab CI

How continuous integration systems like GitHub Actions and GitLab CI can be used to automate machine learning model training, testing, and reporting.

Elle O'Brien, Lecturer @University of Michigan School of Information
Data Scientist @ Iterative, Inc

8:20–9:00 pm

MLOps and AutoML in Cloud-Native Way with Kubeflow and Katib

This talk will demonstrate how to use the main Kubeflow components.
In addition to that, this talk will explore the Kubeflow Katib component. Katib is a Kubernetes native system for AutoML and is agnostic to ML frameworks and programming languages.

Andrey Velichkevich, Senior Software Engineer at Cisco,
Сontributor to the Kubeflow

9:00–9:40 pm

Unifying MLOps with a Feature Stores and Model Deployment

In this talk, we will cover the four current problems in Machine Learning: Reusing features and training sets, building time consistent training sets for modeling, serving features to models in realtime, and tracking data and model lineage.

Ben Epstein, Machine Learning Lead @ Splice Machine

Track 1: Machine Learning Product and Team Management

7:00–7:40 pm

5 Principles of LeanML

I will introduce LeanML in 5 key takeaways. LeanML is a framework that enables Data Science teams to build business-oriented data products through a deliberate process.

Laszlo Sragner, Hypergolic, Founder quant research, mobile gaming, fintech NLP startup

7:40–8:20 pm

Wrong but useful: turning ML models into ML products

In this talk, I will share a set of questions to think through when working on enterprise ML solutions to make sure your models get to production

Elena Samuylova, CEO & Co-founder Evidently AI

8:20–9:00 pm

How become a good (team) manager in Machine Learning?

This talk focuses on simple best tips and tricks for newcomers to a manager path. We will not speak about common practices. The talk will be focused on worst mistakes that all new managers do and on good and simple things that help everyone become a good manager in a short term.

Alexander Moiseev, Head of Product Analytics, Capital Markets, Raiffeisenbank

Track 2: ML Pipelines Automation, Engineering & MLOps

7:00–7:40 pm

Evolutionary AutoML for complex pipelines using FEDOT Framework

I plan to talk about the AutoML solutions for classification, regression, clustering and time series forecasting implemented in FEDOT Framework (open source)

Nikolay Nikitin, Senior Research Fellow @ National Center for Cognitive Technologies, ITMO University

7:40–8:20 pm

Security in Machine Learning: Taxonomy and Applied Counterstrategies

This talk will discuss some adversarial attacks, their risks and chain of consequences and some simple countermeasures to avoid those attacks.

Flavio Clesio, Machine Learning Engineer in Berlin

8:20–9:00 pm

Writing reusable training pipelines for deep learning

Writing reusable training pipelines for deep learning: why and how?
In this talk, I'll show a working training pipeline for deep learning based on PyTorch-lightning as a wrapper over PyTorch code and Hydra for managing configuration files.

Andrey Lukyanenko, Data Scientist @ MTS AI
Kaggle Competition Master, Notebook 1st rank

Track 1: Machine Learning Product and Team Management

7:00–7:40 pm

Architecture of Machine Learning systems

DS faces the question of how to integrate it: the possibilities are usually many, many different decisions have to be made, and it is often unclear how to approach them. Software architecture is the discipline that is responsible for this.
The talk is less about technology and more about processes, strategies and people.

Michael Perlin, Machine Learning Engineer at Volkswagen

7:40–8:20 pm

Your AI Project is a QUEST. Don't get eaten by DRAGONS!

You're in big trouble when your team can't deliver a working AI solution on schedule.
There's a reason. Research projects may take 10 times longer than scheduled.
Join this session to upgrade your AI management skills to mythic levels, and let your team have fun and be productive at the same time.

Artemy Malkov, Data Monsters, CEO, Lecturer at MIPT, PhD

Track 2: ML Pipelines Automation, Engineering & MLOps (tutorials)

7:00–7:40 pm

Better Code Quality for Data Science

At this tutorial, I'll show the full pipeline of setting up code quality checks and tests at your repository. I'll focus on quick wins and solutions for common difficulties. These instruments seem simple but may help you a lot not only improve your own skills but work better as a team of Data Scientists

Julia Antokhina, Data Scientist @ Mobile TeleSystems

7:40–8:40 pm

Reproducibility of ML solutions in seismic interpretation project

We will show what open-source tools our team have developed and how we organized the full cycle of work on the project using the example of fault detection on seismic data

Alexey Kozhevin, Data Scientist in Gazprom Neft

8:40–9:40 pm

MLflow: creating experiments and logging metrics in Databricks and MLflow Tracking Servers

MLflow is a great open-source tool that allows to create and track ML experiments by logging all necessary metrics from your code. I will show you UI possibilities of ML management in both Databricks and MLflow Tracking servers. We will also discuss how MLflow Tracking server is organised and can be deployed.

Elena Vilkova, ML Engineer @ ABN AMRO,

Track 1: Machine Learning Product and Team Management

Track 2: ML Pipelines Automation, Engineering & MLOps (tutorials)

7:00–8:00 pm

Evolutionary automation of ML pipelines with FEDOT Framework

This is a tutorial on hot to build AutoML solutions for classification, regression, clustering and time series forecasting implemented in FEDOT Framework (open source)

Nikolay Nikitin, Senior Research Fellow @ National Center for Cognitive Technologies, ITMO University

8:00–9:00 pm

Develop End To End Scalable ML Pipeline With Kubeflow

85% of ML models are not used in production settings. In Industry if you do not test something on production settings it does not generate value. Kubeflow is a platform which makes prototyping ML models easier and salable. We will learn the very basics of this platform.

Ritaban Chowdhury, Machine Learning Engineer @ RiiidLabs

9:00–10:00 pm

Flyte: Accelerate your ML and Data Workflows to production

In this tutorial/workshop we will provide a demonstration and walkthrough of

Getting started with Flyte
Leveraging a GitOps style workflow for developing ML/Data pipelines
Best practices to develop fast and productionize code effectively
Typical use cases and benefits of using pipelines
Reasons to consider Flyte as the orchestration tool for you Data and ML needs.
If time permits we will also look at how to extend Flyte and how extending Flyte could help benefit a users workflow.

Yee Tong, Backend engineer from Lyft and Climate Corporation
Katrina Rogan, Backend engineer previously at Lyft and Google

Track 1: Machine Learning Product and Team Management

Track 2: ML Pipelines Automation, Engineering & MLOps (tutorials)

7:00–8:00 pm

Kubeflow pipelines for Object detection models on the edge

This workshop will start with a small presentation of object detection models and which ones are most suitable for running on edge devices with real time inference speed.

Imad BEKKOUCH, Data Scientist @ Provectus

8:00–9:00 pm

From Jupyter Notebooks to Reproducible and Automated experiments with DVC in just 4 steps

This is a step-by-step guide how to boost your ML projects and make it closer to production. Start with good engineering practices and reproducible pipelines in Machine Learning in 4 simple steps.

Mikhail Rozhkov, Co-Creator @ ML REPA
Solution Engineer @ Iterative.ai

Speakers

Track 1: Machine Learning Product and Team Management

Track 2: ML Pipelines Automation, Engineering & MLOps

Track 1: Machine Learning Product and Team Management

Lean Data Science: agile practices for Data Science projects

Askhat Urazbaev, Agile coach, Founder @ LeanDS

In this talk, we will explore collaborative techniques that guide data science teams in their agile adaption. We will discuss how to come up with nice and clear product hypothesis, how to prioritise them using ICE/RICE method, how to decompose huge AI Epics into small and easy to validate data science hypothesis and how to effectively manage work using Kanban and Scrum approaches.
Lean Data Science: agile practices for Data Science projects
How to come up with product hypothesis, prioritise them and how to effectively manage work using Kanban and Scrum approaches.

Wrong but useful: turning ML models into ML products

Elena Samuylova, CEO & Co-founder Evidently AI

When working on machine learning projects, we often focus on technical challenges and building accurate models. However, a model itself is not a product. To solve the business problem at hand, we need to consider a wider set of requirements. In this talk, I will share a set of questions to think through when working on enterprise ML solutions to make sure your models get to production

How become a good (team) manager in Machine Learning?

Alexander Moiseev, Head of Product Analytics, Capital Markets @ Raiffeisenbank

Key goals of a good manager - enhance the team to achieve new heights.
Main tools: encourage expertise and knowledge sharing within the team, improve team's members visibility, develop a culture of continuous learning.

This talk focuses on simple best tips and tricks for newcomers to a manager path. We will not speak about common practices. The talk will be focused on worst mistakes that all new managers do and on good and simple things that help everyone become a good manager in a short term

Architecture of Machine Learning systems

Michael Perlin, Machine Learning Engineer @ Volkswagen

A happy moment: ML model leaves the notebook to start benefiting the business. DS faces the question of how to integrate it: the possibilities are usually many, many different decisions have to be made, and it is often unclear how to approach them.
Software architecture is the discipline that is responsible for this. What does it involve? What skills and qualities does it require? Can DS master it? Who to call for help ?
This talk is less about technology and more about processes, strategies and people.

AI development process: Common mistakes

Kseniia Melnikova, Product Owner(Data/AI) @ SoftwareOne

Let's talk about the main problems and common mistakes of the AI dev process!

In this talk I will cover full development stage of the ML models' building: data preparation and data version control, code versions control, metrics tracking and hyperparameters tuning. We will discuss the weak points for each component and possible mistakes that you and your team are probably making. I will provide you with the solutions and AI tools list, which will help you to cover your process fully in more sophisticated way.

Structuring machine learning projects

Yerzat Marat, Project manager @ Knowtions Research

Yerzat has 4+ years delivering AI products and projects across various industries. His main interest lies in AI product management and how to best effectively deliver value using ML/DS, starting from concept through to production

In this talk, we will cover a set of tools and principles that the speaker found useful with practical case breakdowns

The 3 components your "Agile AI" product development stack should include

Ashley Beattie, Head of DevOps Transformation @ Agile By Design

What tools and techniques should I have in my toolbox when it comes to delivering AI products?
1. What are the goals of an AI Product development system?
2. Creating Valuable AI Products: the techniques, tools and approaches to understand, design, develop and test an AI opportunity
3. Rapidly testing AI Product hypothesis: how to slice your opportunity to define an MVP that matters

5 Principles of LeanML

Laszlo Sragner, Founder @ Hypergolic

In this talk, I will introduce LeanML in 5 key takeaways. Lean Machine Learning (LeanML) is a framework that enables Data Science teams to build business-oriented data products through a deliberate process. LeanML was created to address the difficulties of dealing with data-centric workflows and outcomes. It is inspired by techniques and knowhow from disciplines in quant trading, business intelligence, agile software engineering and strategic consulting.

Your AI Project is a QUEST. Don't get eaten by DRAGONS!

Artemy Malkov, PhD
Data Monsters, CEO / Lecturer at MIPT / NVIDIA Elite Service Delivery Partner

You're in big trouble when your team can't deliver a working AI solution on schedule.
There's a reason. Research projects may take 10 times longer than scheduled.
Learn about QUEST, MISSION, Decision-To-Be-Made, 3D-MAPping, RAIDs, CRAFT, LABYRINTH, DRAGONS and TREASURE - new lean data science tools and frameworks.
Join this session to upgrade your AI management skills to mythic levels and let your team have fun and be productive.

Track 2: ML pipelines automation. Code and Data version control. Reproducibility. MLOps

DVC: data versioning and ML experiments on top of Git

Dmitry Petrov, Creator of DVC - Data Version Control - Git for machine learning. Now co-founder & CEO of Iterative.ai. Ex-Data Scientist at Microsoft. PhD in Computer Science.

ML practitioners rapidly experiment to optimize for the best results or analyze
different subsets of data. Experiments need to be reproducible, both to recover
and tweak experiments and to instill confidence in the final results.
Reproducibility across experiments becomes more difficult as the data size and
project complexity increase. The data and code to generate each experiment must
be tracked, and running the entire pipeline from scratch may be infeasible.

With the open-source tool DVC, hundreds of experiments can be tracked
automatically. DVC tracks data, code, and metrics together, keeping the code and
metadata in a Git repository while caching the data anywhere the user chooses.
This approach scales with large data and complex projects, ensuring fully
reproducible results that make experimentation efficient and easy.

Building ML Pipelines with Dagster:
The role of the orchestrator in machine learning

Sandy Ryza, Software Engineer at Elementl, working on Dagster.

There would be no machine learning models without features, and there would be no features without data pipelines. Orchestrators help data scientists and ML engineers assemble durable pipelines out of the data transformations that define their features. Dagster is an orchestrator that puts data at the center. While orchestrators typically focus on sequencing computations in production, Dagster brings orchestration to the entire ML development lifecycle. It helps engineers and data scientists answer questions like:

How will a change to how I model my data affect the performance of my ML model?
What data and code were used to train this model?
How can I test my ML pipelines?
How can I try out changes to my feature without messing up my production data?

Evolutionary automation of ML pipelines with FEDOT Framework

Nikolay Nikitin, Senior Research Fellow
@ National Center for Cognitive Technologies, ITMO University

I plan to talk about the AutoML solutions for classification, regression, clustering, and time series forecasting implemented in open-source FEDOT Framework (https://github.com/nccr-itmo/FEDOT).
The framework allows building the modeling pipelines with the heterogeneous structure that can consist of blocks of different types (for example, ML-models, equation-based models, NLP models, neural networks, data preprocessing blocks, and even atomized pipelines) and have the multiscale or multimodal nature (for example, a model predicting different components of time series separately can be built automatically for a time series forecasting task). Also, the framework makes it possible to "export" the obtained model and data in order to improve the reproducibility of the AutoML-based experiments

Automating Machine Learning with GitHub Actions & GitLab CI

Elle O'Brien, Lecturer @University of Michigan School of Information
Data Scientist @ Iterative, Inc

Machine learning is maturing as a discipline: now that it's trivially easy to create and train models, it's never been more challenging to manage the complexity of experiments, changing datasets, and the demands of a full-stack project. In this talk, we'll examine why one of the staples of DevOps, continuous integration, has been so challenging to implement in ML projects so far and how it can be done using open-source tools like Git, GitHub Actions, and DVC (Data Version Control).
We'll also discuss a new open source project (Continuous Machine Learning) created to adapt popular continuous integration systems like GitHub Actions and GitLab CI to data science projects. We'll cover example use cases, including automated model testing in a standardized environment, getting detailed reporting on model behavior in a pull request, and training models on cloud GPUs.

Develop End To End Scalable ML Pipeline With Kubeflow

Ritaban Chowdhury, Machine Learning Engineer @ RiiidLabs

I am going to talk about how to productionalize ML product using Kubeflow

85% of ML models are not used in production settings. In Industry if you do not test something on production settings it does not generate value. Kubeflow is a platform which makes prototyping ML models easier and salable. We will learn the very basics of this platform.

Evolutionary automation of ML pipelines with FEDOT Framework

Nikolay Nikitin, Senior Research Fellow
@ National Center for Cognitive Technologies, ITMO University

Better Code Quality for Data Science

Julia Antokhina, Data Scientist at Mobile TeleSystems

What you'll learn?
At this tutorial, I'll show the full pipeline of setting up code quality checks and tests at your repository. I'll focus on quick wins and solutions for common difficulties. These instruments seem simple but may help you a lot not only improve your own skills but work better as a team of Data Scientists

Reproducibility of ML solutions in seismic interpretation project

Alexey Kozhevin, Data Scientist at Gazprom-Neft.
Solving seismic interpretation tasks with neural networks

Reproducibility of ML solutions in seismic interpretation is important at all stages of work: from data loading and deploying a production environment to model training and metrics evaluation.

We will show what open-source tools our team have developed and how we organized the full cycle of work on the project using the example of fault detection on seismic data.

Kubeflow pipelines for Object detection models on the edge

Imad Eddine Ibrahim, BEKKOUCH
Provectus, Data Scientist
PhD at Sorbonne University Paris(attending)

This workshop will start with a small presentation of object detection models and which ones are most suitable for running on edge devices with real time inference speed. The next step is to configure and run a kubeflow pipeline locally and configure the hyper-parameters used for model training. Last is to see the results of several experiments and compare the best models

Why you should start writing ML pipelines from training day 0

Hamza Tahir, ZenML, Co-creator

What you'll learn?
We will learn why ML pipelines are important, why they are hard to create, and why it's valuable to have them written as early as possible while developing ML models.
Intended audience are data scientists and ML Engineers who are interested in learning about bridging the experimentation and production phase of the machine learning development lifecycle

What is unique about this talk?
Bridging the gap between the experimentation and the productionalization phase of the machine learning workflow

MLflow: creating experiments and logging metrics in Databricks and MLflow Tracking Servers

Elena Vilkova, ABN AMRO, ML Engineer

I'm working as a Platform and ML Engineer in the Dutch bank ABN AMRO to enable Advanced Analytics projects. Our purpose is to build a robust and standard platform that DSs, DAs and MLEs can use for a fast full ML lifecycle from exploration to production

What you'll learn?
Each ML model should be fully controlled! After the tutorial you will understand how MLflow works and how to use MLflow Tracking for logging parameters and metrics of your ML model. We will understand difference between models, experiments and runs. I will show you UI possibilities of ML management in both Databricks and MLflow Tracking servers. We will also discuss how MLflow Tracking server is organised and can be deployed.

Writing reusable training pipelines for deep learning

Andrey Lukyanenko, MTS AI, Data Scientist
4 years as ERP-consultant, 4 years as Data Scientist
Kaggle Competition Master, Notebook 1st rank

What you'll learn?
The conference's attendees will know when writing a custom training pipeline is worth the efforts, what important functionality should it have, and see an example of such pipeline.
This talk could be interesting to those, who have already started training deep learning models and want to make the training more systematic.

What is unique about this talk?
In this talk, I'll show a working training pipeline for deep learning based on PyTorch-lightning as a wrapper over PyTorch code and Hydra for managing configuration files.

Personal website: https://andlukyane.com/activities
Twitter: https://twitter.com/AndLukyane

VCS repository structure for ML projects

Timur Dzhumakaev, MegaFon, Senior DS
Experience - nearly 5 year experience in the field in various roles: backend developer, ML engineer and data scientist

What you'll learn?
Data scientists can learn about the benefits of VCS, which can store results of many experiments and enhance communication both within and outside the team.
ML engineers/developers can formulate their requirements to DS better - with right demands they can focus more on performance optimization and less on refactoring.

What is unique about this talk?
I This talk is about essential task for IT projects - settings VCS repository. And while it is known that version control for ML projects is difficult - this topic is rarely discussed.

Reducing the distance between Prototyping and Production - Why obsessing over experimentation and iteration compounds ROIs

Soumanta Das, Yugen.ai, Co-Founder
7+ years (Data Science Consulting, Product, ML Systems)

A case study on how setting Minimum Achievable Goals and continuous improvement can help realize value and establish a strong ML engineering culture.

What you'll learn?

Bridging the gap from Prototype to Production by keeping small, achievable impact goals
How frequent iteration and continuous experimentation can increase ROIs and set the path for a strong ML engineering culture
How to balance model development, deployments, architecture improvements and monitoring keeping in mind business goals

Who should attend your talk/tutorial?

Data Scientists and ML Engineers looking to learn how to prioritize their efforts and workflows
Hopefully anyone working at a startup trying to build an ML team can gain from our experiences

From Jupyter Notebook to Reproducible and Automated experiments & MLOps for batch scoring applications

Mikhail Rozhkov, Co-Creator @ ML REPA, Solution Engineer @ Iterative.ai

I'm a Co-creator of the Machine Learning REPA project and ML REPA School. Author of online courses on ML Experiments automation and MLOps with DVC. Has over 6 years of hands-on experience in Machine Learning & Data Science, leads projects, and helps teams to implement good tools and engineering practices. Recently I've joined the Iterative.ai team as a Solution Engineer.

What you'll learn?

How to organize a team workflow
How to work with DVC, MLflow, and Airflow together
How to organize basic configuration file for ML projects
How to automate ML experiments with DVC

Who should attend your talk/tutorial?

Data Scientists and ML Engineers
Team Leads and ML Project Managers

Flyte: Accelerate your ML and Data Workflows to production

Katrina Rogan, Backend engineer previously at Lyft and Google

Experience working on data pipelines for mapping, travel search and ad performance reporting

What you'll learn?
Come to learn about https://flyte.org. A modern approach to cloud native orchestration that enables and accelerates Reproducible, consistent and scalable pipelines from local to production.

Attend if:

You run production pipelines
Want to use cloud native pipeline orchestration engine that has been used in production at scale at large companies like Lyft, Spotify, Freenome etc
Learn about https://flyte.org and the benefits of using a specification based pipeline engine
Want to run executions independently within an organization and not worry about resource scaling etc
Use Kubernetes and docker in a very simple easy to use way - no writing yamls, or complicated apis

Flyte: Accelerate your ML and Data Workflows to production

Yee Tong, Backend engineer from Lyft and Climate Corporation.

Passion for functional programming and orchestration. Deep experience with US mortgage market. Seattle native and involved with Flyte since inception.

What you'll learn?
Come to learn about https://flyte.org. A modern approach to cloud native orchestration that enables and accelerates Reproducible, consistent and scalable pipelines from local to production.

Attend if:

You run production pipelines
Want to use cloud native pipeline orchestration engine that has been used in production at scale at large companies like Lyft, Spotify, Freenome etc
Learn about https://flyte.org and the benefits of using a specification based pipeline engine
Want to run executions independently within an organization and not worry about resource scaling etc
Use Kubernetes and docker in a very simple easy to use way - no writing yamls, or complicated apis

MLOps and AutoML in Cloud-Native Way with Kubeflow and Katib

Andrey Velichkevich, Senior Software Engineer at Cisco,

Andrey Velichkevich is a Senior Software Engineer at Cisco and is one of the major contributors to the Kubeflow open-source project.

He is a co-chair for the AutoML working group and co-lead for the Training working group. Andrey hosts Kubeflow community meetings for the AutoML and Training working group, organises community webinars and writes the blogs. In addition to that, Andrey helps the community to drive the CI/CD infrastructure and contributes to the ML benchmark system for the Kubeflow.

Registration

We are going to use ML REPA School platform for organize our conference Online. Please, register and book your place on Machine Learning REPA Week 2021!

Organizers

Our partners

email: info@ml-repa.ru
telegram: t.me/mlrepa

See you on the ML REPA Week 2021!