How to Machine Learn-ify any Product

Rate this content
Bookmark

This talk will be a walkthrough of utilizing machine learning to replace a rule based system for consumers. We will discuss when is it okay to use ML, how to build these models with intelligent data, evaluate these offline and finally how to validate this evaluation to land these models in production systems. Furthermore, we will illustrate various self-learning/interactive-learning strategies that can be used for production systems to automate how models teach themselves to become better.

This talk has been presented at ML conf EU 2020, check out the latest edition of this Tech Conference.

FAQ

A product is considered 'ready for ML' if simple rules cannot adequately solve its problems and if the solution needs to be generalized to a large scale beyond handling just a few cases.

To determine if a problem can be solved with simple rules, consider if the problem can be addressed by a straightforward decision or threshold. If the problem requires more complex decision-making or data interpretation, ML might be necessary.

The ML model development cycle includes data collection, deciding on the ML model to use, training the model, evaluating it through both offline and online metrics, and ongoing maintenance such as active learning.

Facebook used ML in their Portal product to enhance the calling feature, enabling the device to predict and understand who the user intends to call based on voice commands, even distinguishing between multiple contacts with the same name.

Gradient Boost Decision Trees (GBDT) are an ensemble of regression trees used for classification tasks. Facebook chose GBDT for its reliability and effectiveness in handling complex decision-making processes with categorical and discrete features.

In Facebook's application, particularly for calling features, high precision is crucial to ensure the correct execution of user commands, avoiding errors like calling the wrong person which can lead to user dissatisfaction and privacy concerns.

Working with ML at scale involves ensuring data privacy, handling large volumes of data, maintaining model performance across diverse user interactions, and continuously updating models to adapt to new data and features.

Shivani Poddar
Shivani Poddar
33 min
02 Jul, 2021

Comments

Sign in or register to post your comment.

Video Summary and Transcription

The video explains how to implement machine learning (ML) in products, focusing on Facebook Portal's calling feature. It discusses the importance of determining if a problem can be solved with simple rules or if it requires ML, especially for large-scale applications. The ML model development cycle includes data collection, feature and label setup, and training. Gradient Boost Decision Trees (GBDT) were used for their reliability. The video highlights the importance of precision in ML models to avoid errors like calling the wrong person. It also covers the challenges of working with ML at scale, including data privacy and continuous model retraining. Online evaluation and A/B testing are essential for ensuring model performance. The talk also touches on how to handle label delay through data augmentation and self-learning.

1. Introduction to ML and Product Readiness

Short description:

Hi everyone. I'm Shivani, an ML engineer at Facebook. In this talk, I'll guide you on when to use ML and share a successful use case from Facebook. To determine if your product is ready for ML, consider two questions: Can your problem be solved with simple rules? What is the scale of your problem? For example, classifying apples from oranges may only require a color filter for a small user base. But if you need to classify different types of oranges and apples, you'll need more than just color. If both criteria are met, ML is needed.

Hi everyone. I'm Shivani. I work as an ML engineer at Facebook, and today I'm going to share with you how to MLify almost any product. This talk is going to be a more practical talk where I'm going to walk you through when is it okay to use ML. We are going to discuss a use case wherein I at Facebook used ML and and successfully so and I'm going to walk you through the cycle of an ML models development.

Cool so the first question that we have to answer is that is your product really ready for ML. So this is one of the biggest mistakes that people do is thinking that anything can be plugged in with ML and any problem can be solved with ML. So I think really there are two questions that you want to answer. The first one is that can your problem be solved with simple rules. Can you just think of a threshold or is it a binary flip decision of whether or not your problem can be solved with a simple rule. The second one to think about is what is the scale of your problem. Do you need to generalize your solution to a lot more people than a few hundreds.

One example is that say I want to classify apples from oranges and all of my product is just about classifying apples to oranges. Having a small filter which would say orange is orange and red is apple would be a reasonable approach if I have 20 users using my product. Not yet justifying whether or not we need ML. However, if I was to classify different types of oranges which could also be reddish and different types of apples which could also be orange, I would need more than just color as a rule. I would probably need the shape. I would probably want to employ some computer vision techniques and so on. And so if you can answer both in affirmative that you do need more than just simple rules and your problem is ready to scale, you need ML for your product.

2. Using ML for Facebook Portal Calls

Short description:

Let's walk through a real-life scenario of using ML for Facebook portal. The goal was to make precise calls by predicting the intended recipient. Initially, rule-based selection was used, but it was cumbersome. ML was leveraged to learn from data distribution and overcome the limitations of rules. The ML model development lifecycle involves data collection, setting up feature and label sets, and using organic data from pre-existing product interactions. Features for organic data collection include ASR confidence scores.

Let's not walk through a real-life scenario of how we use this for Facebook portal. So I was working in the Facebook portal team and one of our hero features was calling. So the user would come, they would say, Hey, portal call John, and the idea would be for the device to understand who John is in your friend list. And if there's multiple Johns, then it should disambiguate who the right John is, and then create a call to whoever this person is. And note here that the cost of failing is high because you're ending up calling the wrong person, you leave a missed call. And so really the option here is to be very precise.

And so when we started out, the flow we had was that the user would initiate this command, portal would understand who the most possible John is, and this was simply rule-based. We would pick the top contact that we got, and then we would issue a confirmation prompt. And so if a user said yes, I confirmed, call them, we would call them, otherwise we wouldn't. But this was a very tacky, long process, right? The user had to come in and select, confirm, select who they were calling, oftentimes engage and touch in the UI. And we'll just take away from this experience of the user interacting hands-free with this device, which is smart. And so we had to tackle this problem of, hey, given John, how can I predict who the actual John is, so that the user does not have to come in and do all of this work themselves.

So there are a lot of rules that we could have used. We could see the similarity score in whether the name reflected or matched the name of the person. We could use confidence score, whether the ASR confidence, and ASR is a speech recognition system, and so whether it understood John correctly. We could also use the relationship of the user to the person they were calling. So naturally, if somebody was a family, they would be more likely to call for some users, and for some other users, if someone was calling a John they messaged or called frequently, it would be more likely that they're calling the same John. And then it could also vary on time of day, whether they actually gave the command or not. So for a lot of users, they would be talking to somebody else, sometimes ASR picks up incorrectly whether or not the user is trying to call this person. And so what is the probability of noise? What is the frequency in which that they're talking to this one person? What is the score of all of these upstream modules that essentially translate any speech to text? And so all of these were a barrier of rules, right, and they could not be as abstracted in one single rule for our purposes. And so since these are not just one rule, and we needed the model to actually learn, not just from a single flip-flop rule, but a data distribution, we decided that we are going to be leveraging ML for this problem. And so this is the problem I'm going to walk you through, through this life cycle of the ML model development. This is what it looks like, right, so whenever you've now decided, you've answered this question for your product, that you do need ML for your model development, and so, what does it look like? It starts from data collection. Data collection is setting up the right feature and label sets for your model training. And so this can be organic, or this can be artificial. In our case, for the example I just gave you, since we were already using confirmation prompts from the users to decide whether or not the contact was the right contact, we already had organic labels collected for our features to solve this ML problem. And so, one can either use the pre-ML era data, which is anonymized from their pre-existing user interactions, to train their model, or what ends up happening in a scenario, you can use, you can actually create tools to collect data to solve the problems that you're after. For the purposes of this talk, we are going to be focusing on organic data collection, and how can you get data from an already pre-existing product and how can you then use that data to inject ML in your product. And so this is what our features look like. For our organic data collection, our features would ASR confidence score, which was if a user says something, our ASR or speech recognition engine translates that to text, converts it into text and generates a confidence score with it.

QnA