Like many companies, online grocery delivery service Instacart has spent the past few months overhauling its machine-learning models because the coronavirus pandemic has drastically changed how customers behave.
Starting in mid-March, Instacart’s all-important technology for predicting whether certain products would be available at specific stores became increasingly inaccurate. Instead of being correct 93% of the time, it dropped to 61%. This is a problem because customers could get annoyed being told one thing—the item that they wanted was available—when in fact it wasn’t, resulting in products never being delivered. “A shock to the system” is how Instacart’s machine learning director Sharath Rao described the problem to Fortune.
The reason is that the data Instacart fed into its machine learning models about shopping habits failed to take into account the new coronavirus reality. Normally, Instacart customers would buy products like toilet paper only occasionally. But then, almost overnight, it was like they were preparing for a months-long camping trip and wiped out store supplies. Many customers stockpiled bathroom tissues, wipes, and hand sanitizer gel, as well as staple foods like eggs and cheese.
Rao explained several things Instacart did to fix the problem. It’s a lesson that other businesses that use machine learning could learn from.
Instead of training a machine-learning model based on several weeks of data (in this case, the items that delivery people mark “as found” or “not found” at stores), Instacart now uses up to ten days of data. While weeks of data may provide insights about long-term trends, sifting through data only from recent days provides more accurate results because people’s shopping habits are in flux. As he explained, Instacart had to make a tradeoff between the volume of data used to train its model and the “freshness” of data.
In the last week alone, the nationwide protests over the death of black Minneapolis resident George Floyd while in police custody had a big impact on shopping patterns, whether grocery stores were open, and whether they were fully stocked.
“The world is changing so fast,” Rao said. “Every day looks like a Monday for Instacart,” referring to every day being increasingly busy.
Instacart also increased the number of times it “scores” its model for predicting the likelihood that a certain product will be in stock. Before, Instacart would typically score its model (based on hundreds of millions of items) every three hours, but it now does so every hour to better take into account the fast-changing world. An item like a case of soda that’s marked with a lower “score,” indicates a lower chance it will be at the store when the delivery person arrives, prompting Instacart to suggest that users mark a potential replacement item.
Instacart also performed a wonky task known as “hyper-parameter optimization,” which required its machine learning engineers to adjust certain settings of the model that influence the accuracy of its predictions. Although this task requires machine learning expertise, it can be likened to someone “pushing buttons” to get a device to work properly, Rao explained. Think of an airplane pilot who knows how to “press the right buttons” in a complex aircraft control system to ensure a smooth landing during a sudden storm.
For all of its efforts, Instacart’s technology still isn’t as accurate as before the pandemic. It’s now correct about 85% of the time, according to an Instacart graph, underscoring the challenge of fine-tuning a machine learning model amid uncertainty.