Raga Kolli, Giselle Lewars, Emily Shaw, and I were brainstorming one Fall evening around dinner time about how to combine datasets in the wild to reach new and interesting insights. As we got hungrier, the conversation drifted towards the topic of food. Suddenly, we were struck with inspiration.
What if the weather affected what foods people were eating? What if we could help restaurant owners understand how to staff and stock their business based purely on next week’s weather forecast? So we dug in.
Collect Data
We collected a full calendar year of daily granularity data to ensure seasonal cycles were accounted for. We pulled from the following sources and combined the results into a single data set:
- Weather - National Centers for Environment Information
- Demand - Google Trends
- Cuisines - NYC Department of Health
We were able to manually collect weather and cuisine data through web interfaces, but wrote some custom code to pull from the Google Trends API
Clean Data
How do you compare 1 degree of temperature change with 1 inch of precipitation? You normalize! How do you create a decision tree with a range of values from 0-100? You convert into categories! How do you control for seasonality? You create dummy variables! After spending some time cleaning our data, we were ready to move on to the analysis.
Caveat: Google Trends data shows demand relative to the time frame queried and will not provide daily granularity for a query longer than 6 months. Since we wanted an entire year, we had to stick two separate, 6 month data sets together. We controlled for this with a dummy variable, but it still skewed our results.
Develop Models
We used two primary models to interpret our data:
- Linear Regression per Cuisine
- Decision Tree per Cuisine
Insights
We found that certain foods appear to be seasonal and correlated to weather (i.e. Mexican) and others have no correlation (i.e. Pizza).
For example, we found that 1 inch of precipitation increases demand for Soup by 5%. We also found that 1 degree of temperature change effects Mexican food demand by 1%! As you may have predicted, we found that people always want to eat Pizza in NYC. How cool is that?!
However, our highest r-squared was ~.5, meaning that it is difficult to draw strong conclusions from these results.
Productionalize
We imagined what an app might do if we were to move forward with this concept and provide a service to restaurant owners. It would likely take weather predictions for the next week and send notifications to the owner telling them how to scale their operations.
If you want to try out a prototype for yourself, click on your favorite cuisine below, enter your expected weather conditions, and see how the demand for that cuisine will change!
Conclusion
This was our first foray into combining these various data sources together to draw conclusions. If we were to do it again, we’d look for more reliable signals of food demand (Yelp, POS systems, etc.). We’d also try and pull larger data sets to help offset outliers.
Let us know if you have any questions in the comments!