Detailed analysis and code is on my Kaggle page: https://www.kaggle.com/anudeepvurity/beer-consumption-analysis
Overview of the data:
The dataset explains about Consumption of beer in the Sao Paulo city, Brazil. The dataset has 7 columns and 365 records. This dataset file is in CSV format. Our main objective from this data is to find the rate of consumption of beer at different circumstances like minimum or maximum temperature in a day, during weekdays or weekends. We shall also have a look at how will precipitation affect the consumption of the beer. To make our project even more interesting let us add state holidays list of Sao Paulo in 2015. We can also see whether holidays are affecting consumption.
Questions to analyze on this dataset:
• Will beer consumption vary when there is a change in weather conditions like bad weather?
• Do weekends and weekdays play their role in beer consumption?
• Which months of the year has more beer consumption?
• Do national and state holidays play any role in the consumption of beer.
By answering the above questions we can understand the depth of the dataset.
Conclusion:
The beer was consumed mostly in January and at least in July. It is also clear that weekends (Saturdays and Sundays) played a crucial role in the intake of beer, State_holidays did not play any significant role in consumption unless holidays matched with weekends. Temperature played a vital role in beer consumption as the temperature increases the beer consumption increases. Precipitation did not play a major role in consumption even though there is a slight declination in beer consumption when precipitation increases. We only used Temperature median as the dependent variable as it as higher correlation. We can also see both Linear Regression and Regression Trees predicted 96% on Sao Paulo dataset. Such kind of analysis of the data would help to flourish the restaurant business models in Sao Paulo. Please check this code on my Kaggle Page