Restaurant Survey Exploratory Data Analysis

Restaurant Survey Exploratory Data Analysis in Python

This analysis aims to:

Find out the average customer age and which age group gave us the best ratings.
Discover any relationships that exsist between customer habits and customer ratings.
Asses the effect of customer marriage status, age and habits on their budget.

KPI’s

Overall Rating
Service Rating
Food Rating

Insights

These are the averages for all ratings attributes:

df_all_average_ratings=df[['Overall Rating','Service Rating','Food Rating']].mean()
print(df_all_average_ratings)

Average overall rating 3.225
Average service rating 3.230
Average food rating 3.220

The average customer age was 40.17. Most customers surveyed are in the ‘twenties’ age group. The distribution is shown below

The teens and fifties age groups consistently rate highest among all age groups while customers in the thirties age group consistently rated lower.

Customers who said that they smoke often, gave considerably higher ratings than other customers. In overall rating on average, they rated 18% higer.

Divorced customers had a 28% lower budget on average.

Recommendations

Improving ratings among core customer age groups like twenties and thirties will push average ratings up.
Investigate the reason for high ratings among the smoker demographic.

Code Snippets

def read_n_clean(url):
    df = pd.read_csv(url)
    #removing whitespace from headers. Noticed whitespace in the column header 'Alcohol'. Initially, 'Alcohol '
    print("Removing whitespace from headers...")
    df=df.rename(columns=lambda x: x.rstrip())
    print("Creating new columns...")
    df['Location']=df['Location'].replace('Central Park,ny','Central Park,NY')
    df['Location']=df['Location'].replace('Market City, MY','Market City, NY')
    df['Age'] = 2025 - df['YOB']
    bins=['Teens','Twenties','Thirties','Fourties','Fifties','Sixties']
    bins_edges=[15,20,30,40,50,60,70]
    df['Age Group']=pd.cut(df['Age'],bins=bins_edges,labels=bins)
    return df

The function above was used to read and clean the data. Resolving inconsistencies like white space in columns and also creating a column for age bins.

def validate_data(df):
    # Extract info about dataset
    dataframe_info={
    "info":df.info(),
    "shape":df.shape,
    "describe":df.describe(),
    "null_count":df.isnull().sum(),
    }
    return dataframe_info

This function extracts the info from the dataframe and stores it in a hashmap.

Restaurant Survey Exploratory Data Analysis in Python

KPI’s

Insights

Recommendations

Code Snippets

Leave a Comment Cancel Reply