COVID-19 US Public Transit Analysis

Descriptive Analysis on the Impact of COVID-19 on Public Transit in the United States
Project Overview
Data gathered from the US Census Bureau and National Transit Database (NTD) was cleaned, combined and merged to analyze trends between commute patterns before and during the COVID-19 Pandemic. The merged data was then imported into Tableau to showcase trends in an interactive dashboard
Tools and Datasets
The software tools included:
    - Python using Jupyter Notebook
    - Tableau

The datasets included:
    - adjusted monthly ridership data grouped by transit agency from the NTD
    - 2018-2022 yearly commuter means of transportation separated by year and grouped by urbanized area from the US Census Bureau
     - 2020 general urban area statistics from the US Census Bureau
Project Details
Monthly ridership data was gathered from the National Transit Database. This data included information on unlinked passenger trips across various transit agencies and modes of transit and was updated up to August of 2023. The data gathered was in an excel workbook which was then saved as a CSV to import into Python. Yearly commuter data was gathered from the American Census Bureau per year from 2018 to 2022 each as a CSV.
As for processing and merging data, the monthly ridership data needed to be  merged into yearly statistics to match the commuter datasets. In addition, because the ridership was broken down by transit agency and type of transit, such as bus, rail or ferry, extra steps had to be taken to aggregate over urban area to gather the full extents of ridership per urban area.
The monthly ridership and commuter datasets were imported into Python to be cleaned and profiled. In the process of cleaning, it was found that the the US Census Bureau had changed the wording of working at home from "worked at home" to "work from home" between the years of 2021 to 2022. This affected the commuter datasets so steps were taken to change "worked at home" columns to instead use the current verbiage of "work from home". As for the ridership data, there were summary statistic rows at the bottom of the dataset that were dropped as they were extraneous.

For the analysis, the first steps were to explore the data using correlation matrix and scatterplots. It was found that mean commute time rose as pubic transportation was used more. Another interesting relationship was that as commuting by car decreased commuting by public transit rose. However, the inclusion of year showcased that cars and public transit were not directly inverses. Instead, during the COVID-19 pandemic years of 2020, 2021, and 2022, both public transit usage and car usage decreased as seen in the scatterplot with purple hues.

Next steps included creating geospatial visualizations and using time series to analyze monthly trips taken from 2010 to August of 2023. When viewing the time series of monthly transit trips across the United States, one can see a drastic drop in ridership at the start of the pandemic.

Lastly, the clean and merged data was imported into Tableau to create a dashboard and showcase public transit, work from home, and urban area statistics. One of the charts is seen below which showcases the drop in ridership at the start of the COVID-19 pandemic.

Want to get in touch?
Drop me a line!

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.