Transit Data Mapping Basics

Transit Center - June 16 2017

Install R

Install RStudio

Download Zip Files

Why is Open Transit Data important?

What will you learn today:

Where to find transit data

Strategies on wrangling transit data

Ways of determining reliability

Mapping the reliability of a transit system

Getting Data

MBTA Back on Track

MBTA Developer Portal

MTA Developer Portal

Kinds of Transit Data

Historical Archives (e.g. CSV)

General Transit Feed Specification Static and Realtime ( GTFS)


How do you measure performance?

It's complicated.


MBTA Subway reliability


NYC Bus Performance

Bus Turnaround

Let's get started.

Open TDashboardData_reliability_20160301-20160331.csv

Or download the data.

MBTA On Time Performance

Open mbta_station.csv

This file contains coordinates of the stations.

Source is from the MBTA Rapid Transit Station shapefile, with some manual clean up.

MBTA GTFS Static Feed provides similar data, also requiring some manual clean up.

Launch R Studio

R Studio Overview

File -> Open File -> rely_analysis.R

Set Working Directory. Then "Run" through each step.

Open Carto

Load New dataset: joindata.csv

Change Type to Date

Create A Map

Make A Map

Select avg(rely) and Choose Color

What do you see?

Where are people waiting the longest?

Make Animated Map

Under Fill: Choose rely, Quantile, and Color

under Column: Choose Service Date

What additional data would be useful?

How would you use this data and map responsibly?

Thanks to:

You, TransitCenter, Nature of Cities, Carto

Stay in touch: Open Transit Data Toolkit Email List

Resources: Transit Data

MBTA Back On Track

MBTA-realtime Developer Portal

MTA Bus Time

MTA Developer Resources

The Open Bus


Other Resources:

GTFS Data Best Practices

NYC MTA Developers Google Group

Mass DOT Developers Google Group

Data Wrangling with Python

Google, Stack Overflow, Stack Exchange

Resources: Performance

MBTA Dashboard Data Dictionary

NYC Bus Performance

Bus Turnaround