ITSSData

class itssutils.itssdata.RawITSSData[source]

Human-readable wrappers around raw ITSS data manipulations

raw_data_df

raw dataframe

Type:pd.DataFrame
get_agencies()[source]

Return a list of all reporting agencies

get_collected_data()[source]

Get a list of all the categories of data collected and processed

get_raw_dataframe()[source]

Return the underlying dataframe

load_multiple_years(year_file_list, fast=True, save=False)[source]

Load multiple years worth of raw data into a single object

Parameters:year_file_list (list) – List of tuples of the format (year, filename)

Example

>>> yf_list = [(2012, '2012_ITSS_Data.txt'), (2013, '2013_ITSS_Data.txt')]
>>> rid.load_multiple_years(yf_list)
load_single_year(year, filename, fast=True, save=False)[source]

Load a single year of raw data

Parameters:
  • year (int) – The year of interest
  • filename (str) – The filename containing raw ITSS data
  • fast (bool) – Whether to load from pre-processed pickle file
  • save (bool) – Whether to save to a pickle file
Returns:

None

Example

>>> rid.load_single_year(2016, '2016_ITSS_Data.txt')
plot_timeseries(frequency='1W', agency=None, filter_cols=None, filter_values=True, group=None, title='All Agencies', savename=None, savecsv=None)[source]

Plot a time series of the counts of raw traffic stop data.

Parameters:
  • frequency (str) – the pandas-style sampling frequency; default 1W
  • agency (str) – The agency to filter by; default None
  • filter_cols (str or list) – The column(s) to filter by; default None
  • filter_values (str or int or list) The selected value(s) –
  • group (list of str) – The column to group by: default None
  • title (str) – Plot title
  • savename (str or path) – Path to save figure
  • savecsv (str or path) – Path to save csv of data used to create figure

Examples

>>> # Find the daily number of stops by the Chicago Police
>>> rid.plot_timeseries(frequency='1D', agency='Chicago Police')
>>> # Find the weekly number of citations issued across all departments
>>> rid.plot_timeseries(filter_cols='ResultOfStop', filter_values='Citation')
>>> # Find the monthly number of stops by race
>>> rid.plot_timeseries(frequency='1M', group='DriverRace')