ITSSMetrics¶

class itssutils.itssdata.ITSSMetrics(itss_data=None)[source]¶

Class to wrap ITSS metrics dataframe

raw_df¶

The dataframe of raw ITSS data

Type:	pd.DataFrame

metrics¶

The dataframe of calculated metrics

Type:	pd.DataFrame

grouping¶

The grouping of calculated metrics

Type:	list of str

calculate_metrics(grouping, population_csv=None)[source]¶

Calculate the metrics, grouping by different items

Parameters:	grouping (str or list of str) – Columns by which to group the data population_csv (str or path) – Filename of population demographic csv

Examples

>>> # Calculate the metrics for each racial group across all traffic stops
>>> mdf = metrics_by_group(raw_data_df, 'DriverRace')

>>> # Calculate yearly metrics by driver sex for each agency
>>> mdf = metrics_by_group(raw_data_df, ['AgencyName', 'Year', 'DriverSex'])

get_grouping()[source]¶: Return the grouping used to calculate the metrics

get_metrics()[source]¶: Return a list of all the calculated metrics

get_metrics_df()[source]¶: Return the raw metrics dataframe

load(filename)[source]¶: Load a metrics object from a pickle file pickled object is (grouping, metrics_df) tuple

plot_bars(target_top_row, target_column, only_include_rows=None, title=None, savename=None, savecsv=False, xax_label=None)[source]¶

Make a bar plot of a certain metric. Requires a multi-level metrics calculation be passed in.

Parameters:	target_top_row (str) –

Examples

>>> met.calculate_metrics(['AgencyName', 'DriverRace'])
>>> met.plot_bars('Chicago Police', 'SearchRate')

plot_scatter(y_index, x_index, metric, size, population_col=None, logscale=False, limits=None, scale_factor=None, z_threshold=5, z_opacity='binary', as_ratio=False, title=None, savename=None, savecsv=False)[source]¶

Scatter plot of all agencies

Parameters:

y_index (str or tuple) – the top-level index to use for the y-axis data (i.e. all levels except agency name)
x_index (str or tuple) – the top-level index to use for the x-axis data
metric (str) – the name of the calculated rate to plot, e.g. SearchRate
size (str) – the name of the metric to use to size the points, e.g. SearchCount
logscale (bool) – Plot on a loglog scale
limits (list or tuple) – the limits on the x and y set_axis
scale_factor (float) – Scaling factor for size of points
z_threshold (float) – Cutoff threshold to consider something “statistically significant”
z_opacity (str) – Type of shading to use (‘binary’, ‘gradient’, ‘filter’)
as_ratio (bool) – Make a ratio plot
title (str) – Title of the plot
savename (str or path) – Where to save the figure
savecsv (str or path) – Where to save a csv of data used to make the figure

Examples

>>> # Compare search rates for black and white drivers
>>> met.plot_scatter('Black', 'White', 'SearchRate', 'SearchCount', population_col='StopCount')

plot_timeseries(target_column, only_include_rows=None, only_include_entries=None, title=None, ylabel=None, savename=None, savecsv=None)[source]¶

Make a timeseries plot

Parameters:

target_column (str) – The column you want to make the timeseries for
only_include_rows (str or tuple or list) – Rows of index to include
only_include_entries (str or tuple or list) – Filter criteria - only include matching entries from target
title (str) – Plot title
ylabel (str) – Plot y-axis label
savename (str or path) – Path to save the plot
savecsv (str or path) – Path to save a csv of data used to make the plot

Examples

>>> met.plot_timeseries('SearchRate', only_include_rows='Chicago Police', only_include_entries=['Black', 'Hispanic/Latino', 'Asian', 'White'], title='Search Rate 2012-2017')

plot_zhist(target_item, reference_item, event_col, total_obs_col, title=None)[source]¶

Z-score histogram for a given event/observation count pairing, e.g. SearchCount/StopCount Must have included ‘AgencyName’ in grouping and grouping must be at least two categories

Parameters:	target_item – index of target item, e.g. ‘Black’ reference_item – index of reference item, e.g. ‘White’ event_col – column name for event counts, e.g. SearchCount total_obs_col – column name for total observations, e.g. StopCount

Examples

>>> # Compare the deviation of black driver search hit rate relative to white driver search hit rate
>>> met.plot_zhist('Black', 'White', 'SearchHitCount', 'SearchCount')

save(filename)[source]¶: Pickle a metrics object as a (grouping, metrics_df) tuple

save_csv(filename)[source]¶: Save the current metrics as a csv file