Sampling¶

class sampling.BigIncidentSampler(incidents, deployments, start_time, end_time, min_ts=3, vehicles=['TS'], types=['Binnenbrand', 'Buitenbrand', 'Hulpverlening algemeen'])[source]¶

Class that simulates big incidents at random times. Mostly useful as a starting point for simulating more (regular) incidents and evaluating response times in extreme cases.

Parameters:

incidents (pd.DataFrame) – The incident data.
deployments (pd.DataFrame) – The deployment data
min_ts (int, default=3) – The minimum number of TS deployments for an incident to be considered ‘big’. Only such incidents will be sampled.
vehicles (str or list of strings) – The vehicle types to take into account.
types (list of strings) – The incident types to use. If None, use all in the data.

sample_big_incident()[source]¶: Sample a big incident at some random time and place.

class sampling.IncidentSampler(incidents, deployments, vehicle_types, location_ids, start_time=None, end_time=None, predictor='basic', fc_dir='/data', verbose=True)[source]¶

Samples timing and details of incidents from distributions in the data.

Parameters:

incidents (pd.DataFrame) – The incident data to obtain distributions from.
deployments (pd.DataFrame) – The deployments to obtain vehicle requirement distributions from.
vehicles (array-like of strings) – The vehicles types to include in the sampling.
start_time (datetime or str (convertible to datetime) or None) – The start of the period that should be simulated. If none, starts from end of the data. Defaults to None.
end_time (datetime or str (convertible to datetime) or None) – The start of the period that should be simulated. If none, ends one year after end of the data. Defaults to None.
predictor (str, one of ['prophet']) – What incident rate forecast method to use. Currently only supports “prophet” (based on Facebook’s Prophet package).
verbose (boolean) – Whether to print progress.

Example

>>> sampler = IncidentSampler(df_incidents, df_deployments)
>>> # sample 10 incidents and print details
>>> t = 0
>>> for _ in range(10):
>>>     t, type_, loc, prio, vehicles, func = sampler.sample_next_incident(t)
>>>     print("time: {}, type: {}, loc: {}, prio: {}, vehicles: {}, object function: {}."
>>>           .format(t, type_, loc, prio, vehicles, func))

reset_time()[source]¶: Reset the incident time generator to start from t=0.

sample_next_incident()[source]¶

Sample a random time and type for the next incident.

Parameters:	t (float) – The current time in minutes (from an arbitrary random start time)
Returns:	The new time t of the next incident and incident details (t, incident_type, location, priority, vehicles, building function)

set_custom_forecast(forecast, start_time=None, end_time=None)[source]¶

Manually provide a forecast.

Parameters:	forecast (pd.DataFrame) – Must have the same shape and columns as the output of self.predictor.get_forecast(). No assertions are made on this input. end_time (start_time,) – The start and end time of the new sampling dictionary that will be created from the provided forecast. If None, uses the entire forecast.

set_location_probs(loc, equal_to=None, value_dict=None, types=None)[source]¶

Set the probability that incident of certain type occurs in a given location.

The probabilities are first set to the given values, then they are normalized again to form a proper probability distribution. Note that this function should normally not be called directly. Use fdsim.simulation.Simulator.set_location_incident_rates instead.

Parameters:	loc (str) – The location to set the probabilities for. equal_to (str or list(str)) – The location from which to copy the probabilities. ignored if value_dict is provided. value_dict (dict) – Dictionary specifying the incident types to change and the specific probability to set them to like {‘type’ -> prob}. types (list(str)) – The incident types to change the probabilities of. If None, uses all. Ignored if a value_dict is provided.
Returns:	sum_probs – The sum of the probabilities for each incident type. Can be used to adjust the overall incident rates per type so that the incident rate in other locations is not changed.
Return type:	dict

set_time(time, num_periods=None)[source]¶

Set time so that incidents are sampled from this point forward.

After setting time, sample_next_incident will sample the next incident in the hour(s) after the set time rather than starting at the start of the forecast / sampling dict or resuming from its current position.

Parameters:

time (pd.Timestamp or datetime64) – The time from which to sample the next incident.
num_periods (int, default=100) – The number of periods to simulate from the set time. Incident times will start from time again after num_periods are simulated. This can be used when only short periods need to be considered to speed up calculations, e.g., when simulating major incidents and investigating only simultaneous incidents.

class sampling.ResponseTimeSampler(load_data=True, data_dir='/data', verbose=True)[source]¶

Class that samples response times for deployed vehicles.

Parameters:	load_data (boolean) – Whether to load data from disk (True) or pre-process from scratch (False). data_dir (str) – The path to the directory where data is stored. Used if load_data==True or when data is saved after preparation. verbose (boolean) – Whether to print progress updates when doing stuff.

add_station(station_name, location)[source]¶

Move the location of a single station.

Parameters:	station_name (str) – The name of the new station. location (str or tuple(float, float)) – The location of the new station. If a string is passed, it is interpreted as the identifier of the demand location to move the station to. If a tuple of floats is passed, it is interpreted as the new coordinates in decimal (long, lat).

fit(incidents=None, deployments=None, stations=None, loc_coords=None, vehicle_types=['TS', 'RV', 'HV', 'WO'], osrm_host='http://192.168.56.101:5000', save_prepared_data=False, location_col='hub_vak_bk', volunteer_stations=['DRIEMOND', 'DUIVENDRECHT', 'AMSTELVEEN VRIJWILLIG'])[source]¶

Fit random variables related to response time.

Parameters:

incidents (pd.DataFrame) – The incident data. Only required when no prepared data is loaded.
deployments (pd.DataFrame (optional)) – The deployment data. Only required when no prepared data is loaded.
stations (pd.DataFrame (optional)) – The station information including coordinates and station names. Only required when no prepared data is loaded.
vehicle_types (array-like of strings) – The types of vehicles to use. Defaults to [“TS”, “RV”, “HV”, “WO”].
osrm_host (str) – The url to the OSRM API, required when object is initialized with load_data=False or when no prepared data was found.
save_prepared_data (boolean) – Whether to write the preprocessed data to a csv file so that it can be loaded the next time. Defaults to False.
location_col (str) – The name of the column that specifies the demand locations, defaults to “hub_vak_bk”.
volunteer_stations (array-like of str, optional (default: None)) – The names of the stations that are run by volunteers. Turn-out times are fitted separately for these stations, since volunteers have to travel to the station first.

Notes

Performs the following steps:

Prepares data (merges and adds OSRM distance and duration per deployment)
Fits lognormal random variables to dispatch times per incident type.
Fits Gamma random variables to turnout time per station and type.
Models the travel time as \(\alpha + \beta * \gamma (\theta, k) * \hat{t}\), per vehicle type. Here \(\hat{t}\) represents the OSRM estiamte of the travel time and \(\gamma\) is a random noise factor.
Saves the station and demand location coordinates in dictionaries.

move_station(station_name, new_location, new_name)[source]¶

Move the location of a single station.

Parameters:

station_name (str) – The name of the station to move.
new_location (str or tuple(float, float)) – The new location of the station. If a string is passed, it is interpreted as the identifier of the demand location to move the station to. If a tuple of floats is passed, it is interpreted as the new coordinates in decimal (long, lat).
new_name (str) – The new name of the station.

reset_stations()[source]¶: Reset the station locations and names to those obtained from the data.

sample_dispatch_time(incident_type)[source]¶

Sample a random dispatch time, given the incident type.

Parameters:	incident_type (str,) – The type of incident to sample dispatch times for.
Returns:
Return type:	int, the random dispatch time in seconds.

sample_response_time(incident_type, location_id, station_name, vehicle_type, appointment, prio, estimated_time=None, osrm_host='http://192.168.56.101:5000')[source]¶

Sample a random response time based on deployment characteristics.

Parameters:

incident_type (str) – The type of the incident to sample turn-out time for.
location_id (str) – The ID of the demand location where the incident takes place.
station_name (str) – The name of the station that the deployment is executed from.
vehicle_type (str) – The vehicle type (code) to sample travel time for.
estimated_time (float, int, optional) – The estimated travel time according to OSRM. Optional, defaults to None. If None, estimation is collected from OSRM at time of calling, which is far less efficient.
osrm_host (str, optional) – The URL to the OSRM API. Required when no ‘estimated_time’ is provided.

Returns:

Tuple of (turn-out time, travel time, on-scene time). Note that the dispatch
time is sampled separately, since that is only done once per incident, while
this function is called per deployment.

sample_travel_time(estimated_time, vehicle, osrm_host='http://192.168.56.101:5000')[source]¶

Sample a random travel time.

Parameters:	estimated_time (float) – The travel time in seconds according to OSRM. vehicle (str) – The vehicle type (code) to sample travel time for.
Returns:
Return type:	A float representing the random travel time in seconds.

set_custom_stations(station_locations, station_names, location_col='hub_vak_bk')[source]¶

Change the locations of stations to custom demand locations.

Parameters:	station_locations (array-like of strings) – Location IDs of the custom stations, must match values in location_col of the objects data. location_col (str, optional) – Name of the column to use as a location identifier for incidents. Defaults to “hub_vak_bk”.