xdem.spatialstats.interp_nd_binning

xdem.spatialstats.interp_nd_binning#

xdem.spatialstats.interp_nd_binning(df, list_var_names, statistic=<function nmad>, interpolate_method='linear', min_count=100)[source]#

Estimate an interpolant function for an N-dimensional binning. Preferably based on the output of nd_binning. For more details on the input dataframe, and associated list of variable name and statistic, see nd_binning.

First, interpolates nodata values of the irregular N-D binning grid with scipy.griddata. Then, extrapolates nodata values on the N-D binning grid with scipy.griddata with “nearest neighbour” (necessary to get a regular grid without NaNs). Finally, creates an interpolant function (linear by default) to interpolate between points of the grid using scipy.RegularGridInterpolator. Extrapolation is fixed to nearest neighbour by duplicating edge bins along each dimension (linear extrapolation of two equal bins = nearest neighbour).

If the variable pd.DataSeries corresponds to an interval (as the output of nd_binning), uses the middle of the interval. Otherwise, uses the variable as such.

Parameters:
  • df (DataFrame) – Dataframe with statistic of binned values according to explanatory variables.

  • list_var_names (str | list[str]) – Explanatory variable data series to select from the dataframe.

  • statistic (Union[str, Callable[[ndarray[Any, dtype[floating[Any]]]], floating[Any]]]) – Statistic to interpolate, stored as a data series in the dataframe.

  • interpolate_method (Union[Literal['nearest'], Literal['linear']]) – Method to interpolate inside of edge bins, “nearest”, “linear” (default).

  • min_count (int | None) – Minimum number of samples to be used as a valid statistic (replaced by nodata).

Return type:

Callable[[tuple[Union[Buffer, _SupportsArray[dtype[Any]], _NestedSequence[_SupportsArray[dtype[Any]]], bool, int, float, complex, str, bytes, _NestedSequence[Union[bool, int, float, complex, str, bytes]]], ...]], ndarray[Any, dtype[floating[Any]]]]

Returns:

N-dimensional interpolant function.

:examples # Using a dataframe created from scratch >>> df = pd.DataFrame({“var1”: [1, 2, 3, 1, 2, 3, 1, 2, 3], “var2”: [1, 1, 1, 2, 2, 2, 3, 3, 3], … “statistic”: [1, 2, 3, 4, 5, 6, 7, 8, 9]})

# In 2 dimensions, the statistic array looks like this # array([ # [1, 2, 3], # [4, 5, 6], # [7, 8, 9] # ]) >>> fun = interp_nd_binning(df, list_var_names=[“var1”, “var2”], statistic=”statistic”, min_count=None)

# Right on point. >>> fun((2, 2)) array(5.)

# Interpolated linearly inside the 2D frame. >>> fun((1.5, 1.5)) array(3.)

# Extrapolated linearly outside the 2D frame: nearest neighbour. >>> fun((-1, 1)) array(1.)