mibiscreen.analysis
API reference
mibiscreen module for data analysis.
reduction
mibiscreen module for data analysis reducing sample data.
ordination
Routines for performing ordination statistics on sample data.
@author: Alraune Zech, Jorrit Bakker
cca(data_frame, independent_variables, dependent_variables, n_comp=2, verbose=False)
Function that performs Canonical Correspondence Analysis.
Function makes use of skbio.stats.ordination.CCA on the input data and gives the site scores and loadings.
Input
data_frame : pd.dataframe
Tabular data containing variables to be evaluated with standard
column names and rows of sample data.
independent_variables : list of strings
list with column names data to be the independent variables (=environment)
dependent_variables : list of strings
list with column names data to be the dependen variables (=species)
n_comp : int, default is 2
number of dimensions to return
verbose : Boolean, The default is False.
Set to True to get messages in the Console about the status of the run code.
Output
results : Dictionary
* method: name of ordination method (str)
* loadings_independent: loadings of independent variables (np.ndarray)
* loadings_dependent: loadings of dependent variables (np.ndarray)
* names_independent: names of independent varialbes (list of str)
* names_dependent: names of dependent varialbes (list of str)
* scores: scores (np.ndarray)
* sample_index: names of samples (list of str)
Source code in mibiscreen/analysis/reduction/ordination.py
constrained_ordination(data_frame, independent_variables, dependent_variables, method='cca', n_comp=2)
Function that performs constrained ordination.
Function makes use of skbio.stats.ordination on the input data and gives the scores and loadings.
Input
data_frame : pd.DataFrame
Tabular data containing variables to be evaluated with standard
column names and rows of sample data.
independent_variables : list of strings
list with column names data to be the independent variables (=environment)
dependent_variables : list of strings
list with column names data to be the dependen variables (=species)
method : string, default is cca
specification of ordination method of choice. Options 'cca' & 'rda'
n_comp : int, default is 2
number of dimensions to return
Output
results : Dictionary
* method: name of ordination method (str)
* loadings_independent: loadings of independent variables (np.ndarray)
* loadings_dependent: loadings of dependent variables (np.ndarray)
* names_independent: names of independent varialbes (list of str)
* names_dependent: names of dependent varialbes (list of str)
* scores: scores (np.ndarray)
* sample_index: names of samples (list of str)
Source code in mibiscreen/analysis/reduction/ordination.py
240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 |
|
extract_variables(columns, variables, name_variables='variables')
Checking overlap of two given list.
Function is used for checking if a list of variables is present in the column names of a given dataframe (of quantities for data analysis)
Input
columns: list of strings
given extensive list (usually column names of a pd.DataFrame)
variables: list of strings
list of names to extract/check overlap with strings in list 'column'
name_variables: str, default is 'variables'
name of type of variables given in list 'variables'
Output
intersection: list
list of strings present in both lists 'columns' and 'variables'
Source code in mibiscreen/analysis/reduction/ordination.py
pca(data_frame, independent_variables=False, dependent_variables=False, n_comp=2, verbose=False)
Function that performs Principal Component Analysis.
Makes use of routine sklearn.decomposition.PCA on the input data and gives the site scores and loadings.
Principal component analysis (PCA) is a linear dimensionality reduction technique with applications in exploratory data analysis, visualization and data preprocessing. The data is linearly transformed onto a new coordinate system such that the directions (principal components) capturing the largest variation in the data can be easily identified.
Input
data_frame : pd.dataframe
Tabular data containing variables to be evaluated with standard
column names and rows of sample data.
independent_variables : Boolean or list of strings; default False
list with column names to select from data_frame
being characterized as independent variables (= environment)
dependent_variables : Boolean or list of strings; default is False
list with column names to select from data_frame
being characterized as dependent variables (= species)
n_comp : int, default is 2
Number of components to report
verbose : Boolean, The default is False.
Set to True to get messages in the Console about the status of the run code.
Output
results : Dictionary
containing the scores and loadings of the PCA,
the percentage of the variation explained by the first principal components,
the correlation coefficient between the first two PCs,
names of columns (same length as loadings)
names of indices (same length as scores)
Source code in mibiscreen/analysis/reduction/ordination.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 |
|
rda(data_frame, independent_variables, dependent_variables, n_comp=2, verbose=False)
Function that performs Redundancy Analysis.
Function makes use of skbio.stats.ordination.RDA on the input data and gives the site scores and loadings.
Input
data_frame : pd.dataframe
Tabular data containing variables to be evaluated with standard
column names and rows of sample data.
independent_variables : list of strings
list with column names data to be the independent variables (=envirnoment)
dependent_variables : list of strings
list with column names data to be the dependent variables (=species)
n_comp : int, default is 2
number of dimensions to return
verbose : Boolean, The default is False.
Set to True to get messages in the Console about the status of the run code.
Output
results : Dictionary
* method: name of ordination method (str)
* loadings_independent: loadings of independent variables (np.ndarray)
* loadings_dependent: loadings of dependent variables (np.ndarray)
* names_independent: names of independent varialbes (list of str)
* names_dependent: names of dependent varialbes (list of str)
* scores: scores (np.ndarray)
* sample_index: names of samples (list of str)
Source code in mibiscreen/analysis/reduction/ordination.py
stable_isotope_regression
Routines for performing linear regression on isotope data.
@author: Alraune Zech
Keeling_regression(concentration, delta_mix=None, relative_abundance=None, validate_indices=True, verbose=False, **kwargs)
Performing a linear regression linked to the Keeling plot.
A Keeling fit/plot is an approach to identify the isotopic composition of a contaminating source from measured concentrations and isotopic composition (delta) of a target species in the mix of the source and a pool.
It is based on the linear relationship of the given quantities (concentration) and delta-values (or alternatively the relative abundance x) which are measured over time or across a spatial interval according to
delta_mix = delta_source + m * 1/c_mix
where m is the slope relating the isotopic quantities of the pool (which mixes with the sourse) by m = (delta_pool + delta_source)*c_pool.
The analysis is based on a linear regression of the inverse concentration data against the delta (or x)-values. The parameter of interest, the delta (or relative_abundance, respectively) of the source quantity is the intercept of linear fit with the y-axis, or in other words, the absolute value of the linear fit function.
A plot of the results with data and linear trendline can be generate with the method Keeling_plot() [in the module visualize].
Note that the approach is only applicable if (i) the isotopic composition of the unknown source is constant (ii) the concentration and isotopic composition of the target compound is constant (over time or across space) (i.e. in absence of contamination from the unknown source)
Input
concentration : np.array, pd.dataframe
total molecular mass/molar concentration of target substance
at different locations (at a time) or at different times (at one location)
delta_mix : np.array, pd.dataframe (same length as c_mix), default None
relative isotope ratio (delta-value) of target substance
relative_abundance : None or np.array, pd.dataframe (same length as c_mix), default None
if not None it replaces delta_mix in the inverse estimation and plotting
relative abundance of target substance
validate_indices: boolean, default True
flag to run index validation (i.e. removal of nan and infinity values)
verbose : Boolean, The default is False.
Set to True to get messages in the Console about the status of the run code.
**kwargs : dict
keywordarguments dictionary, e.g. for passing forward keywords to
valid_indices()
Returns
results : dict
results of fitting, including:
* coefficients : array/list of lenght 2, where coefficients[0]
is the slope of the linear fit and coefficient[1] is the
intercept of linear fit with y-axis, reflecting delta
(or relative_abundance, respectively) of the source quantity
* delta_C: np.array with isotope used for fitting - all samples
where non-zero values are available for delta_C and delta_H
* delta_H: np.array with isotope used for fitting - all samples
where non-zero values are available for delta_C and delta_H
Source code in mibiscreen/analysis/reduction/stable_isotope_regression.py
186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 |
|
Lambda_regression(delta_C, delta_H, validate_indices=True, verbose=False, **kwargs)
Performing linear regression to achieve Lambda value.
The Lambda values relates the δ13C versus δ2H signatures of a chemical compound. Relative changes in the ratio can indicate the occurrence of specific enzymatic degradation reactions.
The analysis is based on a linear regression of the hydrogen versus carbon isotope signatures. The parameter of interest, the Lambda values is the slope of the the linear trend line.
A plot of the results with data and linear trendline can be generate with the method Lambda_plot() [in the module visualize].
Input
delta_C : np.array, pd.series
relative isotope ratio (delta-value) of carbon of target molecule
delta_H : np.array, pd.series (same length as delta_C)
relative isotope ratio (delta-value) of hydrogen of target molecule
validate_indices: boolean, default True
flag to run index validation (i.e. removal of nan and infinity values)
verbose : Boolean, The default is False.
Set to True to get messages in the Console about the status of the run code.
**kwargs : dict
keywordarguments dictionary, e.g. for passing forward keywords to
valid_indices()
Returns
results : dict
results of fitting, including:
* coefficients : array/list of lenght 2, where coefficients[0]
is the slope of the linear fit, reflecting the lambda values
and coefficient[1] is the absolute value of the linear function
* delta_C: np.array with isotope used for fitting - all samples
where non-zero values are available for delta_C and delta_H
* delta_H: np.array with isotope used for fitting - all samples
where non-zero values are available for delta_C and delta_H
Source code in mibiscreen/analysis/reduction/stable_isotope_regression.py
Rayleigh_fractionation(concentration, delta, validate_indices=True, verbose=False, **kwargs)
Performing Rayleigh fractionation analysis.
Rayleigh fractionation is a common application to characterize the removal of a substance from a finite pool using stable isotopes. It is based on the change in the isotopic composition of the pool due to different kinetics of the change in lighter and heavier isotopes.
We follow the most simple approach assuming that the substance removal follows first-order kinetics, where the rate coefficients for the lighter and heavier isotopes of the substance differ due to kinetic isotope fractionation effects. The isotopic composition of the remaining substance in the pool will change over time, leading to the so-called Rayleigh fractionation.
The analysis is based on a linear regression of the log-transformed concentration data against the delta-values. The parameter of interest, the kinetic fractionation factor (epsilon or alpha -1) of the removal process is the slope of the the linear trend line.
A plot of the results with data and linear trendline can be generate with the method Rayleigh_fractionation_plot() [in the module visualize].
Input
concentration : np.array, pd.dataframe
total molecular mass/molar concentration of target substance
at different locations (at a time) or at different times (at one location)
delta : np.array, pd.dataframe (same length as concentration)
relative isotope ratio (delta-value) of target substance
validate_indices: boolean, default True
flag to run index validation (i.e. removal of nan and infinity values)
verbose : Boolean, The default is False.
Set to True to get messages in the Console about the status of the run code.
**kwargs : dict
keywordarguments dictionary, e.g. for passing forward keywords to
valid_indices()
Returns
results : dict
results of fitting, including:
* coefficients : array/list of lenght 2, where coefficients[0]
is the slope of the linear fit, reflecting the kinetic
fractionation factor (epsilon or alpha -1) of the removal process
and coefficient[1] is the absolute value of the linear function
* delta_C: np.array with isotope used for fitting - all samples
where non-zero values are available for delta_C and delta_H
* delta_H: np.array with isotope used for fitting - all samples
where non-zero values are available for delta_C and delta_H
Source code in mibiscreen/analysis/reduction/stable_isotope_regression.py
92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 |
|
extract_isotope_data(df, molecule, name_13C='delta_13C', name_2H='delta_2H')
Extracts isotope data from standardised input-dataframe.
Parameters
df : pd.dataframe numeric (observational) data molecule : str name of contaminant molecule to extract isotope data for name_13C : str, default ‘delta_13C’ (standard name) name of C13 isotope to extract data for name_2H : str, default ‘delta_2H’ (standard name) name of deuterium isotope to extract data for
Returns
C_data : np.array numeric isotope data H_data : np.array numeric isotope data
Source code in mibiscreen/analysis/reduction/stable_isotope_regression.py
valid_indices(data1, data2, remove_nan=True, remove_infinity=True, remove_zero=False, **kwargs)
Identifies valid indices in two equaly long arrays and compresses both.
Optional numerical to remove from array are: nan, infinity and zero values.
Parameters
data1 : np.array or pd.series numeric data data2 : np.array or pd.series (same len/shape as data1) numeric data remove_nan : boolean, default True flag to remove nan-values remove_infinity : boolean, default True flag to remove infinity values remove_zero : boolean, default False flag to remove zero values **kwargs : dict keywordarguments dictionary
Returns
data1 : np.array or pd.series numeric data of reduced length where only data at valid indices is in data2 : np.array or pd.series numeric data of reduced length where only data at valid indices is in
Source code in mibiscreen/analysis/reduction/stable_isotope_regression.py
transformation
Routines for performing ordination statistics on sample data.
@author: Alraune Zech, Jorrit Bakker
filter_values(data_frame, replace_NaN='remove', drop_rows=[], inplace=False, verbose=False)
Filtering values of dataframes for ordination to assure all are numeric.
Ordination methods require all cells to be filled. This method checks the provided data frame if values are missing/NaN or not numeric and handles missing/NaN values accordingly.
It then removes select rows and mutates the cells containing NULL values based on the input parameters.
Input
data_frame : pd.dataframe
Tabular data containing variables to be evaluated with standard
column names and rows of sample data.
replace_NaN : string or float, default "remove"
Keyword specifying how to handle missing/NaN/non-numeric values, options:
- remove: remove rows with missing values
- zero: replace values with 0.0
- average: replace the missing values with the average of the variable
(using all other available samples)
- median: replace the missing values with the median of the variable
(using all other available samples)
- float-value: replace all empty cells with that numeric value
drop_rows : List, default [] (empty list)
List of rows that should be removed from dataframe.
inplace: bool, default True
If False, return a copy. Otherwise, do operation in place.
verbose : Boolean, The default is False.
Set to True to get messages in the Console about the status of the run code.
Output
data_filtered : pd.dataframe
Tabular data containing filtered data.
Source code in mibiscreen/analysis/reduction/transformation.py
transform_values(data_frame, name_list='all', how='log_scale', log_scale_A=1, log_scale_B=1, inplace=False, verbose=False)
Extracting data from dataframe for specified variables.
data_frame: pandas.DataFrames
dataframe with the measurements
name_list: string or list of strings, default 'all'
list of quantities (column names) to perfrom transformation on
how: string, default 'standardize'
Type of transformation:
* standardize
* log_scale
* center
log_scale_A : Integer or float, default 1
Log transformation parameter A: log10(Ax+B).
log_scale_B : Integer or float, default 1
Log transformation parameter B: log10(Ax+B).
inplace: bool, default True
If False, return a copy. Otherwise, do operation in place and return None.
verbose : Boolean, The default is False.
Set to True to get messages in the Console about the status of the run code.
data: pd.DataFrame
dataframe with the measurements
Raises:
None (yet).
Example:
To be added.
Source code in mibiscreen/analysis/reduction/transformation.py
sample
mibiscreen module for data analysis performed on each sample.
concentrations
Routines for calculating total concentrations and counts for samples.
@author: Alraune Zech
total_concentration(data_frame, name_list='all', verbose=False, include=False, **kwargs)
Calculate total concentration of given list of quantities.
Input
data: pd.DataFrame
Contaminant concentrations in [ug/l], i.e. microgram per liter
name_ist: str or list, dafault is 'all'
either short name for group of quantities to use, such as:
- 'all' (all qunatities given in data frame except settings)
- 'BTEX' (for benzene, toluene, ethylbenzene, xylene)
- 'BTEXIIN' (for benzene, toluene, ethylbenzene, xylene,
indene, indane and naphthaline)
or list of strings with names of quantities to use
verbose: Boolean
verbose flag (default False)
include: bool, default False
whether to include calculated values to DataFrame
Output
tot_conc: pd.Series
Total concentration of contaminants in [ug/l]
Source code in mibiscreen/analysis/sample/concentrations.py
total_count(data_frame, name_list='all', threshold=0.0, verbose=False, include=False, **kwargs)
Calculate total number of quantities with concentration exceeding threshold value.
Input
data: pd.DataFrame
Contaminant concentrations in [ug/l], i.e. microgram per liter
name_ist: str or list, dafault is 'all'
either short name for group of quantities to use, such as:
- 'all' (all qunatities given in data frame except settings)
- 'BTEX' (for benzene, toluene, ethylbenzene, xylene)
- 'BTEXIIN' (for benzene, toluene, ethylbenzene, xylene,
indene, indane and naphthaline)
or list of strings with names of quantities to use
threshold: float, default 0
threshold concentration value in [ug/l] to test on exceedence
verbose: Boolean
verbose flag (default False)
include: bool, default False
whether to include calculated values to DataFrame
Output
tot_count: pd.Series
Total number of quantities with concentration exceeding threshold value
Source code in mibiscreen/analysis/sample/concentrations.py
properties
Properties for Natural Attenuation Screening.
File containing name specifications of quantities and parameters measured in groundwater samples useful for biodegredation and bioremediation analysis
@author: A. Zech
screening_NA
Routines for calculating natural attenuation potential.
@author: Alraune Zech
NA_traffic(data, inplace=False, verbose=False, **kwargs)
Function evaluating if natural attenuation (NA) is ongoing.
Function to calculate electron balance, based on electron availability calculated from concentrations of contaminant and electron acceptors.
Input
data: pd.DataFrame
Ratio of electron availability
inplace: bool, default False
Whether to modify the DataFrame rather than creating a new one.
verbose: Boolean
verbose flag (default False)
Output
traffic : pd.Series
Traffic light (decision) based on ratio of electron availability
Source code in mibiscreen/analysis/sample/screening_NA.py
available_NP(data, inplace=False, verbose=False, **kwargs)
Function calculating available nutrients.
Approximating the amount of hydrocarbons that can be degraded based on the amount of nutrients (nitrogen and phosphate available)
Input
data: pd.DataFrame
nitrate, nitrite and phosphate concentrations in [mg/l]
inplace: bool, default False
Whether to modify the DataFrame rather than creating a new one.
verbose: Boolean
verbose flag (default False)
Output
------
NP_avail: pd.Series
The amount of nutrients for degrading contaminants
Source code in mibiscreen/analysis/sample/screening_NA.py
check_data(data)
Checking data on correct format.
Input
data: pd.DataFrame
concentration values of quantities
Output
cols: list
List of column names
Source code in mibiscreen/analysis/sample/screening_NA.py
electron_balance(data, inplace=False, verbose=False, **kwargs)
Decision if natural attenuation is taking place.
Function to calculate electron balance, based on electron availability calculated from concentrations of contaminant and electron acceptors
Input
data: pd.DataFrame
tabular data containinng "total_reductors" and "total_oxidators"
-total amount of electrons available for reduction [mmol e-/l]
-total amount of electrons needed for oxidation [mmol e-/l]
inplace: bool, default False
Whether to modify the DataFrame rather than creating a new one.
verbose: Boolean
verbose flag (default False)
Output
e_bal : pd.Series
Ratio of electron availability: electrons available for reduction
devided by electrons needed for oxidation
Source code in mibiscreen/analysis/sample/screening_NA.py
oxidators(data, contaminant_group='BTEXIIN', nutrient=False, inplace=False, verbose=False, **kwargs)
Calculate the amount of electron oxidators [mmol e-/l].
Calculate the amount of electron oxidators in [mmol e-/l] based on concentrations of contaminants, stiochiometric ratios of reactions, contaminant properties (e.g. molecular masses in [mg/mmol])
alternatively: based on nitrogen and phosphate availability
Input
data: pd.DataFrame
Contaminant contentrations in [ug/l], i.e. microgram per liter
if nutrient is True, data also needs to contain concentrations
of Nitrate, Nitrite and Phosphate
contaminant_group: str
Short name for group of contaminants to use
default is 'BTEXIIN' (for benzene, toluene, ethylbenzene, xylene,
indene, indane and naphthaline)
nutrient: Boolean
flag to include oxidator availability based on nutrient supply
calls internally routine "available_NP()" with data
inplace: bool, default False
Whether to modify the DataFrame rather than creating a new one.
verbose: Boolean
verbose flag (default False)
Output
tot_oxi: pd.Series
Total amount of electrons oxidators in [mmol e-/l]
Source code in mibiscreen/analysis/sample/screening_NA.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 |
|
reductors(data, ea_group='ONS', inplace=False, verbose=False, **kwargs)
Calculate the amount of electron reductors [mmol e-/l].
making use of imported molecular mass values for quantities in [mg/mmol]
Input
data: pd.DataFrame
concentration values of electron acceptors in [mg/l]
ea_group: str
Short name for group of electron acceptors to use
default is 'ONS' (for oxygen, nitrate, and sulfate)
inplace: bool, default False
Whether to modify the DataFrame rather than creating a new one.
verbose: Boolean
verbose flag (default False)
Output
tot_reduct: pd.Series
Total amount of electrons needed for reduction in [mmol e-/l]
Source code in mibiscreen/analysis/sample/screening_NA.py
screening_NA(data, ea_group='ONS', contaminant_group='BTEXIIN', nutrient=False, inplace=False, verbose=False, **kwargs)
Calculate the amount of electron reductors [mmol e-/l].
making use of imported molecular mass values for quantities in [mg/mmol]
Input
data: pd.DataFrame
Concentration values of
- electron acceptors in [mg/l]
- contaminants in [ug/l]
- nutrients (Nitrate, Nitrite and Phosphate) if nutrient is True
ea_group: str, default 'ONS'
Short name for group of electron acceptors to use
'ONS' stands for oxygen, nitrate, sulfate and ironII
contaminant_group: str, default 'BTEXIIN'
Short name for group of contaminants to use
'BTEXIIN' stands for benzene, toluene, ethylbenzene, xylene,
indene, indane and naphthaline
nutrient: Boolean, default False
flag to include oxidator availability based on nutrient supply
calls internally routine "available_NP()" with data
inplace: bool, default False
Whether to modify the DataFrame rather than creating a new one.
verbose: Boolean, default False
verbose flag
Output
na_data: pd.DataFrame
Tabular data with all quantities of NA screening listed per sample
Source code in mibiscreen/analysis/sample/screening_NA.py
504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 |
|
thresholds_for_intervention(data, contaminant_group='BTEXIIN', inplace=False, verbose=False, **kwargs)
Function to evalute intervention threshold exceedance.
Determines which contaminants exceed concentration thresholds set by
the Dutch government for intervention.
Input
data: pd.DataFrame
Contaminant contentrations in [ug/l], i.e. microgram per liter
contaminant_group: str
Short name for group of contaminants to use
default is 'BTEXIIN' (for benzene, toluene, ethylbenzene, xylene,
indene, indane and naphthaline)
inplace: bool, default False
Whether to modify the DataFrame rather than creating a new one.
verbose: Boolean, default False
verbose flag
Output
intervention: pd.DataFrame
DataFrame of similar format as input data with well specification and
three columns on intervention threshold exceedance analysis:
- traffic light if well requires intervention
- number of contaminants exceeding the intervention value
- list of contaminants above the threshold of intervention
Source code in mibiscreen/analysis/sample/screening_NA.py
417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 |
|
total_contaminant_concentration(data, contaminant_group='BTEXIIN', inplace=False, verbose=False, **kwargs)
Function to calculate total concentration of contaminants.
Input
data: pd.DataFrame
Contaminant contentrations in [ug/l], i.e. microgram per liter
contaminant_group: str
Short name for group of contaminants to use
default is 'BTEXIIN' (for benzene, toluene, ethylbenzene, xylene,
indene, indane and naphthaline)
inplace: bool, default False
Whether to modify the DataFrame rather than creating a new one.
verbose: Boolean
verbose flag (default False)
Output
tot_conc: pd.Series
Total concentration of contaminants in [ug/l]