Functions for Positive Least Squares (PSL) fitting of respeciate profiles
rsp_pls_x builds PSL models for supplied profile(s) using
the nls function, the 'port' algorithm and a lower
limit of zero for all model outputs to enforce the positive fits. The
modeled profiles are typically from an external source, e.g. a
measurement campaign, and are fit as a linear additive series of reference
profiles, here typically from respeciate, to provide a measure of
source apportionment based on the assumption that the profiles in the
reference set are representative of the mix that make up the modeled
sample. The pls_ functions work with rsp_pls_x
outputs, and are intended to be used when refining and analyzing
these PLS models. See also pls_plots for PLS model plots.
rsp_pls_x(x, m, power = 1, ...)
pls_report(pls)
pls_test(pls)
pls_fit_species(
pls,
species,
power = 1,
refit.profile = TRUE,
as.marker = FALSE,
drop.missing = FALSE,
...
)
pls_refit_species(
pls,
species,
power = 1,
refit.profile = TRUE,
as.marker = FALSE,
drop.missing = FALSE,
...
)
pls_rebuild(
pls,
species,
power = 1,
refit.profile = TRUE,
as.marker = FALSE,
drop.missing = FALSE,
...
)A respeciate object, a data.frame of
profiles in standard long form, intended for PLS modelling.
A respeciate object, a data.frame of
profiles also in standard long form, used as the set of candidate
source profiles when fitting x.
A numeric, an additional factor to be added to
weightings when fitting the PLS model. This is applied in the form
weight^power, and increasing this, increases the relative
weighting of the more heavily weighted measurements. Values in the
range 1 - 2.5 are sometimes helpful.
additional arguments, typically ignored or passed on to
nls.
A rsp_pls_x output, intended for use with
pls_ functions.
for pls_fit_species, a data.frame of
measurements of an additional species to be fitted to an existing
PLS model, or for pls_refit_species a character vector of the
names of species already included in the model to be refit. Both are
multiple-species wrappers for pls_rebuild, a general-purpose
PLS fitter than only handles single species.
(for pls_fit_species, pls_refit_species
and pls_rebuild) logical. When fitting a new species (or
refitted an existing species), all other species in the reference
profiles are held 'as is' and added species is fit to the source
contribution time-series of the previous PLS model. By default, the full PLS
model is then refit using the revised m source profile to generate
a PLS model based on the revised source profiles (i.e., m + new species
or m + refit species). However, this second step can be omitted using
refit.profile=FALSE if you want to use the supplied species
as an indicator rather than a standard member of the apportionment model.
for pls_rebuild, pls_fit_species and
pls_refit_species, logical, default FALSE, when
fitting (or refitting) a species, treat it as source marker.
for pls_rebuild, pls_fit_species and
pls_refit_species, logical, default FALSE, when
building or rebuilding a PLS model, discard cases where species
is missing.
rsp_pls_x returns a list of nls models, one per
profile/measurement set in x. The pls_ functions work with
these outputs. pls_report generates a data.frame of
model outputs, and is used of several of the other pls_
functions. pls_fit_species, pls_refit_species and
pls_fit_parent return the supplied rsp_pls_profile output,
updated on the basis of the pls_ function action.
pls_plots (documented separately) produce various plots
commonly used in source apportionment studies.
This implementation of PLS applies the following modeling constraints:
1. It generates a model of x that is positively constrained linear
product of the profiles in m, so outputs can only be
zero or more. Although the model is generated using nls,
which is a Nonlinear Least Squares (NLS) model, the fitting term applied
in this case is linear.
2. The model is fit in the form:
\(X_{i,j} = \sum\limits_{k=1}^{K}{N_{i,k} * M_{k,j} + e_{i,j}}\)
Where X is the data set of measurements, input x in rsp_pls_x,
M (m) is data set of reference profiles, and N is the data set of
source contributions, the source apportion solution, to be solved by
minimising e, the error terms.
3. The number of species in x must be more than the number of
profiles in m to reduce the likelihood of over-fitting.