Functions for Positive Least Squares (PSL) fitting of respeciate profiles
rsp_pls_x
builds PSL models for supplied profile(s) using
the nls
function, the 'port' algorithm and a lower
limit of zero for all model outputs to enforce the positive fits. The
modeled profiles are typically from an external source, e.g. a
measurement campaign, and are fit as a linear additive series of reference
profiles, here typically from respeciate
, to provide a measure of
source apportionment based on the assumption that the profiles in the
reference set are representative of the mix that make up the modeled
sample. The pls_
functions work with rsp_pls_x
outputs, and are intended to be used when refining and analyzing
these PLS models. See also pls_plot
s for PLS model plots.
rsp_pls_x(x, m, power = 1, ...)
pls_report(pls)
pls_test(pls)
pls_fit_species(
pls,
species,
power = 1,
refit.profile = TRUE,
as.marker = FALSE,
drop.missing = FALSE,
...
)
pls_refit_species(
pls,
species,
power = 1,
refit.profile = TRUE,
as.marker = FALSE,
drop.missing = FALSE,
...
)
pls_rebuild(
pls,
species,
power = 1,
refit.profile = TRUE,
as.marker = FALSE,
drop.missing = FALSE,
...
)
A respeciate
object, a data.frame
of
profiles in standard long form, intended for PLS modelling.
A respeciate
object, a data.frame
of
profiles also in standard long form, used as the set of candidate
source profiles when fitting x
.
A numeric, an additional factor to be added to
weightings when fitting the PLS model. This is applied in the form
weight^power
, and increasing this, increases the relative
weighting of the more heavily weighted measurements. Values in the
range 1 - 2.5
are sometimes helpful.
additional arguments, typically ignored or passed on to
nls
.
A rsp_pls_x
output, intended for use with
pls_
functions.
for pls_fit_species
, a data.frame of
measurements of an additional species to be fitted to an existing
PLS model, or for pls_refit_species
a character vector of the
names of species already included in the model to be refit. Both are
multiple-species
wrappers for pls_rebuild
, a general-purpose
PLS fitter than only handles single species
.
(for pls_fit_species
, pls_refit_species
and pls_rebuild
) logical. When fitting a new species
(or
refitted an existing species
), all other species in the reference
profiles are held 'as is' and added species
is fit to the source
contribution time-series of the previous PLS model. By default, the full PLS
model is then refit using the revised m
source profile to generate
a PLS model based on the revised source profiles (i.e., m + new species
or m + refit species). However, this second step can be omitted using
refit.profile=FALSE
if you want to use the supplied species
as an indicator rather than a standard member of the apportionment model.
for pls_rebuild
, pls_fit_species
and
pls_refit_species
, logical
, default FALSE
, when
fitting (or refitting) a species, treat it as source marker.
for pls_rebuild
, pls_fit_species
and
pls_refit_species
, logical
, default FALSE
, when
building or rebuilding a PLS model, discard cases where species
is missing.
rsp_pls_x
returns a list of nls models, one per
profile/measurement set in x
. The pls_
functions work with
these outputs. pls_report
generates a data.frame
of
model outputs, and is used of several of the other pls_
functions. pls_fit_species
, pls_refit_species
and
pls_fit_parent
return the supplied rsp_pls_profile
output,
updated on the basis of the pls_
function action.
pls_plot
s (documented separately) produce various plots
commonly used in source apportionment studies.
This implementation of PLS applies the following modeling constraints:
1. It generates a model of x
that is positively constrained linear
product of the profiles in m
, so outputs can only be
zero or more. Although the model is generated using nls
,
which is a Nonlinear Least Squares (NLS) model, the fitting term applied
in this case is linear.
2. The model is fit in the form:
\(X_{i,j} = \sum\limits_{k=1}^{K}{N_{i,k} * M_{k,j} + e_{i,j}}\)
Where X is the data set of measurements, input x
in rsp_pls_x
,
M (m
) is data set of reference profiles, and N is the data set of
source contributions, the source apportion solution, to be solved by
minimising e, the error terms.
3. The number of species in x
must be more that the number of
profiles in m
to reduce the likelihood of over-fitting.