Functions for Positive Least Squares (PSL) fitting of (re)SPECIATE profiles
rsp_pls_profile
builds PSL models for supplied profile(s) using
the nls
function, the 'port' algorithm and a lower
limit of zero for all model outputs to enforce the positive fits. The
modeled profiles are typically from an external source, e.g. a
measurement campaign, and are fit as a linear additive series of reference
profiles, here typically from (re)SPECIATE, to provide a measure of
source apportionment based on the assumption that the profiles in the
reference set are representative of the mix that make up the modeled
sample. The pls_
functions work with rsp_pls_profile
outputs, and are intended to be used when refining and analyzing
these PLS models. See also pls_plot
s for PLS model plots.
rsp_pls_profile(rsp, ref, power = 1, ...)
pls_report(pls)
pls_test(pls)
pls_fit_species(
pls,
species,
power = 1,
refit.profile = TRUE,
as.marker = FALSE,
drop.missing = FALSE,
...
)
pls_refit_species(
pls,
species,
power = 1,
refit.profile = TRUE,
as.marker = FALSE,
drop.missing = FALSE,
...
)
pls_rebuild(
pls,
species,
power = 1,
refit.profile = TRUE,
as.marker = FALSE,
drop.missing = FALSE,
...
)
A respeciate
object, a data.frame
of
profiles in standard long form, intended for PLS modelling.
A respeciate
object, a data.frame
of
profiles also in standard long form, used as the set of candidate
source profiles when fitting rsp
.
A numeric, an additional factor to be added to
weightings when fitting the PLS model. This is applied in the form
weight^power
, and increasing this, increases the relative
weighting of the more heavily weighted measurements. Values in the
range 1 - 2.5
are sometimes helpful.
additional arguments, typically ignored or passed on to
nls
.
A rsp_pls_profile
output, intended for use with
pls_
functions.
for pls_fit_species
, a data.frame of
measurements of an additional species to be fitted to an existing
PLS model, or for pls_refit_species
a character vector of the
names of species already included in the model to be refit. Both are
multiple-species
wrappers for pls_rebuild
, a general-purpose
PLS fitter than only handles single species
.
(for pls_fit_species
, pls_refit_species
and pls_rebuild
) logical. When fitting a new species
(or
refitted an existing species
), all other species in the reference
profiles are held 'as is' and added species
is fit to the source
contribution time-series of the previous PLS model. By default, the full PLS
model is then refit using the revised ref
source profile to generate
a PLS model based on the revised source profiles (i.e., ref + new species
or ref + refit species). However, this second step can be omitted using
refit.profile=FALSE
if you want to use the supplied species
as an indicator rather than a standard member of the apportionment model.
for pls_rebuild
, pls_fit_species
and
pls_refit_species
, logical
, default FALSE
, when
fitting (or refitting) a species, treat it as source marker.
for pls_rebuild
, pls_fit_species
and
pls_refit_species
, logical
, default FALSE
, when
building or rebuilding a PLS model, discard cases where species
is missing.
rsp_pls_profile
returns a list of nls models, one per
profile/measurement set in rsp
. The pls_
functions work with
these outputs. pls_report
generates a data.frame
of
model outputs, and is used of several of the other pls_
functions. pls_fit_species
, pls_refit_species
and
pls_fit_parent
return the supplied rsp_pls_profile
output,
updated on the basis of the pls_
function action.
pls_plot
s (documented separately) produce various plots
commonly used in source apportionment studies.
This implementation of PLS applies the following modeling constraints:
1. It generates a model of rsp
that is positively constrained linear
product of the profiles in ref
, so outputs can only be
zero or more. Although the model is generated using nls
,
which is a Nonlinear Least Squares (NLS) model, the fitting term applied
in this case is linear.
2. The model is fit in the form:
\(X_{i,j} = \sum\limits_{k=1}^{K}{N_{i,k} * M_{k,j} + e_{i,j}}\)
Where X is the data set of measurements, rsp
, M is data set of
reference profiles, ref
, N is the data set of source contributions,
the source apportion solution, to be solved by minimising e, the error terms.
3. The number of species in rsp
must be more that the number of
profiles in ref
to reduce the likelihood of over-fitting.