An Introduction to iNEXT.beta3D via Examples

The Latest Update in Nov. 2024: In earlier versions, diversity decomposition (alpha, beta, gamma, and dissimilarity) was performed only for all assemblages of datasets. In the updated version, we have added a logical argument “by_pair” in the main function “iNEXTbeta3D” to specify whether diversity decomposition will be performed for pairs of assemblages or not. If “by_pair = TRUE”, alpha/beta/gamma diversity or dissimilarity will be computed for all pairs of assemblages in the input data; if “by_pair = FALSE”, alpha/beta/gamma diversity or dissimilarity will be computed for K assemblages (i.e., K can be greater than two) when data for K assemblages are provided in the input data. Default is “by_pair = FALSE”.

The package iNEXT.beta3D (iNterpolation and EXTrapolation with beta diversity for three dimensions of biodiversity) is a sequel to iNEXT. The three dimensions (3D) of biodiversity include taxonomic diversity (TD), phylogenetic diversity (PD) and functional diversity (FD). This document provides an introduction demonstrating how to run iNEXT.beta3D. An online version iNEXT.beta3D Online is also available for users without an R background.

A unified framework based on Hill numbers and their generalizations is adopted to quantify TD, PD and FD. TD quantifies the effective number of species, mean-PD (PD divided by tree depth) quantifies the effective number of lineages, and FD quantifies the effective number of virtual functional groups (or functional “species”). Thus, TD, mean-PD, and FD are all in the same units of species/lineage equivalents and can be meaningfully compared; see Chao et al. (2021) for a review of the unified framework.

For each of the three dimensions, iNEXT.beta3D focuses on the multiplicative diversity decomposition (alpha, beta and gamma) of orders q = 0, 1 and 2 based on sampling data. Beta diversity quantifies the extent of among-assemblage differentiation, or the changes in species/lineages/functional-groups composition and abundance among assemblages. iNEXT.beta3D features standardized 3D estimates with a common sample size (for alpha and gamma diversity) or sample coverage (for alpha, beta and gamma diversity). iNEXT.beta3D also features coverage-based standardized estimates of four classes of dissimilarity measures.

Based on the rarefaction and extrapolation (R/E) method for Hill numbers (TD) of orders q = 0, 1 and 2, Chao et al. (2023b) developed the pertinent R/E theory for taxonomic beta diversity with applications to real-world spatial, temporal and spatio-temporal data. An application to Gentry’s global forest data along with a concise description of the theory is provided in Chao et al. (2023a). The extension to phylogenetic and functional beta diversity is generally parallel.

The iNEXT.beta3D package features two types of R/E sampling curves:

Sample-size-based (or size-based) R/E sampling curves: This type of sampling curve plots standardized 3D gamma and alpha diversity with respect to sample size. Note that the size-based beta diversity is not a statistically valid measure (Chao et al. 2023b) and thus the corresponding sampling curve is not provided.
Sample-coverage-based (or coverage-based) R/E sampling curves: This type of sampling curve plots standardized 3D gamma, alpha, and beta diversity as well as four classes of dissimilarity measures with respect to sample coverage (an objective measure of sample completeness).

Sufficient data are needed to run iNEXT.beta3D. If your data comprise only a few species and their abundances/phylogenies/traits, it is probable that the data lack sufficient information to run iNEXT.beta3D.

HOW TO CITE iNEXT.beta3D

If you publish your work based on results from iNEXT.beta3D, you should make reference to at least one of the following methodology papers (2023a, b) and also cite the iNEXT.beta3D package:

Chao, A., Chiu, C.-H., Hu, K.-H., and Zeleny, D. (2023a). Revisiting Alwyn H. Gentry’s forest transect data: a statistical sampling-model-based approach. Japanese Journal of Statistics and Data Science, 6, 861-884. (https://doi.org/10.1007/s42081-023-00214-1)
Chao, A., Thorn, S., Chiu, C.-H., Moyes, F., Hu, K.-H., Chazdon, R. L., Wu, J., Magnago, L. F. S., Dornelas, M., Zeleny, D., Colwell, R. K., and Magurran, A. E. (2023b). Rarefaction and extrapolation with beta diversity under a framework of Hill numbers: the iNEXT.beta3D standardization. Ecological Monographs e1588.(https://doi.org/10.1002/ecm.1588)
Chao, A. and Hu, K.-H. (2023). The iNEXT.beta3D package: interpolation and extrapolation with beta diversity for three dimensions of biodiversity. R package available from CRAN.

SOFTWARE NEEDED TO RUN iNEXT.beta3D IN R

Required: R
Suggested: RStudio IDE

HOW TO RUN iNEXT.beta3D:

The iNEXT.beta3D package is available from CRAN and can be downloaded from Anne Chao’s Github iNEXT.beta3D_github using the following commands. For a first-time installation, additional visualization extension package (ggplot2 from CRAN) and relevant package (iNEXT.3D from CRAN) must be installed and loaded.

## install iNEXT.beta3D package from CRAN
install.packages("iNEXT.beta3D")

## install the latest version from github
install.packages('devtools')
library(devtools)
install_github('AnneChao/iNEXT.beta3D')

## import packages
library(iNEXT.beta3D)

There are three main functions in this package:

iNEXTbeta3D: computes standardized 3D estimates with a common sample size (for alpha and gamma diversity) or sample coverage (for alpha, beta and gamma diversity) for default sample sizes or coverage values. This function also computes coverage-based standardized 3D estimates of four classes of dissimilarity measures for default coverage values. In addition, this function also computes standardized 3D estimates with a particular vector of user-specified sample sizes or coverage values.
ggiNEXTbeta3D: Visualizes the output from the function iNEXTbeta3D.
DataInfobeta3D: Provides basic data information for (1) the reference sample in each assemblage, (2) the gamma reference sample in the pooled assemblage, and (3) the alpha reference sample in the joint assemblage.

DATA INPUT FORMAT

To assess beta diversity among assemblages, information on shared/unique species and their abundances is required. Thus, species identity (or any unique identification code) and assemblage affiliation must be provided in the data. In any input dataset, set row name of the data to be species name (or identification code) and column name to be assemblage name. Two types of species abundance/incidence data are supported:

Individual-based abundance data (datatype = "abundance"): Input data for a single dataset with N assemblages consist of a species-by-assemblage abundance matrix/data.frame. Users can input several datasets which may represent data collected from various localities, regions, plots, time periods, …, etc. Input data for multiple datasets then consist of a list of matrices; each matrix represents a species-by-assemblage abundance matrix for one of the datasets. Different datasets can have different numbers of assemblages. iNEXTbeta3D computes beta diversity and dissimilarity among assemblages within each dataset.
Sampling-unit-based incidence raw data (datatype = "incidence_raw"): Input data for a dataset with N assemblages consist of a list of matrices/data.frames, with each matrix representing a species-by-sampling-unit incidence raw matrix for one of the N assemblages; each element in the incidence raw matrix is 1 for a detection, and 0 for a non-detection. Users can input several datasets. Input data then consist of multiple lists with each list comprising a list of species-by-sampling-unit incidence matrices; see an example below. The number of sampling units can vary with datasets (but within a dataset, the number of sampling units in each assemblage must be the same). iNEXTbeta3D computes beta diversity and dissimilarity among assemblages within each dataset based on incidence-based frequency counts obtained from all sampling units.

Species abundance data format

We use the tree species abundance data collected from three rainforest fragments/localities in Brazil to assess beta diversity between Edge and Interior assemblages/habitats within each fragment; see Chao et al. (2023b) for analysis details. The data (named "Brazil_rainforests") consist of a list of three matrices (for three fragments named “Marim”, “Rebio2”, and “Rochedo”, respectively); each matrix represents a species-by-assemblage abundance matrix, and there are two assemblages (“Edge” and “Interior”) in each fragment. The demo data are slightly different from those analyzed in Chao et al. (2023b) because seven species are removed from the original pooled data due to lack of phylogenetic information. Run the following code to view the data: (Here we only show the first 15 rows for each matrix.)

data(Brazil_rainforests)
Brazil_rainforests

#> $Marim
#>                             Edge Interior
#> Acosmium_lentiscifolium        1        0
#> Actinostemon_estrellensis      0        0
#> Albizia_polycephala            0        0
#> Allophylus_petiolulatus        5        0
#> Alseis_involuta                2        0
#> Amaioua_intermedia             0        0
#> Ampelocera_glabra              1        0
#> Anaxagorea_silvatica           0        0
#> Andira_legalis                 0        1
#> Andira_ormosioides             0        1
#> Annona_dolabripetala           0        0
#> Apuleia_leiocarpa              1        0
#> Aspidosperma_cylindrocarpon    0        0
#> Aspidosperma_illustre          0        3
#> Aspidosperma_parvifolium       0        0
#> 
#> $Rebio2
#>                             Edge Interior
#> Acosmium_lentiscifolium        0        0
#> Actinostemon_estrellensis      0        0
#> Albizia_polycephala            1        0
#> Allophylus_petiolulatus        3        3
#> Alseis_involuta                1        0
#> Amaioua_intermedia             0        1
#> Ampelocera_glabra              0        3
#> Anaxagorea_silvatica           0        6
#> Andira_legalis                 0        0
#> Andira_ormosioides             0        0
#> Annona_dolabripetala           1        0
#> Apuleia_leiocarpa              0        0
#> Aspidosperma_cylindrocarpon    2        0
#> Aspidosperma_illustre          0        0
#> Aspidosperma_parvifolium       0        0
#> 
#> $Rochedo
#>                             Edge Interior
#> Acosmium_lentiscifolium        0        1
#> Actinostemon_estrellensis     23       27
#> Albizia_polycephala            3        0
#> Allophylus_petiolulatus        5        0
#> Alseis_involuta                1        0
#> Amaioua_intermedia             0        0
#> Ampelocera_glabra              0        0
#> Anaxagorea_silvatica           0        0
#> Andira_legalis                 0        0
#> Andira_ormosioides             0        0
#> Annona_dolabripetala           0        0
#> Apuleia_leiocarpa              0        2
#> Aspidosperma_cylindrocarpon    0        0
#> Aspidosperma_illustre          0        0
#> Aspidosperma_parvifolium       1        2

Species incidence raw data format

We use tree species data collected from two second-growth rainforests, namely Cuatro Rios (CR) and Juan Enriquez (JE) in Costa Rica, as demo data to assess temporal beta diversity between two years (2005, 2011, and 2017) within each forest. Each year is designated as an assemblage. The data in each forest were collected from a 1-ha (50 m x 200 m) forest plot. Because individual trees of some species may exhibit intra-specific aggregation within a 1 ha area, they may not be suitable for modelling as independent sampling units. In this case, it is statistically preferable to first convert species abundance records in each forest to occurrence or incidence (detection/non-detection) data in subplots/quadrats; see Chao et al. (2023b) for analysis details.

Each 1-ha forest was divided into 100 subplots (each with 0.01 ha) and only species’ incidence records in each subplot were used to compute the incidence frequency for a species (i.e., the number of subplots in which that species occurred). By treating the incidence frequency of each species among subplots as a “proxy” for its abundance, the iNEXT.beta3D standardization can be adapted to deal with spatially aggregated data and to avoid the effect of intra-specific aggregation.

The data (named "Second_growth_forests") consist of two lists (for two forests named “CR 2005 vs. 2011 vs. 2017” and “JE 2005 vs. 2011 vs. 2017”, respectively). Each list consists of three matrices; the first matrix represents the species-by-subplot incidence data in 2005, the second matrix represents the species-by-subplot incidence data in 2011, and the third matrix represents the species-by-subplots incidence data in 2017. Run the following code to view the incidence raw data: (Here we only show the first ten rows and six columns for each matrix; there are 100 columns/subplots in each forest and each year.)

data(Second_growth_forests)
Second_growth_forests

#> $`CR 2005 vs. 2011 vs. 2017`
#> $`CR 2005 vs. 2011 vs. 2017`$Year_2005
#>        Subplot_1 Subplot_2 Subplot_3 Subplot_4 Subplot_5 Subplot_6
#> Abaade         0         0         0         0         0         0
#> Alcflo         0         0         0         0         0         0
#> Alclat         0         1         0         0         0         0
#> Aliatl         0         0         0         0         0         0
#> Ampmac         0         0         0         0         0         0
#> Anacra         0         1         0         0         0         1
#> Annama         0         1         0         0         0         0
#> Annpap         0         0         0         0         0         0
#> Apemem         0         0         0         0         0         0
#> Ardfim         0         0         0         0         0         0
#> 
#> $`CR 2005 vs. 2011 vs. 2017`$Year_2011
#>        Subplot_1 Subplot_2 Subplot_3 Subplot_4 Subplot_5 Subplot_6
#> Abaade         0         0         0         0         0         0
#> Alcflo         0         0         0         0         0         0
#> Alclat         0         1         0         0         0         0
#> Aliatl         0         0         0         0         0         0
#> Ampmac         0         0         0         0         0         0
#> Anacra         0         1         0         0         0         1
#> Annama         0         0         0         0         0         0
#> Annpap         0         0         0         0         0         0
#> Apemem         0         0         0         0         0         0
#> Ardfim         0         0         0         0         0         0
#> 
#> $`CR 2005 vs. 2011 vs. 2017`$Year_2017
#>        Subplot_1 Subplot_2 Subplot_3 Subplot_4 Subplot_5 Subplot_6
#> Abaade         0         0         0         0         0         0
#> Alcflo         0         0         0         0         0         0
#> Alclat         0         1         0         0         0         0
#> Aliatl         0         0         0         0         0         0
#> Ampmac         0         0         0         0         0         0
#> Anacra         0         1         1         0         1         1
#> Annama         0         0         0         0         0         0
#> Annpap         0         0         0         0         0         0
#> Apemem         0         0         0         0         0         0
#> Ardfim         0         0         0         0         0         0
#> 
#> 
#> $`JE 2005 vs. 2011 vs. 2017`
#> $`JE 2005 vs. 2011 vs. 2017`$Year_2005
#>        Subplot_1 Subplot_2 Subplot_3 Subplot_4 Subplot_5 Subplot_6
#> Alccos         0         0         0         0         0         0
#> Alcflo         0         0         0         0         0         0
#> Alclat         0         0         0         0         0         0
#> Annpap         0         0         0         0         0         0
#> Apemem         0         0         0         0         0         0
#> Astcon         0         0         0         0         0         0
#> Bacgas         0         0         0         0         0         0
#> Brogui         0         0         0         0         0         0
#> Brolac         0         0         0         0         0         0
#> Byrcra         0         0         0         0         1         0
#> 
#> $`JE 2005 vs. 2011 vs. 2017`$Year_2011
#>        Subplot_1 Subplot_2 Subplot_3 Subplot_4 Subplot_5 Subplot_6
#> Alccos         0         0         0         0         0         0
#> Alcflo         0         0         0         0         0         0
#> Alclat         0         0         0         0         0         0
#> Annpap         0         0         0         0         0         0
#> Apemem         0         0         0         0         0         0
#> Astcon         0         0         0         0         0         0
#> Bacgas         0         0         0         0         0         0
#> Brogui         0         0         0         0         0         0
#> Brolac         0         0         0         0         0         0
#> Byrcra         0         0         0         0         1         0
#> 
#> $`JE 2005 vs. 2011 vs. 2017`$Year_2017
#>        Subplot_1 Subplot_2 Subplot_3 Subplot_4 Subplot_5 Subplot_6
#> Alccos         0         0         0         0         0         0
#> Alcflo         0         0         0         0         0         0
#> Alclat         0         0         0         0         0         0
#> Annpap         0         0         0         0         0         0
#> Apemem         0         0         0         0         0         0
#> Astcon         0         0         0         0         0         0
#> Bacgas         0         0         0         0         0         0
#> Brogui         0         0         0         0         0         0
#> Brolac         0         0         0         0         0         0
#> Byrcra         0         0         0         0         0         0

Phylogenetic tree format for PD

To perform PD analysis, the phylogenetic tree (in Newick format) spanned by species observed in all datasets must be stored in a data file. For example, the phylogenetic tree for all observed species (including species in “Marim”, “Rebio2”, and “Rochedo” fragments) is stored in a data file named "Brazil_tree" for demonstration purpose. A partial list of the tip labels and node labels are shown below.

data(Brazil_tree)
Brazil_tree
#> 
#> Phylogenetic tree with 243 tips and 140 internal nodes.
#> 
#> Tip labels:
#>   Carpotroche_brasiliensis, Casearia_ulmifolia, Casearia_sp4, Casearia_sylvestris, Casearia_sp2, Casearia_oblongifolia, ...
#> Node labels:
#>   magnoliales_to_asterales, poales_to_asterales, , , , , ...
#> 
#> Rooted; includes branch lengths.

Species pairwise distance matrix format for FD

To perform FD analysis, the species-pairwise distance matrix (Gower distance computed from species traits) for species observed in all datasets must be stored in a matrix/data.frame format. Typically, the distance between any two species is computed from species traits using the Gower distance. In our demo data, the distance matrix for all species (including species in both “Marim”, “Rebio2”, and “Rochedo” fragments) is stored in a data file named "Brazil_distM" for demonstration purpose. Here we only show the first three rows and three columns of the distance matrix.

data(Brazil_distM)
Brazil_distM

#>                          Carpotroche_brasiliensis Astronium_concinnum Astronium_graveolens
#> Carpotroche_brasiliensis                    0.000               0.522                0.522
#> Astronium_concinnum                         0.522               0.000                0.000
#> Astronium_graveolens                        0.522               0.000                0.000

MAIN FUNCTION: iNEXTbeta3D()

We first describe the main function iNEXTbeta3D() with default arguments:

iNEXTbeta3D(data, diversity = "TD", q = c(0, 1, 2), datatype = "abundance",
            base = "coverage", level = NULL, nboot = 10, conf = 0.95,
            PDtree = NULL, PDreftime = NULL, PDtype = "meanPD",
            FDdistM = NULL, FDtype = "AUC", FDtau = NULL, FDcut_number = 30,
            by_pair = FALSE)

The arguments of this function are briefly described below, and will be explained in more details by illustrative examples in later text. By default (with the standardization base = “coverage”), this function computes coverage-based standardized 3D gamma, alpha, beta diversity, and four dissimilarity indices for coverage up to one (for q = 1, 2) or up to the coverage of double the reference sample size (for q = 0). If users set the standardization base to base=“size”, this function computes size-based standardized 3D gamma and alpha diversity estimates up to double the reference sample size in each dataset. In addition, this function also computes standardized 3D estimates with a particular vector of user-specified sample sizes or coverage values.

Argument	Description
`data`	For `datatype = “abundance”`, species abundance data for a single dataset can be input as a `matrix/data.frame` (species-by-assemblage); data for multiple datasets can be input as a `list` of `matrices/data.frames`, with each matrix representing a species-by-assemblage abundance matrix for one of the datasets. For `datatype = “incidence_raw”`, data for a single dataset with N assemblages can be input as a `list` of `matrices/data.frames`, with each matrix representing a species-by-sampling-unit incidence matrix for one of the assemblages; data for multiple datasets can be input as multiple lists.
`diversity`	selection of diversity type: `diversity = “TD”` = Taxonomic diversity, `diversity = “PD”` = Phylogenetic diversity, and `diversity = “FD”` = Functional diversity.
`q`	a numerical vector specifying the diversity orders. Default is `c(0, 1, 2)`.
`datatype`	data type of input data: individual-based abundance data (`datatype = “abundance”`) or species by sampling-units incidence matrix (`datatype = “incidence_raw”`) with all entries being 0 (non-detection) or 1 (detection).
`base`	standardization base: coverage-based rarefaction and extrapolation for gamma, alpha, beta diversity, and four classes of dissimilarity indices (`base = “coverage”`), or sized-based rarefaction and extrapolation for gamma and alpha diversity (`base = “size”`). Default is `base = “coverage”`.
`level`	A numerical vector specifying the particular values of sample coverage (between 0 and 1 when `base = “coverage”`) or sample sizes (`base = “size”`) that will be used to compute standardized diversity/dissimilarity. Asymptotic diversity estimator can be obtained by setting `level = 1` (i.e., complete coverage for `base = “coverage”`). By default (with `base = “coverage”`), this function computes coverage-based standardized 3D gamma, alpha, beta diversity, and four dissimilarity indices for coverage from 0.5 up to one (for `q = 1, 2`) or up to the coverage of double the reference sample size (for `q = 0`), in increments of 0.025. The extrapolation limit for beta diversity is defined as that for alpha diversity. If users set `base = “size”`, this function computes size-based standardized 3D gamma and alpha diversity estimates based on 40 equally-spaced sample sizes/knots from sample size 1 up to double the reference sample size.
`nboot`	a positive integer specifying the number of bootstrap replications when assessing sampling uncertainty and constructing confidence intervals. Bootstrap replications are generally time consuming. Set `nboot = 0` to skip the bootstrap procedures. Default is `nboot = 10`. If more accurate results are required, set `nboot = 100` (or `nboot = 200`).
`conf`	a positive number < 1 specifying the level of confidence interval. Default is `conf = 0.95`.
`PDtree`	(required argument for `diversity = “PD”`), a phylogenetic tree in Newick format for all observed species in the pooled assemblage.
`PDreftime`	(argument only for `diversity = “PD”`), a numerical value specifying reference time for PD. Default is `PDreftime=NULL`. (i.e., the age of the root of `PDtree`)
`PDtype`	(argument only for `diversity = “PD”`), select PD type: `PDtype = “PD”` (effective total branch length) or `PDtype = “meanPD”` (effective number of equally divergent lineages). Default is `PDtype = “meanPD”`, where `meanPD` = PD/tree depth.
`FDdistM`	(required argument for `diversity = “FD”`), a species pairwise distance matrix for all species in the pooled assemblage.
`FDtype`	(argument only for `diversity = “FD”`), select FD type: `FDtype = “tau_value”` for FD under a specified threshold value, or `FDtype = “AUC”` (area under the curve of tau-profile) for an overall FD which integrates all threshold values between zero and one. Default is `FDtype = “AUC”`.
`FDtau`	(argument only for `diversity = “FD”` and `FDtype=“tau_value”`), a numerical value between 0 and 1 specifying the tau value (threshold level) that will be used to compute FD. If `FDtau = NULL` (default), then the threshold level is set to be the mean distance between any two individuals randomly selected from the pooled dataset (i.e., quadratic entropy).
`FDcut_number`	(argument only for `diversity = “FD”` and `FDtype=“AUC”`), a numeric number to cut [0, 1] interval into equal-spaced sub-intervals to obtain the AUC value by integrating the tau-profile. Equivalently, the number of tau values that will be considered to compute the integrated AUC value. Default is `FDcut_number = 30`. A larger value can be set to obtain more accurate AUC value.
`by_pair`	a logical variable specifying whether to perform diversity decomposition for all pairs of assemblages or not. If `by_pair = TRUE`, alpha/beta/gamma diversity will be computed for all pairs of assemblages in the input data; if `by_pair = FALSE`, alpha/beta/gamma diversity will be computed for multiple assemblages (i.e, more than two assemblages) in the input data. Default is `FALSE`.

This function returns an "iNEXTbeta3D" object which can be further used to make plots using the function ggiNEXTbeta3D() to be described below. (only accept the outcome from iNEXTbeta3D under by_pair = FALSE)

Output of the main function iNEXTbeta3D()

By default (with base = 'coverage'), the iNEXTbeta3D() function for each of the three dimensions (TD, PD, and FD) returns the "iNEXTbeta3D" object including seven data frames for each dataset:

gamma (standardized gamma diversity)
alpha (standardized alpha diversity)
beta (standardized beta diversity)
1-C (standardized Sorensen-type non-overlap index)
1-U (standardized Jaccard-type non-overlap index)
1-V (standardized Sorensen-type turnover index)
1-S (standardized Jaccard-type turnover index)

When users set base = 'size', the iNEXTbeta3D() function for each of the three dimensions (TD, PD, and FD) returns the "iNEXTbeta3D" object including two data frames for each dataset:

gamma (size-based standardized gamma diversity)
alpha (size-based standardized alpha diversity)

Size-based beta diversity and dissimilarity indices are not statistically valid measures and thus are not provided.

GRAPHIC DISPLAYS: FUNCTION ggiNEXTbeta3D()

The function ggiNEXTbeta3D() with default arguments is described as follows: (only accept the outcome from iNEXTbeta3D under by_pair = FALSE)

ggiNEXTbeta3D(output, type = "B")

Argument Description

output output from the function iNEXTbeta3D.

Argument	Description
`output`	output from the function `iNEXTbeta3D`.
`type`	(argument only for `base = "coverage"`), `type = ‘B’` for plotting the rarefaction and extrapolation sampling curves for gamma, alpha, and beta diversity; `type = ‘D’` for plotting the rarefaction and extrapolation sampling curves for four dissimilarity indices. Skip the argument for plotting size-based rarefaction and extrapolation sampling curves for gamma and alpha diversity.

type

(argument only for base = "coverage"),

type = ‘B’ for plotting the rarefaction and extrapolation sampling curves for gamma, alpha, and beta diversity;

type = ‘D’ for plotting the rarefaction and extrapolation sampling curves for four dissimilarity indices.

Skip the argument for plotting size-based rarefaction and extrapolation sampling curves for gamma and alpha diversity.

The ggiNEXTbeta3D() function is a wrapper around the ggplot2 package to create a R/E curve using a single line of code. The resulting object is of class "ggplot", so it can be manipulated using the ggplot2 tools. Users can visualize the displays of coverage-based R/E sampling curves of gamma, alpha and beta diversity as well as four classes of dissimilarity indices by setting the parameter type.

TAXONOMIC DIVERSITY (TD): RAREFACTION/EXTRAPOLATION VIA EXAMPLES

EXAMPLE 1: Abundance data with default sample sizes or coverage values (not by pairs)

First, we run the iNEXTbeta3D() function with Brazil_rainforests abundance data to compute coverage-based taxonomic gamma, alpha, beta diversity, and four dissimilarity indices under base = 'coverage' by running the following code:

## Coverage-based R/E Analysis with taxonomic diversity for abundance data (not by pairs)
data(Brazil_rainforests)

output_TDc_abun = iNEXTbeta3D(data = Brazil_rainforests[1:2], diversity = 'TD', 
                              datatype = 'abundance', base = "coverage", nboot = 10)
output_TDc_abun

The output contains seven data frames: gamma, alpha, beta, 1-C, 1-U, 1-V, 1-S. For each data frame, it includes the name of dataset (Dataset), combinations of assemblage pairs (Pair, if calculating not by pairs, then there is no such column), the diversity order of q (Order.q), the target standardized coverage value (SC), the corresponding sample size (Size), the estimated diversity/dissimilarity estimate (Alpha/Beta/Gamma/Dissimilarity), Method (Rarefaction, Observed, or Extrapolation, depending on whether the target coverage is less than, equal to, or greater than the coverage of the reference sample), standard error of standardized estimate (s.e.), the bootstrap lower and upper confidence limits for the diversity/dissimilarity with a default significance level of 0.95 (LCL, UCL). These estimates with confidence intervals in the output are then used for plotting rarefaction and extrapolation curves.

Our diversity/dissimilarity estimates and related statistics in the default output are displayed for the standardized coverage value from 0.5 to the coverage value of twice the reference sample size (for q = 0), or from 0.5 to 1.0 (for q = 1 and 2), in increments of 0.025. In addition, the results for the following four coverage value are also added: SC(n, alpha), SC(2n, alpha), SC(n, gamma) and SC(2n, gamma) if these values are in the above-specified range. Here SC(n, alpha) and SC(2n, alpha) represent, respectively, the coverage estimate for the alpha reference sample size n and the extrapolated sample with size 2n in the joint assemblage. These values can be found as SC(n) and SC(2n) for "Joint assemblage (for alpha)" in the column “Assemblage” from the output of the function DataInfobeta3D; see later text. Similar definitions pertain to SC(n, gamma) and SC(2n, gamma) for the gamma reference sample; these two values can also be found as SC(n) and SC(2n) for "Pooled assemblage (for gamma)" in the column “Assemblage” from the output of the function DataInfobeta3D. For beta diversity and dissimilarity, the observed sample coverage and extrapolation limit are defined the same as the alpha diversity. The corresponding coverage values for incidence data are denoted as, respectively, SC(T, alpha), SC(2T, alpha), SC(T, gamma) and SC(2T, gamma) in the output.

Because all the diversity/dissimilarity estimates are computed for the standardized coverage range values starting from 0.5, the default setting with level = NULL does not work if the observed sample coverage in the alpha/gamma reference sample is less than 50%. In this case, readers should specify sample coverage values using the argument level, instead of using level = NULL. The suggested maximum coverage value that readers can specify is SC(2n, alpha). Beyond the limit, beta diversity and dissimilarity estimates may be subject to some bias. Below we show the output for taxonomic beta diversity between the “Edge” and “Interior” habitats in the “Marim” fragment.

#>    Dataset Order.q    SC Size Beta                Method  s.e.   LCL  UCL
#> 1    Marim       0 0.500  148 1.11           Rarefaction 0.070 0.974 1.25
#> 2    Marim       0 0.525  162 1.11           Rarefaction 0.069 0.973 1.24
#> 3    Marim       0 0.550  178 1.10           Rarefaction 0.068 0.973 1.24
#> 4    Marim       0 0.575  195 1.10           Rarefaction 0.066 0.973 1.23
#> 5    Marim       0 0.600  213 1.10           Rarefaction 0.063 0.974 1.22
#> 6    Marim       0 0.625  233 1.09           Rarefaction 0.060 0.977 1.21
#> 7    Marim       0 0.650  255 1.09           Rarefaction 0.057 0.980 1.20
#> 8    Marim       0 0.675  279 1.09           Rarefaction 0.053 0.985 1.19
#> 9    Marim       0 0.696  302 1.09 Observed_SC(n, alpha) 0.050 0.989 1.18
#> 10   Marim       0 0.700  306 1.09         Extrapolation 0.049 0.990 1.18
#> 11   Marim       0 0.725  336 1.08         Extrapolation 0.045 0.996 1.17
#> 12   Marim       0 0.750  368 1.08         Extrapolation 0.043 0.999 1.17
#> 13   Marim       0 0.775  403 1.08         Extrapolation 0.044 0.999 1.17
#> 14   Marim       0 0.800  443 1.09         Extrapolation 0.047 0.996 1.18
#> 15   Marim       0 0.825  488 1.09         Extrapolation 0.047 0.997 1.18
#> 16   Marim       0 0.850  541 1.09         Extrapolation 0.050 0.995 1.19
#> 17   Marim       0 0.855  552 1.09 Observed_SC(n, gamma) 0.050 0.994 1.19
#> 18   Marim       0 0.875  602 1.09         Extrapolation 0.053 0.990 1.20
#> 19   Marim       0 0.876  604 1.09  Extrap_SC(2n, alpha) 0.053 0.990 1.20
#> 20   Marim       1 0.500  148 1.11           Rarefaction 0.062 0.988 1.23
#> 21   Marim       1 0.525  162 1.11           Rarefaction 0.061 0.988 1.23
#> 22   Marim       1 0.550  178 1.11           Rarefaction 0.060 0.988 1.22
#> 23   Marim       1 0.575  195 1.10           Rarefaction 0.058 0.990 1.22
#> 24   Marim       1 0.600  213 1.10           Rarefaction 0.056 0.991 1.21
#> 25   Marim       1 0.625  233 1.10           Rarefaction 0.054 0.994 1.20
#> 26   Marim       1 0.650  255 1.10           Rarefaction 0.051 0.998 1.20
#> 27   Marim       1 0.675  279 1.09           Rarefaction 0.047 1.003 1.19
#> 28   Marim       1 0.696  302 1.09 Observed_SC(n, alpha) 0.044 1.008 1.18
#> 29   Marim       1 0.700  306 1.09         Extrapolation 0.043 1.009 1.18
#> 30   Marim       1 0.725  336 1.09         Extrapolation 0.039 1.015 1.17
#> 31   Marim       1 0.750  368 1.09         Extrapolation 0.035 1.021 1.16
#> 32   Marim       1 0.775  403 1.09         Extrapolation 0.033 1.024 1.15
#> 33   Marim       1 0.800  443 1.08         Extrapolation 0.031 1.023 1.15
#> 34   Marim       1 0.825  488 1.08         Extrapolation 0.030 1.021 1.14
#> 35   Marim       1 0.850  541 1.07         Extrapolation 0.031 1.014 1.14
#> 36   Marim       1 0.855  552 1.07 Observed_SC(n, gamma) 0.031 1.012 1.13
#> 37   Marim       1 0.875  602 1.07         Extrapolation 0.033 1.005 1.13
#> 38   Marim       1 0.876  604 1.07  Extrap_SC(2n, alpha) 0.033 1.005 1.13
#> 39   Marim       1 0.900  678 1.06         Extrapolation 0.035 0.997 1.13
#> 40   Marim       1 0.925  775 1.06         Extrapolation 0.037 0.990 1.13
#> 41   Marim       1 0.950  912 1.06         Extrapolation 0.038 0.988 1.14
#> 42   Marim       1 0.969 1075 1.07  Extrap_SC(2n, gamma) 0.039 0.990 1.14
#> 43   Marim       1 0.975 1147 1.07         Extrapolation 0.039 0.992 1.14
#> 44   Marim       1 1.000  Inf 1.10         Extrapolation 0.035 1.034 1.17
#> 45   Marim       2 0.500  148 1.10           Rarefaction 0.051 1.002 1.20
#> 46   Marim       2 0.525  162 1.10           Rarefaction 0.050 1.001 1.20
#> 47   Marim       2 0.550  178 1.10           Rarefaction 0.049 1.001 1.19
#> 48   Marim       2 0.575  195 1.09           Rarefaction 0.048 1.000 1.19
#> 49   Marim       2 0.600  213 1.09           Rarefaction 0.047 1.000 1.18
#> 50   Marim       2 0.625  233 1.09           Rarefaction 0.046 1.000 1.18
#> 51   Marim       2 0.650  255 1.09           Rarefaction 0.044 1.001 1.18
#> 52   Marim       2 0.675  279 1.09           Rarefaction 0.043 1.002 1.17
#> 53   Marim       2 0.696  302 1.08 Observed_SC(n, alpha) 0.042 1.003 1.17
#> 54   Marim       2 0.700  306 1.08         Extrapolation 0.042 1.003 1.17
#> 55   Marim       2 0.725  336 1.08         Extrapolation 0.041 1.005 1.17
#> 56   Marim       2 0.750  368 1.08         Extrapolation 0.040 1.008 1.16
#> 57   Marim       2 0.775  403 1.09         Extrapolation 0.039 1.010 1.16
#> 58   Marim       2 0.800  443 1.09         Extrapolation 0.040 1.009 1.17
#> 59   Marim       2 0.825  488 1.09         Extrapolation 0.043 1.004 1.17
#> 60   Marim       2 0.850  541 1.09         Extrapolation 0.047 0.999 1.18
#> 61   Marim       2 0.855  552 1.09 Observed_SC(n, gamma) 0.047 0.999 1.18
#> 62   Marim       2 0.875  602 1.09         Extrapolation 0.049 0.996 1.19
#> 63   Marim       2 0.876  604 1.09  Extrap_SC(2n, alpha) 0.049 0.996 1.19
#> 64   Marim       2 0.900  678 1.09         Extrapolation 0.051 0.992 1.19
#> 65   Marim       2 0.925  775 1.09         Extrapolation 0.053 0.990 1.20
#> 66   Marim       2 0.950  912 1.09         Extrapolation 0.054 0.988 1.20
#> 67   Marim       2 0.969 1075 1.09  Extrap_SC(2n, gamma) 0.055 0.986 1.20
#> 68   Marim       2 0.975 1147 1.09         Extrapolation 0.055 0.986 1.20
#> 69   Marim       2 1.000  Inf 1.09         Extrapolation 0.057 0.976 1.20

Run the following code to display the two types of curves:

## Coverage-based R/E curves for taxonomic gamma, alpha and beta diversity 
ggiNEXTbeta3D(output_TDc_abun, type = 'B')

## Coverage-based R/E curves for four taxonomic dissimilarity indices
ggiNEXTbeta3D(output_TDc_abun, type = 'D')

The following commands return the size-based R/E sampling curves for gamma and alpha taxonomic diversity:

## Size-based R/E curves with taxonomic gamma and alpha diversity (not by pairs)
data(Brazil_rainforests)
output_TDs_abun = iNEXTbeta3D(data = Brazil_rainforests[1:2], diversity = 'TD', 
                              datatype = 'abundance', base = "size", nboot = 10)

ggiNEXTbeta3D(output_TDs_abun)

EXAMPLE 2: Abundance data for all pairs of assemblages with user-specified sample sizes or coverage values

In addition to the default sample sizes or coverage values, iNEXTbeta3D also computes standardized 3D estimates for all pairs of assemblages with a particular vector of user-specified sample sizes or coverage values. The following commands return the TD estimates with two user-specified levels of sample coverage (e.g., 85% and 90%). Only the output for gamma, alpha and beta is shown below in each dataset; the output for 1-C, 1-U, 1-V, 1-S is omitted.

## Coverage-based R/E Analysis for all pairs of assemblages with taxonomic diversity for abundance data
data(Brazil_rainforests)

data = list("Edge"     = sapply(Brazil_rainforests, function(x) x[,1]),
            "Interior" = sapply(Brazil_rainforests, function(x) x[,2]))
output_TDc_abun_byuser = iNEXTbeta3D(data = data, diversity = 'TD', 
                                     datatype = 'abundance', base = "coverage", nboot = 10,
                                     level = c(0.85, 0.9), by_pair = TRUE)
output_TDc_abun_byuser

#> $Edge
#> $Edge$gamma
#>    Dataset               Pair     Order.q   SC    Size   Gamma        Method   s.e.     LCL     UCL
#> 1                             Order q = 0                                                          
#> 2     Edge   Marim vs. Rebio2           0 0.85 488.874 153.065 Extrapolation  11.86  129.82  176.31
#> 3     Edge   Marim vs. Rebio2           0  0.9  695.41 178.559 Extrapolation 17.897 143.481 213.637
#> 4     Edge  Marim vs. Rochedo           0 0.85  461.62 153.759 Extrapolation 13.932 126.453 181.065
#> 5     Edge  Marim vs. Rochedo           0  0.9 633.379 174.964 Extrapolation 17.389 140.882 209.046
#> 6     Edge Rebio2 vs. Rochedo           0 0.85 380.037 128.239 Extrapolation  7.749 113.051 143.428
#> 7     Edge Rebio2 vs. Rochedo           0  0.9 509.204 144.192 Extrapolation 10.063  124.47 163.915
#> 8                             Order q = 1                                                          
#> 9     Edge   Marim vs. Rebio2           1 0.85 488.874  91.613 Extrapolation   5.79  80.265 102.961
#> 10    Edge   Marim vs. Rebio2           1  0.9  695.41 100.769 Extrapolation  6.479   88.07 113.468
#> 11    Edge  Marim vs. Rochedo           1 0.85  461.62  92.843 Extrapolation  9.006  75.192 110.494
#> 12    Edge  Marim vs. Rochedo           1  0.9 633.379 101.354 Extrapolation 10.245  81.274 121.434
#> 13    Edge Rebio2 vs. Rochedo           1 0.85 380.037   77.18 Extrapolation  5.639  66.127  88.233
#> 14    Edge Rebio2 vs. Rochedo           1  0.9 509.204  83.346 Extrapolation  5.831  71.917  94.774
#> 15                            Order q = 2                                                          
#> 16    Edge   Marim vs. Rebio2           2 0.85 488.874  56.978 Extrapolation  4.052  49.036   64.92
#> 17    Edge   Marim vs. Rebio2           2  0.9  695.41  58.988 Extrapolation  4.298  50.564  67.412
#> 18    Edge  Marim vs. Rochedo           2 0.85  461.62  51.726 Extrapolation  9.558  32.992   70.46
#> 19    Edge  Marim vs. Rochedo           2  0.9 633.379  53.318 Extrapolation 10.292  33.145  73.491
#> 20    Edge Rebio2 vs. Rochedo           2 0.85 380.037  46.058 Extrapolation  5.015   36.23  55.887
#> 21    Edge Rebio2 vs. Rochedo           2  0.9 509.204   47.49 Extrapolation  5.319  37.065  57.916
#> 
#> $Edge$alpha
#>    Dataset               Pair     Order.q   SC    Size   Alpha        Method   s.e.     LCL     UCL
#> 1                             Order q = 0                                                          
#> 2     Edge   Marim vs. Rebio2           0 0.85 569.732 102.908 Extrapolation  8.067  87.096 118.719
#> 3     Edge   Marim vs. Rebio2           0  0.9 734.385 113.072 Extrapolation 10.196  93.089 133.056
#> 4     Edge  Marim vs. Rochedo           0 0.85 608.656 111.722 Extrapolation  7.552  96.919 126.524
#> 5     Edge  Marim vs. Rochedo           0  0.9 778.281 122.193 Extrapolation  9.125 104.307 140.078
#> 6     Edge Rebio2 vs. Rochedo           0 0.85 553.874  97.197 Extrapolation  6.005  85.426 108.967
#> 7     Edge Rebio2 vs. Rochedo           0  0.9 713.702 107.064 Extrapolation  7.103  93.141 120.986
#> 8                             Order q = 1                                                          
#> 9     Edge   Marim vs. Rebio2           1 0.85 569.732  67.935 Extrapolation  4.521  59.074  76.795
#> 10    Edge   Marim vs. Rebio2           1  0.9 734.385  73.494 Extrapolation  5.079  63.539   83.45
#> 11    Edge  Marim vs. Rochedo           1 0.85 608.656  71.795 Extrapolation   4.84  62.308  81.281
#> 12    Edge  Marim vs. Rochedo           1  0.9 778.281  77.566 Extrapolation  5.284   67.21  87.922
#> 13    Edge Rebio2 vs. Rochedo           1 0.85 553.874  56.793 Extrapolation  3.699  49.542  64.044
#> 14    Edge Rebio2 vs. Rochedo           1  0.9 713.702  61.294 Extrapolation   4.11  53.237   69.35
#> 15                            Order q = 2                                                          
#> 16    Edge   Marim vs. Rebio2           2 0.85 569.732  40.725 Extrapolation  3.616  33.637  47.813
#> 17    Edge   Marim vs. Rebio2           2  0.9 734.385  42.059 Extrapolation  3.839  34.534  49.584
#> 18    Edge  Marim vs. Rochedo           2 0.85 608.656  36.067 Extrapolation  5.144  25.985  46.148
#> 19    Edge  Marim vs. Rochedo           2  0.9 778.281  37.011 Extrapolation   5.44  26.348  47.674
#> 20    Edge Rebio2 vs. Rochedo           2 0.85 553.874  28.926 Extrapolation  2.704  23.626  34.226
#> 21    Edge Rebio2 vs. Rochedo           2  0.9 713.702  29.608 Extrapolation  2.835  24.051  35.165
#> 
#> $Edge$beta
#>    Dataset               Pair     Order.q   SC    Size  Beta        Method  s.e.   LCL   UCL
#> 1                             Order q = 0                                                   
#> 2     Edge   Marim vs. Rebio2           0 0.85 569.732 1.487 Extrapolation 0.145 1.202 1.772
#> 3     Edge   Marim vs. Rebio2           0  0.9 734.385 1.579 Extrapolation 0.172 1.242 1.917
#> 4     Edge  Marim vs. Rochedo           0 0.85 608.656 1.376 Extrapolation 0.101 1.178 1.575
#> 5     Edge  Marim vs. Rochedo           0  0.9 778.281 1.432 Extrapolation 0.117 1.204  1.66
#> 6     Edge Rebio2 vs. Rochedo           0 0.85 553.874 1.319 Extrapolation 0.102  1.12 1.519
#> 7     Edge Rebio2 vs. Rochedo           0  0.9 713.702 1.347 Extrapolation 0.107 1.138 1.556
#> 8                             Order q = 1                                                   
#> 9     Edge   Marim vs. Rebio2           1 0.85 569.732 1.349 Extrapolation 0.094 1.165 1.532
#> 10    Edge   Marim vs. Rebio2           1  0.9 734.385 1.371 Extrapolation 0.103 1.169 1.573
#> 11    Edge  Marim vs. Rochedo           1 0.85 608.656 1.293 Extrapolation  0.07 1.156  1.43
#> 12    Edge  Marim vs. Rochedo           1  0.9 778.281 1.307 Extrapolation 0.075  1.16 1.453
#> 13    Edge Rebio2 vs. Rochedo           1 0.85 553.874 1.359 Extrapolation 0.078 1.207 1.511
#> 14    Edge Rebio2 vs. Rochedo           1  0.9 713.702  1.36 Extrapolation 0.076  1.21  1.51
#> 15                            Order q = 2                                                   
#> 16    Edge   Marim vs. Rebio2           2 0.85 569.732 1.399 Extrapolation 0.097 1.208  1.59
#> 17    Edge   Marim vs. Rebio2           2  0.9 734.385 1.403 Extrapolation 0.101 1.205   1.6
#> 18    Edge  Marim vs. Rochedo           2 0.85 608.656 1.434 Extrapolation 0.051 1.334 1.534
#> 19    Edge  Marim vs. Rochedo           2  0.9 778.281 1.441 Extrapolation 0.051  1.34 1.541
#> 20    Edge Rebio2 vs. Rochedo           2 0.85 553.874 1.592 Extrapolation 0.075 1.446 1.739
#> 21    Edge Rebio2 vs. Rochedo           2  0.9 713.702 1.604 Extrapolation 0.075 1.458  1.75
#> 
#> 
#> $Interior
#> $Interior$gamma
#>     Dataset               Pair     Order.q   SC    Size   Gamma        Method   s.e.     LCL     UCL
#> 1                              Order q = 0                                                          
#> 2  Interior   Marim vs. Rebio2           0 0.85 377.835 137.014 Extrapolation 16.357 104.955 169.073
#> 3  Interior   Marim vs. Rebio2           0  0.9 504.138 152.614 Extrapolation 20.559  112.32 192.909
#> 4  Interior  Marim vs. Rochedo           0 0.85 443.568 160.619 Extrapolation  9.803 141.405 179.833
#> 5  Interior  Marim vs. Rochedo           0  0.9 571.802 176.457 Extrapolation 13.308 150.374  202.54
#> 6  Interior Rebio2 vs. Rochedo           0 0.85 468.714  157.16 Extrapolation  16.28 125.252 189.067
#> 7  Interior Rebio2 vs. Rochedo           0  0.9 631.749 177.289 Extrapolation 19.461 139.147 215.431
#> 8                              Order q = 1                                                          
#> 9  Interior   Marim vs. Rebio2           1 0.85 377.835  98.615 Extrapolation  6.092  86.674 110.555
#> 10 Interior   Marim vs. Rebio2           1  0.9 504.138 107.077 Extrapolation  7.126   93.11 121.044
#> 11 Interior  Marim vs. Rochedo           1 0.85 443.568  93.885 Extrapolation  3.301  87.414 100.356
#> 12 Interior  Marim vs. Rochedo           1  0.9 571.802 101.185 Extrapolation  3.867  93.605 108.765
#> 13 Interior Rebio2 vs. Rochedo           1 0.85 468.714  90.808 Extrapolation  7.239  76.621 104.996
#> 14 Interior Rebio2 vs. Rochedo           1  0.9 631.749  98.578 Extrapolation  8.378  82.158 114.998
#> 15                             Order q = 2                                                          
#> 16 Interior   Marim vs. Rebio2           2 0.85 377.835  71.987 Extrapolation  4.648  62.876  81.097
#> 17 Interior   Marim vs. Rebio2           2  0.9 504.138  75.552 Extrapolation  5.178  65.404  85.701
#> 18 Interior  Marim vs. Rochedo           2 0.85 443.568  43.062 Extrapolation  4.021   35.18  50.943
#> 19 Interior  Marim vs. Rochedo           2  0.9 571.802  43.999 Extrapolation  4.266  35.637  52.361
#> 20 Interior Rebio2 vs. Rochedo           2 0.85 468.714  46.915 Extrapolation  5.364  36.401  57.428
#> 21 Interior Rebio2 vs. Rochedo           2  0.9 631.749  48.134 Extrapolation  5.834    36.7  59.568
#> 
#> $Interior$alpha
#>     Dataset               Pair     Order.q   SC    Size   Alpha        Method   s.e.    LCL     UCL
#> 1                              Order q = 0                                                         
#> 2  Interior   Marim vs. Rebio2           0 0.85   516.7  97.723 Extrapolation  7.066 83.874 111.573
#> 3  Interior   Marim vs. Rebio2           0  0.9 662.267 106.711 Extrapolation  7.802  91.42 122.003
#> 4  Interior  Marim vs. Rochedo           0 0.85 500.755  95.957 Extrapolation  6.047 84.104  107.81
#> 5  Interior  Marim vs. Rochedo           0  0.9 626.341 103.713 Extrapolation  6.979 90.035 117.391
#> 6  Interior Rebio2 vs. Rochedo           0 0.85 533.375  92.892 Extrapolation 10.615 72.087 113.697
#> 7  Interior Rebio2 vs. Rochedo           0  0.9 698.207 103.068 Extrapolation 13.673 76.269 129.867
#> 8                              Order q = 1                                                         
#> 9  Interior   Marim vs. Rebio2           1 0.85   516.7  72.058 Extrapolation  4.944 62.368  81.749
#> 10 Interior   Marim vs. Rebio2           1  0.9 662.267  77.989 Extrapolation  5.313 67.576  88.403
#> 11 Interior  Marim vs. Rochedo           1 0.85 500.755  58.908 Extrapolation  4.897  49.31  68.505
#> 12 Interior  Marim vs. Rochedo           1  0.9 626.341  63.269 Extrapolation   5.37 52.744  73.794
#> 13 Interior Rebio2 vs. Rochedo           1 0.85 533.375   53.97 Extrapolation  3.424 47.259   60.68
#> 14 Interior Rebio2 vs. Rochedo           1  0.9 698.207  58.407 Extrapolation  3.967 50.633  66.182
#> 15                             Order q = 2                                                         
#> 16 Interior   Marim vs. Rebio2           2 0.85   516.7  50.209 Extrapolation  2.984 44.361  56.057
#> 17 Interior   Marim vs. Rebio2           2  0.9 662.267  52.431 Extrapolation  3.109 46.338  58.523
#> 18 Interior  Marim vs. Rochedo           2 0.85 500.755   25.89 Extrapolation  3.876 18.293  33.487
#> 19 Interior  Marim vs. Rochedo           2  0.9 626.341  26.429 Extrapolation  4.047 18.497   34.36
#> 20 Interior Rebio2 vs. Rochedo           2 0.85 533.375  25.846 Extrapolation  2.285 21.368  30.325
#> 21 Interior Rebio2 vs. Rochedo           2  0.9 698.207  26.441 Extrapolation  2.387 21.762   31.12
#> 
#> $Interior$beta
#>     Dataset               Pair     Order.q   SC    Size  Beta        Method  s.e.   LCL   UCL
#> 1                              Order q = 0                                                   
#> 2  Interior   Marim vs. Rebio2           0 0.85   516.7 1.402 Extrapolation 0.082 1.242 1.562
#> 3  Interior   Marim vs. Rebio2           0  0.9 662.267  1.43 Extrapolation 0.102  1.23  1.63
#> 4  Interior  Marim vs. Rochedo           0 0.85 500.755 1.674 Extrapolation 0.062 1.552 1.796
#> 5  Interior  Marim vs. Rochedo           0  0.9 626.341 1.701 Extrapolation 0.058 1.587 1.816
#> 6  Interior Rebio2 vs. Rochedo           0 0.85 533.375 1.692 Extrapolation 0.135 1.428 1.956
#> 7  Interior Rebio2 vs. Rochedo           0  0.9 698.207  1.72 Extrapolation 0.145 1.436 2.004
#> 8                              Order q = 1                                                   
#> 9  Interior   Marim vs. Rebio2           1 0.85   516.7 1.369 Extrapolation 0.067 1.238   1.5
#> 10 Interior   Marim vs. Rebio2           1  0.9 662.267 1.373 Extrapolation 0.077 1.222 1.524
#> 11 Interior  Marim vs. Rochedo           1 0.85 500.755 1.594 Extrapolation 0.058  1.48 1.707
#> 12 Interior  Marim vs. Rochedo           1  0.9 626.341 1.599 Extrapolation 0.055 1.492 1.707
#> 13 Interior Rebio2 vs. Rochedo           1 0.85 533.375 1.683 Extrapolation 0.096 1.494 1.871
#> 14 Interior Rebio2 vs. Rochedo           1  0.9 698.207 1.688 Extrapolation 0.098 1.496  1.88
#> 15                             Order q = 2                                                   
#> 16 Interior   Marim vs. Rebio2           2 0.85   516.7 1.434 Extrapolation 0.061 1.314 1.553
#> 17 Interior   Marim vs. Rebio2           2  0.9 662.267 1.441 Extrapolation 0.065 1.314 1.568
#> 18 Interior  Marim vs. Rochedo           2 0.85 500.755 1.663 Extrapolation 0.055 1.556  1.77
#> 19 Interior  Marim vs. Rochedo           2  0.9 626.341 1.665 Extrapolation 0.056 1.556 1.774
#> 20 Interior Rebio2 vs. Rochedo           2 0.85 533.375 1.815 Extrapolation 0.065 1.688 1.942
#> 21 Interior Rebio2 vs. Rochedo           2  0.9 698.207  1.82 Extrapolation 0.064 1.696 1.945

The following commands return the TD estimates for all pairs of assemblages with two user-specified levels of sample sizes (e.g., 300 and 500).

## Size-based R/E for all pairs of assemblages with taxonomic gamma and alpha diversity
data(Brazil_rainforests)

data = list("Edge"     = sapply(Brazil_rainforests, function(x) x[,1]),
            "Interior" = sapply(Brazil_rainforests, function(x) x[,2]))
output_TDs_abun_byuser = iNEXTbeta3D(data = data, diversity = 'TD', 
                                     datatype = 'abundance', base = "size", nboot = 10,
                                     level = c(300, 500), by_pair = TRUE)
output_TDs_abun_byuser

#> $Edge
#> $Edge$gamma
#>    Dataset               Pair     Order.q Size    SC   Gamma        Method   s.e.     LCL     UCL
#> 1                             Order q = 0                                                        
#> 2     Edge   Marim vs. Rebio2           0  300 0.783 118.733   Rarefaction  6.096 106.784 130.681
#> 3     Edge   Marim vs. Rebio2           0  500 0.853 154.717 Extrapolation  12.24 130.728 178.706
#> 4     Edge  Marim vs. Rochedo           0  300  0.78 124.199   Rarefaction  7.855 108.803 139.596
#> 5     Edge  Marim vs. Rochedo           0  500 0.863 159.269 Extrapolation 11.442 136.843 181.696
#> 6     Edge Rebio2 vs. Rochedo           0  300 0.807  114.57   Rarefaction  2.711 109.256 119.884
#> 7     Edge Rebio2 vs. Rochedo           0  500 0.897 143.257 Extrapolation  4.562 134.315 152.199
#> 8                             Order q = 1                                                        
#> 9     Edge   Marim vs. Rebio2           1  300 0.783  78.969   Rarefaction  4.521  70.108   87.83
#> 10    Edge   Marim vs. Rebio2           1  500 0.853  92.213 Extrapolation  6.084  80.289 104.137
#> 11    Edge  Marim vs. Rochedo           1  300  0.78  81.284   Rarefaction  6.397  68.747  93.821
#> 12    Edge  Marim vs. Rochedo           1  500 0.863  95.027 Extrapolation  8.272  78.813  111.24
#> 13    Edge Rebio2 vs. Rochedo           1  300 0.807  72.189   Rarefaction   2.14  67.995  76.383
#> 14    Edge Rebio2 vs. Rochedo           1  500 0.897  82.966 Extrapolation  2.567  77.936  87.997
#> 15                            Order q = 2                                                        
#> 16    Edge   Marim vs. Rebio2           2  300 0.783   53.14   Rarefaction  4.016  45.269   61.01
#> 17    Edge   Marim vs. Rebio2           2  500 0.853  57.124 Extrapolation  4.559  48.188   66.06
#> 18    Edge  Marim vs. Rochedo           2  300  0.78  48.829   Rarefaction  5.336   38.37  59.288
#> 19    Edge  Marim vs. Rochedo           2  500 0.863  52.167 Extrapolation  6.078  40.255  64.079
#> 20    Edge Rebio2 vs. Rochedo           2  300 0.807  44.643   Rarefaction  2.119   40.49  48.795
#> 21    Edge Rebio2 vs. Rochedo           2  500 0.897  47.411 Extrapolation  2.362  42.781   52.04
#> 
#> $Edge$alpha
#>    Dataset               Pair     Order.q Size    SC   Alpha        Method  s.e.    LCL     UCL
#> 1                             Order q = 0                                                      
#> 2     Edge   Marim vs. Rebio2           0  300 0.708  74.152   Rarefaction 2.036 70.162  78.142
#> 3     Edge   Marim vs. Rebio2           0  500 0.822  97.195 Extrapolation 3.204 90.916 103.474
#> 4     Edge  Marim vs. Rochedo           0  300 0.686  77.438   Rarefaction 2.031 73.457   81.42
#> 5     Edge  Marim vs. Rochedo           0  500 0.806 102.405 Extrapolation 3.749 95.057 109.753
#> 6     Edge Rebio2 vs. Rochedo           0  300 0.715  70.435   Rarefaction 1.682 67.137  73.732
#> 7     Edge Rebio2 vs. Rochedo           0  500 0.828  92.861 Extrapolation 3.001 86.979  98.743
#> 8                             Order q = 1                                                      
#> 9     Edge   Marim vs. Rebio2           1  300 0.708  53.388   Rarefaction 2.199 49.079  57.698
#> 10    Edge   Marim vs. Rebio2           1  500 0.822  64.932 Extrapolation 2.715  59.61  70.253
#> 11    Edge  Marim vs. Rochedo           1  300 0.686  54.582   Rarefaction 2.016  50.63  58.533
#> 12    Edge  Marim vs. Rochedo           1  500 0.806  66.923 Extrapolation 2.569 61.887  71.959
#> 13    Edge Rebio2 vs. Rochedo           1  300 0.715  45.657   Rarefaction 2.081 41.578  49.736
#> 14    Edge Rebio2 vs. Rochedo           1  500 0.828  54.906 Extrapolation 2.654 49.705  60.107
#> 15                            Order q = 2                                                      
#> 16    Edge   Marim vs. Rebio2           2  300 0.708   36.13   Rarefaction 2.519 31.192  41.067
#> 17    Edge   Marim vs. Rebio2           2  500 0.822  39.937 Extrapolation 2.976 34.104   45.77
#> 18    Edge  Marim vs. Rochedo           2  300 0.686   32.19   Rarefaction 3.039 26.233  38.147
#> 19    Edge  Marim vs. Rochedo           2  500 0.806  35.172 Extrapolation  3.57 28.175  42.169
#> 20    Edge Rebio2 vs. Rochedo           2  300 0.715   26.61   Rarefaction 2.408 21.891   31.33
#> 21    Edge Rebio2 vs. Rochedo           2  500 0.828  28.609 Extrapolation 2.756 23.208   34.01
#> 
#> 
#> $Interior
#> $Interior$gamma
#>     Dataset               Pair     Order.q Size    SC   Gamma        Method  s.e.     LCL     UCL
#> 1                              Order q = 0                                                       
#> 2  Interior   Marim vs. Rebio2           0  300 0.807 123.729   Rarefaction  6.05 111.872 135.586
#> 3  Interior   Marim vs. Rebio2           0  500 0.899 152.197 Extrapolation 9.704 133.177 171.217
#> 4  Interior  Marim vs. Rochedo           0  300 0.763 133.315   Rarefaction 4.633 124.234 142.396
#> 5  Interior  Marim vs. Rochedo           0  500 0.875 168.384 Extrapolation 6.496 155.652 181.115
#> 6  Interior Rebio2 vs. Rochedo           0  300 0.771 125.664   Rarefaction 4.423 116.995 134.332
#> 7  Interior Rebio2 vs. Rochedo           0  500 0.861  161.68 Extrapolation 7.179  147.61  175.75
#> 8                              Order q = 1                                                       
#> 9  Interior   Marim vs. Rebio2           1  300 0.807  91.871   Rarefaction  5.49   81.11 102.633
#> 10 Interior   Marim vs. Rebio2           1  500 0.899  106.84 Extrapolation 7.093  92.938 120.743
#> 11 Interior  Marim vs. Rochedo           1  300 0.763  82.544   Rarefaction 5.879  71.022  94.066
#> 12 Interior  Marim vs. Rochedo           1  500 0.875  97.367 Extrapolation   7.2  83.255  111.48
#> 13 Interior Rebio2 vs. Rochedo           1  300 0.771  79.175   Rarefaction 4.949  69.475  88.875
#> 14 Interior Rebio2 vs. Rochedo           1  500 0.861  92.512 Extrapolation 6.088   80.58 104.443
#> 15                             Order q = 2                                                       
#> 16 Interior   Marim vs. Rebio2           2  300 0.807  68.632   Rarefaction  4.89  59.047  78.217
#> 17 Interior   Marim vs. Rebio2           2  500 0.899   75.46 Extrapolation 5.791   64.11  86.809
#> 18 Interior  Marim vs. Rochedo           2  300 0.763  41.188   Rarefaction 5.102  31.189  51.188
#> 19 Interior  Marim vs. Rochedo           2  500 0.875  43.528 Extrapolation 5.657   32.44  54.617
#> 20 Interior Rebio2 vs. Rochedo           2  300 0.771   44.46   Rarefaction 5.293  34.087  54.833
#> 21 Interior Rebio2 vs. Rochedo           2  500 0.861  47.205 Extrapolation 5.962  35.519   58.89
#> 
#> $Interior$alpha
#>     Dataset               Pair     Order.q Size    SC  Alpha        Method  s.e.    LCL     UCL
#> 1                              Order q = 0                                                     
#> 2  Interior   Marim vs. Rebio2           0  300 0.726 75.379   Rarefaction 2.164 71.138   79.62
#> 3  Interior   Marim vs. Rebio2           0  500 0.843  96.44 Extrapolation 2.428  91.68 101.199
#> 4  Interior  Marim vs. Rochedo           0  300 0.714 74.739   Rarefaction 1.962 70.893  78.585
#> 5  Interior  Marim vs. Rochedo           0  500  0.85   95.9 Extrapolation 3.115 89.794 102.006
#> 6  Interior Rebio2 vs. Rochedo           0  300 0.733 69.209   Rarefaction 1.246 66.767  71.651
#> 7  Interior Rebio2 vs. Rochedo           0  500 0.837  90.28 Extrapolation 2.079 86.205  94.355
#> 8                              Order q = 1                                                     
#> 9  Interior   Marim vs. Rebio2           1  300 0.726  58.68   Rarefaction 2.555 53.672  63.688
#> 10 Interior   Marim vs. Rebio2           1  500 0.843 71.244 Extrapolation  3.25 64.875  77.614
#> 11 Interior  Marim vs. Rochedo           1  300 0.714 48.698   Rarefaction 3.064 42.692  54.705
#> 12 Interior  Marim vs. Rochedo           1  500  0.85 58.877 Extrapolation  3.68 51.664  66.091
#> 13 Interior Rebio2 vs. Rochedo           1  300 0.733 44.372   Rarefaction 2.365 39.737  49.008
#> 14 Interior Rebio2 vs. Rochedo           1  500 0.837 52.877 Extrapolation 2.796 47.397  58.357
#> 15                             Order q = 2                                                     
#> 16 Interior   Marim vs. Rebio2           2  300 0.726 44.072   Rarefaction 2.976 38.239  49.905
#> 17 Interior   Marim vs. Rebio2           2  500 0.843 49.888 Extrapolation 3.708 42.619  57.156
#> 18 Interior  Marim vs. Rochedo           2  300 0.714 24.242   Rarefaction  3.73 16.931  31.553
#> 19 Interior  Marim vs. Rochedo           2  500  0.85 25.886 Extrapolation 4.198 17.658  34.115
#> 20 Interior Rebio2 vs. Rochedo           2  300 0.733 24.064   Rarefaction 3.035 18.115  30.013
#> 21 Interior Rebio2 vs. Rochedo           2  500 0.837 25.683 Extrapolation 3.472 18.879  32.487

EXAMPLE 3: Incidence data with default sample sizes or coverage values (not by pairs)

We can also use incidence raw data (Second_growth_forests) to compute coverage-based standardized gamma, alpha, beta diversity, and four dissimilarities under base = 'coverage', and also size-based standardized gamma and alpha diversity. Run the following code to perform incidence data analysis. The output data frame is similar to that based on abundance data and thus is omitted.

## Coverage-based R/E Analysis with taxonomic diversity for incidence raw data (not by pairs)
data(Second_growth_forests)

data = list("CR 2005 vs. 2017" = Second_growth_forests[[1]][c(1,3)],
            "JE 2005 vs. 2017" = Second_growth_forests[[2]][c(1,3)])
output_TDc_inci = iNEXTbeta3D(data = data, diversity = 'TD', 
                              datatype = 'incidence_raw', base = "coverage", nboot = 10)
output_TDc_inci

The same procedures can be applied to incidence data. Based on the demo dataset, we display below the coverage-based R/E curves for comparing temporal beta diversity between 2005 and 2017 in two second-growth forests (CR and JE) by running the following code:

## Coverage-based R/E curves with taxonomic gamma, alpha and beta diversity 
ggiNEXTbeta3D(output_TDc_inci, type = 'B')

The following commands return the size-based R/E sampling curves for gamma and alpha taxonomic diversity:

## Size-based R/E curves with taxonomic gamma and alpha diversity (not by pairs)
data(Second_growth_forests)

data = list("CR 2005 vs. 2017" = Second_growth_forests[[1]][c(1,3)],
            "JE 2005 vs. 2017" = Second_growth_forests[[2]][c(1,3)])
output_TDs_inci = iNEXTbeta3D(data = data, diversity = 'TD', 
                              datatype = 'incidence_raw', base = "size", nboot = 10)

ggiNEXTbeta3D(output_TDs_inci)

EXAMPLE 4: Incidence data for all pairs of assemblages with user-specified sample sizes or coverage values

As with abundance data, user can also specify sample sizes (i.e. number of sampling units) or coverage values to obtain the pertinent output for all pairs of assemblages. The code for examples is given below with two user-specified levels of sample coverage values (e.g., 90% and 95%), but the output is omitted.

## Coverage-based R/E Analysis for all pairs of assemblages with taxonomic diversity for incidence data
data(Second_growth_forests)

output_TDc_inci_byuser = iNEXTbeta3D(data = Second_growth_forests, diversity = 'TD', 
                                     datatype = 'incidence_raw', base = "coverage", 
                                     nboot = 10, level = c(0.9, 0.95), by_pair = TRUE)
output_TDc_inci_byuser

The following commands return the TD estimates for all pairs of assemblages with two user-specified levels of sample sizes (e.g., 100 and 200).

## Size-based R/E for all pairs of assemblages with taxonomic gamma and alpha diversity
data(Second_growth_forests)

output_TDs_inci_byuser = iNEXTbeta3D(data = Second_growth_forests, diversity = 'TD', 
                                     datatype = 'incidence_raw', base = "size", 
                                     nboot = 10, level = c(100, 200), by_pair = TRUE)
output_TDs_inci_byuser

PHYLOGENETIC DIVERSITY (PD): RAREFACTION/EXTRAPOLATION VIA EXAMPLES

EXAMPLE 5: Abundance data with default sample sizes or coverage values (not by pairs)

As with taxonomic diversity, iNEXT.beta3D computes coverage-based standardized phylogenetic gamma, alpha, beta diversity as well as four classes of phylogenetic dissimilarity indices; it also computes size-based standardized phylogenetic gamma and alpha diversity. The species names (or identification codes) in the phylogenetic tree must exactly match with those in the corresponding species abundance/incidence data. Two types of phylogenetic rarefaction and extrapolation curves (coverage- and size-based sampling curves) are also provided.

The required argument for performing PD analysis is PDtree. For example, the phylogenetic tree for all observed species (including species in “Marim”, “Rebio2”, and “Rochedo” fragments) is stored in a data file named "Brazil_tree". Then we enter the argument PDtree = Brazil_tree. Two optional arguments are: PDtype and PDreftime. There are two options for PDtype: "PD" (effective total branch length) or "meanPD" (effective number of equally divergent lineages, meanPD = PD/tree depth). Default is PDtype = "meanPD". PDreftime is a numerical value specifying a reference time for computing phylogenetic diversity. By default (PDreftime = NULL), the reference time is set to the tree depth, i.e., age of the root of the phylogenetic tree. Run the following code to perform PD analysis. The output data frame is similar to that based on abundance data and thus is omitted.

## Coverage-based R/E Analysis with phylogenetic diversity for abundance data (not by pairs)
data(Brazil_rainforests)
data(Brazil_tree)

output_PDc_abun = iNEXTbeta3D(data = Brazil_rainforests[1:2], diversity = 'PD', 
                              datatype = 'abundance', base = "coverage", nboot = 10, 
                              PDtree = Brazil_tree, PDreftime = NULL, PDtype = 'meanPD')
output_PDc_abun

Run the following code to display the R/E curves for phylogenetic gamma, alpha, and beta diversity:

## Coverage-based R/E sampling curves for phylogenetic gamma, alpha and beta diversity
ggiNEXTbeta3D(output_PDc_abun, type = 'B')

The following commands return the size-based R/E sampling curves for gamma and alpha phylogenetic diversity:

## Size-based R/E curves for phylogenetic gamma and alpha diversity (not by pairs)
data(Brazil_rainforests)
data(Brazil_tree)

output_PDs_abun = iNEXTbeta3D(data = Brazil_rainforests[1:2], diversity = 'PD', 
                              datatype = 'abundance', base = "size", nboot = 10, 
                              PDtree = Brazil_tree, PDreftime = NULL, PDtype = 'meanPD')
ggiNEXTbeta3D(output_PDs_abun)

FUNCTIONAL DIVERSITY (FD): RAREFACTION/EXTRAPOLATION VIA EXAMPLES

EXAMPLE 6: Abundance data with default sample sizes or coverage values (not by pairs)

As with taxonomic and phylogenetic diversity, iNEXT.beta3D computes coverage-based standardized functional gamma, alpha, beta diversity as well as four classes of functional dissimilarity indices; it also computes size-based standardized functional gamma and alpha diversity. The species names (or identification codes) in the distance matrix must exactly match with those in the corresponding species abundance/incidence data. Two types of functional rarefaction and extrapolation curves (coverage- and size-based sampling curves) are also provided.

The required argument for performing FD analysis is FDdistM. For example, the distance matrix for all species (including species in “Marim”, “Rebio2”, and “Rochedo” fragments) is stored in a data file named "Brazil_distM". Then we enter the argument FDdistM = Brazil_distM. Three optional arguments are (1) FDtype: FDtype = "AUC"means FD is computed from the area under the curve of a tau-profile by integrating all plausible threshold values between zero and one; FDtype = "tau_value" means FD is computed under a specific threshold value to be specified in the argument FD_tau. (2) FD_tau: a numerical value specifying the tau value (threshold level) that will be used to compute FD. If FDtype = "tau_value" and FD_tau = NULL, then the threshold level is set to be the mean distance between any two individuals randomly selected from the pooled data over all datasets (i.e., quadratic entropy). (3) FDcut_number is a numeric number to cut [0, 1] interval into equal-spaced sub-intervals to obtain the AUC value. Default is FDcut_number = 30. If more accurate integration is desired, then use a larger integer. Run the following code to perform FD analysis. The output data frame is similar to that based on abundance data and thus is omitted; see later graphical display of the output.

## Coverage-based R/E Analysis with functional diversity for abundance data - FDtype = 'AUC' (area 
## under curve) by considering all threshold values between zero and one (not by pairs)
data(Brazil_rainforests)
data(Brazil_distM)

output_FDc_abun = iNEXTbeta3D(data = Brazil_rainforests[1:2], diversity = 'FD', 
                              datatype = 'abundance', base = "coverage", nboot = 10, 
                              FDdistM = Brazil_distM, FDtype = 'AUC', FDcut_number = 30)
output_FDc_abun

Run the following code to display the R/E curves for functional gamma, alpha, and beta diversity:

## Coverage-based R/E sampling curves for functional gamma, alpha and beta diversity
ggiNEXTbeta3D(output_FDc_abun, type = 'B')

The following commands return the size-based R/E sampling curves for gamma and alpha functional diversity:

## Size-based R/E curves for functional gamma and alpha diversity (not by pairs)
data(Brazil_rainforests)
data(Brazil_distM)

output_FDs_abun = iNEXTbeta3D(data = Brazil_rainforests[1:2], diversity = 'FD', 
                              datatype = 'abundance', base = "size", nboot = 10, 
                              FDdistM = Brazil_distM, FDtype = 'AUC', FDcut_number = 30)
ggiNEXTbeta3D(output_FDs_abun)

DATA INFORMATION: FUNCTION DataInfobeta3D()

The function DataInfobeta3D() provides basic data information for (1) the reference sample in each individual assemblage, (2) the gamma reference sample in the pooled assemblage, and (3) the alpha reference sample in the joint assemblage. The function DataInfobeta3D() with default arguments is shown below:

DataInfobeta3D(data, diversity = "TD", datatype = "abundance",
               PDtree = NULL, PDreftime = NULL, FDdistM = NULL, FDtype = "AUC", FDtau = NULL,
               by_pair = FALSE)

All arguments in the above function are the same as those for the main function iNEXTbeta3D. Running the DataInfobeta3D() function returns basic data information including sample size, observed species richness, two sample coverage estimates (SC(n) and SC(2n)) as well as other relevant information in each of the three dimensions of diversity. We use Brazil_rainforests data to demo the function for each dimension.

## Data information for taxonomic diversity (not by pairs)
data(Brazil_rainforests)
DataInfobeta3D(data = Brazil_rainforests[1:2], diversity = 'TD', datatype = 'abundance')

#>   Dataset        Assemblage   n S.obs SC(n) SC(2n) f1 f2 f3 f4 f5
#> 1   Marim              Edge 158    84 0.691  0.852 49 18  8  4  1
#> 2   Marim          Interior 144    80 0.704  0.899 43 23  7  5  0
#> 3   Marim Pooled assemblage 302   119 0.855  0.969 44 34 17  9  7
#> 4   Marim  Joint assemblage 302   164 0.696  0.876 92 41 15  9  1
#> 5  Rebio2              Edge 162    70 0.754  0.895 40 17  4  2  0
#> 6  Rebio2          Interior 168    74 0.763  0.877 40 13  8  4  4
#> 7  Rebio2 Pooled assemblage 330   118 0.819  0.901 60 18 15  5  3
#> 8  Rebio2  Joint assemblage 330   144 0.758  0.886 80 30 12  6  4

Output description:

Dataset = the input datasets.
Pair = combinations of assemblage pairs (if calculating not by pairs, then there is no such column).
Assemblage = Individual assemblages, 'Pooled assemblage' (for gamma) or 'Joint assemblage' (for alpha).
n = number of observed individuals in the reference sample (sample size).
S.obs = number of observed species in the reference sample.
SC(n) = sample coverage estimate of the reference sample.
SC(2n) = sample coverage estimate of twice the reference sample size.
f1-f5 = the first five species abundance frequency counts in the reference sample.

## Data information for taxonomic diversity for all pairs of assemblages
data(Brazil_rainforests)

data = list("Edge"     = sapply(Brazil_rainforests, function(x) x[,1]),
            "Interior" = sapply(Brazil_rainforests, function(x) x[,2]))
DataInfobeta3D(data = data, diversity = 'TD', datatype = 'abundance', by_pair = TRUE)

#>     Dataset               Pair        Assemblage   n S.obs SC(n) SC(2n) f1 f2 f3 f4
#> 1      Edge                                Marim 158    84 0.691  0.852 49 18  8  4
#> 2      Edge                               Rebio2 162    70 0.754  0.895 40 17  4  2
#> 3      Edge                              Rochedo 179    82 0.733  0.889 48 21  3  3
#> 4      Edge   Marim vs. Rebio2 Pooled assemblage 320   123 0.791  0.889 67 21  8 11
#> 5      Edge   Marim vs. Rebio2  Joint assemblage 320   154 0.723  0.874 89 35 12  6
#> 6      Edge  Marim vs. Rochedo Pooled assemblage 337   132 0.799  0.909 68 27 13  9
#> 7      Edge  Marim vs. Rochedo  Joint assemblage 337   166 0.713  0.872 97 39 11  7
#> 8      Edge Rebio2 vs. Rochedo Pooled assemblage 341   122 0.830  0.942 58 31 10  9
#> 9      Edge Rebio2 vs. Rochedo  Joint assemblage 341   152 0.743  0.892 88 38  7  5
#> 10 Interior                                Marim 144    80 0.704  0.899 43 23  7  5
#> 11 Interior                               Rebio2 168    74 0.763  0.877 40 13  8  4
#> 12 Interior                              Rochedo 195    80 0.781  0.928 43 24  6  3
#> 13 Interior   Marim vs. Rebio2 Pooled assemblage 312   126 0.815  0.932 58 29 14  9
#> 14 Interior   Marim vs. Rebio2  Joint assemblage 312   154 0.735  0.889 83 36 15  9
#> 15 Interior  Marim vs. Rochedo Pooled assemblage 339   142 0.791  0.929 71 38 17  7
#> 16 Interior  Marim vs. Rochedo  Joint assemblage 339   160 0.747  0.915 86 47 13  8
#> 17 Interior Rebio2 vs. Rochedo Pooled assemblage 363   139 0.805  0.921 71 32 11  8
#> 18 Interior Rebio2 vs. Rochedo  Joint assemblage 363   154 0.772  0.907 83 37 14  7

Output description: definitions are the same as before and thus are omitted.

## Data information for phylogenetic diversity (not by pairs)
data(Brazil_rainforests)
data(Brazil_tree)
DataInfobeta3D(data = Brazil_rainforests[1:2], diversity = 'PD', datatype = 'abundance', 
               PDtree = Brazil_tree, PDreftime = NULL)

#>   Dataset        Assemblage   n S.obs SC(n) SC(2n) PD.obs f1* f2*   g1   g2 Reftime
#> 1   Marim              Edge 158    84 0.691  0.852   8805  49  26 3278 2188     400
#> 2   Marim          Interior 144    80 0.704  0.899   8436  43  28 2974 1935     400
#> 3   Marim Pooled assemblage 302   119 0.855  0.969  11842  44  39 3172 2995     400
#> 4   Marim  Joint assemblage 302   164 0.696  0.876  17241  92  54 6252 4123     400
#> 5  Rebio2              Edge 162    70 0.754  0.895   7874  40  23 3648 1717     400
#> 6  Rebio2          Interior 168    74 0.763  0.877   8360  40  17 3365 1954     400
#> 7  Rebio2 Pooled assemblage 330   118 0.819  0.901  11979  60  23 5063 1637     400
#> 8  Rebio2  Joint assemblage 330   144 0.758  0.886  16234  80  40 7013 3671     400

Information description:

Dataset, Pair, Assemblage, n, S.obs, SC(n) and SC(2n): definitions are the same as in the TD output.
PD.obs = the observed total branch length in the phylogenetic tree spanned by all observed species.
f1*,f2* = the number of singletons and doubletons in the node/branch abundance set.
g1,g2 = the total branch length of those singletons/doubletons in the node/branch abundance set.
Reftime = reference time for phylogenetic diversity (the age of the root of phylogenetic tree).

## Data information for functional diversity (under a specified threshold level, FDtype = 'tau_value', 
## and not by pairs)
data(Brazil_rainforests)
data(Brazil_distM)
DataInfobeta3D(data = Brazil_rainforests[1:2], diversity = 'FD', datatype = 'abundance', 
               FDdistM = Brazil_distM, FDtype = 'tau_value', FDtau = NULL)

#>   Dataset        Assemblage   n S.obs SC(n) SC(2n) a1* a2* h1 h2   Tau
#> 1   Marim              Edge 158    84 0.691  0.852   0   0  0  0 0.343
#> 2   Marim          Interior 144    80 0.704  0.899   0   0  0  0 0.343
#> 3   Marim Pooled assemblage 302   119 0.855  0.969   0   0  0  0 0.343
#> 4   Marim  Joint assemblage 302   164 0.696  0.876   0   0  0  0 0.343
#> 5  Rebio2              Edge 162    70 0.754  0.895   0   0  0  0 0.343
#> 6  Rebio2          Interior 168    74 0.763  0.877   0   0  0  0 0.343
#> 7  Rebio2 Pooled assemblage 330   118 0.819  0.901   0   0  0  0 0.343
#> 8  Rebio2  Joint assemblage 330   144 0.758  0.886   0   0  0  0 0.343

Information description:

Dataset, Pair, Assemblage, n, S.obs, SC(n) and SC(2n): definitions are the same as in the TD output.
a1*,a2* = the number of singletons (a1*) and of doubletons (a2*) among the functionally indistinct set at the specified threshold level 'Tau'.
h1,h2 = the total contribution of singletons (h1) and of doubletons (h2) at the specified threshold level 'Tau'.
Tau = the specified threshold level of distinctiveness. Default is dmean (the mean distance between any two individuals randomly selected from the pooled data over all datasets).

## Data information for functional diversity (FDtype = 'AUC' and not by pairs)
data(Brazil_rainforests)
data(Brazil_distM)
DataInfobeta3D(data = Brazil_rainforests[1:2], diversity = 'FD', datatype = 'abundance', 
               FDdistM = Brazil_distM, FDtype = 'AUC')

#>   Dataset        Assemblage   n S.obs SC(n) SC(2n) dmin dmean  dmax
#> 1   Marim              Edge 158    84 0.691  0.852    0 0.329 0.755
#> 2   Marim          Interior 144    80 0.704  0.899    0 0.313 0.663
#> 3   Marim Pooled assemblage 302   119 0.855  0.969    0 0.323 0.776
#> 4   Marim  Joint assemblage 302   164 0.696  0.876    0 0.323 0.776
#> 5  Rebio2              Edge 162    70 0.754  0.895    0 0.376 0.659
#> 6  Rebio2          Interior 168    74 0.763  0.877    0 0.310 0.660
#> 7  Rebio2 Pooled assemblage 330   118 0.819  0.901    0 0.355 0.776
#> 8  Rebio2  Joint assemblage 330   144 0.758  0.886    0 0.355 0.776

Information description:

Dataset, Pair, Assemblage, n, S.obs, SC(n) and SC(2n): definitions are the same as in TD and thus are omitted.
dmin = the minimum distance among all non-diagonal elements in the distance matrix.
dmean = the mean distance between any two individuals randomly selected from each assemblage.
dmax = the maximum distance among all elements in the distance matrix.

Below We use the demo dataset (Second-growth forests) to show the output of the function DataInfobeta3D for incidence data:

## Data information for taxonomic diversity with incidence data (not by pairs)
data(Second_growth_forests)

data = list("CR 2005 vs. 2017" = Second_growth_forests[[1]][c(1,3)],
            "JE 2005 vs. 2017" = Second_growth_forests[[2]][c(1,3)])
DataInfobeta3D(data = data, diversity = 'TD', datatype = 'incidence_raw')

#>            Dataset        Assemblage   T    U S.obs SC(T) SC(2T)  Q1 Q2 Q3 Q4 Q5
#> 1 CR 2005 vs. 2017         Year_2005 100  787   135 0.919  0.953  64 17 16  6  4
#> 2 CR 2005 vs. 2017         Year_2017 100  768   134 0.917  0.956  64 20 11  8  3
#> 3 CR 2005 vs. 2017 Pooled assemblage 100  923   151 0.925  0.959  70 21 14  6  6
#> 4 CR 2005 vs. 2017  Joint assemblage 100 1555   269 0.918  0.954 128 37 27 14  7
#> 5 JE 2005 vs. 2017         Year_2005 100  503    71 0.955  0.979  23  9  8  4  0
#> 6 JE 2005 vs. 2017         Year_2017 100  659    91 0.953  0.979  31 12  8  3  5
#> 7 JE 2005 vs. 2017 Pooled assemblage 100  864   107 0.963  0.987  32 17  9  4  8
#> 8 JE 2005 vs. 2017  Joint assemblage 100 1162   162 0.954  0.979  54 21 16  7  5

Information description:

Dataset = the input datasets.
Pair = combinations of assemblage pairs (if calculating not by pairs, then there is no such column).
Assemblage = Individual assemblages, 'Pooled assemblage' (for gamma) or 'Joint assemblage' (for alpha).
T = number of sampling units in the reference sample (sample size for incidence data).
U = total number of incidences in the reference sample.
S.obs = number of observed species in the reference sample.
SC(T) = sample coverage estimate of the reference sample.
SC(2T) = sample coverage estimate of twice the reference sample size.
Q1-Q5 = the first five species incidence frequency counts in the reference sample.

## Data information for taxonomic diversity for all pairs of assemblages (incidence data)
data(Second_growth_forests)

data = Second_growth_forests
names(data) = c("CR", "JE")
DataInfobeta3D(data = data, diversity = 'TD', datatype = 'incidence_raw', 
               by_pair = TRUE)

#>    Dataset                    Pair        Assemblage   T    U S.obs SC(T) SC(2T)  Q1 Q2 Q3 Q4
#> 1       CR                                 Year_2005 100  787   135 0.919  0.953  64 17 16  6
#> 2       CR                                 Year_2011 100  768   135 0.916  0.952  65 18 12  7
#> 3       CR                                 Year_2017 100  768   134 0.917  0.956  64 20 11  8
#> 4       CR Year_2005 vs. Year_2011 Pooled assemblage 100  860   145 0.920  0.954  69 19 13  5
#> 5       CR Year_2005 vs. Year_2011  Joint assemblage 100 1555   270 0.917  0.952 129 35 28 13
#> 6       CR Year_2005 vs. Year_2017 Pooled assemblage 100  923   151 0.925  0.959  70 21 14  6
#> 7       CR Year_2005 vs. Year_2017  Joint assemblage 100 1555   269 0.918  0.954 128 37 27 14
#> 8       CR Year_2011 vs. Year_2017 Pooled assemblage 100  837   142 0.923  0.958  65 20 15  7
#> 9       CR Year_2011 vs. Year_2017  Joint assemblage 100 1536   269 0.917  0.954 129 38 23 15
#> 10      JE                                 Year_2005 100  503    71 0.955  0.979  23  9  8  4
#> 11      JE                                 Year_2011 100  631    88 0.942  0.962  37  8  4  6
#> 12      JE                                 Year_2017 100  659    91 0.953  0.979  31 12  8  3
#> 13      JE Year_2005 vs. Year_2011 Pooled assemblage 100  757    96 0.951  0.969  37  8  6  7
#> 14      JE Year_2005 vs. Year_2011  Joint assemblage 100 1134   159 0.947  0.970  60 17 12 10
#> 15      JE Year_2005 vs. Year_2017 Pooled assemblage 100  864   107 0.963  0.987  32 17  9  4
#> 16      JE Year_2005 vs. Year_2017  Joint assemblage 100 1162   162 0.954  0.979  54 21 16  7
#> 17      JE Year_2011 vs. Year_2017 Pooled assemblage 100  788   100 0.958  0.981  33 13  8  6
#> 18      JE Year_2011 vs. Year_2017  Joint assemblage 100 1290   179 0.948  0.971  68 20 12  9

Output description: definitions are the same as before and thus are omitted.

License and feedback

The iNEXT.beta3D package is licensed under the GPLv3. To help refine iNEXT.beta3D, users’ comments or feedback would be welcome (please send them to Anne Chao or report an issue on the iNEXT.beta3D github iNEXT.beta3D_github.

References

Chao, A., Chiu, C.-H., Hu, K.-H., and Zeleny, D. (2023a). Revisiting Alwyn H. Gentry’s forest transect data: a statistical sampling-model-based approach. Japanese Journal of Statistics and Data Science, 6, 861-884. (https://doi.org/10.1007/s42081-023-00214-1)
Chao, A., Henderson, P. A., Chiu, C.-H., Moyes, F., Hu, K.-H., Dornelas, M. and Magurran, A. E. (2021). Measuring temporal change in alpha diversity: a framework integrating taxonomic, phylogenetic and functional diversity and the iNEXT.3D standardization. Methods in Ecology and Evolution, 12, 1926-1940.
Chao, A., Thorn, S., Chiu, C.-H., Moyes, F., Hu, K.-H., Chazdon, R. L., Wu, J., Magnago, L. F. S., Dornelas, M., Zeleny, D., Colwell, R. K., and Magurran, A. E. (2023b). Rarefaction and extrapolation with beta diversity under a framework of Hill numbers: the iNEXT.beta3D standardization. Ecological Monographs e1588.