Bio Statistics

Survival Analysis in Biostatistics: Concepts, Methods, and Applications

R Studio

Ecological Diversity Analysis Across Five Sites Using R

Data Analysis

Time Series Regression Analysis in Biostatistics: Evaluating PM2.5, Temperature, and Intervention Effects on Asthma Cases

Data Analysis

Interpretation of Time Series Analysis of Frog Population Data in R

R Studio

How to Calculate Correlation Coefficient (r) and Create a Scatter Plot in R Studio

Comprehensive Guide to Factor Analysis in Ecological Studies Using R: Identifying Key Environmental Gradients

byDr. Mohan Arthanari •October 05, 2024

0

Introduction

Factor analysis is a statistical technique used to identify underlying relationships between observed variables by reducing the dimensionality of the data. In R, this method is implemented using various functions, such as fa() from the psych package, which allows researchers to explore complex datasets by extracting latent factors that explain the shared variance between variables. Factor analysis is particularly useful in ecological studies where multiple environmental or biological variables are often interrelated, enabling researchers to simplify data interpretation and highlight key ecological gradients. The technique can be performed with different rotation methods, such as varimax or promax, to enhance interpretability by maximizing the variance explained by each factor. By applying factor analysis, one can identify hidden structures in the data, reduce noise, and focus on the most significant factors influencing the observed outcomes.

Interpretation of Factor Analysis Results

The factor analysis was conducted using the Minimum Residual (minres) method with Varimax rotation. Two factors (MR1 and MR2) were extracted, which together explain a substantial portion of the variance in the dataset.

Factor Analysis Table: Factor Loadings

Factor Loadings:

The factor loadings represent the relationship between each variable and the extracted factors:

Factor MR1 shows high loadings for most of the environmental variables:

Soil pH (0.98),
Soil Moisture (-0.99),
Nitrogen Content (0.89),
Phosphorus Content (0.99),
Plant Biomass (0.97),
Species Richness (0.95).

This suggests that Factor MR1 largely captures variability related to soil composition and nutrient availability.

Factor MR2 mainly loads on Site (0.85) and has a smaller loading on Light Intensity (0.29). This indicates that Factor MR2 reflects variability in the spatial characteristics of the sites.

Communality (h²) and Uniqueness (u²):

Communality (h²) represents the proportion of each variable’s variance that is explained by the extracted factors. Higher communality values (close to 1) suggest that the factors account for most of the variability in that variable.
Uniqueness (u²) represents the variance in each variable that is not explained by the factors, i.e., the variance that is unique to the variable and does not share commonality with other variables. Variables with low uniqueness values (close to 0) are well explained by the factors, while those with high uniqueness values retain a lot of unexplained variance.

Details on Communality (h²) and Uniqueness (u²):

Soil pH: The communality of 0.97 means that 97% of the variance in soil pH is explained by MR1, with only 3% of the variance left unexplained (u² = 0.03).

Soil Moisture: A communality of 0.97 means that soil moisture is also largely explained by MR1, with 97% of its variance accounted for by the factor.

Nitrogen Content: Communality of 0.79 suggests that 79% of its variance is explained by the factors, with 21% unique variance.

Phosphorus Content: A very high communality of 0.98 indicates that this variable is almost entirely explained by MR1, with only 2% unexplained variance.

Light Intensity: With a communality of 0.84, the factors explain 84% of the variability in light intensity, leaving 16% of unexplained variance (u² = 0.16).

Plant Biomass and Species Richness have high communalities of 0.96 and 0.97, respectively, showing that nearly all of their variance is explained by MR1, and they have very little uniqueness.

Variance Explained:

Factor MR1 explains 79% of the variance in the dataset, indicating that it captures most of the important variability.
Factor MR2 explains an additional 11% of the variance, bringing the cumulative variance explained by the two factors to 90%.
Factor MR1 explains the bulk of the variance (87% of the explained variance), while MR2 contributes a smaller portion (13%).

Factor Analysis Table: Variance Explained

Model Fit and Reliability:

The Root Mean Square Residual (RMSR) is 0.01, which indicates a good fit, as smaller RMSR values suggest better model fit.
The Tucker Lewis Index (TLI) of 1.212 indicates a very high factor reliability.
The RMSEA index is 0, indicating an excellent fit of the model to the data, with the confidence interval ranging from 0 to 0.211.

Factor Score Adequacy:

The correlation of regression scores with factors is perfect for MR1 (1.00) and high for MR2 (0.93), meaning that the factors are well represented by the variables.
The Multiple R-square of scores with factors is 1.00 for MR1 and 0.87 for MR2, further indicating that the extracted factors are adequate in explaining the variance.

Interpretation of Factor Analysis Diagram

The factor analysis diagram provides a visual representation of the relationships between observed variables and the extracted factors (MR1 and MR2). This diagram helps to interpret how different ecological variables load onto the factors, indicating the underlying structure and relationships in the dataset.

Factor MR1 (Primary Factor)

MR1 has strong loadings on the majority of the observed variables, suggesting it represents a major underlying ecological process, likely related to soil properties and plant characteristics. Variables strongly associated with MR1 include:

Phosphorus Content: This variable has a high loading on MR1, indicating a strong association with the primary factor. This suggests phosphorus content is a key component of the soil's nutrient structure.
Soil pH: Soil pH also shows a strong loading on MR1, reflecting its role in the environmental conditions being captured by this factor.
Plant Biomass: The close association with MR1 indicates that biomass production is strongly influenced by the underlying ecological conditions this factor represents.
Species Richness: High loading on MR1 implies that species diversity is highly correlated with the environmental factors captured by MR1.
Nitrogen Content: With a strong loading on MR1, nitrogen content, another key nutrient, is also a significant contributor to this ecological gradient.

These variables collectively describe an environmental or nutrient gradient influencing soil and plant-related characteristics, with MR1 explaining the majority of their variance.

Factor MR2 (Secondary Factor)

MR2 has fewer significant loadings but is primarily associated with Site and Light Intensity:

Site: The variable Site has a strong loading on MR2, indicating that MR2 captures the spatial variability between different locations in the study. This suggests that geographic location plays a role independent of the soil and plant-related factors.

Light Intensity: Although it has a smaller loading on MR2, light intensity is related to the spatial factor, suggesting that variability in light conditions across sites may also contribute to the differentiation of MR2.

Overall Interpretation of the Factors

MR1 appears to represent a soil nutrient and plant growth factor, as it is associated with multiple variables directly linked to soil chemistry (pH, nutrient content) and biological responses (plant biomass, species richness). The high loadings indicate that this factor captures a major part of the ecological variability in the dataset.
MR2 represents a more geographic or spatial factor, primarily capturing site-based differences and, to a lesser extent, light intensity variability. This suggests that MR2 may account for variation due to location-specific factors not directly related to soil nutrient composition.

Importance of the Diagram in Understanding Factor Relationships

The diagram is a useful tool for understanding the structure of the dataset and the contributions of different variables to the underlying factors. The strength of the loadings, depicted by the thickness of the arrows, provides insight into the relative importance of each variable for each factor. This allows for a more nuanced interpretation of the ecological processes influencing the dataset, providing a clearer understanding of how different environmental variables interrelate.

Conclusion

The factor analysis diagram illustrates the two extracted factors—MR1 (soil and nutrient-related factor) and MR2 (spatial or site-related factor)—and their relationships with the observed ecological variables. MR1 dominates the dataset, explaining most of the variance in soil properties, nutrient content, and plant characteristics, while MR2 captures site-based differences. The clear separation of variables between these two factors enhances the interpretation of underlying ecological gradients in the study system.

Tags: Bio Statistics Data Analysis

Trending

Survival Analysis in Biostatistics: Concepts, Methods, and Applications

Ecological Diversity Analysis Across Five Sites Using R

Time Series Regression Analysis in Biostatistics: Evaluating PM2.5, Temperature, and Intervention Effects on Asthma Cases

Interpretation of Time Series Analysis of Frog Population Data in R

How to Calculate Correlation Coefficient (r) and Create a Scatter Plot in R Studio

Comprehensive Guide to Factor Analysis in Ecological Studies Using R: Identifying Key Environmental Gradients

Introduction

Interpretation of Factor Analysis Results

Factor Loadings:

Communality (h²) and Uniqueness (u²):

Details on Communality (h²) and Uniqueness (u²):

Variance Explained:

Model Fit and Reliability:

Factor Score Adequacy:

Interpretation of Factor Analysis Diagram

Factor MR1 (Primary Factor)

Factor MR2 (Secondary Factor)

Overall Interpretation of the Factors

Importance of the Diagram in Understanding Factor Relationships

Conclusion

Post a Comment

Get new posts by email:

Mastering PCA in R Studio: Applications in Biological Sciences and Step-by-Step Guide

How to Perform Canonical Correspondence Analysis (CCA) in R: A Step-by-Step Guide Using Species Distribution and Environmental Variables Data

Ecological Diversity Analysis Across Five Sites Using R

Step-by-Step Guide to Building a Species Distribution Model (SDM) in R

Mastering PCA in R Studio: Applications in Biological Sciences and Step-by-Step Guide

Contact form