Bio Statistics

Survival Analysis in Biostatistics: Concepts, Methods, and Applications

R Studio

Ecological Diversity Analysis Across Five Sites Using R

Data Analysis

Time Series Regression Analysis in Biostatistics: Evaluating PM2.5, Temperature, and Intervention Effects on Asthma Cases

Data Analysis

Interpretation of Time Series Analysis of Frog Population Data in R

R Studio

How to Calculate Correlation Coefficient (r) and Create a Scatter Plot in R Studio

Ordinal Logistic Regression in R: Step-by-Step Guide with Blood Pressure Analysis

byDr. Mohan Arthanari •January 20, 2025

0

Ordinal Logistic Regression (OLR) is a statistical technique used when the dependent variable is ordinal, meaning it has categories with a meaningful order but the intervals between the categories are not necessarily equal. OLR models the relationship between one or more independent variables (predictors) and an ordinal outcome variable.

Key Characteristics of Ordinal Logistic Regression

Ordinal Dependent Variable: Examples include Blood Pressure responses (e.g., Normal, Pre-Hypertension, and Hypertension).
Independent Variables: These can be continuous, categorical, or a mix of both.

Ordinal Logistic Regression in R

Below is an example of how to perform ordinal logistic regression on this dataset using R:

1. Load the necessary libraries:

# Install and load the required package

install.packages("MASS")

library(MASS)

2. Prepare your data:

For this example, let's assume the dataset is named patient_data. You'll need to ensure the 'Blood_Pressure' column is treated as an ordered factor.

# Sample dataset

patient_data <- data.frame(

Patient_ID = 1:30,

Age = c(60, 45, 50, 70, 65, 40, 37, 55, 30, 43, 58, 63, 35, 43, 53, 67, 41, 72, 54, 48, 51, 32, 62, 49, 51, 41, 68, 59, 45, 65),

Gender = factor(c('F', 'F', 'M', 'F', 'M', 'M', 'F', 'F', 'F', 'F', 'M', 'M', 'M', 'M', 'F', 'F', 'F', 'F', 'M', 'M', 'M', 'M', 'F', 'F', 'M', 'M', 'M', 'F', 'M', 'F')),

Blood_Pressure = factor(c('Normal', 'Pre-Hypertension', 'Hypertension', 'Normal', 'Pre-Hypertension', 'Hypertension', 'Normal', 'Pre-Hypertension', 'Hypertension', 'Normal', 'Pre-Hypertension', 'Hypertension', 'Normal', 'Pre-Hypertension', 'Hypertension', 'Normal', 'Pre-Hypertension', 'Hypertension', 'Normal', 'Pre-Hypertension', 'Hypertension', 'Normal', 'Pre-Hypertension', 'Hypertension', 'Normal', 'Pre-Hypertension', 'Hypertension', 'Normal', 'Pre-Hypertension', 'Hypertension'),

levels = c('Normal', 'Pre-Hypertension', 'Hypertension'), ordered = TRUE)

)

# Convert 'Blood_Pressure' to an ordered factor

patient_data$Blood_Pressure <- factor(patient_data$Blood_Pressure,

levels = c('Normal', 'Pre-Hypertension', 'Hypertension'),

ordered = TRUE)

3. Fit the Ordinal Logistic Regression Model:

Now you can fit the model using the polr() function from the MASS package.

# Fit the ordinal logistic regression model

model <- polr(Blood_Pressure ~ Age + Gender, data = patient_data, Hess = TRUE)

# Display the summary of the model

summary(model)

4. Interpret the results:

To interpret the coefficients and test the significance of the predictors, you can check the output from the summary(model) function. The coefficients indicate how each predictor affects the odds of being in a higher category of blood pressure (i.e., moving from Normal to Pre-Hypertension or Hypertension).

5. Check model significance:

To get a more formal test of model significance, you can compute the p-values using the coef() and summary() functions:

# Compute p-values

z <- coef(model) / sqrt(diag(vcov(model)))

p_values <- 2 * (1 - pnorm(abs(z)))

p_values

This will give you the p-values for each predictor. A p-value less than 0.05 typically indicates that the predictor is statistically significant.

There are several ways to visualize the results of an ordinal logistic regression in R. Here are some common methods for plotting the results of an ordinal logistic regression:

1. Predictive Probabilities Plot

You can plot the predicted probabilities of each level of the outcome variable (in your case, "Blood_Pressure") for different values of a predictor (e.g., Age or Gender). Here's how to create such a plot:

# Create new data for prediction

new_data <- expand.grid(Age = seq(30, 80, by = 1),

Gender = factor(c("F", "M"), levels = c("F", "M")))

# Predict the probabilities

pred_probs <- predict(model, new_data, type = "probs")

# Plot the probabilities for each blood pressure category

library(ggplot2)

# Convert to a data frame for ggplot

plot_data <- data.frame(Age = rep(new_data$Age, 2),

Gender = rep(new_data$Gender, each = length(new_data$Age)),

Normal = pred_probs[, 1],

Pre_Hypertension = pred_probs[, 2],

Hypertension = pred_probs[, 3])

# Plot

ggplot(plot_data, aes(x = Age)) +

geom_line(aes(y = Normal, color = "Normal"), size = 1) +

geom_line(aes(y = Pre_Hypertension, color = "Pre-Hypertension"), size = 1) +

geom_line(aes(y = Hypertension, color = "Hypertension"), size = 1) +

facet_wrap(~ Gender) +

labs(title = "Predicted Probabilities of Blood Pressure Categories",

x = "Age", y = "Predicted Probability") +

scale_color_manual(values = c("Normal" = "blue", "Pre-Hypertension" = "orange", "Hypertension" = "red")) +

theme_minimal()

This plot will show the predicted probabilities of each blood pressure category (Normal, Pre-Hypertension, and Hypertension) for different values of Age, with separate lines for each gender.

2. Coefficient Plot

To visualize the relationship between the predictors and the outcome, you can create a plot of the model coefficients.

# Coefficient plot

coef_data <- as.data.frame(coef(model))

coef_data$Variable <- rownames(coef_data)

# Plot the coefficients

ggplot(coef_data, aes(x = Variable, y = V1)) +

geom_bar(stat = "identity", fill = "lightblue") +

coord_flip() +

labs(title = "Model Coefficients for Ordinal Logistic Regression",

x = "Variable", y = "Coefficient Estimate") +

theme_minimal()

This bar chart will show the estimated coefficients for each predictor in the ordinal logistic regression model.

3. Predicted vs. Observed Plot

This type of plot helps you compare the observed and predicted categories. You can check how well the model performs in predicting the actual categories.

# Predict the categories

predicted_classes <- predict(model, type = "class")

# Create confusion matrix

conf_matrix <- table(Observed = patient_data$Blood_Pressure, Predicted = predicted_classes)

# Plot confusion matrix

library(caret)

confusionMatrix(conf_matrix)

This provides a table of predicted vs. observed values, and confusionMatrix() will give you additional details like accuracy, sensitivity, and specificity.

4. Plot of Predicted Probabilities for Specific Groups

You can also plot the predicted probabilities for specific groups in your data (e.g., females vs males or different age groups).

# Predict probabilities for females only

female_data <- subset(patient_data, Gender == "F")

pred_female <- predict(model, female_data, type = "probs")

# Combine predicted probabilities with the original data

female_data$Normal_Prob <- pred_female[,1]

female_data$Pre_Hypertension_Prob <- pred_female[,2]

female_data$Hypertension_Prob <- pred_female[,3]

# Plot probabilities

ggplot(female_data, aes(x = Age)) +

geom_line(aes(y = Normal_Prob, color = "Normal"), size = 1) +

geom_line(aes(y = Pre_Hypertension_Prob, color = "Pre-Hypertension"), size = 1) +

geom_line(aes(y = Hypertension_Prob, color = "Hypertension"), size = 1) +

labs(title = "Predicted Probabilities for Females",

x = "Age", y = "Predicted Probability") +

scale_color_manual(values = c("Normal" = "blue", "Pre-Hypertension" = "orange", "Hypertension" = "red")) +

theme_minimal()

5. Effect of a Predictor (Age) on Blood Pressure Categories

To visualize how age affects the probability of being in each blood pressure category, you can plot the effect of age using the effects package.

# Install the 'effects' package if not already installed

install.packages("effects")

library(effects)

# Plot the effect of Age on Blood Pressure

effect_plot <- effect("Age", model)

plot(effect_plot, main = "Effect of Age on Blood Pressure Categories")

This will plot the effect of age on the odds of being in the different blood pressure categories.

Conclusion:

Ordinal Logistic Regression is a powerful statistical technique for analyzing ordinal outcome variables, providing meaningful insights into the relationship between predictors and ordered categories. By following this guide, you can effectively implement OLR in R, interpret the results, and visualize key findings through predictive probability plots, coefficient charts, and observed vs. predicted comparisons. Whether analyzing health data like blood pressure levels or other ordinal outcomes, OLR equips researchers with the tools to make data-driven decisions and enhance their understanding of complex relationships.

Tags: Bio Statistics R Studio

Trending

Survival Analysis in Biostatistics: Concepts, Methods, and Applications

Ecological Diversity Analysis Across Five Sites Using R

Time Series Regression Analysis in Biostatistics: Evaluating PM2.5, Temperature, and Intervention Effects on Asthma Cases

Interpretation of Time Series Analysis of Frog Population Data in R

How to Calculate Correlation Coefficient (r) and Create a Scatter Plot in R Studio

Ordinal Logistic Regression in R: Step-by-Step Guide with Blood Pressure Analysis

Key Characteristics of Ordinal Logistic Regression

1. Load the necessary libraries:

2. Prepare your data:

3. Fit the Ordinal Logistic Regression Model:

4. Interpret the results:

5. Check model significance:

There are several ways to visualize the results of an ordinal logistic regression in R. Here are some common methods for plotting the results of an ordinal logistic regression:

1. Predictive Probabilities Plot

2. Coefficient Plot

3. Predicted vs. Observed Plot

4. Plot of Predicted Probabilities for Specific Groups

5. Effect of a Predictor (Age) on Blood Pressure Categories

Conclusion:

Post a Comment

Get new posts by email:

Mastering PCA in R Studio: Applications in Biological Sciences and Step-by-Step Guide

How to Perform Canonical Correspondence Analysis (CCA) in R: A Step-by-Step Guide Using Species Distribution and Environmental Variables Data

Ecological Diversity Analysis Across Five Sites Using R

Step-by-Step Guide to Building a Species Distribution Model (SDM) in R

Mastering PCA in R Studio: Applications in Biological Sciences and Step-by-Step Guide

Contact form