METHODS

Data collection

In this study, I initially conducted a search for open-access datasets using the keywords “biochar application” and “Agriculture,” “Forest,” or “Grassland” across several reputable platforms, including Zenodo (https://zenodo.org/), DataCite (https://datacite.org/), and Scientific Data (https://www.nature.com/sdata/). However, due to the limited availability of open-access datasets, I expanded my search to Web of Science (WOS, http://apps.webofknowledge.com/) and Google Scholar (https://scholar.google.com/) using the same keywords. GetData Graph Digitizer 2.20 (https://getdata-graph-digitizer.software.informer.com/) was used to extact data from pulications.

The collected data included means, the number of replicates, and standard deviation (SD) or standard error (SE) values. In cases where neither SD nor SE was reported, I estimated the SD as 10% of the reported mean. When variance data was provided without specification as SD or SE, I assumed the data to be SE by default. If only SE was available, SD was subsequently calculated using the formula: SD = SE × √n, where n represents the sample size.

Due to time constraints, I focused on data from 22 studies, comprising nine studies related to agriculture, seven to forestry, and six to grassland esystems. This collection includes one dataset from my own ongoing research on urban forests. I also extracted and compiled various background data for each study site, including geographic coordinates (latitude and longitude), MAT, MAP, biochar application rate (Rate), and duration of application (Duration).

In addition to background data, I gathered information on key soil physicochemical properties, such as soil pH, SOC, total nitrogen (TN), ammonium (NH₄⁺), nitrate (NO₃⁻), microbial biomass carbon (MBC), and microbial biomass nitrogen (MBN). For productivity metrics, crop yield data were used for agricultural and grassland systems, while forest productivity was represented by the product of tree diameter at breast height (DBH) squared and tree height. Finally, I extracted GHG emission data, including CO₂, CH₄, and N₂O measurements from the studies.

Figure 1 provides a comprehensive overview of the spatial and ecological breadth of the studies included in this meta‐analysis. In panel (A), the global geographical distribution of literature study areas is mapped to illustrate their locations across continents and major regions. In panel (B), those same sites are classified according to Whittaker’s biome scheme—spanning tropical and subtropical forests, temperate woodlands and grasslands, deserts, and boreal regions—underscoring the wide variety of terrestrial ecosystems represented in my dataset.

A

B

Figure 1. (A) Geographical distribution of the literature study areas. (B) Site distributions under Whittaker biomes.

Effect Size Calculation

In each study, I assessed the impact of biochar application on soil properties, productivity, and greenhouse gas (GHG) emissions across different land systems. The effect size for each study was calculated as the response ratio (RR), which was determined using the following formula:

Where Xbiochar and Xcontrol represent the mean values of a given variable under biochar application and the control treatment, respectively.

The variance (Vi) associated with the response ratio was calculated using the formula:

where SDbiochar and SDcontrol are the standard deviations for biochar-treated and control groups, and nbiochar and ncontrol are the sample sizes for each group.

Calculation of the Overall Response Ratio

To estimate the overall effect size, I employed a multivariate meta-analytic model using the rma.mv() function in the metafor package in R. This model accounted for heterogeneity between studies by specifying StudyID as a random effect (random = ~ 1 | StudyID). This approach allows us to capture systematic differences while assuming random variability in the effect sizes across studies. Model parameters were estimated using the Restricted Maximum Likelihood (REML) method, which ensures stable overall effect estimates while controlling for inter-study heterogeneity.

To calculate confidence intervals (CIs) for the response ratio, we employed the bootstrap method with 1000 resamples, using the boot package in R to generate bias-corrected and accelerated (BCa) bootstrap CIs.

Subgroup Analysis

To explore differences across subgroups, pairwise comparisons were performed using the estimated coefficients and their variance-covariance matrix. The differences in the response ratios were computed for each pair of subgroups, and statistical significance was determined using Z-tests. P-values from pairwise comparisons were corrected for multiple testing using the multcompView package in R, with significance denoted by different letters (e.g., a, b, c) above the subgroups in the figures. Subgroups sharing the same letter were not significantly different, while those with different letters showed statistically significant differences (P < 0.05). The effects of biochar application were considered significant when the 95% confidence intervals did not overlap with zero.

Linear Regression Analysis

Subsequent analyses included linear regression models for each land system type to examine the relationships between environmental factors (MAT, MAP, Duration, and Rate) and various ecological indicators. Pearson correlation coefficients and their significance (α = 0.05) were calculated, with statistical results automatically annotated on the plots using the ggpubr::stat_cor() function in R.

Random-Forest Analysis

To further assess the influence of multiple environmental factors on productivity and soil respiration, I applied random forest analysis using the rfPermute package in R. Given concerns about missing values and small sample sizes, multiple imputation (MICE) and bootstrap resampling techniques were used to ensure robust model performance, with each land system group receiving at least 30 samples. Random forest models were constructed within each land system group, comprising 500 trees and 50 repetitions. The model output was analyzed to generate variable importance scores (%IncMSE) and significance (p-values), which were presented in bar plots to visually display the relative influence of each environmental factor on productivity.

All analyses and figure generation were conducted using R version 4.3.3.