Datasets
AIBECS.jl ships a handful of small pedagogical circulations inside the package (built from scratch in pure Julia, no download needed) and downloads larger data products on demand: ocean circulation matrices, dust deposition fields, topography, and a few others. This page catalogs everything the package can build or fetch.
Why downloads are deferred to first use
The downloaded datasets are large (the OCIM2 transport matrices are ~29 MB each, OCIM2_48L is ~554 MB, ETOPO is ~400 MB compressed), and bundling them would bloat every install for users who only need a subset. AIBECS instead uses DataDeps.jl to declare each dataset as a data dependency: a named record with a URL, a checksum, and a citation.
The first time you call e.g. OCIM2.load(), DataDeps:
Checks the local DataDeps cache (
~/.julia/datadeps/AIBECS-OCIM2_CTL_He/by default) for a copy.If absent, prompts you to accept the licence and citation (auto-accepted on CI when
ENV["DATADEPS_ALWAYS_ACCEPT"] = true), then downloads the file from the source listed below.Verifies the recorded checksum, refusing to use a corrupted file.
Caches the file so subsequent loads are instant and offline-friendly.
This pattern (described in White et al., 2019) is the same mechanism used by MLDatasets.jl, WordNet.jl, and other data-heavy Julia packages. The benefit for AIBECS users: the package stays small and pure-code, the data lives at a stable URL with a citation, and any change to the upstream file is caught by the checksum mismatch instead of silently changing model output.
The toy / pedagogical circulations are different: they are built in memory from a few constants and the helpers in CirculationGeneration.jl, so they add no install-time cost and need no network access.
Ocean circulations
Built from scratch (bundled in AIBECS)
These pedagogical circulations are constructed each time Module.load() is called, using OceanGrids plus the T_advection / T_diffusion helpers in src/CirculationGeneration.jl. Nothing is downloaded or cached.
| Module | Layout | Grid size | Citation | Source | Size (MB) |
|---|---|---|---|---|---|
TwoBoxModel | surface + deep | 1×1×2 | Sarmiento & Gruber (2006) | bundled | 0 |
Archer_etal_2000 | 3-box (HL surface, LL surface, deep) on a 6-cell grid | 2×1×3 | Archer et al. (2000) | bundled | 0 |
Primeau_2x2x2 | shoebox (5 wet boxes, 3 dry) | 2×2×2 | Primeau, Intro2TransportOperators | bundled | 0 |
Haine_and_Hall_2025 | 9-box (3 latitudes × 3 depths) | 3×1×3 | Haine & Hall (2002); Haine et al. (2025) | bundled | 0 |
Downloaded (data-based circulations)
Five families of transport matrices and grids are available. All files are JLD2 (or .tar.gz for OCIM2_48L) generated from the upstream MATLAB distributions by briochemc/OceanCirculations.
| Module | Variant | Grid size | Citation | Source | Size (MB) |
|---|---|---|---|---|---|
OCIM0 | (single) | 180×90×24 | DeVries & Primeau (2011); Primeau et al. (2013) | link | 11 |
OCIM1 | CTL | 180×91×24 | DeVries (2014) | link | 28 |
OCIM2 | CTL_He (default) | 180×91×24 | DeVries & Holzer (2019) | link | 29 |
OCIM2 | CTL_noHe | 180×91×24 | DeVries & Holzer (2019) | link | 29 |
OCIM2 | KiHIGH_He | 180×91×24 | DeVries & Holzer (2019) | link | 29 |
OCIM2 | KiHIGH_noHe | 180×91×24 | DeVries & Holzer (2019) | link | 28 |
OCIM2 | KiLOW_He | 180×91×24 | DeVries & Holzer (2019) | link | 29 |
OCIM2 | KiLOW_noHe | 180×91×24 | DeVries & Holzer (2019) | link | 29 |
OCIM2 | KvHIGH_He | 180×91×24 | DeVries & Holzer (2019) | link | 29 |
OCIM2 | KvHIGH_KiHIGH_noHe | 180×91×24 | DeVries & Holzer (2019) | link | 28 |
OCIM2 | KvHIGH_KiLOW_He | 180×91×24 | DeVries & Holzer (2019) | link | 29 |
OCIM2 | KvHIGH_KiLOW_noHe | 180×91×24 | DeVries & Holzer (2019) | link | 29 |
OCIM2 | KvHIGH_noHe | 180×91×24 | DeVries & Holzer (2019) | link | 29 |
OCIM2_48L | base | 180×91×48 | Holzer, DeVries & de Lavergne (2021) | link | 554 |
OCCA | (single) | 180×80×10 | Forget (2010) | link | 5 |
Loading any OCIM matrix requires using JLD2 (which activates the AIBECSJLD2Ext extension); loading OCIM2_48L additionally requires using MAT, NCDatasets.
Other datasets
Aeolian deposition, topography, river discharge, groundwater discharge, and the AWESOME-OCIM toolbox. These are hosted by their original maintainers (except Chien, which lives on Zenodo).
| Module | Variant | What it is | Citation | Source | Size (MB) |
|---|---|---|---|---|---|
AeolianSources | Chien (default) | 2°×2° seasonal aerosol deposition (fires, biofuels, dust, sea salt, biogenics, volcanoes, fossil fuels) | Chien et al. (2016) | link | 1 |
AeolianSources | Kok | annual dust deposition by source region (DustCOMM) | Kok et al. (2021) | link | 6 |
ETOPO | bedrock | 1-arc-min global relief, bedrock surface (compressed) | Amante & Eakins (2009) | link | 402 |
ETOPO | ice | 1-arc-min global relief, ice-surface (compressed) | Amante & Eakins (2009) | link | 395 |
GroundWaters | (single) | coastal fresh-groundwater discharge shapefile | Luijendijk et al. (2020); PANGAEA dataset | link | 32 |
AO | master | AWESOME-OCIM MATLAB toolbox (source archive) | John et al. (2020) | link | 123 |
Loading these typically requires the matching extension dependencies: AeolianSources needs NCDatasets, ETOPO needs Distances, NCDatasets, GroundWaters needs Shapefile, DataFrames.
Cache layout
DataDeps stores files under ~/.julia/datadeps/<DataDepName>/ by default. You can override the location by setting ENV["DATADEPS_LOAD_PATH"] before loading AIBECS. The DataDep names this package registers are:
AIBECS-OCIM0.1,AIBECS-OCIM1_CTL,AIBECS-OCIM2_<variant>(one per OCIM2 variant),AIBECS-OCIM2_48L,AIBECS-OCCAAIBECS-Chien_etal_2016,AIBECS-Kok_etal_2021ETOPO_bedrock,ETOPO_icegroundwater_dischargeAWESOME-OCIM
To force a re-download (e.g. after a checksum mismatch or to test a URL change), delete the corresponding directory and call Module.load() again.