fre.cmor.cmor_helpers module
fre.cmor helper functions
This module provides helper functions for the CMORization workflow in the FRE (Flexible Runtime Environment) CLI, specifically for use in the cmor_mixer submodule. The utilities here support a variety of common tasks including:
Logging and min/max value inspection for masked arrays.
Extraction and manipulation of variables from netCDF4 datasets.
File path and directory utilities tailored to FRE conventions.
Construction of boundary arrays for vertical levels.
Extraction and filtering of ISO datetime ranges from filenames.
Detection of ocean grid conventions in datasets.
Determination of vertical dimension names in datasets.
Creation of temporary output directories for CMOR products.
Reading and updating experiment configuration JSON files.
Functions
print_data_minmax(ds_variable, desc)from_dis_gimme_dis(from_dis, gimme_dis)find_statics_file(bronx_file_path)create_lev_bnds(bound_these, with_these)get_iso_datetime_ranges(var_filenames, iso_daterange_arr, start, stop)check_dataset_for_ocean_grid(ds)get_vertical_dimension(ds, target_var)create_tmp_dir(outdir, json_exp_config)get_json_file_data(json_file_path)update_grid_and_label(json_file_path, new_grid_label, new_grid, new_nom_res, output_file_path)update_calendar_type(json_file_path, new_calendar_type, output_file_path)check_path_existence(some_path)iso_to_bronx_chunk(cmor_chunk_in)conv_mip_to_bronx_freq(cmor_table_freq)get_bronx_freq_from_mip_table(json_table_config)filter_brands(brands, target_var, mip_var_cfgs, has_time_bnds, input_vert_dim)
Notes
These functions aim to encapsulate frequently repeated logic in the CMOR workflow, improving code readability, maintainability, and robustness.
- fre.cmor.cmor_helpers.check_dataset_for_ocean_grid(ds: netCDF4.Dataset) bool
Check if a netCDF4.Dataset uses an ocean grid (i.e., contains ‘xh’ or ‘yh’ variables).
- Parameters:
ds (netCDF4.Dataset) – Dataset to be checked.
- Returns:
True if ocean grid variables are present, otherwise False.
- Return type:
bool
Note
Logs a warning if an ocean grid is detected.
- fre.cmor.cmor_helpers.check_path_existence(some_path: str)
Check if the given path exists, raising FileNotFoundError if not.
- Parameters:
some_path (str) – A string representing a filesystem path (relative or absolute).
- Raises:
FileNotFoundError – If the path does not exist.
- fre.cmor.cmor_helpers.conv_mip_to_bronx_freq(cmor_table_freq: str) str | None
Convert a MIP table frequency string to its FRE-bronx equivalent using a lookup table.
- Parameters:
cmor_table_freq (str) – Frequency string as found in a MIP table (e.g., ‘mon’, ‘day’, ‘yr’, etc.).
- Raises:
KeyError – If the frequency string is not recognized as valid.
- Returns:
FRE-bronx frequency string, or None if not mappable.
- Return type:
str or None
- fre.cmor.cmor_helpers.create_lev_bnds(bound_these: netCDF4.Variable = None, with_these: netCDF4.Variable = None) numpy.ndarray
Create a vertical level bounds array for a set of levels.
- Parameters:
bound_these (netCDF4.Variable) – netCDF4 Variable with a numpy array representing vertical levels
with_these (netCDF4.Variable) – netCDF4 Variable with a numpy array representing level bounds, one longer than bound_these
- Raises:
ValueError – If the length of with_these is not len(bound_these) + 1.
- Returns:
Array of shape (len(bound_these), 2), where each row gives the bounds for a level.
- Return type:
np.ndarray
Note
Logs debug information about the input and output arrays.
- fre.cmor.cmor_helpers.create_tmp_dir(outdir: str, json_exp_config: str | None = None) str
Create a temporary directory for output, possibly informed by a JSON experiment config.
- Parameters:
outdir (str) – Base output directory.
json_exp_config (str, optional) – Path to a JSON config file with an “outpath” key.
- Raises:
OSError – If the temporary directory cannot be created.
- Returns:
Path to the created temporary directory.
- Return type:
str
Note
If json_exp_config is provided and contains “outpath”, a subdirectory is also created.
- fre.cmor.cmor_helpers.filter_brands(brands: list, target_var: str, mip_var_cfgs: dict, has_time_bnds: bool, input_vert_dim: str | int) str
Disambiguate multiple CMIP7 variable brands by comparing input data properties against each candidate brand’s MIP dimension list.
Two filters are applied in sequence:
Time type: The presence or absence of time bounds in the input data is compared to whether the brand’s MIP dimensions contain
time(time-mean, has bounds) ortime1(instantaneous, no bounds).Vertical coordinate: The input data’s vertical dimension name is mapped to the corresponding MIP dimension name (via
INPUT_TO_MIP_VERT_DIM) and brands whose MIP dimensions do not include it are excluded.
- Parameters:
brands (list[str]) – List of candidate brand strings to filter.
target_var (str) – The base variable name (before the brand suffix).
mip_var_cfgs (dict) – The full MIP table config dict (must contain
"variable_entry").has_time_bnds (bool) – Whether the input dataset contains
time_bnds.input_vert_dim (str or int) – The vertical dimension name from the input dataset, or
0if no vertical dimension is present.
- Raises:
ValueError – If zero or more than one brand survives filtering.
- Returns:
The single brand string that survived disambiguation.
- Return type:
str
- fre.cmor.cmor_helpers.find_gold_ocean_statics_file(put_copy_here: str | None = None) str | None
Locate (and if necessary copy) the gold-standard OM5_025 ocean_static.nc file from the GFDL archive into a user-writable directory.
- Parameters:
put_copy_here (str or None) – Directory root under which a mirror of the archive sub-path will be created and the file copied into.
- Returns:
Absolute path to the local working copy of ocean_static.nc, or None if the file could not be obtained.
- Return type:
str or None
Note
The archive path is hard-coded to the OM5_025 dataset on GFDL systems.
- fre.cmor.cmor_helpers.find_statics_file(bronx_file_path: str) str | None
Attempt to find the corresponding statics file given the path to a FRE-bronx output file. The code assumes the output file is in a FRE-bronx directory structure when trying to access the statics file. The structure is mocked in this package within the fre/tests/test_files/ascii_files/mock_archive directory structure. cd’ing there and using the command tree will reveal the mocked directory structure, something like:
<STEM>/<EXP_NAME>/<PLATFORM>-<TARGET>/
└── pp
├── component
├── realm_frequency.static.nc
└── ts
└── frequency
└── chunk_size
└── component.YYYYMM-YYYYMM.var.nc
- Parameters:
bronx_file_path (str) – File path to use as a reference for statics file location.
- Returns:
Path to the statics file if found, else None.
- Return type:
str or None
Note
The function searches upward in the directory structure until it finds a ‘pp’ directory, then globs for ‘static.nc’ files.
- fre.cmor.cmor_helpers.from_dis_gimme_dis(from_dis: netCDF4.Dataset, gimme_dis: str) numpy.ndarray | None
Retrieve and return a copy of a variable from a netCDF4.Dataset-like object.
- Parameters:
from_dis (netCDF4.Dataset) – The source dataset object.
gimme_dis (str) – The variable name to extract from the dataset.
- Returns:
A copy of the requested variable’s data, or None if not found.
- Return type:
np.ndarray or None
Note
Logs a warning if the variable is not found. The name comes from a hypothetical pronunciation of ‘ds’, the common monniker for a netCDF4.Dataset object.
- fre.cmor.cmor_helpers.get_bronx_freq_from_mip_table(json_table_config: str) str
Extract the frequency of data from a CMIP MIP table (JSON), returning its FRE-bronx equivalent.
- Parameters:
json_table_config (str) – Path to a JSON MIP table file with ‘variable_entry’ metadata.
- Raises:
KeyError – If the frequency cannot be found or mapped.
- Returns:
FRE-bronx frequency string.
- Return type:
str
- fre.cmor.cmor_helpers.get_iso_datetime_ranges(var_filenames: List[str], iso_daterange_arr: List[str] | None = None, start: str | None = None, stop: str | None = None) None
Extract and append ISO datetime ranges from filenames, filtered by start/stop years if specified.
- Parameters:
var_filenames (list of str) – Filenames, some of which contain ISO datetime ranges (e.g. ‘YYYYMMDD-YYYYMMDD’).
iso_daterange_arr (list of str) – List to append found datetime ranges to; modified in-place.
start (str, optional) – Start year in ‘YYYY’ format; only ranges within/after this year are included.
stop (str, optional) – Stop year in ‘YYYY’ format; only ranges within/before this year are included.
- Raises:
ValueError – If iso_daterange_arr is not provided or if no datetime ranges are found.
- Returns:
None
- Return type:
None
Note
This function modifies iso_daterange_arr in-place.
- fre.cmor.cmor_helpers.get_json_file_data(json_file_path: str | None = None) dict
Load and return the contents of a JSON file.
- Parameters:
json_file_path (str) – Path to the JSON file.
- Raises:
FileNotFoundError – If the file cannot be opened.
- Returns:
Parsed data from the JSON file.
- Return type:
dict
- fre.cmor.cmor_helpers.get_vertical_dimension(ds: netCDF4.Dataset, target_var: str) str | int
Determine the vertical dimension for a variable in a netCDF4.Dataset.
- Parameters:
ds (netCDF4.Dataset) – Dataset containing variables.
target_var (str) – Name of the variable to inspect.
- Returns:
Name of the vertical dimension if found, otherwise 0.
- Return type:
str or int
Note
Returns 0 if no vertical dimension is detected.
- fre.cmor.cmor_helpers.iso_to_bronx_chunk(cmor_chunk_in: str) str
Convert an ISO8601 duration string (e.g., ‘P5Y’) to FRE-bronx-style chunk string (e.g., ‘5yr’).
- Parameters:
cmor_chunk_in (str) – ISO8601 formatted string representing a time interval (must start with ‘P’ and end with ‘Y’).
- Raises:
ValueError – If the input does not follow the expected ISO format.
- Returns:
FRE-bronx chunk string.
- Return type:
str
- fre.cmor.cmor_helpers.print_data_minmax(ds_variable: numpy.ma.core.MaskedArray | None = None, desc: str | None = None) None
Log the minimum and maximum values of a numpy MaskedArray along with a description.
- Parameters:
ds_variable (numpy.ma.core.MaskedArray, optional) – The data array whose min/max is to be logged.
desc (str, optional) – Description of the data.
- Returns:
None
- Return type:
None
Note
If the data cannot be logged, a warning is issued.
- fre.cmor.cmor_helpers.update_calendar_type(json_file_path: str, new_calendar_type: str, output_file_path: str | None = None) None
Update the “calendar” field in a JSON experiment config file.
- Parameters:
json_file_path (str) – Path to the input JSON file.
new_calendar_type (str) – New value for the “calendar” field.
output_file_path (str, optional) – Path to save the updated JSON file. If None, overwrites the original file.
- Raises:
FileNotFoundError – If the input JSON file does not exist.
KeyError – If the “calendar” field is not found in the JSON file.
ValueError – If new_calendar_type is None.
json.JSONDecodeError – If the JSON file cannot be decoded.
- Returns:
None
- Return type:
None
Note
The function logs before and after values, and overwrites the input file unless an output path is given.
- fre.cmor.cmor_helpers.update_grid_and_label(json_file_path: str, new_grid_label: str, new_grid: str, new_nom_res: str, output_file_path: str | None = None) None
Update the “grid_label”, “grid”, and “nominal_resolution” fields in a JSON experiment config.
- Parameters:
json_file_path (str) – Path to the input JSON file.
new_grid_label (str) – New value for the “grid_label” field.
new_grid (str) – New value for the “grid” field.
new_nom_res (str) – New value for the “nominal_resolution” field.
output_file_path (str, optional) – Path to save the updated JSON file. If None, overwrites the original file.
- Raises:
FileNotFoundError – If the input JSON file does not exist.
KeyError – If a required field is not found in the JSON file.
ValueError – If any input value is None.
json.JSONDecodeError – If the JSON file cannot be decoded.
- Returns:
None
- Return type:
None
Note
The function logs before and after values, and overwrites the input file unless an output path is given.