fre.cmor.cmor_helpers module

fre.cmor helper functions

This module provides helper functions for the CMORization workflow in the FRE (Flexible Runtime Environment) CLI, specifically for use in the cmor_mixer submodule. The utilities here support a variety of common tasks including:

  • Logging and min/max value inspection for masked arrays.

  • Extraction and manipulation of variables from netCDF4 datasets.

  • File path and directory utilities tailored to FRE conventions.

  • Construction of boundary arrays for vertical levels.

  • Extraction and filtering of ISO datetime ranges from filenames.

  • Detection of ocean grid conventions in datasets.

  • Determination of vertical dimension names in datasets.

  • Creation of temporary output directories for CMOR products.

  • Reading and updating experiment configuration JSON files.

Functions

  • print_data_minmax(ds_variable, desc)

  • from_dis_gimme_dis(from_dis, gimme_dis)

  • find_statics_file(bronx_file_path)

  • create_lev_bnds(bound_these, with_these)

  • get_iso_datetime_ranges(var_filenames, iso_daterange_arr, start, stop)

  • check_dataset_for_ocean_grid(ds)

  • get_vertical_dimension(ds, target_var)

  • create_tmp_dir(outdir, json_exp_config)

  • get_json_file_data(json_file_path)

  • update_grid_and_label(json_file_path, new_grid_label, new_grid, new_nom_res, output_file_path)

  • update_calendar_type(json_file_path, new_calendar_type, output_file_path)

  • check_path_existence(some_path)

  • iso_to_bronx_chunk(cmor_chunk_in)

  • conv_mip_to_bronx_freq(cmor_table_freq)

  • get_bronx_freq_from_mip_table(json_table_config)

  • filter_brands(brands, target_var, mip_var_cfgs, has_time_bnds, input_vert_dim)

Notes

These functions aim to encapsulate frequently repeated logic in the CMOR workflow, improving code readability, maintainability, and robustness.

fre.cmor.cmor_helpers.check_dataset_for_ocean_grid(ds: netCDF4.Dataset) bool

Check if a netCDF4.Dataset uses an ocean grid (i.e., contains ‘xh’ or ‘yh’ variables).

Parameters:

ds (netCDF4.Dataset) – Dataset to be checked.

Returns:

True if ocean grid variables are present, otherwise False.

Return type:

bool

Note

Logs a warning if an ocean grid is detected.

fre.cmor.cmor_helpers.check_path_existence(some_path: str)

Check if the given path exists, raising FileNotFoundError if not.

Parameters:

some_path (str) – A string representing a filesystem path (relative or absolute).

Raises:

FileNotFoundError – If the path does not exist.

fre.cmor.cmor_helpers.conv_mip_to_bronx_freq(cmor_table_freq: str) str | None

Convert a MIP table frequency string to its FRE-bronx equivalent using a lookup table.

Parameters:

cmor_table_freq (str) – Frequency string as found in a MIP table (e.g., ‘mon’, ‘day’, ‘yr’, etc.).

Raises:

KeyError – If the frequency string is not recognized as valid.

Returns:

FRE-bronx frequency string, or None if not mappable.

Return type:

str or None

fre.cmor.cmor_helpers.create_lev_bnds(bound_these: netCDF4.Variable = None, with_these: netCDF4.Variable = None) numpy.ndarray

Create a vertical level bounds array for a set of levels.

Parameters:
  • bound_these (netCDF4.Variable) – netCDF4 Variable with a numpy array representing vertical levels

  • with_these (netCDF4.Variable) – netCDF4 Variable with a numpy array representing level bounds, one longer than bound_these

Raises:

ValueError – If the length of with_these is not len(bound_these) + 1.

Returns:

Array of shape (len(bound_these), 2), where each row gives the bounds for a level.

Return type:

np.ndarray

Note

Logs debug information about the input and output arrays.

fre.cmor.cmor_helpers.create_tmp_dir(outdir: str, json_exp_config: str | None = None) str

Create a temporary directory for output, possibly informed by a JSON experiment config.

Parameters:
  • outdir (str) – Base output directory.

  • json_exp_config (str, optional) – Path to a JSON config file with an “outpath” key.

Raises:

OSError – If the temporary directory cannot be created.

Returns:

Path to the created temporary directory.

Return type:

str

Note

If json_exp_config is provided and contains “outpath”, a subdirectory is also created.

fre.cmor.cmor_helpers.filter_brands(brands: list, target_var: str, mip_var_cfgs: dict, has_time_bnds: bool, input_vert_dim: str | int) str

Disambiguate multiple CMIP7 variable brands by comparing input data properties against each candidate brand’s MIP dimension list.

Two filters are applied in sequence:

  1. Time type: The presence or absence of time bounds in the input data is compared to whether the brand’s MIP dimensions contain time (time-mean, has bounds) or time1 (instantaneous, no bounds).

  2. Vertical coordinate: The input data’s vertical dimension name is mapped to the corresponding MIP dimension name (via INPUT_TO_MIP_VERT_DIM) and brands whose MIP dimensions do not include it are excluded.

Parameters:
  • brands (list[str]) – List of candidate brand strings to filter.

  • target_var (str) – The base variable name (before the brand suffix).

  • mip_var_cfgs (dict) – The full MIP table config dict (must contain "variable_entry").

  • has_time_bnds (bool) – Whether the input dataset contains time_bnds.

  • input_vert_dim (str or int) – The vertical dimension name from the input dataset, or 0 if no vertical dimension is present.

Raises:

ValueError – If zero or more than one brand survives filtering.

Returns:

The single brand string that survived disambiguation.

Return type:

str

fre.cmor.cmor_helpers.find_gold_ocean_statics_file(put_copy_here: str | None = None) str | None

Locate (and if necessary copy) the gold-standard OM5_025 ocean_static.nc file from the GFDL archive into a user-writable directory.

Parameters:

put_copy_here (str or None) – Directory root under which a mirror of the archive sub-path will be created and the file copied into.

Returns:

Absolute path to the local working copy of ocean_static.nc, or None if the file could not be obtained.

Return type:

str or None

Note

The archive path is hard-coded to the OM5_025 dataset on GFDL systems.

fre.cmor.cmor_helpers.find_statics_file(bronx_file_path: str) str | None

Attempt to find the corresponding statics file given the path to a FRE-bronx output file. The code assumes the output file is in a FRE-bronx directory structure when trying to access the statics file. The structure is mocked in this package within the fre/tests/test_files/ascii_files/mock_archive directory structure. cd’ing there and using the command tree will reveal the mocked directory structure, something like:

<STEM>/<EXP_NAME>/<PLATFORM>-<TARGET>/

└── pp

├── component

├── realm_frequency.static.nc

└── ts

└── frequency

└── chunk_size

└── component.YYYYMM-YYYYMM.var.nc

Parameters:

bronx_file_path (str) – File path to use as a reference for statics file location.

Returns:

Path to the statics file if found, else None.

Return type:

str or None

Note

The function searches upward in the directory structure until it finds a ‘pp’ directory, then globs for ‘static.nc’ files.

fre.cmor.cmor_helpers.from_dis_gimme_dis(from_dis: netCDF4.Dataset, gimme_dis: str) numpy.ndarray | None

Retrieve and return a copy of a variable from a netCDF4.Dataset-like object.

Parameters:
  • from_dis (netCDF4.Dataset) – The source dataset object.

  • gimme_dis (str) – The variable name to extract from the dataset.

Returns:

A copy of the requested variable’s data, or None if not found.

Return type:

np.ndarray or None

Note

Logs a warning if the variable is not found. The name comes from a hypothetical pronunciation of ‘ds’, the common monniker for a netCDF4.Dataset object.

fre.cmor.cmor_helpers.get_bronx_freq_from_mip_table(json_table_config: str) str

Extract the frequency of data from a CMIP MIP table (JSON), returning its FRE-bronx equivalent.

Parameters:

json_table_config (str) – Path to a JSON MIP table file with ‘variable_entry’ metadata.

Raises:

KeyError – If the frequency cannot be found or mapped.

Returns:

FRE-bronx frequency string.

Return type:

str

fre.cmor.cmor_helpers.get_iso_datetime_ranges(var_filenames: List[str], iso_daterange_arr: List[str] | None = None, start: str | None = None, stop: str | None = None) None

Extract and append ISO datetime ranges from filenames, filtered by start/stop years if specified.

Parameters:
  • var_filenames (list of str) – Filenames, some of which contain ISO datetime ranges (e.g. ‘YYYYMMDD-YYYYMMDD’).

  • iso_daterange_arr (list of str) – List to append found datetime ranges to; modified in-place.

  • start (str, optional) – Start year in ‘YYYY’ format; only ranges within/after this year are included.

  • stop (str, optional) – Stop year in ‘YYYY’ format; only ranges within/before this year are included.

Raises:

ValueError – If iso_daterange_arr is not provided or if no datetime ranges are found.

Returns:

None

Return type:

None

Note

This function modifies iso_daterange_arr in-place.

fre.cmor.cmor_helpers.get_json_file_data(json_file_path: str | None = None) dict

Load and return the contents of a JSON file.

Parameters:

json_file_path (str) – Path to the JSON file.

Raises:

FileNotFoundError – If the file cannot be opened.

Returns:

Parsed data from the JSON file.

Return type:

dict

fre.cmor.cmor_helpers.get_vertical_dimension(ds: netCDF4.Dataset, target_var: str) str | int

Determine the vertical dimension for a variable in a netCDF4.Dataset.

Parameters:
  • ds (netCDF4.Dataset) – Dataset containing variables.

  • target_var (str) – Name of the variable to inspect.

Returns:

Name of the vertical dimension if found, otherwise 0.

Return type:

str or int

Note

Returns 0 if no vertical dimension is detected.

fre.cmor.cmor_helpers.iso_to_bronx_chunk(cmor_chunk_in: str) str

Convert an ISO8601 duration string (e.g., ‘P5Y’) to FRE-bronx-style chunk string (e.g., ‘5yr’).

Parameters:

cmor_chunk_in (str) – ISO8601 formatted string representing a time interval (must start with ‘P’ and end with ‘Y’).

Raises:

ValueError – If the input does not follow the expected ISO format.

Returns:

FRE-bronx chunk string.

Return type:

str

fre.cmor.cmor_helpers.print_data_minmax(ds_variable: numpy.ma.core.MaskedArray | None = None, desc: str | None = None) None

Log the minimum and maximum values of a numpy MaskedArray along with a description.

Parameters:
  • ds_variable (numpy.ma.core.MaskedArray, optional) – The data array whose min/max is to be logged.

  • desc (str, optional) – Description of the data.

Returns:

None

Return type:

None

Note

If the data cannot be logged, a warning is issued.

fre.cmor.cmor_helpers.update_calendar_type(json_file_path: str, new_calendar_type: str, output_file_path: str | None = None) None

Update the “calendar” field in a JSON experiment config file.

Parameters:
  • json_file_path (str) – Path to the input JSON file.

  • new_calendar_type (str) – New value for the “calendar” field.

  • output_file_path (str, optional) – Path to save the updated JSON file. If None, overwrites the original file.

Raises:
  • FileNotFoundError – If the input JSON file does not exist.

  • KeyError – If the “calendar” field is not found in the JSON file.

  • ValueError – If new_calendar_type is None.

  • json.JSONDecodeError – If the JSON file cannot be decoded.

Returns:

None

Return type:

None

Note

The function logs before and after values, and overwrites the input file unless an output path is given.

fre.cmor.cmor_helpers.update_grid_and_label(json_file_path: str, new_grid_label: str, new_grid: str, new_nom_res: str, output_file_path: str | None = None) None

Update the “grid_label”, “grid”, and “nominal_resolution” fields in a JSON experiment config.

Parameters:
  • json_file_path (str) – Path to the input JSON file.

  • new_grid_label (str) – New value for the “grid_label” field.

  • new_grid (str) – New value for the “grid” field.

  • new_nom_res (str) – New value for the “nominal_resolution” field.

  • output_file_path (str, optional) – Path to save the updated JSON file. If None, overwrites the original file.

Raises:
  • FileNotFoundError – If the input JSON file does not exist.

  • KeyError – If a required field is not found in the JSON file.

  • ValueError – If any input value is None.

  • json.JSONDecodeError – If the JSON file cannot be decoded.

Returns:

None

Return type:

None

Note

The function logs before and after values, and overwrites the input file unless an output path is given.