fre.pp.split_netcdf_script module

fre.pp.split_netcdf_script.fre_outfile_name(infile, varname)

Builds split var filenames the way that fre expects them (and in a way that should work for any .nc file)

This is expected to work with files formed the following way

  • Fre Input format: date.component(.tileX).nc

  • Fre Output format: date.component.var(.tileX).nc

but it should also work on any file filename.nc

Parameters:
  • infile (string) – name of a file with a . somewhere in the filename

  • varname (string) – string to add to the infile

Returns:

new filename

Return type:

string

fre.pp.split_netcdf_script.get_max_ndims(dataset)

Gets the maximum number of dimensions of a single var in an xarray Dataset object. Excludes coord vars, which should be single-dim anyway.

Parameters:

dataset (xarray Dataset) – xarray Dataset you want to query

Returns:

The max dimensions that a single var possesses in the Dataset

Return type:

int

fre.pp.split_netcdf_script.set_coord_encoding(dset, vcoords)

Gets the encoding settings needed for xarray to write out the coordinates as expected we need the list of all vars (varnames) because that’s how you get coords for the metadata vars (i.e. nv or bnds for time_bnds)

Parameters:
  • dset (xarray Dataset object) – xarray Dataset object to query for info

  • vcoords (list of strings) – list of coordinate variables to write to file

Returns:

A dictionary where each key is a coordinate in the xarray Dataset and each value is a dictionary where the keys are the encoding information from the coordinate variable in the Dataset plus the units (if present)

Return type:

dict

Note

This code removes _FillValue from coordinates. CF-compliant files do not have _FillValue on coordinates, and xarray does not have a good way to get _FillValue from coordinates. Letting xarray set _FillValue for coordinates when coordinates have a _FillValue gets you wrong metadata, and bad metadata is worse than no metadata. Dropping the attribute if it’s present seems to be the lesser of two evils.

fre.pp.split_netcdf_script.set_var_encoding(dset, varnames)

Gets the encoding settings needed for xarray to write out the variables as expected

mostly addressed to time_bnds, because xarray can drop the units attribute

Parameters:
  • dset (xarray dataset object) – xarray dataset object to query for info

  • varnames (list of strings) – list of variables that will be written to file

Returns:

dict {var1: {encodekey1 : encodeval1, encodekey2:encodeval2…}}

Return type:

dict

fre.pp.split_netcdf_script.split_file_xarray(infile, outfiledir, var_list='all')

Given a netcdf infile containing one or more data variables, writes out a separate file for each data variable in the file, including the variable name in the filename. if var_list if specified, only the vars in var_list are written to file; if no vars in the file match the vars in var_list, no files are written.

Parameters:
  • infile (string) – input netcdf file

  • outfiledir (string) – writeable directory to which to write netcdf files

  • var_list (list of strings) – python list of string variable names or a string “all”

fre.pp.split_netcdf_script.split_netcdf(inputDir, outputDir, component, history_source, use_subdirs, yamlfile, split_all_vars=False)

Given a directory of netcdf files, splits those netcdf files into separate files for each data variable and copies the data variable files of interest to the output directory

Intended to work with data structured for fre-workflows and fre-workflows file naming conventions - Sample infile name convention: “19790101.atmos_tracer.tile6.nc”

Parameters:
  • inputDir (string) – directory containing netcdf files

  • outputDir (string) – directory to which to write netcdf files

  • component (string) – the ‘component’ element we are currently working with in the yaml

  • history_source (string) – a history_file under a ‘source’ under the ‘component’ that we are working with. Is used to identify the files in inputDir.

  • use_subdirs (boolean) – whether to recursively search through inputDir under the subdirectories. Used when regridding.

  • yamlfile (string) –

    • a .yml config file for fre postprocessing

  • split_all_vars (boolean) – Whether to skip parsing the yamlfile and split all available vars in the file. Defaults to False.