v0.0.1_alpha01

CONUS404 Regridding (Curvilinear => Rectilinear)#

Create a rectilinear grid (1D lon/lat coordinates) for a specific region. Extract spatial and temporal subset of regridded data to a netcdf file. (Extraction to netcdf may also be done for curvilinear grid.)

%%time
import xarray as xr
import xesmf as xe
import numpy as np
import fsspec
import hvplot.xarray
import geoviews as gv
from matplotlib import path 
import intake
import os
CPU times: user 4.73 s, sys: 524 ms, total: 5.25 s
Wall time: 6.71 s

Open dataset from Intake Catalog#

  • Select on-prem dataset from /caldera if running on prem (Denali/Tallgrass)

  • Select cloud/osn object store data if running elsewhere

# open the hytest data intake catalog
hytest_cat = intake.open_catalog("https://raw.githubusercontent.com/hytest-org/hytest/main/dataset_catalog/hytest_intake_catalog.yml")
list(hytest_cat)
['conus404-catalog',
 'conus404-drb-eval-tutorial-catalog',
 'nhm-v1.0-daymet-catalog',
 'nhm-v1.1-c404-bc-catalog',
 'nhm-v1.1-gridmet-catalog',
 'trends-and-drivers-catalog',
 'nhm-prms-v1.1-gridmet-format-testing-catalog',
 'nwis-streamflow-usgs-gages-onprem',
 'nwis-streamflow-usgs-gages-osn',
 'nwm21-streamflow-usgs-gages-onprem',
 'nwm21-streamflow-usgs-gages-osn',
 'nwm21-streamflow-cloud',
 'geofabric_v1_1-zip-osn',
 'geofabric_v1_1_POIs_v1_1-osn',
 'geofabric_v1_1_TBtoGFv1_POIs-osn',
 'geofabric_v1_1_nhru_v1_1-osn',
 'geofabric_v1_1_nhru_v1_1_simp-osn',
 'geofabric_v1_1_nsegment_v1_1-osn',
 'gages2_nndar-osn',
 'wbd-zip-osn',
 'huc12-geoparquet-osn',
 'huc12-gpkg-osn',
 'nwm21-scores',
 'lcmap-cloud',
 'rechunking-tutorial-osn',
 'pointsample-tutorial-sites-osn',
 'pointsample-tutorial-output-osn']
# open the conus404 sub-catalog
cat = hytest_cat['conus404-catalog']
list(cat)
['conus404-hourly-onprem',
 'conus404-hourly-cloud',
 'conus404-hourly-osn',
 'conus404-daily-diagnostic-onprem',
 'conus404-daily-diagnostic-cloud',
 'conus404-daily-diagnostic-osn',
 'conus404-daily-onprem',
 'conus404-daily-cloud',
 'conus404-daily-osn',
 'conus404-monthly-onprem',
 'conus404-monthly-cloud',
 'conus404-monthly-osn',
 'conus404-hourly-ba-osn',
 'conus404-daily-ba-osn']
## Select the dataset you want to read into your notebook and preview its metadata
dataset = 'conus404-hourly-osn' 
cat[dataset]
conus404-hourly-osn:
  args:
    consolidated: true
    storage_options:
      anon: true
      client_kwargs:
        endpoint_url: https://usgs.osn.mghpcc.org/
      requester_pays: false
    urlpath: s3://hytest/conus404/conus404_hourly.zarr
  description: 'CONUS404 Hydro Variable subset, 40 years of hourly values. These files
    were created wrfout model output files (see ScienceBase data release for more
    details: https://www.sciencebase.gov/catalog/item/6372cd09d34ed907bf6c6ab1). You
    can work with this data for free in any environment (there are no egress fees).'
  driver: intake_xarray.xzarr.ZarrSource
  metadata:
    catalog_dir: https://raw.githubusercontent.com/hytest-org/hytest/main/dataset_catalog/subcatalogs

2) Set Up AWS Credentials (Optional)#

This notebook reads data from the OSN pod by default, which is object store data on a high speed internet connection that is free to access from any environment. If you change this notebook to use one of the CONUS404 datasets stored on S3 (options ending in -cloud), you will be pulling data from a requester-pays S3 bucket. This means you have to set up your AWS credentials, else we won’t be able to load the data. Please note that reading the -cloud data from S3 may incur charges if you are reading data outside of the us-west-2 region or running the notebook outside of the cloud altogether. If you would like to access one of the -cloud options, uncomment and run the following code snippet to set up your AWS credentials. You can find more info about this AWS helper function here.

# uncomment the lines below to read in your AWS credentials if you want to access data from a requester-pays bucket (-cloud)
# os.environ['AWS_PROFILE'] = 'default'
# %run ../environment_set_up/Help_AWS_Credentials.ipynb

Parallelize with Dask#

Some of the steps we will take are aware of parallel clustered compute environments using dask. We’re going to start a cluster now so that future steps can take advantage of this ability.

This is an optional step, but speed ups data loading significantly, especially when accessing data from the cloud.

We have documentation on how to start a Dask Cluster in different computing environments here.

%run ../environment_set_up/Start_Dask_Cluster_Nebari.ipynb
## If this notebook is not being run on Nebari/ESIP, replace the above 
## path name with a helper appropriate to your compute environment.  Examples:
# %run ../environment_set_up/Start_Dask_Cluster_Denali.ipynb
# %run ../environment_set_up/Start_Dask_Cluster_Tallgrass.ipynb
# %run ../environment_set_up/Start_Dask_Cluster_Desktop.ipynb
# %run ../environment_set_up/Start_Dask_Cluster_PangeoCHS.ipynb
The 'cluster' object can be used to adjust cluster behavior.  i.e. 'cluster.adapt(minimum=10)'
The 'client' object can be used to directly interact with the cluster.  i.e. 'client.submit(func)' 
The link to view the client dashboard is:
>  https://nebari.dev-wma.chs.usgs.gov/gateway/clusters/dev.d9894f590aa84ed5addca9fa34bf0875/status
ds = cat[dataset].to_dask()
/home/conda/global/774f2bb1-1715626348-8-pangeo/lib/python3.11/site-packages/intake_xarray/base.py:21: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
  'dims': dict(self._ds.dims),
ds
<xarray.Dataset> Size: 222TB
Dimensions:         (time: 376945, y: 1015, x: 1367, bottom_top_stag: 51,
                     bottom_top: 50, soil_layers_stag: 4, x_stag: 1368,
                     y_stag: 1016, snow_layers_stag: 3, snso_layers_stag: 7)
Coordinates:
    lat             (y, x) float32 6MB dask.array<chunksize=(175, 175), meta=np.ndarray>
    lat_u           (y, x_stag) float32 6MB dask.array<chunksize=(175, 175), meta=np.ndarray>
    lat_v           (y_stag, x) float32 6MB dask.array<chunksize=(175, 175), meta=np.ndarray>
    lon             (y, x) float32 6MB dask.array<chunksize=(175, 175), meta=np.ndarray>
    lon_u           (y, x_stag) float32 6MB dask.array<chunksize=(175, 175), meta=np.ndarray>
    lon_v           (y_stag, x) float32 6MB dask.array<chunksize=(175, 175), meta=np.ndarray>
  * time            (time) datetime64[ns] 3MB 1979-10-01 ... 2022-10-01
  * x               (x) float64 11kB -2.732e+06 -2.728e+06 ... 2.732e+06
  * y               (y) float64 8kB -2.028e+06 -2.024e+06 ... 2.028e+06
Dimensions without coordinates: bottom_top_stag, bottom_top, soil_layers_stag,
                                x_stag, y_stag, snow_layers_stag,
                                snso_layers_stag
Data variables: (12/153)
    ACDEWC          (time, y, x) float32 2TB dask.array<chunksize=(144, 175, 175), meta=np.ndarray>
    ACDRIPR         (time, y, x) float32 2TB dask.array<chunksize=(144, 175, 175), meta=np.ndarray>
    ACDRIPS         (time, y, x) float32 2TB dask.array<chunksize=(144, 175, 175), meta=np.ndarray>
    ACECAN          (time, y, x) float32 2TB dask.array<chunksize=(144, 175, 175), meta=np.ndarray>
    ACEDIR          (time, y, x) float32 2TB dask.array<chunksize=(144, 175, 175), meta=np.ndarray>
    ACETLSM         (time, y, x) float32 2TB dask.array<chunksize=(144, 175, 175), meta=np.ndarray>
    ...              ...
    ZNU             (bottom_top) float32 200B dask.array<chunksize=(50,), meta=np.ndarray>
    ZNW             (bottom_top_stag) float32 204B dask.array<chunksize=(51,), meta=np.ndarray>
    ZS              (soil_layers_stag) float32 16B dask.array<chunksize=(4,), meta=np.ndarray>
    ZSNSO           (time, snso_layers_stag, y, x) float32 15TB dask.array<chunksize=(144, 7, 175, 175), meta=np.ndarray>
    ZWT             (time, y, x) float32 2TB dask.array<chunksize=(144, 175, 175), meta=np.ndarray>
    crs             int64 8B ...
Attributes: (12/148)
    AER_ANGEXP_OPT:                  1
    AER_ANGEXP_VAL:                  1.2999999523162842
    AER_AOD550_OPT:                  1
    AER_AOD550_VAL:                  0.11999999731779099
    AER_ASY_OPT:                     1
    AER_ASY_VAL:                     0.8999999761581421
    ...                              ...
    WEST-EAST_PATCH_START_STAG:      1
    WEST-EAST_PATCH_START_UNSTAG:    1
    W_DAMPING:                       1
    YSU_TOPDOWN_PBLMIX:              0
    history:                         Tue Mar 29 16:35:22 2022: ncrcat -A -vW ...
    history_of_appended_files:       Tue Mar 29 16:35:22 2022: Appended file ...
nc_outfile = 'CONUS404_DRB_rectilinear.nc'
bbox = [-75.9, -74.45, 38.7, 42.55]
dx = dy = 3./111.    # 3km grid
vars_out = ['T2', 'SNOW']
start = '2017-04-01 00:00'
stop  = '2017-05-01 00:00'

Use xESMF to regrid#

xESMF is a xarray-enabled interface to the ESMF regridder from NCAR. ESMF has options for regridding between curvilinear, rectilinear, and unstructured grids, with conservative regridding options, and much more

def bbox2ij(lon,lat,bbox=[-160., -155., 18., 23.]):
    """Return indices for i,j that will completely cover the specified bounding box.     
    i0,i1,j0,j1 = bbox2ij(lon,lat,bbox)
    lon,lat = 2D arrays that are the target of the subset
    bbox = list containing the bounding box: [lon_min, lon_max, lat_min, lat_max]

    Example
    -------  
    >>> i0,i1,j0,j1 = bbox2ij(lon_rho,[-71, -63., 39., 46])
    >>> h_subset = nc.variables['h'][j0:j1,i0:i1]       
    """
    bbox=np.array(bbox)
    mypath=np.array([bbox[[0,1,1,0]],bbox[[2,2,3,3]]]).T
    p = path.Path(mypath)
    points = np.vstack((lon.ravel(),lat.ravel())).T   
    n,m = np.shape(lon)
    inside = p.contains_points(points).reshape((n,m))
    ii,jj = np.meshgrid(range(m),range(n))
    return min(ii[inside]),max(ii[inside]),min(jj[inside]),max(jj[inside])

Before we regrid to rectilinear, let’s subset a region that covers our area of interest. Becuase lon,lat are 2D arrays, we can’t just use xarray to slice these coordinate variables. So we have a routine that finds the i,j locations of a specified bounding box, and then slice on those.

i0,i1,j0,j1 = bbox2ij(ds['lon'].values, ds['lat'].values, bbox=bbox)
print(i0,i1,j0,j1)
1123 1178 555 663
ds_subset = ds.isel(x=slice(i0-1,i1+1), y=slice(j0-1,j1+1))
ds_subset = ds_subset.sel(time=slice(start,stop))
ds_subset
<xarray.Dataset> Size: 2GB
Dimensions:         (time: 721, y: 110, x: 57, bottom_top_stag: 51,
                     bottom_top: 50, soil_layers_stag: 4, x_stag: 1368,
                     y_stag: 1016, snow_layers_stag: 3, snso_layers_stag: 7)
Coordinates:
    lat             (y, x) float32 25kB dask.array<chunksize=(110, 57), meta=np.ndarray>
    lat_u           (y, x_stag) float32 602kB dask.array<chunksize=(110, 175), meta=np.ndarray>
    lat_v           (y_stag, x) float32 232kB dask.array<chunksize=(175, 57), meta=np.ndarray>
    lon             (y, x) float32 25kB dask.array<chunksize=(110, 57), meta=np.ndarray>
    lon_u           (y, x_stag) float32 602kB dask.array<chunksize=(110, 175), meta=np.ndarray>
    lon_v           (y_stag, x) float32 232kB dask.array<chunksize=(175, 57), meta=np.ndarray>
  * time            (time) datetime64[ns] 6kB 2017-04-01 ... 2017-05-01
  * x               (x) float64 456B 1.756e+06 1.76e+06 ... 1.976e+06 1.98e+06
  * y               (y) float64 880B 1.88e+05 1.92e+05 ... 6.2e+05 6.24e+05
Dimensions without coordinates: bottom_top_stag, bottom_top, soil_layers_stag,
                                x_stag, y_stag, snow_layers_stag,
                                snso_layers_stag
Data variables: (12/153)
    ACDEWC          (time, y, x) float32 18MB dask.array<chunksize=(24, 110, 57), meta=np.ndarray>
    ACDRIPR         (time, y, x) float32 18MB dask.array<chunksize=(24, 110, 57), meta=np.ndarray>
    ACDRIPS         (time, y, x) float32 18MB dask.array<chunksize=(24, 110, 57), meta=np.ndarray>
    ACECAN          (time, y, x) float32 18MB dask.array<chunksize=(24, 110, 57), meta=np.ndarray>
    ACEDIR          (time, y, x) float32 18MB dask.array<chunksize=(24, 110, 57), meta=np.ndarray>
    ACETLSM         (time, y, x) float32 18MB dask.array<chunksize=(24, 110, 57), meta=np.ndarray>
    ...              ...
    ZNU             (bottom_top) float32 200B dask.array<chunksize=(50,), meta=np.ndarray>
    ZNW             (bottom_top_stag) float32 204B dask.array<chunksize=(51,), meta=np.ndarray>
    ZS              (soil_layers_stag) float32 16B dask.array<chunksize=(4,), meta=np.ndarray>
    ZSNSO           (time, snso_layers_stag, y, x) float32 127MB dask.array<chunksize=(24, 7, 110, 57), meta=np.ndarray>
    ZWT             (time, y, x) float32 18MB dask.array<chunksize=(24, 110, 57), meta=np.ndarray>
    crs             int64 8B ...
Attributes: (12/148)
    AER_ANGEXP_OPT:                  1
    AER_ANGEXP_VAL:                  1.2999999523162842
    AER_AOD550_OPT:                  1
    AER_AOD550_VAL:                  0.11999999731779099
    AER_ASY_OPT:                     1
    AER_ASY_VAL:                     0.8999999761581421
    ...                              ...
    WEST-EAST_PATCH_START_STAG:      1
    WEST-EAST_PATCH_START_UNSTAG:    1
    W_DAMPING:                       1
    YSU_TOPDOWN_PBLMIX:              0
    history:                         Tue Mar 29 16:35:22 2022: ncrcat -A -vW ...
    history_of_appended_files:       Tue Mar 29 16:35:22 2022: Appended file ...
ds_subset.nbytes/1e9
2.489121492
da = ds_subset.T2.sel(time='2017-04-25 00:00', method='nearest')
viz = da.hvplot.quadmesh(x='lon', y='lat', geo=True, rasterize=True, cmap='turbo')
base = gv.tile_sources.OSM
base * viz.opts(alpha=0.5)
ds_subset.nbytes/1e9
2.489121492
%%time
ds_subset = ds_subset.chunk({'x':-1, 'y':-1, 'time':24})
CPU times: user 338 ms, sys: 0 ns, total: 338 ms
Wall time: 335 ms
%%time
ds_out = xr.Dataset({'lon': (['lon'], np.arange(bbox[0], bbox[1], dx)),
                     'lat': (['lat'], np.arange(bbox[2], bbox[3], dy))})

regridder = xe.Regridder(ds_subset, ds_out, 'bilinear')
regridder
CPU times: user 145 ms, sys: 8.46 ms, total: 154 ms
Wall time: 278 ms
xESMF Regridder 
Regridding algorithm:       bilinear 
Weight filename:            bilinear_110x57_143x54.nc 
Reuse pre-computed weights? False 
Input grid shape:           (110, 57) 
Output grid shape:          (143, 54) 
Periodic in longitude?      False
%%time
ds_out = regridder(ds_subset[vars_out])
print(ds_out)
<xarray.Dataset> Size: 45MB
Dimensions:  (time: 721, lat: 143, lon: 54)
Coordinates:
  * time     (time) datetime64[ns] 6kB 2017-04-01 ... 2017-05-01
  * lon      (lon) float64 432B -75.9 -75.87 -75.85 ... -74.52 -74.49 -74.47
  * lat      (lat) float64 1kB 38.7 38.73 38.75 38.78 ... 42.48 42.51 42.54
Data variables:
    T2       (time, lat, lon) float32 22MB dask.array<chunksize=(24, 143, 54), meta=np.ndarray>
    SNOW     (time, lat, lon) float32 22MB dask.array<chunksize=(24, 143, 54), meta=np.ndarray>
Attributes:
    regrid_method:  bilinear
CPU times: user 2.07 s, sys: 116 ms, total: 2.18 s
Wall time: 2.2 s
ds_out['SNOW']
<xarray.DataArray 'SNOW' (time: 721, lat: 143, lon: 54)> Size: 22MB
dask.array<astype, shape=(721, 143, 54), dtype=float32, chunksize=(24, 143, 54), chunktype=numpy.ndarray>
Coordinates:
  * time     (time) datetime64[ns] 6kB 2017-04-01 ... 2017-05-01
  * lon      (lon) float64 432B -75.9 -75.87 -75.85 ... -74.52 -74.49 -74.47
  * lat      (lat) float64 1kB 38.7 38.73 38.75 38.78 ... 42.48 42.51 42.54
list(ds_out.variables)
['T2', 'SNOW', 'time', 'lon', 'lat']
list(ds_out.data_vars)
['T2', 'SNOW']
ds_out['T2'].encoding
{}
ds_out.time
<xarray.DataArray 'time' (time: 721)> Size: 6kB
array(['2017-04-01T00:00:00.000000000', '2017-04-01T01:00:00.000000000',
       '2017-04-01T02:00:00.000000000', ..., '2017-04-30T22:00:00.000000000',
       '2017-04-30T23:00:00.000000000', '2017-05-01T00:00:00.000000000'],
      dtype='datetime64[ns]')
Coordinates:
  * time     (time) datetime64[ns] 6kB 2017-04-01 ... 2017-05-01
encoding={}
for var in ds_out.variables:
    encoding[var] = dict(zlib=True, complevel=2, 
                         fletcher32=False, shuffle=True,
                         _FillValue=None
                        )
# you will need to update the filepaths and uncomment the following line to save out your data.
# ds_out.load().to_netcdf(nc_outfile, encoding=encoding, mode='w')
ds_nc = xr.open_dataset(nc_outfile)
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File /home/conda/global/774f2bb1-1715626348-8-pangeo/lib/python3.11/site-packages/xarray/backends/file_manager.py:211, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
    210 try:
--> 211     file = self._cache[self._key]
    212 except KeyError:

File /home/conda/global/774f2bb1-1715626348-8-pangeo/lib/python3.11/site-packages/xarray/backends/lru_cache.py:56, in LRUCache.__getitem__(self, key)
     55 with self._lock:
---> 56     value = self._cache[key]
     57     self._cache.move_to_end(key)

KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('/shared/users/asnyder/Github/jupyterbook/hytest/dataset_access/CONUS404_DRB_rectilinear.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False)), 'f33b1f0c-6fce-4119-ba9c-b4e1a5d204d7']

During handling of the above exception, another exception occurred:

FileNotFoundError                         Traceback (most recent call last)
Cell In[28], line 1
----> 1 ds_nc = xr.open_dataset(nc_outfile)

File /home/conda/global/774f2bb1-1715626348-8-pangeo/lib/python3.11/site-packages/xarray/backends/api.py:573, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs)
    561 decoders = _resolve_decoders_kwargs(
    562     decode_cf,
    563     open_backend_dataset_parameters=backend.open_dataset_parameters,
   (...)
    569     decode_coords=decode_coords,
    570 )
    572 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 573 backend_ds = backend.open_dataset(
    574     filename_or_obj,
    575     drop_variables=drop_variables,
    576     **decoders,
    577     **kwargs,
    578 )
    579 ds = _dataset_from_backend_dataset(
    580     backend_ds,
    581     filename_or_obj,
   (...)
    591     **kwargs,
    592 )
    593 return ds

File /home/conda/global/774f2bb1-1715626348-8-pangeo/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:646, in NetCDF4BackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, format, clobber, diskless, persist, lock, autoclose)
    625 def open_dataset(  # type: ignore[override]  # allow LSP violation, not supporting **kwargs
    626     self,
    627     filename_or_obj: str | os.PathLike[Any] | BufferedIOBase | AbstractDataStore,
   (...)
    643     autoclose=False,
    644 ) -> Dataset:
    645     filename_or_obj = _normalize_path(filename_or_obj)
--> 646     store = NetCDF4DataStore.open(
    647         filename_or_obj,
    648         mode=mode,
    649         format=format,
    650         group=group,
    651         clobber=clobber,
    652         diskless=diskless,
    653         persist=persist,
    654         lock=lock,
    655         autoclose=autoclose,
    656     )
    658     store_entrypoint = StoreBackendEntrypoint()
    659     with close_on_error(store):

File /home/conda/global/774f2bb1-1715626348-8-pangeo/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:409, in NetCDF4DataStore.open(cls, filename, mode, format, group, clobber, diskless, persist, lock, lock_maker, autoclose)
    403 kwargs = dict(
    404     clobber=clobber, diskless=diskless, persist=persist, format=format
    405 )
    406 manager = CachingFileManager(
    407     netCDF4.Dataset, filename, mode=mode, kwargs=kwargs
    408 )
--> 409 return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)

File /home/conda/global/774f2bb1-1715626348-8-pangeo/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:356, in NetCDF4DataStore.__init__(self, manager, group, mode, lock, autoclose)
    354 self._group = group
    355 self._mode = mode
--> 356 self.format = self.ds.data_model
    357 self._filename = self.ds.filepath()
    358 self.is_remote = is_remote_uri(self._filename)

File /home/conda/global/774f2bb1-1715626348-8-pangeo/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:418, in NetCDF4DataStore.ds(self)
    416 @property
    417 def ds(self):
--> 418     return self._acquire()

File /home/conda/global/774f2bb1-1715626348-8-pangeo/lib/python3.11/site-packages/xarray/backends/netCDF4_.py:412, in NetCDF4DataStore._acquire(self, needs_lock)
    411 def _acquire(self, needs_lock=True):
--> 412     with self._manager.acquire_context(needs_lock) as root:
    413         ds = _nc4_require_group(root, self._group, self._mode)
    414     return ds

File /home/conda/global/774f2bb1-1715626348-8-pangeo/lib/python3.11/contextlib.py:137, in _GeneratorContextManager.__enter__(self)
    135 del self.args, self.kwds, self.func
    136 try:
--> 137     return next(self.gen)
    138 except StopIteration:
    139     raise RuntimeError("generator didn't yield") from None

File /home/conda/global/774f2bb1-1715626348-8-pangeo/lib/python3.11/site-packages/xarray/backends/file_manager.py:199, in CachingFileManager.acquire_context(self, needs_lock)
    196 @contextlib.contextmanager
    197 def acquire_context(self, needs_lock=True):
    198     """Context manager for acquiring a file."""
--> 199     file, cached = self._acquire_with_cache_info(needs_lock)
    200     try:
    201         yield file

File /home/conda/global/774f2bb1-1715626348-8-pangeo/lib/python3.11/site-packages/xarray/backends/file_manager.py:217, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
    215     kwargs = kwargs.copy()
    216     kwargs["mode"] = self._mode
--> 217 file = self._opener(*self._args, **kwargs)
    218 if self._mode == "w":
    219     # ensure file doesn't get overridden when opened again
    220     self._mode = "a"

File src/netCDF4/_netCDF4.pyx:2353, in netCDF4._netCDF4.Dataset.__init__()

File src/netCDF4/_netCDF4.pyx:1963, in netCDF4._netCDF4._ensure_nc_success()

FileNotFoundError: [Errno 2] No such file or directory: b'/shared/users/asnyder/Github/jupyterbook/hytest/dataset_access/CONUS404_DRB_rectilinear.nc'
ds_nc
(ds_nc['T2']-273.15).hvplot(x='lon',y='lat', geo=True,
                rasterize=True, cmap='turbo', 
                tiles='OSM', clim=(2,15))
ds_outcl = ds_subset[vars_out]
list(ds_outcl.data_vars)
encoding={}
for var in ds_outcl.variables:
    encoding[var] = dict(zlib=True, complevel=2, 
                         fletcher32=False, shuffle=True,
                         _FillValue=None
                        )
# you will need to update the filepaths and uncomment the following line to save out your data.
# ds_outcl.load().to_netcdf('CONUS404_DRB_curvilinear.nc', encoding=encoding, mode='w')
client.close(); cluster.shutdown()