Scaling a Subset of Input Data in NetCDF Format

2021-02-10·
Ka Ming FUNG
Ka Ming FUNG
· 12 min read

Reference: For general python handling of nc files using Xarray: http://xarray.pydata.org/en/stable/index.html

I was working on a set of input data of Ocean DMS concentration. I suspected the reported values over the Southern Ocean (vaguely defined as ocean south to 30ºS) were a bit high. I wanted to reduce those values by 50% to see if my model would match observations better. 🌊

Surprisingly, I couldn’t find a quick way to do it. Here I present a workaround. (Please let me know if you have a smarter way! 💡)

import wget                        # for downloading data via URL links
import xarray as xr                   # the major tool to work with NetCDF data!
from matplotlib import pyplot as plt  # for plotting

Open the monthly Ocean DMS nc file

The marine DMS data used in this example can be downloaded from the Dalhousie University server. Or, we can download it from Python.

wget.download('http://rain.ucis.dal.ca/ctm/HEMCO/DMS/v2015-07/DMS_lana.geos.1x1.nc')
'DMS_lana.geos.1x1.nc'

We will use the function xr.open_dataset from the xarray package.

ds = xr.open_dataset('DMS_lana.geos.1x1.nc')

ds # display the dataset
<xarray.Dataset>
Dimensions: (lat: 181, lon: 360, time: 12)
Coordinates:

  • lon (lon) float32 -180.0 -179.0 -178.0 -177.0 … 177.0 178.0 179.0
  • lat (lat) float32 -89.75 -89.0 -88.0 -87.0 … 87.0 88.0 89.0 89.75
  • time (time) datetime64[ns] 1985-01-01 1985-02-01 … 1985-12-01 Data variables: DMS_OCEAN (time, lat, lon) float32 … Attributes: CDI: Climate Data Interface version 1.6.4 (http://code.zmaw.de/p… Conventions: COARDS history: Mon Jul 20 16:29:35 2015: ncatted -O -a units,DMS_OCEAN,o,c… Title: COARDS/netCDF file created by BPCH2COARDS (GAMAP v2-17+) Format: NetCDF-3 Model: GEOS5_47L Delta_Lon: 1.0 Delta_Lat: 1.0 NLayers: 47 Start_Date: 19850101 Start_Time: 0 End_Date: 19850201 End_Time: 0 Delta_Time: 744 CDO: Climate Data Operators version 1.6.4 (http://code.zmaw.de/p

We can also visualize annual mean of the data for a quick check:

ds['DMS_OCEAN'].mean(dim=['time']).plot()
<matplotlib.collections.QuadMesh at 0x7fd61bbd01f0>

png

Scaling down the Southern Ocean DMS

As mentioned, we want to scale down the ocean DMS below 30ºS.

# Step 1: make two copies of the data set, one for grid cells with lat > 30ºS, one for <30ºS

dms_above30S = ds['DMS_OCEAN'].sel(lat=slice(-30,90))

dms_below30S = ds['DMS_OCEAN'].sel(lat=slice(-90,-30-1e-9)) # we add a small value here to avoid the grid cells with lat = 30ºS to appear in both data sets.

# Step 2: reduce the DMS in the dataset of interest by 50%

dms_below30S = dms_below30S * 0.5

# Step 3: merge the untouched and scaled data
ds_partly_scaled = xr.merge([dms_below30S, dms_above30S], compat="no_conflicts")

To ensure our exercise is correct, let’s make some plots to compare the DMS concentrations in ds_partly_scaled and the original dataset, ds.

(ds_partly_scaled['DMS_OCEAN'] / ds['DMS_OCEAN']).plot(col='time', col_wrap=3)
<xarray.plot.facetgrid.FacetGrid at 0x7fd61bd3ba00>

png

Yellow = 1 (untouched); Purple = 0.5(scaled). Mission accomplished?

ds_partly_scaled # missing attributes from the original dataset.
<xarray.Dataset>
Dimensions: (lat: 181, lon: 360, time: 12)
Coordinates:

  • lat (lat) float64 -89.75 -89.0 -88.0 -87.0 … 87.0 88.0 89.0 89.75

  • lon (lon) float32 -180.0 -179.0 -178.0 -177.0 … 177.0 178.0 179.0

  • time (time) datetime64[ns] 1985-01-01 1985-02-01 … 1985-12-01 Data variables: DMS_OCEAN (time, lat, lon) float32 0.0 0.0 … 1.4103641e-07 1.4103641e-07