This vignette provides an introduction to the c4h_get()
function, used for downloading climate data from the Copernicus Climate
Data Store (CDS). It
also covers the helper function c4h_get_help(), which
allows you to explore the available datasets and their required
parameters before downloading.
c4h_get() takes the following arguments:
pat: a string containing your Personal Access Token for
the CDS (details on how to obtain this below).dataset: the name of the dataset you want to download.
Available datasets can be found via c4h_get_help() or
c4h_get_help("datasets").product_type: the name of the product type you want to
download. Available variables can be found via
c4h_get_help("dataset", "product_type"). For example, if
downloading monthly mean data, selecting
product_type = "monthly_averaged_reanalysis" or
product_type = "monthly_averaged_reanalysis_by_hour_of_day"
tells the CDS API whether to download the full monthly mean, or the
monthly mean of a specified hour (e.g. the monthly mean of temperature
at 18:00).variable: a string containing the long name of the
variable to be downloaded.year: a vector of years to download.month: a vector of months to download. The default is
all months (e.g. 1-12).day: a vector of days to download. The default will
download days 1-31 if the data is hourly or daily.time: a vector of hours to download. The default will
download hour 0 (midnight) if the data is hourly.bbox: a vector of coordinates specifying the boundary
box to download, in order
c(lat_max, lon_min, lat_min, lon_max) (N, W, S, E).leadtime_month: a vector of leadtimes to download for
seasonal forecast data.originating_centre: forecasting centre responsible for
producing the hindcast/forecast data.system: the forecast model system for a given
originating centre.outname: the file stem to save the file. It will be
appended by the data year and month. The default is
"cds".outpath: the path to save the file. The default is the
current working directory.The key idea is to specify:
There are many types of climate data available on the CDS. To know
what data is available to download from the CDS using
clim4health, you can use the function
c4h_get_help(). If the function is run without any input
parameters, it will return the datasets that are currently available to
download within clim4health.
Once you have decided the type of data you wish to download, you can
further explore the required parameters for that dataset by running
c4h_get_help() with the dataset name as an input parameter.
For example, to explore the parameters required to download reanalysis
data from ERA5-Land, we would run:
This tells us the required parameters to use in
c4h_get() to download this particular dataset. We can
explore options for some of these parameters using
c4h_get_help() with the dataset name and parameter name as
input parameters. For example, to explore the available variables for
ERA5-Land, we would run:
Examples in this section of the vignette will fail if they are run
because additional parameters need to be specified. They are purely for
illustrative purposes to show how to specify key arguments in
c4h_get().
There are multiple arguments to specify the time range you would like
to download. Parameters year, month,
day, and time take vector inputs to determine
the years, months, days and times of day of interest. For example, to
download hourly reanalysis data from 2011 to 2025 in February and
March:
c4h_get(dataset = "reanalysis-era5-land", # data type
year = 2011:2025, # load multiple years
month = c(2,3), # load February and March
time = 0:23) # load all hours of the day💡 Note: In this case (for hourly reanalysis data from ERA5-Land), the default is to download data from midnight (
time = 0). You could choose to download data at midday by selectingtime = 12, or load data from all hours of the day by specifyingtime = 0:23. This can be useful to reduce the amount of data you have to process.
Additionally, the argument leadtime_month specifies the
forecast months to download if relevant. For example, to download
leadtime months of February and March for a seasonal forecast
initialised in February 2025, we would specify:
c4h_get(dataset = "seasonal-monthly-single-levels", # data type
year = c(2025), # year of initialisation
month = c(2), # month of initialisation
leadtime_month = c(1, 2)) # leadtime months💡 Note: There is no default of
leadtime_month, so the user must specify the leadtime months they wish to download. Note that specifying a single value will download exactly one leadtime month, and not all leadtimes up to that month. For example,leadtime_month = 3downloads only the third leadtime, not leadtimes 1-3 (which could be specified usingleadtime_month = 1:3).
You can explore how to specify the spatial extent (bounding box) for
each dataset using c4h_get_help():
We can select a box in latitude and longitude coordinates using the
parameter bbox within c4h_get(). For example,
to select latitudes between -4 and 4, and longitudes between -73, -70,
we specify the box in order of its North, West, South, East
boundaries.
c4h_get(dataset = "reanalysis-era5-land", # data type
bbox = c(4, -73, -4, -70)) # bounding box in order N, W, S, E💡 Note: There is no default of
bbox. The user must always specify the region they wish to download.
You can check the available variables for a given dataset using
c4h_get_help(dataset = example_dataset, parameter = "variable"),
as above. The variable is then specified in c4h_get() using
the variable parameter. For example, to download
temperature at 2m from ERA5-Land, the key parameters we would specify
are:
By default, c4h_get() will download data to your current
working directory. However, you may wish to specify a different
directory using outpath, and change the name of the file
you are downloading. For example:
c4h_get(dataset = "reanalysis-era5-land", # data type
variable = "2m_temperature", # variable name
year = 2010:2012, # years to download
outpath = "/path/to/dir/", # directory to save data
outname = "era5land_t2m") # file name stem💡 Note: It is recommended to save different datasets in different sub-folders, which will allow the
c4h_loadfunction to easily parse the relevant datasets. For example, you could have two folders called/path/to/dir/reanalysis/and/path/to/dir/forecast/.
In the final name of the downloaded file, the parameter
outname will be followed by the relevant year and month of
the downloaded data. For example, in this case, our first output file
would be stored as: /path/to/dir/era5land_t2m_201001.nc.
When downloading data, c4h_get() will print the name of the
file where the data is saved.
To download data from the CDS, you will need a (free) account with the ECMWF. If you don’t already have an ECMWF account, you can go to the ECMWF page and create a new account by clicking “Log in” in the top right corner of the webpage.
With these credentials you can now log into the CDS.
Click on your user icon (top right corner of the webpage). Under the “Your profile” tab, you can find your API key (PAT). Copy and paste to save it in your R session, as you will need it later when downloading the data.
To download the data in this tutorial, you will also need to accept additional licenses. Under the “Licences” tab in your profile, please check “Additional licence to use non European contributions” and “Licence to use Copernicus Products”. You can click on each of the licences to find out more information before accepting.
Once you know your PAT, the dataset you would like to download, and
the required parameters, you can call c4h_get() as below.
In these 3 examples, we download reanalysis data from ERA5-Land, and two
examples of downloading seasonal monthly forecasts, once over the
hindcast period and once as an example corresponding forecast
dataset.
💡 Note: It can take some time to download all the data for your analysis, depending on the volume of data you have requested. For example, to download hourly global data from 1940-present will take much longer than to download monthly data for a specific region from 1990-2020. Therefore it is recommended to double check you are requesting exactly the data you need for your analysis before beginning the download.
This example downloads monthly mean 2m temperature from the ERA5-Land
reanalysis for April and May of 2010, 2011, and 2012. The data is
downloaded for the box defined by latitudes 33 and -23, and longitudes
-93 and -17. The files are saved in the current working directory with
the name “era5land” followed by the year and month of the data
(e.g. era5land_201004.nc). The names of the files where the
data is saved will be printed in the console.
# Download reanalysis data
c4h_get(pat = pat_api,
dataset = "reanalysis-era5-land-monthly-means",
product_type = "monthly_averaged_reanalysis",
variable = "2m_temperature",
year = c(2010, 2011, 2012),
month = c(4, 5),
bbox = c(33, -93, -23, -17),
outname = "era5land")💡 Note: In this case (for monthly-mean reanalysis data from ERA5-Land), you can download monthly-mean daily-mean data by selecting
product_type = "monthly_averaged_reanalysis". The option to download monthly-mean data from a specific hour of day is not yet implemented (for example, monthly-mean midday temperature).
This example downloads monthly mean 2m temperature, 2m dewpoint
temperature, and total precipitation from the seasonal monthly single
levels dataset for the ECMWF system 51 prediction system. The data will
be downloaded for the forecasts issued in April 2010, 2011, and 2012,
for leadtime months of April and May. The data will be downloaded for
the same bounding box. The files will be saved in the specified
directory with the name “hindcast” followed by the years and months of
the data (e.g. /path/to/dir/hindcast_201004.nc). The names
of the files where the data is saved will be printed in the console.
# Download hindcast data
c4h_get(pat = pat_api,
dataset = "seasonal-monthly-single-levels",
originating_centre = c("ecmwf"),
system = c("51"),
variable = c("2m_temperature",
"2m_dewpoint_temperature",
"total_precipitation"),
product_type = c("monthly_mean"),
year = c(2010, 2011, 2012), month = c(4),
leadtime_month = c(1, 2),
bbox = c(33, -93, -23, -17),
outpath = "/path/to/dir/",
outname = "hindcast")This example downloads the data in the same format as above, except
this time we only request one variable (precipitation), and for the
forecast issued in April 2025. The file is saved in
/path/to/dir/forecast_202504.nc.
# Download forecast data
c4h_get(pat = pat_api,
dataset = "seasonal-monthly-single-levels",
originating_centre = c("ecmwf"),
system = c("51"),
variable = c("total_precipitation"),
product_type = c("monthly_mean"),
year = c(2025),
month = c(4),
leadtime_month = c(1, 2),
bbox = c(33, -93, -23, -17),
outpath = "/path/to/dir/",
outname = "forecast")