GEOGloWS Training
This website contains training materials, presentations, videos, workshops, code notebooks, and links to other resources to learn about the use of the GEOGloWS Toolbox for accessing relevant global water information. The principal tool has been the GEOGloWS ECMWF Streamflow Services, but additional tools for discovering in situ observations, multi-dimensional gridded time series data such as come from numerical weather models and groundwater are being added and improved.
Overview
The GEOGloWS initiative consolidates elements of freshwater activities in GEO. It ensures that strong coordination and commitment are in place for links among data, information, knowledge, applications, and policy. From research to implementation, GEOGloWS provides the demonstration grounds for user-driven solutions to address water issues.

Click here to view a presentation that introduces GEOGloWS.
GEOGloWS Toolbox with Tethys
The GEOGloWS Toolbox is a collection of specific water resources Earth Observation decision support web app tools. It builds on what is being organized in GEOGloWS as an App Warehouse which you can think of as being similar to Apple or Android app stores. These foundational applications are drawn from the warehouse of apps that will grow over time and can be downloaded and combined on customized portals with customized apps to meet the specific needs of different stakeholders.

You can navigate to the GEOGloWS Toolbox hosted by Aquaveo and explore any or all of the apps here: http://apps.geoglows.org/. Note that the Tethys App Warehouse is only functional when logged in as an administrator and is used to add, update, and delete apps available in the Warehouse.
- GEOGloWS HydroViewer
The GEOGloWS ECMWF Streamflow Services provides access to a 40-year simulation and daily 15-day ensemble forecast on nearly 1,000,000 river reaches globally.
- Water Data Explorer
This web app is used by the WMO as a catalog, data access, and visualization for observational and other time series hydrological data stored at point locations.
- Met Data Explorer
Similar to the Water Data Explorer, this app is used to process multi-dimensional gridded data typical of meteorological forecasting systems that produce output in a time series stack of rasters formatted in netCDF, Grib, and geoTIFF. This could be any type of data, but most often are used in GEOGloWS for precipitation.
- GRACE Groundwater
The GRACE satellite data have become useful to understand groundwater anomalies. This application (still being updated) provides visualization and access to this information.
- HydroStats
This application is used for validation of the GEOGloWS streamflow historical and forecasts as you will see in those sections, but it is a general purpose tool to perform a goodness of fit between observed and modeled (or any two) time series.
GEOGloWS ECMWF Streamflow Model
The GEOGloWS ECMWF Streamflow Model is a Hydrologic Model which provides forecasted and historically simulated river discharge. The model is built so that it can be accessed through web services, an approach we call Hydrologic Modeling as a Service (HMaaS). This approach centralizes the cyber-infrastructure, human capacity, and other components of hydrological modeling using the best meteorology and expertise from ECMWF, along with the latest advances in Information and Communication Technology (ICT). It is a reliable hydrological forecast information service, a disruptive technology that moves away from needing to replicate the modelling process, duplicate the underlying geographic and meteorology data, and provide computer, human, and financial resources locally.
The tools available for accessing the model include a web app, a data service (REST API), a mapping service (ESRI Living Atlas), and Python package. We provide several example apps, code notebooks, presentations, workshops, and videos to help you learn how you can access model data, visualize results, and download hydrological information for your location and develop custom workflows and apps for your decision-making needs. Further, information about how to validate and improve the information based on local observations is also given.
To open the app, please visit: https://apps.geoglows.org/apps/geoglows-hydroviewer/
About the GEOGloWS ECMWF Streamflow Service
GEOGloWS helps to organize the international community engaged in the hydrologic sciences, observations, and forecasting. It provides a forum for government-to-government collaboration, and engagement with the academic and private sectors to achieve the delivery of actionable water information. Since the formal creation of the initiative in 2017, the most significant element of GEOGloWS has been the application of Earth Observations (EO) to create a system that forecasts flow on every river of the world while also providing a 40+ year simulated historical flow. This system is called the GEOGLoWS ECMWF Streamflow Services.

Daily 15-day ensemble forecast and 40-year historical simulation for every river in the world
The GEOGloWS ECMWF Streamflow Service uses a Hydrologic Modeling as a Service (HMaaS) approach, which centralizes the cyberinfrastructure, human capacity, and other components of hydrologic modeling. It uses the best forecasts and expertise available, along with the latest advances in Information and Communication Technology. We can now deliver reliable forecast information as a service, instead of delivering the underlying data, which then must be synthesized and computed locally.
In the past, millions of dollars have been invested by international and local agencies to develop hydrologic models from global data sources. This approach requires every agency to be able to download input data (e.g., terrain information, land use, meteorological, and other) to create models, and then have the computational power, software, and human capacity necessary to run and calibrate those models. Having to replicate this resource everywhere is expensive in terms of the cyberinfrastructure required. In addition, these systems frequently have short useful life spans because the organizations who use them lack the resources to continue maintaining and operating them when the external funding source is depleted.
The GEOGloWS global streamflow forecasting service allows local stakeholders to focus on solving water management problems such as flooding, drought, and water/food security issues by providing the water intelligence they need to make decisions. It also benefits the global economy by providing water intelligence to sectors that need to make high-risk investment decisions such as the insurance and reinsurance industries.

Hydrologic Modeling as a Service (HMaaS)
The streamflow services are powered by the cyberinfrastructure at ECMWF by using their ensemble meteorological forecast through the HTESSEL land surface model to produce global runoff on a 16x16 kilometer grid. This output is then mapped to the GEOGloWS watersheds and routed through the river network using Muskingum routing, as implemented in the RAPID software to produce a 15-day forecast on every river. The ERA5 retrospective historical data are run over the same domain to produce the 40-year record of streamflow that is used to derive return-periods and put the current forecast in context. Each of the components of the model and many of their results have been studied individually and published in academic journals, refer to the Publications page for more information on each component.

Model Workflow and Components
Resulting streamflow forecasts, along with a web mapping service produced and hosted by Esri, are constructed and delivered by an API so that custom web and other applications can be created from the HMaaS.

Data Service (REST API)
The REST API allows you to query and download information for any stream based on forming a query with parameters as part of the URL. Documentation for the possible queries and how to form them is found on the main GEOGloWS Streamflow service page https://geoglows.ecmwf.int/. The following workshop provides a brief overview of how the services are used.
REST API Documentation
Go to this website where you can find the documentation for the REST API: https://geoglows.ecmwf.int
Click the tab on the top called “REST API Documentation”.
Click on the blue bar that says GET and the name of the streamflow API method.

Press the white “Try it out” button located beneath the blue bar on the right side.

Provide the necessary parameters. Usually, the only argument is either a reach_id or both a latitude and a longitude.

Press the blue “Execute” bar.

7. The website will then generate the appropriate curl command and URL to access the data you chose with the parameters you provided.

8. After retrieving the streamflow information from the REST API, it will be presented as a preview under code 200 (which is a common response code for a successful query). A download button is found on the bottom right of that box.

9. You can copy and paste the URL you found in step 6 into a new tab of your web browser to retrieve the same result without needing to use the documentation’s interactive tool.
Forecasted Datasets
Each day, a new 15-day weather prediction is made by ECMWF. The weather forecast is composed of 52 ensemble members. From that weather prediction, a surface runoff estimation is made using the precipitation forecast and a land surface model, HTESSEL. Each of the 52 ensemble members is used to drive the GEOGloWS ECMWF hydrologic model producing 52 streamflow predictions called ensembles. The results of these streamflow predictions are available through the following methods.
ForecastStats: Summarizes the 52 ensembles across each time step by reporting the minimum flow, 25th percentile flow, average flow, 75th percentile flow, and maximum flow. Returns a time series of values for each of the 5 statistical values.
ForecastEnsembles: Returns a time series of flows for each of the 52 ensemble members.
ForecastWarnings: Returns a CSV that summarizes when streams are expected to reach 2-, 5-, 10-, 25-, 50-, and 100-year return period level flows.
ForecastRecords: Each day, the average of the predicted flows from 52 forecast ensemble members is recorded and can be retrieved to see a longer running record of streamflow predictions.
Historically Simulated Datasets
ECMWF provides the ERA5 historically simulated runoff dataset. This dataset is also used to drive the GEOGloWS ECMWF model and produce a historical streamflow simulation. This streamflow simulation covers from January 1, 1979 to the present with only a few months lag. The historical streamflow and products derived from it are available through the following methods:
HistoricSimulation: Returns a time series of daily average streamflow from 1979 through the near present.
DailyAverages: Returns a time series 366 steps long representing the average flow for each day of the year including leap day. This is roughly equivalent to what an average year of streamflow looks like at the reach of interest.
MonthlyAverages: Returns a time series of 12 steps representing the average flow for each of the 12 months of the year based on the historical simulation. Most useful in comparative analyses and validation metrics.
ReturnPeriods: Based on the historical simulation and the Gumbal distribution, returns an estimation of the 2-, 5-, 10-, 25-, 50-, and 100-year return period flows for the stream reach.
Mapping Service (Esri Living Atlas)
The GEOGloWS ECMWF Streamflow Model is easily investigated using a mapping service developed with collaborators at Esri.
GEOGloWS Hydroviewer (Web App)
This tutorial will show how to use the GEOGloWS ECMWF Streamflow Explorer App. Features include a forecast hydrograph for each stream, historically simulated streamflows, and the ability to download time series.
To open the app, please click here: http://apps.geoglows.org/apps/geoglows-hydroviewer/
View the Animated Forecast
Zoom in on an area that looks interesting and explore the buttons on the left-hand side of the Hydroviewer.

2. While examining the area, press the triangle “play” button on the left-hand side and notice how the thickness (discharge magnitude) and color (extreme event) can change. After examining for a few minutes push the square “stop” button.

Next, use the two arrow buttons to toggle through and observe the changes of the forecast.

Locate a Stream by its ID
You can zoom in and select any stream you want (and feel free to explore) but in order to match other examples later follow these steps to locate a specific reach ID found in Colombia.
On the left panel under the animation control options enter 9004355 in the box for “Search for a Reach ID”
Then select the “Find a Reach ID”

Now click on the stream nearest the pin (you may have to zoom in for better accuracy).
The current 10-day ensemble forecast is displayed in the plot window for the selected stream segment.

Visualize and Obtain Data
Choose a stream and click on it in order to pull up the data. On the top bar, there are five tabs that allow you to examine the forecast and simulated historical data for the selected stream.
Note
The Average Flows and Flow-Duration tabs will not be visible until you get the historical data from the second tab. This will be explained below.

Forecasts
The forecast (as shown above) comes from 51 different simulations. The graph includes the average, the 25-75 percentile flows, the maximum and the minimum flows, and a single higher resolution forecast (black line - HRES).
The legend can be seen on the right, and the different layers can be turned on and off by double clicking on them in the legend. Experiment with turning on/off the display of each layer.
The actual streamflow value for each time period can be displayed by hovering the cursor over the graph.
The forecast also includes the return periods which are toggled on by default when the forecast exceeds a threshold (as seen below) but are off by default when they do not (shown in the figure above). The return period threshold values are displayed when hovering over them on the right edge of the graph.

Historical
This is a graph of the 40-year simulated historical flow.
The different colors in the graph represent the different return periods which are computed from the 40-year historical simulation and Gumbel Method.
A table displaying the threshold values is included below the graph.

Daily/Monthly Average
Daily and Monthly Average Streamflow are calculated from the historical simulation.
These tabs will pop up on the top after you click “Get Historical Data” on the Historical tab.

Flow-Duration
This plot shows the probability that the streamflow will be greater than any given value.

geoglows Python package
The same functions to access forecasts through the REST API have been created in a Python package called geoglows that is documented at this site: https://geoglows.readthedocs.io/en/latest/. Functions for manipulating and plotting the data are also included in addition to the functions that access the streamflow services through python code. The following Google Colab Notebook provides example code for implementing this package.
Historical Validation
Introduction
These historical validation workshops will showcase some of the validation work we have done in different pilot regions. More importantly, these workshops will guide you on how you can use your own local observations to evaluate the performance of the model for your rivers.
Historical Validation Studies and Methods Presentation
How to Perform Historical Validation Using Your own Observation Data (Google Colab)
The ERA-5 reanalysis precipitation data that have been bias corrected by the Global Precipitation Climatology Project (GPCP). It is converted into runoff using the HTESSEL model. After that, the runoff is resampled performing an area-weighted grid-to-vector downscaling for the runoff computation. The GEOGloWS ECMWF streamflow services (GESS) computes a cumulative runoff volume at each time step as an incremental contribution for each sub-basin. GESS then uses the routing application for parallel computation of discharge (RAPID) model to route these inputs through the stream network (Qiao et al., 2019; Snow et al., 2016). GESS uses the historic simulation to define the return periods and uses these return periods as thresholds for flood alerts (Sanchez Lozano et al., 2021).
Decision-makers are worried about the accuracy and uncertainty of hydrologic model outcomes. Results do not need to be perfect, but they need to be reliable and accurate enough to give decision-makers the confidence to use them. The model accuracy is typically evaluated by comparing simulation results to observed data.
Hydrostats is an open-source software package designed to support hydrologic model evaluation. It supports both visual analysis and error metrics calculation. Hydrostats contains tools for preprocessing data, visualizing data, calculating error metrics on observed and predicted time series, and forecast validation. The visual analysis contains different options such as, hydrographs, daily and monthly seasonality, scatter plots, histograms comparison, and quantile-quantile plots. For error metrics calculation, It contains over 70 error metrics, with many metrics specific to the field of hydrology (Roberts et al., 2018).


In this tutorial, we will show how to validate the GEOGloWS ECMWF Streamflow Service historical simulation using the HydroStats Tethys app.
Obtain Data
To perform the historical validation, you first need to identify the stream of interest. You will need historical observed data and simulated streamflow data for this stream. For this tutorial, we will provide demo data for the Reach ID 9004355.
1. Use this link to download the demo historical observed data: https://www.hydroshare.org/resource/d222676fbd984a81911761ca1ba936bf/data/contents/Discharge_Data/23187280.csv If you are performing the validation using your own observed data, your data must have two columns with column headers in the first row. The first column should be titled ‘datetime’ and contain dates in a standard format. The other may have any title but must contain streamflow values in cubic meters per second (m^3/s). The observed data csv should look like this:

2. To get the historical simulation data, go to this url, which will access the API and download the historical simulation: https://geoglows.ecmwf.int/api/HistoricSimulation/?reach_id=9004355&return_format=csv This may take a few minutes. If you are performing the validation for a different Reach ID, you may edit the Reach ID in the url above, or use the GEOGloWS website to access the API. To use the interactive website, go to this link: https://geoglows.ecmwf.int/documentation and click Get Historic Simulation. Click “Try,” enter the Reach ID, and click “execute.” This will then give you the option to download the historical simulation. The simulated data csv should look like this:

Open the Statistics Calculator App
Once we have the historic simulated data and the historic observed data, we can run the historical validation.
Open the Hydrostats App


Preprocessing
First, we will plot the Historical Simulation data.
Click on “Process a Time Series” on the left menu.
Upload the historical simulation csv.
Click “Plot and Analyze Raw Data”
Notice that the historical simulation has no gaps and an even time-step.
Next, we will plot the Observed Data.
Refresh the page “Process a Time Series Dataset”
Upload the observed data file.
Note
If there are timesteps with empty values, this part will not work. You will need to remove the empty timesteps. The csv provided has empty values; you may skip this step if you don’t need to analyze the observed timeseries.
Click “Plot and Analyze Raw Data”
Notice that this timeseries has gaps. A summary is given showing the length and amount of gaps.
If desired, you can interpolate the missing data. For this example, we won’t interpolate.
Click on “Merge Two Time Series” on the left menu.
Upload the historic observed data and the historical simulated data downloaded for this tutorial.
Click on “Plot Merged Data” to see the plot for observed and simulated data.
Note
Notice that the merged data only covers the time-steps that contain both the simulated and the observed data.
Click on Download Merged Data to save a csv file with the merged data.
The critical thing for validating two datasets is to have a single .csv with both simulated and observed data merged. There should be a one-to-one relationship so that every time step has a value for both observed and simulated in order for the metrics to work correctly. There are some options to do this in the HydroStats App, but you may have to do some of this work on your own. Once you have a merged data .csv file, you can perform the validation with metrics from HydroStats.
Visualization
Click on “Validate Historical Data” on the left menu. This tab allows us to validate the historical simulation. a. Upload the Merged File that you downloaded in the previous step.
Click on:
Create Hydrograph
Then create Hydrograph of Daily Averages
Create Scatter Plot
Create Scatter Plot with Log-Log Scale
Analysis
Scroll down a little more on the “Validate Historical Data” page. You will see a “Table” section and right below that we can select the metrics of interest to validate the streamflow prediction tool historical simulation compared with the observed data.
In this case we are going to select:
Mean Absolute Error, Root Mean Square Error, Nash-Sutcliffe Efficiency, King-Gupta Efficiency (2012).
Note
Leave all of the King-Gupta Efficiency (2012) parameters at the default setting
Finally, click on “Make Table” to see the report.
Make a new table, with metrics of your choice.
See this full list of metrics.
3. If we click on “Compare Volume” we can compare the simulated hydrograph and the observed hydrograph volumes to get a rough estimate of water balance.
Forecast Skill Evaluation
Introduction
The GEOGloWS ECMWF Streamflow Services (GESS) is a Global Streamflow Prediction Service that uses the ECMWF ensemble meteorological forecasting system, which consists of:
1 high-resolution member with spatial resolution of 9 km for 10 days. The temporal resolution for this high-resolution member is 1 hour for the first 4 days (days 1 – 4), 3 hours for the next 2 days (days 5-6), and 6 hours for the final 4 days (days 7-10).
51 members with a spatial resolution of 18 km for the first 10 days (days 1 – 10), and 36km for the next 5 days (days 11 to 15). The temporal resolution for these members is 3 hours for the first 6 days (days 1 – 6) and 6 hours for the next 9 days (days 7-15).
This ensemble is converted to surface runoff using the Hydrology Tiled ECMWF Scheme for Surface Exchanges over Land (HTESSEL) model. After that, an area-weighted grid-to-vector downscaling is performed for the runoff. GESS computes this cumulative runoff volume at each time step as an incremental contribution for each sub-basin. The Routing Application for Parallel computation of Discharge (RAPID) model is then used to route these inputs through the stream network (Qiao et al., 2019; Snow et al., 2016).
By comparing observed data with simulated forecasts for any flood event where the GESS model is available, it is possible to determine how accurately the forecast predicted a high flow event and how many days in advance the model captured the event. Previous results have shown that the GESS forecast can capture high flood events at least 5 days in advance. Unfortunately, the model cannot capture the flow regulations made by the hydraulic structures, or high floods due to local high-intensity precipitation events.

If you can get observed streamflow within a relatively short period of time after the flood event, you can follow the methods and workflow for evaluating the skill of forecasting from this presentation and workshop that includes some python workflows.
Validating Flood Events from GEOGloWS ECMWF Streamflow Services
This part of the workshop will show you how to validate the GEOGloWS ECMWF Streamflow Services short-term forecast in any flood event in any place in the world. You will compare observed data with simulated forecasts for a flood event to see how well the forecast performs.
Example data is provided to complete this workshop. If you would like to perform the validation for a different flood event, you will need:
The set of reach_ids which you want to use for the analysis.
The observed streamflow during the flood event corresponding to the set of reach_ids.
The record of forecasts starting at least 15 days before the flood event (the length of one forecast) to at least a couple of days after the event was finished.
Optional: The historical observed streamflow corresponding to the set of reach_ids.
Step 1: Get Forecast Record
The first step is getting the record of forecasts after a flood event has occurred.
This Google Colab notebook will allow you to download the forecast data: https://colab.research.google.com/drive/1y2eVRJpfcdISB25U0lCBZ7z6up14wswg
After running this notebook, you will find a folder in your Google Drive called ‘Forecast_Validation’ containing a folder for each of the reach_ids that you used. Within these folders is a file with the forecast for each day that a forecast was available.


For the example followed in this tutorial, you will need to use the forecast data available here:
Step 2: Compare Observed Data with Forecasts
The second step in validating the performance of the GESS forecast during flood events is comparing the observed values with the original forecast datasets. First, you will need to save the observed streamflow data for the high flow event of interest to your Google Drive in the correct format. All of the observed data should be in the folder on your Google Drive called ‘Forecast_Validation/Country/’ (for this example the country is Honduras, so it would be ‘Forecast_Validation/Honduras/’). The observed data file for each station should be named in this format: ‘{station name}_RT_Q_orig.csv’
The observed data files should be in the same format as the example file below. The left column should contain the datetime (format: yyyy-mm-dd hh:mm:ss) with the column header “Datetime.” The right column should contain streamflow with units of m3/s and the column header “Streamflow (m3/s).”
In this example we are following the GEOGloWS ECMWF Streamflow Services (GESS) Forecast Validation for the Eta and Iota Hurricanes in the stations HDRPV-Jicaro (951603) and HDRPV-Maragua (951795) in Honduras. The data needed for the example is available here:
The following Colab notebook will help to plot the original forecast launched every day and the observed data for the same datasets. The Colab notebook is available here: https://colab.research.google.com/drive/1VMs50wKE55TBn8tWTimc69s1rNaom8SI
Step 3: Reorganizing the Forecast Data
The third step is reorganizing the forecast as a function of days-in-advance. This will help us understand how far in advance the GEOGloWS ECMWF Streamflow Services forecasts accurately predict flow. Tables 1 and 2 illustrate how the data will be reorganized. After reorganizing the data, we can create visuals and compute metrics that show how accurate the forecasts are 1 day in advance, 2 days in advance, 3 days in advance, etc.
Table 1. Original forecast schema.

Table 2. Reorganized forecast data schema

You can do this by following this Google Colab notebook: https://colab.research.google.com/drive/1CDcKFNHyuZ2ropLVZBl8tU2GZCBB7WLA After running this Colab notebook, there will be a new folder inside the folder for each reach_id with the reorganized data.


Step 4: Comparing Days-in-Advance Forecasts with Observed Data
The fourth step in validating flood events is to do a visual analysis comparing every day in advance of the forecast with the observed data. The idea is to understand how many days in advance the GESS forecast was able to give us advice about the high flow event. This step requires the reorganized data from the previous step. This analysis can be done by following the Google Colab below, which evaluates 1-day to 15-day forecasts: https://colab.research.google.com/drive/1bRpO-cf3EOoSs_4oB0rZvQk8Jc6Avxo8
Bias Correction
Introduction
Global model results often show biases at local scales or specific locations. While timing and other general parameters of a flood event may be correct, the actual magnitude of the event may be consistently higher or lower than the actual flows. These biased predictions prevent their use at a local scale because the bias can significantly affect the accuracy of a simulated flood event and, if incorrect, can cause decision-makers to lose confidence in models.
A bias correction method derived from the method presented by Farmer et al.(2018), which is based on the flow duration curve, has been proposed to correct the bias in the GEOGLoWS ECMWF simulated streamflow. First, the flow duration curve is calculated from thehistoric simulated timeseries and theobserved streamflow timeseries for each month. A flow duration curve shows the cumulative percent of time that any given discharge was exceeded during a given period. In the graphic below, the flow duration curves are shown on the right. Yellow is the original simulated data, blue is observed, and red is the simulated data after bias correction.
Using the flow duration curve, we can estimate the non-exceedance probability of every simulated value for each month. This is shown in the graphic as the top horizontal line, connecting the simulated data to the simulated flow duration curve.The vertical line shows that same non-exceedance probability on the observed data flow duration curve. We can then estimate the observed streamflow value that corresponds to that non-exceedance probability. Finally, we correct the simulated value by replacing it with the equivalent observed streamflow to the same non-exceedance probability, shown by the bottom horizontal line.


You can use historical observed data that you have for sites that you are interested in to adjust any bias in the historical simulation and forecast at that point (we are working on extending bias correction to ungauged areas). The methods and results from some pilot studies are given in the presentation.
The workshop below will show you how the bias correction process can be done in python using the geoglows package.
Finally, this tutorial will show you how you can perform bias correction in the GEOGloWS Hydroviewer app. If you have observed streamflow data, you can add it to the bias correction tool through a csv file and view corrections for the simulated data within the app.
Tutorial
The GEOGloWS ECMWF Streamflow Services generates a historical simulation from 1979-present based on weather data records. However, the model often has bias, commonly overpredicting streamflow. When observed streamflow data is available on a stream, we can use that data to improve the historical simulation and forecast model. This process is called bias correction.
This tutorial will show you how to perform a bias correction on streamflow data using the Global Hydroviewer Web App.
Obtain Data
1. We will need observed streamflow data. You may use your own observed data if you wish, or the demo data available here: https://www.hydroshare.org/resource/d222676fbd984a81911761ca1ba936bf/data/contents/Discharge_Data/23187280.csv.
2. If you are using your own observed data, it should be saved as a .csv file that has 2 columns and both should have column labels in the first row. The first column should be titled ‘datetime’ and contain dates in a standard format. The second column may have any title but must contain streamflow values in cubic meters per second (m^3/s).

Inputting Data
Go to the GEOGloWS ECMWF Streamflow Hydroviewer: http://apps.geoglows.org/apps/.
2. After opening the Hydroviewer app, find the river segment you would like to do the bias correction on. You can do this either by searching for a Reach ID or latitude/longitude coordinates using the fields on the left or by zooming to the river. If you are using the demo data, use the Reach ID 9004355.

3. Once you have found the river, click on it to pull up the forecast. This may take a few minutes to load. Then go to the Bias Correction tab at the top of the window.

4. Now you can upload your observed data csv file by clicking on the blue “Upload New Observation” button and select the data you want to upload. Once you have a file uploaded, click “Start Bias Correction.”

5. Running the bias correction generates a plot of cumulative volume and a scatter plot to show how the bias correction improved the Historical Simulation. You can turn the different lines and datasets on and off by clicking their label in the legend. A table of error metrics is also generated. Each error metric describes a different aspect of how correlated the datasets are; you can read more about the error metrics here: https://hydroerr.readthedocs.io/en/stable/list_of_metrics.html



6. After running the bias correction, you can also go to the Historical tab, where a plot of the original simulated data, observed data, and corrected simulated data is generated.

Finally, you can go to the Forecasts tab, where a plot of the bias corrected forecast is generated.

WHOs Water Data Explorer
In recent years, there has been a growing need for standardized ways of sharing water data on the web. In response, the Consortium of Universities for the Advancement of Hydrologic Science (CUAHSI) has developed the Hydrologic Information System (HIS) along with the standardized WaterOneFlow web services and WaterML data exchange format. Tools to access data shared to the WaterOneFlow services and WaterML already exist such as the Microsoft Windows HydroDesktop software, WaterML R package, and the web-based CUAHSI HydroClient which serves as an access point to the CUAHSI HIS database.
The Tethys Water Data Explorer (WDE) is a newly developed web-based tool allowing a broad range of users to discover, access, visualize, and download data from any Information System that makes available water data in WaterML format through WaterOneFlow services. WDE is designed in a way that allows users to customize it for local or regional web portals.
WHOs Water Data Explorer Presentation
Functionalities Demonstration
This demo will explain how to use the WDE functionalities as a regular user, without admin permissions. We will be using the “Test” Catalog, with views “Los Nevados” and “ParaN.” These views should be preloaded in the tethys staging portal.
When it comes to sorting through data you can use the icon, this will open the Filter of Views menu.

The Filter of Views menu allows users to filter by country and/or variable.

Select Puerto Rico from the Filter by Country tab and then click the green Filter Views button. It will then limit the views to views only in Puerto Rico, you can reset the filters by hitting the Reset button.
You will notice that next to the different views is a checkbox and four icons. These icons all have different functions.

The checkbox selects the views being displayed on the map. Click the checkbox next to the ParaN view, notice that some stations appear on the map in Puerto Rico.
The icon refreshes the view it corresponds with, refreshing will add or remove points that have been added or
removed to the data since the last refresh. Go ahead and hit this icon, you should notice the WDE loading and after will
display a message indicating how many sites were changed.
Click the icon next to the ParaN view, notice that it zooms into where the stations are on the map, this
button will zoom up close to whichever view it corresponds with.
The icon displays a list of available variables of the view. When you click on this icon next to the ParaN view,
it should display three variables: Discharge, Velocity, and Average Water Depth.

The icon displays the information about a view, including the description, endpoint, list of stations, and the
available analysis tools.
Select a station from the ParaN view, it will bring up a data tab below the map containing information about the station, including the variables included in the data.

Hit the green graphing button, notice that the tab transitions to an empty graph. The Water Data Explorer will plot the time series for you. Click the blue Select Variable dropdown menu and select one of the variables. Then below the Select Variable menu are plotting options, you can choose between “Scatter” or “Whisker and Box” Plots.

Once you choose a variable and plot type, hit the green Plot Time Series button and the WDE will then plot the time series in the graph below.

Below the Plot Time Series button is a download menu, this menu allows users to download the data to their local computer. When you click on it you will notice the different downloadable file types. For this time-series dataset you can only download CSV, Waterml 1.2, and Waterml 2.0. NetCDF is not available for any of the Cuahsi HIS Central Datasets.
Met Data Explorer (MDE)
Introduction
In recent years, there has been a growing recognition of the need for standardized ways of sharing meteorological gridded data on the web. In response to this need, Unidata, a division of the University Corporation for Atmospheric Research (UCAR) developed the THREDDS Data Server (TDS). TDS is a web server that provides metadata and data access for scientific datasets, using OPeNDAP, OGC WMS and WCS, HTTP, and other remote data access protocols.
In order to extract data from the TDS, many tools were developed such as the grids python package. The grids package allows for extracting time series subsets from n-dimensional arrays in NetCDF, GRIB, HDF, and GeoTIFF formats. Met Data Explorer (MDE) is a newly developed, web-based tool allowing a broad range of users to discover, access, visualize, and download (using the grids package) data from any TDS that stores meteorological data. MDE was also designed in a way that allows users to customize it for local or regional web portals.

MDE is an open-source web application providing users with the functionalities of data discovery, data access, and data visualization. MDE can be installed by any organization and requires minimal server space. The MDE is an open-source web application for visualizing meteorological gridded data. Utilizing TDS to serve the data, the application allows you to organize and save data files with the specific variables and dimensions that you need, visualize the data in a Leaflet based map viewer, animate the data across a time series, and extract a time series over a specified area. The area for extraction can be specified as a marker, bounding box, or polygon. To extract a time series, the application utilizes the grids python to remotely access and extract the data over the given area. The time series extraction feature in the MDE will work on most data, provided that the data conforms to OGC3 standards and the TDS is properly configured.
Functionalities Demonstration
This tutorial is divided into three parts. The first part of this tutorial introduces the user interface. The second part explains how to add and organize data in the MDE. Adding data to the MDE requires admin permissions. The third part of this tutorial explains how to visualize the data in the map interface, define an area over which to extract the data, plot the extracted time-series, and download the data in a variety of formats. All the steps explained in the final part of this tutorial can be completed as a regular user, without admin permissions. If you do not have admin permissions to complete the second part, the preloaded data on the tethys staging portal can be used to complete the final part of this tutorial.
The Interface
The interface for the MDE is divided into three main sections: the Map Window, the side Navigation Panel, and the Graph Window (see Figure 1). The Map Window is used for visualizing and animating the data and for defining the area over which to extract the data. The side navigation panel lists the files that have been loaded into the MDE. The Graph Window contains all other data-user interactions — including plotting the data, specifying a variable, and viewing file and variable metadata. The Graph/Map Slider can be used to show or hide the map and graph windows. The Navigation Panel Toggle can be used to show or hide the navigation panel.

Adding Data to the Met Data Explorer
To add data to the MDE, make sure you are logged in and that your account has the necessary permissions.
Groups, or catalogs, are created to organize the files. Select the Add Group Button (see Figure 2).
The Add Catalog of Thredds Servers dialog will appear (shown in Figure 3).


Give the group a name and a description (see Figure 3) and click the create group button .
The dialog will close and the group will be added to the navigation panel (see Figure 4).

To add a file to a group, select the Add File Button located on the header of the created group (see Figure 5).
The Add a Thredds Server File dialog will appear (shown in Figure 6).


Enter a name and a description for the file. If the file requires user credentials (i.e. username and password) to
access, skip down and complete the section labeled Enter User Credentials for Files and then return and continue from
this point. Enter a URL for the THREDDS Catalog where the file is accessible. Click the Access Catalog Button to
connect to the THREDDS Catalog.
A separate dialog will appear listing the files and folders contained in the catalog at the specified URL (see Figure 7). Select a file or folder. If a folder is selected, the contents of the folder will be displayed in the dialog. If a file is selected, the variables, dimensions, and metadata for the file will be retrieved and loaded into the Add a Thredds Server File dialog (see Figure 8).


All the variables with two or more dimensions will be listed. Select the variables that you want included in the app and click the Add Thredds File button. The file will be added to the navigation panel under the group to which it was assigned (see Figure 9).

Enter User Credentials for File
Many datasets require a username and password to access the THREDDS Server. This feature was specifically added to the
app to allow access to data stored on the GES DISC data portal but it should be compatible with any server requiring
authentication. While the Add a Thredds Server File dialog is open, click the Link Authentication Credentials button .
The Authentication dialog will appear (see Figure 10). If authentication has already been added to the app, click the
radio button next to the authentication you want to be associated with the app. To add authentication, fill in the
blanks in the Machine, User, and Password columns and press the add button
. Click the radio button next to the newly
added authentication and click save
.

Data Discovery
To visualize the data on the map, select a file from the Navigation Panel (see Figure 11). The file will appear on the map and the Graph Window will open (see Figure 12).


The first variable listed in the file will be selected by default. The selected variable can be changed using the Variable dropdown. The dimensions associated with the variable will be listed along with the range of values spanned by each dimension. If the dimension is not a temporospatial dimension, the value associated with the dimension can be specified using the appropriate dropdown.
How the data is displayed on the map can be modified by changing the display settings located at the bottom of the
Graph Window. Set Data Bounds specifies the data values over which the color range on the map spans. The color style
can be specified using the Set Color Style dropdown. The opacity of the data on the map can be set using the Set Layer
Opacity slider. Once the display setting are set to your liking, click the Update Display Settings button .

Data can be extracted at a point or over a user defined polygon. To extract the data at a point, create a point on
the map using the Create Marker tool located on the drawing menu in the map window. The Create Rectangle
or Create
Polygon
tools can be used to define a polygon over which to extract the data. To use a shapefile to define a polygon,
change the Mask Data With
dropdown to Use A Shapefile. The Select a Shapefile dialog will open (shown in Figure 14).
If the shapefile has previously been uploaded to the map, check the radio button next to the desired shapefile and
click the Use Shapefile button
.

To upload a new shapefile, click the Upload Shapefile button . Follow the prompts to upload the file, click the radio
button next to the uploaded file, and click the Use Shapefile button.

Once a location over which to extract the data has been specified, click the Plot Time Series button to extract and
graph the data. It may take several minutes to retrieve the data, depending on the current network speeds.
The time series will be plotted in the graph window (see Figure 16).

The time series can be downloaded as a csv or json file. Open the Download Data dropdown and select the desired format.
An HTML file can also be downloaded which contains a web map that shows the same data that is displayed in the map
window. The last download option is to download a python notebook with code to extract the time series for the file
and variable currently selected in the MDE.

There are two more tabs in the graph window which can be examined by clicking the Move Right arrow located to the
right of the graph window. The first tab shows the metadata contained in the file (see Figure 18). The second tab
shows all the variables in the file with the associated dimensions (see Figure 19). The metadata for each variable
can be seen by clicking the Metadata Info button
. A dialog will open showing the variable metadata (see Figure 20).



NetCDF Formatting Requirements
NetCDF files are one of the most popular formats for storing and distributing meteorological or earth observational data. They have several advantages over other common file formats. The netCDF format is notable for its ease of use, portability, simple data model, and strong user support. The netCDF format is made to be highly flexible, allowing users to define and organize the data as they see best while still allowing the data to be shared across machines and be self-describing, i.e. the data is human readable without reference to an external source.
Within the Met Data Explorer, the data displayed and analyzed in the app are retrieved from netCDF files that are read from a THREDDS Data Server. To be compatible with the THREDDS Data Server and the services it provides which the Met Data Explorer uses, the netCDF files on the THREDDS Data Server must be CF compliant (the Climate and Forecast (CF) conventions are recommendations and standards for netCDF files) and adhere to several additional guidelines.
This document outlines the CF conventions and additional guidelines to make netCDF files compatible with the Met Data Explorer.

Coordinate Variables
Every dimension in the netCDF file that contains values must have a corresponding variable that has the exact same name as the dimension to which it corresponds. If there is a dimension named x then there must be a variable named x, if there is a dimension named time then there must be a variable named time, ect. The dimension defines the shape (number of values) and the variable lists the values, attributes, and other information for the dimension. The arrays contained within the dimensional variable should be one-dimensional and monotonically increasing or decreasing. Each dimensional variable should contain certain attributes. The attributes that should be in each dimensional variable are long_name - a descriptive name for the dimension that is human readable, standard_name - a standardized name for the dimension (i.e. if using EPSG4325 the standard_name should be longitude for the x dimension and latitude for the y dimension), units - the units used for the dimension (if latitude and longitude are used the units should be degrees_north and degrees_east respectively), and calendar - specifying on which calendar the time dimension is based (only needed for the time dimension). The four spatiotemporal dimensions time, latitude, longitude, and height should all contain the axis attribute with the identifying values T, Y, X, and Z respectively.

Data Variables
All data variables should have the attributes long_name - a descriptive name for the dimension that is human readable, standard_name - a standardized name for the dimension taken from the CF Conventions Standard Name Table, units - the units used for the dimension that are equivalent to the canonical units in the standard name table.

Coordinate Reference Systems
All georeferenced data must be defined by a standard coordinate reference system (crs). If the data does not conform to a standard crs then it cannot be transformed to be used with shapefiles or other data. It is recommended in data which uses latitude and longitude, the latitude values must span from -90° to 90° (not 0° to 180°) and longitude values must span from -180° to 180° (not 0° to 360°). The data may not be extracted correctly if values outside these ranges are used. If an alternate crs is used, the dataset must contain an grid mapping variable. The grid mapping variable must have the grid_mapping_name attribute with a value that specifies the crs used. The CF Conventions should be referenced to determine the attributes that must be included in the grid mapping variable.

NCML Files
NetCDF Markup Language (ncml) is an xml file type specifically designed for modifying, reformatting, and aggregating netCDF files. The easiest way to reconfigure netCDF files is often to create a ncml file. Below are some useful elements for creating a ncml file.

CFBUILD Python Package
The cfbuild python package is a useful tool for updating or building netCDF datasets to be compatible with the Met Data Explorer. It can be used to update existing datasets or to build new datasets.
