How to find your rubber duck: Using machine learning to understand a changing sea
By Dr. Matias Duran Matute
This blog is part of our blog series: The Small-Scale Initiative on Machine Learning, how did it go?, where groups who were invited to participate in a project with eScience Center Research Software Engineers write about their projects and their experience. This week, the guest author is Dr. Matias Duran Matute from TU Eindhoven.
When we think of our lives, it might be obvious that every day is different. For other systems, like bodies of water, it might be more difficult to see how days can differ. Until not too long ago, researchers believed that movements of water and sediment in the Wadden Sea were dominated by the repetitive, cyclical movement of the tide, reaching peaks and troughs about every 12 hours. Recently though, they realized that this could not be further from the truth: the wind, particularly during storms, plays a disproportionately large role in the movement of sea currents and the net transport of sediment.
Of all the types of forcing (wind, tides, freshwater discharge, etc.), in the Wadden Sea, wind is actually the dominant forcing that determines net currents. And there are other variable key players, like the amount of freshwater discharged by rivers, the atmospheric pressure, and the tide, which doesn’t behave as predictably as we used to think: indeed, tides are not even perfectly periodic.
On top of the fact that days vary, there are also big differences from year to year. One year can have more stormy days than another, or in some years easterly winds are more important than the usually dominant southwesterly winds. This variation in the winds results in overall differences of transport averages between years. The variability at time scales from daily to yearly can have a large impact in the ecological functioning of the Wadden Sea due, for example, to differences in the transport of the freshwater from the rivers, or to the transport of larvae. The loss of several cargo containers by a ship in the North Sea (close to the Wadden Sea) on April 7, 2021, is another practical reason to highlight the importance of understanding how every day is different. Where will all the cargo end up?
The ENW/NWO project “LOCO-EX: The Dutch Wadden Sea as an event-driven system: long-term consequences for exchange” is studying how variability of the Dutch Wadden Sea is affected by forcing. This study is led by Dr. Theo Gerkema (NIOZ) and Dr. Matias Duran Matute (TU/e), in collaboration with Dr. Ulf Graewe (IOW, Germany) and participation of Jeancarlo Fajardo Urbina (PhD candidate at TU/e), and uses realistic numerical simulations of the currents, salinity, and temperature in the Dutch Wadden Sea over 36 years. In addition, these simulations have been coupled to simulate the transport of passive particles in the water producing 300 million particle trajectories over the 36 years.
How does machine learning come in?
From the conception of the LOCO-EX project, it was clear that machine learning could help answer some of the group’s questions. There were two main reasons why they saw potential for incorporating machine learning. First, the forcing mechanisms interact in a non-linear, complex way. Second, they had a lot of data: the state of the system during approximately 13 000 different days (or 26 000 tidal periods) in about 30 000 locations, and on top of that, 300 million particle trajectories.
To get a quick start, they applied for the eScience Center Small-Scale Initiative (SSI), and were awarded consultancy on their project titled “Machine Learning for the complex response of the Wadden Sea”. A first goal for the group was to determine what can machine learning actually do and which machine learning tools the group would need for this to happen. After all, machine learning is not magic. The group found great discussion partners in the eScience Center engineers.
They decided to focus on two main questions: First, if machine learning can predict the daily averaged state of the Wadden Sea if we know the forcing. Second, if we can predict the trajectories of particles in the Wadden Sea with machine learning.
The most suitable tool to answer these questions were Long short-term memory (LSTM) artificial recurrent neural network (RNN). Using LSTM is essential because the current state of the Wadden Sea not only depends on the current forcing (e.g., the wind); it also depends on the history of the system. The engineers at ESC have helped to set up the first models.
One additional complication: particle trajectories are chaotic
It is well known that particle trajectories in the ocean are chaotic. This means that two particles starting close together (either in time or space) will eventually have very different trajectories. In fact, one of the first observations of this phenomenon are due to cargo falling out of ships (just like it happened in 2021 in the North Sea). In 1992, a cargo ship container tumbled into the North Pacific, dumping 28,000 rubber ducks and other bath toys. These rubber ducks ended up in beaches all around the world in, for example, Hawaii, Alaska, Chile, Ireland.
One of the open questions that the group is currently exploring is up to which point machine learning can be used to capture chaotic particle trajectories.
We are excited to see future outcomes of this project and would like to thank Dr. Matias Duran Matute and his colleagues for this contribution to our blog.