STFC Logo
Industry: Education
Location: United Kingdom
Download Full Story

STFC: Using Kubernetes to better understand climate change

Highlights

  • Supports STFC’s unique needs.
  • Supports unique needs of global research community.
  • Easy-to-use, and has hands-on support.
  • Interoperable: flexibility to build, scale and transform on demand.
  • Enhances the ability to deploy, manage and scale services JASMIN offers to the user community.

Products

STFC: a world-leading multi-disciplinary science organization enabling research and innovation

On behalf of the UK scientific community, STFC is tasked with running national laboratories and conducting research into major science projects with large-scale infrastructure and resource requirements.

STFC has a broad remit. It provides bespoke technology and resources for hundreds of research projects, covering a wide range of different science areas. Its UK-based teams are involved in high-profile projects, and JASMIN is run in partnership between the CEDA, part of STFC’s Rutherford Appleton Library (RAL) Space department, and STFC’s Scientific Computing Department. No matter the variety and scale of each project, JASMIN has a clear mission — to provide scalable compute environments that help researchers find meaningful answers to the big scientific questions in the environmental sciences domain.

At-a-Glance

Advances in volume data processing and analysis are helping to transform our understanding of the most complex scientific problems. Part of the Science and Technology Facilities Council’s (STFC) remit is to provide scalable compute power for researchers and academic institutions in the U.K. and their collaborators all over the world. One example is JASMIN, a data-intensive supercomputer the Centre for Environmental Data Analysis (CEDA) and STFC operate on behalf of the UK Natural Environment Research Council (NERC), to help academics gain insights into climate change and a range of other environmental research areas.

In recent years, Jupyter Notebooks have become a popular tool for researchers as a means to analyze their data and share their work with collaborators. STFC’s technology team launched its own Jupyter Notebook service to join the suite of other services that its infrastructure provides. To operate this service, Rancher Prime has been deployed to provide academics with a scalable, highly available and secure compute environment.

Advancing data science for the global good

Amongst the host of research projects STFC’s compute resources support, a growing number of teams are focused on pushing boundaries in climate science. The JASMIN Notebooks Service has become a useful tool to support data analytics in environmental sciences research. Later, we’ll explore an example from a Ph.D. student working with STFC’s JASMIN platform to better understand rising inner-city temperatures.

Over the years, Jupyter Notebooks have become a familiar way for academics to consolidate and analyze volume data. A Jupyter Notebook is an interactive document containing live code and visualizations, viewed and modified in a web browser. The JASMIN Notebook Service, based on the open source JupyterHub, provides the ability to run multiple Jupyter Notebook sessions via one easily accessible platform served over the web.

Part supercomputer, part cloud, STFC’s JASMIN platform was the backbone of a recent COP26 hackathon, which saw over 150 attendees exploring topics ranging from climate change to oceanography and biogeochemistry. JASMIN gave attendees — even those with no prior coding experience — easy access to Jupyter Notebooks via web browser where they could interact with terabytes of data.

Ease of use is particularly critical when managing the complexities of climate data at scale. STFC’s JASMIN platform allows the processing and analysis of massive data sets, from multiple origins and in myriads of formats. By co-locating these data alongside services like Jupyter, JASMIN gives user more control by removing issues around data movement and data wrangling to allow them to get insights more quickly. While STFC’s JASMIN platform is the secret to potentially world-changing research projects, Kubernetes, underpinned by Rancher Prime, provides the underlying infrastructure for the Notebook Service.

“The team at STFC were looking for a vendor-backed solution to help manage its Kubernetes estate. Working with Rancher Prime, the Kubernetes architecture was easy to deploy, manage and scale.”

Kubernetes to the core

STFC’s JASMIN Notebook Service was built on Kubernetes containers from the ground up. JASMIN currently serves more than 1500 researchers, exploring a vast range of topics. With climate change high on the agenda for governments everywhere, climate-related projects have never been more important. From oceanography to air pollution, earthquake deformation and analysis of wildlife populations, JASMIN is being put to clever use in environmental science.

To fulfil the need to reduce latency and enable real-time analysis, STFC opted for a bare-metal cluster deployment from Rancher Prime. Once configured with STFC’s storage capabilities, this allowed researchers access to petabytes of data in real time — impossible in a standard desktop environment.

Overcoming operational challenges

Scale, flexibility and open interoperability

In recognition of the growing demand for hyperscale compute resources, STFC wanted to build greater scale and agility into its Kubernetes estate. Rancher Prime allows technology teams to flex compute resources faster, making the service resilient for the long term, no matter how complex or varied projects may be.

Secondly, another key factor for STFC was interoperability; not only does the organization combine its bare-metal setup with data hosted in the cloud, users are also turning to the JASMIN Notebook Service for a widening range of use cases. Kubernetes and Rancher Prime offers those users the freedom to choose what best suits diverse needs, and the flexibility to build, scale and transform on demand.

Sheng Liang, president of engineering and innovation at SUSE says: “The team at STFC were looking for a vendor-backed solution to help manage its Kubernetes estate. Working with Rancher Prime, the Kubernetes architecture was easy to deploy, manage and scale.”

Case Study: Sarah Berk Ph.D.: Exploring the Heat Island Effect in Inner Cities

Sarah Berk is a Ph.D. student at the University of East Anglia (UEA) who uses STFC’s JASMIN Notebook Service to analyze terabytes of data measuring heat from cities all over the world. She’s exploring the urban heat island effect; a phenomenon in which metropolitan areas are significantly warmer than surrounding rural areas due to human activity and properties of the urban environment.

“It’s a really important area for two key reasons,” says Berk. “The first is migration. Now, over half the world's population live in cities, and that will grow to 68% by 2050. The second factor is this backdrop of a changing climate, particularly increasing global temperatures and an increasing frequency of heatwaves.”

Berk believes this research will help us understand the relationship between increasing city temperatures and global shifts in climate. Eventually, she hopes it will provide a framework for change in metropolitan planning and design — whether that’s the selection of building materials or the creation of new green spaces.

Berk, who learnt a new programming language for this project, says that JASMIN enabled her to hit the ground running without having to learn a lot of new skills. It allows her to visualize and analyze huge amounts of data at a speed that otherwise wouldn’t be possible. She’s currently working with two datasets — land surface temperature and land use data.

Berk explains: “Because I am using 15 years' worth of data spanning the entire globe in 300-meter resolutions, the sheer volume of data processing is immense. It’s not possible to use my laptop to analyze this, which is why I use STFC’s JASMIN Notebook Service.

“At UEA there is a High-Performance Computing Environment for supporting intensive computing, but the data I’m working with is too big to store on that, so I’m storing it on an external hard drive,” Berk adds. “As such, it’s a lot easier to have that data already stored on JASMIN and not have to worry about it — nor have to go through the time-consuming task of downloading it.”

What’s next?

Since JASMIN was first launched in early 2012, it has grown significantly in scale and complexity but also in the variety of projects it serves. This growth is likely to continue as the world increasingly turns to the research community to find innovative ways to combat the climate crisis.