SeaView: north star of interoperability on the rough seas of science data

The EarthCube-funded SeaView project is approaching an exciting milestone in a few short months: the tool will be graduating on from the developers and masterminds who incepted it, to realize its true purpose in the hands of Southern Ocean Scientists. As one might guess from the name, SeaView’s ultimate purpose is to serve EarthCube’s considerable Oceanography community, and was driven by three clear goals:

  1. create a useful product for scientists;
  2. strengthen the data repositories involved, and;
  3. use the knowledge gained to inform the way EarthCube continues to grow.

In order to effectively design the SeaView to be optimal for specific groups of scientists (deep water hydrography and marine community-environment interactions,) the team set out to build upon their already formidable knowledge of the ocean sciences community and got to know a few of their smaller communities on a much deeper level than they already did, and doing everything necessary to meet their needs.

A diverse set of skills to meet a wide range of needs

SeaView focused on data interoperability
SeaView, as a collaboration between team members of diverse data backgrounds, became a project in which multiple repositories had robust representation.

The SeaView team draws on the diverse skills and vast experience of its members: Karen Stocks, Director of the Geological Data Center at Scripps Institution of Oceanography; Steve Diggs, Technical Director of CLIVAR and Carbon Hydrographic Data Office (CCHDO); Bob Arko, then-Technical Director of Rolling Deck to Repository (R2R); Danie Kinkade, Director of the Biological and Chemical Oceanography Data Management Office (BCO-DMO); and Adam Shepherd, Technical Director of BCO-DMO. Collectively, they have extensive expertise in data science, various aspects of oceanography, data curation, and systems design and development that allowed them to form a comprehensive foundation for understanding the data needs of ocean scientists.

The team began with the relatively simple objective to make linked sets of data repositories that could be uniquely themed for oceanographers. They wanted to make sure that deep water hydrographers and marine ecologists had access to datasets that could be wholly useful to them. Oceanographers needed a tool that was created with them in mind; that was what SeaView would be.

Leveraging existing tools and going the extra mile

data interoperability
SeaView is focused on the unique edge where a specific science domain interacts with a tool, bringing the two together smoothly.

Driven by their three defined goals, SeaView combined data from the R2R, BCO-DMO, CCHDO, the Ocean Biogeographic Information System (OBIS) and the Ocean Observatories Initiative (OOI). The team was able to leverage existing EarthCube technologies from CINERGI’s Data Discovery Hub (now the Data Discovery Studio) and GeoLink, in creating its products, which Stocks considers a significant success in its own right. In the multifaceted cyberinfrastructure world, Stocks says she thinks of SeaView as working on the last mile, “taking the amazing resources already available, and doing that bit more that was needed to really make it usable for our community.”

Many EarthCube projects address the ‘how’ of data organization, and when it comes to ‘what’ data, we try to get as close to ‘all of it’ as possible. SeaView was inspired by a different angle, focusing on the ‘what,’ and by focusing their efforts on the unique edge where a specific science domain interacts with a tool, they can bring the two together smoothly.

Help us to help you

data interoperability
SeaView’s process makes data lightweight in a way that actually supports several of the FAIR goals, specifically interoperability and reusability.

SeaView got its start two years ago with a workshop at the Ocean Sciences 2016 conference in New Orleans, where they invited ocean scientists that they thought could benefit from their idea. It was diverse group, hailing from varying types of oceanography at varying levels of expertise, and they had varying levels of technological skills. The multifarious character of the group made a useful challenge for the SeaView team, providing them with important input on usability for many backgrounds. It would allow the customizability they believed necessary for more interoperable data. Stocks explains the session by saying “we essentially asked them, ‘what do you want us to do?’”

Because SeaView’s first goal was to make something truly useful to scientists, they were exhaustive during the requirements gathering phase. They conversed with the scientists about their workflows: Where were the bottlenecks? What needed more order? Where could we make this better? What is a priority to you?

Stocks and Diggs took that discussion above and beyond during the workshop, creating activities that involved working with participants to sketch out their current workflows, planned workflows, and “dream” workflows. All to visualize potential use cases, all so that they could truly understand their user. SeaView’s process makes data lightweight in a way that actually supports several of the FAIR goals, specifically interoperability and reusability.

Datasets cooked to order

phytoplankton_-_the_foundation_of_the_oceanic_food_chain
Scientists need data that is relevant to them, they need it in one place, and they need it organized in formats they will use.

Getting the input of scientists was crucial, and the way the team truly listened to that feedback was key. For example, they had begun with the intent to put everything in Ocean Data View (ODV) formats, (a useful software package for analyzing, exploring, and visualizing oceanographic data). They had picked ODV because it has incredible versatility, exciting implications for advanced analysis, and over 50,000 registered users. But as they discussed ODV with participants, they found that while some researchers loved it, others were invested in other tools and formats. NetCDF was the most flexible and useable format for these scientists. Their dedication to making the right tool led them to adopt netCDF without hesitation. Ultimately, they committed to netCDF and ODV to reach the widest possible audience.

The workshop was highly interactive, which made a significant impact on the trajectory of SeaView. Karen Stocks and Steve Diggs took input and feedback while, as Diggs describes, they ‘cooked the dataset to order’ right in front of the scientists. Researchers made comments, caveats, and encouragements as they watched the team stumble, fail, succeed, and progress. After several hours, the SeaView team had forged a custom dataset that overlayed chlorophyll data and physical data. Diggs reports that participants received the datasets with a general sentiment of “yes! This is where we see things going”.

After this inaugural Ocean Sciences 2016 conference, the SeaView team had a foundation, as well as affirmation by their potential end-users that they were on the right track. The message was clear; scientists needed data that is relevant to them, they need it in one place, and they need it organized in formats they will use. The SeaView team moved forward with the construction of their large, integrated, themed data collections in a way that followed these guidelines. They would go on to forge a Hawai’i Ocean Time-series collection, a collection centered around the OOI Pioneer array, and one supporting the Bermuda Ocean Time Series.

Heading south for the big catch

Antarctica Sea Iceberg Polar Ice Scenery Blue
After the successful creation of several collections, SeaView took on the theme of a Southern Ocean time-series.

After the successful creation of several collections, SeaView took on the theme of a Southern Ocean time-series. It became a highly international project for EarthCube, partnering with the Southern Ocean Observing System (SOOS) global scientific network, and drawing data from beyond the core SeaView partner repositories for the first time. The collection focused on data of the entire Southern Ocean. They would now attempt to do with this extremely large dataset, what they had done with the others; ‘harmonize’ all of the data in time and space through their metadata. No matter what the data was, if it was south of 30 South, it would be made part of a final, coherent product.

They began with CTD and bottle data from CCHDO, and added zooplankton data from Long Term Ecological Research (LTER) and microbial data from MicrOBIS, and then integrated sensor data from the ARGO floats. Since ARGO datasets are so large as to be unmanageable, and SeaView’s objective is to make data workable for scientists, they provided custom subsets of the ARGO data matched to the spatial and temporal locations of the other datasets. Then they put it all in a netCDF and ODV formats.

A data collection on tour

fish-2580208_1280
SeaView has already proved it will bring people across oceanographic subdomains together; building, improving upon, and adding to the collections.

SeaView came into its own in an intensive, hands-on second workshop run by Steve Diggs at a pre-workshop for Polar18 in Davos, Switzerland, in June of 2018. He gathered data scientists from disparate communities, exhibiting an international effort with representation from the British Antarctic Survey, University of Tasmania, the Standing Committee On Antarctic Data Management (SCADM), Commonwealth Scientific and Industrial Research Organization (CSIRO), and various US institutions, including NOAA’s National Centers for Environmental Information (NCEI), Scripps Institution of Oceanography/UCSD, among others. The Southern Ocean Time-Series proved helpful; as participants worked with SeaView, they discovered the data were not only useful, they were simple to use.

These scientists have chosen to take the SeaView Southern Ocean Collection with them to their next workshop in Korea next year. This workshop is separate from EarthCube, and only now connected to SeaView because they have decided the data collection is useful in exactly the way they need. The team hopes that the SeaView datasets will be used for scientific papers and new collaborations, catalyzing exciting new science. SeaView has already proved it will bring people across oceanographic subdomains together; building, improving upon, and adding to the collections. Just as it brought its creators together from distinct domains.

Where could it go next?

Like many of the projects in EarthCube’s portfolio that seek to align with at least some aspect of emerging FAIR Data Principles, SeaView tools solidly address the the “I” and “R” in FAIR by making data more interoperable and reusable; in fact, Steve Diggs describes the project as “really getting down to the byte level” with interoperability. The fact that SeaView was never a project about theory, but always focused on making a product that was intended to be used is part of what led to its success.

From the beginning, the team wanted to strengthen the repositories involved in their creation. As Stocks describes, “there are increasing expectations on data repositories – oceanographic ones and other ones – around interoperability, meeting community standards and fostering multidisciplinary science, but those things are hard to do and it’s not completely clear how to do them.”

So SeaView, as a collaboration between team members of diverse data backgrounds, became a project in which multiple repositories had representatives. It created a small community where each of these repositories could have a voice and a helpful role in identifying the most important topics and how to tackle them. This resulted in resources that would inherently be optimized for repositories, since they played a part in creating it too.

Special ops teams supporting data interoperability one science domain at a time

SeaView itself took a team of repository experts that knew the data inside and out, and developers with advanced skills to create a useful data collection at that first conference. Karen Stocks says “that’s where the service is.” It sometimes may be harder than we expect to identify why some data can be difficult to make interoperable, but with dedicated, skilled, diverse teams that know how to listen, a great dataset can be made so convenient the researcher communities pick it up and take it with them.

ocean-3605547_1280
The SeaView team made functional resources that can click oceanography and useable data into place and brought a community together, making the seas of Oceanographic science just a little bit smoother.

Creating data interoperability is no easy task, and repositories must have support. SeaView can serve as a northstar of interoperability, born of dedication to a useable product and listening to users. Its success comes from a team that is both dynamic and humble in their skillset, taking on on challenges as they emerged because they were goal-oriented. They made functional resources that can click oceanography and useable data into place and brought a community together, making the seas of Oceanographic science just a little bit smoother.

Want a closer view of SeaView?

The SeaView team welcomes Ocean Scientists of all stripes to take a closer look at their data sets, and data repositories from all domains who are curious about the SeaView approach to contact them for wisdom shared and lessons learned. Bon voyage!

 

Leave a Reply