Our team at Element 84 isn’t new to diving headfirst into innovations in geospatial technology. Since solving some of the Earth’s biggest problems remains our primary goal, we’ve been excited by the potential of LLMs and LLM-related solutions to provide new avenues for uncovering crucial answers for people that need them most.

Access to data remains a primary barrier to entry in the geospatial field – especially when it comes to truly utilizing data to its fullest potential. Through our work on Earth Search and our involvement and leadership in community initiatives like STAC, we’re putting our time and energy into increasing user access to data, and improving the ways users interact with that data.

In a recent partnership with NOAA to explore emerging concepts, our team employed a combination of user experience interviews to hone in on key user problems in an effort to prioritize solving real issues faced by real users. Initially, NOAA put out a broad agency announcement to look for solutions that would take advantage of the knowledge mesh they are building. The knowledge mesh contains a large amount of data, as it represents a combination of knowledge graphs and links different collections of otherwise unrelated data. NOAA’s BAA is exploratory only with no plans yet to go beyond demonstration, but we see this as an opportunity to set the stage for new user-facing applications in the future.

Our team’s proposed response to this request was simple: we will build a natural language solution that allows users (including marine scientists, climate scientists, and more) to ask questions that can be answered by NOAA data. To engage with the solution, users would simply express their question in their own words – in natural language – and receive immediate answers.

We are excited by the potential for AI and machine learning to bring our ideas for this solution to life, but we don’t want to pretend that AI is a “magic bullet” for solving all user problems. In fact, more often than not it feels that when solutions incorporate AI they frequently lack connection to potential users, or fully miss the opportunity to solve users’ problems in any capacity.

Clearly, the most challenging aspect of developing this solution is ensuring that we are prioritizing REAL user problems. We want to work toward solving real, critical problems that actually need to be solved – not just “problems” fabricated by individuals who will never directly work on the problem. In order to do this we’re engaging in another series of user interviews. This work is still in progress, so if you are personally interested in trying out our solution as it is developed or providing feedback please get in touch with us.

An example of our Geospatial Discovery & Design Process at Element 84

Why is there a disconnect between available earth observation data and the users that seek to use the data?

This dilemma reflects a persistent disconnect between boots-on-the-ground users and the data/technology that is available to them. Even though narrowing this gap and increasing access to data more broadly has been a focus for over a decade in the earth observation sphere, the problem persists for several reasons.

Finding datasets is hard

One of the most clear takeaways we’ve gleaned from the interviews we’ve conducted up to this point is that users have trouble finding all of the available datasets that meet their needs.

These responses are particularly interesting because there are a variety of different, widely available, tools that could provide these answers including sites like Earthdata Search (NASA), OneStop (NOAA), Earth Search (Element 84), and CLASS (NOAA). Naturally, we are left wondering: why aren’t these options working? Or, do users just not know about them? These unsolved questions are a key motivator behind our quest for user input and feedback.

We’ve learned thus far that researchers struggle to find high-quality, relevant datasets for their work. Whether it’s ecosystem mapping or climate change studies, using the right dataset is crucial—but finding it is the challenge.

In remote sensing there’s no shortage of data, but it’s scattered across various platforms, authored by different organizations, and difficult to access. As researchers search for datasets, the criteria can vary widely, from specific locations to particular species. Existing tools are limited, often only hosting datasets from certain sources and supporting narrow search criteria. Some tools support geo-location searches but not taxonomic searches, while others lack natural language search capabilities.

When these tools fall short, researchers turn to Google for a broader search. However, Google’s wide-ranging results include irrelevant results and don’t support spatial, temporal, and other relevant filters, forcing researchers to sift through countless pages, making the search process time-consuming and inefficient.
Users lack a background in code generation

Despite advances in data accessibility, coding skills are required to fully engage with the latest in Earth Observation data technology. There is an expectation that scientists and researchers will write code – especially when it comes to determining why something is not working within their data. This expectation leads to some sticky situations, however, since these users are not software engineers by trade. Even so, because they are engaging with complicated problems and large amounts of data they are often met with equally complicated code that limits the tools they feel comfortable interfacing with to solve their problems.
Data visualization is critical

When it comes to applying complicated datasets to solve real world problems, access to rapid data visualizations is a key part of communicating findings to others. As important as this is, search tools frequently do not provide an avenue for easy visualization. If there were a system like Chat GPT with the ability to completely understand all NOAA data including graphs, however, users would likely ask it to preview their data as a tool for more accurate and straightforward data plotting.

Although this scenario could hypothetically become reality, scientists and researchers hold mixed opinions about the use of LLMs. Many people are very excited about the potential to put Chat GPT in front of their data, and they want to use a tool like this for visualization as we described. At the same time, there are people who are very worried and concerned about the use of this type of implementation for scientific problems. Scientists want to be the ones “doing” the science – not an LLM. While there isn’t a completely uniform view here, we aren’t interested in replacing scientists, or trying to do their work for them. We want to create tools that can help experts working on marine data in the same way that accountants can use Excel. Undeniably, the scientists would still be doing the real work.
Dataset quality leads to compatibility issues

As we’ve mentioned before, scientists and researchers work with a variety of complex datasets. When it comes to using the latest and greatest in earth observation technology to process these datasets, discrepancies in how data is cataloged or labeled can lead to a variety of challenges.

When inconsistencies in naming conventions happen with data, the same variables and parameters can present under different names such as “lat” and “latitude”. If there is data present from two different years or sensors, even small labelling discrepancies between the two can present a lot of extra work on the coding side for people that work with the data. Although this is a known problem that can be solved by utilizing modern file types, many older data formats still exist and are used in important work.

“The future is here, it’s just not evenly distributed” -William Gibson

For many of the above dilemmas, solutions do exist. That said, the existence of a solution to a given problem means very little if it is not widely available or adopted by those impacted by the problem.

Although there is significant work in progress to truly meet user needs, if these advancements do not take into account the needs of users this type of transformative change is very difficult. This challenge can be understood through an illustrative concept we call “The Hill.”

The Hill is a metaphor, both for the work required to connect technology solutions to user problems and the challenge to completely understand user problems and the technologies that would help solve them. Organizations want transformative solutions that will help users, but they are often unable to truly achieve their goal due to a fundamental lack of understanding as to what their desired users actually need in the first place. At the same time, users are unable to climb “the hill” on their side due to a lack of access to and understanding of the data and technology that is potentially available to them.

Shrinking the hill through intentional use of user experience processes

Coming from the technology side, it’s our goal to implement a variety of strategies to improve our understanding of user problems. To make this happen, our team is conducting a variety of user interviews with candidates selected based on their alignment to our target persona groups, experience and familiarity working with the various data entities, experience conducting scientific research and answering research questions that are informed by the collected data. Once identified, we work to ascertain an unbiased understanding of their work and any challenges they may face. Through combining our findings across several interviews, we are able to distill common themes and problems into clear issues that we want to solve.

Once we have a potential solution in mind we are able to go back to the user for feedback and further development. This process necessitates iteration, and we have found it difficult to identify users who have the time and capacity to engage in this work. It’s no surprise to us that everyone working to solve crucial problems with complex data is very busy, but that is why we are so excited about the potential to aid in streamlining the work.

Is implementing an AI-based solution the right approach?

We think AI is a promising technology because of its ability to understand both users and technology. While there are great existing tools for finding and exploring data, like STAC APIs, Jupyter notebooks, and Dask clusters, users need to know how to use those tools and translate their needs into code and operations in tools. LLMs can serve as a bridge in translating user questions into the language of technology in terms of database searches, processing steps, and visualization tools.

We’re still brainstorming, but here are a few ways we’re thinking about implementing AI as a means to shrink the Hill for all parties involved:

Many existing “solutions” for scientists and researchers still require Python. If AI can serve as a go-between for users to access their data, that represents a large step forward.
AI can translate a user question into a STAC search. Someone can use natural language to access the data they need to understand thawing permafrost in Canada: “What data is available for monitoring active layer thickness changes in the Mackenzie Delta region between May and September?”
If individuals have difficulty finding specific elements of data within a larger dataset, it makes sense to use AI to make the data search catalog better.
AI has the potential to increase accessibility of asking questions to pull answers directly from already vetted and reliable data source.
When time is of the essence, AI lowers the bar for visualization and helps users to interpret data. This includes visualizations on maps, graphs, and beyond.
LLMs can understand users – no matter what area of expertise a user may have, the LLM’s ability to understand their question will meet them where they are.

Limitations and uncertainty around AI

With all of this said, we understand that AI is not the answer to every problem. Equally, we understand that there are a variety of reasons why researchers and scientists may be wary about implementing AI into their work. Among other things, users might be wondering about the background of particular AI tools, the potential for AI to shape opinion (presented as fact), and the need for independent validating.

As the field continues to evolve, it is certain that new limitations will continue to be revealed, and new challenges will arise. While we recognize that AI presents its own set of unique challenges, we remain optimistic that it can be an asset to users if employed thoughtfully and with intentionality, and we look forward to continuing to experiment with possible advancements in this area.

Where to go from here

It is not lost on us that the roadblocks to data access that we’ve discussed in this blog reflect the genuine resiliency scientists demonstrate in the face of adversity. When we connect with researchers and scientists making incredibly impressive progress on impactful projects, more often than not they don’t have any type of formal background in machine learning or computer science. Instead, these people are climate change scientists who are teaching themselves machine learning in order to be able to access the technologies and techniques they need to complete their research.

Although this is incredibly impressive, we feel that this heroic effort should not be necessary for every scientist. We don’t claim to have all of the answers, but we are hopeful that by integrating aspects of AI and user experience techniques to aid researchers and scientists in their work we will be able to continue to make tools and data more accessible – and to therefore allow them to use their skillsets for even bigger and better pursuits.

If you’re interested in learning more about how our team might be able to increase your access to relevant data and earth observation technologies, we’d love to connect with you – just send us a message!

Generating truly useful solutions to real problems with the help of user experience and AI