Powered by the emerging large language models (LLMs), autonomous geographic information systems (GIS) agents have the potential to accomplish spatial analyses and cartographic tasks. However, a research gap exists to support fully autonomous GIS agents: how to enable agents to discover and download the necessary data for geospatial analyses. This study proposes an autonomous GIS agent framework capable of retrieving required geospatial data by generating, executing, and debugging programs. The framework utilizes the LLM as the decision-maker, selects the appropriate data source (s) from a pre-defined source list, and fetches the data from the chosen source. Each data source has a handbook that records the metadata and technical details for data retrieval. The proposed framework is designed in a plug-and-play style to ensure flexibility and extensibility. Human users or autonomous data scrawlers can add new data sources by adding new handbooks. We developed a prototype agent based on the framework, which was released as a QGIS plugin (GeoData Retrieve Agent) and a Python program. Experiment results demonstrate its capability of retrieving data from various sources, including OpenStreetMap, administrative boundaries and demographic data from the US Census Bureau, satellite basemaps from ESRI World Imagery, global digital elevation model (DEM) from OpenTotography, weather data from a commercial provider, the COVID-19 cases from the NYTimes GitHub. Our study is among the first attempts to develop an autonomous geospatial data retrieval agent.
We tested various data cases; by accepting data requests in natural language, most of the requests got correct data with an about 80% – 90% success rate. We feel excited about that because the success of such data fetching agents indicates that the data-intensive GIS research or border scientific research can be executed by agents. Autonomous research agents can collect necessary online or local data and then conduct analysis parallelly while adjusting methods or strategies for better results. LLM-Find will play a foundational role in such a bright vision.
The source code is available on Github: https://github.com/gladcolor/LLM-Find
QGIS users can download the QGIS plugin (AutonomousGIS-GeodataRetrieveAgent) to download the data via natural language in a GIS environment. Note that for some data sources, you may need to apply API keys. The source code of the QGIS plugin is here. QGIS Plugin page: https://plugins.qgis.org/plugins/AutonomousGIS_GeodataRetrieverAgent/
Please watch demonstrations on our YoutubeChannel.
For more details, please refer to our paper: Ning, Huan, Zhenlong Li, Temitope Akinboyewa, and M. Naser Lessani. 2024. “LLM-Find: An Autonomous GIS Agent Framework for Geospatial Data Retrieval.” arXiv. https://doi.org/10.48550/arXiv.2407.21024.
Further reading: Autonomous GIS: the next-generation AI-powered GIS. Recommended citation format: Li Z., Ning H., 2023. Autonomous GIS: the next-generation AI-powered GIS. International Journal of Digital Earth. https://doi.org/10.1080/17538947.2023.2278895. GitHub repository: github.com/gladcolor/LLM-Geo
Note: LLM-Find is under active development, and the ideas presented in the paper may change due to the rapid development of AI. We hope LLM-Find can inspire the geospatial community to investigate autonomous GIS further.
LLM-Find framework
LLM-Find Agent workflow