On May 10, 2023, we introduced Autonomous GIS as an AI-powered next generation GIS that leverages the LLM’s general abilities in natural language understanding, reasoning, and coding for addressing spatial problems with automatic spatial data collection, analysis, and visualization. We envisioned that Autonomous GIS will be capable of searching and retrieving needed spatial data either from extensive existing online geospatial data catalogs or collecting new data from sensors, and then using existing spatial algorithms, models, or tools (or developing new ones) to process gathered data to generate the final results, e.g. maps, charts, or reports. LLM can be considered as the ‘brain’ of autonomous GIS or the ‘head’ if equipped with environmental sensors, while executable programs (e.g. Python) can be considered as its digital ‘hands’. Similar to other autonomous agents, we propose that autonomous GIS requires five critical modules, including decision-making (LLM as the core), data collecting, data operating, operation logging, and history retrieval. These modules enable GIS to achieve five autonomous goals: self-generating, self-organizing, self-verifying, self-executing, and self-growing (Table 1).
Full article: Li, Z., & Ning, H. (2023). Autonomous GIS: the next-generation AI-powered GIS. International Journal of Digital Earth, 16(2), 4668-4686.
Some selected discussion points from the article
We envisioned the Copilot in existing GIS platforms such as ArcGIS and QGIS: In addition to the standalone form, autonomous GIS can serve as a co-pilot for traditional GIS software, using natural language to communicate with users and automate spatial data processing and analysis tasks. This setup is similar to Microsoft’s Co-pilot for its Office family, which automates office tasks such as report writing and slide creation (Microsoft Citation2023). For example, an autonomous GIS panel alongside the map view can be integrated into ArcGIS and QGIS, displaying the chatbox, generated solution graph, and codes as illustrated in Figure 7 (b). The result will be shown in the built-in map view. Users can modify the workflow by editing the solution graph or operation parameters. If an operation is implemented by the generated code, users can locate and edit the code by clicking the solution graph nodes. Overall, the solution graph is similar to the Module Builder of ArcGIS or Model Designer in QGIS. Implementing autonomous GIS based on existing GIS platforms may be the most practical and efficient approach at present, as mature GIS platforms (e.g. ArcGIS) already have a rich number of operations (e.g. the tools in the ArcGIS toolbox) along with well-established documentation that LLM can quickly learn and use to generate the solution graph.
Use divide-and-conquer approach to solve complex spatial problems using geoprocessing workflow: One might reasonably question why we do not opt for a more direct approach — asking LLMs like GPT-4 for solutions without generating a solution graph and operations. Indeed, our tests have shown that with extra guidance embedded in the prompt, GPT-4 is capable of generating correct code for relatively simple spatial analysis tasks. However, this direct method may limit the LLM’s capacity to tackle more complex tasks such as identifying the most promising fishing site in a specific sea area or assessing the accessibility of Autism intervention services at a national level using smartphone mobility data APIs and web crawling. These tasks are akin to writing a long novel, and it is unlikely that LLMs would generate comprehensive, one-step solutions that involve extensive programs. Such tasks are challenging not only for humans but also for models (Maeda Citation2023). Human problem-solving often employs a divide-and-conquer strategy when facing complex tasks. By adopting this approach in the design of LLM-Geo, it is intended to address spatial analysis tasks by breaking complex problems into smaller, more manageable sub-problems that LLMs can handle (i.e. operations). It then addresses these sub-problems (i.e. developing a function for each sub-problem) and ultimately combines these functions (i.e. assembling a program) to yield the final results. This divide-and-conquer methodology has been employed in other digital autonomous agents, such as AutoGPT (Richards Citation2023) and AgentGPT (Reworkd Citation2023). We plan to further explore the feasibility of this approach with tasks that are more complex than those demonstrated thus far. Additionally, the divide-and-conquer strategy aids in the production of verified operation nodes (e.g. code snippets). These validated sub-solutions (operations) become valuable assets for autonomous GIS as they can be reused for future tasks, bolstering the autonomous goal of self-growing.
We suggested the need for AI-powered online geospatial data discover and filtering: Our case studies provided data for LLM-Geo, such as data file paths, URLs, or data access APIs, as well as data descriptions. However, an autonomous GIS should be capable of collecting required data independently to finish the task. Most LLM-based autonomous agents are equipped with search engines (using APIs) to search and retrieve data from the internet. Fortunately, numerous geospatial datasets, standard geospatial web services (e.g. OGC WMS, WFS, WCS, WPS), and REST APIs for geospatial data access have been established in recent decades. For example, US population data can be extracted from the US Census Bureau (Citation2023) via API, as well as from OpenStreetMap (OpenStreetMap Citation2023). Large online geospatial data catalogs have also been created by various national or international organizations and government agencies, such as the Google Earth Engine Data Catalog, NOAA and NASA Data Catalogs, EarthCube, and the European Environment Agency (EEA) geospatial data catalog. The challenge lies in finding and selecting suitable and high-quality data layers in terms of spatiotemporal resolution, spatiotemporal coverage, and accuracy. Autonomous GIS developers need to establish practical strategies for LLMs to discover, filter, and utilize the most relevant and accurate geospatial datasets (including those that require an approved account, password, or token to access) for a given spatial analysis task.
Answer ‘why’ questions: LLM-Geo, powered by GPT-4, illustrates the ability of knowing ‘how’ to perform spatial analysis and automate a multitude of routine geospatial tasks. The next challenge lies in the ability of autonomous GIS to address ‘why’ questions, which often requires a more profound understanding and investigation of geospatial knowledge. Questions such as ‘According to the smartphone mobility data, why did my customer numbers drop by 20% this month?’ or ‘Why did these migratory birds alter their path in the past decade?’ extend beyond basic spatial analysis and delve into the domain of hypothesis generation, data selection, and experimental design. To answer these questions, autonomous GIS needs to have the ability to design research by formulating informed hypotheses based on the posed question, available data, and context.
Build a Large Spatial Model (LSM): LLMs, trained on extensive text corpora, have developed language skills, knowledge, and reasoning abilities, but their spatial awareness remains limited due to the scarcity of spatial samples within these corpora. Many resources in GIScience, such as abundant historical remote sensing images, global vector data, detailed records of infrastructure and properties, and vast amounts of other geospatial big data sources, have yet to be fully incorporated into the training of large models. Consider the potential of a Large Spatial Model (LSM), trained on all available spatial data (e.g. all raster data pixels), mirroring the way LLMs are trained on extensive text corpora. Such a model could potentially possess a detailed understanding of the Earth’s surface, accurately describe any location, and comprehend the dynamics of ecosystems and geospheres. This would not only enrich the spatial awareness of future artificial general intelligence, but could also transform the field of GIScience, empowering autonomous GIS to answer ‘why’ questions. We advocate for further research and efforts towards the training of Large Spatial Models that can more accurately represent the Earth’s surface and human society.