LLM-Cat

We prototyped a cartography agent named LLM-Cat to make simple maps. It utilized the vision capability of GPT-4o to demonstrate the possibility of vision-based autonomous cartography. LLM-Cat accepted map-making requests in natural language and generated maps using Python code. Because cartography is a vision-based and iterative process, it requires map cognition and understanding the causation between the code (or GIS operations) and the map. Trained cartographers regularly make multiple attempts and modifications before the map fits the context and requirements. We argue that generative AI with the vision model can make maps by mimicking the cartographers’ behaviors; therefore, LLM-Cat is designed to work iteratively: 1) it generates an initial map based on the given data and requirements; 2) it reviews the map and points out one issue, such as the legend is missing; 3) it revises the code to solve the issue and generates a new map; and 4) it repeats step 2) and 3) until it reaches the maximum iteration or it thinks there are no more issues.

Box 1 shows the input of a use case of LLM-Cat; it requires the agent to create a map showing the hospital locations in Pennsylvania, USA. The map requirements are detailed in natural language, such as title, north arrow, and basemap. Data locations are also provided. Figure 1 shows the maps generated from 4 rounds (Round 0, 3, 7, and 10). After 10 rounds, LLM-Cat achieved multiple improvements: 1) increasing the title fonts; 2) re-arranging the “Designer” annotation; 3) changing the height of the north arrow; 4) replacing the hospital point color; 5) turning up the transparency of the basemap. These improvements make the map look better than Round 0 (initial map). We noticed that the backend AI model (GPT-4o) has relatively weak map aesthetics and cartography skills, so the final map does not look very appealing although it does meet all the requirements. This is among the first attempts among others to use AI for autonomous cartography, and the result is promising. Future research can incorporate the target audience’s background, for example, asking the agent to make maps for children, color-blind viewers, or a specific domain, such as the health science community.

Box 1. Input to LLM-Cat for a map-making task; it is asked to create a map to show the hospital locations in Pennsylvania, USA. The input contains two parts: task requirements and data locations, both in natural language.

Figure 1. The maps from multiple rounds (Round 0, 3, 7, and 10) of the cartography agent LLM-Cat. Note that the scale bar unit should be “km”. The backed GPT-4o model has noticed this error and set the unit parameter to “km” as the document of the Python scale bar package (matplotlib_scalebar), but the shown unit is “Mm” which may be caused by matplotlib_scalebar package bugs.

For more information about the implementation of the LLM-Cat, you can find the source code and more case studies at https://github.com/gladcolor/LLM-Cat.

Recent Publications

Recent Posts