Select Page

Understanding tourist movement patterns with big data

A graph-based approach to detect the tourist movement pattern using social media data

Understanding the characteristics of tourist movement is essential for the tourist behaviour study since they are significantly related to the whole suite of tourist industry management from attraction planning to commercial product development. This research introduces a graph-based method to detect the tourist movement pattern from the Twitter data. First, the collected tweets with geo-tags are cleaned to filter out those not published by tourists. Second, a DBSCAN-based clustering method is adapted to construct the tourist graph consisting of the tourist attraction vertices and edges. Lastly, the network analytics methods (e.g. betweenness centrality and Markov clustering algorithm(MCL) are applied to detect the tourist movement patterns, such as popular attractions, centric attractions, and popular tour routes. The New York City in the United States is selected to demonstrate the proposed approach. The detected tourist movement patterns can help the tour product development, transportation, shopping center and accommodation development.

The figure below illustrate the tourist graph construction: the hollow circle represents the original locations of the tweets tourists publish; the bigger solid circle represents the centroid coordinate of the cluster, that is the vertex in the tourist graph; the grey dash line shows the tourist trajectory tracked by their tweets; the solid blue line represents the aggregated edge between two attractions/vertices. The figure below (right) shows the popular point-to-point routes identified by the weighted degree for each node. The top three routes emerge: 1) from Top of the Rock Observation Deck to Times Square; 2) from Times Square to World Trade Center, and 3) from Central Park to Times Square.

The two figures below show the geographic distribution of the clustered attractions and the recommended routes for tourists based on the Markov clustering result: (A) the first cluster; (B) the second cluster. The red dash line represents the route of Big Bus downtown tour; the green dash line represents the route of Big Bus uptown tour; the purple dash line represents the route of Big Bus midtown tour.

Publication:
Hu F., Li Z., Yang C., Jiang Y. (2018) A graph-based approach to detect the tourist movement pattern using social media dataCartography and Geographic Information Science,  10.1080/15230406.2018.1496036

Twitter Census: Mining and mapping geotagged tweets to reveal human dynamics

The study of population stocks and human movements has historically been severely limited by the absence of reliable data or the temporal sparsity of the available data. Using geospatial digital trace data, the study of population movements can be much more precisely and dynamically measured. Our research seeks to develop a near real-time Twitter census that gives a more temporally granular picture of local and non-local population at the county level.

Publication:
Martin Y., Li Z., Ge Y., Towards real-time population estimates: introducing Twitter daily estimates of residents and non-residents at the county level, Applied Geography (under review)

Rapid Estimation of Visitation Activities in U.S. National Parks by Mining Big Social Media Data

Understanding the visitation activities of National Parks, which is critical for park planning, resource allocation, and effective management, depends on consistent, reliable, high quality information about visitor use. Therefore, agency managers devote a significant amount of staff time and funding to managing and monitoring the use of parks by visitors. Some primary concerns are how many people visit a park, what they are doing while they visit, how long they stay, and characteristics of the ‘typical’ visitor.

This research aims to develop and evaluate a novel approach for rapidly estimating the characteristics associated with park visitation by mining billions of geotagged tweets. The five most visited National Parks in 2018 (Great Smoky Mountains NP, Grand Canyon NP, Rocky Mountain NP, Zion NP, and Yellowstone NP) will be used as case studies in the proposed project. The methodology developed in this research will provide additional information that helps park managers and NPS administrators more effectively understand visitation at NPS units. More broadly, we envision that the procedures and tools developed as part of this research will advance the state-of-the-art human mobility research by providing an innovative cost-effective and scalable approach to better understanding the movement of tourists, which can help promote sustainable tourism and regional economic development while facilitating environmental sustainability.

A preliminary tool to explore the park visitations. The map shows the potential origins of the visitors of the Yellowstone National Park area in June 2017: green dots indicate the visitors and red dots indicate the potential origins of those visitors. Each line connects the same visitor.

Worldwide visitors to Yellowstone National Park

A zoom-in view

Project: Li Z., Kupfer J., Rapid Estimation of Visitation Activities in U.S. National Parks by Mining Big Social Media Data