Objective: The aim of this research was to quantitatively explore the relationship between the building density and building completeness in OSM, and verified that whether the building density indicator is a proxy for quantitative completeness estimation of OSM building data in urban areas.
Background: OpenStreetMap (OSM), as the most representative volunteered geographic information, are free geospatial datasets collected by volunteers worldwide[2]. One significant concern of using the datasets is data quality because the majority of OSM data were collected by ‘non-specialists’ and ‘amateur geographers’[3]. Many studies have focused on assessing various quality elements (e.g. positional accuracy, completeness, thematic accuracy, logical consistency, temporal accuracy and usability) of OSM data, but these often involved a reference dataset for comparison[4]. Nevertheless, an reference dataset may not always be available due to high-cost, scale and use limitation, it is therefore especially necessary to estimate the quality of an OSM dataset without any reference dataset[5].
Data: Datasets from three study areas from different countries and regions were used for testing. The first study area was the city of San Jose in the United States. The second and the third study areas were two administrative regions (i.e. Waikato and Hawke’s Bay) in New Zealand.
Methodology: The method for validation included the following steps: First, delineate the urban areas of a study area. In this research work, urban areas were delineated based on the available land use feature in an OSM dataset; second, create a regular grid (e.g. 1×1 km) and overlap it with the delineated urban areas. The corresponding urban areas in each grid cell were viewed as the smallest (geographic) unit for analysis; third, for the smallest unit in each grid cell, calculate the total building areas in terms of OSM and reference datasets, respectively, as well as calculate the OSM building density (i.e. the percentage of the total area of buildings in a given geographic unit) and OSM building completeness (i.e. the proportion of the total area of the OSM and reference buildings per geographic unit); and finally, plot and analyze the relationship between the OSM building density and OSM building completeness.
Result and Discussion: We found from this research work that: (1) both the OSM building density and OSM building completeness had an approximately linear relationship. More precisely, the OSM building completeness was approximately 3.4–4 times of the OSM building density; (2) The absolute residuals (i.e. estimated completeness-actual completeness) for approximately 70–80% of grid cells were smaller than 10%, while the absolute residuals for approximately 80%–90% were smaller than 20%. In addition, we also found that the size of an analysis unit should not be too small (e.g. smaller than 0.3 or 0.4 km2); otherwise, the corresponding absolute residuals may be larger.
Conclusion and Future Work: From the above results, we concluded that the building density indicator is a potential proxy for quantitative completeness estimation of OSM building data in urban areas. However, as the actual building density often vary among different regions or smallest analysis units, there is still a need to develop an approach to adaptively establish the mathematical relationship between the OSM building density and OSM building completeness. More important, the building density indicator is only suitable for estimating the completeness of OSM buildings in urban areas; but in rural areas, a low building density often corresponds to a high completeness. Therefore, there is a need to develop indicators for estimating the completeness of OSM building data in rural areas.
ACKNOWLEDGMENT
The project was supported by National Natural Science Foundation of China (No. 41771428),Fundamental Research Funds for the Central Universities, China University of Geosciences(Wuhan) (No. G1323541711), and it was also funded by Beijing Key Laboratory of Urban Spatial Information Engineering (No. 2017213). The author would like to express special thanks to all the anonymous reviewers and the editor for their valuable comments.
REFERENCES
[1] Q. Zhou.. “Exploring the relationship between density and completeness of urban building data in OpenStreetMap for quality estimation,” International Journal of Geographical Information Science, vol. 37, pp. 257-281, 2018.
[2] M. Haklay, and P. Weber, “Openstreetmap: user-generated street maps,” IEEE Pervasive Computing, vol. 7, pp. 12–18, 2008.
[3] M. F. Goodchild, “Assertion and authority: the science of user-generated geographic content,” In: Proceedings of the colloquium for Andrew U. Frank’s 60th birthday. Austria: Vienna University of Technology. 2008.
[4] M. Haklay, “How good is volunteered geographical information? A comparative study of OpenStreetMap and ordnance survey datasets,” Environment and Planning B: Planning and Design, vol. A93, pp. 3–11, 2010.
[5] V. Antoniou and A. Skopeliti, “Measures and indicators of VGI quality: an overview,” ISPRS annals of the photogrammetry, remote sensing and spatial information sciences, vol. 2, pp. 345, 2015.