Towards Large Scale Façade Parsing: A Deep Learning Pipeline Using Mask R-CNN
Towards Large Scale Façade Parsing: A Deep Learning Pipeline Using Mask R-CNN
Abstract
This thesis tries to find a methodology that create a working pipeline for building
facade parsing, which allows to access large scale panorama imagery from Google
Street View (GSV) and implement on deep learning models. We propose a semiautomated
pipeline that integrates multiple systems for large-scale building facade
parsing.
One of the aim of pipeline is to alleviate the limitations of using street-level panorama
images for deep learning application through using rectilinear projection. The rectilinear
projection is used to transform high-resolution 360 view street level panorama
imagery to series of normal perspective images. The other aim of the pipeline is to
automatically retrieve large scale panorama imagery from GSV and implement the
Mask R-CNN deep learning model to building facade parsing.
The pipeline process includes i) Street level panorama imagery extraction from GSV
and Mapillary, ii) Apply rectilinear projection on panorama imagery to convert into
a series of images, iii) Retrieve building footprint and identify building facade images
in the map, iv) Image pre-processing, v) Implement and evaluate the Mask
R-CNN model to detect building facade classes, vi) Generate inference, such as -
the window-to-wall ratio of the detected classes, and vii) Test with other external
procedural building systems and depict the result.
In this thesis work, we have tried to use different frameworks and tools along with
deep learning models for large-scale building facade parsing. The thesis discusses
about the methodology and technique, implementation, and experiments to develop
the pipeline based on Google Street View panorama imagery and Mask R-CNN.
As a result, we designed a semi-automated pipeline that comprises the abovementioned
steps and processes. While developing the pipeline, we explored a wide
range of topics and integrated a variety of tools, frameworks, and algorithms. Using
the proposed semi-automated approach, we were able to generate datasets and train
them on a deep learning model, resulting in significant inferences that will serve as
the foundation for the future development of the domain.
Degree
Student essay
Collections
View/ Open
Date
2022-04-27Author
AYENEW, MOLA
Keywords
pipeline
semi-automated
building facade
parsing
panorama imagery
Google Street view
mapillary
Mask R-CNN
deep learning
inference
rectilinear projection
Language
eng