Deeplayout
Deeplayout
ACM Reference Format: and wall lengths. Generally, floor plan design is an iterative trial-
Wenming Wu, Xiao-Ming Fu, Rui Tang, Yuhan Wang, Yu-Hao Qi, and Ligang
and-error and time-consuming process between interior designers,
Liu. 2019. Data-driven Interior Plan Generation for Residential Buildings.
which requires significant expertise and experience, and home users.
ACM Trans. Graph. 38, 6, Article 234 (November 2019), 12 pages. https:
//[Link]/10.1145/3355089.3356556 In this paper, we consider a different problem of automatically
designing floor plans for residential buildings, given the boundary as
input only. In computer games and networked virtual environments,
1 INTRODUCTION efficiently and automatically generating floor plans from a given
Designing a floor plan is an essential part of building a dwelling, boundary is practical and demanded. In interior design, providing
as this plan indicates room connections, room types, room sizes, an initial design or candidate option for house renovation with the
existing boundary is very useful.
∗ The corresponding authors Some techniques have been proposed for automatically generat-
ing floor plans [Hua 2016; Liu et al. 2013; Merrell et al. 2010; Wu
Authors’ addresses: Wenming Wu, University of Science and Technology of China, et al. 2018]. Most of them have considered floor plan generation
China, wwming@[Link]; Xiao-Ming Fu, University of Science and Technology
of China, China, fuxm@[Link]; Rui Tang, Kujiale, China, ati@[Link]; as constraint-based systems with high-level constraints, such as
Yuhan Wang, Kujiale, China, daishu@[Link]; Yu-Hao Qi, University of Science room sizes, positions, and adjacencies. However, these constraints
and Technology of China, China, qiyuhao7@[Link]; Ligang Liu, University
of Science and Technology of China, China, lgliu@[Link].
are highly dependent on the knowledge given by the individual
designers and they may even be inconsistent in some cases. The
Permission to make digital or hard copies of all or part of this work for personal or
work of [Merrell et al. 2010] cannot maintain the input boundary.
classroom use is granted without fee provided that copies are not made or distributed In order to avoid enumerating all high-level constraints and im-
for profit or commercial advantage and that copies bear this notice and the full citation prove the plausibility of the generated floor plans, we aim to develop
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, a learning based method for this task. That is, given a building bound-
to post on servers or to redistribute to lists, requires prior specific permission and/or a ary, its floor plan is automatically learned from a dataset of real floor
fee. Request permissions from permissions@[Link]. plans without specifying any constraints. However, the challenges
© 2019 Association for Computing Machinery.
0730-0301/2019/11-ART234 $15.00 of achieving the goal are two fold. First, a large-scale dataset consist-
[Link] ing of real floor plans with complete room annotations is expected.
ACM Trans. Graph., Vol. 38, No. 6, Article 234. Publication date: November 2019.
234:2 • W. Wu et al.
There is no such dataset yet. Second, it is non-trivial to design a Due to the combinatorial complexity, evolutionary algorithms and
highly individualized learning approach that can quickly generate data-driven methods have been developed. For rectangular single-
intuitive floor plans comparable to those designed by humans, given story dwellings, Michalek et al. [2002] propose an optimization
only the boundary as input. model that is solved by combining gradient-based algorithms with
To this end, we propose a novel data-driven method to automat- evolutionary algorithms. An enhanced hybrid evolutionary algo-
ically and efficiently generate floor plans for residential buildings rithm is developed to generate a set of floor plans in the early
with the given boundary. First, we have built RPLAN, a large-scale design stages of architectural practice [Rodrigues et al. 2013a,b,c].
dataset containing more than 80K real floor plans from residential Bahrehmand et al. [2017] use an evolutionary approach to design
buildings. Each floor plan is represented as vector graphics com- an interactive layout solver. Given a small dataset with 120 architec-
posed of labeled rooms and walls. Then we develop a two-stage tural programs [Merrell et al. 2010], they use a Bayesian network to
approach to learn the floor plan based on the observation that pro- learn attributes of rooms and synthesize the layout using the sto-
fessionals design floor plans with two phases [Rengel 2011]: (i) chastic method without fixing the boundary. A data-driven method
determining room connections and positions and (ii) computing is proposed to estimate room dimensions and orientations [Rosser
room sizes and wall positions. To predict the room locations, we et al. 2017]. Some other methods have been proposed, such as a phys-
use the iterative learning method of [Ritchie et al. 2019; Wang et al. ically based space modeling method [Arvin and House 2002] and a
2018] with one novel modification - a living room first strategy to mixed integer quadratic programming (MIQP) based method [Wu
improve the plausibility of our generated floor plans. Then, given et al. 2018]. Wu et al. [2018] adopt high-level constraints as inputs
the computed locations of the rooms, we train an encoder-decoder and generate building interiors based on a MIQP formulation. An
network to predict the wall positions, and use customized rules to interactive system is presented to generate floor plans subject to
transform a pixelated representation into a vector representation. design, user, and manufacturing constraints [Liu et al. 2013]. A
From numerous experiments, our method is able to learn design lazy generation method is pretested to generate grid-like interior
rules from the dataset and has achieved better plausibility than layouts [Hahn et al. 2006]. Floor plans with irregular rooms are
existing methods. User studies have shown that it has achieved automatically created [Hua 2016]. These methods do not generate
comparable floor plan results to those created by humans. Moreover, floor plans from scratch using only the given boundary as input.
our approach achieves a satisfactory level of efficiency of generating
one floor plan within an average of four seconds. Deep layout generation. Deep learning architecture holds some
Our main contributions are as follows: promise in addressing the layout generation problem. Deep con-
• We propose a novel learning method to automatically and volutional neural networks [Ritchie et al. 2019; Wang et al. 2018]
efficiently generate floor plans for residential buildings given are used for indoor scene synthesis. LayoutGAN [Li et al. 2019] is
the building boundary as input only. a novel Generative Adversarial Network that synthesizes layouts
• To effectively train our networks, RPLAN, a large-scale dataset for graphic design. A room layout is estimated from a single RGB
containing more than 80K real floor plans from residential panorama using a deep learning framework [Yang et al. 2019; Zou
buildings, is constructed with dense annotations. et al. 2018]. A floor plan is reconstructed from a rasterized floor plan
image [Liu et al. 2017]. RGBD streams are used to automatically
To the best of our knowledge, this is the first time floor plan design
reconstruct a floor plan using a novel neural architecture, called
has been automated using only the boundary as constraint without
FloorNet [Liu et al. 2018]. We also use the deep learning approach
any pre-defined high-level constraints on rooms and walls. The
to generate floor plans for residential buildings, but only with the
constructed large-scale dataset RPLAN has pretty much potential
boundary as input. Aydemir et al. [2012] provide a dataset which
to inspire more research.
contains 940 floors. There are only 870 floor plan images in [Liu et al.
2017] and 5,000 samples in [Kalervo et al. 2019]. Large-scale floor
2 RELATED WORK
plans can be extracted from SUNCG [Song et al. 2017] consisting
Layout synthesis in computer graphics. Synthesizing layouts is an of large-scale synthetic indoor scenes; however, they are synthetic
essential topic in computer graphics and plays an important role and cannot model the complexity of real floor plans or replace real
in many applications, such as architecture, computer games, and floor plans. To this end, we propose a large-scale dataset with more
vitual/augmented reality [Liggett 2000; Smelik et al. 2014]. There than 80K real floor plans from residential buildings.
are many types of layouts, such as urban layouts [Aliaga et al. 2008;
Chen et al. 2008; Peng et al. 2016, 2014; Yang et al. 2013], game
3 OVERVIEW
level layouts [Hendrikx et al. 2013; Ma et al. 2014], architectural
layouts [Bao et al. 2013; Harada et al. 1995; Müller et al. 2006], Problem. Our method receives as input a building’s outer bound-
interior layouts [Feng et al. 2016; Merrell et al. 2010; Wu et al. 2018], ary defined as the geometry of the exterior walls with an entrance
indoor scenes [Fisher et al. 2012; Merrell et al. 2011; Wang et al. (Fig. 2 (a)). Our goal is to generate a desired floor plan, as a layout
2018; Yu et al. 2011], and page layouts [Harada et al. 1995; Li et al. of rooms and walls with room annotations ( Fig. 2 (c) and (d)).
2019; O’Donovan et al. 2014]. In this paper, we focus on floor plan
Challenges. However, there are two challenges. First, since only
generation for residential buildings.
the boundary is given, it is non-trivial to design a highly individu-
Floor plan generation. Floor plan generation may be defined as alized learning approach to automatically and efficiently generate
the process of determining the position and size of several rooms. intuitive floor plans that are comparable to the human-created floor
ACM Trans. Graph., Vol. 38, No. 6, Article 234. Publication date: November 2019.
Data-driven Interior Plan Generation for Residential Buildings • 234:3
Study room
Study room
Bathroom Bathroom
Locating rooms Locating walls
Master room Second room
Master room
Second room
Balcony
Balcony
2000
Wall-in 1043
GuestRoom 860 1500 Master room
1000
DiningRoom 1312
ChildRoom 3928 500
Entrance 292 0
60 70 80 90 100 110 120 (m²)
Balcony
(a1) (a4) The total area of the floor plan
Wall mask Room mask
(a) Statistics of our dataset (b) Floor plan (c) Ground truth
Fig. 3. Our dataset RPLAN. (a) Statistics on the occurrence of each room type (a1), room number per floor plan (a2), the proportion of the area of the living
room (a3), and the number of floor plan according to the total area (a4). (b) One typical floor plan in our collected dataset. (c) For each floor plan, we abstract
the necessary information used in our method, including the boundary mask (the entrance is shown in red), inside mask, (interior) wall mask, and room mask.
plans. The second challenge is that the data-driven method requires The dataset is collected at our own expense with the user and floor
a large amount of training data. However, there is no such a large- plan privacy eliminated. Hence, all floor plans in the dataset have no
scale database of floor plans from real residential buildings. copyright issue. Each floor plan has a vector-graphics representation
within a squared region of 18m × 18m, including the geometric and
Methodology. To solve the data problem, we propose RPLAN: a semantic information as shown in Fig. 3 (b). For the sake of applying
large-scale dataset with more than 80K floor plans from real residen- learning scheme, we convert each floor plan into a 256 × 256 image.
tial buildings with labeled rooms and walls (Section 4). Inspired by
human artists’ creative processes, we propose a two-stage approach Filtering. Real-world residential buildings often have some small
to first locate rooms and then walls (Section 5). Fig. 2 shows the areas for the flue, elevator and equipment platform. These are not
workflow of our proposed two-stage method. Given the building our target since these areas are too small and may be randomly set.
boundary (Fig. 2 (a)), the first stage is to predict room types and To avoid the interference of these factors and enhance the reliability
locations with an iterative learning scheme, shown as the red dots of the dataset, we filter out some non-standard data for training
in Fig. 2 (b). The second is to locate walls using an encoder-decoder and testing. Therefore, we first remove floor plans that contain
network and some dedicated rules (Fig. 2 (c) and (d)). undefined room types or rooms with very low frequency. In our
dataset, we have 13 kinds of rooms after filtering. Then, we only
4 DATASET OF RPLAN keep floor plans that satisfy all the following requirements:
We have built RPLAN, a large-scale dataset of floor plans from resi-
dential buildings with semantic annotations at the pixel level (Fig. 3). (1) The total area of the floor plan is larger than 60 square meters
To inspire more research, the dataset will be published later. and less than 120 square meters.
(2) The number of rooms in the floor plan is larger than 3 and
Data collection. We have collected more than 120K floor plans less than 9.
from real-world residential buildings in the estate market in Asia. (3) The floor plan has a living room.
ACM Trans. Graph., Vol. 38, No. 6, Article 234. Publication date: November 2019.
234:4 • W. Wu et al.
Location
Kitchen
Living room Living room Living room
CNN
Bathroom
Master room
Regression Second room
CNN
Balcony
Stop
Continuing
Fig. 4. An iterative prediction model to predict room locations. Our prediction model consists of three deep networks. Given the boundary as input, our model
first adopts a regression network to choose a location for the living room. Based on this, a location network and a continuing network are iteratively used to
predict the types and locations of other rooms in a step-by-step manner. The iterative process stops when the continuing network decides not to add rooms.
(4) The proportion of the area of the living room to the total area
of the floor plan is larger than 0.25 and less than 0.55.
(5) The average area of each room is larger than 10 square meters Entrance
and less than 20 square meters.
After filtering, we are left with more than 80K floor plans in our
dataset, whose statistics are shown in Fig. 3 (a). 75K of them are
used for our network training, while half of the remaining data is
used as the test set, and the other half is used as the verification set. (a1) (b1) (c1)
channel, and the semantics of rooms and walls in the third channel.
We use the specific integers (e.g., 0 for the living room) to represent
different masks. In the fourth channel, we store extra information to
distinguish between different rooms with the same labels. Different
(a2) (b2) (c2)
integers are used to distinguish different rooms with the same labels.
Fig. 5. Illustration of the importance of our living room first strategy. Top
row: examples generated by our method without the living room regression
5 TWO-STAGE APPROACH
network. Bottom row: examples generated by our method. The living room
5.1 Locating rooms is shown as a red block. (a1) lacks a living room which is necessary for a
5.1.1 Living room first strategy. residential building. (b1) chooses a location for another room that is very
close to the centroid of the living room. In (c1), the connection between the
Key observations. The living room is an indispensable part in the living room and the entrance is blocked by another room.
modern residence and there are two key features: (1) it is usually
located in the central area of the floor plan and (2) connected to most
other rooms. Based on these observations, we develop a living room the living room as the regression target. Since the shape of the living
first strategy, which predicts the location of the living room first room is a polygon, we represent the location of the living room as
(Fig. 4). Once the living room is determined, the connectivity can the centroid of the polygon. The multi-channel input includes the
be obtained by detecting the adjacencies between the living room following information at each pixel, which defaults to 0:
and other rooms. A separated prediction model for the living room • Inside mask: taking a value of 1 for the interior.
helps to improve the predictive accuracy and the overall rationality • Boundary mask: taking a value of 1 for the exterior walls and
of the floor plan, as shown in Fig. 5. 0.5 for the front door.
Our iterative strategy. Inspired by [Ritchie et al. 2019; Wang et al. • Entrance mask: taking a value of 1 for the front door.
2018], we propose the following iterative strategy (Fig. 4):
Network architecture. We use a modified Resnet-34 [He et al. 2016]
(1) Computing the location of the living room. to extract the spatial features from the multi-channel input. The
(2) Deciding what type of room to add and where. Resnet-34 architecture is modified to use 256 × 256 multi-channel
(3) Deciding whether to add another room. If yes, go to Step (2); images as inputs. We drop the last average pooling layer and fully
otherwise stop the algorithm. connected (FC) layer, and append two convolution layers. We then
use batch normalization (BN) and leaky rectified linear unit (leaky
5.1.2 Living room regression. We determine the location of the
ReLU) between two convolution layers. We add an average pooling
living room through a regression network.
layer at the end of the network to obtain two-dimensional coordi-
Training dataset. We build a training dataset for the regression nates. We train the regression network using the robust smooth L 1
network with a multi-channel image as the input and the location of loss [Girshick 2015], which is less sensitive than the L 2 loss.
ACM Trans. Graph., Vol. 38, No. 6, Article 234. Publication date: November 2019.
Data-driven Interior Plan Generation for Residential Buildings • 234:5
Sampling. Our prediction map is generated with noise, which is 5.2 Locating walls
very common for large generative models. To reduce the impact of Prediction-based locating strategy. The next step is to locate walls
the noise, we adopt a more direct sampling method to obtain the to allocate space for each room. Previous constraint-based meth-
type and location of the new room at the same time, as shown in ods can be directly applied by formulating room locations into
Fig. 7. Our sampling process contains the following steps: constrains. However, to generate a plausible floor plan, additional
(1) For each pixel p with a room label in the prediction map, we constraints, such as geometric constraints and topology constraints,
compute the number of pixels (denoted as Np ), which have should be provided. On the one hand, these inputs complicate the
the same label as p and are in the 18 × 18 neighborhood of p. design process for users, since constraint design requires increased
(2) Choose the location of the pixel that has the greatest Np as consideration. On the other hand, a system that includes too many
the center of the newly added room. constraints may have no solutions due to the contradictions between
ACM Trans. Graph., Vol. 38, No. 6, Article 234. Publication date: November 2019.
234:6 • W. Wu et al.
Study room
Second room
Living room
Kitchen Bathroom
CNN CNN
Bathroom
Master room Vectorizing
Living room
Second room Predicting Walls
Balcony
Kitchen Master room
Fig. 8. Given the boundary and predicted rooms, the encoder-decoder net-
work predicts walls on the pixel level. We use a post-processing step to
convert the predicted wall map into the vector representation. (a) (b) (c) (d)
Fig. 9. Illustration of the process of vectorization. (a) Input noise wall map.
(b) Decomposed wall blocks with a fixed width. (c) Complete walls. (d) Final
constraints. Therefore, we propose a prediction-based locating strat- result with room geometry, doors, and windows.
egy, as shown in Fig. 8. Specifically, we first use an encoder-decoder
network to predict walls in discrete pixels given the boundary and
room locations. Then, a post-processing step is applied to convert normalized wall map, the predicted room locations are used to
the predicted walls into a vector representation. generate pixel-wise semantics according to the connection between
pixels. We also compute the semantics of exterior walls, entrance,
5.2.1 Encoder-decoder prediction. After locating rooms, we should and interior walls. In Fig. 9 (d), we show the semantics.
now have a series of room types and locations. The next step is
to build walls based on this information. Given the boundary and Step (4). Set doors and windows. The connectivity is determined
predicted rooms, we use an encoder-decoder network to predict the based on the key observation that most doors are connected to
locations of walls. the living room. We use this prior information to add passageways
To build a training dataset for our wall locating network, as we between the living room and other rooms. In addition, a passageway
did before, we first adopt a representation simplification for each between any other connected rooms is added. Two empirical rules
floor plan in our dataset. The input for the network is the same as the are proposed to place the passageways:
room locating network. Our training target is a labeled image that (1) Open-walls are placed in public rooms (i.e., kitchen and bal-
is the same size as the input image. We have three kinds of labels cony); otherwise, we place ordinary doors.
for each pixel: WALL (i.e., a pixel belonging to walls), NOTHING (2) We place the door in the wall that minimizes the distance
(i.e., a pixel belonging to the interior but not part of any walls) and from the door to the front door.
OUTSIDE (i.e., a pixel belonging to the exterior). We treat room The windows are placed based on two empirical rules:
doors as parts of walls, though we can predict locations for room
(1) We set the French windows for the living room and small
doors at the same time. We will pursue this further in future work.
windows for the bathroom in the consideration of privacy;
We use the same network architecture as in the room locating
otherwise, we set the ordinary windows.
network due to the similar pixel-classification tasks. We then use
(2) Except for the bathroom, windows are laid out along the
averaged pixel-wise cross entropy loss to train our network.
longest exterior wall segments within each room. We set at
5.2.2 Vectorization. Our network generates a wall map with dis- most one window at the center of the wall segments.
crete pixels representing walls. Although we obtain the approximate These heuristics are simple but available. Fig. 9 (d) shows the final
outlines of walls from the wall map, there are still a few issues re- result of the floor plan. A learning based method is expected to
maining. We use a post-processing step to convert the wall map improve this process.
into a vector representation. We implement the vectorization using
four steps, as shown in Fig. 9. 6 EXPERIMENTS
Step (1). Decompose and fit the noise predicted walls into rectan- 6.1 Implementation details
gular parts. For a noise wall map (Fig. 9 (a)), we first perform the We use PyTorch to implement and train our networks. All models
morphological closing operation. Then, the walls are decomposed are trained and tested on an NVIDIA GeForce GTX 980 GPU. It takes
into vertical and horizontal wall blocks, which are represented as around two days to train the regression network for the living room
their bounding boxes. Finally, these wall blocks are transformed as well as the continuing network. Training takes around five days
with a fixed wall width, as shown in Fig. 9 (b). for both the room locating network and the wall locating network.
The details of network architectures and training are available in
Step (2). Connect and align the wall blocks to recover the complete
the supplementary material.
walls (Fig. 9 (c)). Separated blocks are connected by computing
Our regression network performs well with an average error of
the intersection of horizontal and vertical wall blocks. For further
0.82 meters (i.e. the actual distance from the predicted location to
optimization, we adjust the wall blocks locally: (1) close wall blocks
the ground truth) in the validation dataset. The continuing network
within a certain threshold are merged together; and (2) wall blocks
reaches around 99% validation accuracy. In the pixel-classification
are moved to align with other wall blocks or the exterior walls.
task, we have a class imbalance problem, since most of the target
Step (3). Obtain the label for each pixel based on the predicted labels are auxiliary labels (i.e. NOTHING and OUTSIDE). To fix this
rooms and recovered walls. To derive room geometry from the issue and improve the accuracy, we use weighted cross entropy loss
ACM Trans. Graph., Vol. 38, No. 6, Article 234. Publication date: November 2019.
Data-driven Interior Plan Generation for Residential Buildings • 234:7
with a weight of 1.25 for the room class or the wall class and 1 for Table 1. Statistics for user studies. We report the number of participants
auxiliary classes. (“#part”) who pass the vigilance tests, their age range, the average age a avg ,
the standard deviation of their ages a dev , the number of males and females,
At synthesis time, it takes around four seconds to generate a
and the number of participants who practice interior design as a profession
vectorized floor plan given the building boundary as input. (“#prof”). For these professional designers, the time (in years) spent in this
profession is recorded, and we report the average (yavg ) and the standard
6.2 User study deviation (ydev ). We also record the time (in minutes) spent by participants
We perform comparisons to other methods or human-created floor in completing the study, and we report the average (t avg ) and the standard
plans through user studies. deviation (t dev ).
Competitor #part Age Range a avg /a dev Male, Female #prof yavg /ydev t avg /t dev
Competitors. We select state-of-the-art methods as the competi- ISSNet+MIQP 71 [18, 50] 25.71/5.37 48, 23 24 3.33/2.58 3.30/1.58
tors. The networks for indoor scene synthesis [Wang et al. 2018] can Stage1+MIQP 60 [18, 49] 24.90/4.50 41, 19 26 2.04/1.34 2.88/1.78
be used to locate rooms. We denote the network of [Wang et al. 2018] ISSNet+Stage2 58 [18, 50] 25.34/5.94 37, 21 29 2.41/2.08 3.23/1.58
Human 71 [20, 50] 26.62/5.60 45, 26 33 3.38/2.60 4.97/2.56
as ISSNet. Given the room locations, the MIQP-based method [Wu
et al. 2018] can be used to determine the wall positions. Then, the 10 10
method first uses ISSNet to locate rooms and then uses MIQP to
N avg = 20.55 N avg = 18.88
locate walls is the first competitor (denoted as ISSNet+MIQP). Re- 5 N std = 4.29 5 N std = 2.94
placing the ISSNet of ISSNet+MIQP with our first stage approach for
locating rooms is the second competitor (denoted as Stage1+MIQP).
0 0
We substitute the MIQP of ISSNet+MIQP with our second stage 5 10 15 20 25 5 10 15 20 25
approach for locating walls to defne the third competitor (denoted
6 6
as ISSNet+Stage2). The human-created is the fourth competitor (de- N avg = 20.11 N avg = 19.35
noted as Human). We provide more details for ISSNet and MIQP in 4 4
N std = 2.79 N std = 4.29
the supplementary material. 2 2
User studies. Given a pair of floor plans that share the same bound- 0 0
5 10 15 20 25 5 10 15 20 25
ary, the forced-choice comparison task is designed, similar to [Wang
et al. 2018]. In each task, each participant should choose the floor 4 4
N avg = 20.76 N avg = 18.72
plan that they think is more plausible. To be fair, we randomly N std = 3.25 N std = 3.67
choose examples used for user studies from our generated results. 2 2
For each pair, the order of floor plans is randomized. We provide a
0 0
questionnaire used for comparison to human-created floor plans in 5 10 15 20 25 5 10 15 20 25
the supplementary material. N avg = 13.39 N avg = 11.70
8 8
We use the same user study design for four competitors. Each user N std = 2.34 N std = 2.49
6 6
study includes 30 forced-choice comparison tasks. In one out of each
4 4
of the 15 tasks, we perform a “vigilance test”, in which an obviously
2 2
wrong answer (specially, one floor plan with a randomized, jumbled
0 0
arrangement of random rooms) is displayed. 5 10 15 20 25 5 10 15 20 25
For each user study, the number of participants enrolled is 86, 81, General Users Designers
85 and 99, respectively. The participants are classified into general Fig. 10. Distributions of N . From the top, each successive row represents
users and designers who practice interior design as a profession. If the results of comparisons of ISSNet+MIQP, Stage1+MIQP, ISSNet+Stage2,
one participant does not achieve 100% accuracy on the vigilance and Human.
tests, we discard this response. The statistics of the participants
spent less than 2 minutes finishing comparison tasks. So quite a few
in each comparison study are shown in Table 1. For each partici-
participants are perfunctory and dropped by the vigilance test.
pant, we record the number (denoted as N ) of floor plans that are
We also observe that the average score of general users is less than
generated by our method and preferred by that participant.
designers’. For example, in the results of comparisons of Human,
Results. The average and standard deviation for all the general N avg is 13.39 for general users and 11.70 for designers. Designers
users or designers in one user study are denoted as N avg and N std , usually focus on the details (e.g., orientations and relative sizes
respectively. The histograms in Fig. 10 show the distributions of between rooms) of the floor plans, while the general users often
N . Note that in Table 1 and Fig. 10, we only record the data from judge the plausibility with personal preference. Our method may
participants who pass the vigilance tests in each user study. not learn those details very well. So designers favored the human-
A score of around 14 (28 non-vigilance choices in total) indicates designed layouts more than general users.
that the two methods are comparable. From the distributions in
Fig. 10, our method is comparable to the competitor Human and 6.3 Comparisons
outperforms the other three competitors. Comparison to ISSNet+MIQP. ISSNet computes category probabili-
Some participants failed the vigilance tests. We found about 7% ties on a single pixel, which operates like an image-to-pixel function,
of participants spent less than 1 minute and 21% of participants and has problems with global consistency and noise suppression
ACM Trans. Graph., Vol. 38, No. 6, Article 234. Publication date: November 2019.
234:8 • W. Wu et al.
Bathroom Bathroom
Living room Bathroom
Bathroom Second room Second room
Living room
Living room Living room
Living room Master room
Master room Bathroom Kitchen
Kitchen
Kitchen
Bathroom
Second room
Second room Second room Balcony
Master room Kitchen Master room
Bathroom
Study room Kitchen Living room Living room
Balcony Living room Second room
Living room
Bathroom
Living room Second room Master room
Master room Kitchen Bathroom
Master room Kitchen
Balcony Bathroom
when sampling. MIQP is a hierarchical optimization with geometric locations, MIQP fails in many examples with a few issues. The first
constraints and topology constraints. To this end, room locations issue is accessibility. In Fig. 12, columns (a) and (b) show that the mas-
predicted by ISSNet are formulated into position constraints for ter rooms generated by Stage1+MIQP are blocked by other rooms
MIQP. We do not use any connection constrains between two rooms and cannot be entered. In column (c), the entrance is incorrectly con-
since the connections are unclear. Moreover, too many constraints nected to the bathroom. Geometric dimensions are another problem.
for MIQP may result in contradictions and lead to no solutions. In columns (d) and (e), MIQP generates some rooms with abnormal
In the user study, average 20.55 floor plans of our method are sizes, which are not suitable for residential buildings.
preferred by general users and average 18.88 by designers, which
indicates that participants preferred floor plans generated by our Comparison to ISSNet+Stage2. While ISSNet serves as an image-
method to those generated by ISSNet+MIQP. Fig. 11 shows some to-pixel function, our room locating network can be treated as an
representative results generated by ISSNet+MIQP and our method. image-to-image function, which predicts the possible locations of all
ISSNet+MIQP generates floor plans with some necessary rooms types of rooms in an input image. Our network has greater integrity
missing which leads to sparse space allocation (columns (a) and and consistency in global layouts compared to ISSNet. We then
(b)), and some rooms with unreasonable geometric sizes and shapes compare our method to ISSNet+Stage2 using ISSNet for locating
(columns (c) and (d)). In contrast, our method performs better in rooms and our wall locating network to generate walls.
terms of global consistency and achieves better plausibility. In the user study, average 20.76 floor plans of our method are
preferred by general users and average 18.72 by designers, which in-
Comparison to Stage1+MIQP. Stage1+MIQP uses our first stage dicates that participants preferred results generated by our method.
approach to locate rooms and then uses MIQP to generate walls. ISSNet has the global consistency problem and may introduce noise
Similar to ISSNet+MIQP, the room locations predicted by our net- due to its image-to-pixel learning process, which causes that nec-
work serve as the location constraints for MIQP. We also do not add essary rooms are omitted in many cases, as shown in Fig. 13 (a)
any connection constraints between two rooms to avoid the issue and (b). Column (c) shows ISSNet+Stage2 generates a floor plan
of constraint contradiction. with only four rooms while our method generates a six-room floor
In the user study, average 20.11 floor plans of our method are plan. Although a four-room floor plan is acceptable, participants
preferred by general users and average 19.35 by designers, which in- preferred the more intensively populated floor plan generated by
dicates that participants preferred results generated by our method our method. In column (e), ISSNet+Stage2 synthesizes an abnormal
over those generated by Stage1+MIQP. The room types and loca- floor plan where the balcony is connected to the bathroom, which
tions predicted by our network become the initializations for MIQP. violates the privacy of the bathroom. Benefiting from the image-
Although our network performs well at predicting room types and to-image learning process, our method performs better in terms
ACM Trans. Graph., Vol. 38, No. 6, Article 234. Publication date: November 2019.
Data-driven Interior Plan Generation for Residential Buildings • 234:9
Bathroom
Balcony
Balcony
Living room
Kitchen Living room Bathroom Balcony
Living room Bathroom
Bathroom Master room Living room
Living room
Second room
Master room Master room
Kitchen
Master room
Balcony
Balcony
Balcony Balcony
ACM Trans. Graph., Vol. 38, No. 6, Article 234. Publication date: November 2019.
234:10 • W. Wu et al.
Kitchen
Balcony Kitchen
Second room Bathroom Bathroom
Second room
Master room Living room
Kitchen Second room
Kitchen Kitchen
Bathroom Study room Second room Bathroom
Living room
Living room
Living room Living room
Master room
Master room
Bathroom Master room Second room Master room
Storage
Balcony Balcony
Balcony Balcony
Kitchen
Kitchen Bathroom
Balcony Second room
Kitchen Bathroom
Second room
Master room
Bathroom Living room Kitchen
Study room
Second room Bathroom
Living room
Bathroom Living room Living room
Living room
Second room
Kitchen Second room Master room Master room
Master room Master room
Balcony
Balcony
of room distribution and thus achieves better global integrity and Kitchen Kitchen Second room
consistency compared to ISSNet+Stage2. Bathroom Kitchen
Second room
Comparison to Human. We finally compare the floor plans gen- Bathroom Master room
erated by our method with original floor plans from our dataset. Master room
These floor plans are designed manually using a combination of Living room
Master room
Living room Living room
Second room
intuition, prior experience, and professional knowledge. Bathroom
In the user study, average 13.39 floor plans of our method are pre- Balcony Balcony Balcony
ferred by general users and average 11.70 by designers. Participants
show a slight preference for the human-designed floor plans to those Fig. 15. Examples of floor plans synthesized by our method, given one or
two room locations specified by users (red dots). From left to right: no
generated by our method. In Fig. 14, columns (d) and (e) illustrate
specified room, specified master room, and specified kitchen and bathroom.
that our method generates floor plans similar to those created by
humans with only few differences, which proves the validity of our
method. For columns (a), (b), and (c), our method provides different
design options compared to the original human-created floor plans. Nearest neighbors. We examine the ability of our method to gen-
Note that the human-created results of Fig. 11, Fig. 12, and Fig. 13 erate floor plans beyond its training dataset, which does not just
are shown in the supplementary material. memorize the training dataset. Given a boundary, we generate our
result and use the boundary to search the nearest neighbor in the
training dataset. All of our results are different from the nearest
6.4 Evaluation and discussion neighbors. Fig. 16 shows three pairs of them. In addition, given the
Room constraints. Our method can support the location constraint same boundary as input, our method can also generate different
that specifies the locations of some rooms. According to personal results from the human-created results, as shown in the user study.
preferences and requirements, users first choose the locations for It indicates the result variety and generalization of our method.
several specific rooms inside the given boundary. Then our method
generates other rooms and respects user’s design intent. Fig. 15 Non axis-aligned input. Although RPLAN only contains floor
shows several examples generated by our method, where one or plans with axis-aligned walls, our method still works well when the
more room locations are specified by users. input boundary is non axis-aligned. We show six examples in Fig. 17
ACM Trans. Graph., Vol. 38, No. 6, Article 234. Publication date: November 2019.
Data-driven Interior Plan Generation for Residential Buildings • 234:11
Balcony
Kitchen Bathroom
Second room
Kitchen
Master room
Living room Living room
Bathroom Master room
Bathroom Living room
Kitchen
Master room Second room Second room
Balcony
Balcony Kitchen
Kitchen
Second room
Second room Master room Second room
Living room
Bathroom Bathroom Living room
Living room
Balcony
Fig. 16. Comparison to the nearest neighbors in the training dataset. Top
row: the nearest neighbors. Bottom row: our synthesized results.
Second room
Balcony
Master room Master room generate some rooms with inappropriate arrangements. As shown
in (a), the door to the master rooms is too small for users to enter.
Bathroom Second room Another failure case is caused by poor wall predictions. In (b), our
Living room
Living room Study room Balcony Living room network predicts walls with a lot of noise, which is difficult to deal
Master room
Kitchen Bathroom
with in post-processing, so the vectorization result is problematic.
Kitchen
Balcony Kitchen
Although our post-processing works well, it cannot handle cases
when necessary wall pixels are missing. The broken walls are hard
Fig. 17. Floor plan generation for non axis-aligned boundaries. In the bottom for our post-processing to handle, which leads to incorrect space
row, the examples contain curved walls. allocations in (c). We consider floor plans with incorrect walls as the
problematic layouts (Fig. 19). We test the frequency of problematic
Balcony Balcony
Second room
Master room floor plans using 100 generated examples. We generate 94 plausible
Living room Master room Second room Second room floor plans and the problematic frequency is low.
Bathroom Master room Bathroom
Living room Living room
Kitchen Kitchen Bathroom Kitchen Study room
7 CONCLUSION
Fig. 18. Synthesizing multiple floor plans given the same boundary as input. Our method provides a novel data-driven technique for automati-
cally and efficiently generating floor plans for residential buildings
with fixed boundary. By imitating the human design process, we
and more examples in the supplementary material. This indicates
propose a two-stage approach to generate floor plans. To effectively
that our model generates floor plans beyond its training dataset.
train our networks, a large-scale dataset containing more than 80K
Multiple floor plans. Since we sample a new room based on the floor plans from real residential buildings is presented. By compar-
maximum number of coverage points in the prediction map, only ing the plausibility of floor plans through user studies, our method
one solution is generated by our sampling method. To generate outperforms state-of-the-art methods, and in some cases our floor
multiple floor plans given the same boundary as input, we gen- plans are comparable to human-created ones.
erate a sample based on the probability distribution of predicted
Constraints in real life. In general, we propose an automatic algo-
rooms. Fig. 18 shows one example, where our method takes the
rithm for generating floor plans from the given boundaries. How-
same boundary as input and generates multiple floor plans.
ever, in actual design work, design with additional constraints (e.g.,
Failure cases. While our method generates plausible floor plans, constrained square footage, support walls, and house orientation)
it does fail in some cases, as shown in Fig. 19. Our method may seems to be more meaningful and challenging. One simple and
ACM Trans. Graph., Vol. 38, No. 6, Article 234. Publication date: November 2019.
234:12 • W. Wu et al.
straightforward solution is to introduce more generative models Mark Hendrikx, Sebastiaan Meijer, Joeri Van Der Velden, and Alexandru Iosup. 2013.
for these additional constraints. Or, we could turn these unfamiliar Procedural Content Generation for Games: A Survey. ACM Trans. Multimedia
Comput. Commun. Appl. 9, 1 (2013), 1:1–1:22.
constraints into room constraints which our method can deal with. Hao Hua. 2016. Irregular architectural layout synthesis with graphical inputs. Automa-
We would like to explore more in the future. tion in Construction 72 (2016), 388 – 396.
Ahti Kalervo, Juha Ylioinas, Markus Häikiö, Antti Karhu, and Juho Kannala. 2019.
CubiCasa5K: A Dataset and an Improved Multi-Task Model for Floorplan Image
More types of buildings. Currently, our method only designs floor Analysis. In Scandinavian Conference on Image Analysis. Springer, 28–40.
plans for one-story residential buildings. For multi-story homes, Jianan Li, Tingfa Xu, Jianming Zhang, Aaron Hertzmann, and Jimei Yang. 2019. Layout-
GAN: Generating Graphic Layouts with Wireframe Discriminator. In International
stairs are necessary to connect two consecutive floors. Our method Conference on Learning Representations.
can be applied by conceptualizing the stairs as a type of room. Robin S Liggett. 2000. Automated facilities layout: past, present and future. Automation
We generate the first floor plan containing the stairs, and then in Construction 9, 2 (2000), 197 – 215.
Chen Liu, Jiaye Wu, and Yasutaka Furukawa. 2018. FloorNet: A Unified Framework for
the second floor plan is generated based on the first. The stairs Floorplan Reconstruction from 3D Scans. In ECCV 2018. 203–219.
generated in the first floor should serve as a constraint for the Chen Liu, Jiajun Wu, Pushmeet Kohli, and Yasutaka Furukawa. 2017. Raster-to-Vector:
Revisiting Floorplan Transformation. In ICCV 2017. 2214–2222.
generation of the second floor. However, in the real world, stairs Han Liu, Yong-Liang Yang, Sawsan AlHalawani, and Niloy J. Mitra. 2013. Constraint-
have special shapes and location considerations, and it may be hard aware interior layout exploration for pre-cast concrete-based buildings. The Visual
for our deep networks to control the generation of stairs. Our two- Computer 29, 6 (2013), 663–673.
Chongyang Ma, Nicholas Vining, Sylvain Lefebvre, and Alla Sheffer. 2014. Game Level
stage approach, even without the living-room-first-strategy, can Layout from Design Specification. Comput. Graph. Forum (EG) 33, 2 (2014), 95–104.
also be extended to other types of buildings, such as office buildings, Paul Merrell, Eric Schkufza, and Vladlen Koltun. 2010. Computer-generated Residential
shopping malls, and supermarkets. However, the prerequisite for Building Layouts. ACM Trans. Graph. 29, 6 (2010), 181:1–181:12.
Paul Merrell, Eric Schkufza, Zeyang Li, Maneesh Agrawala, and Vladlen Koltun. 2011.
all these considerations is that we must have relevant data. Interactive Furniture Layout Using Interior Design Guidelines. ACM Trans. Graph.
30, 4 (2011), 87:1–87:10.
Jeremy Michalek, Ruchi Choudhary, and Panos Papalambros. 2002. Architectural layout
ACKNOWLEDGMENTS design optimization. Engineering optimization 34, 5 (2002), 461–484.
Pascal Müller, Peter Wonka, Simon Haegler, Andreas Ulmer, and Luc Van Gool. 2006.
We would like to thank Kai Wang for providing their implementa- Procedural Modeling of Buildings. ACM Trans. Graph. 25, 3 (2006), 614–623.
tion of [Wang et al. 2018], user study participants for evaluating Peter O’Donovan, Aseem Agarwala, and Aaron Hertzmann. 2014. Learning Layouts
for Single-Page Graphic Designs. IEEE. T. Vis. Comput. Gr. 20, 8 (2014), 1200–1213.
our results, and the anonymous reviewers for their constructive Chi-Han Peng, Yong-Liang Yang, Fan Bao, Daniel Fink, Dong-Ming Yan, Peter Wonka,
suggestions and comments. This work is supported by the National and Niloy J. Mitra. 2016. Computational Network Design from Functional Specifica-
Natural Science Foundation of China (61802359, 61672482, 11626253) tions. ACM Trans. Graph. 35, 4 (2016), 131:1–131:12.
Chi-Han Peng, Yong-Liang Yang, and Peter Wonka. 2014. Computing Layouts with
and the Fundamental Research Funds for the Central Universities Deformable Templates. ACM Trans. Graph. 33, 4 (2014), 99:1–99:11.
(WK0010460006, WK0010450004). Roberto J. Rengel. 2011. The Interior Plan: Concepts and Exercises. Fairchild Books.
Daniel Ritchie, Kai Wang, and Yu an Lin. 2019. Fast and Flexible Indoor Scene Synthesis
via Deep Convolutional Generative Models. In CVPR 2019.
REFERENCES Eugénio Rodrigues, Adélio Rodrigues Gaspar, and Álvaro Gomes. 2013a. An approach to
the multi-level space allocation problem in architecture using a hybrid evolutionary
Daniel G. Aliaga, Carlos A. Vanegas, and Bedrich Benes. 2008. Interactive Example- technique. Automation in Construction 35 (2013), 482–498.
based Urban Layout Synthesis. ACM Trans. Graph. 27, 5 (2008), 160:1–160:10.
Eugénio Rodrigues, Adélio Rodrigues Gaspar, and Álvaro Gomes. 2013b. An evolu-
Scott A Arvin and Donald H House. 2002. Modeling architectural design objectives in
tionary strategy enhanced with a local search technique for the space allocation
physically based space planning. Automation in Construction 11, 2 (2002), 213 – 225.
problem in architecture, Part 1: Methodology. Computer-Aided Design 45, 5 (2013),
Alper Aydemir, Patric Jensfelt, and John Folkesson. 2012. What can we learn from 38,000
887–897.
rooms? Reasoning about unexplored space in indoor environments. In 2012 IEEE/RSJ
Eugénio Rodrigues, Adélio Rodrigues Gaspar, and Álvaro Gomes. 2013c. An evolu-
International Conference on Intelligent Robots and Systems. IEEE, 4675–4682.
tionary strategy enhanced with a local search technique for the space allocation
Arash Bahrehmand, Thomas Batard, Ricardo Marques, Alun Evans, and Josep Blat. 2017.
problem in architecture, Part 2: Validation and performance tests. Computer-Aided
Optimizing layout using spatial quality metrics and user preferences. Graphical
Design 45, 5 (2013), 898–910.
Models 93 (2017), 25 – 38.
Julian F. Rosser, Gavin Smith, and Jeremy G. Morley. 2017. Data-driven estimation of
Fan Bao, Dong-Ming Yan, Niloy J. Mitra, and Peter Wonka. 2013. Generating and
building interior plans. International Journal of Geographical Information Science 31,
Exploring Good Building Layouts. ACM Trans. Graph. 32, 4 (2013), 122:1–122:10.
8 (2017), 1652–1674.
Guoning Chen, Gregory Esch, Peter Wonka, Pascal Müller, and Eugene Zhang. 2008.
Ruben M. Smelik, Tim Tutenel, Rafael Bidarra, and Bedrich Benes. 2014. A Survey on
Interactive Procedural Street Modeling. ACM Trans. Graph. 27, 3 (2008), 103:1–
Procedural Modelling for Virtual Worlds. Comput. Graph. Forum 33, 6 (2014), 31–50.
103:10.
Shuran Song, Fisher Yu, Andy Zeng, Angel X Chang, Manolis Savva, and Thomas
Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L
Funkhouser. 2017. Semantic scene completion from a single depth image. In CVPR.
Yuille. 2018. Deeplab: Semantic image segmentation with deep convolutional nets,
1746–1754.
atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis
Kai Wang, Manolis Savva, Angel X. Chang, and Daniel Ritchie. 2018. Deep Convolu-
and machine intelligence 40, 4 (2018), 834–848.
tional Priors for Indoor Scene Synthesis. ACM Trans. Graph. 37, 4 (2018), 70:1–70:14.
Tian Feng, Lap-Fai Yu, Sai-Kit Yeung, KangKang Yin, and Kun Zhou. 2016. Crowd-driven
Wenming Wu, Lubin Fan, Ligang Liu, and Peter Wonka. 2018. MIQP-based Layout
Mid-scale Layout Design. ACM Trans. Graph. 35, 4 (2016), 132:1–132:14.
Design for Building Interiors. Comput. Graph. Forum (EG) 37, 2 (2018), 511–521.
Matthew Fisher, Daniel Ritchie, Manolis Savva, Thomas Funkhouser, and Pat Hanrahan.
Shang-Ta Yang, Fu-En Wang, Chi-Han Peng, Peter Wonka, Min Sun, and Hung-Kuo
2012. Example-based Synthesis of 3D Object Arrangements. ACM Trans. Graph. 31,
Chu. 2019. DuLa-Net: A Dual-Projection Network for Estimating Room Layouts
6 (2012), 135:1–135:11.
from a Single RGB Panorama. In CVPR 2019.
Ross Girshick. 2015. Fast r-cnn. In Proceedings of the IEEE international conference on
Yong-Liang Yang, Jun Wang, Etienne Vouga, and Peter Wonka. 2013. Urban Pattern:
computer vision. 1440–1448.
Layout Design by Hierarchical Domain Splitting. ACM Trans. Graph. 32, 6 (2013),
Evan Hahn, Prosenjit Bose, and Anthony Whitehead. 2006. Persistent Realtime Building
181:1–181:12.
Interior Generation. In Proceedings of the 2006 ACM SIGGRAPH Symposium on
Lap-Fai Yu, Sai-Kit Yeung, Chi-Keung Tang, Demetri Terzopoulos, Tony F. Chan, and
Videogames. 179–186.
Stanley J. Osher. 2011. Make It Home: Automatic Optimization of Furniture Ar-
Mikako Harada, Andrew Witkin, and David Baraff. 1995. Interactive Physically-based
rangement. ACM Trans. Graph. 30, 4 (2011), 86:1–86:12.
Manipulation of Discrete/Continuous Models. In Proc. SIGGRAPH. 199–208.
Chuhang Zou, Alex Colburn, Qi Shan, and Derek Hoiem. 2018. Layoutnet: Reconstruct-
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning
ing the 3d room layout from a single rgb image. In CVPR 2018. 2051–2059.
for image recognition. In Proceedings of the IEEE conference on computer vision and
pattern recognition. 770–778.
ACM Trans. Graph., Vol. 38, No. 6, Article 234. Publication date: November 2019.