Sketch2Pix @ CDRF

Machine-Augmented Architectural Sketching

Kyle Steinfeld for CDRF, June 2020

Abstract

Presented here is a technical account of the development of an augmented architectural drawing tool Sketch2Pix: an interactive application that supports architectural sketching augmented by automated image-to-image translation processes (Isola, 2016). In this paper, we account not only for the "front-end" experience of this tool, but also detail our approach to offering novice designers access to the "full stack" of processes and technologies required to craft and interact with their own augmented drawing assistant. This workflow includes: the establishment of a training data set; the training and validation of a neural network model; the deploying of this model in a graphic sketching environment; and the configuration of the sketching environment to facilitate a "conversation" with an AI partner. Our approach is novel both in the technical barriers it removes, and in the facilitation of access for creative design practitioners who may not be well-versed in the underlying technologies. The accessibility of our approach is demonstrated by a case study in the Spring of 2020. At this time, a group of novice undergraduate students of design at UC Berkeley adopted Sketch2Pix to produce a series of image-to-image translation models, or "brushes", each trained on a designer's own data, and for a designer's own use in sketching architectural forms. By empowering creative practitioners to more easily shape the configuration of machine-augmented tools, this project seeks to serve as a harbinger of novel modalities of authorship that we might expect in this still-emerging paradigm of computer-assisted design.

Sketch2Pix: An interactive application for architectural sketching augmented by automated image-to-image translation

Code related to this project is available at
github.com / ksteinfe / runway_sketch2pix

Hello.
Today I'd like to offer 
    a technical accounting 
    of the development of 
        an **augmented architectural drawing tool** 

Sketch2Pix
    is an interactive application 
    that supports architectural sketching 
    augmented by automated image-to-image translation

While in the paper,
    I detail the technologies, 
        workflows, 
        and interfaces 
    that enable novice creative authors 
    to train and wield 
    custom-made image-to-image translation models 
    in support of machine-augmented sketching. 
    
Here, 
I'll cover this material 
    only in overview,
and from the perspective 
    of a user of this tool.
    
I'll begin with a brief discussion 
    of our motivation for the project, 
    
Along with a description of 
    some basic terms 
    and the central mechanisms 
    of the tool that support creative sketching

At the end of the talk,
    if there's time,
I'll attempt a live demo of the tool.

Introduction

Motivation & Related Work

'Image Generated by a Convolutional Network - r/MachineLearning.'
Reddit user swifty8883. June 16, 2015

The intersection 
    of recent advances in machine learning 
    and architectural design tools 
remains a largely undefined territory.

Delirious Facade
A "hybrid" facade combining the overall form of one facade selected from the city of Toronto with the fine features of another
LAMAS, 2016

While it is clear that AI-based design tools 
    present fresh challenges 
    to the way we understand and use software 
    in the service of designing and building, 
the nature of these challenges, 
    and the character of the related opportunities, 
are still uncertain.

Face to Pix2Pix Facade
Koki Ibukuro, 2019

This project represents one step 
toward demonstrating an opportunity 
that we have identified in brining ML to bear 
on better supporting modes of reasoning 
central to creative design.

Suggestive Drawing among Human and Artificial Intelligences
Here, an ML model has been trained to understand the transformation from line drawings to a whole range of objects: from flowers to patterned dresses.

While, broadly speaking, 
existing paradigms for design tools 
have been shown to effectively support 
    deductive reasoning, 
    
we assert that 
a new paradigm 
based on machine learning 
will be uniquely suited 
to address heretofore unsupported modes 
of reasoning about design, 
    such as imagistic and inductive thinking.

Animation generated by Artbreeder
Kyle Steinfeld, 2020

This difference is clear 
from the intrinsic mechanisms 
of the tools involved: 

while 3d modeling proceeds 
    by composing geometric operations in sequence, 
    
and parametric modeling proceeds 
    by assembling elaborate chains of logic, 
    
machine learning models proceed 
    by matching patterns drawn 
    from prior data.

Hesse: Edges to Cats, 2017
Here, an ML model has been trained to understand the transformation from a line-drawing of a cat to a photographic image of a cat. Once training is complete, this model will attempt to create a cat-like image of any given line drawing.

Just as we would expect an author 
    to reason differently 
    when wielding a tool of induction 
    as opposed to a tool of deduction 
        
        a charcoal pencil as opposed to a calculator
        
we would similarly expect 
    entirely new cultures of design 
    to emerge in response 
    to paradigm shifts 
    in the nature of software.

Our project anticipates just this.

The tool described in this project is thoroughly inductive.

This is clear 
insofar as we first invite a designer 
to define an indexical relationship 
between a hand-drawn mark 
and the qualities of an image 
it is mapped to, 

and only then 
to find the "reason" 
for these images 
in the composition of a drawing.

Scott Eaton, 2019
Scott Eaton is a mechanical engineer and anatomical artist who uses custom-trained **transfer models** as a "creative collaborator" in his figurative drawings.

This is how the artist Scott Eaton works
as we see in this animation 

So, in this way, 
the project seeks to anticpiate 
new cultures of drawing practice 
that will arise in relation 
to this tool 
    and to others that take a similar approach.
    
from this, we hope to uncover 
new modalities of authorship 
latent in this still-emerging paradigm 
of computer-assisted design.

Terms

"brush"

is a neural network

trained to perform image-to-image translations

ready for use in sketching.

In the parlance established by the project, 
a trained neural network 
that is ready for use in sketching 
    is termed a "brush",

The activation of a brush

to perform an image translation,

is alternatively termed

a "transaction"

or an "inference".

    
The activation of a brush 
    to perform an image translation 
    is alternatively termed 
        a "transaction" 
        or an "inference".

Central Mechanisms

The composition of a sketch using four disticnt brushes, activating each multiple times.
Kyle Steinfeld, 2020

Embedded in a brush 
is the central mechanism 
of the augmented drawing activity. 

Envisioned as a conversation, 
this activity is expressed 
as a graphic "call and response". 

First, 
a human author composes a drawing 
that may or may not adhere 
to some anticipated graphic convention. 

Next, this "call" image is passed 
to a neural net 
trained to offer a "response" image, 
which is passed back to the sketching environment 
to be displayed to the author. 

If the results are not satisfactory, 
this process may be repeated 
by modifying the original drawing, 
or the author may choose 
to move on to a new portion of the sketch. 

Due to certain technical limitations, 
the current system is constrained 
to transacting with images 
smaller than the typical size 
of most architectural sketches. 

For this reason, 
each call-and-response transaction 
typically represents 
only a portion or segment 
of a larger drawing. 

To mitigate this limitation, 
mechanisms are provided 
to allow for the tiling and layering 
of multiple response images 
in a single drawing.

Thank You to the Students at UC Berkeley

We acknowledge that this research is closely related to an upper-division undergraduate research studio offered at UC Berkeley in the Spring of 2020. This studio was instructed by the author, and attended by an inspiring group of sixteen students. We would like to thank the students of this course for the generous willingness to participate in the case study described here. This resilient group of young designers faced difficult circumstances with grace, and adopted unfamiliar methods with enthusiasm.

This research is closely related 
to an upper-division undergraduate research studio 
offered at UC Berkeley 
in the Spring of 2020.

Four Steps from a User's Perspective

I'll now offer a very quick overview
of the specific 
    tools, methods, and workflows 
that constitute the Sketch2Pix 
augmented sketching tool.

Since most of the technical details 
are covered in the paper, 
here I'll speak solely to the experience of the user.


So, 
assuming you're someone 
who would like to make their own 
augmented drawing partner, 
there are four steps required:

1. The establishment of a training data set. 
    for us, data sets are derived 
    from a combination of 3d scanning 
    and texture modeling, 
    a process supported 
    by a collection of user-friendly scripts.

2. The training and validation of a neural network model. 
    This is done through 
    a GPU cloud computing environment, 
    and is supported by 
    a collection of user-friendly python scripts.

3. The deploying of a trained model 
to allow communication 
with a graphic sketching environment. 
    This is achieved through 
    a model hosting and distribution service 
    called "RunwayML".

4. The configuration and customization of the sketching environment 
to facilitate a "conversation" with an AI partner. 
    This is achieved 
    through a bespoke plugin 
    for Adobe Photoshop.

The most critical step 
in crafting a Sketch2Pix brush 
is the establishment 
of a set of data on which to train.

Establishment of Training Data

Here, 
the central operation of a brush is established, 
as expressed by the mapping 
between a graphic "call" provided by a human author 
and the desired image "response" of the neural network.

A Sketch2Pix "brush" trained on images of a bowling pin.


Because a data set of substantial size 
is required for training, 
pipelines are here developed 
for the establishing of training data 
using texture-mapped 3d models.

With a training set established, 
the next step in crafting a Sketch2Pix brush
is the training and validation 
of a neural net.

Training and Validation of a Neural Network

Because most novice designers 
lack direct access to computing resources and expertise
necessary to complete this training 
in a reasonable time, 
we provide guidance and resources 
for completing this step 
in a cloud computing environment.

Having trained a valid Sketch2Pix model, 
the next step is the deploying of this model 
to a hosting environment 
that facilitates connection 
with a graphic sketching environment.

Deploying a Trained Model

While in the recent past, 
such a step would present considerable difficulty 
for a novice user, 
the recent introduction 
of model hosting and distribution services 
have significantly eased this burden 
and lowered barriers of entry. 

Sketch2Pix is registered with one such service: 
    RunwayML 
    
Given this service, 
hosing a model is a relatively trivial process 
from the user's perspective.

With a model deployed 
to a hosting and distribution service, 
the final step in developing a Sketch2Pix brush 
involves the mechanisms 
of interacting with this model, 

which may now be regarded 
as a functional "drawing assistant".

Interactive Sketching

The details of this collaborative drawing activity 
are largely guided 
by the features of the interface developed 
for the given graphic sketching environment. 

A large amount of development time 
was spent noodling 
on how to get Adobe Photoshop 
    to interact with a hosted model
in a way that is supportive of architectural sketching.

I would just note here that 
    how to best structure 
    an successful interaction 
    with an AI assistant 
    is a current topic 
    in HCI research. 


From this perspective, 
    we strongly value the iterative development 
    of small-scale graphics 
    in the service of larger compositions. 

This is because, 
    in our view, 
ease of iteration is essential 
to facilitating the feeling 
of a "conversation" 
with an AI partner. 

- - - - - 

One may see this value reflected in the decision to integrate alpha-channel generation into the Sketch2Pix model itself, as transparency is necessary to effective layering. 

We also see it in the way we've handled the Photoshop layer structure, and in the protocols for returning "response" images. 

We encourage our users to, at the start of a collaborative sketching session, establish two layer groups in Photoshop: one that will contain the hand-drawn "call" images, and a second that will house those "response" images returned by the brush. 

Rather than replacing the user-drawn image with the computer-generated one at each transaction, our application returns generated image to its own Photoshop layer, and links the source layer to the generated layer. 

Then, if the user modifies their hand-drawing, the generated image may be updated rather than duplicated. This pattern of user interaction is intended to encourage the sort of iterative development that is commonly practiced in design drawing.

Live Demo?

Results and Reflection

In this talk 
I've presented the mechanisms 
of a tool that supports 
**machine-augmented architectural sketching**

an activity that we see 
as a special case 
of **inductive design thinking**.


First is a reminder 
of the central role 
of iteration in design, 
and of the small discoveries we made
on how to instrumentalize iteration 
in the user interface of design tools. 

We expect that ease of iteration 
will become even more essential 
in facilitating the feeling 
of a "conversation" with an AI partner.


Second is an indication 
of the challenges that lie ahead 
int the shift from geometric / logical modeling activities 
to those required by training an ML model. 

We see such a challenge 
in the role of the 3d models 
produced in the service of training. 

Here, users must shift their thinking 
away from models playing a directly representational role, 
    where the model stands in 
    for the building we envision. 
    
Instead, these models play a loosely indexical role, 
where the model defines 
the relationships and patterns 
to be learned by a neural network, 
which prefers information 
such as colors and patterns 
over forms and organizations 

We expect to find many such differences 
between traditional forms of computer-aided design, 
and emerging forms based on ML.

Thank you.