Sketch2Pix @ CDRF

Machine-Augmented Architectural Sketching

Kyle Steinfeld for CDRF, June 2020

Abstract

Presented here is a technical account of the development of an augmented architectural drawing tool Sketch2Pix: an interactive application that supports architectural sketching augmented by automated image-to-image translation processes (Isola, 2016). In this paper, we account not only for the "front-end" experience of this tool, but also detail our approach to offering novice designers access to the "full stack" of processes and technologies required to craft and interact with their own augmented drawing assistant. This workflow includes: the establishment of a training data set; the training and validation of a neural network model; the deploying of this model in a graphic sketching environment; and the configuration of the sketching environment to facilitate a "conversation" with an AI partner. Our approach is novel both in the technical barriers it removes, and in the facilitation of access for creative design practitioners who may not be well-versed in the underlying technologies. The accessibility of our approach is demonstrated by a case study in the Spring of 2020. At this time, a group of novice undergraduate students of design at UC Berkeley adopted Sketch2Pix to produce a series of image-to-image translation models, or "brushes", each trained on a designer's own data, and for a designer's own use in sketching architectural forms. By empowering creative practitioners to more easily shape the configuration of machine-augmented tools, this project seeks to serve as a harbinger of novel modalities of authorship that we might expect in this still-emerging paradigm of computer-assisted design.

Sketch2Pix:
An interactive application for architectural sketching augmented by automated image-to-image translation
(left) AI Augmented Axonometric Sketch, Robert Carrasco, 2020
(right) AI Augmented Landscape Drawing, Sarah Dey, 2020

Code related to this project is available at
github.com / ksteinfe / runway_sketch2pix

Hello.
Today I'd like to offer 
    a technical accounting 
    of the development of 
        an **augmented architectural drawing tool** 

Sketch2Pix
    is an interactive application 
    that supports architectural sketching 
    augmented by automated image-to-image translation
    


Agenda

While in the paper,
    I detail the technologies, 
        workflows, 
        and interfaces 
    that enable novice creative authors 
    to train and wield 
    custom-made image-to-image translation models 
    in support of machine-augmented sketching. 
    
Here, 
I'll cover this material 
    only in overview,
and from the perspective 
    of a user of this tool.
    
I'll begin with a brief discussion 
    of our motivation for the project, 
    
Along with a description of 
    some basic terms 
    and the central mechanisms 
    of the tool that support creative sketching

At the end of the talk,
    if there's time,
I'll attempt a live demo of the tool.


Introduction

Motivation & Related Work

'Image Generated by a Convolutional Network - r/MachineLearning.'
Reddit user swifty8883. June 16, 2015

The intersection 
    of recent advances in machine learning 
    and architectural design tools 
remains a largely undefined territory. 

Delirious Facade
A "hybrid" facade combining the overall form of one facade selected from the city of Toronto with the fine features of another
LAMAS, 2016

While it is clear that AI-based design tools 
    present fresh challenges 
    to the way we understand and use software 
    in the service of designing and building, 
the nature of these challenges, 
    and the character of the related opportunities, 
are still uncertain. 


Face to Pix2Pix Facade
Koki Ibukuro, 2019

This project represents one step 
toward demonstrating an opportunity 
that we have identified in brining ML to bear 
on better supporting modes of reasoning 
central to creative design. 


Suggestive Drawing among Human and Artificial Intelligences
Here, an ML model has been trained to understand the transformation from line drawings to a whole range of objects: from flowers to patterned dresses.

While, broadly speaking, 
existing paradigms for design tools 
have been shown to effectively support 
    deductive reasoning, 
    
we assert that 
a new paradigm 
based on machine learning 
will be uniquely suited 
to address heretofore unsupported modes 
of reasoning about design, 
    such as imagistic and inductive thinking.

Animation generated by Artbreeder
Kyle Steinfeld, 2020

This difference is clear 
from the intrinsic mechanisms 
of the tools involved: 

while 3d modeling proceeds 
    by composing geometric operations in sequence, 
    
and parametric modeling proceeds 
    by assembling elaborate chains of logic, 
    
machine learning models proceed 
    by matching patterns drawn 
    from prior data. 

Hesse: Edges to Cats, 2017
Here, an ML model has been trained to understand the transformation from a line-drawing of a cat to a photographic image of a cat. Once training is complete, this model will attempt to create a cat-like image of any given line drawing.

Just as we would expect an author 
    to reason differently 
    when wielding a tool of induction 
    as opposed to a tool of deduction 
        
        a charcoal pencil as opposed to a calculator
        
we would similarly expect 
    entirely new cultures of design 
    to emerge in response 
    to paradigm shifts 
    in the nature of software. 


Proposition One
Nehal Jain, 2020
Our project anticipates just this.

The tool described in this project is thoroughly inductive. 


Work from each of the three propositions posted to Instagram
Nicholas Doerschlag, 2020
Rose Wang, 2020
This is clear 
insofar as we first invite a designer 
to define an indexical relationship 
between a hand-drawn mark 
and the qualities of an image 
it is mapped to, 

and only then 
to find the "reason" 
for these images 
in the composition of a drawing. 


Scott Eaton, 2019
Scott Eaton is a mechanical engineer and anatomical artist who uses custom-trained **transfer models** as a "creative collaborator" in his figurative drawings.

This is how the artist Scott Eaton works
as we see in this animation 

So, in this way, 
the project seeks to anticpiate 
new cultures of drawing practice 
that will arise in relation 
to this tool 
    and to others that take a similar approach.
    
from this, we hope to uncover 
new modalities of authorship 
latent in this still-emerging paradigm 
of computer-assisted design.



Terms

  • A
  • "brush"
  • is a neural network
  • trained to perform image-to-image translations
  • ready for use in sketching.
  • In the parlance established by the project, 
    a trained neural network 
    that is ready for use in sketching 
        is termed a "brush", 
    
    
    
  • The activation of a brush
  • to perform an image translation,
  • is alternatively termed
  • a "transaction"
  • or an "inference".
  •     
    The activation of a brush 
        to perform an image translation 
        is alternatively termed 
            a "transaction" 
            or an "inference". 
    
    
    

    Central Mechanisms

    The composition of a sketch using four disticnt brushes, activating each multiple times.
    Kyle Steinfeld, 2020

    Embedded in a brush 
    is the central mechanism 
    of the augmented drawing activity. 
    
    Envisioned as a conversation, 
    this activity is expressed 
    as a graphic "call and response". 
    
    First, 
    a human author composes a drawing 
    that may or may not adhere 
    to some anticipated graphic convention. 
    
    Next, this "call" image is passed 
    to a neural net 
    trained to offer a "response" image, 
    which is passed back to the sketching environment 
    to be displayed to the author. 
    
    If the results are not satisfactory, 
    this process may be repeated 
    by modifying the original drawing, 
    or the author may choose 
    to move on to a new portion of the sketch. 
    
    Due to certain technical limitations, 
    the current system is constrained 
    to transacting with images 
    smaller than the typical size 
    of most architectural sketches. 
    
    For this reason, 
    each call-and-response transaction 
    typically represents 
    only a portion or segment 
    of a larger drawing. 
    
    To mitigate this limitation, 
    mechanisms are provided 
    to allow for the tiling and layering 
    of multiple response images 
    in a single drawing. 
    
    
    

    Thank You to the Students at UC Berkeley

    We acknowledge that this research is closely related to an upper-division undergraduate research studio offered at UC Berkeley in the Spring of 2020. This studio was instructed by the author, and attended by an inspiring group of sixteen students. We would like to thank the students of this course for the generous willingness to participate in the case study described here. This resilient group of young designers faced difficult circumstances with grace, and adopted unfamiliar methods with enthusiasm.

    This research is closely related 
    to an upper-division undergraduate research studio 
    offered at UC Berkeley 
    in the Spring of 2020.
    
    

    Four Steps from a User's Perspective

    I'll now offer a very quick overview
    of the specific 
        tools, methods, and workflows 
    that constitute the Sketch2Pix 
    augmented sketching tool.
    
    Since most of the technical details 
    are covered in the paper, 
    here I'll speak solely to the experience of the user.
    
    

    Agenda

    
    So, 
    assuming you're someone 
    who would like to make their own 
    augmented drawing partner, 
    there are four steps required:
    
    1. The establishment of a training data set. 
        for us, data sets are derived 
        from a combination of 3d scanning 
        and texture modeling, 
        a process supported 
        by a collection of user-friendly scripts.
    
    2. The training and validation of a neural network model. 
        This is done through 
        a GPU cloud computing environment, 
        and is supported by 
        a collection of user-friendly python scripts.
    
    3. The deploying of a trained model 
    to allow communication 
    with a graphic sketching environment. 
        This is achieved through 
        a model hosting and distribution service 
        called "RunwayML".
    
    4. The configuration and customization of the sketching environment 
    to facilitate a "conversation" with an AI partner. 
        This is achieved 
        through a bespoke plugin 
        for Adobe Photoshop.
    
    

    Agenda

    The most critical step 
    in crafting a Sketch2Pix brush 
    is the establishment 
    of a set of data on which to train. 
    
    
    

    Establishment of Training Data

    Here, 
    the central operation of a brush is established, 
    as expressed by the mapping 
    between a graphic "call" provided by a human author 
    and the desired image "response" of the neural network. 
    
    

    A Sketch2Pix "brush" trained on images of a bowling pin.

    
    Because a data set of substantial size 
    is required for training, 
    pipelines are here developed 
    for the establishing of training data 
    using texture-mapped 3d models.
    
    
    

    With a training set established, 
    the next step in crafting a Sketch2Pix brush
    is the training and validation 
    of a neural net.
    
    
    

    Training and Validation of a Neural Network

    Because most novice designers 
    lack direct access to computing resources and expertise
    necessary to complete this training 
    in a reasonable time, 
    we provide guidance and resources 
    for completing this step 
    in a cloud computing environment. 
    
    
    

    Having trained a valid Sketch2Pix model, 
    the next step is the deploying of this model 
    to a hosting environment 
    that facilitates connection 
    with a graphic sketching environment.
    
    

    Deploying a Trained Model

    Screenshots of the RunwayML model hosting and distribution interface.
    While in the recent past, 
    such a step would present considerable difficulty 
    for a novice user, 
    the recent introduction 
    of model hosting and distribution services 
    have significantly eased this burden 
    and lowered barriers of entry. 
    
    Sketch2Pix is registered with one such service: 
        RunwayML 
        
    Given this service, 
    hosing a model is a relatively trivial process 
    from the user's perspective.
    
    

    With a model deployed 
    to a hosting and distribution service, 
    the final step in developing a Sketch2Pix brush 
    involves the mechanisms 
    of interacting with this model, 
    
    which may now be regarded 
    as a functional "drawing assistant".
    
    

    Interactive Sketching

    The details of this collaborative drawing activity 
    are largely guided 
    by the features of the interface developed 
    for the given graphic sketching environment. 
    
    A large amount of development time 
    was spent noodling 
    on how to get Adobe Photoshop 
        to interact with a hosted model
    in a way that is supportive of architectural sketching.
    
    I would just note here that 
        how to best structure 
        an successful interaction 
        with an AI assistant 
        is a current topic 
        in HCI research. 
    
    
    From this perspective, 
        we strongly value the iterative development 
        of small-scale graphics 
        in the service of larger compositions. 
    
    This is because, 
        in our view, 
    ease of iteration is essential 
    to facilitating the feeling 
    of a "conversation" 
    with an AI partner. 
    
    - - - - - 
    
    One may see this value reflected in the decision to integrate alpha-channel generation into the Sketch2Pix model itself, as transparency is necessary to effective layering. 
    
    We also see it in the way we've handled the Photoshop layer structure, and in the protocols for returning "response" images. 
    
    We encourage our users to, at the start of a collaborative sketching session, establish two layer groups in Photoshop: one that will contain the hand-drawn "call" images, and a second that will house those "response" images returned by the brush. 
    
    Rather than replacing the user-drawn image with the computer-generated one at each transaction, our application returns generated image to its own Photoshop layer, and links the source layer to the generated layer. 
    
    Then, if the user modifies their hand-drawing, the generated image may be updated rather than duplicated. This pattern of user interaction is intended to encourage the sort of iterative development that is commonly practiced in design drawing. 
    
    
    

    Live Demo?

    Results and Reflection

    Figs. 1.9 - 1.11 Augmented drawings authored with the Sketch2Pix tool. From left to right: Daniel Barrio, Can Li, Nicholas Doeschlag, 2020.
    Figs. 1.12 - 1.14 Augmented drawings authored with the Sketch2Pix tool. From left to right: Tina Nguyen, Robert Carrasco, Leo Zhao, 2020.
    In this talk 
    I've presented the mechanisms 
    of a tool that supports 
    **machine-augmented architectural sketching**
    
    an activity that we see 
    as a special case 
    of **inductive design thinking**.
    
    
    First is a reminder 
    of the central role 
    of iteration in design, 
    and of the small discoveries we made
    on how to instrumentalize iteration 
    in the user interface of design tools. 
    
    We expect that ease of iteration 
    will become even more essential 
    in facilitating the feeling 
    of a "conversation" with an AI partner.
    
    
    Second is an indication 
    of the challenges that lie ahead 
    int the shift from geometric / logical modeling activities 
    to those required by training an ML model. 
    
    We see such a challenge 
    in the role of the 3d models 
    produced in the service of training. 
    
    Here, users must shift their thinking 
    away from models playing a directly representational role, 
        where the model stands in 
        for the building we envision. 
        
    Instead, these models play a loosely indexical role, 
    where the model defines 
    the relationships and patterns 
    to be learned by a neural network, 
    which prefers information 
    such as colors and patterns 
    over forms and organizations 
    
    We expect to find many such differences 
    between traditional forms of computer-aided design, 
    and emerging forms based on ML.
    
    Thank you.