AI's eyes

How The Grid was teaching its Artificial Intelligence to see and paint.

5 min readApr 9, 2021

Note: I wrote this post back in the day I worked as an AI Engineer at The Grid. It was a crazy ride, with lots of interesting people and challenges. I left The Grid years ago but some ideas are still relevant to augment design using AI. The content of this post were originally published on The Grid itself and since there's no mirrors around, I'm sharing it here and I hope you enjoy it.

Augmenting and automating design of websites involve many steps and the areas I had my hands allover were on Computer Vision (CV) and Generative Art. We used a mix of traditional CV, generative algorithms and Machine Learning (ML) models, that when applied to design, got us the following features:

Saliency extraction: good to crop to region and text placement/overlay
Face detection: for emotion recognition and supporting detection of salient regions
Delaunay triangulation: to generate artistic backgrounds and picture styling
Image classification: to extract information/semantics from pictures
Sentiment analysis: to extract the general mood of a picture
Color extraction: to extract/quantize colors from pictures

Let's take a look on some of them in more details, shall we?

Saliency extraction

When we look into a picture we can naturally (and really fast!) determine what points get our attention first. A human face, a specific object, a region with higher contrast, those are both examples of salient areas. We can easily recognise those areas, but computers don't. So we have to teach them to.

The Grid used a saliency detection algorithm which analyses a given image looking for regions of high contrast. In summary, it divides the original image into small segments. Those segments are analysed and just the most contrasting ones are selected. We call them salient regions (as shown in the side figure).

Salient regions extracted from the image. Higher gray levels indicate more salient areas.

Knowing which regions are most salient we can now crop the original image to show only those regions — to fit on any screen size, for example. Or we can place text on top of images avoiding the salient regions.

Generative images and image processing

Besides analytics, we can use extracted information from text, images or videos to synthesise new content. We can generate an infinite number of unique images using combinations of different shapes, curves and colours.

We can also apply more complex algorithms like Delaunay Triangulation to obtain more interesting and unique meshes for favbanners or backgrounds.

Generative images by Delaunay Triangulation

Or we can go beyond and use the same algorithm to style existing pictures.

Image styling using Delaunay Triangulation

Using our open-source image processing pipeline imgflo we were able to manipulate images. It was possible to create any image filter and apply them to enhance pictures.

Filters created on imgflo being applied to the same image. Original picture by Sharon Mollerus.

Imgflo is a runtime for Flowhub, a visual programming environment that made it simple to build image processing graphs.

A Flowhub graph to image processing on imgflo.

What's next?

It's more than 6 years since I wrote this post. Deep Learning happened, Computer Vision was completely changed by it and other ML methods. However, the fundamental problems on design are still there and we can leverage a lot from ML/CV to augment and automate graphical and web design.

If you want to know more about my last developments on blending AI/ML with design and creativity, follow me on Twitter!

AI's eyes

How The Grid was teaching its Artificial Intelligence to see and paint.

Saliency extraction

Generative images and image processing

What's next?

Written by Vilson Vieira

No responses yet