AI's eyes

How The Grid was teaching its Artificial Intelligence to see and paint.

Vilson Vieira

--

Note: I wrote this post back in the day I worked as an AI Engineer at The Grid. It was a crazy ride, with lots of interesting people and challenges. I left The Grid years ago but some ideas are still relevant to augment design using AI. The content of this post were originally published on The Grid itself and since there's no mirrors around, I'm sharing it here and I hope you enjoy it.

Augmenting and automating design of websites involve many steps and the areas I had my hands allover were on Computer Vision (CV) and Generative Art. We used a mix of traditional CV, generative algorithms and Machine Learning (ML) models, that when applied to design, got us the following features:

  • Saliency extraction: good to crop to region and text placement/overlay
  • Face detection: for emotion recognition and supporting detection of salient regions
  • Delaunay triangulation: to generate artistic backgrounds and picture styling
  • Image classification: to extract information/semantics from pictures
  • Sentiment analysis: to extract the general mood of a picture
  • Color extraction: to extract/quantize colors from pictures

Let's take a look on some of them in more details, shall we?

Saliency extraction

When we look into a picture we can naturally (and really fast!) determine what points get our attention first. A human face, a specific object, a region with higher contrast, those are both examples of salient areas. We can easily recognise those areas, but computers don't. So we have to teach them to.

Original image

The Grid used a saliency detection algorithm which analyses a given image looking for regions of high contrast. In summary, it divides the original image into small segments. Those segments are analysed and just the most contrasting ones are selected. We call them salient regions (as shown in the side figure).

Salient regions extracted from the image. Higher gray levels indicate more salient areas.

Knowing which regions are most salient we can now crop the original image to show only those regions — to fit on any screen size, for example. Or we can place text on top of images avoiding the salient regions.

Generative images and image processing

Besides analytics, we can use extracted information from text, images or videos to synthesise new content. We can generate an infinite number of unique images using combinations of different shapes, curves and colours.

We can also apply more complex algorithms like Delaunay Triangulation to obtain more interesting and unique meshes for favbanners or backgrounds.

Generative images by Delaunay Triangulation

Or we can go beyond and use the same algorithm to style existing pictures.

Image styling using Delaunay Triangulation

Using our open-source image processing pipeline imgflo we were able to manipulate images. It was possible to create any image filter and apply them to enhance pictures.

Filters created on imgflo being applied to the same image. Original picture by Sharon Mollerus.

Imgflo is a runtime for Flowhub, a visual programming environment that made it simple to build image processing graphs.

A Flowhub graph to image processing on imgflo.

What's next?

It's more than 6 years since I wrote this post. Deep Learning happened, Computer Vision was completely changed by it and other ML methods. However, the fundamental problems on design are still there and we can leverage a lot from ML/CV to augment and automate graphical and web design.

If you want to know more about my last developments on blending AI/ML with design and creativity, follow me on Twitter!

--

--

Vilson Vieira

ML Engineer at Anything.World building the 3D AGI. Prev: SWE at Google, Mozilla. Passionate about AI. More at: https://void.cc & https://hackable.space