COMPSCI 194-26: Project 3

Image Morphing

Saurav Mittal | saurav@berkeley.edu



Project Report

Table of Contents

Image Morphing Setup
  1. Defining Correspondences
  2. Generating Morph Sequences
  3. Different Morphing Functions*
Fun with Face Morphing
  1. Population Analysis
  2. Transforming Image Characteristics*
  3. Class Face Morph!*
Morphing Applications
  1. Making Caricatures
  2. Transformations & Transitions*
  3. Music Visualizer*#
* additional bells & whistles
# best part!

Introduction

In this project we'll explore image morphing! A morph is a simultaneous warp of the image shape and a cross-dissolve of the image colors where the warp is controlled by defining a correspondence between two pictures. We'll take a look at different morphing techniques, analyze population face shapes, generate fun caricatures and more.


Image Morphing Setup

1. Defining Correspondences

First, we will need to define pairs of corresponding points on the two images by hand (the more points, the better the morph, generally). In order for the morph to work we will need a consistent labeling of the two faces. We will label faces A and B in a consistent manner using the same ordering of keypoints in the two faces. Once we have the points on the two faces, we'll need to provide a triangulation of these points that will be used for morphing. I chose to use the Delaunay triangulation on the average point set of the two images since it does not produce overly skinny triangles.

Note: For most of the project, we'll be working with the head shots of two awesome Berkeley EECS Professors, Professor Abbeel & Professor Malik. This is partly for consistency and mostly because manually defining correspondences is tedious and these were the pictures I went with when I started the assignment. You know people who would spend 242695 hours automating a 5 minute job instead of actually doing it? Yeah, I'm those people. I actually tried automating this process and failed so let's not even get into it. Wow we're just gonna enjoy the company of these professors and morph their faces even if it gets weird later (spoiler: it might 😉).


For these two images, I manually selected a shit ton of 84 points (including 4 corners) in both the images across different facial features and accessories (glasses) to set us up for a good morph. Once I got that, I used scipy library's Delaunay function to compute an abstract triangle representation on the mean point set of the two images which is shown above.

2. Generating Morph Sequences

Before we compute the whole morph sequence, let's compute the mid-way face of these two images. This involves the following:

  1. Computing the average shape (a.k.a the average of each keypoint location in the two faces)
  2. Warping both faces into that shape
  3. Averaging the colors together.
The main task in warping the faces into the average shape is implementing an affine warp for each triangle in the triangulation from the original images into this new shape. We iterated through all the triangles pairs and computed an affine transformation matrix between them. We then used this matrix along with some simple interpolation techniques to move the pixels between triangles and implemented an inverse warp of all the pixels.

We found the mid-way face for our professors by warping both the faces to their average face shape and then blending both the faces together using equal weights.

Pieter Abbeel
Pieter Malik
Jitendra Malik

Cool! It actually looks pretty realistic. Do note that the difference in quality is purely because I downsampled the images before processing to make the code run faster and to get lighter gif files for later.
<mini-rant> Why are gifs so big?? I know there are a lot of images but some quick math seems to show they are bigger than im_size*num_frames. All this overhead??? No wonder all gifs on the web look like trash smh </mini-rant>

A 'morph' is just a a sequence of images which starts at an image and ends on another with automatically generated images in the middle which help transition from the first image to the last. We slowly change the shape of the starting image until it reaches the shape of our final image through affine tranformations on the image triangles all while also blending the image colors as well. The image on the right sums up the general algorithm quite well.

Here is the result of using this algorithm to morph the images of Professor Abbeel & Professor Malik

Pieter Abbeel
Berkeley ML Morph
Jitendra Malik

3. Different Morphing Functions

We can play around with our base morph implementation a bit. There are two arguments we can tweak with our current algorithm, the warp factor & the morph factor. The warp factor tells the algorithm by how much to change the shape of our triangle in a given frame and the morph factor tells the algorithm how much we want to blend the two images for a frame. To understand this better, let's take a look at what changing one of these factors looks like while keeping the other constant at 1.

Fixed Morph Factor = 1
Fixed Warp Factor = 1

Custom Sigmoid Function
Both the factors lie in the range [0, 1] and go from zero to one. We have factor calculating functions which map the current frame number to the factor value. In our original implementation, we used a linear map so the images morph steadily & consistently. Intuitively, it made sense to me to speed up the middle frames where we would have weird artifacts from both the images. I tried the sigmoid function as it fit the bill.

I had to play around with the domain and range of the function and clip it to make it work for our use case. The exact parametric sigmoid function and the plot of our new factor function is shown here.

Linear Warp Sigmoid Warp
Linear Blend
Sigmoid Blend

Comparing the different results, I don't think there's a clear winner here. It's really up to what you prefer. For this pair of images I think the Linear Blend + Linear Warp works quite well because the overall morph is fairly smooth so we get to see neat middle frames for longer. For other images where the transition isn't as smooth the linear map would likely be percieved worse than the sigmoid map because we would be spend a lot of time on frames which are not pleasing to look at. I annotated another set of pictures to showcase just this.

Delauney Triangulations

BALJEET x Saurav: BALLrav Morph
Linear Map
Sigmoid Map

This morph isn't nearly as clean as the previous ones because the images are quite different from each other both in color and structure. The sigmoid morph looks better than the linear morph because it speeds through through the displeasing frames quicker.

I tried playing around with some other functions as well but they didn't really provide better results. A fun thing that I did try here was modeling the morph after a random walk.

Random Morphing


Why do this? No reason really, just seemed cool. Implementing the random morphs did give me a fun idea that I ended up working on a lot later so it was definitely worth :)


Fun with Face Morphing

1. Population Analysis

I analyzed a free data set of Danish faces meant for statistical models of shape available here for population facial feature analysis. We have 40 people in this dataset, we can start by finding the average face of this population. Computing the average face first finding the average face shape of the population, transforming all the faces into the average shape and then blending all the transformed faces together with equal weights.

Hover to see the difference :)
All Danes' Faces Transformed

Average Dane's Face
My Face Warped into Avg. Dane's
Avg. Dane's Face Warped into mine

2. Transforming Image Characteristics

Now I'll try to change some characteristics of my face using the features from the Danes dataset.

Changing Ethinicity
Warp Only
Blend Only
Complete Morph

Adding Smile (+ Ethnicity)
Warp Only
Blend Only
Complete Morph

3. Class Face Morph!

We got 15 students from the class in our class morph video! My friend Rina morphed her face to mine and I morphed my face to Amy's.

Delaunay Triangulation

Rina to Saurav
Saurav to Amy

List of students in order of appearance: Ritika Shrivastava, Megan Lee, Rina Lu, Saurav Mittal, Amy Hung, Cesar Plascencia Zuniga, Ron, Danji Liu, Saurav Shroff, Gurmehar Kaur, Avik Sethia, Devesh Agarwal, Yash Swarup Agarwal, Hannah Moore, and Matt Hallac.


Morphing Applications

1. Making Caricatures

Now we make caricatures of my face by extrapolating from the population mean that we calculated in the last section. We do this by finding the difference between the average face features & my face features and then adding it back to my face after scaling it.

Caricatures

We see that the caricatures become more 'extreme' the further alpha deviates from 1 (which would yield the original image).

2. Transformations & Transitions

Transformations

I wanted to see if it was possible to implement transformations using our morphing algorithm. For this, we have a new test subject – a pretty, symmetrical daisy :) As we aren't really morphing to any particular image for transformations, I used my warp-only morph which fixes the morph factor to 1. Here are some of the transformations I implemented.

Rotate Left
Rotate Right
3D Inward Rotate
Scale
Curl Up
Petal Fold

All the transformations shown just use a single labeled pointset on the daisy (shown above). We generate our final point set (end state) through clever alterations of the original point set which lead to the desired effect. When the points move from their original location to their final location it appears as the image is being transformed.

Transitions

The goal here was to see if would be possible to implement Powerpoint-like slide transitions using our morphing algorithm. The process to generate transitions is pretty similar to what we did for transformations, but instead of doing a warp-only morph we now blend the images as well. I found best results with using a Linear Warp with Sigmoid Blend. This makes sense because the quick blend reduces the weird choppy middle frames (slides are likely different from each other) and the linear warp gives us a smooth structural transition from start to finish. Here are the test slides I used along with some transitions I tried out.

Slide 1
Slide 2


Horizontal Flip
Vertical Flip
Pop In/Out
Triangle Dust (My Favorite!)

The first row of transitions just used the corners of the images for the transition sequences. 'Triangle Dust', on the other hand, has this very interesting effect as it used the Delaunay Triangulation on 256 randomly chosen points on both the images.

3. Music Visualizer

Now on to my favorite part of the project! I wanted to do something a bit different, something that just doesn't morph from Image A to Image B as I have already explored different morph functions which did that. The face morphs modeled by bounded random walks were quite interesting. I thought about adding more structure to them to give them some purpose. As you have already seen by the section title, I decided to control the face morph by syncing it with some music. This proved to be quite challenging purely because I essentially have no experience with signal processing. I have only really worked with signals in the undergrad EE courses at Berkeley, and if you know me you know I didn't do too hot there :)

I picked a music clip with different elements so it's interesting to analyze. Let's start by seeing the clip's waveform and its distribution of frequencies (using Fourier transform). Note: I believe audio file doesn't work on Safari, so consider using a superior browser like Chrome.

Sample Audio Clip

Time Domain
Frequency Domain

It's hard to figure out what's what just from these images but if we take a look at the Fourier transform of shorter second-long clips we start to get an idea about what frequencies correspond to what sounds in the audio clip.

I implemented a bandpass filter to separate out all the frequencies so that they can be used separately to individually control different aspects of the morph sequence.

Signal Components

Now that we have the dials, we need to figure out what we want to tweak. Just to keep things simple for now, I decided to work with a smoothened signal (signal convolved with a 1D Gaussian filter) instead of working with all the components at the same time. I wanted to use the signal to control what frame of the A->B morph we're on. This is why it was important to smoothen the signal as without it consecutive frames of the morph could be quite different from each other due to high frequencies which would lead to jerky transitions. This is what the morph looked like using the smooth waveform synced to Boney M's Daddy Cool*
* i said bet, had to do it; warned you it might get weird

This doesn't look great. This was expected because it's only really keeping track of the soundwave amplitudes while ignoring the individual elements we actually percieve while listening to the music. I started thinking about all the ways I could improve on this, distort the faces through the frames into caricatures using the bass, have the high frequencies change the image hue while mids control the flow of the morph and trebble controls the rotation. All this sounds good but doesn't work. I implement a part of this and realized that it looked quite forced and wasn't as seemless as I was hoping/expecting it would be. Just because I had all these knobs for a face morph doesn't mean using them would make the result more pleasing. It was time to pivot.

I quickly realized I had been constraining the idea by sticking to the face morphs when I could do so much more with the audio signal. I set out to implement a music visualizer, kinda like the ones that pop up in some Youtube videos. We want our final results to look something like this.

Sample Audio Visualization

Now, we'll be working with all the frequencies of the signal instead of just the ones filtered out by our bandpass filters. We want to visualize how the frequencies of the audio clip change over time (spectrogram-like visualization). The visualization needs to show all the frequencies at each time step & smoothly transition between the frames. On top of that, I wanted the visualization to pulsate with the bass. First, we need to isolate the bass using a binary signal which encodes the locations of the bass. Implementing this bass detector took a lot of trial and error, but I finally got something that works. I looked at the fourier transforms of second-long clips and found out that the loud thumps in the clip belong to the frequency range ~10-60Hz. I used my bandpass filter to attentuate all other frequencies and then applied a threshold to the absolute value of the filtered signal. This gave me decent results but it still looked choppy so I convolved this with a box filter and then clipped the results to get nice smooth beat detections (shown below).

Beat Detection

If you actually compare the postive regions to the audio clip above, you'll see this does a great job of picking the really loud thumps while ignoring other loud frequencies (Ex. 7-8s).

I found images for the base art and annotated them using automatically computed points along a circle. Each point along the circumference of the circle corresponds to a frequency and we scale the point according to the frequency's intensity at a given time step. Integrate this code with our image morphing code, throw in the beat detector and we have all the components for our visualizer. Time to see some results!

Watermelon Sugar
Sunflower
Instrumental - Blue Danube
Rick (Best one!)

This took a hot second to implement & run so I think I'm going to stop here. There's still a lot that can be played around with. Things from changing color / transforming image shape based on certain frequency ranges to tracking the frequencies on a log scale to make the visualization more interpretable. I'll leave that to others to explore!


Final Takeaway

This project was genuinely super fun! I went in thinking that I'll just learn a bit about affine transformations but in an unexpected turn of events I ended up learning more singal processing than I did in EE16A & EE16B (time to take EE120 now??). The end results were ~perfectly splendid~ so I am quite happy :)