Cheap Facial Motion Capture – Introduction


Hello everyone, and welcome to this tutorial Where we are going to use a regular digital camera in order to capture the performance of an actor’s face. And then later on apply it to a character rig in Maya. I hope you will enjoy this presentation and if all goes well, you should end up with a result quite simillar to the one you are currently viewing. So with that out of the way, let’s get started. This series consist of three parts, part one focus on the actual capture of the actor. Part two is about tracking the motion, and part three covers the process of applying the tracked data to a 3D model. Feel free to skip to any parts as you see fit. Before we begin I would like to take the opportunity to tell you a little bit about motion capture and the techniques presented in this tutorial. There are a variety of techniques to choose from when it comes to motion capture. All depends on what type of application we are working on. The price tags differ as well but there are cheap alternatives, Microsoft Kinect being one of them. The technique I’ve chosen for this application is called optical, meaning that we will rely on cameras in order to extract the motion from our actor. Optical is then divided into two subgroups and we will be utilizing a process called passive. This means that light need to be present in order for the cameras to see the actual markers. In order to extract the motion from a specific marker we need multiple cameras and we need to determine where in space that these cameras are located. This is crucial if we want to successfully pinpoint a markers 3d position. This is done by calibrating the cameras using a known geometric shape, visible by all cameras. The software then compares the image from every cameras measuring perspective in order to recognize where they are. When the actual capture occurs the same technique is used to acquire the position of a marker. As long as the marker is visible by the majority of the cameras, the software is able to track and extract its position. The upside to this is that we will get true 3D information for all the markers. Another neat feature is that the process is real-time, we can evaluate a motion as we record it making it possible to easily direct our actor. The downside is that the system is quite expensive and technical, you will also need specialized software to interpret all the image streams produced by the cameras. With one camera we are able to capture movement in 2D, meaning that we will only get data along the X and Y axis, leaving us with no depth. As you can see from the example the front view looks good but the side clearly shows a lack of depth information. The upside is that it’s cheap. You can use virtually any camera that can record video and the preparation time is only a fracture compared to a fully fledge motion capture rig. The downside is that the techniques rely on semi-manual tracking and interpretation. It’s a process where the artist needs to identify every marker and track them throughout the recorded clip. This will of course make it a non real-time solution. In my case the one-camera solution was the most suitable and the inspiration for this came from the movie Avatar. I noticed when viewing the behind the scenes material that the actors was wearing head mounted cameras on stage. That’s brilliant solution, the actors can move freely and still have their face in focus throughout the shoot. While I don’t have access to a moulding kit in order to make the custom camera helmet I needed to look for some really lightweight equipment. I first thought, web camera, due to the fact that they tend to be lightweight. They don’t cost a small fortune, making it affordable for individuals. The two problems associated with this type of camera is 1) they usually utilize a cord, limiting the actor’s movement. 2) they have a hard time delivering a sustainable frame rate. I tested multiple brands of 30 FPS cameras and they delivered everything between 12 to 23 frames per second. Next prototype involved a tripod mounted camera. This time around a used my Canon Ixus digital camera which delivers a steady 30 fps at 640 by 480. The problem with this setup is that the actor must keep his head completely still, making it really hard to act. The reason for this is that we only want to capture the motion of selected facial features and not the entire head, meaning that we need to zero out all head movement to prevent the facial features from “sliding” around. As you can see by the eyebrow example, the head movement gives the effect of the eyebrows moving around unnaturally. A friend of mine suggested that I should mount the camera on a motor bike helmet. This would give me a sturdy anchor point for the camera rig as long as the weight of the helmet is small. So with the help of a piece of wood and some piano hinge the camera was successfully attached. Sadly it turned out quite heavy, the piano hinge wasn’t sturdy enough so the rig will bend with sudden movements. It’s uncomfortable and not very elegant in design, but it will give you video with the face in focus. Start by placing the markers on the face, use something that is removable. In my case I used an eyeliner pen. Light the room adequate where the shoot is going to take place. Too dark and you will introduce grain to the footage, and grain might generate a jittery movement down the line. Too bright and the markers might disappear in skin highlights. Decide on the amount of markers for your project. Many markers will enable you to extract allot of information from the face, which is good for film or high-end productions. Fewer makers will be more optimized making the solution good for games or real-time applications. The placement of the markers should mimic the bone setup of your 3D character, meaning that they should be in the same general area of the face.

About the Author: Earl Hamill

Leave a Reply

Your email address will not be published. Required fields are marked *