The world of non-commercial film and A-V
|The Film and Video Institute
|Join us on Facebook
Exit Song (for Romeo and Juliet) by Ken Kasriel got a 4-Stars Award at BIAFF 2009.
|Im grateful for the invitation to write about my entry in this
years festival. I dont want to discuss what the film is
about as - while I certainly had something in mind when I made
it - ultimately this is up to the viewer. Rather Id like briefly to
discuss the technical aspects of making it.
This was my first attempt at animation. It gave me a huge respect for animators, as it has easily been, minute-for-minute, the hardest thing Ive attempted in my admittedly brief (4 year) time making short films as a hobby.
It started by chance, when on the web I stumbled across a software package called CrazyTalk for animating faces. Load in a digital photo of a face and sound file, and with some tweaking, the face convincingly says whatever is on the sound file.
I downloaded the trial version even though I was initially sceptical, as the makers market it mainly as something for creating novelty clips make your mother-in-law bark, or the Mona Lisa burp, or your bosss voice come out of a gorillas mouth, etc. It is also fairly inexpensive, which led me to think it was a limited toy. Once I started playing with it, however, I found that it actually gives you a very fine degree of control over the range and intensity of a subjects expression eye and head movements, mood, etc. If anything, there might be too many options it takes some discipline to not get lost in the maze of choices.
The results I got as I moved up the learning curve (which was not too steep; the makers go to great lengths to make the software intuitive to use) were frankly spooky. They reminded me of something I read once about how people react to robots. They have no problem with robots which look nothing like a human, or with those which look very human; but the middle ground, where something clearly inorganic shows life-like behaviour, is eerie. The artistic applications seemed clear.
I also found, however, that, as good as the software is, it takes some time and effort to make your results move from pretty good to very good. When you load in your photo, it guesses what the facial layout is, but this often has to be adjusted. It also guesses what is being said in the sound file, and moves the lips accordingly. As this is not like sophisticated dictation software, which requires training to understand a particular voice, the results of the audio guesses are understandably patchy.
Fixing the lip movements is possible but very painstaking. If the software has guessed wrong, you actually need to spell the words out, using not letters of the normal alphabet, but rather phonemes, or special phonetic characters, which take some getting used to.
In a timeline view, you zoom in to a very fine time resolution (measured in milliseconds rather than seconds) and set the position of the phonemes to match both the audio and video. You must not only align a phoneme to match the start of a syllable, but also must set the duration of the phoneme so that the mouth stays open for the appropriate time. Failure to do the latter makes the lips close almost instantly after the start of the syllable, which looks like complete crap. But with some tinkering, the result can be convincing.
Being lazy, I opted for something with few syllables
Once I got comfortable with the software I decided to make a full music video with it, but - being both overworked and otherwise lazy - I chose a song with relatively few syllables!
I decided on a version of a song Id recently recorded, singing to guitar played by my talented friend, Barry Gollop. The song is called Exit Music (for a Film), a haunting tune written by Thom Yorke, originally recorded by the band Radiohead, and used as the credits rolled in the 1996 version of the movie, William Shakespeares Romeo & Juliet.
Deciding which images to use took, all totaled, around 10 hours; figuring out the best order for them, around 5 hours ; and lip-syncing them all, probably another 50 hours.
The lip-syncing would have taken much less time but for a problem I was having with the software when working with longer face sequences.
The problem was that after getting such a sequence just as I wanted it in the timeline editor view, I would render it (i.e. mix it down to a short video file), only to find that somewhere in the process the timing slipped from what was intended by up to a few seconds often ruining the result. Aargh. The only way to fix this was to figure out how to make the sequence so that it looked wrong when I made it in the editing view, but looked right when rendered akin to aiming wrong with a rifle to compensate for its warped barrel. But as the rendering itself took time, this required a lot of trial, waiting, and error.
In the end I fixed most of the lip-syncing timing lapses though not all which really bothers me, as I am a perfectionist.
But, as someone (I think Shelley) once said, no work of art is ever finished, only abandoned. Thats my excuse, anyway!
- Ken Kasriel
[You can hear more music by Ken and the St. Joesph's Social Club on www.reverbnation.com - Crazy Talk is a product of Reallusion (www.reallusion.com/crazytalk) and you can download a free trial version from that website. The best price I have so far found is from Softonic at £17.61 (http://crazytalk.en.softonic.com) - Ed.]