Captioning Web Video

I’m no video expert. Yet I often find myself encoding, editing, and otherwise manipulating video for the web. Recently, I completed a video project that involved converting a DVD of a 40-minute presentation into a movie that could be viewed on a web page, as a whole or in chapters. The final product had to be captioned.

Converting the DVD into video for the web was easy. I used Handbrake to rip the DVD into MP4 format. Editing was equally easy. I used iMovie to add title screens and transitions, and to break the movie up into chapters. Adding the captioning, however, was tricky.

Why bother with captioning? Here are some good reasons: so that those who are deaf or hard of hearing can enjoy the video, so the text is indexed by search engines, and to aid those for whom English is a second language. And here’s another: the Twenty-First Century Communications and Video Accessibility Act of 2010.

If captioning is important, then why isn’t it a mainstream practice? I’m not qualified to answer that question, but my guess is that it’s in part due to the fact that captioning is time-consuming and difficult. For instance: with external captioning (where captions are contained in an external file and sync with the video), there are multiple formats and a lack of clear standards. And for embedded captioning (where captions are simply typed in an editor and then exported with the movie), it’s just plain tedious work.

For my recent video project, I considered three captioning options:

  1. Embed the captions. The first option is to place the captions directly into the movie itself using a tool such as Final Cut Pro,  iMovie, or  Adobe Premiere.  I have Final Cut Pro, but I tend to use iMovie since most of the video work I do is short and simple. It’s the easiest tool for the job and the results look good. Here’s the thing about iMovie: while there are dozens of title/text effect options, none are designed for captioning (which is surprising given Apple’s robust accessability options for the OS). Despite this shortcoming, I’ve discovered that I can ‘fake’ captions by adding lower thirds to each segment of video. Making a default lower third overlay in iMovie into something that resembles a caption is a matter of changing font sizes. You can see an example of this in a recent video podcast I produced. This works, but it isn’t a practical solution for a long movie. In truth, it’s really not an ideal solution for any length movie because the captions are permanently embedded in the video. Screen readers and search engines can’t see this text. People can’t choose to turn the captions on/off. So I didn’t choose this option for my project.
  2. Dump the text on the page. A second option is to dump the captioning for a video on a page, underneath the video as HTML text. This may technically meets accessibility requirements, but it’s a lousy solution. The text is unassociated with the video. One can read the text or watch the video. It’s not feasible to do both at the same time. Nix.
  3. Create an external caption file. This last choice is the best solution: create an external caption file that will appear in sync with the video. Captioning is then matched up with the video, it’s readable by screen readers, and it’s good for search engines. It can also be turned on or off at the user’s discretion.

So how do you create and deploy and external caption file? If you simply wish to place a video on Youtube, it’s easy. Once you upload your video to the free service, Youtube offers free auto-generated machine transcription. While you’ll find that video speech-to-text accuracy is hit-and-miss (more miss in my experience), the important part is that Google generates  time codes that precisely match the the audio in the video. So once you download the caption file from Youtube, it’s a simply a matter of manually correcting the text so that what appears in the caption will match what is actually being said in the video.

If you don’t want to (or can’t because of workplace policy) solely use Youtube to present your video, it’s still a very useful tool. How? If you are embedding captions in a video using an editor such as iMovie, YouTube will do half of the work for you by delivering a fair approximation of a transcript. If you want to use an external caption file elsewhere with a different video player, you can still use this Google-generated file. You just need to convert it into the right format.

Here’s the process I used to generate a caption file for my video project:

  • I began by uploading the video to my YouTube channel.
  • I then requested that YouTube auto-generate a Subviewer caption file for this movie (Be patient. It may take hours to get this file back from Google because you’ll be in a queue with tons of other people).
  • I then downloaded this file and opened it up in text editor. 
  • The next step is tedious, but necessary: cleaning up the machine-generated text. I opened my movie in a QuickTime player window and, as it played, edited my caption text to correct errors and typos. It’s not too bad if you toogle between a text editor and QuickTime using Cmd-Tab.
  • Once I had my cleaned-up Subviewer text file, I copied and pasted it it into a free online converter to generate into the appropriate format. In my case, I generated a DFXP file for use with a Flash player. Here are three conversion tool options:
    • 3PlayMedia Caption Format Converter. This converter lets you convert from SRT or from SBV to  DFXP, SMI or SAMI (Windows Media), CPT.XML (Flash Captionate XML), QT (Quicktime), and STL (Spruce Subtitle File).
    • Subtitle Horse. A free online caption editor. Exports DFXP, SRT, and Adobe Encore files.
    • Subviewer to DFXP. This free online tool from Ohio State University converts a YouTube .SBV file into DFXP, Subrip, or QT (QuickTime caption) files. I used this tool for my project.

What’s the appropriate format?

  • YouTube: Subviewer (.SBV) 
  • iTunes, iOS: Scenarist Closed Caption (.SCC) 
  • Flash: DFXP, Timed Text Markup Language, the W3C recommendation. These are plain ol’ XML files.  You could also use the SubRip (.SRT) file format for Flash.
  • HTML5:  See this post.

If you’re not using a hosted service like YouTube or Vimeo (which, incidentally, does not support external captions), you’ll of course have to decide how to present the video on your site. There are many options. You can roll your own player with external captions using Adobe Flash. You can use off-the-shelf players that support captioning such as Flowplayer and JW Player — these two commercial products offer very easy setup and they offer HTML5 video players with Flash fallback. Another option: you might try HTML5 with experimental captioning support (note that Safari 5 now supports captioning with the HTML5 video tag). As I said, there are options. The video player discussion is beyond the scope of this post (and I don’t want to go down the HTML5 vs. Flash rabbit hole!).

My main goal here is to point out that Google’s machine transcription is good for more than just hosting a captioned video on Youtube. It’s trivial to convert this caption file into a variety of formats. The key point is that you don’t have to manually add time codes for your video. This critical step is done for you.

Yet even with this handy Google tool, generating caption files (and getting them to work with video players) remains an unwieldy task. We clearly need better tools and standards to help bring video captioning into the mainstream.

P.S. While researching this post, I came across two low-cost tools that look like solid options to create iOS and iTunes movies with captions. Both are from a company called bitfield. The first is called Submerge. This tool makes it very easy to embed (hard-code) subtitles in a movie and will import all the popular external captioning formats. The second is called iSubtitle. This tool will ‘soft-code’ subtitle tracks so you can add multiple files (languages) and easily add metadata to your movie.

in tip | 1,409 Words

Affordable Tapeless Video Capture

Here’s the second post from guest contributor Brandon who is currently attending the National Association of Broadcaster’s annual convention. Today’s topic is about tapeless video acquisition and how this tech is starting to filter down to consumer cameras. There are also many good tips here for those looking to buy a video camera. Enjoy.

“Day two of NAB 2008 found me exploring yet another hall of the Las Vegas convention center. I know you’re eagerly waiting to find out what cool stuff I found but, unfortunately, there was nothing of direct Mac relevance. Everything I found today was geared (and priced) directly toward the professional video market.

To be honest, I spent the better part of the day evaluating industrial gear cases, and I just don’t think you’d find it that interesting. Unless of course, you’re willing to spend $600 on a camera case… No? Ok, then. In the interest of keeping fresh material coming in, I thought I’d talk a little about one of the trends in professional gear that is making good progress on it’s way down from the halls of NAB to the consumer market: tapeless acquisition.

Tapeless acquisition is a technology that is just now really beginning to realize its potential. A few years ago it was only available in high-end professional cameras. We’re talking cameras that cost more than the gross national products of many small South American countries. More recently, though, the technology has found it’s way into lower-end field cameras such as Panasonic’s P2 and Sony’s XDCAM lines. These are the cameras that serve as the primary tools of documentary crews, independent video journalists and anyone else who needs to move fast and shoot broadcast-quality footage. Essentially, they are BMWs compared to the higher-end ‘Ferraris’ of the camera world. The good news this year is that we’re beginning to see pro technology (such as quality tapeless acquisition) filter down to the consumer level at a Chevrolet price point!

So what does this mean for you? No more spending $5 for a single 60-minute DV cassette. Great! But wait, there’s always a catch isn’t there? Let’s take the JVC Everio line as an example. These cameras can store up to 37.5 hours of standard definition footage onto their 30GB hard drives, so the issue is not how much drive space you will need.

The first major issue is the compression used to obtain that very tempting specification. A great number of internet reviews of the Everio line indicate that the video produced is soft and exhibits obvious artifacts. This is not exactly what I would like to see in my preserved-for-posterity memories. The other issue is compatibility for playback and editing on your computer. Unless you intend to use the bundled proprietary software to chop your precious memories into bite-sized YouTube morsels, you’ll need to carefully check the compatibility of the camera with your editing software before purchasing.

For the readers of this site, you should know that the JVC cameras don’t bundle any Mac love. While the JVC website states that “third-party software is available for Macintosh,” I spent nearly 15 minutes (all my ADD would allow) searching for exactly what “third-party software” was available. Guess what…I need to keep looking. Now, in fairness to the little Everios, every report I’ve read indicates that the ‘direct from camera to DVD burner’ feature worked simply and flawlessly — but that really takes the fun out of the whole process.

While I’ve picked on JVC cameras here, these are issues that should be considered and researched when considering offerings from any of the major manufacturers.

But let’s get back to the main benefit of tapeless acquisition. No capturing tapes! It’s really that simple. Not only do you no longer need to buy the expensive little things, you don’t have to spend all that time capturing them into the computer in order to work on your upcoming Academy Award-nominated cinematography. Assuming you do your research and get yourself a great little camera that works perfectly with your Mac, transferring video from you camera will be as simple as copying files from a thumb drive. If your camera is really cool, it will even utilize super-secret CIA scene detection technology to break your happy little trip to the zoo into distinct clips of monkeys, panda bears and tourists falling into the polar bear pit. You may not realize now how great of a time saver this is, but it is. Put it this way: the pros utilize modern indentured servitude (interns) so they don’t have to do it themselves. Most of us have to do it ourselves.

In summary: do your homework. Look for documented compatibility with your Mac and software. Pay attention to the little stickers that tell you what size CCD the camera has; more megapixels + bigger CCD = higher quality video. HD is cheap — and all HD cameras should give you the option to shoot standard definition as well, so look for HDV or AVCHD format cameras. Finally, be sure to buy a case to protect your investment…and remember: with video gear you really do get what you pay for.”