Monday, April 18, 2011

Getting a TLC5940 to retrigger itself

At first glance, the TLC5940 seems quite nice. It offers 16 output channels, each with 12 bit PWM and a 6 bit current limit. Unfortunately, the chip doesn't have an oscillator and when one PWM cycle ends, it needs to be retriggered. This is kind of annoying and it partly defeats the purpose of using an extra chip for PWM. It means that you have to supply a jitter-free clock for PWM and you need to count cycles and retrigger the chip via BLANK when necessary.

I wanted to get it working quickly, so I devised a hack. By capacitive coupling OUT15 to BLANK, the rising edge of OUT15 at the end of the PWM cycle became a positive BLANK pulse which started the next cycle. I used a 4.7 kohm pullup at OUT15 to create a signal there, a 22 nF capacitor between the pins, and a 100 kohm pulldown at BLANK. At BLANK, I also added diodes to the power supply rails, to clip the signal. I'm not sure if those were needed.

I never checked to see if the signal was within or close to specs, but it was perfectly reliable once started. To start it, OUT15 had to be set for the desired length of the PWM cycle, and BLANK had to be pulsed high. Sometimes, it started by itself, but that was unreliable, especially because the initial contents of TLC5940 registers are undefined.

In up mode, all output modes are useful for TACCR0

The MSP430x2xx Family User's Guide, the section on Timer_A output modes claims that "Output modes 2, 3, 6, and 7 are not useful for output unit 0, because EQUx = EQU0". However, the output example for up mode states "The OUTx signal is changed when the timer counts up to the TACCRx value, and rolls from TACCR0 to zero, depending on the output mode." Based on this, it seems those output modes are useful: they could create a one cycle pulse. This does actually work; I used it to drive the BLANK input of a TLC5940.

Sunday, April 03, 2011

Video capture

I recently used VirtualDub 1.9.11 to capture video with audio.

I captured video via an ATI TV-Wonder VE card, which is based on the BrookTree/Conexant Bt878 chip. Windows 7 drivers are not available from Microsoft or the Manufacturer, but the version 5.3.8 btwincap open source driver works pretty well. In VirtualDub capture mode, overlay never works and resolutions can't be set via the capture pin, but preview works, and resolutions can be set via "Set custom format". Spending many minutes in (nonfunctional) overlay mode can cause a bluescreen or crash, but that's easy to avoid by switching to preview or "no display" mode. When previewing or capturing, the card is perfectly stable. Video quality is great.

The TV-Wonder VE doesn't capture audio, so I recorded audio via Realtek ALC889a line in with driver version 6.0.1.6235. The audio quality was good, but since audio and video were recorded via separate devices, they weren't in sync. My first capture was a major disappointment, perhaps due to Windows enhancements for line in. After I disabled those, things got better. I discovered that VirtualDub could perfectly sync if "Correct video timing for fewer frame drops/inserts" was not checked in "Capture timing options". If the option was checked, no frames were dropped and fewer frames were inserted, but VirtualDub did not synchronize the audio very well. There was both an offset and a drift. Fortunately, in most cases the drift was insignificant, and so I could simply correct for the offset. I captured most video this way to minimize frame drops and inserts.

I spent some time wondering whether to extend the luma black or white points. It seemed like there was information beyond the luma white point, but extending it didn't help, perhaps because that information was distorted. There was also a bit of information beyond the black point, but extending it increases noise. Extending either made properly exposed scenes seem washed out. I decided to never extend the white point and only extend the black point for a few scenes which would otherwise be excessively dark.

All the frame inserts I got happened before scene changes or severe noise. This makes me think that the card simply didn't record frames where a usable signal wasn't available. When an insert happened before a scene change, what followed was either one frame of the old scene followed by the first frame of the new scene, or one frame of the old scene and one frame of the old scene interlaced with the new scene. In such cases, I removed those frames.

I gave up on deinterlacing to 29.97fps because all algorithms are a compromise between blur and artifacts, and I didn't want to encode that permanently in the video. Deinterlacing to 59.94fps produced good results, but it also increased compressed video size substantially. Because of that, I decided to encode interlaced MBAFF H.264 video with x264. It plays acceptably in Windows Media Player 12 and excellently in MPC-HC with ffdshow Yadif deinterlacing.

Before encoding I considered various forms of denoising, but I never found one that really improves the end result. Denoising could decrease the slight noise, but it would also blur some subtle patterns like distant grass, asphalt, and slightly dirty walls. Also, the file size decrease from denoising is offset by the greater visibility of artifacts in smoothed areas. The most bothersome effect was that areas which show motion due to subtle patterns could stop showing motion. Finally, I noticed that x264 seems to perform some denoising itself, so denoising isn't needed after all.

Reversing a PAL to NTSC conversion

I wanted to capture video from an NTSC VHS tape that had been converted from a PAL tape. Of course, it would be better to capture from the original PAL tape, but I only had the NTSC tape and so I was forced to work from that.

The NTSC tape has some fields which have breaks in them. The area above the break is the next field, and the area before the break is the previous field. These fields cause a slight motion stutter, and motion is perfectly smooth without them. It's clear that they have been inserted to increase the field rate from PAL's 50 Hz to NTSC's 59.94 Hz.

Usually, the added fields occur every 6 fields. This means there are 5 original fields, 1 fake field, 5 original fields, and so on. However, the break in the field slowly moves down the field. When the break moves off the bottom of the field, then there are a few fields that are total duplicates. After that, the break moves into the top of the next field. Before that happens, it is necessary to skip ahead 7 fields once (outputting 6 original fields) to stay in sync with the inserted field. Another way to look at this is that there is a signal to go back one field which repeats at slightly more than 6 field times. It's not exactly 6 because NTSC uses 59.94 instead of 60 Hz, and so the ratio should be slightly less than 6/5.

Each inserted field flips the correspondence between PAL and NTSC odd and even fields, and in any case, PAL interlace can't really translate to NTSC interlace. (Imagine two overlapping combs with different teeth spacing.) Because of this, the converter had to deinterlace to 50 fps progressive, or more likely, just resize fields vertically while shifting them to compensate for vertical displacement due to interlace.

When outputting NTSC video, the converter bobbed the fields up and down to simulate interlace. However, this was done incorrectly. The image should bob up and down by one pixel at 480 lines or half a pixel at 240 lines, but instead, it bobbed up and down by one whole pixel at 240 lines. This finally made me give up on trying to recover interlaced video. Instead, I decided to capture at 320x480 and end up with 50 frames per second progressive video at 320x240.

The first processing step corrected for the bobbing. Using AviSynth, I added a 1 pixel border at the bottom of the top fields and cropped off 1 pixel from the top. This didn't lose any image data, because the top row in these fields was black.

Once the bobbing was corrected, I used AviSynth's RGBDifferenceFromPrevious function to collect data on inserted fields. By cropping the video so only a few lines at the top or bottom remain, I detected when the top part of the frame is the next frame or the bottom part is the previous frame. By using the function on the whole image, I detected when the switchover is in the vertical retrace interval, and frames are total duplicates. In all cases, it was necessary to crop off blackness, the very edges which are noisy, and the video head switching at the very bottom.

After a bit of experimentation, I chose to use data from frame tops, and to process it using a program which decides whether the duplicate frame occurs in 5, 6 or 7 frames from the previous one. It never occurs in 5 frames, but that capability allows the program to resynchronize if it chooses 7 when it should have chosen 6. After this, another simple program replaced the "7, 5" combinations with "6, 6" and counted the lengths of the spans of 6 that occurred before a 7.

The resulting data was good, but it had some glitches. I used a spreadsheet to work with it. There, I automatically removed some minor jitter and manually fixed a few larger glitches. When I tried to fit a line to the data, I found that it was actually a hockey stick curve with the bend at the start. This was probably because oscillators drifted during warmup and then stabilized. I considered trying to fit some kind of function to the graph, but minor fixes were sufficient.

Once I was satisfied with the data, I wrote another simple program which created a VirtualDub script. First, I had it keep the frames I wanted to remove, to ensure that I am indeed removing the frames which have tears in them. Then I created the final script which removed those frames.

After the video was finished, it was time to synchronize the audio. This can either be done by resampling the audio or changing the frame rate. I chose to resample the audio. For perfect sync, I could have used the data I generated to create a variable sample rate, but I instead just used a fixed sample rate based on a linear approximation. The errors were small enough to be unnoticeable.

Finally, it was time to encode the audio and video. I used a low (high quality) CRF in x264, because at 320x240 the video was quite sharp and I didn't want to degrade it.