May 21, 2008

  • Loose keys

    Loose keys are one of the sure signs that you’re imminently moving.  Your contact lens solution sits alone as the last surviving representative from the bathroom regiment.  Well, except for his partner, the contact lens case.  Your laptop has survived, too, because it’s charging up for the early morning departure and you can’t possibly pack it away yet.  You will need to look up some detail at the last minute.  Your sheets are gone, and you’re probably going to sleep in your clothes.  Related, you hope that Old Spice works for 24 hours, since the thing is packed away and just to be safe, you put it with all the other dangerous multiple-ounce supplies.  Just remember to triple check your phone’s alarm, because your real alarm clock is packed away.  And sleep a little, if you’re lucky.

    moving

    The coffee mug is to save thirty cents.  And the trees.

May 15, 2008

  • Can Hillary still win?

    News organizations are good at making it seem like a horse race that’s still kind of neck-in-neck. It’s good for fairness amongst the candidates, and it’s good for encouraging people to watch the news. But I wanted some straight answers. Vikas mentioned Intrade, where people can go buy stock in particular outcomes. They’ve got Obama at 91.6% right now. But still, that’s just a market, and markets aren’t always right.

    So, is there a better way to end the discussion? Yes. The following table, I think, ends the dispute.

    Obama will win the Democratic nomination.

    *Updated*

    election-update

    Here’s how it works:

    • Take the current delegate status, minus superdelegates.
    • Take recent polls for the next two all remaining contests (Kentucky, Oregon, Puerto Rico, Montana, South Dakota), and give Hillary every single undecided voter. (This will not happen in reality unless she singlehandedly prevents a terrorist attack or Obama reveals a new strategy of “hopelessness,” with the catchphrase “no we can’t!”)
    • Obama ends up on top, 1702 to 1588.
    • Even if you include MI & FL, Obama ends up on top, 1771 to 1766.

    In order for Hillary to win, then, 62% of the undecided superdelegates would have to decide to go against the popular vote. And remember, the real number will be higher than that because every assumption here heavily favored Clinton. Will this happen? No. I’d go buy some Intrade stock if I were you.

  • Running versus walking

    Today I was running for a while in Central Park.  It’s nice to run when all of the Olympic athletes aren’t out in fancy running gear, lapping you, while their hearts thump along at a chill 65 bpm.

    But anyway, if you aren’t an Olympic athlete, I think you have to put a decent amount of mental focus into running.  I can’t really focus on anything moderately complex when running.  The good news is you’re boosting your cardiovascular system and improving muscle tone.  And your exercise goes by faster.

    In comparison, if I walk, I can have a more interesting mental dialogue.  This is a great time to put your thoughts in perspective, and look at things from a different angle without all the contextual clutter of your usual workspace.  For me, this usually includes rethinking some technical challenge or general strategy.  It’s not always work — maybe it includes time to think of what birthday presents I should get for others, or who I need to send a thank-you card to or get in touch with.  But the point is, I’m always surprised by the number of important ideas that emerge when I’m just walking.  This is, I think, more valuable than a little added muscle tone.

    Hybrid

    I think the most advantageous strategy is to run first.  Run until you get a second wind and your legs start feel different, maybe a little numb.  And your field of vision is narrowed by about 15 degrees.  Then, with your senses partially numbed, walk home and divert all of your energy to your thoughts.  If you’ve got an MP3 player, either shut it off or play some song you’ve already heard a billion times.  That way you’ll ignore it.  Make sure your field of vision expanded again, that means your brain has the right amount of oxygen.  Take a few breaths, and think like you’ve never thought before.

May 9, 2008

  • Code comments are archaic

    codecomment

    Comments in code today are so archaic.  I just realized this when I thought about migrating this little sketch into some code I was writing.  I could do it, and it wouldn’t be the first time I turned a simple diagram into either some overly verbose English description or some time-consuming ASCII art, like …

    /*
    Here’s a bad ASCII art diagram that poorly explains
    the code below.

    [  ][  ][  ]
    [  ][  ][  ]

    [  ][  ][  ] [  ][  ]

             [  ][  ][  ]

             [  ][  ][  ]

    */

    Wouldn’t it be great if we could just annotate something with a tablet pen?  The annotation could be stored as a compressed base64 string at the bottom of the same code file.  It could be visually anchored to a marker within existing code or comments.  Visual Studio could just collapse this block and render the inked annotation instead.

    Does anything like this already exist?

January 16, 2008

  • Can I still learn to spell?

    Apparently, I originally learned to spell with the Letter People.  As a mandatory matter, I participated in middle school spelling bees, won the class-wide “primary” and usually took second-place to my best friend Reid Turner.  While I remember very few specific incidents from elementary and middle-school, I still recall vividly the crushing defeat in the cafeteria-turned-spelling-bee-stage when I failed to spell the simple word, “neither.”  My spelling technique is prominently phonetic, and secondarily by visual recognition.  My Achilles’ heel was, and continues to be, this “i before e, except after c, …” nonsense. 


    While ”neither” successfully took a prominent place in my memory, there are a handful of other words that did not.  Perhaps the worst member of this set is the word “retrieve.”  Every time this word comes up, I instantly recognize the fact that I don’t know how to spell it, and fail to recall the correct spelling.  Is it retreive?  Or is it retrieve?  Why am I not capable of this simple recollection?  I’m beginning to wonder, can I still learn to spell, or did that phase of life end in the cafeteria circa 6th grade?


    To patch up this egregious flaw in my brain, I’ve decided on an alternate technique.  Retrieve is now RETRY EVE, and since there’s no Y in retrieve, the correct spelling becomes RETRIEVE.  Strange, that I had to resort to such extreme measures.

October 7, 2007

  • Idea: Upgrade[hack] your digital camera

    If you haven’t seen High Dynamic Range (HDR) photographs, you need try a Google Image Search for “HDR Photo.”  The idea is that you can capture a photo with really bright (overexposed) areas, and really dark areas (underexposed), and adjust the colors so that both areas have a lot of definition.


    The pictures come out looking more like artwork than photographs.


    Here’s an example HDR photo of New York City:


    NewYorkHDR


    Once I saw a few of these photos, I wanted to make my own.  Tutorials on the subject tell you that you need to take two photos every time you want to create an HDR image, unless your camera can take RAW images.  First, you take an underexposure, then an overexposure, then you will merge them together with Photoshop later.  The reason for this is because when your camera compresses your photo into a JPEG, it discards most of the color data.  For most uses, this is fine, but it doesn’t bode well for creating HDR versions later.


    The trouble with RAW formats is that they aren’t standard (Canon is different from Nikon is different from Sony…), and most cameras don’t even let you choose RAW.  The other difficulty with RAW is that the files are huge — sometimes 6 times larger than JPEG –, so you’ll be crippled in terms of how many photos you can take.


    The guys who standardized JPEG (in 1994!) just voted on whether to “start work” on a new format called ”JPEG XR.”  But the proposal is actually more or less a request to rename an existing format called “HD Photo.”  HD photo comes from Microsoft, and they’ve built support into Vista, and there are plugins for XP and OSX.  This format fixes most of the annoyances of JPEG — namely, you can still retain HDR information, and the files are about the same size as JPEG.  Sounds great, but you can’t buy a camera that supports this.


    You can download freely-available C++ source code which encodes HD photo/JPEG XR.  It is intended to be used as a base for building HD photo support into digital cameras.  But I suspect it’s going to be a while before HD Photo/JPEG XR shows up in real cameras. 


    So here’s my idea-of-the-day:


    Hack the firmware for your digital camera to save files as JPEG XR instead of JPEG!


    Of course this will almost certainly void every warranty on your camera, I don’t suggest trying unless you absolutely know what you’re doing or you don’t care about your warranty, so take this idea as-is.  Don’t blame me if you brick your camera, but do send me your HDR photos if you accomplish this!

August 15, 2007

  • Finding close matches with C#


    Ever have trouble finding things?


    This is just a quick code snippet to quickly find the nearest item from a generic list.  It leverages the .NET BinarySearch function, and it also uses the new Extension methods functionality of .NET 3.5 (Visual Studio 2008).


    Here’s the extension in its entirety:


    public static class BinarySearchExtension
    {
        
    // Computes a distance between the query and a given object.
        public delegate double DistanceFunction<T>(T item);

        
    // Finds the nearest item to the query item in a sorted list.
        public static int BinarySearchNearest<T>
            this List list, T query, 
            DistanceFunction distFunc) {

            int n = list.Count,
                i = ~list.BinarySearch(query); 
    /* O(log(n)) */

            return 
                (i < 0) ? ~i : 
    /* query (q) was in the list */
                (i < 1) ?  0 :
    /* q < [0] */
                (i == n) ? n - 1 : /* q > [n-1] */ 
                Math.Abs(distFunc(list[i])) < 
    /* [i] < q < [i-1] */
                Math.Abs(distFunc(list[i - 1])) ?
                    i : i - 1; 
    /* return the closest  */
        }
    }


    Here’s an example of using it to search a list of dates:


    List<DateTime> times = new List<DateTime>();
    times.Add(DateTime.Parse(“8/16/2007?));
    times.Add(DateTime.Parse(“7/7/2007?));

    times.Sort(); 
    /* Must be sorted before BinarySearch */

    int i = times.BinarySearchNearest(DateTime.Now,
            
    /* distance: hours between now and t */
            t => t.Subtract(DateTime.Now).TotalHours);

    DateTime nearest = times[i]; /* nearest is 8/16/2007 */


    I have mixed feelings about extensions. They are clean and messy at the same time — clean because you can, with very little code, add new functionality to old classes (even those that you didn’t write). Unfortunately, because the extensions are separated from the classes they concern, it could easily set up a messy dependency situation. For many projects, I think a good practice is to keep all of your extensions localized together in one assembly.

April 27, 2007

  • Performance of blurring a photo

    I’ve been working on the problem of blurring an image. Take, for example this picture that I borrowed from flickr:
    Original
    Blurred

    The processing that I need to do on the image is as follows:

    • Convert to grayscale
    • Apply a Gaussian blur

    The challenge is that the image was originally 3037×2007, or 6.1 megapixels. To do the blurring, each final pixel ends up being determined by a weighted average of 122 other pixels. So, that’s 0.7 billion floating point (double) multiplications & adds. I split the work up on two threads, and tried to speed the process up as much as possible. Here are some timing results for doing this operation:

    • 1607 ms, unmanaged C++ single precision (float) with /fp:precise flag
    • 1431 ms, C# using fixed pointers
    • 1270 ms, managed C++, identical to the C# code
    • 938 ms, unmanaged C++ with SSE2 compiler flag
    • 845 ms, unmanaged with SSE2 & compiled-in filter size
    • 801 ms, unmanaged C++ with default compile options
    • 800 ms, unmanaged C++ with defaults & compiled-in filter size
    • 765 ms, unmanaged C++ single precision (float) with /fp:fast flag
    • 490 ms, C# with managed Direct3D on Radeon X1600
    • 171 ms, same as above with XFX GeForce 7900GS 600M (thanks Tony)
    • 140 ms, C# with XNA on XFX GeForce 7900GS 600M on WinXP

    As you can see, there is a pretty big reward jump (35%) between managed and unmanaged code, but I was surprised that the SSE2 optimizations actually slowed things down, apparently due to a poor job by the compiler. Also striking was the effect of the floating-point precision mode! In the C++ compiler, you can select one of several floating point modes. If you don’t select the ‘fp:fast‘ mode, it appears that the compiler does everything with double (32-bit) precision, and then casts results back to single precision (32-bit) before storing results. As a result, running the single-precision version of the code actually took the longest of all!

    However, it seems the best way to do this is to use a hardware-accelerated 3D graphics card. The method I used involved writing Pixel shaders, which are tiny programs that execute onboard a graphics processor. It is quite a pain — I ended up making a function which writes a Pixel shader, which in turn gets compiled by DirectX on the fly, and sent to the graphics card. Fortunately, that all happens is pretty quickly. Timings include copying the blurred result back into system memory (via PCI-express). At the extreme end, the GeForce card on Windows XP runs the operation 25-times faster than my Pentium D 2.0GHz. This was also my first pixel shader program, so I suspect it could be made more efficient. The other benefit is that that the main CPU is free to do whatever it wants (say, blur another image!!) in the meantime.

    Vista versus Windows XP. The comparison between the GeForce card on Vista and then on XP originally seemed like proof that Vista and/or the Vista drivers were terrible for this application, but it turns out there were incompatibilities when the program was run on XP, and the output wasn’t correct. After reworking the application to run under XNA instead of Managed DirectX, the erroneous output went away and we were left with a modest improvement from Vista to XP. Hopefully this can be addressed with driver improvements.

    Difference between CPU and GPU Blurring
    Difference between GPU and CPU blurring

    Accuracy with the graphics card: To use the graphics card, you have to be very careful to align pixels to a fake “3D” scene, or the GPU will take the liberty to interpolate between two neighboring pixels. I ran a pixel-to-pixel comparison between C# and the GPU output. In the center of the image (white region), pixel values differed by at most 0.00038% (less than 1/65536 gray levels). But near the edges, some pixels were off by about 10%! I’m not surprised, because the Pixel shader just blindly addressed texture values right off the edges of the image. I was just hoping the GPU would magically do the “right” thing when tiling textures next to each other — no such luck. So the conclusion is that if you demand precise values near the edges of your images, you’ll need to either pad them appropriately before executing a pixel shader, or make sure you understand exactly how the adjacent textures will be indexed.

    Other things I might try:

    • Use 32-bit floating points because SSE2 can multiply 4 of those at once (vs. 2 64-bit doubles)
    • Write manual SSE2 assembly
    • Utilize DirectX and a hardware-accelerated pixel shader to do the job
    • Some better idea that you have…?

    Which one do you think I should try next?

April 11, 2007

  • Loop performance in C#

    Lately I’ve been attempting to perform arithmetic-heavy computations with C#, and
    I encountered a somewhat surprising detail about loop performance. Computing the
    squared distance (euclidian) between two vectors, I discovered that using a variant
    of loop unrolling actually improves performance.

    Check it out. I ran the following two functions 25 million times each for random
    64-dimensional arrays, under ‘Release’ mode in Visual Studio 2005 on a 2GHz Core
    Duo.

    Using this loop, the full operation takes about 10.0 seconds:


    int
    style="font-size: 10pt;"> Distance(int style="font-size: 10pt;">[] a,
    int
    []
    b)
    {


    int
    d_sum
    = 0;

    int
    d
    = 0;


    for (int
    i = 0; i < a.Length; ++i)
    {
    d = a[i] – b[i];
    d_sum += (d * d);
    }


    }

    Whereas this loop takes about 7.9 seconds!?:


    int

    Distance(
    int style="font-size: 10pt;">[] a,

    int
    []
    b)
    {


    int
    d_sum
    = 0;

    int
    d1
    = 0, d2 = 0, d3 = 0, d4 = 0;


    for (int
    i = 0; i < a.Length; ++i)
    {
    d1 = a[i] – b[i]; ++i;
    d2 = a[i] – b[i]; ++i;
    d3 = a[i] – b[i]; ++i;
    d4 = a[i] – b[i];

    d_sum += (d1 * d1) + (d2 * d2) + (d3
    * d3) + (d4 * d4);
    }

    return
    d_sum;

    }

    This is quite puzzling to me. Okay, so the loop unrolling
    reduces the branching rate, but this branch should (almost) always be predicted
    correctly, and, if unrolling makes sense at a lower level, this is one of those
    cases where the compiler should definitely optimize things on its own. (I even tried
    explicitly specifying ‘i < 64′ for the loop to increase the odds of this… no
    improvement.) I’m not sure what’s going on here, and I guess the next step would
    be to look at the IL code generated by the C# compiler.

    Other interesting points:

    • Using a temporary value for ‘a.Length’ doesn’t speed anything up. Apparently the
      compiler is able to notice that the loop has no chance of affecting the array’s
      size.
    • Explicitly inlining this function inside of the outer loop (5,000 runs) actually
      slows performance back down to the 10 sec range!
    • Switching this loop to use fixed unsafe pointers doesn’t improve performance either.
      Why? Either fixing the pointer takes too much time, or there are some fancy low-level
      optimizations that are helping loop #2 which aren’t being applied to the unsafe
      code.
    • Software pipelining (moving the use of d1, d2, … until the next loop iteration)
      doesn’t help, either.

March 9, 2007

  • Millennium Pong

    In two hours, a party will be starting in Ann Arbor.  1000 cups, one ping-pong table.


    If you’re nowhere near Ann Arbor, you can still take part.  How?  Visit the site, and click one of the “cheer” buttons.  Within 1 second, an audio cheer will blast out of the speakers at the party.  Oh, and I should mention that you can also watch the cups dwindle, live, on the same webpage.


    cam


    http://www.milleniumpong.com


    http://www.milleniumpong.com/cam.php