The Blog of Ryan Foss

It's a start!
work

Automatic Road Geometry Generation in Unity

I’ve had an awesome project at work developing a simulation in Unity3D that builds a test track, a road essentially, right before your eyes.  Technically, we’re using some pretty hefty data, including OpenCRG and Power Spectral Density data as inputs, but I also added a lower weight random algorithm that can make for some roads more appropriate for game play.

In the image above you can see a long length of track that I generated with a few inputs.  Additionally, it creates road rails to help keep the vehicle on the road.  It also creates a simple ditch on each side of the road from a simple profile. Additional options include material selection, which changes the road appearance.

It’s not often that I get to share stuff from work, but this is only a taste of what it can do. The project will eventually be released to the public. Hopefully some day I can share a video of everything it does.

The vehicles are from the Car Tutorial project provided by Unity.

Booth Bunny Me

I was out of town this last week, employed as a booth bunny for a simulation demonstration we made to showcase a number of our systems. It’s fun and frustrating and exciting and boring, all in one!

Position from Depth

I’ve been doing some stuff at work using our visualization, shaders and some python scripting.  I normally don’t post stuff about work for many reasons, but this project has been a lot of fun and is worth blogging about (and I have permission).  I also want to document what I did as well as address some of the issues I encountered.

Essentially, long story short, we’re doing some human safety systems work where we need to detect where a human is in an environment.  I’m not directly involved with that part of the effort, but the team that is is using some depth cameras (like Kinect in a way) to evaluate the safety systems.  Our role, and mine specifically, is to provide visualization elements to meld reality and simulation and our first step is to generate some depth data for analysis.

We started by taking our Ogre3D visualization and a co-worker got a depth of field sample shader working.  This first image shows a typical view in our Ogre visualization.  The scene has some basic elements in world space (the floor, frame and man) and others in local space (the floating boxes) we can test against.

A sample scene, showing the typical camera view.  The upper-right cut-in is a depth preview.

The next image shows the modifications I made to the depth shader.  Instead of using a typical black and white depth image, I decided to use two channels, the red and green channels.  The blue channel is reserved for any geometry beyond the sensor vision. Black is depth less, essentially no geometry exists there.

Two color channel depth output image.

I decided to use two color channels for depth, to improve the accuracy.  That’s why you see color banding, because I hop around both channels.  If I only used one channel, at 8 bit, that would be 256 colors.  A depth of 10 meters would mean that the accuracy would only be about 4 cm (10.0 m / 256). By using two color channels I’m effectively using 16 bit, for a total of 65536 colors (256 * 256), which increased our accuracy to 1.5 mm (10.0 m / 65536).  In retrospect, perhaps I could have used a 16 bit image format instead.

To do this sort of math its surprisingly easy.  Essentially you find the depth value right from the shader and make it a range of 0 to 1, with 1 being the max depth.  Since we are using two channels, we want the range to be between 0 and 65536, so just take the depth and multiply by 65536.  Determining the 256 values for each channel is pretty easy too using the modulus.  (A quick explanation of a modulus is like 1pm = 13.  It’s when numbers wrap around.  So the modulus of 13 by 12 is 1 for example, as is 25 = 1.  You could also consider it the remainder after division.)  So the red channel is determined by the modulus of depth by 256.  The green channel is done similarly, but in this case is determined by the modulus of depth/256 by 256.

red channel = modulus(depth, 256)
green channel = modulus(depth/256, 256)

Here’s an example.  Lets say the depth is 0.9.  That would result in a color value of 58982.4 (0.9 * 65536).  The red channel color would be the modulus of 58982.4 by 256, which equals 102.  The green channel would be the modulus of 58982.4/256 by 256, which is 230.

With that done, I save out the image representing the depth with two channels as I illustrate above.

Next I calculate the position from the image and depth information.  This particular aspect caused me lots of headaches because I was over-complicating my calculations with unnecessary trigonometry.  It also requires that you know a some basic information about the image.  First off, it has to be symmetric view frustum.  Next, you need to know your field of views, both horizontal and vertical, or at least one and the aspect.  From there its pretty easy, so long as you realize the depth is flat (not curved like a real camera).  Many of the samples out there that cover this sort of thing assume the far clip is the cut off, but in my case I designed the depth to be a function of a specified depth.

I know the depth by taking the color of a pixel in the image and reversing the process I outlined above.  To find the x and y positions in the scene I take a pixel’s image position as a percentage (like a UV coordinate for instance), then determine that position based off the center of the image.  This is really quite easy, though it may sound confusing.  For example, take pixel 700, 120 in a 1000 x 1000 pixel image.  The position is 0.70, 0.12.  The position based on center is 0.40, -0.76.  That means that the pixel is 40% right of center, and down 76% of center.  The easiest way to calculate it is to double the value then minus 1.

pixelx = pixelx * 2 – 1
pixely = pixely * 2 – 1

To find the x and y positions, in local coordinates to the view, its some easy math.

x = tan(horizontal FOV / 2) * pixelx * depth
y = tan(vertical FOV / 2) * pixely * depth
z = depth

This assumes that positive X values are on the right, and positive Y values are down (or up, depending on which corner 0,0 is in your image).  Positive Z values are projected out from the view.

To confirm that all my math was correct I took a sample depth image (the one above) and calculated the xyz for each pixel then projected those positions back into the environment.  The following images are the result.

Resulting depth data to position, from the capture angle.
The position data from depth from a different angle.
The position data from depth from a different angle.

The results speak for themselves.  It was a learning process, but I loved it.  You may notice the frame rate drop in the later images.  That’s because I represent the pixel data with a lot of planes, over 900,000.  It isn’t an efficient way to display the data, but all I wanted was confirmation that the real scene and the calculated positions correspond.

A Nearest Color Algorithm

I’ve been developing this software at work to take an image from a video stream (a webcam in this case) and detect if a green laser dot is present. After spending most of my day getting a project to compile and a program to recognize my webcam as a stream, and being distracted by numerous other projects and co-workers, I just wasn’t at the top of my algorithm development game to detect the green dot. Combine that with office lighting and you know my dilemma (fluorescent lighting is often green tinted and throws off a white balance.)

I start by stepping through each pixel in the source and check its color. If you take the pixel’s colors from each channel and find the average you’ve essentially got a gray scale value. This doesn’t work so well since some shiny metal might throw off the detection (like a ring). If I wanted to find the brightest spot, this is a good approach, but I needed to emphasize the green. At first I added a weight to the green channel by simply doubling it. So now the average is calculated as (R+G+G+B)/4, counting the green channel twice. This was better, but also prone to problems.

At the end of the day I had a basic project setup and working with some hit or miss detection. I had invested over an hour into multiple approaches and although my algorithms for color matching were working somewhat, and I knew there had to be a better approach.

What dawned on me later was something I should have noted much earlier. I know that the RGB color concept is essentially a three dimensional array, but since it is color I’m used to thinking of it two dimensionally like you see in an image editing program. Once I visualized it as a 3D space, a big cube made of blocks so to speak, I knew my answer.

You know the shortest route from A to B is a straight line right. This is easy to calculate in 2D using the Pythagorean Theorem as you probably recognize:

a² + b² = c²

This also applies in 3D space, which is exactly what RGB color is. An RGB value is a point in a 3D space. The Theorem applies like this:

a² + b² + c² = d²

The obvious solution was in front of me but I didn’t see it. Euclidean distance would give me exactly what I needed. I needed to treat the color difference as a distance.

r² + g² + b² = d²

For instance, say I wanted to find the closest pixel in the supplied source to a target color RGB 128, 255, 128. I check pixels against my target color by finding their channel distance. So imagine I have a black pixel of 0, 0, 0. My distance is calculated as:

r = pixel_color – target_color = 0 – 128 = -128
g = pixel_color – target_color = 0 – 255 = -255
b = pixel_color – target_color = 0 – 128 = -128

d = sqrt(r² + g² + b²) = 312.7

What if I have a pixel with color 0, 0 255, or a pixel with color 0, 255, 0. Which one is closer to my target? If I calculate the average, they are the same. But by distance, the green pixel is closer to my desired.

PdaNet for Android Doesn’t Work For Me

I follow a number of tech blogs, game blogs, fun blogs and serious code blogs, and then some, and the best time for me to do it is at work. Granted, much of it is personal interest, but it turns out my work is often very related. This was working fine and although they block everything cool or interesting on the main network, they didn’t on the guest network so I was free to roam via my laptop. Until Monday that is. They have now installed the worlds most ridiculous, draconian, big brother web blocking scheme imaginable to guests. Game is a four letter word. Storage sites like imageShack are blocked, no questions asked. Absolutely no streaming anything, including YouTube, Viemo, and a sad consequence is the Ted presentations. This from a company that is shouting (though its up-to-the-era internal message systems, such as that paper newsletter we can pick up near the door or get in our mail box) how innovative and cutting edge we are.

Damn it pisses me off. The internets is video and they block it. The internets is streaming and they block it. If I try to search for something graphics or programming related, it often ends up blocked, flagged as game or hacking or forums. Forums and IRC and newsgroups are off limits, good thing this isn’t where the answers to common questions are easily found. The company is digging themselves a grave IMO; a shallow grave of cold old technology to cuddle up with and wither away as the world passes them by and their employees dream of greener pastures.

Anyway. For mostly personal reasons I was trying to find a way to bypass their guest internets connection all together so I looked to my G1 phone and found an app called PdaNet. It basically tethers a Smart Phone to a PC via USB and the phone acts as a modem, using the 3G (or Edge I guess) network to access the internets. The install is one of the best I’ve seen and was painless. The software is easy to use and has user controls where they should be (it only runs if you ask it to). But the problem is that although I’m able to see the internets unfiltered, much of it just doesn’t load or times out. Gmail and Reader fail to load, as did a few others I tried. Other sites opened fine, but it was of little consequence since my main hope was to bypass the filtering so I can read the blogs I like as they are intended, with images and video.

One of my objectives at work is to stay up to date on relevant technology. I can’t do that at work now. A project I’ll soon be working on will incorporate interactive technology, which is impossible to research at work because of the blocking. To me, this is basically guilty until proven innocent shit that is going to hurt them more than help.

BTW, if you want to e-mail me at work, don’t put swears in the e-mail because I won’t get it. That’s right, they filter my e-mail and if it has a naughty word in it, they swat it down.

Sound Served

At work I played around with our sound server and sound messaging and I was excited to get variable engine sounds working working. When we first started doing sound on the PC, the system couldn’t keep up with the simulation and play the sound at sync. None of us had any sound programming experience and could not get it to work well. We had some balancing issues so we made a sound server that could run on a remote machine where the processor wasn’t consumed. This was good for a lot of reasons, but ultimately is ugly today IMO because it adds complexity and lots of messaging. Hopefully I can integrate sound directly into our simulation this summer. Meanwhile, since we are simulating some boats I added the ability to control volume and pitch and volume on indexed sounds that are attached to vehicles, and send messages with the corresponding values. The code looks something like this.

pitch = 0.75 + 0.5*currentSpeed/maxSpeed;
volume = 0.50 + 0.5*currentSpeed/maxSpeed;

//send message “vehicle index soundname x y z h p r volume pitch”

Its pretty simple and it works quite well. Essentially as the vehicle increases in speed the volume and pitch of the motor sound increases. It proved the concept, which I’m sure is how some games do it. Next I want to add an idling sound and modify it something like this.

if currentSpeed < 0.25*maxSpeed {
volumeIdle = 1.0 – 1.0/0.25*currentSpeed/maxSpeed; //goes from 100% to 0%
volumeEngine = 0.5 + 1.0*currentSpeed/maxSpeed; //this should max at 0.75
}
else{
volumeIdle = 0.0;
volumeEngine = 0.75 + 0.25*currentSpeed/maxSpeed;
}
pitch = 0.75 + 0.5*currentSpeed/maxSpeed;

//send message “vehicle index soundnameIdle x y z h p r volume pitch”
//send message “vehicle index soundnameEngine x y z h p r volume pitch”

This should let the engine idle sound be heard when the vehicle is not moving, or moving very slow. I don’t know if this will work, but I bet with some tweaking it will sound ok.

Also, since this is for a boat it makes sense as there is no shifting or gears; the engine just ramps up. But if I want to apply this to another type of vehcile with a more complicated engine, it could be modified this:

pitch = 0.5 + 0.25*gear + 0.75*currentSpeed/maxSpeed;
volume = 0.5 + 0.75*gear + 0.25*currentSpeed/maxSpeed;

Where gear is used to make the sounds see-saw some for that vroom-vrooooom-vrooooom sound of changing gears.

XNA W00T!

Got to play with some XNA today at work. I whipped this quick generic gun model up to prove out the content pipeline (Blender and textures). This is a modification to the BasicModel project, but with my parts and textures.

It was a lot of fun, and whoa a lot of learning. I see I have my work cut out for me, bot as a programmer, but as an artist.

Overworked, Overstressed, Undersleep

I’ve spent too much time at work. These silly projects that I can’t really attach myself to. The only reason I work is for the money. I’ve so much as said that although I may only be 75% effective after so many hours, at least I’m getting paid. It sucks that I only get straight time too, not double pay or whatever others may get if they work OT.

Last week I worked 72 hours. This week, I only worked 52, but I took Friday off (kids and wife sick, vomit in my wife’s purse to prove it).

In my case, anything over 45 hours gets paid straight OT. So my 72 hour week means I’ll make an extra $500 or so after taxes. All that time and effort, lost sleep, lost social life, lost family time. It adds up to so little. Is it worth it?

I’m deadly (and geeky) too!


I spent the last four days in DC at the Navy League something or other trade show standing on the trade show floor, demonstrating our modeling and simulation to VIPs and other not-so-VIPs. Turns out the schedule I conned my co-worker/friend Brent into coincided with an interview. Brent got a stunning write up on the Popular Mechanics technology blog and all I got was an extra hour of sleep.

Work, work, work

Quiz was easy, super easy. The report on the other hand took me a while, partially because I was actually very interested in the subject. I had to write a 1 page report, double-spaced, on the history of video games. The tough part was keeping it to one page.

Work has me traveling to DC Sunday for an early Monday morning meeting, what fun.