Friday, December 2, 2016

The World is Big and Round, and Why that Matters

The world is big and round.  While this may sound like a very prosaic and obvious remark, it also has major implications on attempting to render a detailed world on a computer.  In the previous examples where I've done so (and have blogged about it here), the "worlds" were relatively low-resolution raster images, either shown as a flat projection or wrapped around a small model of a sphere.

However, the world is in fact round.  Its not in fact a sphere, but a slightly oblate ellipsoid.  And a flat map representing even a sphere, let alone an ellipsoid, is never free from distortion.  Angles and/or distances will be skewed, though the mix and extent of the distortion depends upon the projection chosen and its exact parameters.  Nothing eliminates distortion entirely, but it may be minimized to some extent, especially over a relatively small area.  This is a large part of why coordinates in latitude-longitude form have been traditional for large-scale cartography.  They don't work directly for computer rendering, however, which works on coordinates in Cartesian (x,y,z) space.

Latitude and longitude are not the only approach, however.  Surveys for various purposes have often utilized a more local coordinate system, measuring Cartesian (x,y) distances from previously established baselines.   (I'm simplifying a bit, here.)  Such an approach works well enough over a certain distance, but distortion becomes a problem after certain point.

The Universal Transverse Mercator (UTM) coordinate system attempts to do something similar but on a larger scale.  UTM divides the world into sixty zones, each six degrees of longitude in width. UTM coordinates take the form of a zone number and then a northing (y) and easting (x) value in meters.  For example, 17N 630084 4833438 would be the zone 17, northern hemisphere, easting of 630,084 meters, northing of 4,833,438 meters.  The baseline for each zone is the central meridian (line running North-South) of each zone.  At the baseline, the x (easting) component is five hundred thousand (500000) meters.  This eliminates the need for negative numbers.  For a better explanation, check out the link above, or a GIS or cartography text.  I'm just hitting the high points here.

With all these systems that address coordinates effectively, why is there a challenge in rendering the big round world?  UTM is Cartesian, 3D rendering is Cartesian, shouldn't that simplify matters?  Not as much as you might think.  At the equator, a UTM zone is around 833 kilometers wide.  At the far North or South extremes, a zone will be only around 167 kilometers wide.  That's the whole distortion issue again.  The coordinates themselves are undistorted.  X meters from the central meridian of the zone is always X meters.  But the "edge", so to speak, of each zone moves.  A zone is not a rectangle. You cannot treat a zone or collection of zones as a simple Cartesian grid.

But what if you used geocentric coordinates instead?  That is, what if you converted the latitude-longitude or UTM coordinates into coordinates in X,Y, Z form, relative to the center of the planet? Would that work?  It would, but it would be very, very slow.

Why?  This takes a bit more explaining.  Modern graphics cards and the GPU's (graphics processing units) are optimized for single-precision floating point math.  Single-precision floating point uses 32 bits to represent numbers.  In the IEEE 754 standard, that means a single bit for the sign, eight bits for the exponent (where the decimal point goes, essentially), and 23 bits for the number itself.  That sort of means a number between -8388608 and 8388608, but with the exponent portion it could be multiplied or divided by powers of ten up to 127.  However, you can only exactly represent an integer in the -8388608 and 8388608 range.  Larger or smaller integers begin to lose accuracy.  Numbers in that range with fractional components even sooner.  You basically only have about 7 digits but the ability to move the decimal point around to make really large or really small numbers, or approximations thereof.

What this means when rendering is that in single-precision mode, with coordinates far from the origin you experience problems getting things to line up accurately.  Objects that are supposed to be adjacent may be many meters apart.  Movements become jerky.  All is not well.  And problems can occur far before you near the range mentioned above.  This is because when performing calculations using single-precision floating point numbers, some accuracy is lost.  The farther from the origin, the more pronounced.  So if a lot of calculations are performed on the coordinates, problems can occur even with coordinates only a portion of the range from the origin.

Since I mentioned single-precision above, you may be wondering it there's greater precision available that might alleviate the problem.  Double-precision floating point numbers exist, and when using them this problem effectively goes away.  So why not just use double-precision floating point numbers instead?

Remember my mention above of GPU optimization for single-precision floating point math?  GPU's usually perform double-precision floating point math, if they can do so at all, at a small fraction of the rate they can perform equivalent single-precision floating point math.  How small a fraction? Some brief research yesterday, looking at published benchmarks, suggest 1/3 the performance using a very expensive, high-end GPU and a mere 1/32 on more typical GPU.  Cheaper GPU's may have even worse performance, if they support double precision at all.

So what's the solution?  In Game Programming Gems IV, Peter Freese has an article, "Solving Accuracy Problems in Large World Coordinates".  It offers a solution by breaking the world into segments, and using a coordinate system that contains both the segment and then an offset within the segment.  The article compares it to the segmented memory model of the early PC world, which is a valid comparison, but it also reminds me of UTM coordinates, in a way.

This approach assumes that for rendering purpose there will be a base segment, likely where the camera or avatar is located. The location of the base segment would function as the origin, with objects from the base segment and nearby segments having their coordinates (or transformation matrices) translated (on the CPU, likely) relative to the base segment.  As long as the only objects being rendered are from the base segment and nearby segments, the coordinates can then be kept well within the single-precision range.  The base segment could be adjusted if the camera or avatar moves through the scene to another segment, but will require some recalculation of coordinates or matrices.

Unfortunately, Freese's solution assumes a Cartesian plane.  So how do we apply this to a whole planet that has great detail?  I'm not sure yet.  Geocentric coordinates and a three dimensional segmentation of space?  Maybe?

There's a book, 3D Engine Design for Virtual Globes by Patrick Cozzi and Kevin Ring, that may offer some ideas.  I may check that out.  I will also continue searching the web to see if there are solutions out there.  When I have found a solution I like, I'll mention it here on this blog.  For now, I should go to bed.  It took me nearly an hour to write this up, and I have to get up for work in about six hours.  Good night, world.

No comments:

Post a Comment