字幕表 動画を再生する
(light music)
- Hey, my name's Tristan Goulden,
I'm a remote sensing scientist in the AOP group,
and I'm gonna give a talk on Discrete LiDAR Uncertainty.
So, generally here we talk about
two major sources of uncertainty,
geolocation uncertainty as well as processing uncertainty.
So geolocation uncertainty deals with
the uncertainty that's associated
with each of the instrument and subsystems within the LiDAR.
So the GPS and IMU, laser ranger, laser scanner,
and the measurements that they make
and how the error in each of those measurements combines
into geolocation error for the actual point cloud.
So generally in that situation,
horizontal uncertainty for LiDAR
is greater than vertical uncertainty.
What we've seen is that if you look
at the instrument specifications for LiDAR
they generally don't give you a very good impression
of what the uncertainty is.
So generally they give you uncertainty specifications
in very optimistic conditions that you're not gonna see,
for the most part, in the real world.
And so vegetation and terrain conditions
will also affect the uncertainty in the point cloud.
But then we also have processing uncertainty,
which is really one of the larger sources
of error that we have and it's much more difficult
to quantify than the geolocation error,
and we'll talk a little bit about that.
So I just wanted to go through
sort of the different processing steps
and how the uncertainty is introduced
into the LiDAR system in each one of those steps.
So the first is the airborne trajectory,
which we talked about yesterday.
And so you can see here we've got a picture
of this airborne trajectory and it's colored by
an uncertainty that was given, a predicted uncertainty
that was given by the commercial software
that we use to produce the trajectory.
So the red areas are high uncertainty,
yellow sort of middle,
and then the blue areas are a little bit better.
So the uncertainty in trajectory
is a combination of the distance you are
from your GPS base station,
the distribution and number of satellites,
the lever arms inside of the system.
And those are the linear distances from the GPS antenna
down to the IMU and from the IMU to the laser sensor.
So you have to measure those linear distances between those
so when we get the position at the GPS
we can translate down to the laser
and then down to the ground.
So those need to be measured.
And of course, the accuracy of the IMU.
Now, what we found, some really nice stats
that Bridget worked up this past year,
is that when you look at the simulated uncertainty
from the software, what it tells us is that the distance
from the base station is actually the most important factor
when we're looking at the uncertainty in the trajectory.
And this is sort of an average of the predicted uncertainty
for all our flights across the entire season
and the distance from the base station.
And you can see it around 20 kilometers
you get this jump and that starts increasing.
So this is one of the reasons
we try to keep our base stations
always within 20 kilometers of the flight
'cause we know that after that the uncertainty starts
to really raise in that trajectory.
And the trajectory is really the base of our,
all of our geolocation so it's really important
that we maintain really accurate trajectory.
So we also get these stats at the end of the flight
that tell us what the uncertainty
in the easting, northing, and elevation are with the flight.
So to further look at this idea
of the distance of the base station,
we had some flights in D,
I think that's D8, three different sites,
and what we did is we had base stations located at the site,
and we processed the trajectory with the base station
and without the base station,
and then we compared the difference
between those trajectories at the sites.
And so in some cases this didn't turn out very well.
In fact we got upwards of over half a meter difference
in those trajectories when
we weren't using the base station.
So this is a huge deal for us.
We're trying to meet 15 centimeters
of accuracy in the LiDAR.
So if we're getting these types of errors on the trajectory,
we're completely gone.
But a lotta times, I mean,
I think in this particular trajectory,
this area of high uncertainty was when we were transitting
and far from other base stations.
And so you can get situations like that.
Another set we looked at it was a little bit better.
It wasn't quite as bad.
It was about 15 centimeters of difference between those two,
but still a big deal to us.
So, I mean, it's obvious that having
that base station really close to the trajectory
is really important to maintain the error that we want.
PDOP is a measure of,
like it's a descriptor of uncertainty
in the GPS satellite constellation.
So that's one of the portions that contributes
or gives you an idea of what the uncertainty
and the trajectory is gonna be.
If you have a high PDOP
then you're gonna have a high uncertainty in the trajectory.
But what we found is that that distance
from the base station, making sure that that's low
is way more important than making sure that PDOP is low.
'Cause generally since we're doing flights
just in the United States,
the GPS satellite constellation
is dense most of the times around here,
and so we usually get enough satellites
in a good distribution so the PDOP is generally low.
So after the trajectory we have the LMS processing.
So this is a processing that we do
in the commercial software that's provided by Optech.
And so a couple things.
At the beginning of the season,
we do a flight to measure the boresights.
And so the boresights are angular differences
between how the LiDAR sits and how the IMU sits.
So basically the IMU is giving us
our orientation in the sky.
And then the LiDAR head, we need to know the relationship
between how that's sitting to with the IMU
to properly geolocate all the observations on the ground.
And the small angular differences,
these are usually subdegree differences
between the IMU and the laser head
are called boresight misalignments,
and we do a dedicated flight over Greeley each year
to measure what those boresight misalignments are.
Course those are calculated and so
there's always potentially a little bit of uncertainty.
And so after we do a flight what we can do is we can look at
how the data in the overlapping strips matches.
It's like I mentioned before,
we have 30% overlap in each one of those strips.
So what we can do is we can look to see
how well that overlap data matches
and how well it compares with each other.
If it compares really well we get these vertical differences
associated with scan angle and the software plots these.
And so if this is a nice flat line,
that tells us that the system is in a really good alignment.
But it's also possible to get situations like this
where we kinda get this angled distribution here
where there's some bias with scan angle.
So if that happens it tells us that the boresight alignments
need to be redone or checked again.
And then often if we see this then
we'll do mid-season boresight alignments
to get these graphs to go back flat.
So, there's also what's called intensity table corrections.
These are factory calibrations that are provided by Optech.
And basically these are range adjustments
that are applied to the range based on the PRF
and the returned intensity.
So we really have no control over these.
It's corrections that are done in the lab back at Optech.
So after we fly the trajectory,
we get our boresight misalignments,
we process this data through the Optech software,
what we're then able to do is check the vertical accuracy
of the LiDAR, and we do that over a runway here in Boulder.
So a couple years ago we went out and took about two to 300
really high accuracy GPS points across the entire runway.
And so errors of about one, one centimeter or so.
So then what we do is we use all of those GPS points
and interpolate between them to get
sort of a validation surface of the entire runway,
so we know what the elevation is everywhere on the runway.
And then when we fly over it these are all the LiDAR points
that land on the runway,
and then we can get the vertical difference
between each of those LiDAR points
and that validation surface.
And so when we do that, since the LiDAR's collecting
hundreds of thousands of points per second,
we get this really great distribution
with a really high sample,
that gives us an impression of what the error is.
And so, since we try to fly over the runway
with the laser at nadir so the plane
is directly above the runway,
the primary error sources that are gonna be contributing
to these statistics are the errors in the laser ranger
and the vertical error in the GPS.
Other types of errors like in the IMU or in the scan angle,
they're gonna only propagate more heavily
into large scan angles, not so much at nadir.
So usually these stats are just giving us an idea
of how well the laser ranger and the GPS is operating.
So these are some results for several different lines
that we did over the runway.
You can see that we separate them by PRF.
And so that's the pulse repetition frequency,
how fast the laser is pulsing.
And as I mentioned yesterday,
we only fly at 100 kilohertz or less,
and this chart shows why.
So that when we get to 100 kilohertz
you can see that we have very low mean
and standard deviations at some of these higher PRFs,
125, 142, the errors are above our limits of 15 centimeters.
So this is why we fly only at 100 kilohertz and below.
So we also wanna test the horizontal accuracy
of the LiDAR system in addition to the vertical.
And the main source of error in the horizontal component
of the LiDAR points is due to
the beam divergence of the laser pulse.
And so you think about a laser,
you think it's coming out and it's very thin,
tight, bound of energy as it's coming out.
But the instantaneous field of view on the laser
and the beam divergence is .8 milliradians.
So that means when we're flying at 1,000 meters,
when that laser pulse hits the ground
it's diameter is 80 centimeters.
And so what can happen is that the energy distribution
of that pulse is actually Gaussian shaped.
And so most of the energy is contained in the center,
but out towards the edge, at this one over e level,
this is our 80 centimeter diameter here.
So you can see there's still lots of energy
out further than that,
and it only takes about one to 2% of the energy
to get returned back to the LiDAR system
to trigger a return pulse.
And so what can happen when you have this really wide beam
is that, say, if we were flying over here
and we were going to the table,
which is a very hard, flat surface,
if our beam came down here and it's 80 centimeters,
it can come down and the edge of the beam can hit the table.
That return's gonna go back from the edge of the table,
but the coordinate gets associated
with the center of the beam.
So then it looks like the table is over here,
because the center of the beam was over here,
and the edge of it hit the table.
And since the coordinates associated with here,
but we've got the elevation from the edge of the table
then it actually ends up over here.
And so
what we can do is we can,
what we do is actually fly
several flights over the headquarter buildings.
And we went out and we use traditional surveying
at total station to survey all the corners
of the headquarter buildings,
and then we fly over these and then we look as the pulses,
as we're scanning across,
and the pulses are coming up to the building edge,
where do they first jump from the ground up
to the building edge.
And what we find is that it's usually some distance
away from the building edge where we see that first jump up.
And then we can calculate this perpendicular distance,
and that gives us an impression of
what the horizontal error is gonna be.
And so when we do that we see that it's about
pretty close to half of our beam divergence,
which is 40 centimeters.
So we have that 80 centimeter full diameter,
but then as we're coming up we're only 40 centimeters
away from the building edge when we see that jump.
So that shows us that the primary source of error
in this horizontal component is the beam divergence.
So there's gonna be some GPS error,
some other types of error,
but they're pretty much dwarfed
by this beam divergence error.
So then when we can we also try to validate
our digital terrain models when we're going out to sites,
and so when we visit a couple sites per year
to do some ASD measurements
to support the spectrometer.
But when we do that we also collect LiDAR validation points
using rapid static GPS techniques.
So basically we take a high accuracy GPS,
set it out for about 20 minutes, collect observations,
get elevations sort of throughout the site,
and then we take each one of those elevations
and we compare it to the elevation
we get from the digital terrain model.
So this is an example of doing that at Oak Ridge,
and all these circles showed the different GPS points
that we collected and then this chart down here
shows that vertical difference
between the GPS points and the DTM.
So you can see that we're doing pretty good.
We go to mean of about four centimeters
and a standard deviation of about six centimeters.
So this is pretty consistent with what you can expect
for most commercial LiDAR providers.
So then to give people an idea of what those errors are,
kind of across the entire site
that are associated with the instrument,
what we do is we actually simulate the error
in every single point that the LiDAR has acquired
based on errors that we know for the GPS and IMU,
laser ranger and laser scanner.
So we propagate the errors through
each one of those instrument components
into every single point.
And then we get horizontal and vertical errors
for every single point and then we create LAZ files
or LAS files where we take out the elevation,
but insert the vertical uncertainty.
So then we can plot these LAZ files.
And instead of having the elevation,
they have the vertical uncertainty instead.
We use the algorithm that I published in 2010,
so if anyone wants to know more about that
then feel free to ask.
Generally what you find is that you can see here
that sort of at the edges of,
these are all different lines
that we've flown at the edges of lines.
The uncertainty's a little bit higher.
And that's because at nadir you don't have
a lot of the errors propagating in from the scan angle.
So as you scan higher any errors,
say in beam divergence or errors in the scan angle,
errors in roll, pitch, and yaw, they'll propagate higher
into the vertical coordinate as you get a larger scan angle.
So generally what we see is that the edges of scans
have higher uncertainty than the center.
It's also good potentially if you can fly
where your edge, you're applying with 50% overlap
where your edge is hitting the center of the adjacent line
because then you're getting your highest error
compared to your lowest error.
But it's always a trade off between flying time
and things like that.
I think I mentioned yesterday we use
the Triangular Regular Network to create our DTMs,
and then from those DTMs we create or slope and aspect.
And so I mentioned that one of the downfalls
of the TIN interpolation method is that
we don't get any filtering due to
redundancy within each individual grid cell.
And so we create the DTM just natively
with the TIN interpolation routine.
But then as we create the slope and aspect
I run a three by three moving average
across the DTM before calculating the slope and aspect.
And this slide kinda demonstrates why we do that.
You can see over here this is just
the raw DTM over the runway.
And if you look at the slope you can see
it's like really variable across the runway.
The runway's a really flat surface.
It doesn't have slopes that are ranging
from zero to five degrees.
And the reason we see that is because
there's a lot of noise in the LiDAR points.
So you're just getting your slope
between those really noisy points.
And so then if we were on a three by three moving average
across the DTM and then calculate the slope
you get this blue line here.
So you can see the slope is a lot less over the runway
after we do that.
So next I wanna talk about
the Canopy Height Model uncertainty.
This is an analysis I did
at the San Joaquin Experimental Range.
So I was able to get field measured tree heights
for a lot of the trees throughout the site
and then compare those directly to grid cells
in the Canopy Height Model.
And after getting rid of some outliers
and some other points that they measured,
for example, sometimes you'll get points
that they measure on trees that are lower
than the upper canopy,
and the LiDAR's only seeing the top of the canopy.
So you need to get rid of those.
I got this regression line.
So we should get a one to one regression,
and this is pretty close.
It's actually not statistically different from one.
No trend in the residuals.
But the important part here is that
the intercept value's negative 0.493,
means that generally we're underestimating
the tree height with the LiDAR.
This is a fairly common problem
that you'll see in the literature
that tree heights are generally underestimated by LiDAR.
And this is because the pulse actually
penetrates partially into the tree crown
before enough energy is returned
to trigger that return policy.
So you'll get some infiltration down,
and then you'll get that return enough energy
to go back and get a return pulse.
And so us seeing sort of about half meter
below these trees is pretty consistent,
which with what most people have seen in the literature.
So something that we've also done,
some more in depth analysis of the CH,
of the Canopy Height Model uncertainty.
And we leveraged BRDF lights that we flew
primarily for the spectrometer.
These flights are designed so that we can see
how the spectrometer's gonna give
different observations using
different flight tracks, angles,
and orientations of the flight tracks.
So the nice thing about these flights
is we're actually able to leverage
the center portion of this
where we get 20 lines overlapping.
So I can actually make 20 Canopy Height Models.
And then in this overlapping portion
just look at every cell and see how it varies
between all of those different Canopy Height Models
in that center portion.
So it enables us to sort of empirically derive
what the precision in the Canopy Height Model is.
I did this analysis on Canopy Height Models.
Amanda is actually continuing this analysis this summer
and applying the same algorithm
to all of our other data products,
in addition to the Canopy Height Model,
and that's what she'll talk about this afternoon.
So these are just some images that's showing
when we overlap all those flight lines,
you get this nice area in the center where we have
all the flights lines overlapping 18 and 120 and the other.
So we're able to create all those different rasters.
There's an example.
And then we can look at the center portion
and actually get these rasters of uncertainty,
for each cell represents the standard deviation
of the Canopy Height Model across all those different lines.
So I guess kind of the take home message from this
is that this is the average uncertainty
that we saw in the Canopy Height Model
at each one of these sights.
So it's San Joaquin, 1.9 meters,
it's Soaproot 2.2 meters,
and Oak Ridge 1.1 meters.
I have sort of a more in depth presentation on this stuff,
which I'd be happy to give people,
but the basic take home idea here was that
each one of these sites represented
really different forest types,
and there's different factors at each forest type
that contribute to the overall uncertainty.
But also what this tells us is generally
if you're looking at an individual cell
in a Canopy Height Model you could be looking at
one to two meters of error at that actual cell.
Yeah, so SJR is like a savanna type landscape
with shorter blue oak trees.
And each oak tree is kind of individual,
has some space around it.
And sort of what we saw at San Joaquin was that
due to that beam divergence issue that I mentioned before,
the edges of the individual trees at San Joaquin
had a lot of uncertainty because you had some points
that would hit the edge of the tree
and some points that would hit the ground.
And so you got a lot of variation
at the edges of those trees.
At Soaproot you had really tall, thin, ponderosa pines.
And so what happened is that as we flew
those different orientations of the flight lines,
sometimes the LiDAR point would hit mid tree
on those really tall thin trees,
and sometimes it would hit the top,
and so on these really tall thin trees
you'd get really high standard deviation sometimes,
like 18 to 20 meters, just based on
where the LiDAR point happened to hit the tree.
And then at Oak Ridge, where we have a really heavy canopy,
what happened was we got these areas here
of high uncertainty,
kinda these segments of high uncertainty
throughout the Canopy Height Model.
And when you look into that what you find is that
at these areas we also had really poor ground penetration
underneath those heavy canopies,
and so what happens is that there was a lot of interpolation
that was occurring across the ground surface here
and in the different flight lines,
this interpolation resulted
in really different ground surfaces,
and then when we're subtracting
the top of the canopy down to the bottom,
that resulted in really different canopy height estimations.
So this problem that I mentioned
at Oak Ridge is really important
because it's commonplace across a lot of our sites
that we don't get good ground penetration.
So this is an example of the Great Smoky Mountains flight
that we flew in 2015.
And what it's actually colored by is the longest edge
in any one of the TIN triangles across the entire site.
And so what we see is that generally these range
between zero to three, three at the most.
And this is for all the points.
And so at the most we're interpolating
three meters across any given area.
We can look at that distribution.
See at the most it was three,
but generally it was below 1.5.
And this is because, like I told you guys,
we're getting generally between
two and four pulses per meter,
and so generally we don't have to interpolate much more
than 1.5 meters.
But, this is what happens when we look at that same plot
using the ground only points.
Using the ground only points
we're going from zero to 25 meters,
and so there is particular areas in this really heavy canopy
where we're interpolating
the ground surface across 25 meters.
And so that's gonna add a lot of uncertainty
into the Canopy Height Model
because then if we miss a dip
or a hill in the ground surface
it's really gonna affect the canopy height.
So this is that same distribution
except for the ground points only.
You see we got this little bump,
sort of aligned with the previous histogram,
that's the open areas within this
larger histogram showing underneath the canopy.
And I will say the Great Smoky Mountains
is probably one of the worst sites that we fly for this.
So it is a worst case example.
So something else that we've done and you will do
directly after this is look at differences at Pringle Creek.
This is a really nice site to analyze the uncertainty
because last year we flew the whole site
in bad weather conditions just to get LiDAR coverage
'cause we didn't think that the weather was gonna improve,
and then lo and behold the weather improved,
so the very next day we flew it again
to get good weather spectrometer data.
So we have two LiDAR collections one day apart.
So this we can assume that nothing has changed in this site
from day to day and so that we can look at,
okay, well how did this acquisitions change
between these two days.
So that's the lesson we're gonna
look at directly after this.
So then there's also some larger processing uncertainty,
errors that mostly have to do
with misclassification of the point cloud.
So I mentioned yesterday about how we classify
the point cloud into ground points, vegetation points,
buildings, and unclassified.
So this is a good example from the Flatirons
just local to here.
Where originally when we did our ground classification
it thought because those Flatirons were so steep
there's no way that the ground
can go up that fast and that steep.
So it assumed that these were not ground points
on top of the Flatirons and so it actually cut
all of the top of the Flatirons off
because it assumed that that was vegetation.
And so there was actually
talked with Martin who created last tools,
did the classification on this,
he actually made an improvement to the algorithm
that allowed us to correct for that error.
So you can see this is the original profile,
across the Flatirons where
we were cutting off a lot of those tops.
And then this was an improvement that was made
to the algorithm that allowed us to do that.
Unfortunately this improvement works well in these cases
but doesn't work as well in some other cases.
And so I still generally use the old way to do this,
and I just got an email three weeks ago
from the park service at Great Smoky Mountains
that said, hey, you cut off a whole bunch
of the top of the mountains in Great Smoky Mountains.
So then I reprocessed it with the new method
to correct that for them.
So this can also happen with vegetation.
So you can see up here this is an RGB image
of an area at Dead Lake, which is one of our D8 sites.
And in this area here,
there's a lot of really low vegetation to the ground.
There's actually one taller tree right here.
You can see its shadow.
And when we look at the Canopy Height Model
all we see is this one tree.
Everything else here is zero.
So when the algorithm went through,
it classified all this short vegetation as ground points.
Okay, and then when we look at the Digital Terrain Model,
those are included in the Digital Terrain Model,
and then you can see them here in the hill shade.
So actually this misclassification of the vegetation points
has added a lot of error into the Digital Terrain Model.
We look at a profile that goes
across the Digital Terrain Model there.
I'm assuming that ground probably doesn't look like this.
And basically what we've gotten
is a lot of the different vegetation
that was incorrectly classified as ground points.
So this can occur within our data.
It's more likely to occur on short vegetation.
I think key presentation yesterday,
he mentioned about the range resolution
of the laser pulses in his waveform presentation.
So the outgoing width of the Optech system
is outgoing pulse width is 10 nanoseconds.
And so based on that we're only able
to get a two meter range resolution.
So we can't distinguish between two objects
that are less than two meters to get apart.
So when we get short vegetation that's lower than two meters
we're not gonna get the ground point
beneath that vegetation.
And so the what happens is that the algorithm sees this
as the last point and it assumes
that it must be the ground point.
And then we get situations like this.
So beware of short vegetation because
it can definitely affect the Digital Terrain Models.
So, I mean, obviously we would like to correct these things,
but we're a small group here at NEON.
We're collecting a lot of data,
so we rely in our classification algorithms
to get us 85, 90% of the way there.
Usually commercial providers will then have employees
that get them the last 10% that takes 90% of the time.
We're getting 90% of the way there,
and then we're delivering the data.
So I just say, if you're using NEON data
it's good to be aware that the classifications
are gettin' you almost there, but not completely.
Yeah so, all this classification is done from the LAS files,
which are available as the L1 product.
And so, I mean, we give those last files by flight line
with no classifications.
And so you can definitely reclassify the points.
The other thing is right now we use this,
the classification routine takes in several parameters.
We use a standard set of parameters for all the sites,
and probably it would be best to tweak
those parameters slightly for each individual site.
If we're ever gonna do that we're gonna need to figure out
a dynamic way to calculate what those parameters are
as opposed to going in and changing them every time
'cause their process is so automated at this point.
And there is, I have seen research on this
starting to come out of figuring out
how to dynamically calculate the parameters
for the classification so hopefully that's gonna happen,
and then we'll be able to apply something
that does a little bit better.
And so the regal system that we're gonna start flying
at the end of this year, next year,
it's outgoing pulse width is three nanoseconds,
as opposed to 10 nanoseconds.
So that brings the range resolution of that system
down to 60 centimeters as opposed to two meters.
So then our take home message is, for our uncertainty,
is that we try to get those base stations
at less than 20 kilometers to make sure
our trajectory is a high fidelity.
We biannually test that sensor,
basically when it's going out and when it comes back
at the runways to test the vertical accuracy
and then here at headquarters for the horizontal accuracy.
And then we're monitoring that boresight,
those boresight misalignments throughout the season.
The simulated error in the point clouds are available,
but remember, these are based only on the errors
in the individual sensor components.
So those errors have nothing to do
with any sort of classification error
that may be introduced into the point cloud.
'Cause that's something that's something
that's really difficult to quantify.
So these errors really only tell you
how well the sensor was operating,
not how it interacted with the land cover.
The ground point density and heavy canopy
can be sparse, which can lead to errors
in the DTM and the CHM.
And also these misclassifications
are probably our largest source of error right now,
so just to be aware for those.
Yeah, so what we do is we actually relate everything
back to the IMU.
And so the IMU is like our base orientation system
inside the plane.
And so when the IMU's tipping back and forth,
and then the laser scanner is scanning out,
we need to know what that difference is
so that when we apply the roll and pitch and yawn things
that it's being applied correctly.
But then the (voice muffled)
is also sitting slightly differently.
So we relate that back to the IMU as well.
So since we have both related to the IMU
then we have that really high geolocation,
relative geolocation between the two instruments.