University of California
Informatics and GIS Program


Distillation from the NEON Data Institute

So much to learn! Here is my distillation of the main take-homes from last week. 

Notes about the workshop in general:

NEON data and resources:

Other misc. tools:

Posted on Monday, June 26, 2017 at 9:33 PM
Tags: class (28), conferences (32), programming (11), remote sensing (33)

Day 4 Wrap Up from the NEON Data Institute 2017

Day 4

This is it! Final day of LUV-DATA. Today we focused on hyperspectral data and vegetation. Paul Gader from the University of Florida kicked off the day with a survey of some of his projects in hyperspectral data, explorations in NEON data, and big data algorithmic challenges. Katie Jones talked about the terrestrial observational plot protocol at the NEON sites. Sites are either tower (in tower air-shed) or distributed (throughout site). She focused on the vegetation sampling protocols (individual, diversity, phenology, biomass, productivity, biogeochemistry). Data to be released in the fall. Samantha Weintraub talked to us about foliar chemistry data (e.g. C, N, lignin, chlorophyll, trace elements) and linking with remote sensing. Since we are still learning about fundamental controls on canopy traits within and between ecosystems, and we have a poor understanding of their response to global change, this kind of NEON work is very important. All these foliar chemistry data will be released in the fall. She also mentioned the extensive soil biogeochemical and microbial measurements in soil plots (30cm depth) again in tower and distributed plots (during peak greenness and 2 seasonal transitions).

The coding work focused on classifying spectra (Classification of Hyperspectral Data with Ordinary Least Squares in Python), (Classification of Hyperspectral Data with Principal Components Analysis in Python) and (Using SciKit for Support Vector Machine (SVM) Classification with Python), using our new best friend Jupyter Notebooks. We spent most of the time talking about statistical learning, machine learning and the hazards of using these without understanding of the target system. 

Fun additional take-home messages/resources:

  • NEON data seems like a tremendous resource for research and teaching. Increasing amounts of data are going to be added to their data portal. Stay tuned:
  • NRC has collaborated with NEON to do some spatially extensive soil characterization across the sites. These data will also be available as a NEON product.
  • Fore more on when data rolls out, sign up for the NEON eNews here:

Thanks to everyone today! Megan Jones (ran a flawless workshop), Paul Gader (remote sensing use cases/classification), Katie Jones (NEON terrestrial vegetation sampling), Samantha Weintraub (foliar chemistry data).

And thanks to NEON for putting on this excellent workshop. I learned a ton, met great people, got re-energized about reproducible workflows (have some ideas about incorporating these concepts into everyday work), and got to spend some nostalgic time walking around my former haunts in Boulder.

Posted on Thursday, June 22, 2017 at 6:12 PM

Day 4 Wrap Up from the NEON Data Institute 2017


Posted on Thursday, June 22, 2017 at 3:20 PM

NEON 2017 Data Workshop Notes - Day 3

Today we focused on uncertainty. Yay!

Tristan Goulden gave a talk on sources of uncertainty in the discrete return lidar data. Uncertainty comes from two main sources: geolocation - horizontal and vertical (e.g. distance from base station, distribution and number of satellites, and accuracy of IMU), and processing (e.g. classification of point cloud, interpolation method ). The NEON remote sensing team has developed tests for each of these error sources. NEON provides with all their lidar data a simulated point cloud error product, with horizontal and vertical error per point in LAS format (cool!). These products show the error is largest at the edges of scans, obvi.

  • The take homes are: fly within 20km of a basestation; test your lidar sensor annually; check your boresight; dense canopy make ground point density more sparce, so DTM is problematic; and initial point cloud misclassification can lead to large errors in downstream products. So much more in my notes.

We then coded an example from the PRIN NEON site, where NEON captured lidar data twice within 2 days, and so we could explore how different the data were. Again, we used Jupyter Notebooks and explored the relative differences in DSM and DTM values between the two lidar captures. The differences are random, but non-negligible, at least for DSM. For the DTM, the range = 0.0-20cm; but for the DSM the range = 0.0-1.5. The mean DSM is 6.34m, so the difference can be ~20%. The take home is that despite a 15cm accuracy spec from vendors on vertical accuracies, you can get very different measures on different flights and those can be considerable, especially with vegetation. In fact, NEON meets its 15cm accuracy requirements only in non-vegetated areas. Note, when you download NEON data, you can get line-to-line differences in the NEON lidar metadata, to kind of explore this. But assume if you are in heavily vegetated areas you should expect higher than 15cm error.

After lunch we launched into the NEON Imaging Spectrometer data and uncertainty with Nathan This is something I had not really thought about before this workshop.
We talked about orthorectfication and geolocation, focal plan characterization, spectral calibration and radiometric calibration and all the possible sources of error that can creep into the data, like blurring and ghosting of light. NEON calibrates their data across these areas, and provided information on each. I don't think there are many standards for reporting these kinds of spectral uncertainties.

The first live coding exercise (Hyperspectral Variation Uncertainty Analysis in Python) looked at the NEON site F07A, at which NEON acquired 18 individual flights (for BRDF work) over an hour on one day. We used these data and plotted the different spectral reflectance curves for several pixels. For a vegetated pixel, the NIR can vary tremendously! (e.g. 20% reflectance compared to 50% reflectance, depending on time of day, solar angle, etc.) Wow! I should note that the related indices - NDVI, which are ratios, will not be as affected. Also, you can normalize the output using some nifty methods like the Standard Normal Variate (SNV) algorithm, if you have large areas over which you can gather multiple samples.

The second live coding exercise (Assessing Spectrometer Accuracy using Validation Tarps with Python) focused on a calibration experiment they conducted at CHEQ for the NIS instrument. They laid out two reflectance tarps - 3% (black) and 48% (white), measured reflectance with an ASD spectrometer, and flew over with the NIS. We compared the data across wavelengths. Results summary: small differences between ASD and NIS across wavelengths; water absorption bands play a role; % differences can be quite high - up to 50% for the black tarp. This is mostly from stray light from neighboring areas. NEON has a calibration method for this (they call it their "de-blurring correction").

Fun additional take-home messages/resources:

  • All NEON point cloud classifications are done with LASTools. Go LASTools!
  • Check out pdal - like gdal for point clouds. It can be used from bash. Learned from my workshop neighbor Sergio Marconi
  • Reflectance Tarps are made by GroupVIII
  • ATCOR says we should be able to rely on 3-5% error on reflectance when atmospheric correction is done correctly (say that 10 times fast) with a well-calibrated instrument.
  • NEON hyperspectral data is stored in HDF5 format. HDFView is a great tool for interrogating the metadata, among other things.

Thanks to everyone today! Megan Jones (our fearless leader), Tristan Goulden (Discrete Lidar Uncertainty and all the coding), Nathan Leisso (spectral data uncertainty), and Amanda Roberts (NEON intern - spectral uncertainty).

Day 1 Wrap Up
Day 2 Wrap Up

Posted on Thursday, June 22, 2017 at 12:00 AM

Day 2 Wrap Up from the NEON Data Institute 2017

First of all, Pearl Street Mall is just as lovely as I remember, but OMG it is so crowded, with so many new stores and chains. Still, good food, good views, hot weather, lovely walk.

Welcome to Day 2!
Our morning session focused on reproducibility and workflows with the great Naupaka Zimmerman. Remember the characteristics of reproducibility - organization, automation, documentation, and dissemination. We focused on organization, and spent an enjoyable hour sorting through an example messy directory of misc data files and code. The directory looked a bit like many of my directories. Lesson learned. We then moved to working with new data and git to reinforce yesterday's lessons. Git was super confusing to me 2 weeks ago, but now I think I love it. We also went back and forth between Jupyter and python stand alone scripts, and abstracted variables, and lo and behold I got my script to run.

The afternoon focused on Lidar (yay!) and prior to coding we talked about discrete and waveform data and collection, and the opentopography ( project with Benjamin Gross. The opentopography talk was really interesting. They are not just a data distributor any more, they also provide a HPC framework (mostly TauDEM for now) on their servers at SDSC ( They are going to roll out a user-initiated HPC functionality soon, so stay tuned for their new "pluggable assets" program. This is well worth checking into. We also spent some time live coding with Python with Bridget Hass working with a CHM from the SERC site in California, and had a nerve-wracking code challenge to wrap up the day.

Fun additional take-home messages/resources:

Thanks for everyone today! Megan Jones (our fearless leader), Naupaka Zimmerman (Reproducibility), Tristan Goulden (Discrete Lidar), Keith Krause (Waveform Lidar), Benjamin Gross (OpenTopography), Bridget Hass (coding lidar products).

Our home for the week

Posted on Tuesday, June 20, 2017 at 10:59 PM
Tags: class (28), cloud (6), coding (10), collaboration (24), conferences (32), learning (9), lidar (8), open source (10), programming (11), remote sensing (33), tools (5), training (10)

Next 5 stories | Last story

Webmaster Email: