University of California
Informatics and GIS Program

Posts Tagged: coding

Day 2 Wrap Up from the NEON Data Institute 2017

First of all, Pearl Street Mall is just as lovely as I remember, but OMG it is so crowded, with so many new stores and chains. Still, good food, good views, hot weather, lovely walk.

Welcome to Day 2!
Our morning session focused on reproducibility and workflows with the great Naupaka Zimmerman. Remember the characteristics of reproducibility - organization, automation, documentation, and dissemination. We focused on organization, and spent an enjoyable hour sorting through an example messy directory of misc data files and code. The directory looked a bit like many of my directories. Lesson learned. We then moved to working with new data and git to reinforce yesterday's lessons. Git was super confusing to me 2 weeks ago, but now I think I love it. We also went back and forth between Jupyter and python stand alone scripts, and abstracted variables, and lo and behold I got my script to run.

The afternoon focused on Lidar (yay!) and prior to coding we talked about discrete and waveform data and collection, and the opentopography ( project with Benjamin Gross. The opentopography talk was really interesting. They are not just a data distributor any more, they also provide a HPC framework (mostly TauDEM for now) on their servers at SDSC ( They are going to roll out a user-initiated HPC functionality soon, so stay tuned for their new "pluggable assets" program. This is well worth checking into. We also spent some time live coding with Python with Bridget Hass working with a CHM from the SERC site in California, and had a nerve-wracking code challenge to wrap up the day.

Fun additional take-home messages/resources:

Thanks for everyone today! Megan Jones (our fearless leader), Naupaka Zimmerman (Reproducibility), Tristan Goulden (Discrete Lidar), Keith Krause (Waveform Lidar), Benjamin Gross (OpenTopography), Bridget Hass (coding lidar products).

Our home for the week

Posted on Tuesday, June 20, 2017 at 10:59 PM
Tags: class (5), cloud (3), coding (10), collaboration (5), conferences (7), learning (2), lidar (1), open source (3), programming (5), remote sensing (2), tools (2), training (3)

Day 1 Wrap Up from the NEON Data Institute 2017

I left Boulder 20 years ago on a wing and a prayer with a PhD in hand, overwhelmed with bittersweet emotions. I was sad to leave such a beautiful city, nervous about what was to come, but excited to start something new in North Carolina. My future was uncertain, and as I took off from DIA that final time I basically had Tom Petty's Free Fallin' and Learning to Fly on repeat on my walkman. Now I am back, and summer in Boulder is just as breathtaking as I remember it: clear blue skies, the stunning flatirons making a play at outshining the snow-dusted Rockies behind them, and crisp fragrant mountain breezes acting as my Madeleine. I'm back to visit the National Ecological Observatory Network (NEON) headquarters and attend their 2017 Data Institute, and re-invest in my skillset for open reproducible workflows in remote sensing. 

Day 1 Wrap Up from the NEON Data Institute 2017
What a day!
Attendees (about 30) included graduate students, old dogs (new tricks!) like me, and research scientists interested in developing reproducible workflows into their work. We are a mix of ages and genders. The morning session focused on learning about the NEON program ( its purpose, sites, sensors, data, and protocols. NEON, funded by NSF and managed by Battelle, was conceived in 2004 and will go online for a 30-year mission providing free and open data on the drivers of and responses to ecological change starting in Jan 2018. NEON data comes from IS (instrumented systems), OS (observation systems), and RS (remote sensing). We focused on the Airborne Observation Platform (AOP) which uses 2, soon to be 3 aircraft, each with a payload of a hyperspectral sensor (from JPL, 426, 5nm bands (380-2510 nm), 1 mRad IFOV, 1 m res at 1000m AGL) and lidar (Optech and soon to be Riegl, discrete and waveform) sensors and a RGB camera (PhaseOne D8900). These sensors produce co-registered raw data, are processed at NEON headquarters into various levels of data products. Flights are planned to cover each NEON site once, timed to capture 90% or higher peak greenness, which is pretty complicated when distance and weather are taken into account. Pilots and techs are on the road and in the air from March through October collecting these data. Data is processed at headquarters.

In the afternoon session, we got through a fairly immersive dunk into Jupyter notebooks for exploring hyperspectral imagery in HDF5 format. We did exploration, band stacking, widgets, and vegetation indices. We closed with a fast discussion about TGF (The Git Flow): the way to store, share, control versions of your data and code to ensure reproducibility. We forked, cloned, committed, pushed, and pulled. Not much more to write about, but the whole day was awesome!

Fun additional take-home messages:
- NEON is amazing. I should build some class labs around NEON data, and NEON classroom training materials are available:
- Making participants do organized homework is necessary for complicated workshop content:
- HDF5 as an possible alternative data format for Lidar - holding both discrete and waveform
- NEON imagery data is FEDExed daily to headquarters after collected
- I am a crap python coder
- #whofallsbehindstaysbehind
- Tabs are my friend

Thanks to everyone today, including: Megan Jones (Main leader), Nathan Leisso (AOP), Bill Gallery (RGB camera), Ted Haberman (HDF5 format), David Hulslander (AOP), Claire Lunch (Data), Cove Sturtevant (Towers), Tristan Goulden (Hyperspectral), Bridget Hass (HDF5), Paul Gader, Naupaka Zimmerman (GitHub flow).

IMG_5359 copy.JPG
IMG_5365 copy.JPG
IMG_5368 copy.JPG
IMG_5369 copy.JPG
Posted on Monday, June 19, 2017 at 11:55 PM
Tags: class (5), coding (10), collaboration (5), conferences (7), data (1), open source (3), privacy (1), programming (5), remote sensing (2), tools (2), training (3)

Cloud-based raster processors out there

Hi all,

Just trying to get my head around some of the new big raster processors out there, in addition of course to Google Earth Engine. Bear with me (bare?) while I sort through these. Thanks for raster sleuth Stefania Di Tomasso for the leg work. 

1. Geotrellis (

Geotrellis is a Scala-based raster processing engine, and it is one of the first geospatial libraries on Spark.  Geotrellis is able to process big datasets. Users can interact with geospatial data and see results in real time in an interactive web application (for regional, statewide dataset).  For larger raster datasets (eg. US NED). GeoTrellis performs fast batch processing using Akka clustering to distribute data across the cluster.  GeoTrellis was designed to solve three core problems, with a focus on raster processing:

  • Creating scalable, high performance geoprocessing web services;
  • Creating distributed geoprocessing services that can act on large data sets; and
  • Parallelizing geoprocessing operations to take full advantage of multi-core architecture.


  • GeoTrellis is designed to help a developer create simple, standard REST services that return the results of geoprocessing models.
  • GeoTrellis will automatically parallelize and optimize your geoprocessing models where possible.
  • In the spirit of the object-functional style of Scala, it is easy to both create new operations and compose new operations with existing operations.

2. GeoPySpark - in synthesis GeoTrellis for Python community

Geopyspark provides python bindings for working with geospatial data on PySpark (PySpark is the Python API for Spark). Spark is open source processing engine originally developed at UC Berkeley in 2009.  GeoPySpark makes Geotrellis ( accessible to the python community.  Scala is a difficult language so they have created this Python library. 

3. RasterFoundry

Posted on Tuesday, May 2, 2017 at 5:58 PM
Tags: big (1), cloud (3), coding (10), raster (1)

Spatial Data Science Bootcamp 2016!

Last week we held another bootcamp on Spatial Data Science. We had three packed days learning about the concepts, tools and workflow associated with spatial databases, analysis and visualizations. Our goal was not to teach a specific suite of tools but rather to teach participants how to develop and refine repeatable and testable workflows for spatial data using common standard programming practices.

2016 Bootcamp participants

On Day 1 we focused on setting up a collaborative virtual data environment through virtual machines, spatial databases (PostgreSQL/PostGIS) with multi-user editing and versioning (GeoGig). We also talked about open data and open standards, and moderndata formats and tools (GeoJSON, GDAL).  On Day 2 we focused on open analytical tools for spatial data. We focused on Python (i.e. PySAL, NumPy, PyCharm, iPython Notebook), and R tools.  Day 3 was dedicated to the web stack, and visualization via ESRI Online, CartoDB, and Leaflet. Web mapping is great, and as says: “Internet maps appear magical: portals into infinitely large, infinitely deep pools of data. But they aren't magical, they are built of a few standard pieces of technology, and the pieces can be re-arranged and sourced from different places.…Anyone can build an internet map."

All-in-all it was a great time spent with a collection of very interesting mapping professionals from around the country. Thanks to everyone!

Posted on Saturday, April 2, 2016 at 8:43 PM
Tags: class (5), coding (10), collaboration (5), conferences (7), programming (5), viz (3)

ESRI @ GIF Open GeoDev Hacker Lab

We had a great day today exploring ESRI open tools in the GIF.  We had a full class of 30 participants, and two great ESRI instructors (leaders? evangelists?) John Garvois and Allan Laframboise, and we worked through a range of great online mapping (data, design, analysis, and 3D) examples in the morning, and focused on using ESRI Leaflet API in the afternoon. Here are some of the key resources out there.

Great Stuff! Thanks Allan and John

Posted on Friday, January 15, 2016 at 9:39 PM
Tags: cartography (2), class (5), cloud (3), coding (10), conferences (7), geoweb (1), programming (5), webgis (1)

Next 5 stories | Last story

Webmaster Email: