Mapping Afterschool in Vermont with Tableau and Mapbox

In the spring of 2017, I attended the Northeast Arc Users Group Conference (NEARC) in Amherst, MA and gave a presentation about how I was able to leverage the powers of Tableau Public, Mapbox, MapZen, Excel, and Access to create an easily-update-able interactive web map for Vermont Afterschool’s website. I walked the audience through my decision making process and discoveries that I made along the way. An audience member described it as a very “nuts and bolts” presentation - just what I was going for.

Two years later, the web map continues to thrive, although the database on the back end has changed. Shortly after I gave the presentation, Microsoft announced the discontinuation of its Access Web Apps; shortly after, a second gut-punch: MapZen announced that its free and open access geocoding service was shutting down. Fortunately, I was able to relatively painlessly shift the database over to Zoho Creator and write some internal scripts that called out to Open Street Map to geocode through that service instead of MapZen. It was a lesson in the risks of relying exclusively on third party services for technical operations.

Regardless of the changes, it still seems valuable to share the presentation as a demonstration of the thought processes behind my work and what was possible with these straight-forward tools up until 2017. Here are the key slides and text of the presentation.

Vermont Afterschool is a public-private statewide partnership that helps support and sustain learning opportunities beyond the school day for Vermont’s children and youth. It does things such as advocacy work and training that will support organizations in providing afterschool and summer learning. We want the kids in Vermont to not only have safe places to be when their parents are working, but also to have places where they can develop useful skills and grow in their 21st century skills such as leadership, confidence, and healthy lifestyles. At its inception in 2009-10, the organization didn’t really have a sense of were all these programs were located, or how many there were. We thought it’d be helpful to get an idea of this so we could begin to reach out to them. We didn’t initially think of a map, just mainly a database with contact information.

The short story is that we did eventually end up with this map which we launched about a year ago. It’s hosted on Tableau Public and embedded on our website. Users have the ability to search for programs by name or by location and they can filter by type of program, such as YMCA or Boys and Girls Clubs and can also filter by location (such as county or town). Finally, they can hover over various points to see a pop-up with general location and contact information.

In summary, I used a total of five applications to create this map. I was able to do it basically for free (we only pay a monthly fee for the database as a SQL web app) and with very little actual programming needed.

Tableau Public is where the application was ultimately styled and is hosted. Tableau reads from Mapbox to get the style of the map. It also reads from Excel to get the program data, which includes the GPS coordinates for each. Down one level, Excel reads from the SQL data from the MS Access database through a live data connection. And Excel calls up Mapbox through a function to have each row in the table geocoded.

Here is how the map came to be. Back in 2010, I surveyed school principals to figure out which schools in the state hosted afterschool programs. We also obtained lists of programs that were funded with the 21st Century Community Learning Centers federal funding and which were licensed by the Child Development Division of the Agency of Human Services in Vermont. In 2015, we got the latest version on MS Access and we were able to turn our database into a MS Access password-protected web app where our data was hosted on a SQL server. This was a big step because we could allow readability and write-ability of our data through other apps.

Then in 2016 came the web map.

Here is the front-end of our internal database – the upgrade on 2010. It’s hosted on Microsoft’s SQL server. Our staff updates program information as we become aware of it. We have street address data for every program. Yes, that took some Googling on our part. This web map is great and easy to maintain, but one thing that we can’t do with it is geocode our addresses.

That brings us to the two main things that I needed to think about when beginning the process of creating the map. The first was to figure out how to geocode all of the approximately 400 street addresses of the afterschool programs and the second was to create some kind of user-friendly web map interface so that people could find and interact with this information in some kind of useful way.

While taking these two main points into account and also realizing there are so many ways to do this on the web, I also recognized the limitations. Geocoding, especially really accurate geocoding, can be expensive. Also, we were not looking to do a one-time geocoding, but rather we did want some kind of automatic process in place where whenever a new program was added to the database, it would be geocoded and then appear on the map. Shouldn’t be too much to ask, right?

Sure enough, there are plenty of applications that can both geocode and turn geocoded data into beautiful web maps. However, the cost is sometimes significant and also, it was really unclear to me if it would be possible to connect our SQL data to these applications, have the data geocoded, and automatically kept up to date. It seemed like by leveraging the APIs (application programming interfaces), and by doing some manual programming this might be possible, but this was a bit beyond me at the time.

Enter…Tableau Public. This is a free service that allows anyone to publish interactive data visualizations on the web. And you can embed them onto your own website. I had been somewhat familiar with Tableau and knew there was a mapping component and that you could add filters and pop-ups to your data and it was easily customizable.

I began testing out how well Tableau would work for us. I connected a spreadsheet of our programs from the database and checked it out on the map. Now at first glance, it looks like the data are geocoded beautifully, but this isn’t actually the case. In Tableau, you can specify which fields contain geographic data, but not down to the address level. Onl yas far down as the town or zip level.

This is what the ZIP codes of each program look like. A point is placed in the center of each ZIP code area. Of course, many ZIP codes have more than one program so those are just piled on top of one another, which doesn’t really work. We don’t need absolutely perfect geocoding, but it would be helpful to better see the density of programs so we need to do a little better than zip-code level.

City-level might be a little better, since in Vermont some ZIP codes comprise more than one town since many of the towns are small. But, when I tried it that way, only a few points appeared on the map and I got this list of 165 errors. Even when I specified that all of these towns should be in Vermont, Tableau couldn’t identify most of them.

One thing that Tableau will do nicely though is map your data if it is pre-geocoded; that is, if you already have the latitude and longitude coordinates for each of your rows. So that means that thinking back to our two big things to do of geocoding and creating the web interface, I decided that this would have to happen in two different places.

However I decided to geocode the address data, I was going to have to store the results so that I could use them in Tableau. This turned out to be an important consideration because often services that allow free or cheap geocoding will have a cost or a higher cost associated with storing the coordinates in your own database as opposed to displaying them directly on their mapping applications.

The second consideration with geocoding, which I mentioned briefly earlier would to be make it some sort of semi-automatic process by which a new program gets added to the database and then it’s automatically geocoded and displayed on the map.

Finally, I need the coordinates in some type of useable format that I can understand. I can connect a few different file formats to Tableau, Excel being one of them. So ideally I would want the output of the geocoding to be two nice neat columns that could go in Excel – one for latitude and one for longitude.

Here’s what I mean by wanting to put the coordinates into Excel. I know that I can link the specific query in our Access SQL database with an Excel file and also link an Excel file to a Tableau project, and I know Excel pretty well, so I figured if I could somehow make the geocoding happen in Excel, this would be a feasible plan.

There are many many geocoding services available and varying costs – any varying levels of accuracy. Most have storage and usage restrictions. Like I mentioned, for many there are higher cost associated with storing the results for your own use. Esri’s service and Mapbox’s service are among some of these options.

I also discovered Mapzen, which has very flexible and open licensing. According to their terms of service, you can geocode up to 25,000 addresses per month at no charge AND use them with any map – not just on a Mapzen map.

The output of geocoding with Mapzen are in GeoJSON format. JSON stands for Javascript ObjectNotation and GeoJSON just means that the output in that format contains geographic data.

When you sign up for a Mapzen account, you receive an API key that you tack onto a URL along with your search parameters, which are the address information. This is called their search URL and this is the template of what it looks like:

You can specify more than just the address. For example, you can specify a country that you want your results limited to and the number of results that you want returned.

This what the GeoJSON output looks like – for geocoding of one address. This is the result when I put that URL into chrome and I have a JSON viewer extension. Without the JSON viewer extension, you’d just see a giant block of text.

And you can see, included in the output are the GPS coordinates for the address. So then the question becomes, how can I geocode everything, on a continual basis and extract the latitude and longitude so that they are in some type of useable and storable format?

This is where Excel, again, comes in. Instead of putting a specific Mapzen search URL into a browser window, I can use a function in Excel called WEBSERVICE. The WEBSERVICE function allows returns data from the web when you use a URL as the parameter.

We can write a function using the Mapzen search URL and each address to return the GeoJSON search text result into each row.

Again, we need to get the latitude and longitude into separate columns for mapping purposes. So, the next step was to write a formula in separate cells using combinations of the Excel functions, VALUE and FIND and a few others to locate the word “coordinates” in the GeoJSON output for each of the latitudes and longitudes.

Also, I had wanted this all to happen automatically. Well, because all of this data is in an Excel table, it kind of does. Whenever the file is opened, the data gets refreshed. So that means if new programs hard been added (or deleted) from the database query, new rows will be added or taken away from the table accordingly. And the WEBSERVICE formula will geocode the newly added rows automatically. This is all because it is a table.

The reason it’s only semi-automatic is because the Excel file has to be opened in order for the data connection to refresh.There’s probably some way to not have to make this happen, but I don’t know how. Either way, it’s so much easier than manually adding new sites or re-uploading data to a map for every update.

Now we have lat/long coordinates for every program. Now, each program can be mapped more accurately. And we can see, for example, that there is a big cluster in the Burlington area, which makes sense because it has the largest highest population density in the state. This is way more useful than just a dot in the center of each ZIP code region.

So here’s where Mapbox comes in. A few releases ago, Tableau made a way to connect with Mapbox for nicer map styling. For our map, we don’t really need to see the bordering states. Also, it might be nicer to get some county borders in there so we can more easily visualize where in the state the programs are located.

Then I added filters and info windows to make the map more user-friendly and we embedded it on our website.

Remember how I had to open the Excel file for the data to refresh from the connection? I have to do the same thing here. I can’t just view the map on our website to see new points added whenever there are new programs added to our database. I have to open Excel torefresh the data, and then I also need to open the map application in Tableau so it can get a fresh read off of the Excel workbook. So there are two steps thatI wish didn’t have to exist, but they are very easy steps and do not involve changing anything with the design.

Using Mapbox Studio, with a free account I was able to create a ‘map style’ where I uploaded a few shapefiles –one with the county polygons and one with the town polygons to create this mapstyle. Mapbox provides a URL for using any style you’ve created in Tableau.

 Then back in Tableau there’s a menu option for adding your own Mapbox style, so the map can look more like the one on the right than the one on the left:

In summary, I used a total of five applications/tools to create this map. I was able to do it basically for free (we only pay a monthly fee for the database as a SQL web app) and with very little actual programming needed.

Tableau Public is where the application was ultimately styled and is hosted. Tableau reads from Mapbox to get the style of the map. It also reads from Excel to get the program data, which includes the GPS coordinates for each.

Down one level, Excel reads from the SQL data from the MS Access database through a live data connection. And Excel calls up Mapbox through a function to have each row in the table geocoded.

The map is great because it works, it’s live, it updates semi-automatically and we didn’t need to vastly improve our skills set or budget in order to get it and keep it up and running. However, there are still some things we’d like to improve in the future.

It would be great, as I’ve stated to have the process be more automated. I’d like to not have to open the Excel Workbook and Tableau file in order to refresh the data connections.

I’d also like to be able to make the info windows a bit more customizable. Right now they are just plain text and it would be great to have links to the programs’ websites. There doesn’t seem to be an obvious way to do this with Tableau.

I will probably at some point have to reassess the limits of using this particular set of methods of connecting various services and avoiding actual programming to get this map to do what we want it to do. We are going to first explore other data hosting options and free tools available for non-profits.