Data Concepts

Before we dive in, a few ideas to help you connect your data to WebGeoDa.

There are three main data concepts that are important to loading your data in WebGeoDa:

  1. Geospatial data: Geospatial data (geodata) represents the geospatial data that you'll be working with. Currently, WebGeoDa supports GeoJSON or Mapbox Vector Map Tiles (hosted through Mapbox) as the core polygons that you want to represent. Each geodata needs to have a common identifier column, such as a ZIP code or FIPS code. Ideally, most data about these geographies will be contained in the tables, described below, but having some basic information, such as land area or population may be useful.

  2. Tables: Tables, or tabular data, should contain most of the information you want to map. These also need a common identifier column that directly reflects the ID column on your geodata. This data can take the form of CSV files, but you can also connect externally to JSON data or a Google Sheet. Each geodata has a set of tables attached to it, from which WebGeoDa will infer which variables are valid for that set of geographies. You can define the same table name for different geographies, allowing you to swap between different geographies for the same variable -- for example, consider exploring unemployment rates at the state, county, and zip code level. Each reveals something different about the spatial patterns through more or less granularity. Note: It is critical that all tables present across different geographies have the same available columns. Consider each table name that you define like a preset data schema -- WebGeoDa will use the available tables at different geographies to infer which variables are valid at which geographic scale. If tables of the same name, joined across different spatial scales, have different columns, you may get unexpected results.

  3. Variables: To define how you want to visualize your data (spatially, and in charts), variables specify the combination of data you want to understand. Each variable needs at minimum a name, data table, column name, and binning strategy. If you want, you can normalize variables by dividing a numerator and denominator data column in each geography, such as GDP per capita. Variables can also handle time-series data, where you can define a date index rather than a specific column name. Variables are used for both map representations, where each polygon is colored based on the value of the data, and in chart or information representations -- currently supported are distribution graphs of Histograms, 2D Scatterplots, and Scaled Dot 2D Histogram, and Heatmap. WebGeoDa will automatically populate the spatial scales that have the correct tables. WebGeoDa will also automatically swap to the relevant data set when changing variables, if the current geographic scale doesn't have the data needed.

Last updated