00

Folder Structure

All files must be in the same folder before you start.

DSPG26Data-Organization-Workshop-main/
  ├── acs_virginia_raw.csv
  ├── data_availability_workshop_exercise.R
  ├── data_availability_dashboard.qmd
  └── data_availability_dashboard_shinyapp.R
Open RStudio → File → Open Project → select this folder. Your working directory will be set automatically.

01

Install & Load Packages

Run the install line once. Load packages at the start of every session.

R
# Install once
install.packages(c("tidyverse", "tidycensus", "reactable",
                   "htmlwidgets", "jsonlite", "shiny"))

# Load every session — run this before anything else
library(tidyverse)
library(tidycensus)
library(reactable)
library(jsonlite)

02

Census API Key

Register your key once — it saves permanently to your machine. Get a free key at api.census.gov/data/key_signup.html.

R
# Replace with your actual key from api.census.gov/data/key_signup.html
census_api_key("YOUR_API_KEY_HERE", install = TRUE)
readRenviron("~/.Renviron")

# Verify it worked — should return your key, not an empty string
Sys.getenv("CENSUS_API_KEY")

03

Single Data Pull

One function call, one variable, one year. This is the building block of everything else.

R
va_income <- get_acs(
  geography = "county",      # unit of analysis
  variables = "B19013_001",  # ACS variable code
  state     = "VA",          # Virginia counties only
  year      = 2022,          # which year
  survey    = "acs5"         # 5-year estimates
)

head(va_income)
# Columns: GEOID, NAME, variable, estimate, moe
# estimate = the ACS value · moe = margin of error
survey = "acs5" means 5-year estimates. Use "acs1" for 1-year, but only for areas with population > 65,000.

04

Multi-Year Download

The full loop that created acs_virginia_raw.csv. Pull each variable separately so failures in one year don't block others. Takes several minutes.

R — reference only, do not run during workshop
vars  <- c("B19013_001", "B17001_002", "B15003_022",
           "B27001_001", "B01003_001", "B28002_004")
years <- 2010:2023

raw_data <- map_dfr(vars, function(var) {
  map_dfr(years, function(yr) {
    tryCatch({
      get_acs(geography = "county", variables = var,
              state = "VA", year = yr, survey = "acs5") |>
        mutate(year = yr)
    }, error = function(e) NULL)  # skip silently if unavailable
  })
})

# Save immediately — never have to re-run the loop
write_csv(raw_data, "acs_virginia_raw.csv")
tryCatch() handles errors gracefully — if a variable doesn't exist for a year (e.g. broadband before 2017), it returns NULL and skips instead of crashing.

05

Load the Pre-Downloaded Data

No API needed for today — load the CSV we prepared for you.

R
raw <- read_csv("acs_virginia_raw.csv")

glimpse(raw)
# 9,725 rows · 6 columns: GEOID, NAME, variable, estimate, moe, year

06

Find Variable Codes

ACS codes follow a pattern: B19013_001 = Table B19013, Cell 001. Search with load_variables().

R
vars_lookup <- load_variables(2022, "acs5", cache = TRUE)

# Search by keyword — opens as a spreadsheet in RStudio
vars_lookup |>
  filter(str_detect(label, "income|Income")) |>
  select(name, label, concept) |>
  head(10) |>
  View()
concept = the table name  ·  name = the code you use in get_acs()  ·  label = what it measures

07

Audit Variable Availability

Check which variables exist in which years — directly from the CSV, no API calls needed.

R
# What actually exists in the CSV
years_in_data <- raw |>
  distinct(variable, year)

# All possible combinations (6 variables × 14 years = 84 rows)
all_combos <- expand_grid(
  variable = unique(raw$variable),
  year     = 2010:2023
)

# Join — missing combinations become FALSE
audit_results <- all_combos |>
  left_join(
    years_in_data |> mutate(available = TRUE),
    by = c("variable", "year")
  ) |>
  mutate(available = replace_na(available, FALSE))

08

Add Labels & Categories

Map variable codes to human-readable labels and thematic categories.

R
var_metadata <- tibble(
  variable = c("B19013_001", "B17001_002", "B15003_022",
               "B27001_001", "B01003_001", "B28002_004"),
  label    = c("Median Household Income", "Population Below Poverty",
               "Bachelor's Degree", "Health Insurance Universe",
               "Total Population", "Broadband Access"),
  category = c("Socioeconomic", "Socioeconomic", "Socioeconomic",
               "Healthcare Access", "Demographics", "Infrastructure")
)

audit_final <- audit_results |>
  left_join(var_metadata, by = "variable") |>
  group_by(variable, label, category) |>
  summarise(
    n_years       = sum(available),
    years_present = list(year[available == TRUE]),
    missing_years = list(year[available == FALSE]),
    .groups = "drop"
  )

# Save for dashboard QMD and Shiny app
saveRDS(audit_final, "audit_final.rds")

09

Option 1 — reactable Widget

Quickest option. Produces a standalone HTML file that works offline.

R
library(htmlwidgets)

widget <- reactable(
  audit_final,
  filterable = TRUE,
  searchable = TRUE
)

saveWidget(widget, "data_availability.html", selfcontained = TRUE)

10

Option 2 — Quarto Dashboard

Styled VT-colored table. Requires audit_final.rds — run Step 08 first.

R
# Run Step 08 first to create audit_final.rds, then:
quarto::quarto_render("data_availability_dashboard.qmd")

# Open data_availability_dashboard.html in your browser

11

Option 3 — Shiny App

Live reactive filtering inside RStudio. Requires audit_final.rds — run Step 08 first.

R
# Run Step 08 first to create audit_final.rds, then:
shiny::runApp("data_availability_dashboard_shinyapp.R")
Note: Shiny requires R to be running. Unlike Options 1 and 2, it cannot be saved as a standalone file.

12

Option 4 — R + Claude

Export JSON from R, paste into Claude with the prompt below to generate a polished color-coded dashboard.

R — Export JSON
cat(toJSON(audit_final |>
  mutate(
    years_present = map(years_present, as.integer),
    missing_years = map(missing_years, as.integer)
  ), pretty = TRUE))
Claude Prompt — paste your JSON at the bottom
You are a data visualization expert. I have audited variable
availability in ACS 5-year estimates for Virginia counties
from 2010-2023. Below is a JSON file with the results.

Create a self-contained interactive HTML dashboard for an ACS variable availability audit.

STYLE
- Virginia Tech colors: maroon #861F41, orange #E5751F, cream background
- Fonts: Cormorant Garamond (headings), JetBrains Mono (code), DM Sans (body)

LAYOUT
- Dark maroon hero header with grid texture, italic serif title, monospace metadata
- White stat bar: total variables, full/partial counts, animated fill-rate progress bar
- Filter pills: coverage (All / Full / Partial) and category — each category in its own color
- Grid: left column shows variable name, ACS code, category badge, mini progress bar
  Year columns show filled colored squares (available) or dashed grey squares (missing)
- Tooltip on hover: variable name, year status (✓/✗), coverage fraction, missing years as chips
- Footer with institution name and data source

RULES
- Vanilla JS only, no libraries
- Each category gets a distinct color across cells, badges, pills, and progress bars
- Tooltip repositions to stay in viewport
- Filters show empty state when nothing matches
- Fill-rate bar animates on load

DATA [Paste JSON here]