53033 FipsDecoder

How to Zero-Pad FIPS Codes in Python, R, and SQL

FIPS codes must always be treated as fixed-length, zero-padded strings — never as integers. This is the most common source of "why is my join producing zero results?" bugs when working with federal data.

Why This Matters

Nine US states have FIPS codes that start with zero: Alabama (01), Alaska (02), Arizona (04), Arkansas (05), California (06), Colorado (08), Connecticut (09). If you store or import state FIPS codes as integers, these become 1, 2, 4, 5, 6, 8, 9 — which are wrong values that will silently fail to join with any other dataset that stores the correct zero-padded form.

At the county level, the problem is even more widespread. Hundreds of counties have 5-digit FIPS codes beginning with zero (e.g., 01001 for Autauga County, AL). Cast to integer, that becomes 1001 — a completely different code.

Silent failure: This bug often produces a result — just the wrong one. If your dataset happens to also have an integer FIPS, the join succeeds but maps to a different geographic entity. Always validate your join counts against expected county or state totals.

Python

import pandas as pd

df = pd.read_csv('data.csv', dtype=str)  # ← always read FIPS as string

# If FIPS was already read as integer (e.g., from Excel):
df['county_fips'] = df['county_fips'].astype(str).str.zfill(5)
df['state_fips']  = df['state_fips'].astype(str).str.zfill(2)

# Splitting a 5-digit county FIPS into components:
df['state_part']  = df['county_fips'].str[:2]
df['county_part'] = df['county_fips'].str[2:]

# Building a 5-digit FIPS from separate state and county columns:
df['fips'] = df['state_fips'].str.zfill(2) + df['county_fips'].str.zfill(3)
# Common mistake — never do this:
df['county_fips'] = df['county_fips'].astype(int)  # ✗ drops leading zeros
df[df['county_fips'] == 1001]  # ✗ should be '01001'

# Correct comparison:
df[df['county_fips'] == '01001']  # ✓

R

library(tidyverse)

# Read CSV — ensure FIPS is character, not numeric
df <- read_csv("data.csv", col_types = cols(county_fips = col_character()))

# Zero-pad after reading
df <- df |>
  mutate(
    county_fips = str_pad(county_fips, width = 5, pad = "0"),
    state_fips  = str_pad(state_fips,  width = 2, pad = "0")
  )

# Build 5-digit from components
df <- df |>
  mutate(fips = paste0(
    str_pad(state_fips, 2, pad = "0"),
    str_pad(county_part, 3, pad = "0")
  ))

# With tidycensus — it handles padding for you:
library(tidycensus)
get_acs(geography = "county", state = "01", county = "001", ...)

SQL (PostgreSQL)

-- Ensure zero-padding when storing
INSERT INTO counties (fips_code, ...)
VALUES (LPAD('1001', 5, '0'), ...);  -- → '01001'

-- Repair a column that was stored as integer
UPDATE counties
SET fips_code = LPAD(fips_code::text, 5, '0')
WHERE LENGTH(fips_code) < 5;

-- Joining two tables where one has integer FIPS
SELECT a.*, b.*
FROM federal_data a
JOIN my_counties b
  ON LPAD(a.fips::text, 5, '0') = b.county_fips;

-- Cast on the way in (never store as integer)
ALTER TABLE my_table
  ALTER COLUMN county_fips TYPE char(5)
  USING LPAD(county_fips::text, 5, '0');

Excel / CSV Imports

Excel is the most common source of this bug. When you open a CSV containing FIPS codes, Excel silently converts the column to numbers, stripping leading zeros. Solutions:

  • Import via Data → From Text/CSV and set the column type to "Text"
  • Prefix each FIPS value with a single quote in the CSV: '01001
  • Use a helper column: =TEXT(A1,"00000") to repad a numeric FIPS
  • Better: don't use Excel for FIPS-based analysis — use Python or R

Quick Reference: FIPS Code Lengths

GeographyLengthExamplePad to
State FIPS206str.zfill(2)
County portion only3037str.zfill(3)
Full county FIPS506037str.zfill(5)
Census Tract GEOID1106037137000str.zfill(11)