Data Engineering Runs on Judgment Calls
The actual work of data engineering, at least in my experience, is judgment calls. You make hundreds of them on a single project and each one changes what the numbers look like downstream. The SQL is fine. Joins are joins. You learn a new dialect, figure out the optimizer quirks, move on. That part takes care of itself.
I worked on two projects over the last couple years that both taught me this. Different domains, different stakeholders, same core problem: someone hands you data and a vague goal, and you end up being the person whose decisions shape what everyone else sees.
”You’ll figure it out”
First one was a research analytics dashboard. Claims database, about 40 million patients, and the ask was to track treatment patterns for a specific condition. No spec. No requirements doc. No list of what drugs to track. Connection string and a goal.
First question: who even counts as a patient? Diagnosis codes are messy. Does one occurrence qualify? Two? Does it matter if it came from a hospital visit vs. outpatient? The research team had opinions but hadn’t actually defined the criteria, because no one had needed to until I started building.
So I made the call. Two or more diagnosis codes in a rolling 12-month window, or one code plus a condition-specific prescription fill. Ran the numbers, checked them against published prevalence rates for the population, and they lined up.
“Defensible” is the whole game when there’s no spec. The goal is to be the person who can explain their reasoning when someone asks.
Building a drug list from scratch
Whole project was about tracking what drugs patients take and in what order. I had to build the drug list myself.
I started from what I knew about domestic prescribing, then mapped everything to a foreign market where brand names are different, some drugs aren’t approved, and there were local staples I’d never heard of. Ended up with about 30 molecules across acute and preventative categories.
That split became its own problem. Same molecule can be acute or preventative depending on how it’s prescribed. A triptan taken as-needed? Acute. A beta-blocker taken daily? Preventative. The data had a flag for this. I was excited about it for maybe 15 minutes before I looked at the actual values and saw it was populated about 60% of the time. When it was there, it was sometimes just wrong.
So I built frequency-based rules. Patient filling a script more than once every 30 days consistently? Probably preventative. Fills sporadic and clustered? Probably acute. Anything ambiguous defaulted to whatever the clinical literature says that drug is most commonly used for. This also meant I was reading clinical papers at 11pm trying to understand why doctors prescribe certain drugs in a certain order, because you can’t build classification rules for something you don’t understand.
Every decision changes the numbers
Patient takes Drug A for six months, switches to Drug B for three months, goes back to Drug A. Is that three lines of therapy or two?
I went with two. Going back to something you already tried isn’t a new strategy, and collapsing re-initiations made the survival analysis cleaner.
Then you need to define discontinuation. If there’s a 45-day gap between prescription fills, did they stop? 60 days? 90? Claims data gives you fill dates but not instructions. You don’t know if the doctor said “take this for two months” or if the patient just stopped showing up.
I set it at 60 days with a grace period proportional to days supply. Each one of these is a judgment call. Each one moves the numbers. There’s no answer key.
The dashboard that ran on tribal knowledge
Second project was completely different. Rebuilding a legacy BI dashboard for a pharma sales team, about 200 people using it every week. Old tool had separate views that never talked to each other. Sales by one dimension, by another, by geography. Each one made sense on its own.
New dashboard needed everything in a single data model. And the second you do that, every discrepancy that was always there becomes a bug report. “The total for Dr. Smith in one view doesn’t match the other view.” It never did. You just couldn’t see it before because the old tool wouldn’t let you compare.
The business rules behind all of this lived in maybe three people’s heads. Discovery was a loop: build model, run numbers, compare against old dashboard, find discrepancy, ask why, get told “that’s just how it works,” escalate until someone who’s been there 8+ years says “oh, that’s because we do X.” Implement X. Find new discrepancy. Repeat for weeks.
The kind of thing that comes out of that loop: a rep visits a physician and also talks to three other attendees. That’s four interactions for display. For completion percentage? Same visit counts as one. Same event, two different numbers depending on which metric you’re computing. I found this when counts didn’t match between two sections and spent a day and a half tracing it back.
Being the only data person
I was solo on the data layer. Every data question came to me. No peer review on my SQL, no one to sanity-check my join logic. When the numbers are wrong, there’s one person to look at. When they’re right, no one notices.
After the third time a bad refresh went live and generated confused emails from the sales team, I built QC gates. Before any refresh hits production, automated checks run. YTD sales should only go up. Month-over-month deltas should be within historical ranges. Row counts shouldn’t drop more than 5%. If anything fails, the refresh blocks and the dashboard shows yesterday’s data.
The checks are basic. Any data engineer could write them in an afternoon. But the decision to gate production behind them, to say stale data is better than wrong data, was the best call I made on this project.
I think the thing that caught me off guard about both of these is how little of the actual work was technical. You spend years getting good at SQL and Python and stats, and then the job is mostly sitting in ambiguity, making calls, and hoping your reasoning holds up when someone pulls on a thread.