From Audit Anxiety to SOC 2 Type II: A HealthTech Success Story

The email arrived on a Monday morning: a large hospital network was interested in piloting the client's patient engagement platform — contingent on a SOC 2 Type II report. The startup had 12 engineers, a solid product, and exactly zero formal security controls. Their CTO called us that afternoon.

Fourteen weeks later, they had their Type II certification. The hospital deal closed. Here's an honest account of every obstacle we hit and how we worked through it — because the path from zero to SOC 2 is rarely as clean as the auditors make it sound.

The starting point

Before we could plan, we needed to understand what we were working with. The gap assessment revealed a familiar picture for a fast-moving startup:

  • No SSO — every engineer had individual credentials to every service, rotated inconsistently
  • Shared AWS root account credentials stored in a team Notion doc
  • Kubernetes with no RBAC beyond default settings — every deployment ran as root
  • No centralized logging — logs lived on individual pods and disappeared on restart
  • Secrets in environment variables, hardcoded in Dockerfiles, or scattered across a shared .env file in Slack
  • No incident response plan, no on-call rotation, no postmortem process
  • Vendor access never revoked for two former contractors

None of this was negligence. It was the natural state of a team that had been moving fast and building product. SOC 2 simply requires you to stop and formalize what you've been doing informally — and fill in the gaps you've been ignoring.

The hardest implementation challenges

Challenge 1: SSO rollout without breaking production

Enforcing SSO with MFA across every service sounds straightforward until you realize how many things will break. We chose Okta as the identity provider and worked service by service:

  • AWS IAM Identity Center for console and CLI access
  • Okta SAML integration for GitHub, DataDog, and PagerDuty
  • Kubernetes OIDC authentication via Dex
  • Database access through a bastion with Okta MFA (Teleport)

The critical move was doing this in parallel — keeping existing credentials active for two weeks while engineers migrated — rather than a hard cutover. A hard cutover on a Monday would have been chaotic. Parallel operation gave us time to find the service accounts, CI/CD integrations, and scripts that were using personal credentials that nobody remembered.

Challenge 2: Secrets sprawl

This was the most time-consuming part. The team had secrets in seven different places. We spent three days just inventorying them before we could start migrating. The approach:

  1. Audit every secret location (Notion, Slack, Dockerfiles, .env files, CI variables, Kubernetes manifests)
  2. Migrate everything into AWS Secrets Manager as the single source of truth
  3. Update application code to fetch secrets at runtime via the AWS SDK
  4. Rotate every secret that had been exposed — a non-trivial coordination exercise
  5. Add Gitleaks to the CI pipeline to catch any future accidental commits
  6. Enforce Kubernetes Sealed Secrets for any config that had to live in Git

We also set up AWS CloudTrail with alerts for any access to production secrets from outside the CI/CD pipeline or application runtime. That becomes evidence for the auditor — not just that secrets are protected, but that you know when they're accessed.

Challenge 3: Kubernetes RBAC without disrupting the team

Going from "everyone has cluster-admin" to least-privilege RBAC is politically as much as technically challenging. Engineers who had always been able to kubectl exec into any pod suddenly couldn't. The solution was to implement RBAC in three phases:

  • Phase 1 (audit mode): Deploy audit logging to see what everyone was actually doing, without restricting anything. Two weeks of data.
  • Phase 2 (namespaced permissions): Restrict access by namespace — each team only had full access to their own namespace. Production was read-only for most engineers.
  • Phase 3 (production lockdown): Production access via just-in-time approval through Teleport, logged and time-limited. Emergency access still possible, but fully audited.

By the time we hit Phase 3, the team had internalized the pattern. Nobody complained about the production lockdown because they'd seen two weeks of audit logs showing how rarely they actually needed it.

Challenge 4: Logging — what to collect and for how long

SOC 2 requires logs that demonstrate your controls are operating effectively. That's not the same as application logs. We needed three distinct log streams:

  • Application logs: Already existed, but unstructured and ephemeral. We standardized on JSON structured logging and shipped everything to Loki.
  • Audit logs: Who accessed what, when, from where. CloudTrail for AWS, Kubernetes audit logs, Teleport session recordings for SSH/kubectl sessions.
  • Security event logs: Failed authentications, privilege escalations, unusual API patterns. These fed into DataDog Security Monitoring with alert rules.

Retention was 1 year for all audit logs, 90 days for application logs. The critical decision was storing audit logs in a separate S3 bucket with write-once (Object Lock) enabled — auditors want to see that log integrity cannot be tampered with, even by your own team.

Challenge 5: Writing policies people would actually follow

Every SOC 2 audit requires written policies — access control, incident response, vendor management, change management, and more. Most companies either write policies nobody reads or copy templates that don't match how they actually operate.

We took a different approach: we wrote policies by reverse-engineering what the team was already doing well, then adding what was missing. If your engineers already require PR reviews before merging to main, write that down and enforce it. If your incident response "process" is a Slack channel called #incidents, formalize it into a procedure with severity levels and response SLAs — but keep it close enough to how the team naturally works that it doesn't become shelf-ware.

The auditor will interview your engineers. If your policies don't match what your engineers actually do, it fails. Authentic documentation beats polished fiction every time.

The audit process

We engaged the auditor at week 6 — before the observation period started — to align on scope and control mapping. This is the move most teams skip, and it costs them months. The auditor told us which controls they'd be testing and what evidence they'd need. We spent the remaining weeks generating that evidence systematically.

The Type II observation period was 3 months. During that time, the team operated under the new controls, and we generated a continuous evidence collection pipeline:

  • Weekly automated exports of access reviews (IAM, Kubernetes RBAC)
  • Monthly reports from vulnerability scanners (Trivy, Prowler for AWS)
  • Incident log exports (even in a period with zero incidents, the auditor wants to see the process was running)
  • Change management records — every infrastructure change as a tagged Git commit
  • Vendor access reviews — a quarterly Notion page showing every third-party with production access

The outcome

Type II certification, zero exceptions. The auditor's report noted "mature controls for an organization of this size" — which, given where they'd started 14 weeks earlier, was remarkable.

The tangible business results went beyond the hospital contract:

  • Two enterprise deals accelerated — security questionnaires that previously took 3 weeks of back-and-forth became a 2-page attachment
  • Cyber insurance premium reduced by 34% — the insurer's risk model reflected the new controls
  • Engineering velocity increased — counterintuitively, enforced processes eliminated the informal coordination overhead. The RBAC rollout reduced "can you give me access to X" Slack messages by 80%.
  • One near-miss caught — during the observation period, the audit logging system flagged a former contractor's credentials still active in a third-party integration. Would never have been found without the new controls.

"I expected SOC 2 to slow us down. Instead, we ship faster, we sleep better, and we just closed our largest contract. The hospital deal alone was worth 10x what we spent on the entire implementation."

What we'd do differently

Every engagement teaches us something. On this one:

  • Start secrets management earlier. Secrets sprawl is always worse than it looks, and the rotation coordination is the most disruptive part. If you're not yet SOC 2 ready, start centralizing secrets today — you'll need to do it anyway.
  • Loop in the auditor at week 4, not week 6. Two more weeks of alignment would have saved a late-stage control mapping revision.
  • Automate evidence collection from day one. Manually pulling evidence for an observation period is painful. Scripts that generate weekly compliance reports pay back their build time within the first month.

SOC 2 is achievable for any engineering team. The infrastructure work is known and repeatable — we've now run this playbook across FinTech, HealthTech, and SaaS companies at every stage. The differentiator isn't the tools; it's knowing which order to tackle things, where the hidden complexity lives, and how to keep engineers productive while the controls go in around them.

Starting your SOC 2 journey?

We'll run a free gap assessment against the Trust Service Criteria and give you a week-by-week implementation plan — no obligation.

Book Free Assessment
← Back to all articles