About LearnWise AI
We’re an AI-first startup helping universities and colleges support students and faculty with smart, knowledge-driven tools. Small team, high ownership, real user impact. LearnWise is a place where the best idea wins—no matter who says it—and where innovation and growth are in our DNA.
You’ll work closely with our CTO, Head of AI, and our DevOps/infra owners.
Why this role exists
We’re scaling quickly, and some of our biggest “make or break” constraints are now AWS and Atlas spend, latency, and how our data layer behaves under load.
MongoDB Atlas is a core system for us today, and we’re also investing more deeply in AWS optimization - including evaluating architecture changes such as moving vector workloads to AWS. We need an engineer who can own performance and cost end-to-end, and who enjoys doing deep-dives and the into finding bottlenecks, fixing them properly, and making sure they don’t come back. This also includes optimizing the application side of things, and occasionally also jumping on some back-end development tasks when things are quiet on the platform front.
Important: this is not a generic DevOps role and not a pure DBA role. It’s a platform/data optimization role focused on reducing costs, improving latency, and occasional back-end work.
What you’ll do
- Own everything related to Mongodb & Atlas including latency and costs
- Contribute to AWS optimization
- Partner with DevOps to identify and implement high-impact improvements: right-sizing, storage/network tuning, monitoring, and ongoing cost/performance reviews.
- Reduce AWS spend without degrading reliability or developer velocity.
- Benchmark and evaluate architecture changes (including a potential move of vector workloads to S3 Vectors) and drive rollout plans based on evidence.
- Audit and improve schemas, indexes, query patterns, aggregation pipelines, and connection usage.
- Reduce p95/p99 latency and eliminate common scaling pitfalls (index bloat, write amplification, inefficient query shapes, hot partitions).
- Establish a practical capacity/scaling approach as usage grows (including when/if sharding is warranted).
- Build dashboards and alerts that make regressions obvious (cost, latency, saturation, slow queries, error rates).
- Create lightweight guardrails so performance doesn’t rely on heroics.
- Startup-style ad-hoc work
- Jump in on performance investigations, production incidents, or architecture reviews when needed.
- Help unblock teams by turning fuzzy performance problems into clear plans and shipped fixes.
Non-negotiable requirements