Technical Leadership in a 2025 AI/EdTech Startup: Leveraging Short-Cycle R&D
I claim that technical leadership in the era of generative AI implies a fundamental shift: we go from managing large groups towards harnessing short-cycle R&D for immediate leverage. As a result, at Stemuli I have found that popular systems of software engineering management are actively unhelpful when meeting the challenges of AI-native teams. What follows is an outline of my responses to these challenges, with particular emphasis on lightweight systems of engineering management that enable rapid execution and R&D efforts.
In no particular order, I’ll outline some constituent components of that approach.
Organizational Design
In a 2018 graduate seminar run by Chip Heath at Stanford, we were invited to write on a whiteboard: "How can you mitigate the organizational drawbacks of scaling?" I made myself unpopular in the room with the somewhat antagonistic phrase: "automate, don't hire".
Fast-forward to today: this doesn't mean we never hire for AI at Stemuli (we take it very seriously, and we get great people!), but "let's hire someone" as an answer to every new challenge is a recipe for organizational complexity.
The first organizational design priority of a technical leader in an applied-AI org must be to resist the instinct to increase headcount as an input intended to increase productive output. These are more loosely coupled than ever before. Shift executive focus onto productionizing R&D efforts, prototyping, automating, and scaling workflows, bringing in little AI homunculi ('agents'). It isn't about minimizing cost, instead it is about minimizing complexity.
Put differently, you can screw a lot of stuff up (and then correct your mistakes) with poor organizational design and reporting structures, if you have kept headcount low and IC quality high. This is the 20% insight that delivers 80% for organizational design at an applied AI startup.
Day-to-Day micro-level management (tickets)
Finding legacy engineering management systems (Agile, Schmagile) to involve excessive overhead for effectively balancing execution with R&D projects, I developed a simple system for our AI team. The elements retained from traditional approaches are fairly minimal: a simplified Kanban board with basic statuses, one board per team. This may incur switching costs for managers and PMs. Too bad (and of course, at low headcount we can get away with that more easily anyway). It protects IC focus, enables ‘larger’ more substantive tickets, and provides an async view of everything a team is doing, in one place. Our tickets are stripped of what Tim Ferriss would call "productivity fidgets" - no automations, labels, color coding, zodiac signs, ...
Within tickets, I embrace simplicity through markdown or other nearly-plain/just-barely rich text systems. While it may seem inelegant to keep extensive information in text fields inside of tickets, big text blobs prove remarkably resilient - surviving system migrations, upgrades, and organizational changes. We had to migrate ours from ClickUp to Jira as part of a whole-org decision, and it survived ~95% intact (Jira is admittedly much worse for updating long text sections, I recommend picking another product if you can). Using text fields like this is also highly compatible with LLM processing for documentation and knowledge management. A bonus: you can paste a ticket (with the log, as described below) into your chat and get a blog post or piece of high-level documentation out. This helps rapidly onboard new joiners and update existing team members.
So what’s in the ticket? A description of how the project or task came to be; a short list of deliverables, a ‘current status’ block, a log, and a scoping section. All in one text field.
The first two of these need no further elaboration.
The log is a stream of consciousness record, updated at least once daily by ICs. It serves as a companion to their git activity and documentation: what they did or tried, why, how it turned out. There is no need for paragraphs, or really any structure, within a given brief log entry. So long as the keywords are appropriate they are searchable. They can also be pasted straight into an LLM chat window for synthesis. To quote one of my engineers, the log also makes it "really obvious if someone isn't getting anything done."
The scoping exercise draws from a mix of pop and evidence-based paradigms in psychology. For example, there is an exercise known as the "Disney pattern", after Walt Disney. The user cycles through adopting the posture of the 'dreamer', the 'critic', and the 'realist' in order to assess a creative idea. My adaptation of this process for engineering scoping also tracks evidence-based paradigms in organizational psychology, such as the distinction between promotion and prevention mindsets (itself related to prospect theory/loss-aversion tendencies in human reasoning), and the 'premortem' exercise beloved of Huggy Rao.
For each ticket, ICs must briefly outline three scenarios: an "optimistic" timeline broken into stages, a "realistic" timeline with appropriate padding and side-notes, and finally a mildly pessimistic listing of anticipated sources of delay. As they then work on the ticket, they keep the ‘current status’ block updated according to the stage of their scoping that they are in.
The scoping exercise takes no more than 20-30 minutes, but provides rich observability benefits: we know what's been done, where they are at within the stages of the ticket, and whether they're ahead (e.g. in line with optimistic projection), on (in-line with realistic schedule), or behind schedule. Their team lead and upper managers can attempt to mitigate anticipated delays (provide information, resources, connect to stakeholders) before these cause actual delays.
Week-to-week management (IC scheduling and focus work)
There's a famous Paul Graham essay about the radically differing blocks of time needed to 'get things done' in a people manager role as opposed to the role of an individual knowledge worker. Any rational inspection of the treatment of ICs in technology companies should lead one to extreme alarm. The plural of anecdote is surely data, and I could relate many anecdotes concerning ICs having their working week shredded or filled with coordination tasks and meetings.
So here's my radical notion: we just don't do it. ICs spend less than three hours per week in formally scheduled meetings, with no more than three working half-days impacted. That leaves them with a minimum of 7/10 half-days in the M-F week to actually do their work. Team leads need to be aggressive in defending their ICs from "hop on a call" type requests, or having them pulled into committees or taskforces that might create further blocks of committed sitting-in-meetings time.
Leveraging saved time for R&D
An old joke first told to me by my father: an American executive is boasting about his powerful automobile, the speeds it can achieve, the acceleration it can deliver. He is asked "what do you do with the time you save?", a question that shuts him up.
Of course, I claim that minimal headcount, the above ticket system, guarding IC time, and the rest are just performance-enhancing in their own right (they also help attract and retain excellent people). But what do we do with the time we save from unproductive organizing rituals and fidgets?
The answer lies in R&D. We use the slack that we gain to steadily make short-cycle R&D bets, with a clear process for leveraging those that pay off. We bet on emerging (not theoretical) technologies, rapidly making self-contained prototypes. We attempt to anticipate whole-organization priorities when dreaming these up. We remain ruthless about avoiding 'shiny object syndrome' in this process: a high-risk bet is fine, but we are industrial researchers, so the payoff has to be commensurate. Blue-sky research would need to be resourced separately, it’s not in the scope of our day-to-day.
A note on scale
We prioritize high-reliability systems scoped for expected scale.
All startups aspire to serve bajillions of users one day. Few are well-served by building for real scale from day 1. Much better to go through evolutions: the system that is good enough for O(1000s), replaced (with any other architecture changes and features you need by then) by the system that handles O(100ks), and so on. You learn things and anticipate events for the next gear-shift, and can incorporate them 'next time'.
Feedback from the troops
Hearteningly, my ICs regularly comment on how little distraction and bullshit they endure in my teams, especially compared to their prior work experiences.
Perhaps relatedly, typical onboarding 'time-to-useful-contributor' is around 1 to 1.5 days, although this will probably lengthen a little as our codebase gets larger and our systems scale.
In Conclusion
As Taylor Shead (CEO, Stemuli) says, "be obsessed, be quick, learners desperately need us". I hope that the considerations I have shared above are helpful for others navigating the challenges of Applied AI technical leadership in 2025.