Name:
Platform Strategies 2025: How AI is Transforming Platform Strategy
Description:
Platform Strategies 2025: How AI is Transforming Platform Strategy
Thumbnail URL:
https://cadmoremediastorage.blob.core.windows.net/7791e59c-710b-4239-b1a5-9b957f57420c/thumbnails/7791e59c-710b-4239-b1a5-9b957f57420c.png
Duration:
T00H37M09S
Embed URL:
https://stream.cadmore.media/player/7791e59c-710b-4239-b1a5-9b957f57420c
Content URL:
https://cadmoreoriginalmedia.blob.core.windows.net/7791e59c-710b-4239-b1a5-9b957f57420c/Silverchair_2025_Panels_Thursday_4_ How_AI_Is_Transforming_P.mp4?sv=2019-02-02&sr=c&sig=Zr45o9q0s369x%2BIzYwmC8Qvn4BcLegUGSPUwOfIX5W8%3D&st=2025-10-08T22%3A26%3A26Z&se=2025-10-09T00%3A31%3A26Z&sp=r
Upload Date:
2025-10-08T00:00:00.0000000
Transcript:
Language: EN.
Segment:0 .
STUART LEITCH: All right. So the last time I was here was two years ago, and I kind of gave a presentation really about the coming disruption with AI, really is in coming disruption. We're now really in the early contact with that disruption. It's still early days, but we're starting to see some important thresholds get crossed. Part of what I'm going to really talk about today, and it's actually going to be a little bit hands on.
STUART LEITCH: I'm actually going to attempt to do a live demo over hotel Wi-Fi, which is a little bit sketchy. It may not work out, but I think there's a bit of excitement that comes with that. I'm also going to add to the chaos and allow you to ask questions throughout. Probably easiest is to put questions into the app, and Stephanie will basically just get on a mic and let me know.
STUART LEITCH: But I'm really trying to land a few things with you today. Some things that may not be obvious to you about how AI is evolving. And so if you're asking questions to try and understand that, it'll probably help the room get that more. So one of the big questions here is, are we in an AI bubble? And the media narrative is all over the place in this regard.
STUART LEITCH: You've got a lot of people saying, well, it's just-- there's way more money going in than value being extracted. It is just an astronomical amount of money. That's going into these things. You've got a lot of people saying, well, it's hitting a ceiling. It's not getting any smarter. It's not telling jokes better. It's losing its personality.
STUART LEITCH: And there's also players like the hedge fund, Bridgewater, largest hedge fund in the world. And they're saying the market still hasn't yet priced in the upside if it is. And it's actually really hard to know. And it's really a question for economists. But what I will show you today is where we're actually at. So this is-- I'll jump to this side.
STUART LEITCH: So at least in the world of coding, coding is not a unique domain, but it's somewhat specialized in that because coding is fairly complex to write but relatively easy to validate. And I say relatively easy to validate. There's a big asymmetry there. And it actually allows tests to be run to understand how complex a task can AI take on.
STUART LEITCH: And what this graph is showing is that over the last five years, essentially, the time horizon that it would take a human being to complete that task, where AI can get there 80% of the time has really been just steadily increasing. And so three years ago, we had GitHub Copilot. And it was really solving for method level coding.
STUART LEITCH: Let me jump back here. In terms of time horizon, it's a proxy for complexity. And so what you can see here is that the level of complexity that an AI coding assistant can engage with has basically been doubling every seven months for the past five years. And this has really tracked with our anecdotal experience on the AI team at Silverchair, where we've been saying internally, like, boy, it seems like every six months, things get half as hard and it can go twice as deep into things.
STUART LEITCH: And we've gone from method level completion to being able to interact with whole classes, to now being able to interact with whole code bases. It's still not fully there in terms of being able to do all the things we want it to do with large enterprise code bases. But it's really pretty crazy that this exponential is continuing to hold. So at least in the realm of software development, I can tell you that this thing is getting really, really strong.
STUART LEITCH: And I'm part of a CTO group in Charlottesville. And in the last meeting, there were like 17 leaders there. Charlottesville's got a lot of tech companies. And we did it essentially a poll. And we're asking the question, at your organization, do you have over 50% of your developers actively using an AI coding tool on a daily basis? Every single one of the leaders said yes. So coding-- software development is an industry that's going to get massively disrupted around this.
STUART LEITCH: But what's interesting is that trend has really held. And this kind of obviously, it goes to the moon if the exponential holds. But what's interesting is that starting in late 2024, that rate started to accelerate. So we're seeing an acceleration of that exponential. And it's pretty interesting because this asymmetry that I talked about with software development, software development is a little bit more unique than other domains in that it's a really constrained language, a finite set of rules.
STUART LEITCH: There's a huge amount of training data out there. And the output is relatively verifiable. So these systems can engage in self-play. And they can essentially learn in an unsupervised way. And that's in part why they're getting so strong here. My belief is that we will see that trickle down into other domains. But software is definitely going to be one of the first dominoes to fall in terms of major industry disruption.
STUART LEITCH: What I want to talk about today is just different ways of looking at what's going on. So we've got the models themselves. So the original one that really burst onto the scene was ChatGPT 3.5. Like really amazing. And you're just talking with the model and you're dealing with its internal knowledge.
STUART LEITCH: And it was just like, oh, my God, it can actually converse. It can almost pass a Turing test. But it was also pretty flaky about a lot of things. It hallucinated like crazy. The models since then, like when I was up here last time, GPT 4 had just dropped. And if you could even get it to work because the API, there was so much load on it. And it was really, really slow.
STUART LEITCH: Well, what's happened since then is that GPT 4 level intelligence is now lightning fast. There's been an over 200 times compression in token cost around that for that same level of intelligence. And like OpenAI released the OSS model, which is superior intelligence to GPT 4, that I can run on this laptop. And it runs really fast. So what's happening is that these models are also just getting a lot more efficient really quickly.
STUART LEITCH: But not only are the models getting stronger. They've also learned how to use tools. So most of you, if you've been using ChatGPT or any of these other tools, are going to have experienced the primary tool, which is search. And that's where it can go off, and it can answer questions beyond the knowledge cut off of the model. But it's not just search that they've learned to use. There's a protocol called model context protocol, which is essentially a standard.
STUART LEITCH: It's a USB-C type concept to be able to connect any arbitrary tool. So anything that's got an API, you can basically create an MIC for it to basically be able to plug it in to something like for a desktop or ChatGPT, to be able to go and essentially do relatively arbitrary things. And the third part of this is the scaffolding. And the scaffolding is where it gets really interesting.
STUART LEITCH: It's the glue that holds it altogether. So if you're using ChatGPT, there's scaffolding underneath that. But the scaffolding is actually getting pretty sophisticated. And this is one of the things that I actually want to spend a bit of time on today and just kind of really grounding us in this because there's pieces of that are coming that I don't think most people are aware of, where you're going from a really simple kind of single loop chatbot experience into something that's actually multi-headed.
STUART LEITCH: And it's got a lot of sophistication going on under the covers. And I'm actually going to assert that the future is really multi agentic. And we'll spend a bit of time really drilling into why this is. But if you think about having a regular chat session with say, ChatGPT, you start off, it's a day one problem every time.
STUART LEITCH: It starts off from nothing. And it does have some memory components in there now. But you can effectively think of it as it's a day one intern that's super eager to please you and do things for you. Unfortunately, in terms of the human feedback, because you've got this whole army of people kind of ranking responses, and human beings tend to sycophantic things more than other things.
STUART LEITCH: And so we're seeing some kind of distortions creep in there. But what happens is that if you're in the chat session and you give it access to tools, and it goes and uses those tools to try and find out more about a given topic, it's going to do searches, it's going to pull down articles. And it's going to find some really good stuff, and it's going to find some irrelevant stuff as well. But what's happening is that all of that information is filling up its context window, and it's like it's working memory.
STUART LEITCH: And the deeper you get into a conversation, essentially, the more the intelligence is used to attend to all these different things. And it's actually remarkable that you can have a context window like 200,000 tokens, up to 1 million tokens, and it can still somewhat make sense of that. But you've also probably experienced that the deeper you get into a chat conversation, essentially the more and more intelligence of that model is being used, just essentially attending to all the stuff that's in its memory.
STUART LEITCH: And so in a kind of a manual workflow, if you always say using ChatGPT and you're wanting to-- you got some sort of complex research assignment, what you'll typically want to do is use an expand-contract methodology, where you go and you're researching something and it goes and pulls a whole lot of things into its context window. And then you try and boil that down into some sort of succinct report out of that.
STUART LEITCH: It's like having an intern, and you send the intern off to the library, and the intern goes and checks out a whole lot of books. What you don't want is the intern coming back and just kind of like dumping all those books on your desk. What you want is you want just the information you need to know relative to what you asked for. So if you're doing this manually, what you want to be doing is kind of breaking up how you're working with this such that you're expanding this out, using up the context window, getting out of it what you need, then bring the output of that into a new session.
STUART LEITCH: Because when the context window is not filled up, the models are way, way, way smarter. The problem is that when they start out, they just don't have that context. But what I'm going to do, and we're going to do a bit of an experiment here over hotel Wi-Fi to actually I'm just going to show you what's currently available. And this is going to be in a coding tool.
STUART LEITCH: So this will probably freak some people out. This is command line stuff. There's no reason why there couldn't be a pretty interface on top of this. It just so happens to be at this point in time, the state of these tools is that they're really focusing on just the raw muscles of the infrastructure underneath this scaffolding.
STUART LEITCH: And we'll see pretty interfaces coming. But this is going to be coming relatively soon to other domains. And this tool, by the way, is actually fundamentally usable beyond just coding solutions. We've got a bunch of non-technical users within Silverchair that are doing some pretty crazy stuff. So I'm going to basically give it a set of instructions. And with LLMs, you want to be really explicit with what you're asking for, and you generally want to give it a lot of context.
STUART LEITCH: So one of the things I do is I talk to it. It's just a way of dictations gotten really pretty damn good. So let's go. So I've got a research problem for you. Scholarly publishers are seeing disruption in their traffic patterns on their academic journals. That's increasingly large. Traffic is coming from AI bots representing human users. Research the issues around this, including the likelihood that institutions will want MCP access.
STUART LEITCH: So you can see here down, dictations, largely a solved problem now. So this is a pretty naive prompt. And this is a contrived exercise. But what I'm trying to do is show you the mechanics of orchestration here, because we're going to give it some interesting instructions here. So firstly, spin up a subagent for gather context about the annual Silverchair strategies conference.
STUART LEITCH: It's in the scholarly publishing industry because I want the research you create to be tailored to this audience and save that to a markdown document. So this is where we're actually-- this is the concept of a subagent. And a subagent, this is where the scaffolding is kind of breaking off and spawning a new kind of chat session that doesn't share all of the memory with the larger session.
STUART LEITCH: So you're essentially getting-- it's like you're a project manager and you're sending the intern off to the library and you're telling it, it's going to go off and it's going to research all. It's going to do a web search. It's going to pile all this stuff together. It's going to go down wrong paths. But what you want is you want it to return back a fairly succinct description, as we've just kind of asked it for about what this conference is about, such that it can ground its research in that.
STUART LEITCH: Then I want you to use a subagent to do preliminary research on the research topic I described above, to get a sense of the problem space and break it down into key themes. So what we're doing here is we're doing kind of a work breakdown structure. We're trying to-- it's like human beings. You might have the smartest person, but you don't just get them to try and solve everything at once.
STUART LEITCH: You typically break the work down into manageable chunks. So that's part of what we're doing here. And I'm a bit like an LLM here too. I'm kind of trying to attend to all of you. And I've got a lot of things in my context window. Then I want you to spin up separate subagents in parallel to do web research on each of these themes with the audience in mind, and save this to markdown.
STUART LEITCH: So here's where we're sending the interns off to do the actual real research on these particular things. Once all the [INAUDIBLE] research is done, spin up a subagent for each theme to extract the key claims and conclusions. So this is one of the real challenges with models, particularly when we're giving it-- and there's nothing set up in the background.
STUART LEITCH: This is just a vanilla instance of Claude Code, nothing preconfigured. And I'm not being very directive about what kind of level of output that I'm looking for here. And it is going to hallucinate. It's researching across this stuff. It's also going to be scanning the web. It'll probably get a whole bunch of things off Reddit. I'm sure you're all really clear that Reddit's it's not necessarily the source of truth for all things.
STUART LEITCH: So it's going to have extracted these claims. Now spin up in parallel a separate subagent for each assumption or claim to critically evaluate it and produce a confidence score on that, and then save all that down to a markdown file as well. When that is done, use that info to revise the reports on each theme.
STUART LEITCH: And if a theme no longer really holds up because the claims there aren't sufficiently substantiated, essentially, do more research and repeat the entire validation process until it's done. So what we're doing here is we're actually creating a feedback loop within that system, and we've got multiple actors engaged within the system. When that's done, use a sub agent to merge all the themes down into a single research report, and then use another sub agent to reduce the redundancy between the sections of the report and go through the report.
STUART LEITCH: And just make sure that all of this relates back to the original research prompt. Because this is one of the things you'll find with LLMs is particularly as they get deep into things that can have an idea and then repeat the idea and think that it's really relevant. So what we're doing here is we're having another instance running that's essentially just going to pressure test that from a fresh perspective.
STUART LEITCH: And then finally, I want you to research with a sub agent, the slide deck application gamma.app, and create a version of this presentation that's ready to produce slides. Because the challenge that I'm faced with here is that this is going to be a raw research report. I'm in a ballroom setting. And I want to be able to flash something up so people get a sense of what the output is.
STUART LEITCH: So that's the instructions that I gave it. I'm kicking it off. So it's actually going to take a little bit to run. At home, on my gigabit fiber, this like runs pretty quick. Over this hotel Wi-Fi, I'm just going to notice that just goes pretty slowly. But what I can do is I can show you some stuff, which is if we were to represent this in a workflow, what would this actually look like if you put this in this kind of format.
STUART LEITCH: This is basically what we're doing is we're creating a fairly complex agentic orchestration. We're breaking things out into parallel. We've got all these steps. We're breaking things out in parallel. We have this notion of the work breakdown structure. But then we also have, we're essentially instantiating other personalities. We're getting kind of role differentiation in here as well.
STUART LEITCH: And I've been doing this for two years. But the thing that's changed is the scaffolding has become powerful enough that you just literally saw us do this with language. It actually doesn't require technical knowledge anymore. And so that's actually going to be really, really interesting, because when you use these kind of sophisticated workflows, you can get a lot more out of the existing generation of LLM technology.
STUART LEITCH: And you've all actually experienced something similar to this if you've used, say, ChatGPT deep research. You've probably noticed that the results that come out of that are just a lot more solid than what comes out of a regular chat session. And that is because under the covers, it's doing this. So this is a contrived example. We've effectively just re-implemented deep research.
STUART LEITCH: So you don't actually need to understand this code. You don't need to capture that prompt. Just use deep research. But the point here is that you can create things of relatively arbitrary complexity just with your words. And what we can do here is we can basically look at the logs of what's going on here. So this was a session I ran earlier.
STUART LEITCH: So we got the initial prompt. It's then essentially generating a to-do list, so that it's kind of keeping track of what it needs to do. It's going off. It's researching the conference. We can see, OK. So this is the instructions that sub agent was given. So notice that this agent doesn't have all the other instructions we gave.
STUART LEITCH: So it's like the intern that's feels like they're a mushroom. They're kind of kept in the dark and fed bullshit. It need to know. And this agent is basically then going to go off and do research. And it's going to go off and do a bunch of web search. It's going to fetch various things, and it's kind of crawling around until it gets to its level of satisfaction.
STUART LEITCH: Then it comes back with its report and it basically goes on and on like this. There's a lot of activity happening here. And we can jump a little bit further along. And we can-- actually, lost my place with that. What we can see here is that as the claims were being checked, some of the claims, like open access, projected to reach 90% of publications by 2030. Well, when it looked at that critically, it was like, oh, maybe I got that off Reddit.
STUART LEITCH: And you can essentially specify how much rigor you want to bring. So you can essentially spin up. You might think of who are the most curmudgeonly people in the industry that have published enough that they're actually known to the large language models. Well, I want you to take on the personality of each three of those people to really beat this up from their specialties.
STUART LEITCH: So this really unlocks a lot. So let's go back and check and see if this is actually really doing anything. It's still on still on job one. So this might take-- this is clearly not going to complete by the end of this session. But I did want to really land this point with you. I'm hoping at least half of you in the audience get that this is technology that's available today, that essentially, when you combine this with tools like MCPs, where you're able to plug it up to Salesforce, you're able to plug it up to Jira, in our case, you're able to connect it to your databases, you're able to potentially although it's pretty dangerous to read your emails, read your Team's messages, read your Slack.
STUART LEITCH: And you think about the kinds of problems that you're just routinely solving within the organization, where's there's a lot of-- it requires intelligence, but it's stuff that is going to increasingly be automatable. And it's this multi-agent concept that's going to have chatbots-- have us jump beyond chatbots to things where we can essentially outsource really significant cognitive labor.
STUART LEITCH: So what I'm going to do is I'm just going to quickly take an earlier version of this, where we got some output here. And I'm going to take that, and I'm going to drop this into the slide deck application Gamma. And I'm basically going to just create a new one. I'm going to paste in some text. And I'm going to give it some permissions to just go a little wild.
STUART LEITCH: I'll make this concise. Make sure I've got-- yep. And now this is essentially AI on top of AI. So the quality of the results here aren't necessarily-- it's going to have confabulated some things in the process here. But this is basically-- this is an end to end process that actually had no human supervision whatsoever. So in this room, this is not typically how we want to do things.
STUART LEITCH: We want to think about where are we putting the human supervision. So I would typically essentially have a lot of human supervision in terms of actually creating the initial instructions for what the research was. So you see, I just gave it this a really kind of vague and just directional thing that I wanted to research. I would typically iterate a lot more until I got a really meaty problem statement.
STUART LEITCH: And then I would probably also have human in the loop review when it's generating its key themes. And what I probably would have said to it is, generate the five to seven key themes you think really rise to the top. But while you're at it, give me another 20 themes to look through. And what I typically do is I just look at that artifact, and I'd just be there hitting my dictation key, and I'd be talking back to it, saying, OK, I've got some feedback on this.
STUART LEITCH: I love one, three, and five, and I want you to include seven and eight. And I think there's another one you missed. Go back and do research on that. And essentially, human in the loop review until that's in shape. And then certainly, before throwing it into a slide deck, I wouldn't be just blindly throwing it into a tool like this.
STUART LEITCH: So this is kind of like chugging along. I've got an earlier version of it that I created just with a different present. Different formatting of it. But it's pretty interesting. It's not going to be fully accurate. But what I'm hoping lands with you is how rapidly this technology is moving and how you can orchestrate relatively complex workflows and then figure out how to essentially inject your way into that.
STUART LEITCH: Stephanie, did we get any questions through the app?
STEPHANIE: Not yet. Not yet. No, but anybody is welcome to put some in there.
STUART LEITCH: And anybody's welcome. You all look engaged, which is good. I love it when people are actually all looking at me and not down at their phones. But I was going to talk about a lot of things at a high level. But as I've been talking to people, I realized, people aren't really getting what's available with this scaffolding. And one of the things that I think is really important is that you all be thinking about who in your organizations are playing with these, playing with things like [INAUDIBLE] code that are currently mostly developer tools, but that are essentially learning about this.
STUART LEITCH: Because ultimately, when we hear the headlines about the coming kind of white collar bloodbath, like, why is that even being taken seriously? Well, it's because of this. It's because of the multi-agent orchestration. And so there's a real premium on understanding this, understanding how to actually work with LLMs. Like this industry is, in general, a very conservative industry for a really good reason.
STUART LEITCH: It's the bulwark to make sure that science is just progressively layered, and that there's real rigor involved in this. But this technology is coming really fast, and it's going to disrupt our businesses in all kinds of ways. We already know that we're seeing disintermediation with traffic. We know that papers are becoming cheaper and cheaper to produce, as people are using LLMs to just crank out papers.
STUART LEITCH: We know that particularly tech companies are basically just stopped hiring junior developers. It's a really bad world out there right now if you're a junior developer coming out of college. There's a reason why the tech companies are doing this. And there's a reason why Salesforce was able to just lay off or they said kind of repurposed, but effectively lay off like 4,000 of their customer service reps, because the technology is getting to a place where it can essentially handle these more basic tasks.
STUART LEITCH: And as we see with that exponential, basically, every seven months or now, every four or five months, the time horizon in which it can effectively work is doubling. So I'll basically stop there and open this up to questions.
STEPHANIE: I have a couple in the app to start us out.
STUART LEITCH: Yeah.
STEPHANIE: How long did it take you to develop the skills to exploit or master Claude Code in this manner? 500 hours, more, less?
STUART LEITCH: So it's taken me-- I've been working at this for three years, but I've been able to transfer these skills to many other people in the organization, including non-technical people, including, like many of the folks on the executive team have Claude Code installed. And these skills are-- you can-- it's not rocket science. It's just understanding these concepts.
STUART LEITCH: It's just the notion of subagency or just you take prompts like that. There's plenty of resources on the internet. It's just that most people aren't aware of it. So for most people, it's probably a week or two of ramp up. We've now deployed Claude Code to all of our technologists. I need to be careful how I say this. It's not yet at a place where it's magically solving all the things in the code base.
STUART LEITCH: But one of the questions we had was, what is the ramp up time for somebody to be able to extract more value out of this tool than it is consuming in terms of the ramp up? And in most cases, we've found that has occurred within under a week. So these tools are becoming increasingly accessible. And you can essentially get them to do a whole lot of grunt work there.
STUART LEITCH: I'll use an example, you've got an incoming RFP, huge document. I mean, some organizations it's like, wow. And then we've got all these existing RFPs. And you throw all of that into a chatbot. You basically filled up its whole context window. And you're asking it to do, stuff where it's kind of like averaging out around-- it's not going to do a great job. But if you use something like Claude Code, and you essentially say, OK, here's the RFP, here are the other resources, take that RFP, break it down by each individual question.
STUART LEITCH: Then for each individual question, separate subagent, go do research. And then you have checks and you're saying, OK, what are all the historical issues we've had with RFPs? What have we learned? Let's have agents essentially checking for that. You're actually able to orchestrate a lot of this work. And it's far from being able to reduce-- to take the human out of the loop.
STUART LEITCH: But it can really augment what you're doing.
STEPHANIE: OK. We have just about five minutes left, so maybe our last question. We'll see. Are you concerned about the environmental impact of AI?
STUART LEITCH: Yeah. Yeah. I mean, I think we're going to see extraordinary amounts of power consumption being consumed with this. The upside is that if the trends continue to hold, we're going to see these models get increasingly compute-efficient. The ChatGPT, when it was originally running, it was really like heat in the environment up in indirect ways.
STUART LEITCH: And now when I run it on my laptop, it's just like running a game. It's actually like having very little environmental impact. But I do expect that this is going to be very, very broadly deployed, and you can't have NVIDIA at a $4 trillion market cap without the expectation that this is going to go really big. I think the hope is that AI is actually able to advance science itself.
STUART LEITCH: And we're certainly not there yet. But that it is actually, in and of itself, able to create greater efficiencies and help us manage the environment. But it's a real concern.
STEPHANIE: OK. This might be an even better last question. Any chance we could get a copy of the prompt you entered into Claude Code?
STUART LEITCH: Yes, yes. Yeah.
SPEAKER 2: Will you be able to do one more?
STEPHANIE: OK. I've been told we can do one more. I've got more. We'll keep going. Let's see here. How do you control for AI work slop?
STUART LEITCH: Yeah. So that's a really good question because it's something that we're seeing at a coding level. These tools allow you to do things so much faster, and particularly if you're using it in a fairly naive way, really, it'll just spit out code, and then you go to try and run the code and it's like, ugh. So part of it is you're trying to create these feedback loops, where it's actually able to measure itself and encode that's easier to do.
STUART LEITCH: But like with this research we did here, it's about trying to put in these quality gates. And as we saw, you can use language to put in these quality gates. So like our architecture team is like both loving AI and hating AI simultaneously. And one of the things that I'm really asking them to do is think about all the different types of concerns you have about slop that you're seeing coming through the system, and let's try and get that represented in different subagents such that we can have that in our continuous integration, as codes being checked in.
STUART LEITCH: And there's a pull request. But even better than that, pushing that down to individual developers machines, such that when people are using Claude Code and they're building out software, we have it in the instructions, you always need to consult these five agents, and basically keep iterating until you've reached a level of satisfaction with all these agents.
STUART LEITCH: Now, that's still not going to get rid of all the slop. You still need manual code reviews. The level of slop you get is going to be proportional, or inversely proportional to the level of effort you put into the original instructions, the checks and balances, and how much you steer it along the way. All right.
STUART LEITCH: So I think we're at time. So thank you for really paying attention. I appreciate it.