In no way am I comparing myself to the great Bill James, but I do feel like he and I are at least cut from the same cloth. I don't believe he was a data scientist, and nor am I. He can't do it as well as the data scientists either, but what separates the pure data guys from the guys who make data actually useful and dare I say accessible is the people that both know enough about ball to make a formula that makes sense, and have the skills to know how to use it and explain it to people.
You've invoked quite a bit of modesty with the comparison to the legend, and thank you very much by the way. I just hope that I go in the same general direction as he did, raising understanding for people everywhere.
Bill J got BA in psychology. so, he had a little stats there. intro plus maybe a bit.
IMHO he has a scientific bend of thought.
and he was a very clear writer.
basically, i think he would hear or read something and pause and think “If that is true, what evidence sound i be able to find”
he was terrific at tapping into available data. and coming up with how to extract information from data.
i loved how he combined all season all team data about basic factoids ( ie walks) and tried to see how they contributed to runs.
he just kept building like that.
he expanded into multivariate thinking but avoided a lot of “traditional” statistical approaches because, correctly, he realized Baseball data did not match with statistical assumptions about “distributions of data”. the “ general linear model” was the main approach to lots of statistical analysis when he was writing a lot of his most ground breaking stuff 1970-1980’s.
he was completely right about that.
i read all the annual books. with great data and well conceived information boiled out of the data. and then an essay about every team.
and - his HoF book was a terrific.
i am a retired psychologist who was a “research methodologist”and computer programmer and clinical psychologist. taught graduate statics.
I must say this is a great way to think about things.
If something is true, you should be able to find evidence for it. So many easily identifiable truths (i.e. in this sport, that shotgun running is better than under centre running) are buried under this fictitious 'common sense' that people generally don't wish to challenge. A similar idea in baseball is that batting average is more important than on-base percentage, when evaluating a player. This was the prevailing thought for ages and ages. The first 100 years at least of organised professional baseball, and all it took was one or a few people looking for evidence to the contrary. They found it quite easily.
If Bill had intro stats plus a little bit more, that's probably about the same statistical background that I have. It's nowhere near enough to change the world through sheer statistical prowess. I am no Ben Baldwin, but I think what Bill did is what we all need a little bit more of. The drive to look at these numbers, and make real sense out of them, instead of taking so many things as givens.
Zach Wilson might still have a chance to be successful. He spent the last year without the pressure of being the starting QB and with good coaches in Denver, and got to focus on just improving himself. Now he's in Miami, and if any backup is going to get playing time that's the spot. I really, really hope Tua can avoid more concussions, but if he does get another he'll be out for a while. It would be the most New York Jets story ever if Wilson does succeed in the AFC East... as a Miami Dolphin.
I suppose you're right Nick. We shouldn't ring the bell on him just yet, but the problem becomes that it's really hard to improve on both accuracy and sack avoidance at once. Generally, one comes at the expense of the other, and you improve in increments over multiple seasons. In 2023, his sk%+ was 69 and his CPOE was -3.1.
If he does get some small (or large) sample playing time in 2025, I would like to see either a CPOE above zero or a sk%+ above 100. It doesn't necessarily have to be both. Either would give him a jumping off point to at least be a workable NFL player. The problem with this is that sack avoidance tends to be the very most difficult thing to improve, and Zach is awful at it. However, he has just spent a whole season with one of the best sack avoiders in the world in Bo Nix, and the offensive staff that made him that way. You never know how much that kind of thing helps.
Zach actually did make big year on year accuracy improvements every year he played, but the problem with that is when you start as one of the worst ever, even solid year on year improvements every season of a career only got him to the point where I would describe him as wildly inaccurate. No longer one of the least accurate passers ever, but not very good. I actually have some faith the CPOE over zero can happen at some point, but if his sk%+ stays below 85 (where it's been for all of his NFL career) his arm will never get a chance to matter anyways.
In defense of the model, Josh have to the greatest improvements in accuracy in NFL history to become what he is today. Anthony Richardson and Josh are great examples of how far outlier physical traits can get you as NFL prospect as both had sub 10% chances of being success and ended as top 5 picks.
Caleb Williams, Bryce Young, and Trevor Lawrence:
These guys are prime examples of guys who were considered #1 picks before their 3rd year. The overwhelming consensus was that they could have sat out their last year and still been #1 picks, similarly to how Chase sat out in 2020 and was still taken #5.
This leads me more specifically to Caleb. On one hand, I understand why you take him because no gets fired for buying IBM and he was great in his first two seasons (I'm assuming he had better xEPA stats but I could be way off). On the other hand, Jayden was the Heisman winner so it is not unlike there would be unbearable push back (like if you drafted Bo Nix).
The Model:
It amazes me how NFL teams can do all sorts of cap shenanigans, understand and communicate libraries full on tactics and strategy, and manage huge workloads yet can barely beat a guy in his basement on a laptop using a model that seems relatively easy (for an NFL team) to build. Maybe they do build things like this and other decision making factors rule out or only some teams use them, but the fact you can go toe-to-toe with a billion dollar franchise is both a compliment to you and a huge insult to teams.
Edit: Maybe this points more to how you can only accurately predict so much. There is so much variability in football that maybe the best you can do is get it right 60%. By the same token, by using a really good model plus good reasoning, you should be able to hit that 60% mark.
You've hit on the only thing I was considering accounting for, but did not. Through the course of this process, I kept finding guys (Bryce Young is a perfect example) where their final season was actually a down year compared to their previous ones. I considered putting a simple dummy control in the model to account for this possibility, but I didn't, because I feel as if it could suffer from survivor bias in that case. How many guys who have down years in their final season never get drafted at all? I feel like that could introduce more flaws than it fixes. If I'm to go back and redo this at any point, I will look into this possibility.
This is the odd thing that NFL teams must deal with when drafting QBs that I was not tasked with. I can just say that Caleb was not on my draft board at all for the first round, meanwhile Jayden and Bo are two of the best QB prospects of the last 20 years, and nobody will yell at me, but people will yell at NFL teams if they pass on the consensus number one guy, even if the second guy is clearly better.
You see this in a draft like 2004, where the Chargers had Philip Rivers higher than both Manning and Roethlisberger on their board, but wouldn't have picked Philip over either of them, had it actually been their pick, because Philip was a consensus third that year. The only reason San Diego was able to get the guy they wanted, even though they had the first overall pick (which should've in theory made it quite easy), was that they were able to coerce the Giants into reaching for the best QB in the class, so that San Diego themselves wouldn't have to. This is the same reason the NBA teams are allowed to make picks on behalf of other teams.
This is part of the reason why I can draft basically as well as the NFL can, because their hands are somewhat tied in a way that mine are not. San Diego took Eli Manning in 2004, even though they knew in their hearts that Philip was better, because nobody gets fired for drafting the consensus number one guy. Imagine if a team had actually drafted Brock Purdy in the first round in 2022. There would've been anarchy.
I feel like if a team was able to free themselves from these constraints, that would definitely be a way to improve on inefficiencies at the draft table. If an owner could somehow make a binding agreement to a GM that he will not be fired based upon what he does in the draft, I'm almost certain that drafting leaguewide would get much better, because then the constraints would be off, like they're off me. I believe this is the reason the Patriots drafted so well for so long, because the GM just knew he wasn't going to be fired, no matter what happened, because he also happened to be possibly the best coach ever, so he could just make the pick he wanted, instead of trying to balance between what's best for the team and what's not going to get me fired.
There's a famous story about the 1990s Bills. Their previous regime was set to trade away all kinds of draft picks for solid players, in an effort to not get themselves fired. Instead, the Bills drafted both Bruce Smith and Andre Reed in 1985, and the team was immensely successful because of that, but that old regime was fired after that year, so was this a good decision or a bad decision on their part?
The draft is a game of misaligned incentives, but in my opinion, the fanbase gets excited when a young QB is drafted, no matter who it is. Front offices really should not feel as beholden to the draft board at this position as they do, because as soon as guys are throwing passes in September, nobody gives a damn what the draft board was. If the Bears drafted Bo Nix instead of Caleb Williams, the fans would've been ready to set the seats on fire, but as soon as September they would've been talking about how much of a genius their GM is.
It's a double edged sword here. You're looking to get the pick right, but you're also looking to placate everybody. This is how the draft has always worked, but in my opinion, since this position in particular is so important, the weight should be skewed a little more towards trying to get it right.
I don't have any misaligned incentives at all. I am entirely focused on trying to get it right, which perhaps illuminates just how big the inefficiency created by the misaligned incentives is, because my being without it allows me to catch all the way up to the NFL process, despite a budget of nothing and a head count of one person. If the NFL people were not so handcuffed, and were allowed to draft the guys they actually wanted, surely they would blow me out of the water, but I'm not comparing my ability to theirs. Doing that, one guy vs. a whole staff will lose every time.
I'm comparing my process to their process, and their process includes this gross inefficiency, while mine doesn't. Therefore, we are equal.
On top of this, QBs are also just really hard to predict, so I think your final paragraph is probably correct too. There is a ceiling as to how often you can get it right, and based on our different approaches to the draft, the NFL and myself are converging towards this ceiling from different directions. They (technically) hit on guys like Josh Allen, Jordan Love, Sam Darnold, Daniel Jones, meanwhile I hit on guys like Jalen Hurts, Russell Wilson, Andy Dalton, and Gardner Minshew.
I think if anybody can get better than a 50% hit rate drafting QBs, they should consider themselves satisfied, and I do consider myself satisfied for just that reason, but overall, I think what I have proven is a) QBs are extremely tough to predict, and b) the NFL is wasting all kinds of money trying to do it, when I can do it just as well by looking at an ESPN.com stat page.
Very cool, Robbie! It's not super surprising to me that your model was comparable to actual NFL GMs.
This is is a pretty niche hockey stat, but are you familiar with NHLe? There are a few different versions, but basically the idea is to try to translate scoring stats from non-NHL leagues to their NHL equivalent (hence the "e"). It's a way to try to account for varying strength of leagues. A few years ago, someone found that simply drafting players by descending NHLe outperformed actual NHL GMs by a decent amount. I wonder if incorporating some kind of conference or strength of schedule adjustment would be enough for your box score model to beat NFL GMs straight up (and might allow you to include FCS QBs as well).
In short, I was silly not to include age in the first place. I suppose I got so caught up on football skill I completely forgot about it. Including age as a control improves the performance of the model by about five percent, which I don't know if you're a statistics guy or not, but that's a ton of improvement with one change. Without age, the model can predict NFL success with approximately 24% accuracy. With age, it's about 29 percent.
Funnily enough, taking into account that he did at he did at Texas A&M, entering the draft at just 21 years old, this skyrockets Johnny Manziel to be the second best draft prospect of the last 20 years, still behind Deshaun Watson, who was also crazily young at Clemson.
The change does not impact my first round pick hit rate any. It's still 31 out of my 58 highest rated guys that made it in the NFL. The only change on this front is compositional. The age inclusive model replaces the hits on Andy Dalton and Russell Wilson with hits on Sam Darnold and Matthew Stafford.
So, including age is likely a slight downgrade in terms of actual fitted performance, but in terms of conceptual strength, it's a big upgrade, and that's the key thing.
Also, taking into account that guys leave college so late these days, the model now only thinks four guys are first round calibre from this 2025 draft: Riley Leonard, Jaxson Dart, Kyle McCord, and Jalen Milroe. This is down from six in the original specification, and no longer includes Cam Ward at all. I suspect this will keep happening, as college prospects continue to bolt for the pros at older and older ages, as is the pattern we're seeing now. Late first round guys in terms of football skill will be weighed down by the fact they're 25 years old in a lot of cases these days.
It's interesting to see how NIL completely flips this pattern. Guys were trending towards the pros younger and younger, and then suddenly the switch flipped, and now guys are coming out at 24 and 25 on a regular basis. I think my or any model will need some time to adjust to this new reality.
No doubt. I was more thinking along the lines of it would be interesting to kind of see if/what the velocity threshold was. Obviously, raw arm strength isn’t nearly as important as people make it out to be, but I have to imagine off-platform arm strength is a game changer in today’s game. You call Mahomes and Goff game managers, but Mahomes can manage a game in a way more dynamic way because of that off platform unset base arm strength
If you went to a gm and said I can improve your QB draft success rate by 5%, he’d nut himself. Something I wish the NFL combine did with QBs was like a max velocity test. Ideally it would measure ball speed out of the hand, and then at yard intervals. I feel like if you had that variable, you could really make a strong model. As I’m sure it’d weed out a few guys with great college statistical markers who just don’t have the raw arm strength to hit in the NFL. I doubt it would be as powerful for predicting top tier success, but removing misses is probably more important when you consider the value of a rookie deal
If we're talking about drafting guys purely statistically, a velocity variable would be very important, but I was more going for the ability to read a mock draft, and rank the QBs on it, simply assuming that if they've made it to the point of being considered, they have NFL level arm strength.
Well do keep in mind that this particular model was built out of only QBs who were actually drafted. That's why I felt comfortable in leaving out something like that, because all who've made it this far have at least adequate NFL level arm strength, and I'm not even sure such a thing exists. Chad Pennington had to throw a moon ball to get the ball to go 30 yards, and he nearly won MVP on two separate occasions.
Especially in the modern NFL, where game managers (like Patrick Mahomes and Jared Goff) are having more success than they've ever had before, I'm not especially sure how important arm strength is anymore.
There does need to be a certain level of throw power to deal with the speed of the NFL game. Fair enough, but once you get to the level of adequate, I'm seriously doubtful that any more arm strength makes a difference, because we're already controlling for whether guys can complete passes or not. If their weak arm hindered their ability to complete passes, this model would just hate them natively. It wouldn't need an augmentation, and if a weak arm didn't hinder their ability to complete passes, why wouldn't that ability transfer to the NFL at the same level as anybody else's ability to complete passes does? It worked for Chad Pennington and Tua Tagovailoa.
I truthfully cannot think of a player whose weak arm strength is legitimately hindering them at the NFL level right now. I think it all gets weeded out in the draft process well enough through looking at the man's ability to complete throws, and if he can complete throws on the sheet, it will only take a cursory look at the player in real life to determine if the arm is strong enough for the NFL level. If not, throw his row out of the dataset. If so, treat him the same as everybody else.
In other words, I agree with your last sentence wholeheartedly. Arm strength is more a disqualifier than a qualifier. Poor arm strength will hurt your case, but elite arm strength will not help it, once you get to the NFL level. At the NFL level, it's all about the ability to complete passes, which is not correlated with raw arm strength at this level, likely because all the players in the pros have arms of at least the minimum strength to complete the necessary throws to have a solid NFL offence.
They could still do your velocity test. It could be fun, but it just seems surplus to requirements, when arm strength is at best a niche skill, and at worst not important in any way, once we've weeded out the bottom end, which the draft process already manages to accomplish.
“Zach Wilson in the Sun Belt.” I understand that in context, but man is it jarring to read. Especially since we basically had a power conference schedule before COVID wrecked everything. Surprises me a bit how much better the model likes Wilson than Beck—I wonder how much that’s influenced by the difference between the Sun Belt in 2020 vs MWC in 2006 baseline. The rushing value stands out on the chart as a differentiator between the two—and I can’t disagree that Wilson was a better runner than Beck—but I would have pegged Beck as the better passer and your CPOE model disagrees. But my perception may be also skewed by Beck being a breath of fresh air at the QB position for BYU after a few seasons of bad QB play whereas Wilson followed some merely mediocre seasons.
I imagine the model missing on Tua is largely attributable to only using the final season, when he was really drafted based on what he had shown in prior seasons. Similarly, Jalen Hurts was probably discounted by his performance before his season in Oklahoma relative to where he would have been drafted if it was just based on that final season—I imagine it would have been more in line with his predecessors Baker and Kyler.
Missing on Josh Allen is probably a plus in my book in terms of process. He was a purely potential pick, seemingly because of his big arm. That doesn’t feel like a particularly successful archetype, so I have no problem missing on him in the first round, despite what he has later become.
The 2020 Sun Belt is slightly better than the 2006 MWC in terms of passing the ball, giving John Beck a slightly lower baseline to go up against, but not enough to make up the gap. Perhaps John might've been a better passer fundamentally, but in 2020 Zach Wilson completed 73.5 percent of passes, on an extremely long average pass. It's one of the best college CPOE seasons ever, and mine is not the only CPOE model that thinks that. There's the famous video of a New York Jets guy talking about their draft strategy, and you can see his CPOE model open in the background, which also has Wilson really high in 2020.
He's a guy that should've been able to work out at the NFL level. A 48% chance of success in my preferred specification is the same as Matt Ryan. Zach Wilson is not like JaMarcus Russell, where the talent in a workout was high, but xEPA on the field did not indicate a first round draft pick. There was something there for Zach, even against a non-power schedule, but it just didn't happen for him. That's what makes the draft fun and unpredictable.
The purely rate specifications like Tua a lot. They like him better than they like Joe Burrow, but when you mix the volume in, Alabama QBs just don't get very much of it, and especially Tua, because he was hurt in 2019. That's why I said I may choose to override the model in his case if I were actually drafting, because his 2019 did not impress the model very much, but his 2018 very much did, and if I were an NFL team, with the ability to give him the all clear on his injuries, assuring myself (in theory) that it would not be a frequent problem moving forward, I'd be willing to draft Tua in round one. Perhaps not ahead of Jalen Hurts, but definitely in the first round.
As for Jalen Hurts, it's true that in both 2016 and 2017 he was really not special, but by the time 2019 had rolled around, he was fantastic, better than either Baker or Kyler in that Oklahoma offence. It's only one year, but so was Kyler Murray the year before, so I really don't understand what the miss was here. Is playing in a mediocre fashion for Alabama as a freshman and sophomore, like Jalen did, worse than not playing at all, like Kyler did? This is what you're implicitly proposing, and perhaps it's true, but that doesn't make any sense to me. Why would that be the case?
Josh Allen is tough. You're right. There is no way I can coerce the model into liking a negative CPOE, poor sack avoider in the non-power Mountain West, and I tried to torture the data into liking Josh Allen. I tried for a long time, and I just couldn't make it happen. There is no way a model based on college skill can be coaxed into liking either Josh Allen or Jordan Love. Both of them were kind of meh, in non-power conferences. Strictly worse than Zach Wilson, in terms of results on the college field.
I would've gone nowhere near either of them with a first round draft choice, but in the bottom quartile of my model, I believe the expected value is 2.06 successful NFL QBs, so somebody was going to hit the longshot. They are the two. Good for them, but I still don't see their story as repeatable.
The reason playing mediocre as a freshman and sophomore might be worse than not playing at all is it concretizes a poorer level of expected performance. We know that 2018 Kyler and 2019 Jalen both performed incredibly—but we don’t know whether that represents a 67th percentile season from Kyler but a 95th percentile season from Hurts (or vice versa). Because we’ve seen a poorer level of performance from Hurts, I’d be more likely to attribute his performance in 2019 to an outlier (in a general rather than technical sense) season for him, while Murray represented more unknown, but with that unknown I’d be less confident projecting it as an outlier season.
That model, of course, completely ignores improvement across seasons (which you certainly should not do, and that alone may wash out the difference in this specific example), but I think simplifies the illustration of the principle.
The beginning of Bill James was caused by his skepticism about the way announcers and ex players and writers described and ranked players.
Absolutely.
In no way am I comparing myself to the great Bill James, but I do feel like he and I are at least cut from the same cloth. I don't believe he was a data scientist, and nor am I. He can't do it as well as the data scientists either, but what separates the pure data guys from the guys who make data actually useful and dare I say accessible is the people that both know enough about ball to make a formula that makes sense, and have the skills to know how to use it and explain it to people.
You've invoked quite a bit of modesty with the comparison to the legend, and thank you very much by the way. I just hope that I go in the same general direction as he did, raising understanding for people everywhere.
hi.
Bill J got BA in psychology. so, he had a little stats there. intro plus maybe a bit.
IMHO he has a scientific bend of thought.
and he was a very clear writer.
basically, i think he would hear or read something and pause and think “If that is true, what evidence sound i be able to find”
he was terrific at tapping into available data. and coming up with how to extract information from data.
i loved how he combined all season all team data about basic factoids ( ie walks) and tried to see how they contributed to runs.
he just kept building like that.
he expanded into multivariate thinking but avoided a lot of “traditional” statistical approaches because, correctly, he realized Baseball data did not match with statistical assumptions about “distributions of data”. the “ general linear model” was the main approach to lots of statistical analysis when he was writing a lot of his most ground breaking stuff 1970-1980’s.
he was completely right about that.
i read all the annual books. with great data and well conceived information boiled out of the data. and then an essay about every team.
and - his HoF book was a terrific.
i am a retired psychologist who was a “research methodologist”and computer programmer and clinical psychologist. taught graduate statics.
he is my hero.
I must say this is a great way to think about things.
If something is true, you should be able to find evidence for it. So many easily identifiable truths (i.e. in this sport, that shotgun running is better than under centre running) are buried under this fictitious 'common sense' that people generally don't wish to challenge. A similar idea in baseball is that batting average is more important than on-base percentage, when evaluating a player. This was the prevailing thought for ages and ages. The first 100 years at least of organised professional baseball, and all it took was one or a few people looking for evidence to the contrary. They found it quite easily.
If Bill had intro stats plus a little bit more, that's probably about the same statistical background that I have. It's nowhere near enough to change the world through sheer statistical prowess. I am no Ben Baldwin, but I think what Bill did is what we all need a little bit more of. The drive to look at these numbers, and make real sense out of them, instead of taking so many things as givens.
Zach Wilson might still have a chance to be successful. He spent the last year without the pressure of being the starting QB and with good coaches in Denver, and got to focus on just improving himself. Now he's in Miami, and if any backup is going to get playing time that's the spot. I really, really hope Tua can avoid more concussions, but if he does get another he'll be out for a while. It would be the most New York Jets story ever if Wilson does succeed in the AFC East... as a Miami Dolphin.
I suppose you're right Nick. We shouldn't ring the bell on him just yet, but the problem becomes that it's really hard to improve on both accuracy and sack avoidance at once. Generally, one comes at the expense of the other, and you improve in increments over multiple seasons. In 2023, his sk%+ was 69 and his CPOE was -3.1.
If he does get some small (or large) sample playing time in 2025, I would like to see either a CPOE above zero or a sk%+ above 100. It doesn't necessarily have to be both. Either would give him a jumping off point to at least be a workable NFL player. The problem with this is that sack avoidance tends to be the very most difficult thing to improve, and Zach is awful at it. However, he has just spent a whole season with one of the best sack avoiders in the world in Bo Nix, and the offensive staff that made him that way. You never know how much that kind of thing helps.
Zach actually did make big year on year accuracy improvements every year he played, but the problem with that is when you start as one of the worst ever, even solid year on year improvements every season of a career only got him to the point where I would describe him as wildly inaccurate. No longer one of the least accurate passers ever, but not very good. I actually have some faith the CPOE over zero can happen at some point, but if his sk%+ stays below 85 (where it's been for all of his NFL career) his arm will never get a chance to matter anyways.
Josh Allen:
In defense of the model, Josh have to the greatest improvements in accuracy in NFL history to become what he is today. Anthony Richardson and Josh are great examples of how far outlier physical traits can get you as NFL prospect as both had sub 10% chances of being success and ended as top 5 picks.
Caleb Williams, Bryce Young, and Trevor Lawrence:
These guys are prime examples of guys who were considered #1 picks before their 3rd year. The overwhelming consensus was that they could have sat out their last year and still been #1 picks, similarly to how Chase sat out in 2020 and was still taken #5.
This leads me more specifically to Caleb. On one hand, I understand why you take him because no gets fired for buying IBM and he was great in his first two seasons (I'm assuming he had better xEPA stats but I could be way off). On the other hand, Jayden was the Heisman winner so it is not unlike there would be unbearable push back (like if you drafted Bo Nix).
The Model:
It amazes me how NFL teams can do all sorts of cap shenanigans, understand and communicate libraries full on tactics and strategy, and manage huge workloads yet can barely beat a guy in his basement on a laptop using a model that seems relatively easy (for an NFL team) to build. Maybe they do build things like this and other decision making factors rule out or only some teams use them, but the fact you can go toe-to-toe with a billion dollar franchise is both a compliment to you and a huge insult to teams.
Edit: Maybe this points more to how you can only accurately predict so much. There is so much variability in football that maybe the best you can do is get it right 60%. By the same token, by using a really good model plus good reasoning, you should be able to hit that 60% mark.
You've hit on the only thing I was considering accounting for, but did not. Through the course of this process, I kept finding guys (Bryce Young is a perfect example) where their final season was actually a down year compared to their previous ones. I considered putting a simple dummy control in the model to account for this possibility, but I didn't, because I feel as if it could suffer from survivor bias in that case. How many guys who have down years in their final season never get drafted at all? I feel like that could introduce more flaws than it fixes. If I'm to go back and redo this at any point, I will look into this possibility.
This is the odd thing that NFL teams must deal with when drafting QBs that I was not tasked with. I can just say that Caleb was not on my draft board at all for the first round, meanwhile Jayden and Bo are two of the best QB prospects of the last 20 years, and nobody will yell at me, but people will yell at NFL teams if they pass on the consensus number one guy, even if the second guy is clearly better.
You see this in a draft like 2004, where the Chargers had Philip Rivers higher than both Manning and Roethlisberger on their board, but wouldn't have picked Philip over either of them, had it actually been their pick, because Philip was a consensus third that year. The only reason San Diego was able to get the guy they wanted, even though they had the first overall pick (which should've in theory made it quite easy), was that they were able to coerce the Giants into reaching for the best QB in the class, so that San Diego themselves wouldn't have to. This is the same reason the NBA teams are allowed to make picks on behalf of other teams.
This is part of the reason why I can draft basically as well as the NFL can, because their hands are somewhat tied in a way that mine are not. San Diego took Eli Manning in 2004, even though they knew in their hearts that Philip was better, because nobody gets fired for drafting the consensus number one guy. Imagine if a team had actually drafted Brock Purdy in the first round in 2022. There would've been anarchy.
I feel like if a team was able to free themselves from these constraints, that would definitely be a way to improve on inefficiencies at the draft table. If an owner could somehow make a binding agreement to a GM that he will not be fired based upon what he does in the draft, I'm almost certain that drafting leaguewide would get much better, because then the constraints would be off, like they're off me. I believe this is the reason the Patriots drafted so well for so long, because the GM just knew he wasn't going to be fired, no matter what happened, because he also happened to be possibly the best coach ever, so he could just make the pick he wanted, instead of trying to balance between what's best for the team and what's not going to get me fired.
There's a famous story about the 1990s Bills. Their previous regime was set to trade away all kinds of draft picks for solid players, in an effort to not get themselves fired. Instead, the Bills drafted both Bruce Smith and Andre Reed in 1985, and the team was immensely successful because of that, but that old regime was fired after that year, so was this a good decision or a bad decision on their part?
The draft is a game of misaligned incentives, but in my opinion, the fanbase gets excited when a young QB is drafted, no matter who it is. Front offices really should not feel as beholden to the draft board at this position as they do, because as soon as guys are throwing passes in September, nobody gives a damn what the draft board was. If the Bears drafted Bo Nix instead of Caleb Williams, the fans would've been ready to set the seats on fire, but as soon as September they would've been talking about how much of a genius their GM is.
It's a double edged sword here. You're looking to get the pick right, but you're also looking to placate everybody. This is how the draft has always worked, but in my opinion, since this position in particular is so important, the weight should be skewed a little more towards trying to get it right.
I don't have any misaligned incentives at all. I am entirely focused on trying to get it right, which perhaps illuminates just how big the inefficiency created by the misaligned incentives is, because my being without it allows me to catch all the way up to the NFL process, despite a budget of nothing and a head count of one person. If the NFL people were not so handcuffed, and were allowed to draft the guys they actually wanted, surely they would blow me out of the water, but I'm not comparing my ability to theirs. Doing that, one guy vs. a whole staff will lose every time.
I'm comparing my process to their process, and their process includes this gross inefficiency, while mine doesn't. Therefore, we are equal.
On top of this, QBs are also just really hard to predict, so I think your final paragraph is probably correct too. There is a ceiling as to how often you can get it right, and based on our different approaches to the draft, the NFL and myself are converging towards this ceiling from different directions. They (technically) hit on guys like Josh Allen, Jordan Love, Sam Darnold, Daniel Jones, meanwhile I hit on guys like Jalen Hurts, Russell Wilson, Andy Dalton, and Gardner Minshew.
I think if anybody can get better than a 50% hit rate drafting QBs, they should consider themselves satisfied, and I do consider myself satisfied for just that reason, but overall, I think what I have proven is a) QBs are extremely tough to predict, and b) the NFL is wasting all kinds of money trying to do it, when I can do it just as well by looking at an ESPN.com stat page.
Very cool, Robbie! It's not super surprising to me that your model was comparable to actual NFL GMs.
This is is a pretty niche hockey stat, but are you familiar with NHLe? There are a few different versions, but basically the idea is to try to translate scoring stats from non-NHL leagues to their NHL equivalent (hence the "e"). It's a way to try to account for varying strength of leagues. A few years ago, someone found that simply drafting players by descending NHLe outperformed actual NHL GMs by a decent amount. I wonder if incorporating some kind of conference or strength of schedule adjustment would be enough for your box score model to beat NFL GMs straight up (and might allow you to include FCS QBs as well).
Throwing darts at a board blindfolded would do a better job drafting QBs than the NFL 😀😀😀
What happens if you throw age into this? Basically in every sport, the younger you are the better
You've asked a very good question my friend.
In short, I was silly not to include age in the first place. I suppose I got so caught up on football skill I completely forgot about it. Including age as a control improves the performance of the model by about five percent, which I don't know if you're a statistics guy or not, but that's a ton of improvement with one change. Without age, the model can predict NFL success with approximately 24% accuracy. With age, it's about 29 percent.
Funnily enough, taking into account that he did at he did at Texas A&M, entering the draft at just 21 years old, this skyrockets Johnny Manziel to be the second best draft prospect of the last 20 years, still behind Deshaun Watson, who was also crazily young at Clemson.
The change does not impact my first round pick hit rate any. It's still 31 out of my 58 highest rated guys that made it in the NFL. The only change on this front is compositional. The age inclusive model replaces the hits on Andy Dalton and Russell Wilson with hits on Sam Darnold and Matthew Stafford.
So, including age is likely a slight downgrade in terms of actual fitted performance, but in terms of conceptual strength, it's a big upgrade, and that's the key thing.
Also, taking into account that guys leave college so late these days, the model now only thinks four guys are first round calibre from this 2025 draft: Riley Leonard, Jaxson Dart, Kyle McCord, and Jalen Milroe. This is down from six in the original specification, and no longer includes Cam Ward at all. I suspect this will keep happening, as college prospects continue to bolt for the pros at older and older ages, as is the pattern we're seeing now. Late first round guys in terms of football skill will be weighed down by the fact they're 25 years old in a lot of cases these days.
It's interesting to see how NIL completely flips this pattern. Guys were trending towards the pros younger and younger, and then suddenly the switch flipped, and now guys are coming out at 24 and 25 on a regular basis. I think my or any model will need some time to adjust to this new reality.
No doubt. I was more thinking along the lines of it would be interesting to kind of see if/what the velocity threshold was. Obviously, raw arm strength isn’t nearly as important as people make it out to be, but I have to imagine off-platform arm strength is a game changer in today’s game. You call Mahomes and Goff game managers, but Mahomes can manage a game in a way more dynamic way because of that off platform unset base arm strength
If you went to a gm and said I can improve your QB draft success rate by 5%, he’d nut himself. Something I wish the NFL combine did with QBs was like a max velocity test. Ideally it would measure ball speed out of the hand, and then at yard intervals. I feel like if you had that variable, you could really make a strong model. As I’m sure it’d weed out a few guys with great college statistical markers who just don’t have the raw arm strength to hit in the NFL. I doubt it would be as powerful for predicting top tier success, but removing misses is probably more important when you consider the value of a rookie deal
If we're talking about drafting guys purely statistically, a velocity variable would be very important, but I was more going for the ability to read a mock draft, and rank the QBs on it, simply assuming that if they've made it to the point of being considered, they have NFL level arm strength.
Well do keep in mind that this particular model was built out of only QBs who were actually drafted. That's why I felt comfortable in leaving out something like that, because all who've made it this far have at least adequate NFL level arm strength, and I'm not even sure such a thing exists. Chad Pennington had to throw a moon ball to get the ball to go 30 yards, and he nearly won MVP on two separate occasions.
Especially in the modern NFL, where game managers (like Patrick Mahomes and Jared Goff) are having more success than they've ever had before, I'm not especially sure how important arm strength is anymore.
There does need to be a certain level of throw power to deal with the speed of the NFL game. Fair enough, but once you get to the level of adequate, I'm seriously doubtful that any more arm strength makes a difference, because we're already controlling for whether guys can complete passes or not. If their weak arm hindered their ability to complete passes, this model would just hate them natively. It wouldn't need an augmentation, and if a weak arm didn't hinder their ability to complete passes, why wouldn't that ability transfer to the NFL at the same level as anybody else's ability to complete passes does? It worked for Chad Pennington and Tua Tagovailoa.
I truthfully cannot think of a player whose weak arm strength is legitimately hindering them at the NFL level right now. I think it all gets weeded out in the draft process well enough through looking at the man's ability to complete throws, and if he can complete throws on the sheet, it will only take a cursory look at the player in real life to determine if the arm is strong enough for the NFL level. If not, throw his row out of the dataset. If so, treat him the same as everybody else.
In other words, I agree with your last sentence wholeheartedly. Arm strength is more a disqualifier than a qualifier. Poor arm strength will hurt your case, but elite arm strength will not help it, once you get to the NFL level. At the NFL level, it's all about the ability to complete passes, which is not correlated with raw arm strength at this level, likely because all the players in the pros have arms of at least the minimum strength to complete the necessary throws to have a solid NFL offence.
They could still do your velocity test. It could be fun, but it just seems surplus to requirements, when arm strength is at best a niche skill, and at worst not important in any way, once we've weeded out the bottom end, which the draft process already manages to accomplish.
“Zach Wilson in the Sun Belt.” I understand that in context, but man is it jarring to read. Especially since we basically had a power conference schedule before COVID wrecked everything. Surprises me a bit how much better the model likes Wilson than Beck—I wonder how much that’s influenced by the difference between the Sun Belt in 2020 vs MWC in 2006 baseline. The rushing value stands out on the chart as a differentiator between the two—and I can’t disagree that Wilson was a better runner than Beck—but I would have pegged Beck as the better passer and your CPOE model disagrees. But my perception may be also skewed by Beck being a breath of fresh air at the QB position for BYU after a few seasons of bad QB play whereas Wilson followed some merely mediocre seasons.
I imagine the model missing on Tua is largely attributable to only using the final season, when he was really drafted based on what he had shown in prior seasons. Similarly, Jalen Hurts was probably discounted by his performance before his season in Oklahoma relative to where he would have been drafted if it was just based on that final season—I imagine it would have been more in line with his predecessors Baker and Kyler.
Missing on Josh Allen is probably a plus in my book in terms of process. He was a purely potential pick, seemingly because of his big arm. That doesn’t feel like a particularly successful archetype, so I have no problem missing on him in the first round, despite what he has later become.
The 2020 Sun Belt is slightly better than the 2006 MWC in terms of passing the ball, giving John Beck a slightly lower baseline to go up against, but not enough to make up the gap. Perhaps John might've been a better passer fundamentally, but in 2020 Zach Wilson completed 73.5 percent of passes, on an extremely long average pass. It's one of the best college CPOE seasons ever, and mine is not the only CPOE model that thinks that. There's the famous video of a New York Jets guy talking about their draft strategy, and you can see his CPOE model open in the background, which also has Wilson really high in 2020.
He's a guy that should've been able to work out at the NFL level. A 48% chance of success in my preferred specification is the same as Matt Ryan. Zach Wilson is not like JaMarcus Russell, where the talent in a workout was high, but xEPA on the field did not indicate a first round draft pick. There was something there for Zach, even against a non-power schedule, but it just didn't happen for him. That's what makes the draft fun and unpredictable.
The purely rate specifications like Tua a lot. They like him better than they like Joe Burrow, but when you mix the volume in, Alabama QBs just don't get very much of it, and especially Tua, because he was hurt in 2019. That's why I said I may choose to override the model in his case if I were actually drafting, because his 2019 did not impress the model very much, but his 2018 very much did, and if I were an NFL team, with the ability to give him the all clear on his injuries, assuring myself (in theory) that it would not be a frequent problem moving forward, I'd be willing to draft Tua in round one. Perhaps not ahead of Jalen Hurts, but definitely in the first round.
As for Jalen Hurts, it's true that in both 2016 and 2017 he was really not special, but by the time 2019 had rolled around, he was fantastic, better than either Baker or Kyler in that Oklahoma offence. It's only one year, but so was Kyler Murray the year before, so I really don't understand what the miss was here. Is playing in a mediocre fashion for Alabama as a freshman and sophomore, like Jalen did, worse than not playing at all, like Kyler did? This is what you're implicitly proposing, and perhaps it's true, but that doesn't make any sense to me. Why would that be the case?
Josh Allen is tough. You're right. There is no way I can coerce the model into liking a negative CPOE, poor sack avoider in the non-power Mountain West, and I tried to torture the data into liking Josh Allen. I tried for a long time, and I just couldn't make it happen. There is no way a model based on college skill can be coaxed into liking either Josh Allen or Jordan Love. Both of them were kind of meh, in non-power conferences. Strictly worse than Zach Wilson, in terms of results on the college field.
I would've gone nowhere near either of them with a first round draft choice, but in the bottom quartile of my model, I believe the expected value is 2.06 successful NFL QBs, so somebody was going to hit the longshot. They are the two. Good for them, but I still don't see their story as repeatable.
The reason playing mediocre as a freshman and sophomore might be worse than not playing at all is it concretizes a poorer level of expected performance. We know that 2018 Kyler and 2019 Jalen both performed incredibly—but we don’t know whether that represents a 67th percentile season from Kyler but a 95th percentile season from Hurts (or vice versa). Because we’ve seen a poorer level of performance from Hurts, I’d be more likely to attribute his performance in 2019 to an outlier (in a general rather than technical sense) season for him, while Murray represented more unknown, but with that unknown I’d be less confident projecting it as an outlier season.
That model, of course, completely ignores improvement across seasons (which you certainly should not do, and that alone may wash out the difference in this specific example), but I think simplifies the illustration of the principle.