Cooking with Otters (Arguably Too Hard)

11/02/2024



Sanctum of All, at this point, is known primarily for our constructed innovations. For almost every single constructed event, at least a few members bring a spicy brew - and Worlds 2024 was certainly no different.


This past weekend was a bit unique though, in that literally every member of our team registered the same deck. And even beyond that, 7 out of the 8 of us registered the same 75 - Rei Zhang (cftsoc) was the only odd one out, with 3 small card changes.


i'm an otter!

i'm an otter!


Going into the event, we were very happy with our deck choice. We didn't quite think it was broken, per se, but we thought it was a very good deck that lined up fairly well against the metagame, with a few close matchups but certainly no terrible ones.


But unfortunately, our results did not reflect our expectations. Of the 8 of us, only 3 made day 2, and our best finish was Ryan Condon going 9-5 to qualify for their first PT(!). Our combined constructed record was below 50%, and my personal record was also quite mediocre at 6-8.


So, what went wrong? Why did we choose to play otters, and why did it underperform our expectations? Where do we go from here?


DISCLAIMER


This article is very intentionally not a deck/sideboard guide, and rather more of a brewing history and tournament report. It will be heavily focused on the testing process, rather than specifically how to play the Otters deck.


If you want to learn Otters as a deck, you should also go check out Ryan's guide! They do a fantastic job of comprehensively breaking down every part of the deck, and I could not recommend their guide enough!




Part 1: How did we get here?


One of my favorite parts about working as part of a testing team is the feeling of a deck truly being a team effort. There are always going to be individuals that are spearheading efforts to work on a deck, but I always love seeing how much every single extra pair of eyes can matter.


In this regard, the origin story of the otters deck is extremely amusing to me.


It actually started many months ago, right after BLB released. At that point, we weren't really seriously prepping for Worlds yet, but I was still doing a bit of Standard brewing just for fun. And in that brewing, I had a thought: what if we played

Valley Floodcaller with
Invasion of Segovia?


see if you can find the infinite loop!

see if you can find the infinite loop!


So I sent this list to Autumn Burchett, and she played it a bit on stream and thought it was fun, though we both agreed it was reasonable but not particularly competitive.


Months later, after the release of DSK, we were fiddling with yoman5's Stormsplitter combo list, and trying to improve on it. And as we were trying various thing, Autumn sent back the Otter Segovia list, with the comment: "I wonder if there’s a way to incorporate ideas from both decks into something new".


And thus, the deck was born.


Slow Beginnings


We pretty quickly built out what would become the core of the deck:

Stormchaser's Talent,
Thundertrap Trainer,
Valley Floodcaller,
Enduring Vitality,
This Town Ain't Big Enough (TTABE),
Roaring Furnace, and
Torch the Tower were in there from the beginning; while
Analyze the Pollen joined soon after to increase Trainer hits and
Bitter Reunion was not far behind as we realized how important haste was.


the first somewhat-refined version of the deck

the first somewhat-refined version of the deck


cft was actually the first person who was really advocating for the deck. Their biggest argument for why the deck worked well was, in their words: enduring is really good at counting".


"Counting", here, refers to the concept that many decks need to "count" to a certain number of raw resources (cards in hand or play, lands in play) in order to enact their gameplan. I wrote an article all about it, but the short version here is that Enduring completely breaks this paradigm by providing so much extra mana with just one card. It then allows you to leverage more mana-intensive play-patterns (like Talent+TTABE), without having to invest very many cards into producing that mana.


This certainly was all quite appealing - I did love the flexible gameplans that the deck had access to. However, I started off still not quite convinced, as to me the deck had one glaring issue: it felt about half a turn too slow.


Basically, in playtesting the deck felt like it was really nice when it got its engine running, but struggled to setup fast enough. There were many games where it felt like it died while a mere 1-2 mana away from comboing.


So, while I was interested in the deck, I was skeptical. That is until...


It Says Rat???


At some point, we were doing Scryfall searches to see what cards might help improve the Otters deck. I remember I was searching through each of the creature types listed on Floodcaller to see if there was anything cute we could do, when my eyes glanced over the card

Song of Totentanz.


Amusingly, I think my initial reaction was actually to dismiss it as "Oh that's cute, it gives haste. Probably worse than Bitter Reunion though, we don't care much about the tokens." For some reason, my brain dismissed it as a haste-granter, and not a token-maker, even though the search I was doing was literally for rats.


But on my second pass I caught it. And as soon as I internalized exactly what the card did, my skepticism of the otters deck fell away. This, to me, was exactly the missing piece we needed - it was a compact way to speed up the combo by a full turn or more, giving just that much more oomph to the combo gameplan.


a proof of concept video i posted shortly after - 23 damage starting from 6 mana and Vitality in play using only Floodcaller, Talent, and Song

a proof of concept video i posted shortly after - 23 damage starting from 6 mana and Vitality in play using only Floodcaller, Talent, and Song


After adding Song to the deck, I started working a lot on it. To me, one of the biggest flaws of the deck had been solved with the inclusion of Song - but there were still many many things we could optimize further.


This Town Ain't Beans Enough


As we were working on the deck, one of the biggest questions was what kind of card advantage we wanted to play.


We nicely already had some inherent card advantage through modal cards like Talent and Trainer providing extra cards with extra mana put in, but we also wanted some cheaper ways to just go up raw cards more easily. TTABE could sometimes provide this by bouncing our own cards, but we wanted just a bit more than just that.


Some early considerations included some card draw permanents like

The Everflowing Well (for its synergy with TTABE), and raw card draw like
Plan the Heist. But all of those options just felt a bit too clunky - 3 is a lot of mana when compared to the rest of the deck mostly costing 1-2 mana, even for a grindy card.



The breakthrough was quite simple and obvious in the end:

Up the Beanstalk is quite well-known as a strong card draw engine. We had been hesitant to play it at first because it seemed kind of silly to do so when we only really had one way to trigger it (TTABE), but once we actually tried it, it turned out that it was just good enough anyways!


Basically the reason was simply that we were actively trying to find TTABE every game anyways, and on top of that often recurred TTABE with Talent multiple times to boot! So while Beans was fairly limited in how often it triggered, even just one trigger was good enough and common enough to justify including, and it wasn't actually that hard to get multiple.


On top of that, there are actually a couple other possible Beans triggers in the deck! Song for X=4 or more is something that would actually happen pretty regularly; and while casting the Sauna half of Furnace/Sauna first is less common than the other way around, it can happen, especially in games where you care about the card draw.


Fighting Against Red


The final big development in our deck came at the last minute, basically during our final constructed meeting.


Throughout our testing, the decks we had been by far the most worried about and focused on were the red aggro decks - whether with

Leyline of Resonance or with
Innkeeper's Talent. We thought it was not quite a bad matchup per se, but both hard to play and very close even when played well. We tried many things to improve the matchup, but never felt we could quite get it to favored.



At some point though, cft suggested the card

Bushwhack. We had been playing with Analyze since the start of the deck, and the initial Bushwhack suggestion was meant more as a 5th Analyze than anything more - the idea was to make Trainer just a bit more consistent. But as we played more with it, Claire started advocating for the idea that maybe we should just be playing many Bushwhacks. And as we kept playing the red matchups, it started to look more and more appealing.


Just having access to more cheap removal in those matchups felt game-changing. It wasn't perfect, of course - Torch was still much more important, as an instant-speed answer - but being able to turn some of your lands into removal spells was huge.


The idea caught on quick, and in the final 24 hours we shifted from 4 Analyze 1 Bushwhack, all the way to 3 Analyze 3 Bushwhack.




Part 2: What happened?


Going into the event, we were confident. I distinctly remember both Claire and Nicole commenting that this was the most confident they felt about their constructed deck in some time. And while I personally wasn't as happy as I had been with Slogurk or Nadu (it would be hard to be without actually breaking a format), I felt like there was a good chance we had the best deck in the room.


So what happened? What went wrong? Was there something we missed in testing? Or was it just variance?


Cooking Too Hard


At the end of day 1, I ran a small postmortem in the players' lounge. Not everyone on the team was there, but I was interested in hearing any thoughts anyone had - both for how they might inform both the three of us going into day 2, and for how the team could improve our processes in general for future events.


And the consensus was that the deck did in fact seem as good as we thought it was during testing!


Most of the matchups seemed to go just as we thought they would - though the Oculus matchup did seem scarier than we expected given that people were building that deck better than online versions. The metagame seemed to be roughly what we thought it would be - there was more Dimir Midrange and less Golgari Midrange than we thought there would be, but both were good matchups for us so that was a bit of a wash.


Instead, the conclusion we came to was that the deck was just too hard. Multiple people highlighted making specific mistakes that concretely cost them games. It wasn't so much that the deck was failing us; we were failing the deck.


But is that really all there is to it? Magic is a hard game no matter what deck you play, so it's hard to simply claim "I made gameplay mistakes" and dismiss that as the only problem with your performance. "The deck was too hard" is an oversimplification. What can we learn from this? How can we avoid this in the future?


Sample Size


First, an important sidenote: this tournament featured much smaller sample size than even most PTs. Not only did we only have 7 constructed rounds per day instead of 8, we also had half the team size as usual, as the tournament overall was smaller.


This makes is even harder than usual to read into numerical results. Sure, we had a less than 50% constructed winrate - but flip the results of as few as five matches and suddenly we'd be reporting a winrate over 60%!


I specifically also felt that I had a bad variance weekend, to the tune of worse than 10% variance. And the margins on Magic events are quite small - as the saying goes, it takes both luck and skill to win at Magic. So keep that in mind as we talk about results: the results were bad, but sometimes you just have a bad weekend.


Structurally Unforgiving


One more nuanced way to phrase the idea "the deck was too hard" is to look at it structurally: the deck was quite unforgiving to mistakes in its play patterns.


Some decks are very forgiving. Take Nadu, for example: as long as you know how the combo works (and even if you mess up the combo a bit!), you will win a lot of games from just being a blisteringly fast and resilient combo deck. When you win you win by miles, so giving an inch does not make a difference. Small mistakes that cause you to take 2 more damage than you need to won't matter if you're winning on turn 3 anyways.


On the other hand, some decks are extremely unforgiving - otters is one such example. It is a deck with a lot of decisions and agency. And on one hand this is a great strength of the deck - it allows for a lot of flexibility in gameplay, giving you more paths to take and making it more likely that at least one leads to victory. But on the other hand, a lot of the power we attributed to the deck was tied up in this flexibility - so if you made a misstep and took the wrong line, there was not as much inherent power to cushion you.


This isn't to say that the deck is "worse" than we thought overall. We do not think we misjudged the strength of the deck at its peak, and we still believe that we brought one of the best decks at the tournament (perhaps Quinn Red was better)! But I believe that we did have a slight misconception in exactly how that strength was distributed.


a sketched graph of comparing how strong Nadu and Otters are as decks relative to skill level. obviously Nadu is stronger by many metrics, but it also has the property of leveling off and getting close to maximum strength earlier in skill level. on the otter hand, otters requires a lot of skill expression to maximize.

a sketched graph of comparing how strong Nadu and Otters are as decks relative to skill level. obviously Nadu is stronger by many metrics, but it also has the property of leveling off and getting close to maximum strength earlier in skill level. on the otter hand, otters requires a lot of skill expression to maximize.



Testing Takes Time


We certainly recognized how hard Otters was as a deck. Internally during testing there were a lot of comparisons to Slogurk, where we concluded that Slogurk was more tactically difficult but Otters was much more strategically difficult. But I do think that we failed to fully realize the consequences of this.


We spent a lot of time testing, and a lot of time working on gameplay specifically, but empirically it still wasn't enough. Several others specifically commented during the postmortem that they felt they needed more reps, and that they should've recognized this during testing.


But at the same time, I think we also spent too much time on constructed. Our limited performance this tournament was also worse than usual, and I think we skimped a bit on a few of the things we usually do to prep for limited - especially draft log reviews. And in hindsight, this is pretty clearly tied to the fact that our constructed deck was so hard - we felt that we needed to spend more time than usual on constructed, and that time had to come from somewhere.


Does that mean that we should've avoided trying to make a hard deck work? Well, I don't think that's the case. I still think I would register the same deck again, and I still think that we were right to pursue the deck.


If there's any lesson to learn here, it's just to both be more aware of exactly what our testing needs are, and to be more efficient overall. Every second of testing time counts; no tournament before this one has made this more clear to me.


No Plan Survives Contact


Finally, I think one thing we could've done better is specifically focus more on testing in an environment similar to the tournament.


A lot of the testing we do on Sanctum is somewhat casual in form, if not in intent. We're trying our hardest to learn and develop decks, but this often involves things like allowing some degree of takebacks, or pausing games to have people deeply analyze positions.


And I think this kind of testing is very valuable! It's certainly the kind of testing I personally find most useful, as I value exploration and innovation a lot.


But I think this time we ran into some issues with not practicing enough in a tournament environment, where you have specifically limited time to think. Practicing playing slow is essential to doing well at playing fast, but it is also important to have some sense of what it feels like to play fast.




Part 3: Where do we go from here?


So, would I recommend Otters as a deck?


Well, I would certainly register it, personally. I still believe in it as one of the best decks in the format, and I think that it has the tools to take on basically any direction the meta could develop in.


I would caution, though, that you should know what you're getting into before picking up the deck. It is legitimately quite hard, and so will take effort to learn. Don't let that scare you away! But keep it in mind.


With that in mind, here are some ideas I have for future work on the deck:


Tuning Mono-Red



The other deck out of Worlds that I thought might be the best deck in the room is Quinn Tonole's Mono-Red deck.


We actually were pretty much aware of this list going into the tournament - as Quinn had been playing it in challenges - but we (correctly) predicted that no one except Quinn would bring it. And we were pretty happy with that, as it is probably slightly scarier to our deck than the Gruul versions of red aggro.


And after Quinn's good performance at Worlds, the deck has been gaining popularity online. As such, while we didn't focus much on it during Worlds testing, it's probably now quite important to be prepared for this version specifically.


My first priority would be to test out how good

Dissection Tools and
Screaming Nemesis are against a deck with four copies of Nemesis. They were our sideboard plans against Gruul, but both have some awkwardnesses against opposing Nemeses.


If Nemesis is a big problem, I would look into whether non-damage-based removal like

Unable to Scream was at all viable. Alternatively, I would want to look into different sideboard plans, especially ones that try to lean more into the combo to try and race - since the deck is inherently a bit slower but more inevitable compared to Gruul.


The Ninth Otters Player



There was another Otters player at Worlds, Rei Hirayama, who had a different take on the deck than we did. Overall I like our list better, but there are many ideas in Rei's deck that I think are worth exploring more.


Sleight of Hand is the biggest one to me. I am somewhat skeptical of making the mana work while playing both Sleight AND green land-searching effects as pseudo-land-replacements, but it might not be quite as bad as I think, and is definitely worth exploring further.


The other big thing, which we did think about but never tried, is the four copies of

Bristlebud Farmer in the sideboard. I suspect they are worse than Dissection Tools at stabilizing against red, but again, certainly worth trying.


Foundations


Unfortunately, there aren't that many cards out of Foundations that look appealing for this deck. It's possible

Inspiration from Beyond could do something as a spell that loops with two copies, but it seems pretty anemic and unnecessary to me.
Aetherize could be a nice option against aggro, but at the same time seems a bit too slow.


But even if there are no direct upgrades to Otters from Foundations, the set will almost certainly have a big impact on Standard, which will shape how the deck wants to be constructed going forwards. I'm pretty hopeful that Otters will gain from the changes overall, as I think it's a deck that is structurally pretty good at dealing with

Llanowar Elves decks.


But we'll see! I'll probably be bringing Otters to some RCQs when Standard season starts just so I can play the deck more, but other than that there aren't that many opportunities for me to play Standard in the near future at a higher level - I'm going to Atlanta, but judging there, not playing.


There is VML Champs coming up soon though - and I would not be surprised if I registered Otters for that.




Conclusion


Overall, I'm okay with where the deck ended up. It's disappointing that we couldn't show off its power more, but we did have one team member, Ryan, qualify for her first PT! (again, reminder to go check out her excellent guide)


Sometimes, you just have a bad weekend. And while it's important to recognize and learn from the mistakes you made, it's also important not to read too much into what actually is just variance. So while I do think there are many parts of our testing process we on Sanctum can improve on going forward as a team, I am overall happy with the deck we made.


It certainly was an otter delight to play.








#FreePalestine

#FreePalestine | Consider donating to UNWRA or PCRF, supporting protesters locally, and educating yourself.