class: center, middle # Migrating Applications Without Fear ??? * Hi everyone, thanks for coming to PyTennessee, and thanks for coming to my talk. * You're at _Migrating Applications without Fear_, a beginner to intermediate level talk about dealing with some of the technical and organizational challenges around changing the infrastructure an application runs on. * My goal is that this material will be useful to you whether you're primarily a developer, operator, or a mix. * This is a talk about tactics, not strategy. I'm not going to discuss how to decide whether an infrastructure migration is warranted, just how to be prepared for one. * I'm going to try to spend 2/3 of my time on my prepared material. If you want to stop me to ask a question, go ahead! * If at any point you decide this isn't the talk for you, don't be afraid to migrate to another one. --- # Me * Brian Pitts * Operations Engineer at Mozilla * Previously Eventbrite, Lonely Planet, and more * Former Nashvillian, Current Atlantan * https://www.polibyte.com ??? * write a lot of tooling in python, run a lot of python applications amongs other languages * always glad for pytn to give me an excuse to visit nashville, greatful it still came together despite terrible damage and loss of life from tornado --- class: center, middle  ??? * so why talk on migrations? * made diagrams showing the applications and teams my team works with, this is one filtered down to just areas i tough * Talk about migrations on all those projects * decided to share what i'd learned, if only to foce myself to think about it more systematically --- class: center, middle  ??? As I was putting this together, I realized there were 2 callbacks to past PyTN. One is to Lars 2017 keynote on socorro, because I too am going to talk about that. If you saw both * mozilla never learns * lars is more entertaining --- class: center, middle # Integration Points ??? * The other is my 2017 talk on Capacity and Stability Patterns. One of the key messages of that talk is that much of the risk is found in your systems integration points. * As I was putting this together and looking for the overarching theme, I realized that the same message applied. * Migrating without fear is enabled by understanding your applications integration points - with your infrastructure, with other applications, and with its users, well enough to confidently make changes. * And its enabled by designing those integration points in ways that minimize the impacts of changing them. So let's talk a bit about design first. --- class: center, middle # Use boring technology ??? * set of common services you can find on any cloud provider or run yourself- relational databases, message queues, object stores, etc * layer you can build on and ensure portability * prefer these common, well proven tools to newer, provider specific ones * not saying build to lowest common denominator- e.g. if you're going to use postgres, maybe go all in! don't have to be able to switch to any db, just have to know you can run pg anywhere * sometimes you may want to build heavily on a privder specific service, e.g. mozilla's data team is investing heavily in bigquery right now. but do this conscious of the tradeoffs, including lockin,. --- class: center, middle # Swappable Implementations ??? * don't let dependencies on your infra layer pervade your codebase * structure codebase to allow swapping implementations, or use libraries that abstract * example is socorro, which has seen its queueing and crash storage mecahnisms shift dramatically over time * recently did this again, and at this point its a one line config change to go from, say google's pubsub to amazon's sqs --- class: center, middle # Containerize ??? * originally said dockerize, but there are other toolchains you can use if you prefer * think about dependencies of your python app, you probably think about requirements file or pipfile and what you get from pypi, but there's more * dependency on specific python version * for modules that aren't pure python, specific versions of system libraries * likely for a webapp you have a webserver, like nginx, in front * expectations around how things like logs are managed * huge number of things you rely on from the host, making your build artifact be a container has two benefits * helps you clearly define what those dependencies you have are * decouple satisfying those dependencies from where you app actually runs * at this point tons of ways to run a container, from simply installing docker on a server, to clustering tools like ecs or kubernetes, to traditional PaaS like heroku * my part of mozilla made this transition in 2016, over 200 containers on dockerhub and its paid of really well --- class: center, middle # Infrastructure as Code ??? * just as container lets you capture host-level dependencies, IaC lets you capture other infrastructure dependencies. numerous benefits * makes your dependencies clear, no questions like "oh, i think this app uses a cache". does your infra code setup a cache? * depending on type of migration, it can potentially be applied wholesale, or serve as basis for porting * preferably use something multiprovider like terraform. even if resoruces you are calling change during migration, can resuse structure, and easier to have one tool throughout * another benefit is speed of turning up new environments during migration, and consistency across them * example of elmo vs taskcluster --- class: center, middle # Instrumentation ??? * last year i have a talk on instrumentation, and sam clarke gave one yesterday * by instrumentation i mean metrics, logs, traces, whatever that you've added to your code to understand its behavior in the wild * number of ways that its important * in my talk i said that the first thing you instrument is your integration points, tracking count nad latency of succesful and failed calls. so this acts as another form of documentation for your dependencies. * understand utilization, better size your new infrastructure * validate behavior before, during, after migration --- class: center, middle # Understand Decisionmakers' Motivations ??? * Now lets switch gears and talk about organizational rather than technical issues. * big one is why are you migrating, what do decisionmakers want out of it. * they don't just want to be able to say we migrated, they want to be able to say we migrated and X * X could be decreased cost, improved reliability, etc * need to know what X is so you can prioritize * example of elmo = datacenter shutdown = speed, taskcluster = seperation of concerns = security --- class: center, middle # Carefully negotiate scope and timeline ??? * No matter how well, you think you understand your system, changes inherent in migration introduce risks * Minimize taking on any additional risks * Main unnecessary risks you can take on are doing too much or doing it too fast * Resist the urge to rearchitect. Example of taskcluster changing auth system, introduce new worker types, etc slowing it down * Fight for the time to do it right. THings liek end-to-end testing often take longer than expected and can be pressure to skip. Don't gamble. --- class: center, middle # Plan ??? * understaning scope and timeline means you need a plan! * not set in stone, in fact should evolve and get more and more details and concrete as your progress * at the beginning you may just have broad steps, at the end for the actual cutover you should have checklist of who is doing what, when, how its validated, etc --- class: center, middle # Build Rapport ??? * identify key players * get them to trust each other --- class: center, middle # Low friction communication ??? * lots of info needs to be shared * people will have dumb questions * safe and easy place to chat * e.g. on taskcluster went from one weekly mtg to slack channel --- class: center, middle # Regular status updates ??? foo --- class: center, middle # Thanks! ??? my status update is, that's it! i hope some of what i shared is helpful