DevOps Engineering is about Workflow

DevOps Engineering is about Workflow


Currently, I work doing a bunch of different things(like most of Brazilians). I do Architecture, DevOps Engineering, Software Engineering, Agile Coaching and a bit of management. There different types of companies and cultures but basically or you are more specialized or more cross-functional I think in general Brazil is similar like Europe and more cross-functional and the US is more specialized but really depends on the company. I also run a small team doing: Dynomite / DM, Serverless Remediation, Stress Test and Chaos Platform and Telemetry / Observability. My teams need to provide solutions(engineering)but also Stability.

Working with DevOps Engineering could very very tricky. Depending how we decide to do things could create a complex code to main and also hard to understand. IMHO clear intentions and clear code are always a goal when delivering a solution no matter if is software engineering or DevOps engineering.

Provisioning Microservices is quite straightforward you often install some dependencies on the OS using some provisioning tool like Ansible, Chef , Puppet, Salt Stack or Cloud Formation for instance and then you get a package like a WAR or RPM or tar.gz  or egg or gem and deploy on the box or in a specific server like Jetty, Tomcat, Wildly, etc… However, when you need do provisioning for NoSQL databases them things get a little bit more fun and complicated as well.


Hidden Concurrency Issues


It's easy to fall into hidden concurrency issues when you are doing deploys with bigger clusters. Let's say you add some code on the /ect/init.d/ of the machine to create a schema in Cassandra for instance. Are you deploy 3 box cluster you might be fine but them when you do with 12 or 20 you have high chances to run into trouble. So you could continue going down to the rabbit hole and some extra IFs, Retries, try/catches, timeout, exponential backoff to deal with that. However IMHO this sounds like a design issue - like this code should not be there.


Footnote on Retries, Timeouts and Exponential Backoff


I'm not saying these are but things the opposite actually this are default practices on distributed systems. Google's talks a lot about them on SRE book. There are some great practices here and here.


The Frame or Framework


A Framework “frames” our way to think about work and this good and bad at same time. So without review, it's easy to keep doing the same things over and over -- or just do it one time and keep that way forever but this would create tech debts. IMHO not re-thinking about code or just moving forward could be a big source of tech debts. A Good design for me is like good coffee once you get it will be hard to live without it.

The Flow


This is something I've thought about in a while and time to time I go back to this thought about flow. I found some great evidence that DevOps Engineering is about flow. For instance, if you take a look at some open source projects like:


Jenkins Pipeline.You will realize it's about flow, visualization makes easier to understand. We could do a monolith job doing everything with lots of complicated scripts but definitely would be harder to understand.


NetflixOSS Spinnaker is a great other evidence of flow. Since Spinnaker works as a cloud orchestrator calling Jenkins jobs and doing AWS(or another cloud vendor like google) specific tasks such as creating ASG, SGs, ELBs, etc…


Apache Airflow is another evidence.Used mainly by Airbnb and PayPal. For PayPal, they do Hadoop jobs remediations like replacing nodes.


AWS Lambda Step Functions: You can see is a very simple state machine with very basic EIP Patterns. So definitely this “problem” is on Serverless as well and looks like AWS see it.


All these tools are great but they are kind of macro. We might also need some kind of micro-workflow something like a code level orchestration. You easily could archive that with an internal DSL. Basically, the main method calling just functions could easily do the job as well. There are some libs in Python(for instance) as well do it like Workflow, SpiffWorkflow, Toil and many others.  


There are different kinds of problems for DevOps Engineering, more and more I see fit for workflow solutions given how hard and counterintuitive your code can get. However, I don't believe in the Silver bullet so this is not for everything.


Cheers,
Diego Pacheco


Popular posts from this blog

Kafka Streams with Java 15

Rust and Java Interoperability

HMAC in Java