How to do DevOps

transient states

Rebrand your ops/dev/any team as the DevOps

Separate DevOps group

Manage change with plan and architects

No plan survives contact with the enemy.

Helmuth von Moltke the Elder

SRE

Site Reliability Engineering

SRE BOOK

https://landing.google.com/sre/books/

“specific implementation of DevOps with some idiosyncratic extensions.”

SRE is “what happens when a software engineer is tasked with what used to be called operations.”

Hope is not a strategy

50% ops (issues, on-call, and manual intervention)

50% development tasks (new features, scaling or automation)

Reduce organizational silos

  • SRE shares ownership with developers to create shared responsibility
  • SREs use the same tools that developers use, and vice versa

Implement gradual changes

  • SRE encourages developers and product owners to move quickly by reducing the cost of failure

Leverage tooling and automation

  • SREs have a charter to automate manual tasks (called “toil”) away

Measure everything

  • SRE defines prescriptive ways to measure values
  • SRE fundamentally believes that systems operation is a software problem

Managing Risk

What is a Risk

  • occurence
  • severity
  • non-detectability

Risks

  • Software fault tolerance
    unusable product ⚖️ not helpful stable
  • Testing
    outages, leaks… ⚖️ lose your market
  • Push
    Every push is risky
  • Canary duration and size

Costs

  • redundant resources
  • 📉 opportunities

SLA / SLI / SLO

  • Service Level Agreement commitment between a service provider and a client
    • Service Level Objective
      SLI achievement values
    • Service Level Indicator
      Measure of the service level provided by a service provider to a customer

Time-based availability

uptime.is

Aggregate availability

  • requests

Error Budget

Determines how unreliable the service is allowed

Chaos Team

Test your errors

“Risk friendly” culture

Blameless Postmortem

  • Downtime or degradation
  • Data loss
  • On-call engineer intervention
  • A resolution time above some threshold
  • A monitoring failure

Team Organization

Problems to solve

Scaling

http://blog.idonethis.com/two-pizza-team/

Planning

It always takes longer than you expect, even when you take into account

Hofstadter’s Law

Finishing projects

Adding manpower to a late software project makes it later.

Brook’s Law

Empowerment

Individuals are less likely to offer help to a victim when other people are present; the greater the number of bystanders, the less likely it is that one of them will help

Bystander effect

Conway’s law

Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure

— Melvin E. Conway

Component Team

  • optimized for delivering the maximum number of lines of code
  • focus on increased individual productivity by implementing ‘easy’ lower-value features

https://less.works/less/structure/feature-teams.html

Feature Team

  • optimized for delivering the maximum customer value
  • focus on high-value features and system productivity (value throughput)

https://less.works/less/structure/feature-teams.html

“Spotify Model”

The “Spotify Model” is not an Agile Method

  • Don’t scale agile… descale your organization

Valve: Cabal

  • Self-organized multidisciplinary project team
  • Form organically
  • People decide to join the group based on their own belief
  • Structure change according to new requirements

p15 Valve_NewEmployeeHandbook.pdf

OKR

  • Objective
    a clearly defined goal
  • Key Results
    specific measures used to track the achievement
  • 1 quarter
  • Public
  • Can be shared across the organization

Voyager

Join another team for 1 sprint/quarter to achieve an Objective needed

Peer Review Evaluation

No one than your peer(s) can evaluate what you did to achieve Objectives

Build you own model

Depending of:

  • Legacy
  • Culture
  • People

Change when needed

Today’s work is the legacy of tomorrow.

Product Creation

Everything-as-a-Service (EaaS / XaaS)

Day-to-day work

Pair programming

Review

Changes have to be read and merged by other of your team.

Everyone will be responsible of it after.

Declaration over Convention

Your convention may be not those of other teams

Eat Your Own Dog Food

You have to use your own product to know how your user are feeling

Test eveything

Never assume it works

Communication Rituels

Brown Bag Lunch (BBL)

OpenSpaces

MEP It Easy

###

  • Start Small – Build Trust
  • Create Champions
  • Build Confidence
  • Celebrate Success
  • Exploit Circumstance

https://www.slideshare.net/jesserobbins/cloud-expo-jesserobbinsopscode20130129b

End

return