Hi [[ session.user.profile.firstName ]]

Tactics and Strategies in Reliability

The two key dimensions to incident response are handling the incident as it occurs as well as driving follow up to prevent future issues. An incident response framework can provide structure for both of these two dimensions. Register and explore how to prevent future issues with the framework Dropbox uses.

This session is collectively part of PagerDuty Virtual Summit.
Recorded Nov 2 2016 27 mins
Your place is confirmed,
we'll send you email reminders
Presented by
Andrew Fong, Director of Engineering, Dropbox
Presentation preview: Tactics and Strategies in Reliability

Network with like-minded attendees

  • [[ session.user.profile.displayName ]]
    Add a photo
    • [[ session.user.profile.displayName ]]
    • [[ session.user.profile.jobTitle ]]
    • [[ session.user.profile.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(session.user.profile) ]]
  • [[ card.displayName ]]
    • [[ card.displayName ]]
    • [[ card.jobTitle ]]
    • [[ card.companyName ]]
    • [[ userProfileTemplateHelper.getLocation(card) ]]
  • Channel
  • Channel profile
  • Artificial Intelligence and Machine Learning Get Real Dec 6 2017 8:00 pm UTC 60 mins
    Dominic Marion, Operations Support and Release Manager, NBC News & Lilia Gutnik, Product Manager, Data Experience, PagerDuty
    Too often the terms machine learning and AI are associated with complex black box algorithms and models that no one seems to understand. PagerDuty is taking a more applied approach to machine learning – specifically focusing on real-life problems that users face. In this session, we'll talk about our recent release and future plans.

    You'll hear how we have delivered real, tangible value to some of our early customers using machine learning, advanced statistics, and sometimes, just simple math. No magic black boxes or unicorns and rainbows allowed.
  • Panel: Reimagining Customer Support Nov 29 2017 3:00 am UTC 45 mins
    Disha Gosalia, Sr. Director of Customer Support, Cloud Products, GE Digital
    Consumers are on the hunt for the ultimate customer experience, regardless of if you are a consumer facing brand or an enterprise company. Discover how brands like GE Digital, Gainsight, and Okta use PagerDuty to coordinate response to customer issues of all sizes — delivering the best possible responses 24×7, every time.
  • Panel: The Business Case for DevOps Nov 29 2017 2:00 am UTC 45 mins
    Andrew Fong, Director of Engineering, Dropbox
    What is the real value that organizations get from undergoing a DevOps transformation? Join us as industry experts and IT leaders discuss actual DevOps implementations, and their value to the customer, the business, and the team, in dollars, time, and satisfaction. We’ll explore the key metrics that you should use to measure for building a strong case for DevOps.
  • The Future of DevOps Nov 29 2017 1:00 am UTC 45 mins
    Adam Jacob, Co-Founder & CTO, Chef
    What does the future hold for the world of DevOps? Implementing DevOps means changing the technology system as well as the cultural system. Both are intertwined and, to be successful, both systems must evolve to drive velocity. Learning how to manage and lead by giving people context and information to make better decisions—versus giving them tickets or briefs—is how leaders and teams will thrive.
  • Incident Response Best Practices Nov 29 2017 12:00 am UTC 45 mins
    Susan Fowler, Editor-in-Chief of Increment, Stripe
    Hear from Susan Fowler about the current state of incident response. After surveying over 30 leading Bay Area companies about their incident response practices for their first issue, she’ll share insight into the trends found and tips and tricks for how to implement change at your company, as well as what the future of incident response holds.
  • William Hill Australia Migrates to the Cloud with PagerDuty Nov 28 2017 11:00 pm UTC 30 mins
    Alan Alderson, Head of Infrastructure & Operations at William Hill Australia & Ancy Dow, Sr Product Mktg. Manager, PagerDuty
    76.5% of Australian consumers will leave a digital app or service in less than one minute if it’s slow or unresponsive. Imagine the stakes at hand when placing a bet online - downtime can prevent huge potential winnings for a customer. At William Hill’s Australian division, 99.999% uptime is not good enough, especially on Tier One event days like the Melbourne Cup. That’s why they've placed a safe bet in PagerDuty and have seen immediate results. Learn more about how PagerDuty has helped the operations team at William Hill Australia reduce team time to awareness from minutes to seconds, and what’s in store for the future!
  • The Investment in Digital Insight and the Power of PagerDuty Nov 28 2017 10:00 pm UTC 45 mins
    Jennifer Tejada, CEO, PagerDuty
    PagerDuty loves DevOps. We support a culture of empowerment and enablement. As such, our newly announced capabilities integrate applied machine learning, end-to-end response automation, and the mobilization of people in real-time across the entire business, to help teams eliminate inefficiencies when it matters most and get back to innovation.
  • Best Practices for Driving IT and DevOps transformation in the enterprise Nov 28 2017 3:00 am UTC 30 mins
    Firaas Rashid, Director & Head of Production of International Wealth Management, Credit Suisse
    How does one effectively drive IT and DevOps transformation at a large, established company? In this session, we explore key best practices, challenges, and organizational changes that established enterprises need to drive in order be successful as they become more digital. Learn about the evolving role of IT and Ops, from a tactical function to a key strategic organization for the business.
  • Apprentices of Scale: The Slack Operations Story Nov 28 2017 2:30 am UTC 30 mins
    Richard Crowley, Operations Architect, Slack
    Growing up is hard to do. Learn from the story of Slack’s Operations team as they share how they learned to cope, scale, and thrive at the intersection of growing traffic, infrastructure, company, and market.
  • Bulletproof and PagerDuty: Accelerating Digital Transformation Nov 28 2017 2:00 am UTC 30 mins
    Greg Cockburn, Chief Cloud Officer, David Wall, Head of Asia-Pacific Japan,PagerDuty & Ancy Dow, Sr Product Marketing Manager
    As the leading cloud services provider in Australia and New Zealand, Bulletproofs key commitment to its customers is immediate human response time. With PagerDuty, Bulletproof has the confidence to know that regardless of when an issue occurs the right person receives the alert. Learn more about how PagerDuty has helped Bulletproof stay on top of extremely varied customer workloads across multiple tools and applications.
  • Panel: Leading Transformation in a Digital World Nov 28 2017 1:00 am UTC 45 mins
    Merline Saintil, Head of Ops, Product & Technology, Intuit
    Join us as we sit down with leaders from Intuit, Gainsight, and Symantec as they each share their digital transformation stories and lessons learned.
  • AI and Machine Learning Get Real Nov 28 2017 12:30 am UTC 30 mins
    Dominic Marion, Operations Support and Release Management Manager, NBC News
    The terms machine learning and AI are often associated with complex black box algorithms and models that no one seems to understand. We’re taking a more applied approach to machine learning – focusing on real-life problems. Hear how PagerDuty has delivered real, tangible value to some of our early customers using machine learning and advanced statistics. No magic black boxes or unicorns and rainbows allowed.
  • SoA Observability and Control: Present and Future Nov 28 2017 12:00 am UTC 30 mins
    Matt Klein, Software Engineer, Lyft
    State of the art (SoA) observability currently primarily relies disparate systems to get the job done, which yield a huge amount of cognitive overload for operators who are continuously trying to stitch together all of the different threads. Join us as we discuss the current state and look forward towards the power of a unified observability and control plane for SoA primarily driven by the burgeoning “service mesh” paradigm.
  • The PagerDuty Community & Product Updates Nov 27 2017 11:30 pm UTC 30 mins
    Alex Solomon, CTO and Co-Founder, PagerDuty
    As software becomes the de facto medium through which business is conducted, the quality of the digital experience defines an organization’s success. That’s why we’ve introduced new capabilities that integrate applied machine learning, end-to-end response automation, and the mobilization of people in real-time across the entire business. Our latest features help you eliminate inefficiencies when it matters most and get back to innovation.
  • Fireside Chat with Adrian Cockcroft Nov 27 2017 11:00 pm UTC 30 mins
    Adrian Cockcroft, VP of Architecture, AWS
    We sit down down with Adrian Cockcroft and chat about everything from working better in a developer driven culture to chaos engineering. He shares insight into how AWS and Netflix instill DevOps best practices that enable teams to write better code, bring tools and ideas together, go on-call for code they own, influence organizational change, and so much more.
  • Empowering Developers & Ops in a Digital World Nov 27 2017 10:00 pm UTC 45 mins
    Jennifer Tejada, CEO, PagerDuty
    In this digital world, where brands must deliver on the ultimate customer experience, the role of developers is changing rapidly. Everything must work perfectly. As a result, digital operations is not only critical to developers and engineers, it’s becoming business critical. Join us as we discuss the idea of implementing DevOps across the business. Should you put everyone on-call?
  • Organizing and Optimizing ITSM Toolsets Nov 22 2017 9:00 am UTC 60 mins
    Manish Kalra, Director of Product Marketing, PagerDuty
    Service Management (ITSM) is an approach for designing, delivering, managing and improving the way IT is used within an organization. To make that approach a reality, a core requirement is having the right strategic toolset for your unique organizational needs.

    But are the right tools to choose to help you deliver optimal services and keep your application and critical infrastructure available? How do you organize all the information these tools are feeding your organization everyday?

    Join PagerDuty as they take you through what it takes to:

    ●Consolidate multiple ITSM services into one hub
    ●Support a 2-speed IT infrastructure and multiple ITSM processes within a single organization
    ●Evaluate integrations and flexibility of potential toolsets
  • DevOps at Scale: Using AWS & PagerDuty to Improve Growth & Incident Resolution Recorded: Oct 19 2017 39 mins
    Thomas Robinson, Solutions Architect, AWS & Eric Sigler, Head of DevOps, PagerDuty; Christopher Hoey, SRE, Datadog
    Meeting the demands of everchanging IT management and security requirements means evolving both how you respond to and resolve incidents. It’s critical for organizations to adopt a scalable DevOps solution that integrates with their current monitoring systems to enable collaboration across development and operations teams, reducing the mean time to resolution.

    PagerDuty works with AWS services like Amazon CloudWatch, to provide rapid incident response with rich, contextual details that allow you to analyze trends and monitor the performance of your applications and AWS environment.
  • Introduction to Being an Incident Responder Recorded: Sep 28 2017 30 mins
    Eric Sigler - Head of DevOps @ PagerDuty
    Best practices to succeed during major incident response

    What do you do when the unexpected happens and causes customer-impacting downtime? It’s of the utmost importance that you are prepared and can get our systems back into full working order as quickly as possible. It’s crucial to have a well-defined strategy to come together as a team, work the problem, and get to a solution quickly.

    Drawing from the experiences of thousands of operationally mature teams, this incident responder training will help you gain the understanding required to help support your team’s success when mitigating customer-impacting issues.

    Join us to learn:
    •What is incident response?
    •The roles involved in incident response
    •How to incorporate learnings from previous incident responses
    •Skills for success
  • The Definitive Incident Resolution Lifecycle for Modern Ops Recorded: Sep 20 2017 39 mins
    Dave Cliffe, Group Product Manager, PagerDuty & Sean Higgins, Product Manager, PagerDuty
    The stakes of managing complex infrastructure continue to increase alongside the ever-increasing costs of outages. And while many IT Operations teams are investing in monitoring and ITSM tools to detect issues, they are often forced to react to high volumes of event data without context and without any consistent, well-defined processes. This leads to costly operational inefficiencies, employee burnout, and extended customer downtime. In this webinar, you'll learn how to:

    •Optimize your ITSM toolsets by integrating people, data, and processes
    •Maximize cross-functional transparency and consistency
    •Prioritize incidents with well-defined rules
    •Automated troubleshooting and remediation
    •Improve problem management with postmortems and continuous learning across your team
Your Fastest Path to Incident Resolution
PagerDuty is helping IT Operations and DevOps professionals deliver on the promise of agility, performance and uptime. Our enterprise-grade incident management helps you orchestrate the ideal response to create better customer, employee, and business value.

Embed in website or blog

Successfully added emails: 0
Remove all
  • Title: Tactics and Strategies in Reliability
  • Live at: Nov 2 2016 7:00 pm
  • Presented by: Andrew Fong, Director of Engineering, Dropbox
  • From:
Your email has been sent.
or close