The Hows and Whats of Service Monitoring at CERN

Presented by

David Moreno García, DevOps Engineer, CERN

About this talk

The CERN IT infrastructure is spread across different countries providing different resources for physics computing, analysing the petabytes of data from the Large Hadron Collider and other experiments. We have approximately 40k Puppet managed nodes and more than one hundred virtual machines providing the necessary infrastructure to support this. Monitoring has become a key aspect of our daily operations, allowing us not only to identify problems in real time but also to narrow down the causes of them. In the long term it is also a key asset in the planning for the future and the improvement of the efficiency of the team. This session is focused on showcasing how we monitor our Puppet infrastructure (using tools like ElasticSearch, collectd, Flume, Kibana and Grafana among others), and how this has helped us in real situations.

Related topics:

More from this channel

Upcoming talks (0)
On-demand talks (39)
Subscribers (4438)
At Puppet, we've always focused on helping IT navigate change -- whether moving to the cloud, embracing infrastructure as code and DevOps practices, or adopting containers. We've proven we can solve hard problems. And we're building on that heritage. Whatever it takes to deliver better software, faster, we'll get you there -- just as we help thousands of other companies bridge from legacy IT to the future. More than 30,000 companies — including more than two-thirds of the Fortune 100 — use Puppet’s open source and commercial solutions to achieve situational awareness and drive software change with confidence.