Home Use Cases Blog Contact Us
24/7 Support 010 5000 LSD (573) Submit a Ticket
LSD

LSD

Kubernetes Professionals

  • Home
  • Solutions
    • Managed Kubernetes
    • Managed Observability
    • Managed Event Streaming
  • Services
  • Cloud
    • Cloud Kubernetes
    • Cloud AWS
  • Software
  • Partners
    • Confluent
    • Elastic
    • Gitlab
    • Red Hat
    • SUSE
    • VMWare Tanzu

Tag: logs

What is Observability? (Part 1)

What is Observability? (Part 1)

Posted on October 21, 2022October 19, 2022 by Doug Moll
INSIGHTS

In part one of this two-part series of posts, I’ll be discussing my views on the fundamentals and key elements of Observability, as opposed to a technical deep dive. There are many great resources out there which already take a closer look at the key concepts. First off, let’s look at what Observability is.

What is Observability?

The CNCF defines Observability as “the capability to continuously generate and discover actionable insights based on signals from the system under observation”.

Essentially the goal of Observability is to detect and fix problems as fast as possible. In the world of monolithic apps and older architectures, monitoring was often enough to accomplish this goal, but with the world moving to distributed architectures and microservices, it is not always obvious why a problem has occurred by merely monitoring an isolated metric which has spiked.

This is where observability becomes a necessity. With observability basically being a measure of how well the internal state of a system can be understood based on its signals, it stands to reason that all the right data is needed! In a distributed system the right data is typically regarded to be logs, metrics and application traces, often referred to as the “three pillars of observability”.

While these are the generally agreed upon key indicators, it is important in my view to also look at including user experience data, uptime data, as well as synthetic data to provide an end-to-end observable system.

The analyst’s ability to then gain the relevant insights from this data to detect and fix root cause events in the quickest and most efficient way possible is the measure of how effectively observability has been implemented for the system.

There are a number of aspects which can determine the success of your observability efforts, some of which bear more weight than others. There are also tons of observability tools and solutions to choose from. What is fairly typical amongst customers that LSD engages with is that they have numerous tools in their stable but have not achieved their goals in terms of observability, and therefore haven’t achieved the desired state.

Let’s explore this a bit more by looking at what the desired state may look like.

What is the desired state?

This is best explained by looking at an example: A particular service has a spike in latency which is likely picked up through an alert. How does an analyst go from there to determine the root cause of the latency spike?

Firstly the analyst may want to trace the transaction causing the latency spike. For this, they would analyse the full distributed trace of the high latency events. Having identified the transaction, the analyst still does not know the root cause. Some clues may lie in the metrics of the host or container it ran in, so that may be the next course of action. The root cause is mostly determined in the logs, so ultimately the analyst would want to analyse the logs for the specific transaction in question.

The above scenario is fairly simple however achieving this in the most efficient way, relies on the ability to optimally correlate between logs, metrics and traces.

Proper correlation means being able to jump directly from a transaction in a trace to the logs for that specific transaction, or being able to jump directly to the metrics of the container it ran in. To me, the most effective way to achieve this is for all the logs, metrics and traces, to exist in the same observability platform and to share the same schema.

In the digital age, customers want a flawless experience when interacting with businesses. Let’s look at a bank for example. There is no room for error when a service is directly interacting with a customer’s finances. So when an online banking service goes down for three days (it happens), it will lose customers or at least suffer reputational damage.

The ultimate goal is to detect and fix root cause events as quickly and efficiently as possible, and in this, the approach of using multiple tools fails.

In part two of this series, I will discuss the most critical factors which contribute to a good Observability solution that will help businesses reach the goals set out above.

 

Learn more about Observability by reading this blog post by Mark Billett, an Observability engineer at LSD.

If you would like to know more about Observability or a Managed Observability Platform, check out our page.

Doug Moll

Doug Moll is a solution architect at LSD, focusing on Observability and Event-Streaming for cloud native platforms.

Recent Posts

  • The Technical Benefits of Cloud Native architecture (Part 1)
  • What is Event-Streaming?
  • Watch: LSD, VMware and Axiz talk Tanzu at the Prison Break Market
  • Protected: Free VMware Tanzu Proof of Concept for three qualifying companies
  • Wrap-Up | Tech This Out: Ansible

Recent Comments

No comments to show.

Archives

  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • July 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • September 2021
  • August 2021
  • July 2021
  • June 2021
  • May 2021
  • April 2021
  • March 2021
  • February 2021
  • November 2020
  • August 2020
  • July 2020
  • June 2020
  • April 2020
  • March 2020
  • February 2020

Categories

  • Cloud Native Tech
  • News
  • Press Release
  • Uncategorized
  • Video
  • Why Cloud Native?
Managed Kubernetes Managed Observability Managed Streaming Services Software
Usecases Partners Thinktank (Blog)
Contact Us Terms of Service Privacy Policy Cookie Policy

All Rights Reserved © 2022 | Designed and developed by Handcrafted Brands

MENU logo
  • Home
  • Solutions
    • Managed Kubernetes
    • Managed Observability
    • Managed Event Streaming
  • Services
  • Cloud
    • Cloud Kubernetes
    • Cloud AWS
  • Software
  • Partners
    • Confluent
    • Elastic
    • Gitlab
    • Red Hat
    • SUSE
    • VMWare Tanzu
  • Blog