Home Use Cases Blog Contact Us
24/7 Support 010 5000 LSD (573) Submit a Ticket
LSD

LSD

Kubernetes Professionals

  • Home
  • Solutions
    • Managed Kubernetes
    • Managed Observability
    • Managed Event Streaming
  • Services
  • Cloud
    • Cloud Kubernetes
    • Cloud AWS
  • Software
  • Partners
    • Confluent
    • Elastic
    • Gitlab
    • Red Hat
    • SUSE
    • VMWare Tanzu

Tag: apm

What are the benefits of Observability?

What are the benefits of Observability?

Posted on November 4, 2022 by Doug Moll
INSIGHTS

Now that we have explored what observability is and what makes up a good observability solution, we can dive a bit deeper into the benefits. This is again not an exhaustive list of benefits but I consider these to be the most impactful to businesses. Although some of these have been touched on in my previous posts, in this post I will consolidate these and add the missing pieces.

More performance, less downtime

Leaders in the observability space can detect and resolve issues considerably faster than businesses that are still relatively immature in this space. This includes issues relating to application performance or downtime.

Poorly performing applications or applications experiencing downtime have a direct impact on costs for any business. These can be in the form of tangible costs such as a direct loss in revenue or intangible costs such as brand and reputational damage.

Consider an eCommerce store which cannot transact due to a broken payment service, a social application that can no longer serve ads, a real-time trading application with super high latency, or a logistics application with a broken tracking service. There are literally thousands of examples across industries where the costs associated with downtime or poorly performing applications are very tangible.

When a banking application goes down, almost everyone knows about it the minute it happens. Twitter lights up, it appears on everyone’s news feeds and it even lands up on radio and television news broadcasts. Apart from the direct costs, the reputational damage caused by the downtime of an application can also be very costly, leading to increased customer churn, loss of new customers; as well as a host of other outcomes which impact the bottom line.

Measuring the true costs of downtime or poor-performing applications can be a difficult task, but the costs typically far outweigh the costs of making sure observability is done right; where issues are detected early and fixed before they can have a significant impact.

Higher productivity, better customer experience

A properly implemented observability solution provides businesses with massively improved insights across the entirety of the business. These insights improve efficiencies and workflows in detecting and resolving issues across the application landscape. This landscape is distributed in today’s modern architectures and extends to the infrastructure, networks and platforms on which the applications run, both on-prem as well as cloud environments. These insights and efficiencies ultimately provide multiple benefits across business operations.

One of the more tangible benefits is that if your developers and DevOps engineers are not stuck diagnosing problems all day, they can spend their time developing and deploying applications. This means accelerated development cycles that ultimately lead to getting applications to the market quicker as well as leading to better and more innovative applications.

With businesses being ever more defined by the digital experiences they provide to their customers, observability is one of the edges required to become leaders in the industry. The deeper insights also help to align the different functions of the business. Having visibility on all aspects of the system, from higher level SLAs to all the frontend and backend processes, enables operations and development teams to optimise processes across the landscape. These insights even enable businesses to introduce new sources of income.

Observability is also vital in providing businesses with confidence in their cross-functional processes and assurance that the applications that are brought to market are robust. This confidence is even more important in today’s complex distributed systems which stretch across on-prem and cloud environments.

Happy people, better talent retention

One of the often overlooked benefits of observability is talent retention. With highly skilled developers and DevOps engineers being a bit of a scarcity, it stands to reason that businesses would want to do what they can to retain their best talent.

The frustration of sitting in endless war rooms and spending the majority of the day putting out fires is a surefire way to ensure highly skilled talent will look for opportunities to work elsewhere,  to be able to do what they enjoy.

Efficient observability practices and workflows drastically reduce the amount of time developers and engineers spend dealing with issues, making them happier and ultimately helping to retain them.

Fewer monitoring tools, look at all those benefits

One of the themes from my previous posts is that using multiple monitoring tools instead of a centralised observability solution creates inefficiencies and has a severe impact on a business’s ability to detect and resolve issues. From this post, it should be apparent that the insights gained – by a centralised observability solution across the landscape – have a number of other benefits too.

Although this post is dealing with the generic benefits of observability without necessarily comparing it to other approaches, I feel addressing a few drawbacks from the multiple tool approach will also highlight additional benefits of the central platform approach to observability. Below are some of these drawbacks:

  • Licensing multiple monitoring tools introduces unnecessary costs as well as complexity in administering multiple different licensing models.
  • Having multiple tools also introduces complexity across your environment with multiple different agents and tools to be managed and operationally maintained.
  • The diverse and often rare skills required to operate multiple different tools either introduce a burden on existing operations teams or cause reliance on multiple different external parties to implement, manage and maintain tools.
  • Data governance is vital in any tool or system that stores data. Monitoring tools are no different and often contain sensitive data. Governance for a single observability solution is far simpler to achieve and less costly than multiple tools.
  • Storing data also has a cost burden which is often far higher when you have multiple tools, each with its own storage requirements.

The main thing to highlight is that the above drawbacks are really secondary to the most important benefit of centralised observability over the multiple monitoring tools approach. That is detecting and resolving issues in the most efficient and quickest way possible. This is best achieved with seamless correlation between your logs, metrics and APM data in a centralised platform.

Realising your benefits

To be a leader in the observability space is a journey. As I mentioned in previous posts, observability is not simply achieved by deploying a tool. It starts with architecture and design to ensure the solution adheres to best practices and can scale and grow as the business needs it to. It then extends to ingesting all the right data, formatted and stored in a way that can facilitate efficient correlations and workflows. Then all the other backend and frontend pieces need to fall in place, such as retention management, alerting, security, machine learning, etc.

LSD has been deploying observability solutions for our customers for many years and we help accelerate their journey through our battle-tested solutions and experience in deploying and implementing these solutions. Please follow this link to learn more.

Doug Moll

Doug Moll is a solution architect at LSD, focusing on Observability and Event-Streaming for cloud native platforms.

What is Observability? (Part 1)

What is Observability? (Part 1)

Posted on October 21, 2022October 19, 2022 by Doug Moll
INSIGHTS

In part one of this two-part series of posts, I’ll be discussing my views on the fundamentals and key elements of Observability, as opposed to a technical deep dive. There are many great resources out there which already take a closer look at the key concepts. First off, let’s look at what Observability is.

What is Observability?

The CNCF defines Observability as “the capability to continuously generate and discover actionable insights based on signals from the system under observation”.

Essentially the goal of Observability is to detect and fix problems as fast as possible. In the world of monolithic apps and older architectures, monitoring was often enough to accomplish this goal, but with the world moving to distributed architectures and microservices, it is not always obvious why a problem has occurred by merely monitoring an isolated metric which has spiked.

This is where observability becomes a necessity. With observability basically being a measure of how well the internal state of a system can be understood based on its signals, it stands to reason that all the right data is needed! In a distributed system the right data is typically regarded to be logs, metrics and application traces, often referred to as the “three pillars of observability”.

While these are the generally agreed upon key indicators, it is important in my view to also look at including user experience data, uptime data, as well as synthetic data to provide an end-to-end observable system.

The analyst’s ability to then gain the relevant insights from this data to detect and fix root cause events in the quickest and most efficient way possible is the measure of how effectively observability has been implemented for the system.

There are a number of aspects which can determine the success of your observability efforts, some of which bear more weight than others. There are also tons of observability tools and solutions to choose from. What is fairly typical amongst customers that LSD engages with is that they have numerous tools in their stable but have not achieved their goals in terms of observability, and therefore haven’t achieved the desired state.

Let’s explore this a bit more by looking at what the desired state may look like.

What is the desired state?

This is best explained by looking at an example: A particular service has a spike in latency which is likely picked up through an alert. How does an analyst go from there to determine the root cause of the latency spike?

Firstly the analyst may want to trace the transaction causing the latency spike. For this, they would analyse the full distributed trace of the high latency events. Having identified the transaction, the analyst still does not know the root cause. Some clues may lie in the metrics of the host or container it ran in, so that may be the next course of action. The root cause is mostly determined in the logs, so ultimately the analyst would want to analyse the logs for the specific transaction in question.

The above scenario is fairly simple however achieving this in the most efficient way, relies on the ability to optimally correlate between logs, metrics and traces.

Proper correlation means being able to jump directly from a transaction in a trace to the logs for that specific transaction, or being able to jump directly to the metrics of the container it ran in. To me, the most effective way to achieve this is for all the logs, metrics and traces, to exist in the same observability platform and to share the same schema.

In the digital age, customers want a flawless experience when interacting with businesses. Let’s look at a bank for example. There is no room for error when a service is directly interacting with a customer’s finances. So when an online banking service goes down for three days (it happens), it will lose customers or at least suffer reputational damage.

The ultimate goal is to detect and fix root cause events as quickly and efficiently as possible, and in this, the approach of using multiple tools fails.

In part two of this series, I will discuss the most critical factors which contribute to a good Observability solution that will help businesses reach the goals set out above.

 

Learn more about Observability by reading this blog post by Mark Billett, an Observability engineer at LSD.

If you would like to know more about Observability or a Managed Observability Platform, check out our page.

Doug Moll

Doug Moll is a solution architect at LSD, focusing on Observability and Event-Streaming for cloud native platforms.

Recent Posts

  • The Technical Benefits of Cloud Native architecture (Part 1)
  • What is Event-Streaming?
  • Watch: LSD, VMware and Axiz talk Tanzu at the Prison Break Market
  • Protected: Free VMware Tanzu Proof of Concept for three qualifying companies
  • Wrap-Up | Tech This Out: Ansible

Recent Comments

No comments to show.

Archives

  • February 2023
  • January 2023
  • December 2022
  • November 2022
  • October 2022
  • September 2022
  • July 2022
  • April 2022
  • March 2022
  • February 2022
  • January 2022
  • December 2021
  • September 2021
  • August 2021
  • July 2021
  • June 2021
  • May 2021
  • April 2021
  • March 2021
  • February 2021
  • November 2020
  • August 2020
  • July 2020
  • June 2020
  • April 2020
  • March 2020
  • February 2020

Categories

  • Cloud Native Tech
  • News
  • Press Release
  • Uncategorized
  • Video
  • Why Cloud Native?
Managed Kubernetes Managed Observability Managed Streaming Services Software
Usecases Partners Thinktank (Blog)
Contact Us Terms of Service Privacy Policy Cookie Policy

All Rights Reserved © 2022 | Designed and developed by Handcrafted Brands

MENU logo
  • Home
  • Solutions
    • Managed Kubernetes
    • Managed Observability
    • Managed Event Streaming
  • Services
  • Cloud
    • Cloud Kubernetes
    • Cloud AWS
  • Software
  • Partners
    • Confluent
    • Elastic
    • Gitlab
    • Red Hat
    • SUSE
    • VMWare Tanzu
  • Blog