OpenTelemetry Metrics: A Guide to Getting Started

Observability

Sep

2025

Sep

2025

‍^‍In today's complex, distributed technology landscape, understanding what's happening inside your applications has become more critical than ever. As organizations move toward microservices, cloud-native deployments, and DevOps practices, traditional monitoring approaches often fall short. This is where OpenTelemetry metrics come in-a powerful, standardized way to gain visibility into your application's performance and behavior.

If you're an engineer, developer, or observability architect looking to understand what OpenTelemetry metrics are and how they can transform your monitoring strategy, you've come to the right place. This introductory guide will walk you through everything you need to know about OpenTelemetry metrics, from basic concepts to practical implementation steps.

What Are OpenTelemetry Metrics?

OpenTelemetry metrics are standardized, numerical measurements that track how your applications perform over time. Think of them as the vital signs of your software-like a doctor monitoring heart rate, blood pressure, and temperature, metrics give you continuous insights into your application's health, performance, and behavior.

The Basics: What Makes OpenTelemetry Metrics Special

OpenTelemetry (often abbreviated as OTel) is an open-source observability framework that provides a unified way to collect, process, and export telemetry data. Metrics are one of three core pillars of observability, alongside traces (which show how requests flow through your system) and logs (which provide detailed event information).

Key characteristics that set OpenTelemetry metrics apart:

Standardized format: Unlike vendor-specific monitoring solutions, OTel metrics follow industry standards
Language agnostic: Works consistently across Python, Java, Node.js, Go, and many other languages
Vendor neutral: Export to any monitoring backend-Prometheus, Grafana, Datadog, and more
Rich context: Include labels and attributes for detailed filtering and analysis

How OpenTelemetry Metrics Work

At their core, OpenTelemetry metrics work by collecting numerical data points at regular intervals. These data points are then aggregated, stored, and made available for analysis. Here's the basic flow:

Instrumentation: Your application code records measurements (like response times, request counts)
Collection: The OpenTelemetry SDK gathers these measurements
Processing: Data is aggregated and formatted according to OTel standards
Export: Metrics are sent to your chosen monitoring backend
Analysis: You can query, visualize, and alert on the collected data

Why Are OpenTelemetry Metrics Useful?

The Observability Problem

Before diving into the benefits, let's understand the challenge: modern applications are complex. A single user request might touch dozens of services, databases, caches, and external APIs. When something goes wrong, traditional debugging approaches often fail because:

You can't reproduce the issue in development
The problem only occurs under specific load conditions
Multiple services are involved, making root cause analysis difficult
Performance issues are gradual and hard to spot

How OpenTelemetry Metrics Solve These Problems

1. Proactive Problem Detection Instead of waiting for users to report issues, metrics give you early warning signs:

Response times creeping up
Error rates increasing
Resource usage approaching limits
Unusual traffic patterns

2. Performance Optimization Metrics help you identify bottlenecks and optimization opportunities:

Which database queries are slowest
Which API endpoints consume the most resources
Where memory leaks might be occurring
How caching strategies are performing

3. Business Intelligence Beyond technical monitoring, metrics provide business insights:

User engagement patterns
Feature usage statistics
Conversion funnel performance
Revenue impact of performance issues

4. Operational Efficiency Metrics enable data-driven operations:

Capacity planning based on actual usage
Automated scaling based on demand
SLA monitoring and alerting
Cost optimization through resource tracking

Understanding the OpenTelemetry Metrics Data Model

Core Concepts

To effectively use OpenTelemetry metrics, you need to understand a few fundamental concepts:

Meter: Think of a meter as a factory that creates different types of measurement instruments. Each application typically has one meter.

Instrument: These are the actual measurement tools created by the meter. Different types of instruments measure different aspects of your application.

Attributes: These are key-value pairs that provide context to your measurements. For example, you might track request counts with attributes like method=GET, endpoint=/api/users, and status=200.

Aggregation: This is how individual measurements are combined over time. For example, you might want to see the average response time over the last 5 minutes, or the total number of requests in the last hour.

Types of Metrics in OpenTelemetry

OpenTelemetry supports several metric types, each designed for specific measurement scenarios:

1. Counter

Counters only go up-they're perfect for tracking cumulative events that can never decrease.

What they measure:

Total number of HTTP requests
Total number of database queries
Total number of user registrations
Total number of errors

Example use case:

// Track total requests to your API
const requestCounter = meter.createCounter('api_requests_total', {
  description: 'Total number of API requests'
});

// Increment the counter for each request
requestCounter.add(1, {
  method: 'GET',
  endpoint: '/api/users'
});

2. Gauge

Gauges represent current values that can go up or down-like a fuel gauge in a car.

What they measure:

Current memory usage
Current number of active connections
Current queue depth
Current CPU utilization

Example use case:

// Track current active connections
const activeConnectionsGauge = meter.createUpDownCounter('active_connections', {
  description: 'Number of currently active connections'
});

// Update the gauge when connections change
activeConnectionsGauge.add(1); // Connection opened
activeConnectionsGauge.add(-1); // Connection closed

3. Histogram

Histograms track the distribution of values and provide percentile information-crucial for understanding performance characteristics.

What they measure:

Response time distributions
Request size distributions
Processing duration patterns
Error rate distributions

Example use case:

// Track response time distribution
const responseTimeHistogram = meter.createHistogram('response_time_seconds', {
  description: 'Response time in seconds',
  unit: 's'
});

// Record individual response times
responseTimeHistogram.record(0.125, {
  endpoint: '/api/users',
  method: 'GET'
});

How to Start Using OpenTelemetry Metrics

Getting Started: A Simple Example

Let's walk through setting up OpenTelemetry metrics in a Node.js application. This will give you a concrete understanding of how everything works together.

Step 1: Install the Required Packages

npm install @opentelemetry/api 
@opentelemetry/sdk-metrics @opentelemetry/sdk-node
npm install @opentelemetry/exporter-prometheus

Step 2: Set Up Basic Instrumentation

Create a file called instrumentation.js:

const { NodeSDK } = require('@opentelemetry/sdk-node');
const { MeterProvider } = require('@opentelemetry/sdk-metrics');
const { PrometheusExporter } = require('@opentelemetry/exporter-prometheus');

// Create a meter provider (the factory for creating metrics)
const meterProvider = new MeterProvider();

// Create a Prometheus exporter to send metrics to Prometheus
const prometheusExporter = new PrometheusExporter({
  port: 9464,        // Port where metrics will be exposed
  endpoint: '/metrics' // URL path for the metrics endpoint
});

// Connect the exporter to the meter provider
meterProvider.addMetricReader(prometheusExporter);

// Set this meter provider as the global default
const { metrics } = require('@opentelemetry/api');
metrics.setGlobalMeterProvider(meterProvider);

// Initialize the OpenTelemetry SDK
const sdk = new NodeSDK({
  metricReader: prometheusExporter,
});

// Start the SDK
sdk.start();

console.log('OpenTelemetry metrics initialized on port 9464');

Step 3: Create Your First Metrics

Now create a file called app.js to use the metrics:

const express = require('express');
const { metrics } = require('@opentelemetry/api');

const app = express();

// Get a meter instance for your application
const meter = metrics.getMeter('my-web-app');

// Create a counter for tracking requests
const requestCounter = meter.createCounter('http_requests_total', {
  description: 'Total number of HTTP requests'
});

// Create a histogram for tracking response times
const responseTimeHistogram = meter.createHistogram('http_response_time_seconds', {
  description: 'HTTP response time in seconds',
  unit: 's'
});

// Middleware to track all requests
app.use((req, res, next) => {
  const startTime = Date.now();
  
  // Increment request counter
  requestCounter.add(1, {
    method: req.method,
    endpoint: req.path
  });
  
  // Override res.end to capture response time
  const originalEnd = res.end;
  res.end = function(chunk, encoding) {
    const responseTime = (Date.now() - startTime) / 1000;
    
    // Record response time
    responseTimeHistogram.record(responseTime, {
      method: req.method,
      endpoint: req.path,
      status: res.statusCode.toString()
    });
    
    originalEnd.call(this, chunk, encoding);
  };
  
  next();
});

// Example route
app.get('/api/users', (req, res) => {
  // Simulate some work
  setTimeout(() => {
    res.json({ users: [] });
  }, Math.random() * 1000); // Random delay up to 1 second
});

app.listen(3000, () => {
  console.log('Server running on port 3000');
});

Step 4: View Your Metrics

After starting your application, you can view the metrics by visiting http://localhost:9464/metrics in your browser. You'll see output like this:

# HELP http_requests_total Total number of HTTP requests
# TYPE http_requests_total counter
http_requests_total{method="GET",endpoint="/api/users"} 5

# HELP http_response_time_seconds HTTP response time in seconds
# TYPE http_response_time_seconds histogram
http_response_time_seconds_bucket{method="GET",endpoint="/api/users",status="200",le="0.1"} 2
http_response_time_seconds_bucket{method="GET",endpoint="/api/users",status="200",le="0.5"} 3
http_response_time_seconds_bucket{method="GET",endpoint="/api/users",status="200",le="1"} 5
http_response_time_seconds_bucket{method="GET",endpoint="/api/users",status="200",le="+Inf"} 5
http_response_time_seconds_sum{method="GET",endpoint="/api/users",status="200"} 2.1
http_response_time_seconds_count{method="GET",endpoint="/api/users",status="200"} 5

Best Practices for Using OpenTelemetry Metrics

1. Choose Meaningful Names

Good metric names are descriptive and follow consistent patterns:

// Good examples
const goodMetrics = {
  http_requests_total: 'Total HTTP requests',
  database_query_duration_seconds: 'Database query duration',
  cache_hit_ratio: 'Cache hit ratio',
  active_user_sessions: 'Active user sessions'
};

// Avoid these patterns
const badMetrics = {
  'requests': 'Too vague',
  'HTTP_Requests': 'Inconsistent casing',
  'req_count': 'Abbreviated names'
};

2. Design Attributes Thoughtfully

Attributes should provide useful filtering and grouping without creating too many unique combinations

// Good attribute design
requestCounter.add(1, {
  method: 'GET',           // Limited set of values
  endpoint: '/api/users',  // Limited set of values
  status_code: '200',      // Limited set of values
  service: 'user-service'  // Limited set of values
});

// Avoid high cardinality attributes
requestCounter.add(1, {
  user_id: '12345',        // Too many unique values
  session_id: 'sess_abc',  // Too many unique values
  timestamp: '2024-01-15T10:30:00Z' // Don't include timestamps
});

3. Start Simple, Iterate Gradually

Don't try to instrument everything at once:

// Start with basic metrics
const basicMetrics = {
  requestCounter: meter.createCounter('requests_total'),
  responseTimeHistogram: meter.createHistogram('response_time_seconds')
};

// Add more sophisticated metrics later
const advancedMetrics = {
  businessMetrics: meter.createCounter('business_events_total'),
  customHistogram: meter.createHistogram('custom_measurement')
};

4. Monitor Your Monitoring

Track the performance impact of your metrics collection:

// Monitor the monitoring system itself
const metricCollectionCounter = meter.createCounter('metric_collection_operations_total');
const metricCollectionDuration = meter.createHistogram('metric_collection_duration_seconds');

// Track how long metric collection takes
const startTime = Date.now();
// ... collect metrics ...
const duration = (Date.now() - startTime) / 1000;

metricCollectionDuration.record(duration);
metricCollectionCounter.add(1);

Common Challenges and Pitfalls

1. High Cardinality Problems

The Problem: When you have too many unique attribute combinations, it can overwhelm your monitoring system and increase costs.

‍Example of the problem:

// This creates a new metric series for every user
userActionCounter.add(1, {
  user_id: req.user.id,        // Could be millions of unique values
  action: req.body.action,
  timestamp: new Date().toISOString()
});

Solution: Limit cardinality by grouping or bucketing values:

2. Memory Leaks from Metric Collection

The Problem: Metrics can accumulate in memory if not properly managed, especially in long-running applications.

‍Solution: Use metric views to limit what gets collected:

const { View } = require('@opentelemetry/sdk-metrics');

// Only collect specific attributes to limit memory usage
const limitedView = new View({
  instrumentName: 'http_requests_total',
  attributeKeys: ['method', 'endpoint'] // Only these attributes
});

meterProvider.addView(limitedView);

3. Performance Impact

The Problem: Collecting too many metrics can slow down your application.

Solution: Use asynchronous collection and reasonable intervals:

const prometheusExporter = new PrometheusExporter({
  port: 9464,
  endpoint: '/metrics',
  // Collect metrics every 30 seconds instead of continuously
  collectionTimeout: 30000
});

How OpenTelemetry Metrics Compare to Other Solutions

OpenTelemetry vs. Prometheus Metrics

What's the difference? This is a common question that appears in search results.

Prometheus metrics are a specific format and protocol for exposing metrics. Prometheus itself is a monitoring system that scrapes metrics from HTTP endpoints.

OpenTelemetry metrics are a standardized way to generate and collect metrics that can be exported to Prometheus (and many other systems).

Key differences:

OpenTelemetry provides the instrumentation and collection framework
Prometheus provides the storage, querying, and alerting capabilities
OpenTelemetry can export to Prometheus, but also to many other backends
Prometheus can only ingest metrics in its specific format

Think of it this way: OpenTelemetry is like a universal translator that can speak to many monitoring systems, while Prometheus is one specific monitoring system that speaks one language.

OpenTelemetry vs. Vendor-Specific Solutions

Traditional APM tools (like New Relic, Datadog, AppDynamics) often have their own proprietary instrumentation methods. This creates vendor lock-in and makes it difficult to switch between monitoring solutions.

OpenTelemetry provides vendor-neutral instrumentation that works with any monitoring backend. You can start with one solution and easily switch to another without rewriting your instrumentation code.

Getting Started: Next Steps

1. Choose Your First Application

Start with a simple, non-critical application to learn the ropes:

A development environment
A staging application
A simple API service
A background job processor

2. Identify Key Metrics

Focus on the most important measurements first:

Availability: Is your service responding?
Performance: How fast is it responding?
Errors: How often does it fail?
Throughput: How much work is it doing?

3. Set Up Basic Monitoring

Start with simple dashboards showing:

Request rates and response times
Error rates and types
Resource usage (CPU, memory, disk)
Business metrics (if applicable)

4. Implement Alerting

Set up basic alerts for:

High error rates
Slow response times
Service unavailability
Resource exhaustion

5. Iterate and Expand

Once you're comfortable with the basics:

Add more sophisticated metrics
Implement custom dashboards
Set up advanced alerting
Expand to more services

Conclusion

OpenTelemetry metrics represent a fundamental shift in how we approach application monitoring and observability. By providing a standardized, vendor-neutral way to collect and export metrics, they eliminate the complexity and lock-in associated with traditional monitoring solutions.

The key benefits of OpenTelemetry metrics include:

Standardization: Consistent approach across different languages and frameworks
Flexibility: Export to any monitoring backend that suits your needs
Rich context: Detailed attributes for meaningful analysis
Performance: Efficient collection with minimal application impact
Future-proofing: Industry-standard approach that will continue to evolve

Getting started with OpenTelemetry metrics doesn't have to be overwhelming. Start small with basic instrumentation, focus on the metrics that matter most to your application, and gradually expand your observability coverage. The investment in proper instrumentation will pay dividends in faster debugging, better performance, and improved user experience.

Remember, observability is not just about collecting data-it's about creating a system that helps you understand, optimize, and improve your applications. OpenTelemetry metrics provide the foundation for building that understanding.

OpenTelemetry Metrics: A Guide to Getting Started

What Are OpenTelemetry Metrics?

The Basics: What Makes OpenTelemetry Metrics Special

How OpenTelemetry Metrics Work

Why Are OpenTelemetry Metrics Useful?

The Observability Problem

How OpenTelemetry Metrics Solve These Problems

Understanding the OpenTelemetry Metrics Data Model

Core Concepts

Types of Metrics in OpenTelemetry

1. Counter

2. Gauge

3. Histogram

How to Start Using OpenTelemetry Metrics

Getting Started: A Simple Example

Step 1: Install the Required Packages

Step 2: Set Up Basic Instrumentation

Step 3: Create Your First Metrics

Step 4: View Your Metrics

Best Practices for Using OpenTelemetry Metrics

1. Choose Meaningful Names

2. Design Attributes Thoughtfully

3. Start Simple, Iterate Gradually

4. Monitor Your Monitoring

Common Challenges and Pitfalls

1. High Cardinality Problems

2. Memory Leaks from Metric Collection

3. Performance Impact

How OpenTelemetry Metrics Compare to Other Solutions

OpenTelemetry vs. Prometheus Metrics

OpenTelemetry vs. Vendor-Specific Solutions

Getting Started: Next Steps

1. Choose Your First Application

2. Identify Key Metrics

3. Set Up Basic Monitoring

4. Implement Alerting

5. Iterate and Expand

Conclusion

Additional Resources

Related articles