Skip to main content

Grafana Loki

Grafana Loki is a horizontally scalable, highly available, multi-tenant log aggregation system inspired by Prometheus. It is designed to be cost-effective and easy to operate, as it does not index the contents of logs but rather labels for each log stream. Loki allows you to efficiently store and query logs from your applications and infrastructure using LogQL, a powerful query language similar to PromQL.

Authentication Types

Grafana Loki supports API Key authentication, which works with both self-hosted and Grafana Cloud deployments:

API Key Authentication

For Self-Hosted Loki:

  • Uses HTTP Basic Authentication through a reverse proxy (nginx, HAProxy, etc.)
  • Format: Username and password are configured in the connector
  • Loki itself doesn't handle authentication - it must be configured at the reverse proxy layer

Pros:

  • Simple to configure once reverse proxy is set up
  • Standard HTTP authentication mechanism
  • Works with any reverse proxy solution

Cons:

  • Requires reverse proxy setup and configuration
  • Need to manage username/password credentials
  • Additional infrastructure component to maintain

For Grafana Cloud Loki:

  • Uses Bearer token authentication with API keys
  • Format: API key token is configured in the connector
  • Grafana Cloud handles authentication directly

Pros:

  • No additional infrastructure needed
  • Centralized credential management through Grafana Cloud
  • Easy to create and revoke API keys
  • Supports different access levels (e.g., MetricsPublisher, Admin)

Cons:

  • Requires Grafana Cloud account
  • API keys need to be stored securely
  • May have rate limits depending on plan

General Settings

Before using the Loki connector, you need to configure the following settings:

Loki Instance URL

The base URL of your Loki instance. This varies depending on your deployment type:

Self-Hosted Examples:

  • http://loki.localhost (local development)
  • http://loki.example.com (production)
  • https://loki.internal.company.com (internal deployment)

Grafana Cloud Examples:

  • https://logs-prod-us-central1.grafana.net (US region)
  • https://logs-prod-eu-west-0.grafana.net (EU region)

The URL should not include the /loki/api/v1 path - just the base URL.

Tenant/Org ID (Optional)

For multi-tenant Loki deployments, specify the organization or tenant ID. This is used to set the X-Scope-OrgID header in requests.

When to use:

  • Multi-tenant self-hosted Loki installations
  • When you need to query a specific tenant's logs
  • Grafana Cloud automatically includes the organization ID in the URL

Format:

  • Usually a numeric ID or string identifier
  • Example: 12345 or my-org

Leave empty for single-tenant deployments.

Setting up API Key Authentication

For Self-Hosted Loki

Self-hosted Loki requires a reverse proxy (like nginx) configured with HTTP Basic Authentication. Here's how to set it up:

Step 1: Configure nginx with Basic Authentication

  1. Create a password file using htpasswd:
htpasswd -c /etc/nginx/passwords loki_user

Enter a secure password when prompted.

  1. Create or update your nginx configuration file (e.g., /etc/nginx/conf.d/loki.conf):
upstream loki {
server 127.0.0.1:3100;
keepalive 15;
}

server {
listen 80;
server_name loki.example.com;

auth_basic "Loki Authentication";
auth_basic_user_file /etc/nginx/passwords;

location / {
proxy_read_timeout 1800s;
proxy_connect_timeout 1600s;
proxy_pass http://loki;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "Keep-Alive";
proxy_set_header Proxy-Connection "Keep-Alive";
proxy_redirect off;
}

# Health check endpoint without auth
location /ready {
proxy_pass http://loki;
auth_basic "off";
}
}
  1. Test and reload nginx:
nginx -t
nginx -s reload

Step 2: Configure the Connector

In the connector settings:

  1. Authentication Type: Select "API Key"
  2. Loki Instance URL: Enter your nginx proxy URL (e.g., http://loki.example.com)
  3. API Key Configuration:
    • Username: Enter the username from htpasswd (e.g., loki_user)
    • Password: Enter the password you set

Step 3: Test the Connection

You can verify the setup with curl:

curl -u loki_user:your_password "http://loki.example.com/loki/api/v1/labels"

For Grafana Cloud Loki

Grafana Cloud provides managed Loki with built-in authentication using API keys.

Step 1: Create an API Key

  1. Log in to Grafana Cloud

  2. Navigate to SecurityService Accounts (or API Keys in older versions)

  3. Click Add service account or Add API Key

  4. Configure the service account:

    • Display name: e.g., "Loki MCP Connector"
    • Role: Select appropriate role (e.g., "MetricsPublisher" for write access, "Viewer" for read-only)
  5. Click Add token to generate an API token

  6. Important: Copy the token immediately - it won't be shown again

Step 2: Find Your Loki Instance URL

  1. In Grafana Cloud, go to your stack details

  2. Find the Loki section

  3. Copy the URL (e.g., https://logs-prod-us-central1.grafana.net)

  4. Note your User/Instance ID if needed for the tenant ID

Step 3: Configure the Connector

In the connector settings:

  1. Authentication Type: Select "API Key"
  2. Loki Instance URL: Enter your Grafana Cloud Loki URL
  3. API Key Configuration:
    • Token: Paste the API token you generated
    • Leave Username and Password empty
  4. Tenant/Org ID: Enter your instance ID (optional, usually included in the URL)

Step 4: Test the Connection

You can verify with curl:

curl -H "Authorization: Bearer YOUR_API_TOKEN" \
"https://logs-prod-us-central1.grafana.net/loki/api/v1/labels"

LogQL Query Language Primer

LogQL is Loki's query language, similar to PromQL. Here are the basics:

Log Stream Selectors

Select logs using label matchers:

{app="nginx"}                          # Exact match
{app="nginx", env="production"} # Multiple labels (AND)
{app=~"nginx|apache"} # Regex match
{app!="nginx"} # Not equal
{app=~"nginx", env!="dev"} # Combined

Log Pipeline Expressions

Filter and parse log content:

{app="nginx"} |= "error"               # Contains "error"
{app="nginx"} != "debug" # Doesn't contain "debug"
{app="nginx"} |~ "error|warn" # Regex match
{app="nginx"} | json # Parse JSON
{app="nginx"} | logfmt # Parse logfmt
{app="nginx"} | json | level="error" # Parse and filter

Metric Queries

Generate metrics from logs:

rate({app="nginx"}[5m])                # Log rate per second
count_over_time({app="nginx"}[1h]) # Count in time window
sum(rate({app="nginx"}[5m])) # Total rate
sum by (status) (rate({app="nginx"}[5m])) # Rate by status

Common Query Patterns

Find errors in the last hour:

{namespace="production"} |~ "(?i)error|exception|fatal" [1h]

Count HTTP errors by status code:

sum by (status) (count_over_time({job="nginx"} | json | status >= 400 [5m]))

Calculate request rate by endpoint:

sum by (endpoint) (rate({app="api"} | json [1m]))

Find slow queries:

{app="database"} | logfmt | duration > 1000

Common Use Cases

1. Real-Time Error Monitoring

Use Stream Logs (Tail) to watch errors as they occur:

// Watch production errors live
{
query: '{namespace="production", level="error"}',
limit: 50
}

2. Historical Log Analysis

Use Query Logs Range to analyze past incidents:

// Find errors during an incident
{
query: '{app="api"} |= "timeout"',
start: '2024-01-15T10:00:00Z',
end: '2024-01-15T11:00:00Z'
}

3. Discovering Log Structure

Use List Labels and Query Label Values to explore:

// Step 1: Find all labels
// (no parameters needed)

// Step 2: Get values for a specific label
{
label: "app"
}

4. Volume Analysis

Use Query Log Volume Range to identify traffic patterns:

// See log volume by service over time
{
query: '{namespace="production"}',
targetLabel: 'app',
start: '2024-01-15T00:00:00Z',
end: '2024-01-16T00:00:00Z',
step: '1h'
}

5. Pattern Detection

Use Detect Patterns to find common log formats:

// Discover log patterns in a service
{
query: '{app="nginx"}',
start: '2024-01-15T00:00:00Z',
end: '2024-01-15T01:00:00Z'
}

6. Query Optimization

Use Query Log Statistics before running expensive queries:

// Check data volume before querying
{
query: '{namespace="production"}',
start: '2024-01-01T00:00:00Z',
end: '2024-01-15T00:00:00Z'
}

Troubleshooting

Authentication Failed (401)

Cause: Incorrect credentials or API token

Solution:

For Self-Hosted:

  • Verify username/password in htpasswd file
  • Check nginx configuration is loading the correct password file
  • Test with curl: curl -u username:password http://loki.example.com/loki/api/v1/labels

For Grafana Cloud:

  • Regenerate API token in Grafana Cloud console
  • Ensure token has correct permissions (read/write as needed)
  • Verify token is entered correctly without extra spaces

No Such Host / Connection Refused

Cause: Incorrect Loki Instance URL

Solution:

  • Verify the URL format (should not include /loki/api/v1)
  • Check network connectivity to the Loki instance
  • Ensure Loki is running and accessible
  • For self-hosted: verify nginx/reverse proxy is running
  • For Grafana Cloud: check the URL matches your stack details

Invalid Query Syntax

Cause: LogQL syntax error

Solution:

  • Use Format LogQL Query tool to validate syntax
  • Check LogQL documentation for correct operators
  • Common mistakes:
    • Missing quotes in labels: {app=nginx} should be {app="nginx"}
    • Invalid regex: ensure proper escaping
    • Wrong metric function: rate() requires a range vector [5m]

Empty Results

Cause: Query doesn't match any logs or time range issue

Solution:

  • Use List Labels to verify label names exist
  • Use Query Label Values to check valid label values
  • Verify time range includes log data
  • Start with broader query, then narrow down
  • Check if tenant ID is required and set correctly

Multi-Tenancy Issues

Cause: Missing or incorrect X-Scope-OrgID header

Solution:

  • Set the Tenant/Org ID in general settings
  • Verify the tenant ID matches your Loki configuration
  • For single-tenant deployments, leave it empty
  • Check Loki server logs for tenant-related errors

Rate Limiting (429)

Cause: Too many requests or query is too expensive

Solution:

  • Reduce query frequency
  • Use smaller time ranges
  • Add more specific filters to queries
  • For Grafana Cloud: check your plan limits
  • Use Query Log Statistics to understand query cost before executing

WebSocket Connection Failed (Tail)

Cause: WebSocket support disabled or proxy issue

Solution:

  • Ensure nginx/reverse proxy supports WebSocket upgrades
  • Check nginx config includes Upgrade headers
  • Verify firewall allows WebSocket connections
  • For Grafana Cloud: ensure no proxy blocks WebSockets

Best Practices

  1. Start Broad, Then Narrow: Begin with simple label selectors, then add filters
  2. Use Statistics First: Check query cost with Query Log Statistics before running expensive queries
  3. Limit Result Sets: Always use reasonable limit values to avoid overwhelming responses
  4. Leverage Labels: Structure your logs with meaningful labels for easier querying
  5. Cache Metadata: Labels and label values change slowly - cache them locally
  6. Use Patterns: Let Detect Patterns discover log structures automatically
  7. Monitor Volume: Use volume queries to identify noisy log sources and optimize
  8. Validate Queries: Use Format LogQL Query to catch syntax errors early

Additional Resources