Prometheus
Prometheus is an open-source monitoring and alerting toolkit designed for reliability and scalability. It collects and stores metrics as time series data, with each time series identified by a metric name and key-value pairs called labels. Prometheus provides powerful query capabilities through PromQL (Prometheus Query Language), enabling real-time analysis of metrics, alerting on conditions, and operational insights.
Authentication Types
Prometheus supports 1 authentication method:
- API Key (Basic Auth) - Username and password authentication using HTTP Basic Authentication
- Pros: Simple to set up, widely supported, standard HTTP authentication
- Cons: Credentials are base64-encoded (not encrypted), requires HTTPS for security
- Best for: Internal deployments, authenticated access to Prometheus instances
General Settings
Before using the connector, you need to configure:
- Prometheus Instance URL - The base URL of your Prometheus server (e.g.,
https://prometheus.example.comorhttp://localhost:9090)
Setting up API Key (Basic Auth)
To use Prometheus with Webrix, you need to configure Basic Authentication if your Prometheus instance requires it.
1. Configure Basic Auth on Prometheus (if not already done)
- Generate a bcrypt-hashed password:
htpasswd -nBC 10 "" | tr -d ':\n'
- Create a
web.ymlconfiguration file:
basic_auth_users:
admin: <bcrypt_hashed_password>
- Start Prometheus with the web config file:
prometheus --web.config.file=web.yml
2. Enable Admin API (Optional)
If you want to use administrative features (snapshots, series deletion, config reload), enable these flags when starting Prometheus:
prometheus \
--web.config.file=web.yml \
--web.enable-admin-api \
--web.enable-lifecycle
--web.enable-admin-api- Enables TSDB admin operations (snapshot, delete, clean)--web.enable-lifecycle- Enables config reload endpoint
Admin endpoints should only be enabled in trusted environments as they allow destructive operations and configuration changes.
3. Configure in Webrix
-
In Webrix, go to Integrations → New → Built-in
-
Select Prometheus and click Use
-
Under General Settings, enter your Prometheus Instance URL
- Example:
https://prometheus.example.com - Example:
http://localhost:9090
- Example:
-
Under Authentication Type, select API Key
-
Enter your Username (e.g.,
admin) -
Enter your Password (the plain text password, not the bcrypt hash)
-
Click Save Changes
4. Test the Connection
-
After saving, click Connect to test the authentication
-
Try running a simple query like "Execute Instant Query" with query:
up -
You should see metrics data returned successfully
Common Use Cases
Querying Metrics
Use the query tools to retrieve metrics data:
-
Execute Instant Query - Get current metric values
- Example:
up- Check which targets are up - Example:
rate(http_requests_total[5m])- Request rate over last 5 minutes - Example:
node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes- Memory utilization
- Example:
-
Execute Range Query - Get metrics over a time period
- Example: Query
cpu_usagefrom 1 hour ago to now with 1-minute resolution - Perfect for creating graphs and dashboards
- Example: Query
Discovering Metrics
Explore what metrics and labels are available:
- List All Label Names - See all available labels (job, instance, status_code, etc.)
- Get Label Values - Find all values for a specific label (e.g., all job names)
- Find Series by Label Matchers - Discover time series matching specific criteria
- List Metric Metadata - Get descriptions and types for metrics
Monitoring Operations
Check the health and status of your monitoring infrastructure:
- List Scrape Targets - See all targets being scraped and their status
- List Active Alerts - View currently firing alerts
- List Alerting and Recording Rules - Audit configured rules
- List Alertmanagers - Check Alertmanager discovery status
Administration
Manage your Prometheus instance:
- Create TSDB Snapshot - Backup your metrics data
- Delete Time Series - Remove unwanted metrics
- Reload Configuration - Apply config changes without restart
- Get TSDB Statistics - Analyze cardinality and resource usage
Troubleshooting
Authentication Failed (401 Unauthorized)
You receive 401 errors when trying to query Prometheus.
Cause: Incorrect username or password, or Basic Auth not configured on Prometheus.
Solution:
- Verify your username and password are correct
- Check that Prometheus is started with
--web.config.filepointing to your auth config - Test authentication manually:
curl -u username:password http://prometheus-url/api/v1/query?query=up - Ensure the password in Webrix matches the plain text password (not the bcrypt hash)
Connection Refused or Timeout
Cannot connect to Prometheus instance.
Cause: Incorrect instance URL, Prometheus not running, or network issues.
Solution:
- Verify the Instance URL is correct and includes the protocol (http:// or https://)
- Check that Prometheus is running:
curl http://localhost:9090/-/healthy - Ensure no firewall rules are blocking access
- For HTTPS instances, ensure the certificate is valid
Admin API Disabled (501 Not Implemented)
Error when trying to use snapshot, delete, or config reload tools.
Cause: Admin API endpoints are not enabled on the Prometheus server.
Solution:
- Restart Prometheus with the appropriate flags:
prometheus --web.enable-admin-api --web.enable-lifecycle - These flags enable:
--web.enable-admin-api: Snapshot, delete series, clean tombstones--web.enable-lifecycle: Config reload
- Note: Only enable these in trusted environments
Query Timeout (503 Service Unavailable)
Queries fail with timeout errors.
Cause: Query is too expensive or takes too long to execute.
Solution:
- Simplify your query or reduce the time range
- Add a
timeoutparameter to your query (e.g., "2m") - Increase the query timeout on Prometheus server:
prometheus --query.timeout=5m - Check if high cardinality is causing performance issues using "Get TSDB Statistics"
- Consider adding more specific label matchers to reduce data scanned
Invalid Query (400 Bad Request)
Query fails with parsing or validation errors.
Cause: Syntax error in PromQL expression.
Solution:
- Use the "Format Query" tool to check query syntax
- Use the "Parse Query" tool to see how Prometheus interprets your query
- Common mistakes:
- Missing closing brackets or parentheses
- Invalid label matchers (use
=,!=,=~,!~) - Invalid duration formats (use
5m,1h,30s)
- Refer to PromQL documentation
No Data Returned
Query succeeds but returns empty results.
Cause: No matching time series, or querying outside the retention period.
Solution:
- Use "Find Series by Label Matchers" to verify the series exists
- Check the time range - metrics may have been deleted or expired
- Verify label matchers are correct (case-sensitive)
- Use "List Scrape Targets" to ensure targets are being scraped successfully
- Check for relabeling issues using "Get Relabel Steps"
High Cardinality Warnings
Prometheus performance degrading or high memory usage.
Cause: Too many unique time series being created.
Solution:
- Use "Get TSDB Statistics" to identify high cardinality labels
- Avoid labels with unbounded values (user IDs, timestamps, etc.)
- Use relabeling to drop or aggregate high cardinality labels
- Consider using recording rules to pre-aggregate data
- Review "List Metric Metadata" to audit your metrics
Cannot Delete Series
Series deletion fails or doesn't free up space.
Cause: Admin API not enabled, or tombstones not cleaned.
Solution:
- Ensure
--web.enable-admin-apiflag is set - After "Delete Time Series", run "Clean Tombstones" to reclaim disk space
- Note: Deletion only marks data as deleted initially
- Create a snapshot before deletion for safety
- Verify deletion with "Find Series by Label Matchers"
Important Notes
Security Considerations
- Always use HTTPS in production to protect Basic Auth credentials
- Limit access to admin endpoints (
--web.enable-admin-api) to trusted users only - Consider network-level access controls for Prometheus
- Regular backups using "Create TSDB Snapshot" are recommended
Performance Best Practices
- Use specific label matchers in queries to reduce data scanned
- Monitor cardinality with "Get TSDB Statistics"
- Use recording rules for frequently-queried expensive aggregations
- Set appropriate retention periods to balance storage and query performance
PromQL Tips
rate()andirate()for counter metricsincrease()for cumulative counters over timehistogram_quantile()for histogram metrics- Use
byandwithoutfor aggregations - Label matchers:
=(equal),!=(not equal),=~(regex match),!~(regex not match)