Skip to content

Graylog

Julien A. Raemy edited this page Aug 11, 2025 · 1 revision

📢 Important Update: LINDAS is undergoing a major infrastructure migration to LINDASnext. The current Stardog-based system is being replaced with GraphDB EE. New contracts have been awarded in two lots for the period 2025-2034: Lot 1 (Infrastructure) to Cognizone and Lot 2 (Application Development) to Liip, Zazuko, and Adnovum with their respective partners. Services remain operational during transition. See LINDASnext for details.

LINDAS Graylog Guide

Graylog is the centralized logging platform for LINDAS, providing comprehensive log management and analysis capabilities for all system components.

Overview

Graylog collects and manages logs from all LINDAS components including Stardog, Trifid, Visualize, Cube Creator, and infrastructure services. While this provides tremendous visibility into system operations, the volume of data can make it challenging to find specific information without proper filtering techniques.

Access

Graylog Interface: https://logging.ldbar.ch/search

Documentation Resources

Official Documentation

VSHN-Specific Documentation

How-To Guides

Understanding Log Streams

Graylog organizes logs into streams to categorize different types of log entries. Current streams include:

Stream Description Components
Stardog Database operations and queries Stardog triplestore
Varnish HTTP caching operations Varnish cache layer
Zazuko Application-level operations Trifid (SPARQL endpoints), possibly other Zazuko components

Note: The Zazuko stream likely includes Trifid logs, but this requires validation.

Identifying Stream Membership

When viewing a log entry, stream membership is shown as: Routed into streams: [Stream Name]

Understanding Log Entry Structure

Basic Fields

Every log entry contains standard fields:

  • Timestamp: When the event occurred
  • hostname: Source system identifier
  • kubernetes_*: Kubernetes-specific metadata fields

Message Field Structure

The message field contains the actual log content from components. Graylog automatically extracts structured data from these messages and presents them as individual searchable fields.

Example: A log message might contain:

{
  "level": "info",
  "pid": "12345", 
  "msg": "request completed",
  "responseTime": 150
}

These nested values become searchable as separate fields: level, pid, msg, responseTime.

Search Query Syntax

Basic Field Searches

Search specific fields using the format field_name:value:

kubernetes_labels_app_kubernetes_io_name:stardog
msg:"request completed"
reqId:"req-1uj7"

Note: Enclose values in quotes when they contain spaces or special characters.

Combining Search Terms

  • Default behavior: Multiple terms use OR logic
  • Explicit AND: Use AND between terms
  • Explicit OR: Use OR between terms (same as default)

Examples:

kubernetes_labels_app_kubernetes_io_instance:trifid-lindas-int AND reqId:"req-1uj7"
level:error OR level:warn

Component-Specific Log Identification

SPARQL Queries via Trifid

Filters:

kubernetes_labels_app_kubernetes_io_name:trifid
kubernetes_labels_app_kubernetes_io_instance:trifid-lindas-prod

Environment-specific instances:

  • trifid-lindas-prod: Production
  • trifid-lindas-int: Integration
  • trifid-lindas-test: Testing

SPARQL Query Tracking:

  • Multiple log entries may share the same reqId
  • Start entry: Query initiation
  • Completion entry: Contains msg:"request completed" and responseTime

Tip: To display the responseTime field in results, customize columns using the edit button next to "All Messages".

Cube Creator Logs

Filters:

kubernetes_labels_app_kubernetes_io_name:app
kubernetes_labels_app_kubernetes_io_part-of:cube-creator
kubernetes_namespace_name:zazuko-int
kubernetes_pod_name:cube-creator-app-955fffdf4-tz9p5

Note: Pod names are dynamic and will change with deployments.

Varnish Cache Logs

Filters:

kubernetes_labels_app_kubernetes_io_name:varnish
kubernetes_labels_app_kubernetes_io_instance:varnish-prod

Visualize Application Logs

Challenge: No direct Kubernetes label filtering available for Visualize.

Workaround: Filter by remoteAddress field using known Visualize IP addresses.

Recommendation: Contact Zazuko team for current Visualize IP ranges.

Advanced Usage Tips

Performance Analysis

  1. Query Response Times: Search for msg:"request completed" and display responseTime column
  2. Error Investigation: Use level:error to identify system issues
  3. Request Tracing: Use reqId to follow specific requests through the system

Time-based Analysis

  • Use Graylog's time range selector for focused analysis
  • Combine with component filters for targeted investigation

Custom Queries for Common Tasks

Find slow SPARQL queries:

msg:"request completed" AND responseTime:>5000

Monitor error rates by component:

level:error AND kubernetes_labels_app_kubernetes_io_name:[component_name]

Track specific user sessions:

reqId:"[specific_request_id]"

Best Practices

  1. Start Broad: Begin with component-level filters, then narrow down
  2. Use Time Ranges: Limit searches to relevant time periods
  3. Combine Filters: Use multiple criteria to isolate specific issues
  4. Save Searches: Bookmark useful queries for repeated analysis
  5. Monitor Streams: Regularly check different streams for system health

Clone this wiki locally