Lumigo OpenTelemetry Distribution for Python
The Lumigo OpenTelemetry Distribution for Python is a package that provides no-code distributed tracing for containerized applications.
The Lumigo OpenTelemetry Distribution for Python is made of several upstream OpenTelemetry packages, with additional automated quality-assurance and customizations that optimize for no-code injection, meaning that you should need to update exactly zero lines of code in your application in order to make use of the Lumigo OpenTelemetry Distribution. (See the No-code activation section for auto-instrumentation instructions)
Note: If you are looking for the Lumigo Python tracer for AWS Lambda functions, lumigo-tracer
is the package you should use instead.
Adding the Lumigo OpenTelemetry Distro for Python to your application is a three-step process:
The lumigo_opentelemetry
package needs to be a dependency of your application.
In most cases, you will add lumigo_opentelemetry
as a line in requirements.txt
:
lumigo_opentelemetry
Or, you may use pip
:
pip install lumigo_opentelemetry
Configure the LUMIGO_TRACER_TOKEN
environment variable with the token value generated for you by the Lumigo platform, under Settings --> Tracing --> Manual tracing
:
LUMIGO_TRACER_TOKEN=<token>
Replace <token>
below with the token generated for you by the Lumigo platform.
It is also strongly suggested that you set the OTEL_SERVICE_NAME
environment variable with, as value, the service name you have chosen for your application:
OTEL_SERVICE_NAME=<service name>
Replace <service name> with the desired name of the service
.
Note: While you are providing environment variables for configuration, consider also providing the one needed for no-code tracer activation :-)
There are two ways to activate the lumigo_opentelemetry
package: one based on importing the package in code (manual activation), and the other via the environment (no-code activation).
The no-code activation approach is the preferred one.
Note: The instructions in this section are mutually exclusive with those provided in the Manual activation section.
Set the following environment variable:
AUTOWRAPT_BOOTSTRAP=lumigo_opentelemetry
Note: The instructions in this section are mutually exclusive with those provided in the No-code activation section.
Import lumigo_opentelemetry
at the beginning of your main file:
import lumigo_opentelemetry
The Lumigo OpenTelemetry Distro for Python is made of several upstream OpenTelemetry packages, together with additional logic and, as such, the environment variables that work with "vanilla" OpenTelemetry work also with the Lumigo OpenTelemetry Distro for Python. Specifically supported are:
The lumigo_opentelemetry
package additionally supports the following configuration options as environment variables:
LUMIGO_TRACER_TOKEN
: [Required] Required configuration to send data to Lumigo; you will find the right value in Lumigo under Settings -> Tracing -> Manual tracing
.
LUMIGO_DEBUG=true
: Enables debug logging
LUMIGO_DEBUG_SPANDUMP
: path to a local file where to write a local copy of the spans that will be sent to Lumigo; this option handy for local testing but should not be used in production unless you are instructed to do so by Lumigo support.
LUMIGO_SECRET_MASKING_REGEX=["regex1", "regex2"]
: Prevents Lumigo from sending keys that match the supplied regular expressions. All regular expressions are case-insensitive. By default, Lumigo applies the following regular expressions: [".*pass.*", ".*key.*", ".*secret.*", ".*credential.*", ".*passphrase.*"]
.
LUMIGO_SECRET_MASKING_REGEX_HTTP_REQUEST_BODIES
, LUMIGO_SECRET_MASKING_REGEX_HTTP_REQUEST_HEADERS
, LUMIGO_SECRET_MASKING_REGEX_HTTP_RESPONSE_BODIES
, LUMIGO_SECRET_MASKING_REGEX_HTTP_RESPONSE_HEADERS
, LUMIGO_SECRET_MASKING_REGEX_HTTP_QUERY_PARAMS
, LUMIGO_SECRET_MASKING_REGEX_ENVIRONMENT
.LUMIGO_SWITCH_OFF=true
: This option disables the Lumigo OpenTelemetry distro entirely; no instrumentation will be injected, no tracing data will be collected.
LUMIGO_REPORT_DEPENDENCIES=false
: This option disables the built-in dependency reporting to Lumigo SaaS. For more information, refer to the Automated dependency reporting section.
LUMIGO_AUTO_FILTER_EMPTY_SQS
: This option enables the automatic filtering of empty SQS messages from being sent to Lumigo SaaS. For more information, refer to the Filtering out empty SQS messages section.
LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX='["regex1", "regex2"]'
: This option enables the filtering of client and server endpoints through regular expression searches. Fine-tune your settings via the following environment variables, which work in conjunction with LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX
for a specific span type:
LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_SERVER
applies the regular expression search exclusively to server spans. Searching is performed against the following attributes on a span: url.path
and http.target
.LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_CLIENT
applies the regular expression search exclusively to client spans. Searching is performed against the following attributes on a span: url.full
and http.url
.For more information check out Filtering http endpoints.
Execution Tags allow you to dynamically add dimensions to your invocations so that they can be identified, searched for, and filtered in Lumigo. For example: in multi-tenanted systems, execution tags are often used to mark with the identifiers of the end-users that trigger them for analysis (e.g., Explore view) and alerting purposes.
In the Lumigo OpenTelemetry Distro for Python, execution tags are represented as span attributes and, specifically, as span attributes with the lumigo.execution_tags.
prefix.
For example, you could add an execution tag as follows:
from opentelemetry.trace import get_current_span
get_current_span().set_attribute('lumigo.execution_tags.foo','bar')
Notice that, using OpenTelemetry's get_current_span()
API, you do not need to keep track of the current span, you can get it at any point of your program execution.
In OpenTelemetry, span attributes can be strings
, numbers
(double precision floating point or signed 64 bit integer), booleans
(a.k.a. "primitive types"), and arrays of one primitive type (e.g., an array of string, and array of numbers or an array of booleans).
In Lumigo, booleans and numbers are transformed to strings.
IMPORTANT: If you use the Span.set_attribute
API multiple times on the same span to set values for the same key multiple values, you may override previous values rather than adding to them:
from opentelemetry.trace import get_current_span
get_current_span().set_attribute('lumigo.execution_tags.foo','bar')
get_current_span().set_attribute('lumigo.execution_tags.foo','baz')
In the snippets above, the foo
execution tag will have in Lumigo only the baz
value!
Multiple values for an execution tag are supported as follows:
from opentelemetry.trace import get_current_span
get_current_span().set_attribute('lumigo.execution_tags.foo',['bar', 'baz'])
Tuples are also supported to specify multiple values for an execution tag:
from opentelemetry.trace import get_current_span
get_current_span().set_attribute('lumigo.execution_tags.bar',('baz','xyz',))
The snippets above will produce in Lumigo the foo
tag having both bar
and baz
values.
Another option to set multiple values is setting execution Tags in different spans of an invocation.
In Lumigo, multiple spans may be merged together into one invocation, which is the entry that you see, for example, in the Explore view. The invocation will include all execution tags on all its spans, and merge their values:
from opentelemetry import trace
trace.get_current_span().set_attribute('lumigo.execution_tags.foo','bar')
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span('child_span') as child_span:
child_span.set_attribute('lumigo.execution_tags.foo','baz')
In the examples above, the invocation in Lumigo resulting from executing the code will have both bar
and baz
values associated with the foo
execution tag.
Which spans are merged in the same invocation depends on the parent-child relations among those spans.
Explaining this topic is outside the scope of this documentation; a good first read to get deeper into the topic is the Traces documentation of OpenTelemetry.
In case your execution tags on different spans appear on different invocations than what you would expect, get in touch with Lumigo support.
key
of an execution tag cannot contain the .
character; for example: lumigo.execution_tags.my.tag
is not a valid tag. The OpenTelemetry Span.set_attribute()
API will not fail or log warnings, but that will be displayed as my
in Lumigo.lumigo.execution_tags.
prefix does not count against the 50 characters limit.Programmatic Errors allow you to customize errors, monitor and troubleshoot issues that should not necessarily interfere with the service. For example, an application tries to remove a user who doesn't exist. These custom errors can be captured by adding just a few lines of additional code to your application.
Programmatic Errors indicating that a non-fatal error occurred, such as an application error. You can log programmatic errors, track custom error issues, and trigger Alerts.
Programmatic errors are created by adding span events with a custom attribute being set with the key name lumigo.type
.
For example, you could add a programmatic error as follows:
from opentelemetry.trace import get_current_span
get_current_span().add_event('<error-message>', {'lumigo.type': '<error-type>'})
Instrumentation | Package | Supported Versions | ||||
---|---|---|---|---|---|---|
3.7 | 3.8 | 3.9 | 3.10 | 3.11 | ||
botocore | boto3 | 1.17.22~1.33.13 | 1.17.22~1.34.11 | 1.17.22~1.34.11 | 1.17.22~1.34.11 | 1.17.22~1.34.11 |
django | django | 3.2.1~3.2.23 | 3.2.1~3.2.23 | 3.2.1~3.2.23 | 3.2.1~3.2.23 | 3.2.1~3.2.23 |
3.2 | 4.0.1~4.2.8 | 4.0.1~4.2.8 | 4.0.1~4.2.8 | 4.0.1~4.2.8 | ||
3.2 | 3.2 | 5.0.1~5.0.1 | 5.0.1~5.0.1 | |||
4.0 | 4.0 | 3.2 | 3.2 | |||
4.0.a1 | 4.0.a1 | 4.0 | 4.0 | |||
4.0.b1 | 4.0.b1 | 4.0.a1 | 4.0.a1 | |||
4.0.rc1 | 4.0.rc1 | 4.0.b1 | 4.0.b1 | |||
4.1 | 4.1 | 4.0.rc1 | 4.0.rc1 | |||
4.1.a1 | 4.1.a1 | 4.1 | 4.1 | |||
4.1.b1 | 4.1.b1 | 4.1.a1 | 4.1.a1 | |||
4.1.rc1 | 4.1.rc1 | 4.1.b1 | 4.1.b1 | |||
4.1rc1 | 4.1rc1 | 4.1.rc1 | 4.1.rc1 | |||
4.2 | 4.2 | 4.1rc1 | 4.1rc1 | |||
4.2.a1 | 4.2.a1 | 4.2 | 4.2 | |||
4.2.b1 | 4.2.b1 | 4.2.a1 | 4.2.a1 | |||
4.2.rc1 | 4.2.rc1 | 4.2.b1 | 4.2.b1 | |||
4.2rc1 | 4.2rc1 | 4.2.rc1 | 4.2.rc1 | |||
4.2rc1 | 4.2rc1 | |||||
5.0 | 5.0 | |||||
5.0rc1 | 5.0rc1 | |||||
fastapi | uvicorn | 0.11.3~0.22.0 | 0.11.3~0.22.0 | 0.11.3~0.22.0 | 0.11.3~0.22.0 | 0.12.0~0.22.0 |
0.24.0~0.25.0 | 0.24.0~0.25.0 | 0.24.0~0.25.0 | 0.24.0~0.25.0 | |||
fastapi | 0.56.1~0.100.0 | 0.56.1~0.100.0 | 0.56.1~0.100.0 | 0.56.1~0.100.0 | 0.56.1~0.100.0 | |
0.100.0b2~0.103.2 | 0.100.0b2~0.108.0 | 0.100.0b2~0.108.0 | 0.100.0b2~0.108.0 | 0.100.0b2~0.108.0 | ||
flask | flask | 2.0.0~2.2.5 | 2.0.0~2.2.5 | 2.0.0~2.2.5 | 2.0.0~2.2.5 | 2.0.0~2.2.5 |
grpcio | grpcio | 1.45.0~1.60.0rc1 | 1.45.0~1.60.0rc1 | 1.45.0~1.60.0rc1 | 1.45.0~1.60.0rc1 | 1.49.0~1.60.0rc1 |
kafka_python | kafka_python | 2.0.0~2.0.2 | 2.0.0~2.0.2 | 2.0.0~2.0.2 | 2.0.0~2.0.2 | 2.0.0~2.0.2 |
pika | pika | 1.0.0 | 1.0.0 | 1.0.0 | 1.0.0 | 1.0.0 |
1.0.1~1.3.0 | 1.0.1~1.3.0 | 1.0.1~1.3.0 | 1.0.1~1.3.0 | 1.0.1~1.3.0 | ||
1.3.0rc5~1.3.2 | 1.3.0rc5~1.3.2 | 1.3.0rc5~1.3.2 | 1.3.0rc5~1.3.2 | 1.3.0rc5~1.3.2 | ||
psycopg | psycopg-binary | 3.1.1~3.1.16 | 3.1.1~3.1.16 | 3.1.1~3.1.16 | 3.1.1~3.1.16 | 3.1.4~3.1.16 |
3.1 | 3.1 | 3.1 | 3.1 | |||
psycopg | 3.1.1~3.1.16 | 3.1.1~3.1.16 | 3.1.1~3.1.16 | 3.1.1~3.1.16 | 3.1.1~3.1.16 | |
3.1 | 3.1 | 3.1 | 3.1 | 3.1 | ||
psycopg2 | psycopg2 | 2.7.5~2.9.9 | 2.8.1~2.9.9 | 2.8.1~2.9.9 | 2.8.1~2.8.6 | 2.9.5~2.9.9 |
2.8 | 2.8 | 2.8 | 2.9.5~2.9.9 | |||
2.9 | 2.9 | 2.9 | 2.8 | |||
psycopg2-binary | 2.7.5~2.9.9 | 2.8.1~2.9.9 | 2.8.1~2.9.9 | 2.8.1~2.8.6 | 2.9.5~2.9.9 | |
2.8 | 2.8 | 2.8 | 2.9.5~2.9.9 | |||
2.9 | 2.9 | 2.9 | 2.8 | |||
pymongo | pymongo | 3.1.1~3.3.1 | 3.1.1~3.3.1 | 3.1.1~3.3.1 | 3.1.1~3.3.1 | 3.1.1~3.3.1 |
3.5.0~3.13.0 | 3.5.0~3.13.0 | 3.5.0~3.13.0 | 3.5.0~3.13.0 | 3.5.0~3.13.0 | ||
4.0.1~4.6.1 | 4.0.1~4.6.1 | 4.0.1~4.6.1 | 4.0.1~4.6.1 | 4.0.1~4.6.1 | ||
3.1 | 3.1 | 3.1 | 3.1 | 3.1 | ||
3.2 | 3.2 | 3.2 | 3.2 | 3.2 | ||
4.0 | 4.0 | 4.0 | 4.0 | 4.0 | ||
pymysql | pymysql | 0.9.0~0.10.1 | 0.9.0~0.10.1 | 0.9.0~0.10.1 | 0.9.0~0.10.1 | 0.9.0~0.10.1 |
1.0.0~1.0.3 | 1.0.0~1.0.3 | 1.0.0~1.0.3 | 1.0.0~1.0.3 | 1.0.0~1.0.3 | ||
1.1.0~1.1.0rc2 | 1.1.0~1.1.0rc2 | 1.1.0~1.1.0rc2 | 1.1.0~1.1.0rc2 | 1.1.0~1.1.0rc2 | ||
redis | redis | 4.1.1~4.2.0 | 4.1.1~4.2.0 | 4.1.1~4.2.0 | 4.1.1~4.2.0 | 4.1.1~4.2.0 |
4.2.1~4.6.0 | 4.2.1~4.6.0 | 4.2.1~4.6.0 | 4.2.1~4.6.0 | 4.2.1~4.6.0 | ||
5.0.0~5.1.0a1 | 5.0.0~5.1.0a1 | 5.0.0~5.1.0a1 | 5.0.0~5.1.0a1 | 5.0.0~5.1.0a1 |
To provide better support and better data-driven product decisions with respect to which packages to support next, the Lumigo OpenTelemetry Distro for Python will report to Lumigo SaaS on startup the packages and their versions used in this application, together with the OpenTelemetry resource data to enable analytics in terms of which platforms use which dependencies.
The data uploaded to Lumigo is a set of key-value pairs with package name and version. Similar is available through the tracing data sent to Lumigo, except that this aims at covering dependencies for which the Lumigo OpenTelemetry Distro for Python does not have instrumentation (yet?). Lumigo's only goal for these analytics data is to be able to give you the instrumentations you need without you needing to tell us!
The dependencies data is sent only when a LUMIGO_TRACER_TOKEN
is present in the process environment, and it can be opted out via the LUMIGO_REPORT_DEPENDENCIES=false
environment variable.
The Lumigo OpenTelemetry Distro will automatically create the following OpenTelemetry constructs provided to a TraceProvider
.
The attributes from the default resource:
telemetry.sdk.language
: python
telemetry.sdk.name
: opentelemetry
telemetry.sdk.version
: depends on the version of the opentelemetry-sdk
included in the dependenciesThe lumigo.distro.version
containing the version of the Lumigo OpenTelemetry Distro for Python as specified in the VERSION file
The following process.runtime.*
attributes as specified in the Process Semantic Conventions:
process.runtime.description
process.runtime.name
process.runtime.version
A non-standard process.environ
resource attribute, containing a stringified representation of the process environment, with environment variables scrubbed based on the LUMIGO_SECRET_MASKING_REGEX
configuration.
If the instrumented Python application is running on the Amazon Elastic Container Service (ECS):
cloud.provider
attribute with value aws
cloud.platform
with value aws_ecs
container.name
with the hostname of the ECS Task containercontainer.id
with the ID of the Docker container (based on the cgroup id)If the ECS task uses the ECS agent v1.4.0, and has therefore access to the Task metadata endpoint version 4, the following experimental attributes as specified in the AWS ECS Resource Attributes specification:
aws.ecs.container.arn
aws.ecs.cluster.arn
aws.ecs.launchtype
aws.ecs.task.arn
aws.ecs.task.family
aws.ecs.task.revision
k8s.pod.uid
with the Pod identifier, supported for both cgroups v1 and v2LUMIGO_TRACER_TOKEN
environment variable is set: a BatchSpanProcessor, which uses an OTLPSpanExporter
to push tracing data to LumigoLUMIGO_DEBUG_SPANDUMP
environment variable is set: a SimpleSpanProcessor
, which uses an ConsoleSpanExporter
to save to file the spans collected. Do not use this in production!The following SDK environment variables are supported:
OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT
OTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT
** If the OTEL_SPAN_ATTRIBUTE_VALUE_LENGTH_LIMIT
environment variable is not set, the span attribute size limit will be taken from OTEL_ATTRIBUTE_VALUE_LENGTH_LIMIT
environment variable. The default size limit when both are not set is 2048.
The Lumigo OpenTelemetry Distro provides access to the TracerProvider
it configures (see the Baseline setup section for more information) through the tracer_provider
attribute of the lumigo_opentelemetry
package:
from lumigo_opentelemetry import tracer_provider
# Do here stuff like adding span processors
For short-running processes, the BatchProcessor
configured by the Lumigo OpenTelemetry Distro may not ensure that the tracing data are sent to Lumigo (see the Baseline setup section for more information).
Through the access to the tracer_provider
, however, it is possible to ensure that all spans are flushed to Lumigo as follows:
from lumigo_opentelemetry import tracer_provider
# Do some logic
tracer_provider.force_flush()
# Now the Python process can terminate, with all the spans closed so far sent to Lumigo
Messaging instrumentations that retrieve messages from queues tend to be counter-intuitive for end-users: when retrieving one or more messages from the queue, one would naturally expect that all calls done using data from those messages, e.g., sending their content to a database or another queue, would result in spans that are children of the describing the retrieving of those messages.
Consider the following scenario, which is supported by the boto3
SQS receive_message
instrumentation of the Lumigo OpenTelemetry Distro for Python:
from opentelemetry import trace
tracer = trace.get_tracer(__name__)
response = client.receive_message(...) # Instrumentation creates a `span_0` span
for message in response.get("Messages", []):
# The SQS.ReceiveMessage span is active in this scope
with tracer.start_as_current_span("span_1"): # span_0 is the parent of span_1
do_something()
Without the scope provided by the iterator over response["Messages"]
, span_1
would be without a parent span, and that would result in a separate invocation and a separate transaction in Lumigo.
A common pattern in SQS-based applications is to continuously poll an SQS queue for messages, and to process them as they arrive. In order not to clutter the Lumigo platform with empty SQS polling messages, the default behavior is to filter them out from being sent to Lumigo.
You can change this behavior by setting the boolean environment variable LUMIGO_AUTO_FILTER_EMPTY_SQS
to false
.
The possible variations are:
LUMIGO_AUTO_FILTER_EMPTY_SQS=true
filter out empty SQS polling messagesLUMIGO_AUTO_FILTER_EMPTY_SQS=false
do not filter out empty SQS polling messagesYou can selectively filter spans based on HTTP server/client endpoints for various components, not limited to web frameworks.
Set the LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX
environment variable to a list of regex strings. Spans with matching server/client endpoints will not be traced.
For exclusive server (inbound) or client (outbound) span filtering, use the environment variables:
LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_SERVER
LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_CLIENT
Notes:
google.com
the environment variable value should be ["google\\.com"]
.Examples:
/login
endpoint (will also match requests such as /login?user=foo
, /login/bar
))):
LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_SERVER=["\\/login"]
google.com
domain (will also match requests such as google.com/foo
, bar.google.com
):
LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_CLIENT=["google\\.com"]
'https://www.google.com
(will also match requests such as https://www.google.com/
, https://www.google.com/foo
)
LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX_CLIENT=["https:\\/\\/www\\.google\\.com"]
login
:
LUMIGO_FILTER_HTTP_ENDPOINTS_REGEX=["login"]
For guidelines on contributing, please see CONTRIBUTING.md.