Checking a job's logs
Find out how to get your job's logs while the job is running or after it is finished
Find out how to get your job's logs while the job is running or after it is finished
Last updated 14th July, 2021
This guide will help you to check your job's logs while your job is running or after your job is finished.
To read an introduction about Data Processing service you can visit Data Processing Overview.
When you launch a job with Data Processing, you may want to read your job's logs as it is running. There are three ways to get live logs:
To see your logs in the manager, you need to follow these steps:
Public Cloud
.Data Processing
from the left panel. Logs
tab in your job dashboard page. Those logs will appear only if your job is running. Once your job has ended you will get a link to your Object Storage where your logs files are stored.
If you are using the ovh-spark-submit CLI (see How to launch jobs through the CLI), then you do not have to take action.
The logs will appear in the standard output while your job is running.
Once the job has ended, the URL to the Object Storage container containing the job's logs will be displayed. This could be used to list your job's logs files through the OpenStack API. Thus, you will be able to upload them.
Please refer to the section below to know how.
If you use the OVHcloud Manager or the CLI, you may never see the last entries of your logs before the job stops. It is because the job has finished before the UI was updated. But don't worry, all the logs are uploaded to your Object Storage at the job's end.
An other way to read your job's logs is to use the OVH cloud API by calling the endpoint to GET a job's logs (see How to use the OVHcloud API)
To get the logs you have to use the GET
on the /cloud/project/{serviceName}/dataProcessing/jobs/{jobId}/logs
endpoint (where the service name is your Public Cloud project ID).
This endpoint can take a query parameter which is from
. This parameter allows you to specify the date from which you want to retrieve the logs. Its default value is 1970-01-01T00:00:00 UTC.
Whether you chose to set this start date or not, you will retrieve all the logs that came after in the limit of 10 000 characters.
Once the job has ended, the URL to the Object Storage container containing the job's logs will be displayed. This could be used to list your job's logs files through the OpenStack API. Thus, you will be able to upload them.
Please refer to the section below to know how.
When streaming logs you are limited to 10 000 characters at a time. Meaning you could experience missing logs in streaming mode (in the manager, in the CLI). All logs will be uploaded to your Object Storage at job end.
Once your job is finished, its logs are uploaded to your Object Storage. While you can only retrieve the Spark driver node's logs when the job is running, you will have the logs from all the node (driver and executors) stored.
For each node, you will have at least 2 logs files:
Some {jobId}/{nodeName}/spark.log.yyyy-MM-ddThhhmmmss.sssssssss files can be uploaded in Object Storage while your job is still running. This is due to the logs rotation that is configured to upload the files that reach the maximum logs files size of 100 MiB.
There is three ways to download your logs from your Object Storage:
To see your logs in the manager, you need to follow these steps:
Public Cloud
.From here, you can either go to your Object Storage section of your Public Cloud project and select the odp-logs
container. Then filter the list of object with your job ID to get its logs.
Or you can go through your job dashboard instead, to do so:
Data Processing
from the left panel.Logs
tab in your job dashboard page. Download logs
to download the output logs of your job from your Object Storage account. This button will lead you to the Object Storage container pre-filtered with the wanted job ID.
You can access to your Object Storage by using the OpenStack CLI or the Swift CLI (Swift being the name of the OpenStack Object Storage).
Please follow the OpenStack documentation on installing the CLI and using the CLI.
In order to authenticate with the CLI you will have to set environment variables using an OpenStack RC file. You can find your RC file by following these steps:
Public Cloud
.Users & Roles
from the left panel....
option button of your user and select Download OpenStack's RC file
.
In order to use the OpenStack API, you will need an OpenStack token. You can generate one in the OVHcloud manager by following these steps:
Public Cloud
.Users & Roles
from the left panel....
option button of your user and select Generate an OpenStack token
.With this token you should be able to list and to download the logs files of your jobs using the OpenStack API.
To learn more about using Data Processing and how to submit a job and process your data, we invite you to look at the Data Processing documentations page.
You can send your questions, suggestions or feedbacks in our community of users on https://community.ovh.com/en/ or on our Discord in the channel #dataprocessing-spark
Please feel free to give any suggestions in order to improve this documentation.
Whether your feedback is about images, content, or structure, please share it, so that we can improve it together.
Your support requests will not be processed via this form. To do this, please use the "Create a ticket" form.
Thank you. Your feedback has been received.
Access your community space. Ask questions, search for information, post content, and interact with other OVHcloud Community members.
Discuss with the OVHcloud community