How to generate environment file for Python jobs

Find out how to create your Python environment and export it as an environment file.

Last updated 04th May, 2020


This guide helps you create your python environment with Conda. Then we will see how to export it so you can use it to submit your Python job on OVHcloud Data Processing platform.

To read an introduction about the Data Processing service you can visit Data Processing Overview.


  • Your application code as Python files.
  • Conda installed on your computer, refer to this guide.


Step 1: Create your environment

OVHcloud Data processing is using Conda in order to manage packages and their dependencies. If you haven't installed Conda yet please do.

With Conda, you can create, export, list, remove, and update environments that have different versions of Python and/or packages installed in them. OVHcloud Data Processing uses this environment to make sure your Python job has everything necessary to run smoothly. If you want to learn more about Conda, have a look at their documentation.

Once installed, Conda will automatically create a first environment. You can then start installing the needed packages. To do so, use the install command:

$ conda install numpy

It will install the latest version of Numpy in the current environment, you just have to repeat this for each needed package. You can learn more about the install command and its options here.

Step 2: Export your environment

Now that you have an environment that suits your code, it's time to export it! To do so, make sure Conda is set in the environment you want to export and run this command:

$ conda env export --from-history -f environment.yml

It's going to generate a portable environment file. You will need this file to run your code on OVHcloud Data Processing. To learn more about environment file, have a look here.

You can now move on to the next step and submit a Python job.

Generic environment file

If you want to quickly test OVHcloud Data Processing with a basic job, you can use this environment file it includes commonly used packages:

name: datascience-environment
  - defaults
  - python=3.7.6
  - numpy
  - requests
  - pandas
  - boto3
  - pyspark
  - beautifulsoup4
  - sqlalchemy
  - pillow
  - scikit-learn

Do not hesitate to reuse this environment file. Also, feel free to add or remove packages to better fit your needs.

Go further

To learn more about using Data Processing and how to submit a job and process your data, we invite you to look at Data Processing documentations page.

You can send your questions, suggestions or feedbacks in our community of users on or on our Discord in the channel #dataprocessing-spark

Questa documentazione ti è stata utile?

Prima di inviare la valutazione, proponici dei suggerimenti per migliorare la documentazione.

Immagini, contenuti, struttura... Spiegaci perché, così possiamo migliorarla insieme!

Le richieste di assistenza non sono gestite con questo form. Se ti serve supporto, utilizza il form "Crea un ticket" .

Grazie per averci inviato il tuo feedback.

Potrebbero interessarti anche...

OVHcloud Community

Accedi al tuo spazio nella Community Fai domande, cerca informazioni, pubblica contenuti e interagisci con gli altri membri della Community OVHcloud

Discuss with the OVHcloud community

Conformemente alla Direttiva 2006/112/CE e successive modifiche, a partire dal 01/01/2015 i prezzi IVA inclusa possono variare in base al Paese di residenza del cliente
(i prezzi IVA inclusa pubblicati includono di default l'aliquota IVA attualmente in vigore in Italia).