How to generate environment file for Python jobs
Find out how to create your Python environment and export it as an environment file.
Find out how to create your Python environment and export it as an environment file.
Last updated 04th May, 2020
This guide helps you create your python environment with Conda. Then we will see how to export it so you can use it to submit your Python job on OVHcloud Data Processing platform.
To read an introduction about the Data Processing service you can visit Data Processing Overview.
OVHcloud Data processing is using Conda in order to manage packages and their dependencies. If you haven't installed Conda yet please do.
With Conda, you can create, export, list, remove, and update environments that have different versions of Python and/or packages installed in them. OVHcloud Data Processing uses this environment to make sure your Python job has everything necessary to run smoothly. If you want to learn more about Conda, have a look at their documentation.
Once installed, Conda will automatically create a first environment. You can then start installing the needed packages. To do so, use the install command:
$ conda install numpy
It will install the latest version of Numpy in the current environment, you just have to repeat this for each needed package. You can learn more about the install command and its options here.
Now that you have an environment that suits your code, it's time to export it! To do so, make sure Conda is set in the environment you want to export and run this command:
$ conda env export --from-history -f environment.yml
It's going to generate a portable environment file. You will need this file to run your code on OVHcloud Data Processing. To learn more about environment file, have a look here.
You can now move on to the next step and submit a Python job.
If you want to quickly test OVHcloud Data Processing with a basic job, you can use this environment file it includes commonly used packages:
name: datascience-environment
channels:
- defaults
dependencies:
- python=3.7.6
- numpy
- requests
- pandas
- boto3
- pyspark
- beautifulsoup4
- sqlalchemy
- pillow
- scikit-learn
Do not hesitate to reuse this environment file. Also, feel free to add or remove packages to better fit your needs.
To learn more about using Data Processing and how to submit a job and process your data, we invite you to look at Data Processing documentations page.
You can send your questions, suggestions or feedbacks in our community of users on https://community.ovh.com/en/ or on our Discord in the channel #dataprocessing-spark
Zachęcamy do przesyłania sugestii, które pomogą nam ulepszyć naszą dokumentację.
Obrazy, zawartość, struktura - podziel się swoim pomysłem, my dołożymy wszelkich starań, aby wprowadzić ulepszenia.
Zgłoszenie przesłane za pomocą tego formularza nie zostanie obsłużone. Skorzystaj z formularza "Utwórz zgłoszenie" .
Dziękujemy. Twoja opinia jest dla nas bardzo cenna.
Dostęp do OVHcloud Community Przesyłaj pytania, zdobywaj informacje, publikuj treści i kontaktuj się z innymi użytkownikami OVHcloud Community.
Porozmawiaj ze społecznością OVHcloud