AI Deploy - Tutorial - Deploy and call a spam classifier with FastAPI

How to deploy and call an API for spam classification using FastAPI

Last updated 31st January, 2023.

AI Deploy is in beta. During the beta-testing phase, the infrastructure’s availability and data longevity are not guaranteed. Please do not use this service for applications that are in production, as this phase is not complete.

AI Deploy is covered by OVHcloud Public Cloud Special Conditions.

Objective

The objective of this tutorial is to deploy an API for Spam classification.

The use case is the Spam Ham Collection Dataset.

Overview

The objective of this tutorial is to show how it is possible to create, deploy and call an API with AI Deploy.

In order to do this, we will use FastAPI, a web framework for developing RESTful APIs in Python. You will also learn how to build and use a custom Docker image for a FastAPI API.

Requirements

Instructions

You are going to follow different steps to build your FastAPI app.

  • More information about FastAPI capabilities can be found here.
  • Direct link to the full code can be found here.

Here we will mainly discuss how to write the model.py and app.py codes, the requirements.txt file and the Dockerfile. If you want to see the whole code, please refer to the GitHub repository.

Create the API

Two Python files are created for the purpose of defining the model and building the API.

Define the model.py file

This Python file is dedicated to build and define the Logistic Regression model. You can find the full code here.

You can find all the information on the method used through this notebook. You will be able to understand the process and the different steps to follow to build a spam classifier based on logistic regression.

First, we have to load the Spam Ham Collection Dataset.

def load_data():

    PATH = 'SMSSpamCollection'
    df = pd.read_csv(PATH, delimiter = "\t", names=["classe", "message"])
    X = df['message']
    y = df['classe']

    return X, y

Then, we create the function to split the dataset for training and test.

def split_data(X, y):

    ntest = 2000/(3572+2000)
    X_train, X_test, y_train, y_test = model_selection.train_test_split(X, y, test_size=ntest, random_state=0)

    return X_train, y_train

The last function allows us to build the model.

def spam_classifier_model(Xtrain, ytrain):

    model_logistic_regression = LogisticRegression()
    model_logistic_regression = model_logistic_regression.fit(Xtrain, ytrain)

    coeff = model_logistic_regression.coef_
    coef_abs = np.abs(coeff)

    quantiles = np.quantile(coef_abs,[0, 0.25, 0.5, 0.75, 0.9, 1])

    index = np.where(coeff[0] > quantiles[1])
    newXtrain = Xtrain[:, index[0]]

    model_logistic_regression = LogisticRegression()

    model_logistic_regression.fit(newXtrain, ytrain)

    return model_logistic_regression, index

By calling the different functions as follows, you will be able to get a classification result (spam or not) as well as a confidence score.

# extract input and output data
data_input, data_output = load_data()

# split data
X_train, ytrain = split_data(data_input, data_output)

# transform and fit training set
vectorizer = CountVectorizer(stop_words='english', binary=True, min_df=10)
Xtrain = vectorizer.fit_transform(X_train.tolist())
Xtrain = Xtrain.toarray()

# use the model and index for prediction
model_logistic_regression, index = spam_classifier_model(Xtrain, ytrain)

Write the app.py file

Initialize an instance of FastAPI.

app = FastAPI()

Define the data format.

class request_body(BaseModel):
    message : str

Process the message sent by the user.

def process_message(message):

    desc = vectorizer.transform(message)
    dense_desc = desc.toarray()
    dense_select = dense_desc[:, index[0]]

    return dense_select

Define the GET method.

@app.get('/')
def root():
    return {'message': 'Welcome to the SPAM classifier API'}

Create the POST method.

The classify_message function allows the user to send a message.

It will then call the model and return the result of the classification.

@app.post('/spam_detection_path')
def classify_message(data : request_body):

    message = [
        data.message
    ]

    if (not (message)):
        raise HTTPException(status_code=400, detail="Please Provide a valid text message")

    dense_select = process_message(message)
    label = model_logistic_regression.predict(dense_select)
    proba = model_logistic_regression.predict_proba(dense_select)

    if label[0]=='ham':
        label_proba = proba[0][0]
    else:
        label_proba = proba[0][1]

    return {'label': label[0], 'label_probability': label_proba}

All the functions defined above are in the app.py Python file.

You can find the code on the GitHub repository.

Write the requirements.txt file for the API

The requirements.txt file will allow us to write all the modules needed to make our application work. This file will be useful when writing the Dockerfile.

fastapi==0.87.0
pydantic==1.10.2
uvicorn==0.20.0
pandas==1.5.1
scikit-learn==1.1.3

Write the Dockerfile for the application

Your Dockerfile should start with the FROM instruction indicating the parent image to use. In our case we choose to start from the python:3.8 image:

python:3.8

Create the home directory and add your files to it:

WORKDIR /workspace
ADD . /workspace

Install the requirements.txt file which contains your needed Python modules using a pip install ... command:

RUN pip install -r requirements.txt

Set the listening port of the container:

EXPOSE 8000

Define the entrypoint and the default launching command to start the application:

ENTRYPOINT ["uvicorn"]
CMD ["app:app", "--host", "0.0.0.0"]

Give correct access rights to an ovhcloud user (42420:42420):

RUN chown -R 42420:42420 /workspace
ENV HOME=/workspace

Build the Docker image from the Dockerfile

Launch the following command from the Dockerfile directory to build your application image:

docker build . -t fastapi-spam-classification:latest

The dot . argument indicates that your build context (place of the Dockerfile and other needed files) is the current directory.

The -t argument allows you to choose the identifier to give to your image. Usually image identifiers are composed of a name and a version tag <name>:<version>. For this example we chose fastapi-spam-classification:latest.

Please make sure that the docker image you will push in order to run containers using AI products respects the linux/AMD64 target architecture. You could, for instance, build your image using buildx as follows:

docker buildx build --platform linux/amd64 ...

Test it locally (optional)

Launch the following Docker command to launch your application locally on your computer:

docker run --rm -it -p 8000:8000 --user=42420:42420 fastapi-spam-classification:latest

The -p 8000:8000 argument indicates that you want to execute a port redirection from the port 8000 of your local machine into the port 8000 of the Docker container. The port 8000 is the default port used by FastAPI applications.

Don't forget the --user=42420:42420 argument if you want to simulate the exact same behaviour that will occur on AI Deploy apps. It executes the Docker container as the specific OVHcloud user (user 42420:42420).

Once started, your application should be available on http://localhost:8000.

Push the image into the shared registry

The shared registry of AI Deploy should only be used for testing purposes. Please consider attaching your own Docker registry. More information about this can be found here.

Find the address of your shared registry by launching this command:

ovhai registry list

Log in on the shared registry with your usual OpenStack credentials:

docker login -u <user> -p <password> <shared-registry-address>

Push the compiled image into the shared registry:

docker tag fastapi-spam-classification:latest <shared-registry-address>/fastapi-spam-classification:latest
docker push <shared-registry-address>/fastapi-spam-classification:latest

Launch the AI Deploy app

The following command starts a new app running your FastAPI application:

ovhai app run \
      --default-http-port 8000 \
      --cpu 4 \
      <shared-registry-address>/fastapi-spam-classification:latest

--default-http-port 8000 indicates that the port to reach on the app URL is the 8000.

--cpu 4 indicates that we request 4 CPUs for that app.

Consider adding the --unsecure-http attribute if you want your application to be reachable without any authentication.

Interact with the deployed API through the dashboard

By clicking on the link of your AI Deploy app, you will arrive on the following page.

Overview

How to interact with your API?

You can add /docs at the end of the url of your app.

In our example, the url is as follows: https://ba2ef330-3e95-444a-a81b-7ca83dff5836.app.gra.training.ai.cloud.ovh.net/docs

It provides a complete dashboard for interacting with the API!

Overview

To be able to send a message for classification, select /spam_detection_path in the green box. Click on Try it out and type the message of your choice in the dedicated zone.

Overview

To get the result of the prediction, click on the Execute button.

Overview

Congratulations! You have obtained the result of the prediction with the label and the confidence score.

Go further

  • You can imagine deploying an AI model with an other tool: Gradio. Read this tutorial.
  • Another way to create an AI Deploy app is to use Streamlit! Follow this tutorial.

Feedback

Please send us your questions, feedback and suggestions to improve the service:


Czy ten przewodnik był pomocny?

Zachęcamy do przesyłania sugestii, które pomogą nam ulepszyć naszą dokumentację.

Obrazy, zawartość, struktura - podziel się swoim pomysłem, my dołożymy wszelkich starań, aby wprowadzić ulepszenia.

Zgłoszenie przesłane za pomocą tego formularza nie zostanie obsłużone. Skorzystaj z formularza "Utwórz zgłoszenie" .

Dziękujemy. Twoja opinia jest dla nas bardzo cenna.


Inne przewodniki, które mogą Cię zainteresować...

OVHcloud Community

Dostęp do OVHcloud Community Przesyłaj pytania, zdobywaj informacje, publikuj treści i kontaktuj się z innymi użytkownikami OVHcloud Community.

Porozmawiaj ze społecznością OVHcloud

Zgodnie z Dyrektywą 2006/112/WE po zmianach, od dnia 1 stycznia 2015 r., ceny brutto mogą różnić się w zależności od kraju zameldowania klienta
(ceny brutto wyświetlane domyślnie zawierają stawkę podatku VAT na terenie Polski).