Is FastAPI the Key to Effortless Background File Processing?

Taming File Uploads: FastAPI's Secret Weapon for Efficiency and Performance

Is FastAPI the Key to Effortless Background File Processing?

Building web applications that involve file uploads? It’s super important to keep your application responsive and efficient. FastAPI, a modern Python web framework, can help you with that. How? With background tasks. This powerful feature lets your application process files in the background while still keeping everything running smoothly for users. Let’s dive into how this can be set up.

Background Tasks in FastAPI

So, what’s the deal with background tasks in FastAPI? These are operations that run asynchronously after the main request has been taken care of and a response has been sent back to the client. This is a game changer for handling time-consuming tasks like processing uploaded files, sending emails, or updating records. FastAPI’s BackgroundTasks class makes it super easy to add and manage these tasks.

How to Use Background Tasks for File Uploads

When it comes to handling asynchronous file uploads with FastAPI’s background tasks, it’s pretty straightforward. Let’s take a look at the steps.

First, you’ll need to define the Background Task Function. This function will handle the actual processing of the uploaded file, like reading a CSV, processing its content, and updating a database.

Check this out:

from fastapi import BackgroundTasks, FastAPI, UploadFile
from fastapi.concurrency import run_in_threadpool
import asyncio
import uuid
import time

app = FastAPI()

async def process_file(file: UploadFile, name: str):
    t1 = time.time()
    async with asyncio.Lock():
        await update_status(name, 'Uploading')
        try:
            csv_data = await run_in_threadpool(read_csv, file)
        except Exception as e:
            await update_status(name, 'Failed')
            return
        csv_rows = (dict(zip(csv_data.keys(), row)) for row in zip(*csv_data.values()))
        await csv_repository.create(csv_rows, name)
        await update_status(name, 'Uploaded')
    t2 = time.time()
    print('ALL TIME IS: ', t2 - t1)

async def read_csv(file: UploadFile):
    return polars.read_csv(file.file).to_dict()

async def update_status(name: str, status: str):
    await status_repository.update(name, status)

Next step is to create the endpoint to handle file uploads. This endpoint will kick off the background task.

from fastapi import File, UploadFile
from fastapi.responses import JSONResponse
from uuid import uuid4

@app.post("/upload-file/")
async def upload_file(background_tasks: BackgroundTasks, file: UploadFile = File(...)):
    name = uuid4().hex
    background_tasks.add_task(process_file, file, name)
    return JSONResponse(content={"message": f"File with ID {name} has been uploaded and is being processed in the background."}, status_code=201)

Handling Large Files & Performance Considerations

Handling big files is no joke. You need to ensure that your app doesn’t hit performance issues.

Reading File Content: Make sure to read the file content within the same endpoint context to avoid issues with file objects being closed too soon.

async def get_file_content(file: UploadFile):
    return await file.read()

@app.post("/upload/")
async def upload_file(background_tasks: BackgroundTasks, file: UploadFile = File(...)):
    file_content = await get_file_content(file)
    background_tasks.add_task(print_file_content, file_content)
    return {"message": "File uploaded successfully"}

Using Thread Pool: For I/O-bound tasks, using a thread pool can really help. FastAPI provides run_in_threadpool to run functions in a thread pool.

from fastapi.concurrency import run_in_threadpool

async def process_file(file: UploadFile, name: str):
    t1 = time.time()
    async with asyncio.Lock():
        await update_status(name, 'Uploading')
        try:
            csv_data = await run_in_threadpool(read_csv, file)
        except Exception as e:
            await update_status(name, 'Failed')
            return
        csv_rows = (dict(zip(csv_data.keys(), row)) for row in zip(*csv_data.values()))
        await csv_repository.create(csv_rows, name)
        await update_status(name, 'Uploaded')
    t2 = time.time()
    print('ALL TIME IS: ', t2 - t1)

Chaining Multiple Background Tasks

Sometimes, you’ll need to run multiple tasks in a sequence. FastAPI makes chaining these tasks easy peasy.

from fastapi import FastAPI, BackgroundTasks

app = FastAPI()

def task1(arg: str):
    pass

def task2(arg: int):
    pass

@app.post("/chain-tasks")
async def chain_tasks(background_tasks: BackgroundTasks):
    background_tasks.add_task(task1, "arg1")
    background_tasks.add_task(task2, 42)
    return {"message": "Chained tasks started"}

Using Asynchronous Background Tasks

FastAPI supports async background tasks too, ideal for I/O-bound operations.

import asyncio
from fastapi import FastAPI, BackgroundTasks

app = FastAPI()

async def async_task(seconds: int):
    await asyncio.sleep(seconds)

@app.post("/async-background")
async def async_background(background_tasks: BackgroundTasks):
    background_tasks.add_task(async_task, 10)
    return {"message": "Async task started"}

Real-World Example: Image Processing API

Imagine you’re building an API to process images. You could have endpoints for both sync and async processing.

import time
from fastapi import FastAPI, BackgroundTasks, File, UploadFile

app = FastAPI()

def process_image(image: UploadFile):
    time.sleep(2)

@app.post("/process-sync")
async def process_sync(image: UploadFile = File(...)):
    process_image(image)
    return {"message": "Image processed synchronously"}

@app.post("/process-async")
async def process_async(background_tasks: BackgroundTasks, image: UploadFile = File(...)):
    background_tasks.add_task(process_image, image)
    return {"message": "Image processing started in the background"}

Wrapping Up

FastAPI’s background tasks are incredibly handy for managing async operations, especially when dealing with file uploads. By pushing time-intensive tasks to the background, you can keep your application swift and user-friendly. Whether you’re wrestling with large files, sending out emails, or updating databases, background tasks help you build scalable and high-performing web apps. With these steps, you can easily integrate background tasks into your FastAPI application and boost its overall performance.