Markian Mumba Mwangi

Currently most of the company’s stack is still based on VPSs. The bigger goal is to move services to the cloud (some are already there). The cloud of choice for us is Azure.

So recently I got tasked with building a microservice to push images to Azure.

That was the entire statement. Nothing else. A black box. So the first thing was to define what the MVP of such a service should actually look like.

Here’s what I came up with:

•
A service calling this service should be able to upload an image.
•
It should also be able to delete images it uploaded.
•
There should be an option for multiple image uploads.
•
And we should have an endpoint to fetch image metadata if needed.

Pretty straightforward list. Now… the tricky part: I had never used Azure.

Blob Storage — Azure’s S3

So, I did some reading. Azure Blob Storage is basically Azure’s version of AWS S3. Both are used for storing unstructured data — things like videos, images, mp3s, etc.

Tangent: what do we even mean by “unstructured data”?

Well, structured data is nice and clean — rows and columns in a database, easily queryable. Unstructured data, on the other hand, doesn’t fit that shape.

Take an image: sure, at the lowest level it’s just a grid of pixels with RGB values (so technically “structured”). But the pixel matrix itself doesn’t tell you:

•
“This is a dog.”
•
“This person is smiling.”

Databases can’t natively query “all images containing cats” unless you run them through some AI or computer vision model to extract features (which then become structured metadata).

TL;DR → images are unstructured because they don’t carry semantic meaning by themselves.

How Blob Storage is organized

For our case, there are only three things we care about:

•
Storage Account → the top-level namespace (think of it as your virtual hard drive).
•
Container → a logical grouping of blobs (like a folder or bucket).
•
Blob → the actual file/object.

That’s enough theory. Let’s get to uploading.

Upload Flow

The actor here is another microservice that gives us a multipart file. We have to accept it and push it to Azure.

First thing: we work with an InputStream. Why? Because JSON data can be mapped by Spring into a DTO automatically — it has structure. But an image is just a stream of bytes. There’s no schema Spring can map.

Step 1: Check for duplicates with MD5

So, once we have the stream, we calculate an MD5 hash of it.

Code

data.mark(Integer.MAX_VALUE);
String md5 = utilites.calculateMD5Hash(data);

data.reset();
Optional<ImageMetadata> existingImage = imageMetadataRepository.findByMd5Hash(md5);
if (existingImage.isPresent()) {
    log.info("Duplicate image detected: {} (original: {})", 
             name, existingImage.get().getOriginalFilename());
    return createDuplicateResponse(existingImage.get());
}

Why MD5? Because if someone uploads cat.png, renames it to dog.png, and tries again, the hash will be the same → we know it’s a duplicate.

Utility for that looks like this:

Code

public String calculateMD5Hash(InputStream inputStream) throws IOException {
    try {
        MessageDigest md = MessageDigest.getInstance("MD5");
        byte[] buffer = new byte[8192];
        int bytesRead;

        while ((bytesRead = inputStream.read(buffer)) != -1) {
            md.update(buffer,0,bytesRead);
        }
        byte[] hashBytes = md.digest();
        StringBuilder sb = new StringBuilder();
        for (byte b: hashBytes) {
            sb.append(String.format("%02x",b));
        }
        return sb.toString();
    } catch (NoSuchAlgorithmException e) {
        throw new RuntimeException("MD5 algorithm not available", e);
    }
}

Step 2: Rename and extract metadata

If the file is new, we give it a unique name and then extract details like type, size, width, and height.

Code

String newName = utilites.fileRename(name);
ImageMetadata metadata = imageMetadataService.extractMetadata(
    data, name, newName, contentType, size, uploadedBy
);
metadata.setMd5Hash(md5);

Code

Rename utility:

Code

public String fileRename(String name) {
    return UUID.randomUUID().toString() + 
           name.substring(name.lastIndexOf("."));
}

Code

Metadata extraction (simplified):

Code

BufferedImage bufferedImage = ImageIO.read(new ByteArrayInputStream(imageBytes));
metadata.setOriginalFilename(name);
metadata.setBlobName(blobName);
metadata.setContentType(contentType);
metadata.setSizeBytes(size);
metadata.setUploadedBy(uploadedBy);
metadata.setWidth(bufferedImage.getWidth());
metadata.setHeight(bufferedImage.getHeight());

For JPEGs, we even try to parse EXIF data.

Step 3: Upload to Azure

Now the actual upload:

Code

BlobContainerClient blobContainerClient = 
        blobServiceClient.getBlobContainerClient(containerName);
BlobClient blobClient = blobContainerClient.getBlobClient(newName);

Response<BlockBlobItem> response = blobClient.uploadWithResponse(
        new BlobParallelUploadOptions(data)
            .setParallelTransferOptions(
                new ParallelTransferOptions().setBlockSizeLong(4L * 1024 * 1024)
            ),
        null,
        Context.NONE
);

It’s a bit like Linux:

•
touch file.txt creates the file.
•
echo "hello" writes data into it.

Same here: getBlobClient() gives us a reference, then uploadWithResponse() actually writes the data.

Finally, we grab the blob URL and save it:

Code

metadata.setBlobUrl(blobClient.getBlobUrl());
metadata.setUploadedAt(LocalDateTime.now());
ImageMetadata savedMetadata = imageMetadataRepository.save(metadata);

log.info("Successfully uploaded image: {} with ID: {}", 
         name, savedMetadata.getId());

Step 4: Error handling

And of course, we wrap the whole thing in try/catch for Azure-specific, I/O, and generic errors.

Other Functions

•
Delete → look up blob name in DB → get blob client → call delete().
•
Get metadata → fetch metadata by ID.
•
Upload multiple → simple for loop over a list of multipart files.

Finally, register the service in the service registry so other microservices can pick it up.

Wrap up

And honestly, that’s it.

The hardest part wasn’t Azure, really — it was handling unstructured data and wiring it into a nice service flow. Blob Storage itself is pretty straightforward once you get the hang of storage account → container → blob.

It feels a lot like dealing with files and folders locally… just with the cloud sprinkled on top.

Of course, down the line, we can add fancy stuff like CDN, caching headers, or lifecycle rules (delete images after X days). But for now, we’ve got something that works: upload, deduplicate, delete, and return metadata.

Building an Image Upload Service with Azure Blob Storage (and a Few Tangents Along the Way)

Blob Storage — Azure’s S3

How Blob Storage is organized

Upload Flow

Step 1: Check for duplicates with MD5

Step 2: Rename and extract metadata

Step 3: Upload to Azure

Step 4: Error handling

Other Functions

Wrap up

💎 Random Nugget