Multi-modal Messages

Acontext supports multi-modal messages that include text, images, audio, and PDF documents. You can store and retrieve these messages in both OpenAI and Anthropic formats, with automatic format conversion between providers.

Prerequisites

Before working with multi-modal messages, ensure you have:

A running Acontext server (run locally)
An Acontext API key

Multi-modal content is stored as assets in S3, while message metadata is stored in PostgreSQL. Acontext automatically handles file uploads and generates presigned URLs for retrieval.

Supported content types

Acontext supports the following multi-modal content types:

Images

PNG, JPEG, GIF, WebP formats for visual content

Audio

WAV, MP3 formats for voice and sound

Documents

PDF documents for analysis and summarization

Storing images

Images with OpenAI format

OpenAI supports images through the image_url content part type, which accepts both external URLs and base64-encoded data URLs:

Image URL
Base64-encoded

from acontext import AcontextClient

client = AcontextClient(
    api_key="sk-ac-your-root-api-bearer-token",
    base_url="http://localhost:8029/api/v1"
)
client.ping()
session = client.sessions.create()

# Store a message with an image URL
message = client.sessions.store_message(
    session_id=session.id,
    blob={
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What's in this image?"
            },
            {
                "type": "image_url",
                "image_url": {
                    "url": "https://example.com/image.png",
                    "detail": "high"  # Options: "low", "high", "auto"
                }
            }
        ]
    },
    format="openai"
)

print(f"Message with image sent: {message.id}")

import base64
from acontext import AcontextClient

client = AcontextClient(
    api_key="sk-ac-your-root-api-bearer-token",
    base_url="http://localhost:8029/api/v1"
)
session = client.sessions.create()

# Read and encode image as base64
with open("image.png", "rb") as image_file:
    image_data = base64.b64encode(image_file.read()).decode("utf-8")

# Store message with base64 image (data URL format)
message = client.sessions.store_message(
    session_id=session.id,
    blob={
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What's in this image?"
            },
            {
                "type": "image_url",
                "image_url": {
                    "url": f"data:image/png;base64,{image_data}",
                    "detail": "high"  # Options: "low", "high", "auto"
                }
            }
        ]
    },
    format="openai"
)

print(f"Message with base64 image sent: {message.id}")

The detail parameter controls image processing quality. Use "high" for detailed analysis, "low" for faster processing, or "auto" to let the system decide.

Base64-encoded images in OpenAI format use the data URL scheme: data:image/[type];base64,[base64-data]. The image data is stored within the message parts and returned as base64 when retrieved.

Images with Anthropic format

Anthropic requires images to be base64-encoded:

import base64
from acontext import AcontextClient

client = AcontextClient(
    api_key="sk-ac-your-root-api-bearer-token",
    base_url="http://localhost:8029/api/v1"
)
session = client.sessions.create()

# Read and encode image as base64
with open("image.png", "rb") as image_file:
    image_data = base64.b64encode(image_file.read()).decode("utf-8")

# Store message with base64 image
message = client.sessions.store_message(
    session_id=session.id,
    blob={
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Describe this image"
            },
            {
                "type": "image",
                "source": {
                    "type": "base64",
                    "media_type": "image/png",
                    "data": image_data
                }
            }
        ]
    },
    format="anthropic"
)

print(f"Message with image sent: {message.id}")

Anthropic format requires images to be base64-encoded. The base64 data is stored within the message parts and returned as base64 when you retrieve the message.

Storing audio

Audio content can be included in messages for speech-to-text or audio analysis use cases:

import base64
from acontext import AcontextClient

client = AcontextClient(
    api_key="sk-ac-your-root-api-bearer-token",
    base_url="http://localhost:8029/api/v1"
)
session = client.sessions.create()

# Read and encode audio file
with open("audio.wav", "rb") as audio_file:
    audio_data = base64.b64encode(audio_file.read()).decode("utf-8")

# Store message with audio (OpenAI format)
message = client.sessions.store_message(
    session_id=session.id,
    blob={
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Transcribe this audio"
            },
            {
                "type": "input_audio",
                "input_audio": {
                    "data": audio_data,
                    "format": "wav"
                }
            }
        ]
    },
    format="openai"
)

print(f"Message with audio sent: {message.id}")

Storing Files

You can store files for analysis and understanding using base64-encoded content. Different formats handle files differently:

Anthropic Format
OpenAI Format

Anthropic supports storing files using the document content type with base64-encoded data:

import base64
from acontext import AcontextClient

client = AcontextClient(
    api_key="sk-ac-your-root-api-bearer-token",
    base_url="http://localhost:8029/api/v1"
)
session = client.sessions.create()

# Read and encode PDF file as base64
with open("report.pdf", "rb") as pdf_file:
    pdf_data = base64.b64encode(pdf_file.read()).decode("utf-8")

# Store message with PDF document (Anthropic format)
message = client.sessions.store_message(
    session_id=session.id,
    blob={
        "role": "user",
        "content": [
            {
                "type": "document",
                "source": {
                    "type": "base64",
                    "media_type": "application/pdf",
                    "data": pdf_data
                }
            },
            {
                "type": "text",
                "text": "Summarize the key findings in this report"
            }
        ]
    },
    format="anthropic"
)

print(f"Message with PDF sent: {message.id}")

OpenAI format supports base64 file uploads using the file content type with embedded file_data:

import base64
from acontext import AcontextClient

client = AcontextClient(
    api_key="sk-ac-your-root-api-bearer-token",
    base_url="http://localhost:8029/api/v1"
)
session = client.sessions.create()

# Read and encode PDF file as base64
with open("document.pdf", "rb") as pdf_file:
    pdf_data = base64.b64encode(pdf_file.read()).decode("utf-8")

# Store message with PDF file (OpenAI format)
message = client.sessions.store_message(
    session_id=session.id,
    blob={
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "What's in this PDF document?"
            },
            {
                "type": "file",
                "file": {
                    "file_data": pdf_data,
                    "filename": "document.pdf"
                }
            }
        ]
    },
    format="openai"
)

print(f"Message with PDF sent: {message.id}")

When you store a PDF with base64 data, the base64 content is stored within the message parts JSON in S3. When you retrieve the message, the PDF is returned as base64 data again—not as a presigned URL. This keeps the PDF data inline with the message content.

Supported document formats

Acontext will not limit the document formats in the context, you can store any file type in Acontext. However, not every file type is supported by your LLM Provider. You may check their documentations to see if the file type is supported:

When retrieving messages, the content format depends on how the message was originally sent:

from acontext import AcontextClient

client = AcontextClient(
    api_key="sk-ac-your-root-api-bearer-token",
    base_url="http://localhost:8029/api/v1"
)

# Retrieve messages
result = client.sessions.get_messages(
    session_id="session_uuid",
    format="anthropic",  # or "openai"
)

print(f"Retrieved {len(result.items)} messages")

# Access messages
for msg in result.items:
    for block in msg.content:
        if block.get('type') == 'image':
            # Images sent as base64 are returned as base64
            print(f"Image source type: {block['source']['type']}")

How content is returned:

Images/PDFs sent as base64 data are returned as base64 data (stored within the message parts)
Images/PDFs sent as URLs in OpenAI format are stored as URLs in metadata
Files uploaded via multipart form-data are stored as separate S3 assets (not covered in this guide)

Format conversion

Acontext automatically converts between formats when retrieving messages:

Format conversion for images:

Images sent as base64 are stored as base64 and returned as base64 in any format
Images sent as URLs (in OpenAI format) are stored as URLs and can be:
- Retrieved as URLs in OpenAI format (URL is preserved)
- Retrieved as base64 in Anthropic format (URL is downloaded and converted on-the-fly)

Store OpenAI, Retrieve Anthropic
Store Anthropic, Retrieve OpenAI

from acontext import AcontextClient

client = AcontextClient(
    api_key="sk-ac-your-root-api-bearer-token",
    base_url="http://localhost:8029/api/v1"
)
session = client.sessions.create()

# Store in OpenAI format
client.sessions.store_message(
    session_id=session.id,
    blob={
        "role": "user",
        "content": [
            {"type": "text", "text": "Analyze this image"},
            {
                "type": "image_url",
                "image_url": {
                    "url": "https://example.com/photo.jpg"
                }
            }
        ]
    },
    format="openai"
)

# Retrieve in Anthropic format
result = client.sessions.get_messages(
    session_id=session.id,
    format="anthropic"  # Different format!
)

# Image is automatically converted to Anthropic format
print("Message retrieved in Anthropic format")
for msg in result.items:
    print(f"Role: {msg.role}")
    for block in msg.content:
        print(f"  Block type: {block.type}")

import base64
from acontext import AcontextClient

client = AcontextClient(
    api_key="sk-ac-your-root-api-bearer-token",
    base_url="http://localhost:8029/api/v1"
)
session = client.sessions.create()

# Read image
with open("chart.png", "rb") as f:
    image_data = base64.b64encode(f.read()).decode("utf-8")

# Store in Anthropic format
client.sessions.store_message(
    session_id=session.id,
    blob={
        "role": "user",
        "content": [
            {"type": "text", "text": "Explain this chart"},
            {
                "type": "image",
                "source": {
                    "type": "base64",
                    "media_type": "image/png",
                    "data": image_data
                }
            }
        ]
    },
    format="anthropic"
)

# Retrieve in OpenAI format
result = client.sessions.get_messages(
    session_id=session.id,
    format="openai"  # Different format!
)

# Image is automatically converted to OpenAI format
print("Message retrieved in OpenAI format")
for msg in result.items:
    if hasattr(msg, 'content') and isinstance(msg.content, list):
        for part in msg.content:
            print(f"  Part type: {part.get('type')}")

Format conversion is bidirectional and lossless for common content types. Use the format that best matches your workflow when retrieving messages.

Best practices

Optimize image and PDF sizes before encoding

Compress images and PDFs to reduce storage costs and improve performance
Use appropriate resolutions (e.g., 2048px max for most image analysis tasks)
Consider using "detail": "low" in OpenAI format for simple image understanding tasks
Base64-encoded content increases message size by ~33%, so optimization is important

Handle large files efficiently

Base64 encoding works well for files under 10MB
For very large files (>10MB), consider using multipart file uploads instead
Monitor your storage usage and clean up old sessions regularly

Choose the right format for your use case

Use OpenAI format for GPT-4 Vision and similar models
Use Anthropic format for Claude with vision and document analysis capabilities
Format conversion is automatic and lossless for common content types

Understand storage behavior

Base64 data (images, PDFs, audio) is stored within the message parts JSON
Message parts are stored in S3, with metadata in PostgreSQL
When retrieving, base64 content is returned as-is (not converted to URLs)

Complete workflow example

Here’s a complete example that demonstrates storing and retrieving multi-modal messages:

import base64
from acontext import AcontextClient

client = AcontextClient(
    api_key="sk-ac-your-root-api-bearer-token",
    base_url="http://localhost:8029/api/v1"
)

try:
    # Create session
    session = client.sessions.create()
    print(f"Session created: {session.id}")
    
    # Store text + image message
    with open("screenshot.png", "rb") as f:
        image_data = base64.b64encode(f.read()).decode("utf-8")
    
    message = client.sessions.store_message(
        session_id=session.id,
        blob={
            "role": "user",
            "content": [
                {
                    "type": "text",
                    "text": "What UI improvements would you suggest for this design?"
                },
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": "image/png",
                        "data": image_data
                    }
                }
            ]
        },
        format="anthropic"
    )
    print(f"Message sent: {message.id}")
    
    # Retrieve messages
    result = client.sessions.get_messages(
        session_id=session.id,
        format="openai"  # Convert to OpenAI format
    )
    
    print(f"\nRetrieved {len(result.items)} messages:")
    for msg in result.items:
        print(f"  Role: {msg.role}")
        if isinstance(msg.content, list):
            for part in msg.content:
                if part.get('type') == 'text':
                    print(f"  Text: {part.get('text')[:50]}...")
                    
finally:
    client.close()

Getting started

Context Storage

Agentic SDK

Observability

Agent Self-learning

Customization

Miscellaneous

Prerequisites

Supported content types

Images

Audio

Documents

Storing images

Images with OpenAI format

Images with Anthropic format

Storing audio

Storing Files

Supported document formats

Format conversion

Best practices

Complete workflow example

Next steps

Store Artifacts

Dashboard

Getting started

Context Storage

Agentic SDK

Observability

Agent Self-learning

Customization

Miscellaneous

​Prerequisites

​Supported content types

Images

Audio

Documents

​Storing images

​Images with OpenAI format

​Images with Anthropic format

​Storing audio

​Storing Files

​Supported document formats

​Retrieving multi-modal messages

​Format conversion

​Best practices

​Complete workflow example

​Next steps

Store Artifacts

Dashboard

Prerequisites

Supported content types

Storing images

Images with OpenAI format

Images with Anthropic format

Storing audio

Storing Files

Supported document formats

Retrieving multi-modal messages

Format conversion

Best practices

Complete workflow example

Next steps