ai-infra / API Reference

MultimodalEmbeddings

from ai_infra.embeddings import MultimodalEmbeddings

ai_infra.embeddings

Provider-agnostic embeddings for mixed text and image inputs. Generates a single embedding vector from an ordered sequence of text strings and/or images. Supports interleaved content (e.g. caption + image + follow-up text) where the provider supports it. Supported providers: - voyage: Voyage AI voyage-multimodal-3.5 (single-backbone, best RAG) - cohere: Cohere embed-v4.0 (128K context, multilingual) - google_vertexai: Google multimodalembedding@001 (Vertex AI) - amazon: Amazon Titan image embeddings (AWS Bedrock) Requires at least one of: VOYAGE_API_KEY, COHERE_API_KEY, GOOGLE_APPLICATION_CREDENTIALS, or AWS_ACCESS_KEY_ID.

python

from pathlib import Path
    from ai_infra import MultimodalEmbeddings

    emb = MultimodalEmbeddings()  # auto-detects provider

    # Embed a single image
    vector = emb.embed([Path("photo.jpg")])

    # Embed image + caption together
    vector = emb.embed([Path("photo.jpg"), "a picture of a mountain"])

    # Batch embedding
    vectors = emb.embed_batch([
        [Path("img1.jpg"), "caption one"],
        [Path("img2.png"), "caption two"],
    ])

    # Async
    vector = await emb.aembed([Path("photo.jpg")])

Providers: - voyage / voyage_ai: Voyage AI (VOYAGE_API_KEY) - cohere: Cohere (COHERE_API_KEY) - google / google_vertexai / vertexai: Google Vertex AI (GOOGLE_APPLICATION_CREDENTIALS or GOOGLE_CLOUD_PROJECT) - amazon / bedrock / aws: Amazon Bedrock (AWS_ACCESS_KEY_ID)

Constructor

MultimodalEmbeddings(provider: str | None = None, model: str | None = None, kwargs: Any = {}) -> None

Parameter	Type	Default	Description
`provider`	`str\|None`	None	Provider name. Auto-detects from environment if not given.
`model`	`str\|None`	None	Model name. Uses provider default if not specified.
`kwargs`	`Any`	{}	—

Methods

from pathlib import Path from ai_infra import MultimodalEmbeddings emb = MultimodalEmbeddings() # auto-detects provider # Embed a single image vector = emb.embed([Path("photo.jpg")]) # Embed image + caption together vector = emb.embed([Path("photo.jpg"), "a picture of a mountain"]) # Batch embedding vectors = emb.embed_batch([ [Path("img1.jpg"), "caption one"], [Path("img2.png"), "caption two"], ]) # Async vector = await emb.aembed([Path("photo.jpg")])

Parameter

Type

Default

Description

provider

str|None

None

Provider name. Auto-detects from environment if not given.

model

str|None

None

Model name. Uses provider default if not specified.

kwargs

Any

{}

—