39.3. LoadModel
39.3.1. Model Loading System
- class memories.models.load_model.LoadModel[source]
Bases:
object- __init__(use_gpu=True, model_provider=None, deployment_type=None, model_name=None, api_key=None, endpoint=None, device=None)[source]
Initialize model loader with configuration.
- Parameters:
use_gpu (bool) – Whether to use GPU if available
model_provider (str) – The model provider (e.g., “deepseek”, “azure-ai”, “mistral”)
deployment_type (str) – Either “local” or “api”
model_name (str) – Short name of the model from BaseModel.MODEL_MAPPINGS
api_key (str) – API key for the model provider (required for API deployment type)
endpoint (str) – Endpoint URL for the model provider (optional)
device (str) – Specific GPU device to use (e.g., “cuda:0”, “cuda:1”)
- get_response(prompt, **kwargs)[source]
Generate a response using either local model or API.
- Parameters:
prompt (str) – The input prompt
**kwargs – Additional generation parameters including: max_length: Maximum length of generated response temperature: Sampling temperature (0.0 to 1.0) top_p: Nucleus sampling parameter top_k: Top-k sampling parameter num_beams: Number of beams for beam search
- Returns:
- Response dictionary containing:
text: The generated response text metadata: Generation metadata (tokens, time, etc) error: Error message if generation failed
- Return type:
Dict[str, Any]
- get_response_with_context(prompt, context_data, **kwargs)[source]
Generate a response using context-aware prompting.
- Parameters:
prompt (str) – The input prompt
context_data (Dict[str, Any]) – Dictionary containing contextual information including: - location_info: Location details - raw_data_summary: Summary of raw data from different sources - analysis_results: Results of various analyses - scenario_projections: Future scenario projections - historical_trends: Historical trend analysis
**kwargs – Additional generation parameters including: max_length: Maximum length of generated response temperature: Sampling temperature (0.0 to 1.0) top_p: Nucleus sampling parameter top_k: Top-k sampling parameter num_beams: Number of beams for beam search
- Returns:
- Response dictionary containing:
text: The generated response text metadata: Generation metadata including context usage error: Error message if generation failed
- Return type:
Dict[str, Any]
- chat_completion(messages, tools=None, tool_choice='auto', **kwargs)[source]
Generate a chat completion response using either local model or API.
- Parameters:
messages (List[Dict[str, str]]) – List of message dictionaries with ‘role’ and ‘content’ keys. Roles can be ‘user’, ‘assistant’, ‘system’, or ‘function’.
tools (List[Dict[str, Any]] | None) – Optional list of tool/function definitions that the model can use. Each tool should have a ‘type’, ‘function’ with ‘name’, ‘description’, ‘parameters’.
tool_choice (str) – How to handle tool selection. Options: - “auto”: Let the model decide if it should call a function - “none”: Don’t call any functions - Dict with specific function to call
**kwargs – Additional parameters including: temperature: Sampling temperature (0.0 to 1.0) max_tokens: Maximum tokens in the response top_p: Nucleus sampling parameter frequency_penalty: Frequency penalty parameter presence_penalty: Presence penalty parameter
- Returns:
- Response dictionary containing:
message: The assistant’s message tool_calls: List of tool calls if any metadata: Generation metadata error: Error message if generation failed
- Return type:
Dict[str, Any]
39.3.2. LoadModel Class
- class memories.models.load_model.LoadModel[source]
Bases:
object- __init__(use_gpu=True, model_provider=None, deployment_type=None, model_name=None, api_key=None, endpoint=None, device=None)[source]
Initialize model loader with configuration.
- Parameters:
use_gpu (bool) – Whether to use GPU if available
model_provider (str) – The model provider (e.g., “deepseek”, “azure-ai”, “mistral”)
deployment_type (str) – Either “local” or “api”
model_name (str) – Short name of the model from BaseModel.MODEL_MAPPINGS
api_key (str) – API key for the model provider (required for API deployment type)
endpoint (str) – Endpoint URL for the model provider (optional)
device (str) – Specific GPU device to use (e.g., “cuda:0”, “cuda:1”)
- get_response(prompt, **kwargs)[source]
Generate a response using either local model or API.
- Parameters:
prompt (str) – The input prompt
**kwargs – Additional generation parameters including: max_length: Maximum length of generated response temperature: Sampling temperature (0.0 to 1.0) top_p: Nucleus sampling parameter top_k: Top-k sampling parameter num_beams: Number of beams for beam search
- Returns:
- Response dictionary containing:
text: The generated response text metadata: Generation metadata (tokens, time, etc) error: Error message if generation failed
- Return type:
Dict[str, Any]
- get_response_with_context(prompt, context_data, **kwargs)[source]
Generate a response using context-aware prompting.
- Parameters:
prompt (str) – The input prompt
context_data (Dict[str, Any]) – Dictionary containing contextual information including: - location_info: Location details - raw_data_summary: Summary of raw data from different sources - analysis_results: Results of various analyses - scenario_projections: Future scenario projections - historical_trends: Historical trend analysis
**kwargs – Additional generation parameters including: max_length: Maximum length of generated response temperature: Sampling temperature (0.0 to 1.0) top_p: Nucleus sampling parameter top_k: Top-k sampling parameter num_beams: Number of beams for beam search
- Returns:
- Response dictionary containing:
text: The generated response text metadata: Generation metadata including context usage error: Error message if generation failed
- Return type:
Dict[str, Any]
- chat_completion(messages, tools=None, tool_choice='auto', **kwargs)[source]
Generate a chat completion response using either local model or API.
- Parameters:
messages (List[Dict[str, str]]) – List of message dictionaries with ‘role’ and ‘content’ keys. Roles can be ‘user’, ‘assistant’, ‘system’, or ‘function’.
tools (List[Dict[str, Any]] | None) – Optional list of tool/function definitions that the model can use. Each tool should have a ‘type’, ‘function’ with ‘name’, ‘description’, ‘parameters’.
tool_choice (str) – How to handle tool selection. Options: - “auto”: Let the model decide if it should call a function - “none”: Don’t call any functions - Dict with specific function to call
**kwargs – Additional parameters including: temperature: Sampling temperature (0.0 to 1.0) max_tokens: Maximum tokens in the response top_p: Nucleus sampling parameter frequency_penalty: Frequency penalty parameter presence_penalty: Presence penalty parameter
- Returns:
- Response dictionary containing:
message: The assistant’s message tool_calls: List of tool calls if any metadata: Generation metadata error: Error message if generation failed
- Return type:
Dict[str, Any]
39.3.3. Model Types
39.3.3.1. Base Model
- class memories.models.base_model.BaseModel[source]
Bases:
objectBase model class that can be shared across modules
- initialize_model(model, use_gpu=True, device=None)[source]
Initialize a model with the specified configuration.
39.3.3.2. API Connectors
- class memories.models.api_connector.APIConnector[source]
Bases:
ABCBase class for API connectors.
- class memories.models.api_connector.OpenAIConnector[source]
Bases:
APIConnectorConnector for OpenAI API.
- chat_completion(messages, tools=None, tool_choice='auto', **kwargs)[source]
Generate a chat completion using the OpenAI API.
- Parameters:
messages (List[Dict[str, str]]) – List of message dictionaries with ‘role’ and ‘content’ keys
tools (List[Dict[str, Any]] | None) – Optional list of tools/functions the model can use
tool_choice (str) – How to handle tool selection (“auto”, “none”, or specific)
**kwargs – Additional parameters for the API call
- Returns:
Dict containing the response message, tool calls, and metadata
- Return type:
- class memories.models.api_connector.AnthropicConnector[source]
Bases:
APIConnectorConnector for Anthropic API.
- class memories.models.api_connector.DeepseekConnector[source]
Bases:
APIConnectorConnector for Deepseek API.
39.3.4. Example Usage
from memories.models.load_model import LoadModel
# Initialize model with local deployment
local_model = LoadModel(
use_gpu=True,
model_provider="deepseek-ai",
deployment_type="local",
model_name="deepseek-coder-small"
)
# Generate text with the local model
response = local_model.get_response("Write a function to calculate factorial")
print(response["text"])
# Initialize model with API deployment
api_model = LoadModel(
model_provider="openai",
deployment_type="api",
model_name="gpt-4",
api_key="your-api-key" # Or set OPENAI_API_KEY environment variable
)
# Generate text with the API model
response = api_model.get_response(
"Explain quantum computing",
temperature=0.7,
max_tokens=500
)
print(response["text"])
# Clean up resources when done
local_model.cleanup()