39. API Reference
Contents:
- 39.1. MemoryStore
- 39.2. Models
- 39.3. LoadModel
- 39.4. GPU Utilities
- 39.4.1. 🔑 Key Features
- 39.4.2. GPU Resource Management
- 39.4.3. gpu_stat
- 39.4.4. 📊 Usage Examples
- 39.4.5. ⚡ Performance Optimization
- 39.4.6. 🔧 Troubleshooting Guide
- 39.4.7. 📚 Additional Resources
- 39.4.8. GPU Memory Management
- 39.4.9. GPU Acceleration
- 39.4.10. GPU Memory Monitoring
- 39.4.11. Error Handling
- 39.4.12. Performance Comparison
- 39.5. Data Utilities
- 39.6. Utilities
39.7. Core Components
39.7.1. Memory System
Mock memory module for documentation build.
This module provides mock objects for the memories.core.memory module to allow documentation to be built without requiring all dependencies.
39.7.2. Models
- class memories.models.load_model.LoadModel[source]
Bases:
object- __init__(use_gpu=True, model_provider=None, deployment_type=None, model_name=None, api_key=None, endpoint=None, device=None)[source]
Initialize model loader with configuration.
- Parameters:
use_gpu (bool) – Whether to use GPU if available
model_provider (str) – The model provider (e.g., “deepseek”, “azure-ai”, “mistral”)
deployment_type (str) – Either “local” or “api”
model_name (str) – Short name of the model from BaseModel.MODEL_MAPPINGS
api_key (str) – API key for the model provider (required for API deployment type)
endpoint (str) – Endpoint URL for the model provider (optional)
device (str) – Specific GPU device to use (e.g., “cuda:0”, “cuda:1”)
- get_response(prompt, **kwargs)[source]
Generate a response using either local model or API.
- Parameters:
prompt (str) – The input prompt
**kwargs – Additional generation parameters including: max_length: Maximum length of generated response temperature: Sampling temperature (0.0 to 1.0) top_p: Nucleus sampling parameter top_k: Top-k sampling parameter num_beams: Number of beams for beam search
- Returns:
- Response dictionary containing:
text: The generated response text metadata: Generation metadata (tokens, time, etc) error: Error message if generation failed
- Return type:
Dict[str, Any]
- get_response_with_context(prompt, context_data, **kwargs)[source]
Generate a response using context-aware prompting.
- Parameters:
prompt (str) – The input prompt
context_data (Dict[str, Any]) – Dictionary containing contextual information including: - location_info: Location details - raw_data_summary: Summary of raw data from different sources - analysis_results: Results of various analyses - scenario_projections: Future scenario projections - historical_trends: Historical trend analysis
**kwargs – Additional generation parameters including: max_length: Maximum length of generated response temperature: Sampling temperature (0.0 to 1.0) top_p: Nucleus sampling parameter top_k: Top-k sampling parameter num_beams: Number of beams for beam search
- Returns:
- Response dictionary containing:
text: The generated response text metadata: Generation metadata including context usage error: Error message if generation failed
- Return type:
Dict[str, Any]
- chat_completion(messages, tools=None, tool_choice='auto', **kwargs)[source]
Generate a chat completion response using either local model or API.
- Parameters:
messages (List[Dict[str, str]]) – List of message dictionaries with ‘role’ and ‘content’ keys. Roles can be ‘user’, ‘assistant’, ‘system’, or ‘function’.
tools (List[Dict[str, Any]] | None) – Optional list of tool/function definitions that the model can use. Each tool should have a ‘type’, ‘function’ with ‘name’, ‘description’, ‘parameters’.
tool_choice (str) – How to handle tool selection. Options: - “auto”: Let the model decide if it should call a function - “none”: Don’t call any functions - Dict with specific function to call
**kwargs – Additional parameters including: temperature: Sampling temperature (0.0 to 1.0) max_tokens: Maximum tokens in the response top_p: Nucleus sampling parameter frequency_penalty: Frequency penalty parameter presence_penalty: Presence penalty parameter
- Returns:
- Response dictionary containing:
message: The assistant’s message tool_calls: List of tool calls if any metadata: Generation metadata error: Error message if generation failed
- Return type:
Dict[str, Any]