AI-powered hardware and scaling recommendations for optimal AI/ML infrastructure
FinOpsMetrics provides intelligent, AI-powered recommendations to optimize your AI/ML infrastructure. Configure your preferred LLM provider (OpenAI, Anthropic, Azure OpenAI, Ollama) or use our built-in rule-based engine for hardware optimization, scaling strategies, and cost reduction opportunities.
Optimize your GPU, CPU, memory, and storage configuration
Intelligent auto-scaling and capacity planning
Reduce infrastructure costs without sacrificing performance
GPT-4, GPT-3.5
Claude 3 (Opus, Sonnet)
Enterprise GPT-4
Local LLaMA models
No LLM required
Get up and running with AI-powered recommendations in 4 simple steps
Install the FinOpsMetrics package with all dependencies, or choose specific LLM providers based on your needs.
# Install with all LLM providers
pip install finopsmetrics[all]
# Or install with specific LLM provider
pip install finopsmetrics openai # For OpenAI (GPT-4, GPT-3.5)
pip install finopsmetrics anthropic # For Anthropic (Claude)
pip install finopsmetrics # No LLM (rule-based only)
Choose your preferred LLM provider or skip this step to use the built-in rule-based recommendation engine. You can configure the LLM using environment variables or a configuration file.
# Set your API key and model
export OPENAI_API_KEY="sk-your-api-key-here"
export OPENAI_MODEL="gpt-4-turbo-preview"
# Or for Anthropic Claude
export ANTHROPIC_API_KEY="sk-ant-your-api-key-here"
export ANTHROPIC_MODEL="claude-3-opus-20240229"
Create a file named llm_config.json
{
"provider": "openai",
"model_name": "gpt-4-turbo-preview",
"api_key": "sk-your-api-key-here",
"temperature": 0.3,
"max_tokens": 1000
}
Provide your current infrastructure metrics and receive intelligent, actionable recommendations for optimization.
from finopsmetrics.observability.intelligent_recommendations import get_recommendations
# Define your current infrastructure metrics
current_metrics = {
'instance_count': 4,
'avg_cpu_utilization': 65, # Average CPU usage percentage
'avg_gpu_utilization': 45, # Average GPU usage percentage
'gpu_count': 1, # GPUs per instance
'cost_per_instance_hour': 3.06, # Hourly cost per instance
'workload_type': 'inference' # 'training' or 'inference'
}
# Get intelligent recommendations
recommendations = get_recommendations(
current_metrics=current_metrics,
workload_type='inference',
cloud_provider='aws' # 'aws', 'azure', 'gcp', or 'on-prem'
)
# Display the recommendations
for rec in recommendations:
print(f"\n{'='*60}")
print(f"📌 {rec.title}")
print(f" Priority: {rec.priority}")
print(f" Impact: ${rec.impact['cost_monthly_usd']}/month savings")
print(f" {rec.description}")
Integrate recommendations directly into your executive dashboards for easy visualization and tracking of potential savings.
from finopsmetrics.dashboard import COODashboard
# Initialize the dashboard
dashboard = COODashboard()
# Get recommendations with summary statistics
recommendations = dashboard.get_intelligent_recommendations(
current_metrics=current_metrics
)
# Display summary
print(f"📊 Total Recommendations: {recommendations['summary']['total_recommendations']}")
print(f"💰 Potential Monthly Savings: ${recommendations['summary']['potential_monthly_savings']:.2f}")
print(f"⚡ High Priority Items: {recommendations['summary']['high_priority_count']}")
Start with the rule-based engine (no LLM required) to get immediate recommendations, then upgrade to LLM-powered recommendations for more contextual and nuanced insights.
LLM-powered analysis provides contextual recommendations beyond simple rules
Typically 30-50% reduction in infrastructure costs without performance loss
Every recommendation includes specific implementation steps
Works with any LLM provider or use rule-based fallback
View recommendations directly in executive dashboards
Continuous monitoring and updated recommendations
from finopsmetrics.observability.llm_config import LLMConfig, LLMProvider
from finopsmetrics.observability.intelligent_recommendations import IntelligentRecommendationsCoordinator
# Custom LLM config
llm_config = LLMConfig(
provider=LLMProvider.ANTHROPIC,
model_name="claude-3-opus-20240229",
api_key="your-api-key",
temperature=0.2,
max_tokens=1500,
track_api_costs=True,
max_monthly_cost=100.0
)
# Initialize coordinator with custom config
coordinator = IntelligentRecommendationsCoordinator(llm_config)
# Get recommendations
recs = coordinator.get_all_recommendations(
current_metrics=metrics,
use_llm=True
)
# Export as Markdown report
markdown_report = coordinator.export_recommendations(recommendations, format='markdown')
# Export as HTML report
html_report = coordinator.export_recommendations(recommendations, format='html')
# Export as JSON
json_report = coordinator.export_recommendations(recommendations, format='json')
Title: Downgrade to Cost-Effective GPU
Impact: Save $1,458/month
Description: Current GPU utilization is only 45%. Switch from A100 to L4 GPU for inference workloads.
Actions:
Title: Enable Predictive Auto-Scaling
Impact: Save $892/month
Description: Detected repeatable usage patterns. Enable predictive scaling to pre-provision capacity.
Actions:
Title: Use Spot Instances for Training
Impact: Save $2,102/month
Description: Training workload is fault-tolerant. Use spot instances with checkpointing for 70% cost reduction.
Actions: