MLflow - 机器学习生命周期管理平台

一、关于 MLflow

1、项目概览

开源平台，专为机器学习从业者和团队设计，用于简化机器学习全流程管理，确保各阶段可追踪、可复现。

MLflow Hero

2、相关资源

源码：https://github.com/mlflow/mlflow
文档：https://mlflow.org/docs/latest/index.html
社区：https://mlflow.org/community/#slack
演示：https://mlflow.org/docs/latest/index.html#running-mlflow-anywhere
许可证：Apache 2.0 https://github.com/mlflow/mlflow/blob/master/LICENSE.txt
PyPI下载量：https://pepy.tech/project/mlflow
Twitter：https://twitter.com/MLflow
Stack Overflow：https://stackoverflow.com/questions/tagged/mlflow
问题反馈：https://github.com/mlflow/mlflow/issues/new/choose

3、核心特性

1、实验追踪 📝

通过API记录模型参数和实验结果
提供交互式UI进行实验对比
支持scikit-learn等框架的自动日志记录

2、模型打包 📦

标准化格式封装模型及元数据
记录依赖版本确保部署可靠性

3、模型注册表 💾

集中化存储管理模型
提供版本控制和生命周期管理API

4、模型服务 🚀

支持Docker/Kubernetes部署
兼容AWS SageMaker/Azure ML等平台

5、模型评估 📊

自动化评估工具套件
可视化多模型性能对比

6、可观测性 🔍

支持OpenAI/LangChain等GenAI库追踪
提供Python SDK进行手动埋点

二、安装指南

# 基础安装
pip install mlflow# 精简版安装（无额外依赖）
pip install mlflow-skinny

三、使用示例

1、实验追踪

以下示例使用 scikit-learn 训练一个简单的回归模型，同时启用 MLflow 的自动日志记录功能进行实验追踪。

详见：https://mlflow.org/docs/latest/tracking.html

import mlflowfrom sklearn.model_selection import train_test_split
from sklearn.datasets import load_diabetes
from sklearn.ensemble import RandomForestRegressormlflow.sklearn.autolog() # 启用自动日志# Load the training dataset
db = load_diabetes()
X_train, X_test, y_train, y_test = train_test_split(db.data, db.target)rf = RandomForestRegressor(n_estimators=100, max_depth=6, max_features=3)rf.fit(X_train, y_train)  # 自动记录实验数据

当上述代码执行完成后，在另一个终端中运行以下命令，并通过打印出的URL访问MLflow UI。

系统会自动创建一个MLflow Run，用于跟踪训练数据集、超参数、性能指标、训练好的模型、依赖项等信息。

启动UI服务：

mlflow ui

2、模型服务

通过 MLflow CLI 的一行命令，您可以将已记录的模型部署到本地推理服务器。如需了解如何将模型部署到其他托管平台，请参阅文档。

mlflow models serve --model-uri runs:/<run-id>/model

详见：https://mlflow.org/docs/latest/deployment/index.html

3、模型评估

以下示例使用多个内置指标对问答任务进行自动评估。

详见：https://mlflow.org/docs/latest/model-evaluation/index.html

import mlflow
import pandas as pd# Evaluation set contains (1) input question (2) model outputs (3) ground truth
df = pd.DataFrame({"inputs": ["What is MLflow?", "What is Spark?"],"outputs": ["MLflow is an innovative fully self-driving airship powered by AI.","Sparks is an American pop and rock duo formed in Los Angeles.",],"ground_truth": ["MLflow is an open-source platform for managing the end-to-end machine learning (ML) ""lifecycle.","Apache Spark is an open-source, distributed computing system designed for big data ""processing and analytics.",],}
)
eval_dataset = mlflow.data.from_pandas(df, predictions="outputs", targets="ground_truth"
)# Start an MLflow Run to record the evaluation results to
with mlflow.start_run(run_name="evaluate_qa"):# Run automatic evaluation with a set of built-in metrics for question-answering modelsresults = mlflow.evaluate(data=eval_dataset,model_type="question-answering",)print(results.tables["eval_results_table"])

4、LLM追踪

MLflow Tracing 为多种 GenAI 库（如 OpenAI、LangChain、LlamaIndex、DSPy、AutoGen 等）提供 LLM 可观测能力。要启用自动追踪功能，在运行模型前调用 mlflow.xyz.autolog() 即可。如需定制化或手动埋点配置，请参阅相关文档。

详见：https://mlflow.org/docs/latest/llms/tracing/index.html

import mlflow
from openai import OpenAImlflow.openai.autolog()  # 启用OpenAI自动追踪# Query OpenAI LLM normally
response = OpenAI().chat.completions.create(model="gpt-4o-mini",messages=[{"role": "user", "content": "Hi!"}],temperature=0.1,
)