深度学习day3-自动微分

七自动微分

torch.autograd模块：自动微分模块，自动计算张量操作的梯度，可以自动求导实现权重参数的更新

1 基础概念

张量：Torch中的一切都是张量，属性requires_grad决定是否对其进行梯度计算,True为进行梯度计算

计算图：torch.autograd通过创建一个动态计算图来跟踪张量的操作

反向传播：tenso.backward()执行反向传播，自动计算每个张量对损失函数的梯度

梯度：tensor.grad访问计算得到的梯度，最小化损失函数，优化模型

2 计算梯度

1.标量的梯度计算

import torch
def test01():#标量的梯度计算#创建标量，必须是浮点类型（涉及算术运算会有小数）x=torch.tensor(5,requires_grad=True,dtype=torch.float32)#损失函数y=x**2+2*x-5#反向传播，计算梯度y.backward()#计算梯度：1.y函数对x求导函数；2.把x当前的值带入上一步的函数中求导数值#读取梯度值print(x.grad)#打印出x的导数值（梯度值）#不能手动更新#x=x+5if __name__=='__main__':test01()

2.向量的梯度计算

import torch
def test02():#向量的梯度计算#创建向量x=torch.tensor([1,2,3],requires_grad=True,dtype=torch.float64)#损失函数y=x**2+2*x-5print(y)y=y.sum()#得到一个标量#y=y.mean()y.backward()#梯度计算（y必须是一个标量,才能用这个标量对x求导）print(x.grad)
if __name__=='__main__':test02()

3.多标量的梯度计算

import torch
def test03():#多标量的梯度计算x1=torch.tensor(1.,requires_grad=True)x2=torch.tensor(2.,requires_grad=True)y=x1**2+3*x2-5y.backward()#梯度计算print(x1.grad)print(x2.grad)

if __name__=='__main__':test03()

4.多向量的梯度计算

import torch
def test04():#多向量的梯度计算x1=torch.tensor([1,2,3],requires_grad=True,dtype=torch.float32)x2=torch.tensor([2,2,1],requires_grad=True,dtype=torch.float32)y=x1**2+3*x2-5y=y.sum()y.backward()print(x1.grad)print(x2.grad)#怎么算出tensor（[2.,4.,6.]),tensor([3.,3.,3.])if __name__=='__main__':test04()

5.矩阵梯度计算

import torch
def test05():#矩阵的梯度计算x1=torch.tensor([[1,2],[3,4]],requires_grad=True,dtype=torch.float32)y=x1**2+2y=y.sum()y.backward()print(x1.grad)

if __name__=='__main__':test05()

3 梯度上下文控制

管理计算图、内存消耗、计算效率

1.控制梯度计算

关闭梯度计算（有性能开销）

import torch
def test01():x=torch.tensor(5,requires_grad=True,dtype=torch.float64)y=x**2+2*x-5print(y)y.backward()print(x.grad)print(y.requires_grad)#默认是True#实际开发时y已经是最后一个数学表达式了，但是y的梯度计算功能是开启的状态# z=y**2+3# print(z)# z.backward()# print(y.grad)#关闭y的计算梯度1-with
def test02():x=torch.tensor(5,requires_grad=True,dtype=torch.float64)with torch.no_grad():#在with里面是关掉y的梯度的y=x**2+2*x-5print(y.requires_grad)# 关闭y的计算梯度2-装饰器1（系统自带）
@torch.no_grad()#和with类似
def test03():x=torch.tensor([1,2,3],requires_grad=True,dtype=torch.float64)y=x**2+2*x-5y=y.sum()print(y.requires_grad)# 关闭y的计算梯度3-装饰器2（自己写）
#自己实现一个不要梯度计算的装饰器函数
def my_no_grad(func): def wrapper(): #func=>最近调用my_no_gard的函数，即test04with torch.no_grad():res=func()return resreturn wrapper@my_no_grad#和with类似,先把下面的test04函数传给my_no_grada（）函数调用
def test04():x=torch.tensor([1,2,3],requires_grad=True,dtype=torch.float64)y=x**2+2*x-5y=y.sum()print(y.requires_grad)
# 关闭y的计算梯度4-全局设置，需要谨慎，影响大
def test05():#全局关闭x=torch.tensor([1,2,3],requires_grad=True,dtype=torch.float64)torch.set_grad_enabled(False)y=x**2+2*x-5y=y.sum()print(x.requires_grad)print(y.requires_grad)y.backward()print(x.grad)#x也不能求梯度if __name__=='__main__':test05()

2.累计梯度

重复对一个自变量进行梯度计算时，梯度是累加的

import torch
def test06():#累计梯度x=torch.tensor(4,requires_grad=True,dtype=torch.float64)# y=x**2+2*x-5  # y.backward()# print(x.grad)
# y=2*x**2+7# y.backward()# print(x.grad)
# y=2*x**2+7# y.backward()# print(x.grad)#累计梯度：每次计算都会累计梯度for _ in range(4):y=2*x**2+7y.backward()print(x.grad)if __name__=='__main__':test05()

3.梯度清零

不需要梯度累加的时候，在反向传播之前可以先对梯度进行清零

import torch
def test07():#梯度清零x=torch.tensor(4,requires_grad=True,dtype=torch.float64)y=2*x**2+7#如果在未来不知道这个x还从来没有求过梯度那么可以判断是否有累计梯度值if x.grad is not None:x.grad.zero_()y.backward()print(x.grad)
z=3*x**2+7*x#在反向传播之前对x的梯度清零x.grad.zero_()z.backward()print(x.grad)

def test08():x=torch.tensor(4,requires_grad=True,dtype=torch.float64)for _ in range(10):y=2*x**2+7   #清零操作if x.grad is not None:x.grad.zero_()y.backward()print(x.grad)

if __name__ == "__main__":test07()

4.梯度更新

import torch
import matplotlib.pyplot as plt
def test():w=torch.linspace(-200,100,1000)loss=3*w**2plt.grid()plt.plot(w,loss)plt.show()

def test01():#生成初始化ww=torch.tensor(5.,requires_grad=True)#定义训练的一些参数lr=0.01epoch=100for i in range(epoch):#生成损失函数loss=3*w**2+2*w-5#梯度清零#w.grad.zero_()if w.grad is not None:w.grad.zero_()#反向传播（求当前w的导数值：梯度值，斜率）loss.backward()#求得当前w的斜率print(w.grad)#更新梯度#w这个tensor是不能改的，否则的话未来w就变成了新的数据（可能是数字，也可能是新的张量）#修改w的data属性的值就可以了#w=w-0.1*w.gradw.data=w.data-lr*w.grad.dataprint(w)#访问训练100轮结束后的w的值print(w.item())def test02():#生成初始化ww=torch.tensor([10.,2.,3.],requires_grad=True)#定义训练的一些参数lr=0.01epoch=100for i in range(epoch):#生成损失函数loss=3*w**2+2*w-5loss=loss.sum()#梯度清零#w.grad.zero_()if w.grad is not None:w.grad.zero_()#反向传播（求当前w的导数值：梯度值，斜率）loss.backward()#求得当前w的斜率print(w.grad)#更新梯度#w这个tensor是不能改的，否则的话未来w就变成了新的数据（可能是数字，也可能是新的张量）#修改w的data属性的值就可以了#w=w-0.1*w.gradw.data=w.data-lr*w.grad.dataprint(w)#访问训练100轮结束后的w的值print(w.data)#保存weightstorch.save(w.data,'./data/weights.pth')
#调用
def detect():w=torch.load('./data/weights.pth',map_location="cuda")print(w)

if __name__ == "__main__":test01()

4叶子节点

当requires_grad=True时，调用numpy转换为ndarray时会报错，可以用detach()创建叶子节点，该张量和原张量共享数据，只是该张量不需要计算梯度

import torch
def test01():x=torch.tensor([1,2,3],requires_grad=True,dtype=torch.float32)print(x)#x2=x.numpy()#如果x是一个可以求导的张量，那么它就不能直接当做普通的tensor使用，比如调numpy函数#print(x2)x2=x.detach()print(x2)x3=x2.numpy()print(x3)

def test02():x=torch.tensor([1,2,3],requires_grad=True,dtype=torch.float32)print(x)x2=x.detach()print(x2)print(id(x),id(x2))#tensor本质是对象print(id(x.data),id(x2.data))

if __name__ == "__main__":test01()

深度学习day3-自动微分

七自动微分

1 基础概念

2 计算梯度

1.标量的梯度计算

2.向量的梯度计算

3.多标量的梯度计算

4.多向量的梯度计算

5.矩阵梯度计算

3 梯度上下文控制

1.控制梯度计算

2.累计梯度

3.梯度清零

4.梯度更新

4叶子节点

相关资讯

热文排行

最新新闻

推荐新闻

热搜词

深度学习day3-自动微分

七 自动微分

1 基础概念

2 计算梯度

1.标量的梯度计算

2.向量的梯度计算

3.多标量的梯度计算

4.多向量的梯度计算

5.矩阵梯度计算

3 梯度上下文控制

1.控制梯度计算

2.累计梯度

3.梯度清零

4.梯度更新

4叶子节点

相关资讯

热文排行

最新新闻

推荐新闻

热搜词

七自动微分