模型参数加载

weixin_44040169

于 2024-03-23 16:13:28 发布

阅读量244

点赞数 2

本文链接：https://blog.csdn.net/weixin_44040169/article/details/136452202

版权

一般模型参数加载的方式：

model.load_state_dict(torch.load(path))

torch.save(model.state_dict(), path)

然而，nn.Parameter(torch.ones(10))这样使用nn.Parameter进行初始化的，
load_state_dict()就失效了，可以这样：

print("Model loaded from {}".format(self.args.checkpoint_dir))
checkpoint = torch.load(self.args.checkpoint_dir + f'checkpoint_epoch_best.pth', map_location=f'cuda:{self.args.gpu}')
# print("checkpoint:",checkpoint)
pl = checkpoint['state_dict']['ctx']

# self.model.load_state_dict(checkpoint['state_dict'], strict=False)不用这个了
self.model.prompt_learner.set_pl(pl)

当然此时模型类在定义时需要写一个能将模型参数通过外部数据进行赋值的函数，通过该函数将params传进去即可。

class PromptLearner(nn.Module):
    def __init__(self,llm,tokenizer,embed_tokens):
        super().__init__()
		print("Initializing a generic context")
        ctx_vectors = torch.ones(1,n_ctx, ctx_dim, dtype=dtype)
        self.ctx = nn.Parameter(ctx_vectors)  # 以上述向量初始为可优化参数 to be optimized
    def set_pl(self,pl):
        self.ctx = nn.Parameter(pl)

    def forward(self):
        ctx = self.ctx
        print("ccctx:",ctx,ctx.size())

torch.argmax()

next_tokens = torch.argmax(next_tokens_scores, dim=-1)#返回指定维度最大值的序号

也就是0维竖着看，1维横着看

import torch

x = torch.randn(2, 4)
print(x)
'''
tensor([[ 1.2864, -0.5955,  1.5042,  0.5398],
        [-1.2048,  0.5106, -2.0288,  1.4782]])
'''

# y0表示矩阵dim=0维度上（每一列）张量最大值的索引
y0 = torch.argmax(x, dim=0)
print(y0)
'''
tensor([0, 1, 0, 1])
'''

# y1表示矩阵dim=1维度上（每一行）张量最大值的索引
y1 = torch.argmax(x, dim=1)
print(y1)
'''
tensor([2, 3])
'''

^ ：代表非

^A-Za-z： 代表非字母

[^A-Za-z]+ ：可连续多个非字母的字符

.strip() ：去掉首位的空格

.lower()：把大写字母全部统一成小写

weixin_44040169

关注

2
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫