Description

DeciLM-7B-instruct is a derivative of the recently released DeciLM-7B language model, a pretrained, high-efficiency generative text model with 7.04 billion-parameters. DeciLM-7B-instruct is a model for short-form instruction following. It is built by LoRA fine-tuning on the SlimOrca dataset.

Publishers
Deci AI Team

Submitted Version
December 12, 2023

Latest Version
N/A

Size
N/A 

Text to Text

Overview


DeciLM-7B-instruct is a derivative of the recently released DeciLM-7B language model, a pretrained, high-efficiency generative text model with 7.04 billion parameters. DeciLM-7B-instruct is one the best 7B instruct models obtained using simple LoRA fine-tuning, without relying on advanced techniques such as RLHF, DPO, etc.

DeciLM-7B-instruct is available under the Apache 2.0 license, offering unrestricted use. It’s designed for versatile deployment, whether locally or on any cloud platform.

Model Highlights

  • Task: Text Generation 
  • Model Type: An auto-regressive language model using an optimized transformer decoder architecture that includes variable Grouped-Query Attention
  • Languages (NLP): English 

Model Architecture

Parameters

Layers

Heads

Sequence Length

GQA Key Value Heads

7.04B

32

32

8K

Variable

*AutoNAC was employed to optimize the selection of the GQA num_key_value_heads for each model layer.

  • Decoder layer: Varible Grouped Query Attention. Grouped Query Attention (GQA) was introduced in Ainslie et al., 2023
  • Position Embeddings: Dynamic NTK Scaling Rotary Position Embeddings Su et al., 2021

Uses

This model is intended for commercial and research use in English and can be fine-tuned for use in other languages.

Share
Add Your Heading Text Here
				
					from transformers import AutoFeatureExtractor, AutoModelForImageClassification

extractor = AutoFeatureExtractor.from_pretrained("microsoft/resnet-50")

model = AutoModelForImageClassification.from_pretrained("microsoft/resnet-50")