Deploy Tiny-Llama on AWS EC2. Learn to deploy an actual ML… | by Marcello Politi

Deploy Tiny-Llama on AWS EC2. Learn to deploy an actual ML… | by Marcello Politi | Jan, 2024

Tiny-Llama brand (src: https://github.com/jzhang38/TinyLlama)

Learn to deploy an actual ML utility utilizing AWS and FastAPI

Introduction

I’ve all the time thought that even the very best undertaking on this planet doesn’t have a lot worth if folks can’t use it. That’s the reason it is rather essential to discover ways to deploy Machine Studying fashions. On this article we concentrate on deploying a small massive language mannequin, Tiny-Llama, on an AWS occasion referred to as EC2.

Listing of instruments I’ve used for this undertaking:

Deepnote: is a cloud-based pocket book that’s nice for collaborative knowledge science tasks, good for prototyping
FastAPI: an internet framework for constructing APIs with Python
AWS EC2: is an internet service that gives sizable compute capability within the cloud
Nginx: is an HTTP and reverse proxy server. I take advantage of it to attach the FastAPI server to AWS
GitHub: GitHub is a internet hosting service for software program tasks
HuggingFace: is a platform to host and collaborate on limitless fashions, datasets, and purposes.

About Tiny Llama

TinyLlama-1.1B is a undertaking aiming to pretrain a 1.1B Llama on 3 trillion tokens. It makes use of the identical structure as Llama2 .

As we speak’s massive language fashions have spectacular capabilities however are extraordinarily costly by way of {hardware}. In lots of areas we now have restricted {hardware}: assume smartphones or satellites. So there’s plenty of analysis on creating smaller fashions to allow them to be deployed on edge.

Here’s a checklist of “small” fashions which might be catching on:

Cell VLM (Multimodal)
Phi-2
Obsidian (Multimodal)

Deploy Tiny-Llama on AWS EC2. Learn to deploy an actual ML… | by Marcello Politi | Jan, 2024

Learn to deploy an actual ML utility utilizing AWS and FastAPI

Introduction

About Tiny Llama

Visualization of Information with Pie Charts in Matplotlib | by Diana Rozenshteyn | Oct, 2024

Summarize name transcriptions securely with Amazon Transcribe and Amazon Bedrock Guardrails

Meta AI Releases Meta Spirit LM: An Open Supply Multimodal Language Mannequin Mixing Textual content and Speech

Leave a Reply Cancel reply

Visualization of Information with Pie Charts in Matplotlib | by Diana Rozenshteyn | Oct, 2024

The right way to get began with Google’s NotebookLM

Summarize name transcriptions securely with Amazon Transcribe and Amazon Bedrock Guardrails

EON Actuality Introduces Chopping-Edge XR Resolution for Regulation Enforcement Coaching and Operations EON Actuality Introduces Chopping-Edge XR Resolution for Regulation Enforcement Coaching and Operations – EON Actuality

Practice, optimize, and deploy fashions on edge gadgets utilizing Amazon SageMaker and Qualcomm AI Hub

Learn to deploy an actual ML utility utilizing AWS and FastAPI

Introduction

About Tiny Llama

More Stories

Leave a Reply Cancel reply

You may have missed