#

inference

Here are 1,194 public repositories matching this topic...

SubstrateLabs / substrate-python

Substrate Python SDK

machine-learning ai ml inference agents dag rag mlops llms generative-ai

Updated Jun 1, 2024
Python

wesleey / python-artificial-intelligence

Search, Knowledge, Uncertainty, Optimization, Learning, Neural Networks and Language.

python markov-chain pagerank-algorithm inference artificial-intelligence search-algorithm minimax propositional-logic adversarial-search breadth-first-search alpha-beta-pruning depth-first-search informed-search uninformed-search a-star-search depth-limited-search

Updated Jun 1, 2024
Python

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda inference pytorch transformer llama gpt rocm model-serving mlops llm inferentia llmops llm-serving trainium

Updated Jun 1, 2024
Python

microsoft / DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

machine-learning compression deep-learning gpu inference pytorch zero data-parallelism model-parallelism mixture-of-experts pipeline-parallelism billion-parameters trillion-parameters

Updated Jun 1, 2024
Python

google / XNNPACK

High-efficiency floating-point neural network inference operators for mobile, server, and Web

cpu neural-network inference multithreading simd matrix-multiplication neural-networks convolutional-neural-networks convolutional-neural-network inference-optimization mobile-inference

Updated Jun 1, 2024
C

google-ai-edge / mediapipe

Cross-platform, customizable ML solutions for live and streaming media.

android c-plus-plus calculator machine-learning framework computer-vision deep-learning inference pipeline-framework stream-processing video-processing perception mobile-development audio-processing graph-framework graph-based mediapipe

Updated Jun 1, 2024
C++

openvinotoolkit / openvino

OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference

nlp natural-language-processing ai computer-vision deep-learning transformers inference speech-recognition yolo recommendation-system performance-boost good-first-issue openvino diffusion-models stable-diffusion generative-ai llm-inference optimize-ai deploy-ai

Updated Jun 1, 2024
C++

MGTheTrain / python-machine-learning-starter

Template designed to kickstart your machine learning training projects in Python

python tensorflow jupyter-notebook inference cookiecutter-template ml-training

Updated May 31, 2024
Python

huggingface / text-generation-inference

Large Language Model Text Generation Inference

nlp bloom deep-learning inference pytorch falcon transformer gpt starcoder

Updated May 31, 2024
Python

google / jetstream-pytorch

PyTorch/XLA integration with JetStream (https://github.com/google/JetStream) for LLM inference"

inference pytorch batching attention llama gemma model-serving tpu llm llm-inference llama2

Updated May 31, 2024
Python

SubstrateLabs / substrate-typescript

Substrate TypeScript SDK

machine-learning ai ml inference agents dag rag mlops llms generative-ai

Updated May 31, 2024
TypeScript

cyrusbehr / tensorrt-cpp-api

TensorRT C++ API Tutorial

machine-learning computer-vision cpp inference tensorrt

Updated May 31, 2024
C++

ctesta01 / QualPrep

Study materials for taking the Harvard Biostatistics PhD Qualifying Exam, Summer 2024

statistics algorithms probability inference data-structures

Updated May 31, 2024
TeX

deepjavalibrary / djl-serving

A universal scalable machine learning model deployment solution

deep-learning deployment inference pytorch serving djl

Updated May 31, 2024
Java

aws / amazon-sagemaker-examples

Example 📓 Jupyter notebooks that demonstrate how to build, train, and deploy machine learning models using 🧠 Amazon SageMaker.

training aws data-science machine-learning reinforcement-learning deep-learning examples jupyter-notebook inference sagemaker mlops

Updated Jun 1, 2024
Jupyter Notebook

google / JetStream

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

gpu inference pytorch transformer llama gpt gemma model-serving tpu jax mlops large-language-models llm llmops llm-inference llama2

Updated May 31, 2024
Python

aikit

sozercan / aikit

🏗️ Fine-tune, build, and deploy open-source LLMs easily!

Updated May 31, 2024
Go

MIVisionX

ROCm / MIVisionX

MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.

Updated May 31, 2024
C++

nickpotafiy / illama

A lightweight, fast, parallel inference server for Llama

server inference llama llm-inference exllama llama2 flash-attention-2 paged-attention llama3 exllamav2

Updated May 31, 2024
Python

intel / xFasterTransformer

intel inference transformer xeon llama model-serving llm chatglm qwen

Updated May 31, 2024
C++

Improve this page

Add a description, image, and links to the inference topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the inference topic, visit your repo's landing page and select "manage topics."