
-
Alibaba Group << USTC
- Kaiyuan county, Liaoning
-
04:30
(UTC -08:00) - https://hitachinsk.github.io/
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
🚀 PyTorch Implementation of "Progressive Distillation for Fast Sampling of Diffusion Models(v-diffusion)"
CLIP+MLP Aesthetic Score Predictor
This is a replicate of DeepSeek-R1-Zero and DeepSeek-R1 training on small models with limited data
Wan: Open and Advanced Large-Scale Video Generative Models
PyTorch implementation of InstructDiffusion, a unifying and generic framework for aligning computer vision tasks with human instructions.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Official Repo for Open-Reasoner-Zero
[CVPR 2024] Code for the paper "Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model"
Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
A collection of awesome video generation studies.
A curated list of recent diffusion models for video generation, editing, and various other applications.
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
My implementation of "Patch n’ Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution"
Tarsier -- a family of large-scale video-language models, which is designed to generate high-quality video descriptions , together with good capability of general video understanding.
Official repository for LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers
Video Generation Foundation Models: https://saiyan-world.github.io/goku/
每个人都能看懂的大模型知识分享,LLMs春/秋招大模型面试前必看,让你和面试官侃侃而谈
[ICLR 2025] Rectified Diffusion: Straightness Is Not Your Need
Janus-Series: Unified Multimodal Understanding and Generation Models
Investigating CoT Reasoning in Autoregressive Image Generation
ICLR2024 Spotlight: curation/training code, metadata, distribution and pre-trained models for MetaCLIP; CVPR 2024: MoDE: CLIP Data Experts via Clustering
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.