P2333

Tianyu Pang P2333

Machine Learning

395 followers · 13 following

Sea AI Lab
Singapore
https://p2333.github.io/
@TianyuPang1

Achievements

Organizations

Stars

rosieyzh / openrlhf-pretrain

Code for "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining"

Python 6 Updated Apr 9, 2025

xingjunm / Awesome-Large-Model-Safety

Safety at Scale: A Comprehensive Survey of Large Model Safety

145 3 Updated Feb 19, 2025

TIGER-AI-Lab / verl-tool

Python 11 2 Updated Apr 14, 2025

sail-sg / understand-r1-zero

Understanding R1-Zero-Like Training: A Critical Perspective

Python 845 39 Updated Apr 14, 2025

BytedTsinghua-SIA / DAPO

An Open-source RL System from ByteDance Seed and Tsinghua AIR

1,099 46 Updated Apr 10, 2025

sail-sg / SkyLadder

Forked from jzhang38/TinyLlama

The official repository for SkyLadder: Better and Faster Pretraining via Context Window Scheduling

Python 29 Updated Mar 20, 2025

sail-sg / LightTrans

The official implementation of "LightTransfer: Your Long-Context LLM is Secretly a Hybrid Model with Effortless Adaptation"

17 Updated Mar 17, 2025

sail-sg / Meta-ARVDM

Official Implementation of "Error Analyses of Auto-Regressive Video Diffusion Models: A Unified Framework"

5 Updated Mar 12, 2025

haonan3 / V1

V1: Toward Multimodal Reasoning by Designing Auxiliary Task

Python 28 Updated Apr 14, 2025

ganler / code-r1

Reproducing R1 for Code with Reliable Rewards

Python 167 11 Updated Apr 7, 2025

sail-sg / oat

🌾 OAT: A research-friendly framework for LLM online alignment, including preference learning, reinforcement learning, etc.

Python 319 20 Updated Apr 8, 2025

QwenLM / QwQ

QwQ is the reasoning model series developed by Qwen team, Alibaba Cloud.

Python 451 17 Updated Mar 27, 2025

hamishivi / automated-instruction-selection

Exploration of automated dataset selection approaches at large scales.

Python 37 2 Updated Mar 4, 2025

John-AI-Lab / Unnatural_Language

The official repository of 'Unnatural Language Are Not Bugs but Features for LLMs'

Python 13 Updated Mar 5, 2025

PeterGriffinJin / Search-R1

Search-R1: An Efficient, Scalable RL Training Framework for Reasoning & Search Engine Calling interleaved LLM based on veRL

Python 1,846 135 Updated Apr 11, 2025

RAGEN-AI / RAGEN

RAGEN leverages reinforcement learning to train LLM reasoning agents in interactive, stochastic environments.

Python 1,358 98 Updated Apr 10, 2025

wonderNefelibata / Awesome-LRM-Safety

Awesome Large Reasoning Model(LRM) Safety.This repository is used to collect security-related research on large reasoning models such as DeepSeek-R1 and OpenAI o1, which are currently very popular.

Python 61 4 Updated Apr 14, 2025

sail-sg / LongSpec

LongSpec: Long-Context Speculative Decoding with Efficient Drafting and Verification

Python 44 Updated Mar 2, 2025

zhaochenyang20 / Awesome-ML-SYS-Tutorial

My learning notes/codes for ML SYS.

Python 1,787 109 Updated Apr 14, 2025

open-thoughts / open-thoughts

Fully open data curation for reasoning models

Python 1,697 146 Updated Apr 7, 2025

NovaSky-AI / SkyThought

Sky-T1: Train your own O1 preview model within $450

Python 3,192 322 Updated Apr 8, 2025

huggingface / evaluate

🤗 Evaluate: A library for easily evaluating machine learning models and datasets.

Python 2,178 272 Updated Jan 10, 2025

EleutherAI / lm-evaluation-harness

A framework for few-shot evaluation of language models.

Python 8,619 2,301 Updated Apr 10, 2025

aiverify-foundation / moonshot-data

Contains all assets to run with Moonshot Library (Connectors, Datasets and Metrics)

Python 29 29 Updated Apr 6, 2025

sail-sg / oat-zero

A lightweight reproduction of DeepSeek-R1-Zero with indepth analysis of self-reflection behavior.

Python 222 10 Updated Apr 4, 2025

volcengine / verl

verl: Volcano Engine Reinforcement Learning for LLMs

Python 6,567 699 Updated Apr 14, 2025

sail-sg / Rigging-ChatbotArena

Improving Your Model Ranking on Chatbot Arena by Vote Rigging

Python 20 2 Updated Feb 25, 2025

openai / simple-evals

Python 2,633 238 Updated Apr 10, 2025

SpatialVision / Orient-Anything

Python 260 9 Updated Apr 11, 2025

sail-sg / sailor2

🔱 Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs

55 3 Updated Mar 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tianyu Pang P2333

Achievements

Achievements

Organizations

Block or report P2333

Stars

rosieyzh / openrlhf-pretrain

xingjunm / Awesome-Large-Model-Safety

TIGER-AI-Lab / verl-tool

sail-sg / understand-r1-zero

BytedTsinghua-SIA / DAPO

sail-sg / SkyLadder

sail-sg / LightTrans

sail-sg / Meta-ARVDM

haonan3 / V1

ganler / code-r1

sail-sg / oat

QwenLM / QwQ

hamishivi / automated-instruction-selection

John-AI-Lab / Unnatural_Language

PeterGriffinJin / Search-R1

RAGEN-AI / RAGEN

wonderNefelibata / Awesome-LRM-Safety

sail-sg / LongSpec

zhaochenyang20 / Awesome-ML-SYS-Tutorial

open-thoughts / open-thoughts

NovaSky-AI / SkyThought

huggingface / evaluate

EleutherAI / lm-evaluation-harness

aiverify-foundation / moonshot-data

sail-sg / oat-zero

volcengine / verl

sail-sg / Rigging-ChatbotArena

openai / simple-evals

SpatialVision / Orient-Anything

sail-sg / sailor2