Zengyi Qinqinzy [at] mit.edu |
I am an MIT PhD student affiliated with MIT CSAIL. My current research focus is
I have solid experience pre-training and post-training LLMs. I was the project lead of JetMoE, an MoA+MoE LLM pre-trained and post-trained
I am also the main author of several popular open-source projects. One has 28k stars and was trending 1st on Github. One receives >4M average monthly downloads (more than Stable Diffusion) on Huggingface.
Previously, I was also visiting researcher in Stanford Vision and Learning Lab, where I had the privilege of working with
Beyond research, I also have extensive experience
Doctor of Philosophy - PhD
Affiliated with MIT CSAIL (Computer Science and Artificial Intelligence Laboratory) and MIT AeroAstro
Technical blog
MIT CSAIL posts
Comments from the field (1 2 3)
The breakthrough represented by JetMoE-8B signals a significant democratization of AI technology (1)
Technical blog
Trended 1st on Github. Now 27k stars
Covered by VentureBeat, HyScaler and other medias
AI Voice Cloning Redefined: OpenVoice Unveils Revolutionary Open-Source Technology (1)
Visual Reasoning by Learning Latent Symbolization |
|
JetMoE: Reaching LLaMA2 Performance with 0.1M Dollars JetMoE is pre-trained and post-trained from scratch with less than 0.1M USD cost but outperforms LLaMA2-7B. It democratized high-performance LLM pre-training and post-training with remarkable cost-efficiency. |
|
OpenVoice: Versatile Instant Voice Cloning Instantly clone any voice to generate speech in various styles and languages.
|
|
DreamVoice: Text-Guided Voice Conversion Convert a voice into any voice based on the input text prompt. | |
MeloTTS: A high-quality multi-lingual multi-accent text-to-speech library High-quality multi-lingual text-to-speech library that supports English (US, BR, AU, INDIAN), Spanish, French, Chinese, Japanese and Korean |
MonoGRNet: A General Framework for Monocular 3D Object Detection A general monocular 3D object detection framework that flexibly adapts to both fully and weakly supervised learning, which alleviates the need of extensive 3D labels and only requires ground truth 2D bounding boxes during training. |
|
Weakly Supervised 3D Object Detection from Point Clouds A state-of-the-art framework for weakly supervised 3D object detection from point clouds without using any ground truth 3D bounding box for training. The core of our method is the unsupervised 3D object proposal module and the cross-modal knowledge distillation strategy. |
|
Triangulation Learning Network: from Monocular to Stereo 3D Object Detection This is a pioneering work on stereo image based 3D object detection without calculating the pixel-level depth maps. We proposed a triangulation learning method to learn the object-level stereo geometric correspondence for 3D object detection. |
|
MonoGRNet: A Geometric Reasoning Network for Monocular 3D Object Localization A state-of-the-art monocular 3D object detection approach based on geometric reasoning. We proposed to decompose the whole task into four progressive sub-tasks that significantly facilitates the monocular 3D object detection. |
SABLAS: Learning Safe Control for Black-Box Dynamical Systems Learning control barrier functions (CBFs) for safe control of black-box systems. CBFs are a powerful tool to provide safety guarantee, but before this work, they cannot be directly applied to black box systems where their models are unavailable. |
|
KETO: Learning Keypoint Representations for Tool Manipulation KETO is a framework for robots to manipulate unseen objects as tools to complete diverse tasks. We proposed a method to learn the keypoint representations of objects, which simplify the manipulation task and improve the generality to novel objects. |
|
Learning Safe Multi-agent Control with Decentralized Neural Barrier Certificates We study the multi-agent safe control problem where agents should avoid any collision while reaching their goals. Our method can scale up to an arbitrarily large number of agents (e.g., >1000 in our experiments) and achieve a 99-100% safety rate. |
|
Reactive and Safe Road User Simulations using Neural Barrier Certificate Reactive and safe agent modelings are important for nowadays traffic simulator designs and safe planning applications. We propose a control barrier function-based method to simulate traffic agents that behave like humans or human controlled vehicles, which react to other road participants. |
|
Density Constrained Reinforcement Learning We study constrained reinforcement learning (CRL) from a novel perspective by setting constraints directly on state density functions, rather than the value functions considered by previous work. State density has a clear physical and mathematical interpretation, and is able to express a wide variety of constraints such as resource limits and safety requirements. |
|
Safe Nonlinear Control Using Robust Neural Lyapunov-Barrier Functions Safety and stability are common requirements for robotic control systems. We propose a robust feedback method based on robust control Lyapunov barrier functions that generalize despite model uncertainty, and with safety and stability guarantee. |
|
Controller synthesis for linear system with reach-avoid specifications We address the problem of synthesizing provably correct controllers for linear systems with reach-avoid specifications. Our solution decomposes the overall synthesis problem into two smaller and more tractable problems, achieving a 2-150 times speedup compared with the previous techniques. |
Learning fine-grained estimation of physiological states from coarse-grained labels by distribution restoration Our method allows machine learning algorithms to perform fine-grained estimation of physiological states (e.g., sleep depth) even if the training labels are coarse-grained. |
|
sEMG based Tremor Severity Evaluation for Parkinson's Disease using a Light-weight CNN A machine learning framework to assist the diagnosis of Parkinson's Disease by assessing the pathological tremor. We proposed a light-weight convolutional neural network and a similarity learning strategy to handle the scarcity of medical data. |