Highlights
- Pro
Stars
TabICLv2: A state-of-the-art tabular foundation model
PluRel: Synthetic Data unlocks Scaling Laws for Relational Foundation Models
MLE-bench is a benchmark for measuring how well AI agents perform at machine learning engineering
A holistic framework for advancing LLMs as data science agents
Comprehensive open-source library of AI research and engineering skills for any AI model. Package the skills and your claude code/codex/gemini agent will be an AI research agent with full horsepowe…
LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence https://arxiv.org/abs/2509.03505
A practical guide to diffusion models, implemented from scratch.
[NeurIPS 2025] Tracking and Understanding Object Transformations
A largely incomplete but hopefully useful list of links to datasets for relational learning and inductive logic programming. No guarantees on availability.
Relational Transformer: Toward Zero-Shot Foundation Models for Relational Data
GraphPFN: A Prior-Data Fitted Graph Foundation Model
A model zoo for Grounding-DINO-based open-world detection models.
This repository is for tracking issues, feature requests, and feedback for KumoRFM.
🔬 MCP server to query KumoRFM in your agentic flows
Tabular Deep Learning Library for PyTorch
This is a curated list of research on diffusion models for tabular data, and serves as the official repository for the survey paper "Diffusion Models for Tabular Data: Challenges, Current Progress,…
[WIP] Resources for AI engineers. Also contains supporting materials for the book AI Engineering (Chip Huyen, 2025)
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
[ICCV 25] The official repository of paper 'Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle'
ReDeLEx is a Python framework for developing and evaluating RDL models on relational databases via RelBench and CTU datasets.
RelBench: Relational Deep Learning Benchmark
🦀 Small exercises to get you used to reading and writing Rust code!
[Pytorch] Generative retrieval model using semantic IDs from "Recommender Systems with Generative Retrieval"
A comprehensive toolkit and benchmark for tabular data learning, featuring 35+ deep methods, more than 10 classical methods, and 300 diverse tabular datasets.
(ICLR 2025 Spotlight) TabReD: Analyzing Pitfalls and Filling the Gaps in Tabular Deep Learning Benchmarks
Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.



