Research areas

Where we focus

Four interlocking pillars that reflect where we believe AI can have the largest near-term impact.

🧠

Reasoning & foundation models

Advancing the capabilities of large models — particularly in long-horizon reasoning, planning, and tool use — and the evaluation methods that make those advances measurable.

🤖

Agentic systems

Building AI agents that can take real-world actions reliably: orchestration, memory, verification, and safety guardrails for production-grade autonomous workflows.

🧬

AI for science

Applied AI for biology, chemistry, materials, climate, and medicine — partnering with research groups to accelerate the experiments that move human knowledge forward.

🛡️

Trustworthy AI

Interpretability, robustness, alignment, and the deployment science required to make AI systems safer, more transparent, and more accountable in practice.

Selected work

Recent publications & releases

A sample of what we've been working on. (Placeholder entries — update with your real outputs.)

Paper • 2026

Long-horizon planning in tool-using agents

A study of decomposition strategies and verification methods that improve agent reliability on multi-step scientific workflows.

Read paper →
Open source • 2026

PureBench: an evaluation suite for AI in the lab

A benchmark for evaluating LLMs and agents on realistic scientific tasks — from literature triage to experiment design.

View on GitHub →
Paper • 2025

Retrieval-grounded reasoning for chemistry

How structured retrieval over chemical knowledge bases improves both factual accuracy and exploratory reasoning in domain models.

Read paper →
Tech report • 2025

Safe deployment patterns for agent workflows

A working document of the production patterns, evaluation gates, and human-in-the-loop designs we use across deployments.

Read report →

Collaborate with us.

We partner with academic labs, scientific institutions, and industry teams on long-horizon research problems. If you have a problem you think we'd find interesting, we'd love to hear about it.

Get in touch →