Building Fast

Another interesting paper on AI. It presents a new AI system that automatically writes expert-level scientific software to solve empirical research tasks, such as data analysis, forecasting, and modeling, where the goal is to maximize a measurable quality metric (for example, prediction accuracy or fit to data). The approach combines a large language model (LLM) with tree search (TS), allowing it to systematically generate, evolve, and select code solutions that improve the score for a given scientific problem. I could see all of what is does being useful for business applications as well.

At its heart, the system works by prompting an LLM with a description of the scientific task, existing code, evaluation metrics, and relevant research ideas sourced from papers or textbooks. The LLM rewrites code candidates which are then scored and explored in a tree search, efficiently navigating the space of possible solutions. The TS mechanism uses an upper confidence bound strategy to balance trying promising candidates (exploitation) and exploring new variants (exploration), leading to rapid jumps in solution quality as improvements are discovered. This would be a very tedious process if done unassisted with AI.

The paper demonstrates the system's capabilities across multiple fields: it outperformed human-made software on public leaderboards for single-cell RNA sequencing data integration, COVID-19 hospitalization forecasting, satellite image segmentation, neural activity prediction in zebrafish, time series prediction, and numerical solution of challenging integrals. Notably, it created dozens of novel method: for example, 40 new approaches for single-cell analysis and 14 for COVID-19 forecasting that beat the prior best human and CDC ensemble solutions in independent benchmark tests.

For scientists and engineers, this AI system has transformative potential: it drastically shortens the time needed to create, test, and optimize research software, turning months of work into hours. It can serve as a co-scientist by generating, recombining, and refining ideas from the literature or domain experts, speeding up hypothesis testing, benchmarking, and discovery pipelines in computational biology, epidemiology, environmental studies, neuroscience, data science, and mathematical modeling. By rapidly producing validated high-performance software, scientists can more quickly explore alternative approaches and advance the frontiers of their disciplines. For non-scientists among us, opportunities abound as well.


No comments:

Seeking Return

Park Slope’s iconic brownstones which have featured in countless movies are seeing a new wave of multigenerational living, with adult childr...