AlphaEvolve verifies, runs and scores the proposed programs using automated evaluation metrics. These metrics provide an objective, quantifiable assessment of each solution’s accuracy and quality.
Yeah, that’s the way genetic algorithms have worked for decades. Have they figured out a way to turn those evaluation metrics directly into code improvements, or do they just keep doing a bunch of rounds of trial and error?