Apple's New AI Dataset Aims to Improve Photo Editing Models

Apple researchers have released Pico-Banana-400K, a comprehensive dataset of 400,000 curated images that's been specifically designed to improve how AI systems edit photos based on text prompts.The massive dataset aims to address what Apple describes as a gap in current AI image editing training.While systems like GPT-4o can make impressive edits, the researchers say progress has been limited by inadequate training data built from real photographs.

Apple's new dataset aims to improve the situation.Pico-Banana-400K features images organized into 35 different edit types across eight categories, from basic adjustments like color changes to complex transformations such as converting people into Pixar-style characters or LEGO figures.Each image went through Apple's AI-powered quality control system, with Google's Gemini-2.5-Pro being used to evaluate the results based on instruction compliance and technical quality.

The dataset also includes three specialized subsets: 258,000 single-edit examples for basic training, 56,000 preference pairs comparing successful and failed edits, and 72,000 multi-turn sequences showing how images evolve through multiple consecutive edits.Apple built the dataset using Google's Gemini-2.5-Flash-Image (aka Nano-Banana) editing model, which was released just a few months ago.However, Apple's research revealed its limitations.

While global style changes succeeded 93% of the time, precise tasks like relocating objects or editing text seriously struggled, with success rates below 60%.Despite the limitations, researchers say their aim with Pico-Banana-400K is to establish "a robust foundation for training and benchmarking the next generation of text-guided image editing models." The complete dataset is freely available for non-commercial research use on GitHub, so developers can use it to train more capable image editing AI.

Read More
Related Posts