09 Towards Fair Decentralized Benchmarking of Design Innovation Assessment Systems: A Federated Design Evaluation Challenge
Keywords:
Federated Design Evaluation, Design Innovation Assessment, Distributed Collaboration, AI-driven Design Analysis, Cross-cultural Design, Intellectual Property ProtectionAbstract
Computational design competitions have become the standard for benchmarking design innovation assessment algorithms, but they typically use small curated test datasets acquired from a few design studios, leaving a gap to the reality of diverse multicultural design contexts. To address this limitation, we introduce the Federated Design Evaluation (FeDe) Challenge, representing a new paradigm for real-world algorithmic performance evaluation in design innovation assessment. The FeDe challenge is a competition to benchmark both federated design knowledge aggregation algorithms and state-of-the-art design evaluation algorithms across multiple international design studios. Design knowledge aggregation and studio selection techniques were compared using a multicultural design dataset in realistic federated learning simulations, yielding benefits for adaptive knowledge aggregation and efficiency gains through selective studio sampling. Quantitative performance evaluation of state-of-the- art design assessment algorithms on data distributed internationally across 32 design institutions revealed good generalization on average, albeit worst-case performance exposed culture-specific modes of failure. Similar multi-site setups can help validate the real-world utility of design innovation assessment algorithms in the future, enabling more inclusive and culturally-aware design evaluation systems that respect intellectual property while fostering global creative collaboration.
