26 The Adaptive Cultural Visual Transformer (ACVT-BiLSTM): Enhancing Consumer Purchase Intention through Dynamic Cultural Symbol Processing in Smart Retail Environments
Keywords:
Smart Retail, Cultural Symbols, Visual Merchandising, Deep Learning, Vision Transformer, Purchase IntentionAbstract
The rapid evolution of smart retail environments, while offering unprecedented personalization, often overlooks the profound influence of cultural symbols on consumer psychology. Existing visual merchandising strategies, primarily focused on product placement and generic aesthetics, fail to leverage the deep-seated emotional resonance and identity congruence triggered by culturally relevant visual stimuli, leading to suboptimal engagement and conversion rates. To address this gap, this study proposes an innovative deep learning framework, the Adaptive Cultural Visual Transformer (ACVT-BiLSTM), designed to dynamically process and deploy culturally resonant visual elements in real-time retail displays. The framework is built upon an enhanced Vision Transformer architecture, the Adaptive Cultural Visual Transformer (ACVT), which incorporates a novel Cultural Feature Fusion (CFF) module to extract multi-scale cultural features from image blocks. The ACVT is integrated with a Bidirectional Long Short-Term Memory (BiLSTM) network to analyze the temporal sequence of visual stimuli and predict their cumulative effect on consumer emotional states (e.g., pleasure, arousal, dominance) and subsequent purchase intention. We deployed this system in a simulated smart retail lab, utilizing a curated dataset of traditional regional cultural symbols and their modern interpretations. Experimental results demonstrate that the ACVT-BiLSTM model achieves a 94.2\% accuracy in classifying the emotional valence of dynamic cultural displays and, more critically, the system-generated dynamic displays resulted in a 15.8\% increase in observed consumer engagement time and a 12.1\% uplift in purchase intention compared to static, non-cultural displays. This research provides a novel, data-driven methodology for integrating cultural design and artificial intelligence in commercial spaces, offering a significant theoretical contribution to the fields of visual merchandising, cross-cultural design, and consumer psychology, and providing a practical blueprint for hyper-personalized, emotionally intelligent retail experiences.
