Recently, Associate Professor Hengyun (Neil) Li from The Hong Kong Polytechnic University delivered an academic lecture titled “Survival Through Text-Image Fusion and Joint Learning: A Multi-modal Model for Predicting and Explaining Restaurant Success” for faculty and students of the Integrated Resort and Tourism Management Department.
Professor Li pointed out that restaurants, as representatives of small and medium-sized enterprises, rely heavily on survival prediction for operational decision-making. However, existing research often focuses on single data modalities and lacks intuitive explanations. To address this, his team innovatively proposed an analytical framework that integrates multi-modal online review data, including text and images. This framework incorporates multi-modal online review data, such as text and images, and introduces a joint learning method to enable information sharing between prediction tasks and interpretation tasks. By leveraging generative machine learning techniques, the framework can also provide human-like textual explanations, thereby enhancing the interpretability of the prediction results.
Professor Li elaborated on the data annotation method—determining the opening and closing years of restaurants by analyzing timestamps of online reviews, and using a combination of keyword matching + manual annotation to identify closure times. In terms of research design, the team systematically conducted five sets of experiments to comprehensively examine the capabilities of the proposed “IMRSP model” in predicting and interpreting restaurant survival. The results confirmed that deep interaction and fusion of text and images can effectively improve prediction performance, and the information flow from interpretation to prediction plays a more critical role. This breaks through the limitations of previous studies that merely combined multi-modal data in a simplistic manner, highlighting the value of deep integration and bidirectional interaction. The study also emphasized the importance of deep integration and interaction between text and image data.
Professor Li candidly shared the current limitations of the research, such as the potential to enrich data dimensions, extend the study to more cities in Hong Kong and the United States, and incorporate the impact of external competition and environmental factors.
During the interactive session, Professor Li engaged in enthusiastic discussions with faculty and students on topics such as model construction and discovering academic inspiration. He encouraged researchers to maintain extensive reading and curiosity about the real world, extracting scientific questions from observed phenomena.
This seminar featured cutting-edge content and novel perspectives, not only showcasing the innovative application of artificial intelligence in tourism management research but also providing practical insights for the industry regarding operational monitoring and risk prediction. It received positive feedback from the attendees.


