Abstract. This article investigates how photographic images within Telegram discourse visually frame intercultural interaction. The research aims to identify the models of visual framing and the mechanisms by which photographic images trigger the «conflict» frame, forming ideas about inter-ethnic contradictions. The empirical foundation of this research is comprised of 527 images shared on the Telegram channel dedicated to «News and Media» from January to July of 2024. The methodology draws on discourse analysis, social semiotics, framing theory, and the theory of cognitive-discursive world-modelling. The results of the research reveal that there are three models of visual framing. It has been established that the dominant model «neutral image + conflictogenic linguistic mode» is based on semantic re-coding. Within the framework of this model, the image does not visualize the conflict, the linguistic substance contains ideas about contradictions in the propositional content. The «conflictogenic image + conflictogenic linguistic mode» model depicts a conflict that is enhanced by the emotional impact of the verbal component. The «neutral pictorial substance + neutral linguistic substance» model reveals latent conflicts, that arise under the influence of external neutral components, and indicates the potential for ambiguous interpretations of the content. The authors argue that regular application of these models in telegram communication forms structured ideas about migrant workers among readers and leads to formation of a representative structure, which ensures the problematization of the migration and conflictogenic world modeling. These findings contribute to discourse analysis and enable the adoption of a critical approach towards representing intercultural interactions in the digital environment.