Team News

Team News

Gong Jianya/Zhang Mi and others have made progress in the field of full stack autonomous remote sensing intelligent interpretation models and multi-agent systems

10.December 2024

Recently, Associate Researcher Zhang Mi, as the first author, and Academician Gong Jianya, as the corresponding author, published a research paper titled "Foundation Model for Generalized Remote Sensing Intelligence: Potential and Prospects" in Science Bulletin (IF=18.8). The paper systematically analyzes the progress of remote sensing big models in scene classification/retrieval, object detection, land cover classification (semantic segmentation), change detection, video tracking, and geoscience applications in the past 5 years at home and abroad. It proposes a unified computing framework for remote sensing big models, including domain specific deep learning frameworks and sample libraries, fusion and fine-tuning of multimodal geoscience domain knowledge, bidirectional human-machine feedback mechanism, quality and reliability evaluation, and transparent downstream application models. Based on this framework, the research team has developed a multimodal multi task remote sensing large model LuoJia with 2.8 billion parameters SmartSensing (Luojia The prototype system was successfully deployed at Shandong Haiyang Dongfang Aerospace Port, inspired by the project.

Figure 1  Unified Computing Framework for Remote Sensing Universal Large Model

Another study, TopoSense: Agent driven Topological Graph Extraction from Remote Sensing Images, was published in the ISPRS Technical Commission III Mid term Symposium on Remote Sensing. This achievement is based on the research group's CVPR 2023 paper TopDiG: Class agnostic Topological Directional Graph Extraction from Remote Sensing Images, and further explores the application of visual agents in vector extraction of remote sensing images. Unlike current language model driven intelligent agent methods, TopoSense attempts to use the "point line surface" elements of remote sensing image ground targets as visual intelligent agent primitives, automatically discovering and correcting broken and discontinuous primitive features through intelligent agent primitives. At the same time, it combines basic pre trained large models to overcome feature generalization performance problems, thereby achieving online updating and robust extraction of vector ground features.


Figure 2  Schematic diagram of TopoSense visual agent vector extraction model

It is reported that the LuoJiaNet intelligent remote sensing interpretation team, under the guidance of Academician Gong Jianya and Professor Hu Xiangyun, and led by Associate Researcher Zhang Mi, has systematically solved the bottleneck problem of autonomous control of the entire stack of remote sensing intelligent interpretation large models. We have collaborated with Huawei to establish the open-source LuoJiaNet deep learning framework and LuoJiaSet sample library; Built LuoJia with 2.8 billion parameters SmartSensing remote sensing multimodal multi task large model, and deployed in collaboration with Baidu at Shandong Haiyang Dongfang Aerospace Port; We have built a LuoJiaNet remote sensing AI internship and training platform for digital education. The relevant achievements have been implemented and applied in units such as Guangdong Provincial Land and Resources Technology Center, Guangzhou Urban Planning Survey and Design Research Institute, and Zhejiang Provincial Institute of Surveying and Mapping Science.

The co authors of the above research results include Academician Zhang Zuxun, Professor Hu Xiangyun, doctoral student Yang Bingnan, etc. They have received support from projects such as the National Natural Science Foundation of China, the Ministry of Education's Integrated Research and Development Platform, the Hubei Provincial Key Research and Development Program, and the Hubei Luojia Laboratory Fund.

Related paper achievements and attachments:

Zhang M, Yang B, Hu X, Gong J#, Zhang Z. Foundation model for generalist remote sensing intelligence: potentials and prospects. Science Bulletin. 2024 Sep 19. (Supplementary:https://ars.els-cdn.com/content/image/1-s2.0-S2095927324006510-mmc1.pdf

Zhang M, Yang B, Gong J, Hu X. TopoSense: agent driven topological graph extraction from remote sensing image. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences. 2024 Nov 4;10:445-52.

Yang B, Zhang M#, Zhang Z, Zhang Z, Hu X. TopDiG: Class-agnostic Topological Directional Graph Extraction from Remote Sensing Images. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2023 (pp. 1265-1274).

Yang B, Zhang M#, Zhang Z, Zhao Y, Gong J. UniVecMapper: A universal model for thematic and multi-class vector graph extraction. International Journal of Applied Earth Observation and Geoinformation. 2024 Jun 1;130:103915.