首页 | 本学科首页   官方微博 | 高级检索  
     检索      

一个精细粒度实时计算资源管理系统
引用本文:王彬,宗翔,魏敏.一个精细粒度实时计算资源管理系统[J].应用气象学报,2008,19(4):507-512.
作者姓名:王彬  宗翔  魏敏
作者单位:国家气象信息中心计算机室, 北京 100081
基金项目:科技部基础条件平台计划,中国气象局科技攻关项目
摘    要:由于相应业务系统软件的缺乏,国家级气象高性能计算机的资源管理措施落后于能力建设的发展。对此,该文提出了一个精细粒度实时计算资源管理系统。系统设计紧密围绕着目前竞争最为激烈的计算资源,采用资源虚拟单元GCU作为资源使用的计量单位,屏蔽了不同高性能计算机系统的体系结构差异,实现了计算资源细粒度的统一量化统计。系统可分为用户接口层、资源管理层、HPC系统层等3个层次,根据与网格平台软件不同结合方式以两种方式运行。在国家气象信息中心完成了系统的研发、部署和试验运行,根据试验运行的部分数据进行了用户单位和用户个人的计算资源使用的统计分析。目前,计算资源管理系统成果已成功应用到国家级气象高性能计算机计算资源的业务管理工作中。

关 键 词:国家级气象高性能计算机资源    资源管理    GCU    实时    精细粒度
收稿时间:2007-07-26
修稿时间:4/7/2008 12:00:00 AM

A Fine-grained, Real Time HPC Resource Management System
Wang Bin,Zong Xiang and Wei Min.A Fine-grained, Real Time HPC Resource Management System[J].Quarterly Journal of Applied Meteorology,2008,19(4):507-512.
Authors:Wang Bin  Zong Xiang and Wei Min
Institution:Computer Division, National Meteorological Information Center, Beijing 100081
Abstract:In contrast to the rapid development of capability construction, resource manage ment of national meteorological high performance computers is left behind. Absen ce of operational software in resource management keeps system administrators fr om having a detailed knowledge of what's going on in national meteorological hig h performance computers and exerting effective control over resource allocations . Regarding existing problems, a fine grained, real time high performance comput er resource management system is proposed. The system is designed to be a real t ime, fine grained one with cross cluster (Grid) support. The system works close ly with CPU hours resources under keen competition. With the introduction of GCU (General Computing Unit), a resource virtualization unit, to measure computing resources, diversities of computing resources in different high performance comp uter systems are shielded and fine grained uniform quantitative management is en abled by the system. The target users of the system include resource users, lead ers of user organizations, resource system administrators, decision makers etc. The system comprises three layers, namely, user interfaces, resource management , and high performance computer systems. Resource management layer, the primary layer, can be divided into resource accounting and allocation manager, Grid plat form, and resource information database. With open source software from supercom puting centers abroad, Grid project funded by MOST, and RDBMS employed, the syst em has seen an implementation, deployment and experimental running in National M eteorological Information Center. Fundamental functions of resource accounting a nd allocation management have been implemented, including cluster system job acc ounting, resource accounts management, management, allocation and query of user and organizations, providing command line interface for users. PostgreSQL databa se technology is adopted as the resource information database, on which accounts , users, organizations, computer systems, job records, accounting and allocation relation tables are created. The software system has been deployed into the thr ee partitions of IBM high performance computer system, Sunway 32I cluster, Sunwa y 32P cluster, IBM SP system, working with LoadLeveler, PBS. Information of user s on national meteorological high performance computer systems have been sorted and updated, resulting in uniform UID and GID, and inserted into databases. Two layers of management, organizations (projects) and individuals, are established. Computing resources are evenly allocated to user organizations according to 200 per cent of the total available resource in terms of GCUs. Only resources alloc ated to their department can be used by individual users. The validity of resour ces are set to a season. Overdraft is allowed. Based on partial data collected d uring experimental run, initial statistical analyses are made to probe resource usage by user organizations and individuals. At present, the high performance co mputer resource system has been put into operational run and successfully applie d to operation management.
Keywords:GCU
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《应用气象学报》浏览原始摘要信息
点击此处可从《应用气象学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号