HPC
DCV 应用教程
DCV | Nice DCV 安装手册
DCV | RLM 提取 HOSTID
EF Portal and DCV HA Solution
Enginframe 应用教程
Enginframe | 技术参数说明
毅硕HPC教程
毅硕HPC | HPC环境下的LDAP配置
毅硕HPC | Rocky Linux 9 SLURM软件编译安装
毅硕HPC | RHEL 8 上的NVIDIA驱动安装
毅硕HPC | 配置SLURM作业队列优先级
毅硕HPC | Pritunl + ECS + Frp 搭建远程办公VPN
毅硕HPC | 在HPC集群上优雅地使用 Conda
毅硕HPC | 一文详解HPC环境中的MPI并行计算
毅硕HPC | NVIDIA DGX Spark 万字硬核评测:将AI超级工厂带上桌面
毅硕HPC | Lustre文件系统在HPC集群中的部署实战
毅硕HPC | InfiniBand网络在HPC集群中的核心应用
毅硕HPC | OpenPBS构建高效稳定的HPC作业调度环境
毅硕HPC | HPC集群LSF调度系统部署指南
毅硕HPC | 轻量高效的XFCE桌面环境
毅硕HPC | Ubuntu 24 SLURM 编译安装
-
+
首页
EF Portal and DCV HA Solution
### 节点说明 ``` 192.168.1.26 rhel-dcv # DCV Server node 192.168.1.25 rhel-efserver # EF Portal master node 192.168.1.28 rhel-efserver-bak # EF Portal backup node 192.168.1.29 rhel-mariadb # Broker's db, EF Portal's data (/efp-data) and EF Portal's mysql ``` ### Keepalived 健康检查采用监控 Nginx 进程和 DCV SM Broker 服务 ``` # master node rhel-efserver global_defs { router_id m01 vrrp_skip_check_adv_addr # vrrp_strict # 注释掉,否则在某些网络环境下VIP无法ping通 vrrp_garp_interval 0 vrrp_gna_interval 0 } # 脚本1:检查 Nginx 和 Broker 服务 vrrp_script check_services { script "pidof nginx && systemctl is-active dcv-session-manager-broker" interval 2 weight -20 # 失败则权重减 20 } # 脚本2:检查 MySQL 服务状态 vrrp_script check_mysql { script "mysqladmin ping -u root -p'Ins@1234'" interval 2 weight -30 # 失败则权重减 30 } vrrp_instance VI_1 { state MASTER # 主节点初始状态 interface eth2 # 请确认网卡名称正确(如 ens33, eth0 等) virtual_router_id 51 # 主备必须一致 priority 100 # 主节点优先级(需高于备节点) advert_int 1 authentication { auth_type PASS auth_pass dcv_ha_pwd } virtual_ipaddress { 192.168.1.100 # 虚拟IP (VIP) } track_script { check_services check_mysql } } ``` ``` # bak node rhel-efserver-bak global_defs { router_id m02 # 建议与主节点不同 vrrp_skip_check_adv_addr # vrrp_strict vrrp_garp_interval 0 vrrp_gna_interval 0 } vrrp_script check_services { script "pidof nginx && systemctl is-active dcv-session-manager-broker" interval 2 weight -20 } vrrp_script check_mysql { script "mysqladmin ping -u root -p'Ins@1234'" interval 2 weight -30 } vrrp_instance VI_1 { state BACKUP # 备节点初始状态 interface eth2 # 必须与物理网卡一致 virtual_router_id 51 priority 90 # 优先级低于主节点 advert_int 1 authentication { auth_type PASS auth_pass dcv_ha_pwd } virtual_ipaddress { 192.168.1.100 } track_script { check_services check_mysql } } ``` ### DCV Session Manager 主备节点使用同一个mariadb数据库 dcvdb ``` # master node rhel-efserver persistence-db = mysql jdbc-connection-url = jdbc:mysql://localhost:3306/dcvsmdb jdbc-user = dcvtest jdbc-password = Ins@1234 # bak node rhel-efserver-bak persistence-db = mysql jdbc-connection-url = jdbc:mysql://localhost:3306/dcvsmdb jdbc-user = dcvtestbak jdbc-password = Ins@1234 ``` ### DCV Connection Gateway 主备节点写各自的 IP,保证 Keepalived 已正确配置 ``` # master node rhel-efserver [resolver] url = "https://rhel-efserver:8447" ca-file = "/opt/ca.pem" # bak node rhel-efserver-bak [resolver] url = "https://rhel-efserver-bak:8447" ca-file = "/opt/ca.pem" ``` ### DCV Server 认证端点写 VIP , 保证 ca-file 和 DCV Connection Gateway 相同 ``` # node rhel-dcv [security] administrators=["dcvsmagent"] ca-file="/etc/dcv/dcv.pem" auth-token-verifier="https://192.168.1.100:8445/agent/validate-authentication-token" no-tls-strict=true ``` ### DCV Agent broker 主机写 VIP ``` # node rhel-dcv [agent] # hostname or IP of the broker. This parameter is mandatory. broker_host = '192.168.1.100' # The port of the broker. Default: 8445 #broker_port = # CA used to validate the certificate of the broker. ca_file = '/etc/dcv/dcv.pem' # Set to false to accept invalid certificates. True by default. tls_strict = false ``` ### Nginx 主备节点(rhel-efserver ,rhel-efserver-bak)相同配置 ``` server { listen 80; server_name localhost; #charset koi8-r; access_log logs/host.access.log; location / { proxy_pass http://127.0.0.1:8080; # 透传 Host,确保它与用户浏览器地址栏一致 proxy_set_header Host $http_host; # 告诉后端真实的客户端 IP proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; # 告知后端原始协议(HTTP 或 HTTPS) proxy_set_header X-Forwarded-Proto $scheme; # 针对 EnginFrame 的 CSRF,还需要透传 Referer proxy_set_header Referer $http_referer; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_read_timeout 300s; proxy_send_timeout 300s; } ``` ### EF Portal 主备节点挂载同一个 EF Portal 的安装目录,安装时候指定 保证主备节点 java 版本一致,路径一致 * efinstall.config ``` ###################################################################### # EF Portal Spoolers # Choose the location for the EF Portal spoolers ###################################################################### # Spoolers directory ef.spooler.dir = /efp-data/enginframe/spoolers ###################################################################### # EF Portal Repository # Choose the location for the EF Portal repository ###################################################################### # Repositories directory ef.repository.dir = /efp-data/enginframe/repository ###################################################################### # EF Portal Sessions # Choose the location for the EF Portal sessions ###################################################################### # Sessions directory ef.sessions.dir = /efp-data/enginframe/sessions ###################################################################### # EF Portal Data # Choose the location for the EF Portal data directory ###################################################################### # Data directory ef.data.root.dir = /efp-data/enginframe/data ###################################################################### # EF Portal Logs and Temp # Choose the location for the EF Portal logs and temp directories ###################################################################### # Logs directory ef.logs.root.dir = /efp-data/enginframe/logs # Temp directory ef.temp.root.dir = /efp-data/enginframe/tmp ``` * efp-data挂载 ``` # master node rhel-efserver [root@rhel-efserver conf]# df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 3.8G 0 3.8G 0% /dev tmpfs 3.8G 0 3.8G 0% /dev/shm tmpfs 3.8G 18M 3.8G 1% /run tmpfs 3.8G 0 3.8G 0% /sys/fs/cgroup /dev/mapper/rhel-root 35G 9.6G 26G 28% / /dev/xvda1 1014M 265M 750M 27% /boot tmpfs 766M 12K 766M 1% /run/user/42 tmpfs 766M 4.0K 766M 1% /run/user/0 rhel-mariadb:/efp-data 35G 7.4G 28G 22% /efp-data tmpfs 766M 0 766M 0% /run/user/1110 tmpfs 766M 4.0K 766M 1% /run/user/1000 [root@rhel-efserver ~]# ll /efp-data/ total 0 drwxrwxr-x. 6 efnobody efnobody 61 Jan 8 11:59 data drwx------. 4 efnobody efadmin 52 Jan 8 13:52 logs drwxr-x---. 3 efnobody root 21 Jan 8 14:10 repository drwxrwsrwt. 3 efnobody efnobody 21 Jan 8 14:17 sessions drwxr-xr-x. 3 efnobody root 21 Jan 8 14:10 spoolers drwxr-xr-x. 4 efnobody root 52 Jan 8 13:52 tmp # bak node rhel-efserver-bak [root@rhel-efserver-bak ~]# df -h Filesystem Size Used Avail Use% Mounted on devtmpfs 3.8G 0 3.8G 0% /dev tmpfs 3.8G 0 3.8G 0% /dev/shm tmpfs 3.8G 265M 3.5G 7% /run tmpfs 3.8G 0 3.8G 0% /sys/fs/cgroup /dev/mapper/rhel-root 35G 9.5G 26G 28% / /dev/xvda1 1014M 265M 750M 27% /boot tmpfs 766M 12K 766M 1% /run/user/42 /dev/sr0 12G 12G 0 100% /media/iso rhel-mariadb:/efp-data 35G 7.5G 28G 22% /efp-data tmpfs 766M 4.0K 766M 1% /run/user/0 [root@rhel-efserver-bak ~]# ll /efp-data/ total 0 drwxrwxr-x. 6 efnobody efnobody 61 Jan 8 16:51 data drwx------. 3 efnobody efadmin 27 Jan 8 17:01 logs drwxr-x---. 3 efnobody root 21 Jan 8 17:09 repository drwxrwsrwt. 2 efnobody efnobody 6 Jan 8 16:51 sessions drwxr-xr-x. 3 efnobody root 21 Jan 8 17:09 spoolers drwxr-xr-x. 3 efnobody root 27 Jan 8 17:01 tmp ``` * mysql-connector 版本一致 ``` [root@rhel-efserver ~]# find /efp-data/ -name mysql-connector-j-8.0.32.jar /efp-data/enginframe/2025.2-r2056/enginframe/WEBAPP/WEB-INF/lib/mysql-connector-j-8.0.32.jar [root@rhel-efserver-bak ~]# find /efp-data/ -name mysql-connector-j-8.0.32.jar /efp-data/enginframe/2025.2-r2056/enginframe/WEBAPP/WEB-INF/lib/mysql-connector-j-8.0.32.jar ``` * dcvsm插件配置 ENDPOINT 写 Keepalived 的 VIP AUTH_ID 与 AUTH_PASSWORD 写本节点使用 dcv-session-manager-broker 注册的 register-api-client ``` # master node rhel-efserver # Configuration for cluster dcvsm_cluster1 DCVSM_CLUSTER_dcvsm_cluster1_AUTH_ID=aaed74c5-d81d-4239-98f0-690068a06d0b DCVSM_CLUSTER_dcvsm_cluster1_AUTH_PASSWORD=OTAxNDQ1YWItZGYzZi00Nzc3LThhNGEtN2YwMWE5MTUwMDMz DCVSM_CLUSTER_dcvsm_cluster1_AUTH_ENDPOINT=https://192.168.1.100:8448/oauth2/token DCVSM_CLUSTER_dcvsm_cluster1_SESSION_MANAGER_ENDPOINT=https://192.168.1.100:8448 DCVSM_CLUSTER_dcvsm_cluster1_NO_STRICT_TLS=true # bak node rhel-efserver-bak # Configuration for cluster dcvsm_cluster1 DCVSM_CLUSTER_dcvsm_cluster1_AUTH_ID=5e2ea15e-331e-4196-b142-234dc4e6d6ae DCVSM_CLUSTER_dcvsm_cluster1_AUTH_PASSWORD=MTJlOGYxNWMtNzYwNi00YWI3LWFjM2MtZGQzY2U1NjUzZmY4 DCVSM_CLUSTER_dcvsm_cluster1_AUTH_ENDPOINT=https://192.168.1.100:8448/oauth2/token DCVSM_CLUSTER_dcvsm_cluster1_SESSION_MANAGER_ENDPOINT=https://192.168.1.100:8448 DCVSM_CLUSTER_dcvsm_cluster1_NO_STRICT_TLS=true ``` * Tomcat Server 配置 编辑 /efp-data/enginframe/conf/tomcat/conf/server.xml ,在Host部分添加 RemoteIpValve ``` <Host workDir="${catalina.workdir}" name="localhost" appBase="webapps" unpackWARs="true" autoDeploy="false"> <!-- SingleSignOn valve, share authentication between web applications Documentation at: /docs/config/valve.html --> <!-- <Valve className="org.apache.catalina.authenticator.SingleSignOn" /> --> <Valve className="org.apache.catalina.valves.RemoteIpValve" remoteIpHeader="x-forwarded-for" protocolHeader="x-forwarded-proto" internalProxies="127\.0\.0\.1" /> <!-- Access log processes all example. Documentation at: /docs/config/valve.html Note: The pattern used is equivalent to using pattern="common" --> <!-- Modified by EnginFrame Installer --> <Valve className="org.apache.catalina.valves.AccessLogValve" directory="${catalina.logdir}" prefix="localhost_access_log" suffix=".txt" pattern="combined" maxDays="60" resolveHosts="false"/> <!-- Added by EnginFrame Installer --> <Valve className="org.apache.catalina.valves.ErrorReportValve" showReport="false" showServerInfo="false"/> </Host> ``` * nat.conf 配置 可将 VIP (192.168.1.100) 解析到域名 ``` rhel-dcv:8443 192.168.1.100:8443 ``` * /efp-data/enginframe/conf/enginframe/server.conf 配置mysql数据库 (安装后指定外部数据库生成) ``` EF_DEFAULT_AUTHORITY=pam ef.db.url=jdbc:mysql://localhost:3306/efpdb EF_ADMIN=efadmin # Enable or disable recording of audit events to a log file. # Log file is located in EF_TOP/logs/<hostname>/audit.log, e.g. /opt/nisp/enginframe/logs/myhost/audit.log # Server restart is required for changes to take effect. # Syntax: ef.audit.logfile.enabled=[true | false] # Default: true ef.audit.logfile.enabled=true EF_KEY_STORE=${EF_CONF_ROOT}/enginframe/certs/server.keystore EF_KEY_STORE_PASSWORD=60d2a53b95c0dcceb1aafea64762c1b6002faf13 EF_TRUST_STORE=${EF_CONF_ROOT}/enginframe/certs/server.truststore EF_TRUST_STORE_PASSWORD=60d2a53b95c0dcceb1aafea64762c1b6002faf13 EF_DB_KEY_STORE_ENABLED=true EF_DB_KEY_STORE=${EF_CONF_ROOT}/enginframe/certs/db.keystore ``` ### Myql(Master-Master)配置 主节点 ``` [mysqld] # --- 基本设置 --- server-id = 25 # 必须唯一,建议用IP尾数 port = 3306 datadir = /var/lib/mysql socket = /var/lib/mysql/mysql.sock # --- 复制核心配置 --- log-bin = mysql-bin # 启用二进制日志 binlog-ignore-db=mysql binlog_format = row # 推荐使用 ROW 格式,数据最安全 relay-log = relay-bin log_slave_updates = 1 # 关键:从库执行完同步后也记入自己的binlog # --- 防止主键冲突 (双主必备) --- auto_increment_increment = 2 # 步长为2 auto_increment_offset = 1 # 初始值为1 (生成 ID 为 1, 3, 5...) # --- 高可用与安全设置 --- read_only = 0 # 主库初始为可读写 skip_name_resolve = 1 # 禁用DNS解析,提高连接速度 innodb_flush_log_at_trx_commit = 1 sync_binlog = 1 ``` 备节点 ``` [mysqld] # --- 基本设置 --- server-id = 28 # 必须唯一 port = 3306 datadir = /var/lib/mysql socket = /var/lib/mysql/mysql.sock # --- 复制核心配置 --- log-bin = mysql-bin binlog-ignore-db=mysql binlog_format = row relay-log = relay-bin log_slave_updates = 1 # 关键:允许数据级联同步 # --- 防止主键冲突 (双主必备) --- auto_increment_increment = 2 # 步长为2 auto_increment_offset = 2 # 初始值为2 (生成 ID 为 2, 4, 6...) # --- 高可用与安全设置 --- read_only = 0 skip_name_resolve = 1 innodb_flush_log_at_trx_commit = 1 sync_binlog = 1 ``` 主备(其实都是从节点)执行 ``` # 在 1.25 上执行 CREATE USER 'repl'@'%' IDENTIFIED WITH 'mysql_native_password' BY 'Ins@1234'; GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%'; # 指定对方为Master # 在 1.28 上执行: change master to master_host='192.168.1.25',master_user='repl',master_password='Ins@1234'; START SLAVE; # 在 1.28 上执行 CREATE USER 'repl'@'%' IDENTIFIED WITH 'mysql_native_password' BY 'Ins@1234'; GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%'; # 指定对方为Master # 在 1.25 上执行: change master to master_host='192.168.1.28',master_user='repl',master_password='Ins@1234'; START SLAVE; ```
xxnie
2026年1月11日 19:57
转发
收藏文档
上一篇
下一篇
手机扫码
复制链接
手机扫一扫转发分享
复制链接
Markdown文件
Word文件
PDF文档
PDF文档(打印)
分享
链接
类型
密码
更新密码
有效期