跳转至

Ansible 企业实战方案:中型企业从安装到策略落地(含命令)

本文以“中型企业 IT 运维”为场景,虚拟构建一套典型的业务与网络拓扑,演示如何用 Ansible 从 0 到 1 完成安装、分层规划、基础配置、网络与安全、应用发布与监控接入,面向初学者提供实际可复用的命令与清单样例。


1. 场景与拓扑(虚拟企业)

  • 总部(HQ)与一处机房(DC),两环境(prod/dev),一跳堡垒机(bastion)
  • 业务:Nginx Web 前端、Java 应用、PostgreSQL 数据库、Redis、对象存储网关
  • 网络与 VLAN(示例)
  • 管理网 10.10.0.0/24(bastion/跳板与 Ansible 控制端)
  • 生产区 10.10.10.0/24,开发区 10.10.20.0/24
  • 数据库区 10.10.30.0/24,存储/备份区 10.10.40.0/24

ASCII 拓扑(简化):

Laptop -> VPN -> Bastion(10.10.0.10) -> { prod, dev }
                                   \-> DB(10.10.30.x)
                                     -> Web/App(10.10.10.x / 20.x)

2. 安装与基础准备

2.1 安装 Ansible(建议虚拟环境)

  • Ubuntu/Debian
sudo apt update && sudo apt install -y python3-venv python3-pip
python3 -m venv ~/.venvs/ansible && source ~/.venvs/ansible/bin/activate
pip install --upgrade pip && pip install ansible==9.* ansible-lint
ansible --version
  • RHEL/CentOS/Rocky
sudo dnf install -y python3 python3-pip
python3 -m venv ~/.venvs/ansible && source ~/.venvs/ansible/bin/activate
pip install --upgrade pip && pip install ansible==9.*

2.2 目录结构(推荐)

enterprise-ansible/
├─ ansible.cfg
├─ inventories/
│  ├─ prod/hosts.ini
│  └─ dev/hosts.ini
├─ group_vars/{all,prod,dev}.yml
├─ host_vars/
├─ roles/
│  ├─ base_hardening/{tasks,templates,files}
│  ├─ linux_network/{tasks,templates}
│  ├─ web_nginx/{tasks,templates}
│  ├─ db_postgresql/{tasks,templates}
│  └─ exporters_node/{tasks,templates}
└─ site.yml

2.3 ansible.cfg(示例)

[defaults]
inventory = inventories/prod/hosts.ini
forks = 20
timeout = 30
host_key_checking = False
interpreter_python = auto_silent
retry_files_enabled = False
stdout_callback = yaml

2.4 SSH 与提权

# ~/.ssh/config(跳板)
Host bastion
  HostName 10.10.0.10
  User ops
Host 10.10.*.*
  ProxyJump bastion

# 目标机提供 sudo 权限(免密码或限命令)

3. 清单(Inventory)与分组

inventories/prod/hosts.ini:

[web]
web01 ansible_host=10.10.10.11 env=prod
web02 ansible_host=10.10.10.12 env=prod

[app]
app01 ansible_host=10.10.10.21 env=prod

[db]
db01 ansible_host=10.10.30.31 env=prod pg_role=primary

[all:vars]
ansible_user=ops
ansible_become=true

group_vars/all.yml(基础参数):

ntp_servers: [time1.aliyun.com, time2.aliyun.com]
timezone: Asia/Shanghai
ssh_allow_groups: [ops]
firewall_allowed:
  - { port: 22, proto: tcp }
  - { port: 80, proto: tcp }
  - { port: 443, proto: tcp }

group_vars/prod.yml(生产偏好):

packages_common: [vim, curl, htop, chrony]

4. 角色与任务(示例)

4.1 基线加固:roles/base_hardening/tasks/main.yml

- name: 设置时区
  community.general.timezone:
    name: "{{ timezone }}"

- name: 安装基础包
  ansible.builtin.package:
    name: "{{ packages_common }}"
    state: present

- name: 启用并配置 chrony
  ansible.builtin.service:
    name: chronyd
    state: started
    enabled: true

- name: 加固 sshd
  ansible.builtin.lineinfile:
    path: /etc/ssh/sshd_config
    regexp: '^#?PasswordAuthentication'
    line: 'PasswordAuthentication no'
    backup: yes
  notify: Restart sshd

- name: 配置防火墙(以 firewalld 为例)
  ansible.posix.firewalld:
    port: "{{ item.port }}/{{ item.proto }}"
    permanent: true
    immediate: true
    state: enabled
  loop: "{{ firewall_allowed }}"

- name: 收集资产(facts)到文件
  ansible.builtin.copy:
    content: "{{ ansible_facts | to_nice_yaml }}\n"
    dest: "/var/tmp/facts_{{ inventory_hostname }}.yml"

handlers/main.yml:

- name: Restart sshd
  ansible.builtin.service:
    name: sshd
    state: restarted

4.2 网络配置:roles/linux_network/tasks/main.yml(示例)

  • Ubuntu(Netplan) 创建 VLAN 接口 vlan10:
- name: 渲染 netplan
  ansible.builtin.template:
    src: netplan.yaml.j2
    dest: /etc/netplan/01-netcfg.yaml
  notify: Apply netplan

handlers:

- name: Apply netplan
  ansible.builtin.command: netplan apply
  • RHEL 使用 nmcli(示例 ad-hoc)
ansible all -m community.general.nmcli -a \
"conn_name=vlan10 ifname=eth0.10 type=vlan vlandev=eth0 vlan_id=10 ip4=10.10.10.11/24 gw4=10.10.10.1 state=present"

4.3 Web 与数据库(示例)

roles/web_nginx/tasks/main.yml:

- name: 安装并启用 Nginx
  ansible.builtin.package:
    name: nginx
    state: present
- ansible.builtin.service:
    name: nginx
    state: started
    enabled: true
- name: 部署站点模板
  ansible.builtin.template:
    src: default.conf.j2
    dest: /etc/nginx/conf.d/default.conf
  notify: Reload nginx

handlers:

- name: Reload nginx
  ansible.builtin.service:
    name: nginx
    state: reloaded

roles/db_postgresql/tasks/main.yml(安装、初始化略)

4.4 导出器(监控接入)

roles/exporters_node/tasks/main.yml:

- name: 下载并安装 node_exporter
  ansible.builtin.unarchive:
    src: https://github.com/prometheus/node_exporter/releases/download/v1.8.1/node_exporter-1.8.1.linux-amd64.tar.gz
    dest: /opt/
    remote_src: yes
- name: 配置 systemd
  ansible.builtin.copy:
    dest: /etc/systemd/system/node_exporter.service
    content: |
      [Unit]
      Description=Node Exporter
      [Service]
      ExecStart=/opt/node_exporter-1.8.1.linux-amd64/node_exporter
      [Install]
      WantedBy=multi-user.target
- name: 启动并启用
  ansible.builtin.systemd:
    name: node_exporter
    state: started
    enabled: true

5. Playbook 汇总与执行

site.yml:

- hosts: all
  roles:
    - base_hardening
    - linux_network
    - exporters_node

- hosts: web
  roles:
    - web_nginx

- hosts: db
  roles:
    - db_postgresql

执行:

# 先做连通性与提权检查
ansible all -m ping
ansible all -m command -a 'id'

# 正式执行(以 prod 库存为例)
ANSIBLE_CONFIG=./ansible.cfg ansible-playbook -i inventories/prod/hosts.ini site.yml -t base_hardening

6. 常用 ad-hoc 命令与技巧

# 推送文件/模板
a nsible all -m copy -a "src=./file dest=/tmp/file mode=0644"
ansible web -m template -a "src=nginx.j2 dest=/etc/nginx/nginx.conf"

# 用户与密钥
ansible all -m user -a "name=deploy shell=/bin/bash state=present"
ansible all -m authorized_key -a "user=deploy key='{{ lookup('file','~/.ssh/id_rsa.pub') }}'"

# 包与服务
ansible all -m package -a "name=htop state=present"
ansible web -m service -a "name=nginx state=restarted"
  • 机密使用 ansible-vault:
ansible-vault create group_vars/prod/vault.yml
# 在任务中通过 vars_files 引用,或使用 vars: vault_xxx

7. 策略落地与最佳实践

  • 分层:inventory(环境) → group_vars(域) → role(职责) → playbook(编排)
  • 变更安全:--check、--diff、分批滚动、蓝绿/金丝雀
  • 审计:集中 facts、生成 CMDB(ansible-cmdb)、日志留痕
  • 与监控联动:部署 exporter、接入 Alertmanager/Grafana
  • 与备份联动:执行前后 Hook 与 NetBackup 备份/恢复脚本

8. 常见坑

  • Python 与系统 Python 冲突:使用 venv 隔离
  • SSH 跳板链路:ProxyJump,或 ansible_ssh_common_args=-o ProxyCommand
  • 防火墙/SELinux:以模块管理,不手动改配置
  • 幂等性:优先使用模块(package/service/template),避免裸 command/shell