diff options
116 files changed, 3342 insertions, 595 deletions
diff --git a/DEPLOYMENT_TYPES.md b/DEPLOYMENT_TYPES.md new file mode 100644 index 000000000..1f64e223a --- /dev/null +++ b/DEPLOYMENT_TYPES.md @@ -0,0 +1,23 @@ +#Deployment Types + +This module supports OpenShift Origin, OpenShift Enterprise, and Atomic +Enterprise Platform. Each deployment type sets various defaults used throughout +your environment. + +The table below outlines the defaults per `deployment_type`. + +| deployment_type | origin | enterprise (< 3.1) | atomic-enterprise | openshift-enterprise (>= 3.1) | +|-----------------------------------------------------------------|------------------------------------------|----------------------------------------|----------------------------------|----------------------------------| +| **openshift.common.service_type** (also used for package names) | origin | openshift | atomic-openshift | | +| **openshift.common.config_base** | /etc/origin | /etc/openshift | /etc/origin | /etc/origin | +| **openshift.common.data_dir** | /var/lib/origin | /var/lib/openshift | /var/lib/origin | /var/lib/origin | +| **openshift.master.registry_url openshift.node.registry_url** | openshift/origin-${component}:${version} | openshift3/ose-${component}:${version} | aos3/aos-${component}:${version} | aos3/aos-${component}:${version} | +| **Image Streams** | centos | rhel + xpaas | N/A | rhel | + + +**NOTE** `enterprise` deloyment type is used for OpenShift Enterprise version +3.0.x OpenShift Enterprise deployments utilizing version 3.1 and later will +make use of the new `openshift-enterprise` deployment type. Additional work to +migrate between the two will be forthcoming. + + diff --git a/README_AWS.md b/README_AWS.md index c511741b9..3a5790eb3 100644 --- a/README_AWS.md +++ b/README_AWS.md @@ -154,18 +154,10 @@ Note: If no deployment type is specified, then the default is origin. ## Post-ansible steps -Create the default router -------------------------- -On the master host: -```sh -oadm router --create=true \ - --credentials=/etc/openshift/master/openshift-router.kubeconfig -``` - -Create the default docker-registry ----------------------------------- -On the master host: -```sh -oadm registry --create=true \ - --credentials=/etc/openshift/master/openshift-registry.kubeconfig -```
\ No newline at end of file + +You should now be ready to follow the **What's Next?** section of the advanced installation guide to deploy your router, registry, and other components. + +Refer to the advanced installation guide for your deployment type: + +* [OpenShift Enterprise](https://docs.openshift.com/enterprise/3.0/install_config/install/advanced_install.html#what-s-next) +* [OpenShift Origin](https://docs.openshift.org/latest/install_config/install/advanced_install.html#what-s-next) diff --git a/README_GCE.md b/README_GCE.md index f6c5138c1..50f8ade70 100644 --- a/README_GCE.md +++ b/README_GCE.md @@ -39,6 +39,13 @@ Create a gce.ini file for GCE * gce_service_account_pem_file_path - Full path from previous steps * gce_project_id - Found in "Projects", it list all the gce projects you are associated with. The page lists their "Project Name" and "Project ID". You want the "Project ID" +Mandatory customization variables (check the values according to your tenant): +* zone = europe-west1-d +* network = default +* gce_machine_type = n1-standard-2 +* gce_machine_image = preinstalled-slave-50g-v5 + + 1. vi ~/.gce/gce.ini 1. make the contents look like this: ``` @@ -46,11 +53,15 @@ Create a gce.ini file for GCE gce_service_account_email_address = long...@developer.gserviceaccount.com gce_service_account_pem_file_path = /full/path/to/project_id-gce_key_hash.pem gce_project_id = project_id +zone = europe-west1-d +network = default +gce_machine_type = n1-standard-2 +gce_machine_image = preinstalled-slave-50g-v5 + ``` -1. Setup a sym link so that gce.py will pick it up (link must be in same dir as gce.py) +1. Define the environment variable GCE_INI_PATH so gce.py can pick it up and bin/cluster can also read it ``` - cd openshift-ansible/inventory/gce - ln -s ~/.gce/gce.ini gce.ini +export GCE_INI_PATH=~/.gce/gce.ini ``` diff --git a/README_OSE.md b/README_OSE.md index cce1ec030..79ad07044 100644 --- a/README_OSE.md +++ b/README_OSE.md @@ -101,6 +101,7 @@ ose3-master.example.com # host group for nodes [nodes] +ose3-master.example.com ose3-node[1:2].example.com ``` @@ -116,22 +117,8 @@ ansible-playbook playbooks/byo/config.yml inventory file use the -i option for ansible-playbook. ## Post-ansible steps -#### Create the default router -On the master host: -```sh -oadm router --create=true \ - --credentials=/etc/openshift/master/openshift-router.kubeconfig \ - --images='rcm-img-docker01.build.eng.bos.redhat.com:5001/openshift3/ose-${component}:${version}' -``` -#### Create the default docker-registry -On the master host: -```sh -oadm registry --create=true \ - --credentials=/etc/openshift/master/openshift-registry.kubeconfig \ - --images='rcm-img-docker01.build.eng.bos.redhat.com:5001/openshift3/ose-${component}:${version}' \ - --mount-host=/var/lib/openshift/docker-registry -``` +You should now be ready to follow the [What's Next?](https://docs.openshift.com/enterprise/3.0/install_config/install/advanced_install.html#what-s-next) section of the advanced installation guide to deploy your router, registry, and other components. ## Overriding detected ip addresses and hostnames Some deployments will require that the user override the detected hostnames diff --git a/README_libvirt.md b/README_libvirt.md index 60af0ac88..18ec66f2a 100644 --- a/README_libvirt.md +++ b/README_libvirt.md @@ -68,9 +68,14 @@ If your `$HOME` is world readable, everything is fine. If your `$HOME` is privat error: Cannot access storage file '$HOME/libvirt-storage-pool-openshift/lenaic-master-216d8.qcow2' (as uid:99, gid:78): Permission denied ``` -In order to fix that issue, you have several possibilities:* set `libvirt_storage_pool_path` inside `playbooks/libvirt/openshift-cluster/launch.yml` and `playbooks/libvirt/openshift-cluster/terminate.yml` to a directory: * backed by a filesystem with a lot of free disk space * writable by your user; * accessible by the qemu user.* Grant the qemu user access to the storage pool. +In order to fix that issue, you have several possibilities: + * set `libvirt_storage_pool_path` inside `playbooks/libvirt/openshift-cluster/launch.yml` and `playbooks/libvirt/openshift-cluster/terminate.yml` to a directory: + * backed by a filesystem with a lot of free disk space + * writable by your user; + * accessible by the qemu user. + * Grant the qemu user access to the storage pool. -On Arch: +On Arch or Fedora 22+: ``` setfacl -m g:kvm:--x ~ @@ -89,7 +94,8 @@ dns=dnsmasq - Configure dnsmasq to use the Virtual Network router for example.com: ```sh -sudo vi /etc/NetworkManager/dnsmasq.d/libvirt_dnsmasq.conf server=/example.com/192.168.55.1 +sudo vi /etc/NetworkManager/dnsmasq.d/libvirt_dnsmasq.conf +server=/example.com/192.168.55.1 ``` Test The Setup diff --git a/README_origin.md b/README_origin.md index f13fe660a..cb213a93a 100644 --- a/README_origin.md +++ b/README_origin.md @@ -73,6 +73,7 @@ osv3-master.example.com # host group for nodes [nodes] +osv3-master.example.com osv3-node[1:2].example.com ``` @@ -88,23 +89,8 @@ ansible-playbook playbooks/byo/config.yml inventory file use the -i option for ansible-playbook. ## Post-ansible steps -#### Create the default router -On the master host: -```sh -oadm router --create=true \ - --credentials=/etc/openshift/master/openshift-router.kubeconfig -``` - -#### Create the default docker-registry -On the master host: -```sh -oadm registry --create=true \ - --credentials=/etc/openshift/master/openshift-registry.kubeconfig -``` -If you would like persistent storage, refer to the -[OpenShift documentation](https://docs.openshift.org/latest/admin_guide/install/docker_registry.html) -for more information on deployment options for the built in docker-registry. +You should now be ready to follow the [What's Next?](https://docs.openshift.org/latest/install_config/install/advanced_install.html#what-s-next) section of the advanced installation guide to deploy your router, registry, and other components. ## Overriding detected ip addresses and hostnames Some deployments will require that the user override the detected hostnames diff --git a/Vagrantfile b/Vagrantfile index 4675b5d60..33532cd63 100644 --- a/Vagrantfile +++ b/Vagrantfile @@ -38,7 +38,7 @@ Vagrant.configure(VAGRANTFILE_API_VERSION) do |config| end config.vm.provider "virtualbox" do |vbox, override| - override.vm.box = "chef/centos-7.1" + override.vm.box = "centos/7" vbox.memory = 1024 vbox.cpus = 2 @@ -54,8 +54,7 @@ Vagrant.configure(VAGRANTFILE_API_VERSION) do |config| when "enterprise" override.vm.box = "rhel-7" when "origin" - override.vm.box = "centos-7.1" - override.vm.box_url = "https://download.gluster.org/pub/gluster/purpleidea/vagrant/centos-7.1/centos-7.1.box" + override.vm.box = "centos/7" override.vm.box_download_checksum = "b2a9f7421e04e73a5acad6fbaf4e9aba78b5aeabf4230eebacc9942e577c1e05" override.vm.box_download_checksum_type = "sha256" end @@ -66,7 +65,7 @@ Vagrant.configure(VAGRANTFILE_API_VERSION) do |config| config.vm.define "node#{node_index}" do |node| node.vm.hostname = "ose3-node#{node_index}.example.com" node.vm.network :private_network, ip: "192.168.100.#{200 + n}" - config.vm.provision "shell", inline: "nmcli connection reload; systemctl restart network.service" + config.vm.provision "shell", inline: "nmcli connection reload; systemctl restart NetworkManager.service" end end @@ -74,7 +73,7 @@ Vagrant.configure(VAGRANTFILE_API_VERSION) do |config| master.vm.hostname = "ose3-master.example.com" master.vm.network :private_network, ip: "192.168.100.100" master.vm.network :forwarded_port, guest: 8443, host: 8443 - config.vm.provision "shell", inline: "nmcli connection reload; systemctl restart network.service" + config.vm.provision "shell", inline: "nmcli connection reload; systemctl restart NetworkManager.service" master.vm.provision "ansible" do |ansible| ansible.limit = 'all' ansible.sudo = true diff --git a/bin/cluster b/bin/cluster index c80fe0cab..59a6755d3 100755 --- a/bin/cluster +++ b/bin/cluster @@ -5,6 +5,7 @@ import argparse import ConfigParser import os import sys +import subprocess import traceback @@ -48,11 +49,11 @@ class Cluster(object): deployment_type = os.environ['OS_DEPLOYMENT_TYPE'] return deployment_type + def create(self, args): """ Create an OpenShift cluster for given provider :param args: command line arguments provided by user - :return: exit status from run command """ env = {'cluster_id': args.cluster_id, 'deployment_type': self.get_deployment_type(args)} @@ -64,65 +65,60 @@ class Cluster(object): env['num_infra'] = args.infra env['num_etcd'] = args.etcd - return self.action(args, inventory, env, playbook) + self.action(args, inventory, env, playbook) def terminate(self, args): """ Destroy OpenShift cluster :param args: command line arguments provided by user - :return: exit status from run command """ env = {'cluster_id': args.cluster_id, 'deployment_type': self.get_deployment_type(args)} playbook = "playbooks/{}/openshift-cluster/terminate.yml".format(args.provider) inventory = self.setup_provider(args.provider) - return self.action(args, inventory, env, playbook) + self.action(args, inventory, env, playbook) def list(self, args): """ List VMs in cluster :param args: command line arguments provided by user - :return: exit status from run command """ env = {'cluster_id': args.cluster_id, 'deployment_type': self.get_deployment_type(args)} playbook = "playbooks/{}/openshift-cluster/list.yml".format(args.provider) inventory = self.setup_provider(args.provider) - return self.action(args, inventory, env, playbook) + self.action(args, inventory, env, playbook) def config(self, args): """ Configure or reconfigure OpenShift across clustered VMs :param args: command line arguments provided by user - :return: exit status from run command """ env = {'cluster_id': args.cluster_id, 'deployment_type': self.get_deployment_type(args)} playbook = "playbooks/{}/openshift-cluster/config.yml".format(args.provider) inventory = self.setup_provider(args.provider) - return self.action(args, inventory, env, playbook) + self.action(args, inventory, env, playbook) def update(self, args): """ Update to latest OpenShift across clustered VMs :param args: command line arguments provided by user - :return: exit status from run command """ env = {'cluster_id': args.cluster_id, 'deployment_type': self.get_deployment_type(args)} playbook = "playbooks/{}/openshift-cluster/update.yml".format(args.provider) inventory = self.setup_provider(args.provider) - return self.action(args, inventory, env, playbook) + self.action(args, inventory, env, playbook) def service(self, args): """ Make the same service call across all nodes in the cluster :param args: command line arguments provided by user - :return: exit status from run command """ env = {'cluster_id': args.cluster_id, 'deployment_type': self.get_deployment_type(args), @@ -131,7 +127,7 @@ class Cluster(object): playbook = "playbooks/{}/openshift-cluster/service.yml".format(args.provider) inventory = self.setup_provider(args.provider) - return self.action(args, inventory, env, playbook) + self.action(args, inventory, env, playbook) def setup_provider(self, provider): """ @@ -141,10 +137,14 @@ class Cluster(object): """ config = ConfigParser.ConfigParser() if 'gce' == provider: - config.readfp(open('inventory/gce/hosts/gce.ini')) + gce_ini_default_path = os.path.join( + 'inventory/gce/hosts/gce.ini') + gce_ini_path = os.environ.get('GCE_INI_PATH', gce_ini_default_path) + if os.path.exists(gce_ini_path): + config.readfp(open(gce_ini_path)) - for key in config.options('gce'): - os.environ[key] = config.get('gce', key) + for key in config.options('gce'): + os.environ[key] = config.get('gce', key) inventory = '-i inventory/gce/hosts' elif 'aws' == provider: @@ -163,7 +163,7 @@ class Cluster(object): boto_configs = [conf for conf in boto_conf_files if conf_exists(conf)] if len(key_missing) > 0 and len(boto_configs) == 0: - raise ValueError("PROVIDER aws requires {} environment variable(s). See README_AWS.md".format(missing)) + raise ValueError("PROVIDER aws requires {} environment variable(s). See README_AWS.md".format(key_missing)) elif 'libvirt' == provider: inventory = '-i inventory/libvirt/hosts' @@ -182,7 +182,6 @@ class Cluster(object): :param inventory: derived provider library :param env: environment variables for kubernetes :param playbook: ansible playbook to execute - :return: exit status from ansible-playbook command """ verbose = '' @@ -212,7 +211,18 @@ class Cluster(object): sys.stderr.write('RUN [{}]\n'.format(command)) sys.stderr.flush() - return os.system(command) + try: + subprocess.check_call(command, shell=True) + except subprocess.CalledProcessError as exc: + raise ActionFailed("ACTION [{}] failed: {}" + .format(args.action, exc)) + + +class ActionFailed(Exception): + """ + Raised when action failed. + """ + pass if __name__ == '__main__': @@ -258,6 +268,9 @@ if __name__ == '__main__': meta_parser.add_argument('-t', '--deployment-type', choices=['origin', 'online', 'enterprise'], help='Deployment type. (default: origin)') + meta_parser.add_argument('-T', '--product-type', + choices=['openshift', 'atomic-enterprise'], + help='Product type. (default: openshift)') meta_parser.add_argument('-o', '--option', action='append', help='options') @@ -324,14 +337,11 @@ if __name__ == '__main__': sys.stderr.write('\nACTION [update] aborted by user!\n') exit(1) - status = 1 try: - status = args.func(args) - if status != 0: - sys.stderr.write("ACTION [{}] failed with exit status {}\n".format(args.action, status)) - except Exception, e: + args.func(args) + except Exception as exc: if args.verbose: traceback.print_exc(file=sys.stderr) else: - sys.stderr.write("{}\n".format(e)) - exit(status) + print >>sys.stderr, exc + exit(1) diff --git a/inventory/byo/hosts.example b/inventory/byo/hosts.example index 1685003e8..f554cc660 100644 --- a/inventory/byo/hosts.example +++ b/inventory/byo/hosts.example @@ -18,7 +18,7 @@ ansible_ssh_user=root #ansible_sudo=true # deployment type valid values are origin, online and enterprise -deployment_type=enterprise +deployment_type=atomic-enterprise # Enable cluster metrics #use_cluster_metrics=true @@ -61,7 +61,7 @@ openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', # For installation the value of openshift_master_cluster_hostname must resolve # to the first master defined in the inventory. # The HA solution must be manually configured after installation and must ensure -# that openshift-master is running on a single master host. +# that the master is running on a single master host. #openshift_master_cluster_hostname=openshift-ansible.test.example.com #openshift_master_cluster_public_hostname=openshift-ansible.test.example.com #openshift_master_cluster_defer_ha=True @@ -70,11 +70,14 @@ openshift_master_identity_providers=[{'name': 'htpasswd_auth', 'login': 'true', #osm_default_subdomain=apps.test.example.com # additional cors origins -#osm_custom_cors_origins=['foo.example.com', 'bar.example.com'] +#osm_custom_cors_origins=['foo.example.com', 'bar.example.com'] # default project node selector #osm_default_node_selector='region=primary' +# set RPM version for debugging purposes +#openshift_pkg_version=-3.0.0.0 + # host group for masters [masters] ose3-master[1:3]-ansible.test.example.com @@ -82,7 +85,9 @@ ose3-master[1:3]-ansible.test.example.com [etcd] ose3-etcd[1:3]-ansible.test.example.com -# host group for nodes +# NOTE: Currently we require that masters be part of the SDN which requires that they also be nodes +# However, in order to ensure that your masters are not burdened with running pods you should +# make them unschedulable by adding openshift_scheduleable=False any node that's also a master. [nodes] -ose3-master[1:3]-ansible.test.example.com openshift_scheduleable=False +ose3-master[1:3]-ansible.test.example.com ose3-node[1:2]-ansible.test.example.com openshift_node_labels="{'region': 'primary', 'zone': 'default'}" diff --git a/inventory/gce/hosts/gce.py b/inventory/gce/hosts/gce.py index 3403f735e..6ed12e011 100755 --- a/inventory/gce/hosts/gce.py +++ b/inventory/gce/hosts/gce.py @@ -120,6 +120,7 @@ class GceInventory(object): os.path.dirname(os.path.realpath(__file__)), "gce.ini") gce_ini_path = os.environ.get('GCE_INI_PATH', gce_ini_default_path) + # Create a ConfigParser. # This provides empty defaults to each key, so that environment # variable configuration (as opposed to INI configuration) is able @@ -173,6 +174,7 @@ class GceInventory(object): args[1] = os.environ.get('GCE_PEM_FILE_PATH', args[1]) kwargs['project'] = os.environ.get('GCE_PROJECT', kwargs['project']) + # Retrieve and return the GCE driver. gce = get_driver(Provider.GCE)(*args, **kwargs) gce.connection.user_agent_append( @@ -211,7 +213,8 @@ class GceInventory(object): 'gce_image': inst.image, 'gce_machine_type': inst.size, 'gce_private_ip': inst.private_ips[0], - 'gce_public_ip': inst.public_ips[0], + # Hosts don't always have a public IP name + #'gce_public_ip': inst.public_ips[0], 'gce_name': inst.name, 'gce_description': inst.extra['description'], 'gce_status': inst.extra['status'], @@ -219,8 +222,8 @@ class GceInventory(object): 'gce_tags': inst.extra['tags'], 'gce_metadata': md, 'gce_network': net, - # Hosts don't have a public name, so we add an IP - 'ansible_ssh_host': inst.public_ips[0] + # Hosts don't always have a public IP name + #'ansible_ssh_host': inst.public_ips[0] } def get_instance(self, instance_name): diff --git a/inventory/openstack/hosts/nova.py b/inventory/openstack/hosts/nova.py index d5bd8d1ee..3197a57bc 100755 --- a/inventory/openstack/hosts/nova.py +++ b/inventory/openstack/hosts/nova.py @@ -34,7 +34,7 @@ except ImportError: # executed with no parameters, return the list of # all groups and hosts -NOVA_CONFIG_FILES = [os.getcwd() + "/nova.ini", +NOVA_CONFIG_FILES = [os.path.join(os.path.dirname(os.path.realpath(__file__)), "nova.ini"), os.path.expanduser(os.environ.get('ANSIBLE_CONFIG', "~/nova.ini")), "/etc/ansible/nova.ini"] diff --git a/playbooks/adhoc/atomic_openshift_tutorial_reset.yml b/playbooks/adhoc/atomic_openshift_tutorial_reset.yml index 1200caa2a..54d3ea278 100644 --- a/playbooks/adhoc/atomic_openshift_tutorial_reset.yml +++ b/playbooks/adhoc/atomic_openshift_tutorial_reset.yml @@ -19,10 +19,12 @@ - openshift-node - atomic-enterprise-master - atomic-enterprise-node + - etcd - yum: name={{ item }} state=absent with_items: - openvswitch + - etcd - origin - origin-master - origin-node @@ -61,12 +63,15 @@ - shell: docker ps -a -q | xargs docker stop changed_when: False + failed_when: False - shell: docker ps -a -q| xargs docker rm changed_when: False + failed_when: False - shell: docker images -q |xargs docker rmi changed_when: False + failed_when: False - file: path={{ item }} state=absent with_items: @@ -86,6 +91,8 @@ - /etc/sysconfig/openshift-node - /etc/sysconfig/atomic-enterprise-master - /etc/sysconfig/atomic-enterprise-node + - /etc/etcd + - /var/lib/etcd - user: name={{ item }} state=absent remove=yes with_items: diff --git a/playbooks/adhoc/docker_loopback_to_lvm/docker-storage-setup b/playbooks/adhoc/docker_loopback_to_lvm/docker-storage-setup new file mode 100644 index 000000000..059058823 --- /dev/null +++ b/playbooks/adhoc/docker_loopback_to_lvm/docker-storage-setup @@ -0,0 +1,2 @@ +DEVS=/dev/xvdb +VG=docker_vg diff --git a/playbooks/adhoc/docker_loopback_to_lvm/docker_loopback_to_direct_lvm.yml b/playbooks/adhoc/docker_loopback_to_lvm/docker_loopback_to_direct_lvm.yml new file mode 100644 index 000000000..b6a2d2f26 --- /dev/null +++ b/playbooks/adhoc/docker_loopback_to_lvm/docker_loopback_to_direct_lvm.yml @@ -0,0 +1,142 @@ +--- +# This playbook coverts docker to go from loopback to direct-lvm (the Red Hat recommended way to run docker) +# in AWS. This adds an additional EBS volume and creates the Volume Group on this EBS volume to use. +# +# To run: +# 1. Source your AWS credentials (make sure it's the corresponding AWS account) into your environment +# export AWS_ACCESS_KEY_ID='XXXXX' +# export AWS_SECRET_ACCESS_KEY='XXXXXX' +# +# 2. run the playbook: +# ansible-playbook -e 'cli_tag_name=<tag-name>' -e "cli_volume_size=30" docker_loopback_to_direct_lvm.yml +# +# Example: +# ansible-playbook -e 'cli_tag_name=ops-master-12345' -e "cli_volume_size=30" docker_loopback_to_direct_lvm.yml +# +# Notes: +# * By default this will do a 30GB volume. +# * iops are calculated by Disk Size * 30. e.g ( 30GB * 30) = 900 iops +# * This will remove /var/lib/docker! +# * You may need to re-deploy docker images after this is run (like monitoring) +# + +- name: Fix docker to have a provisioned iops drive + hosts: "tag_Name_{{ cli_tag_name }}" + user: root + connection: ssh + gather_facts: no + + vars: + cli_volume_type: gp2 + cli_volume_size: 30 + + pre_tasks: + - fail: + msg: "This playbook requires {{item}} to be set." + when: "{{ item }} is not defined or {{ item }} == ''" + with_items: + - cli_tag_name + - cli_volume_size + + - debug: + var: hosts + + - name: start docker + service: + name: docker + state: started + + - name: Determine if loopback + shell: docker info | grep 'Data file:.*loop' + register: loop_device_check + ignore_errors: yes + + - debug: + var: loop_device_check + + - name: fail if we don't detect loopback + fail: + msg: loopback not detected! Please investigate manually. + when: loop_device_check.rc == 1 + + - name: stop zagg client monitoring container + service: + name: oso-rhel7-zagg-client + state: stopped + ignore_errors: yes + + - name: stop pcp client monitoring container + service: + name: oso-f22-host-monitoring + state: stopped + ignore_errors: yes + + - name: stop docker + service: + name: docker + state: stopped + + - name: delete /var/lib/docker + command: rm -rf /var/lib/docker + + - name: remove /var/lib/docker + command: rm -rf /var/lib/docker + + - name: check to see if /dev/xvdb exists + command: test -e /dev/xvdb + register: xvdb_check + ignore_errors: yes + + - debug: var=xvdb_check + + - name: fail if /dev/xvdb already exists + fail: + msg: /dev/xvdb already exists. Please investigate + when: xvdb_check.rc == 0 + + - name: Create a volume and attach it + delegate_to: localhost + ec2_vol: + state: present + instance: "{{ ec2_id }}" + region: "{{ ec2_region }}" + volume_size: "{{ cli_volume_size | default(30, True)}}" + volume_type: "{{ cli_volume_type }}" + device_name: /dev/xvdb + register: vol + + - debug: var=vol + + - name: tag the vol with a name + delegate_to: localhost + ec2_tag: region={{ ec2_region }} resource={{ vol.volume_id }} + args: + tags: + Name: "{{ ec2_tag_Name }}" + env: "{{ ec2_tag_environment }}" + register: voltags + + - name: Wait for volume to attach + pause: + seconds: 30 + + - name: copy the docker-storage-setup config file + copy: + src: docker-storage-setup + dest: /etc/sysconfig/docker-storage-setup + owner: root + group: root + mode: 0664 + + - name: docker storage setup + command: docker-storage-setup + register: setup_output + + - debug: var=setup_output + + - name: start docker + command: systemctl start docker.service + register: dockerstart + + - debug: var=dockerstart + diff --git a/playbooks/adhoc/docker_loopback_to_lvm/ops-docker-loopback-to-direct-lvm.yml b/playbooks/adhoc/docker_loopback_to_lvm/ops-docker-loopback-to-direct-lvm.yml new file mode 100755 index 000000000..614b2537a --- /dev/null +++ b/playbooks/adhoc/docker_loopback_to_lvm/ops-docker-loopback-to-direct-lvm.yml @@ -0,0 +1,104 @@ +#!/usr/bin/ansible-playbook +--- +# This playbook coverts docker to go from loopback to direct-lvm (the Red Hat recommended way to run docker). +# +# It requires the block device to be already provisioned and attached to the host. This is a generic playbook, +# meant to be used for manual conversion. For AWS specific conversions, use the other playbook in this directory. +# +# To run: +# ./ops-docker-loopback-to-direct-lvm.yml -e cli_host=<host to run on> -e cli_docker_device=<path to device> +# +# Example: +# ./ops-docker-loopback-to-direct-lvm.yml -e cli_host=twiesttest-master-fd32 -e cli_docker_device=/dev/sdb +# +# Notes: +# * This will remove /var/lib/docker! +# * You may need to re-deploy docker images after this is run (like monitoring) + +- name: Fix docker to have a provisioned iops drive + hosts: "{{ cli_name }}" + user: root + connection: ssh + gather_facts: no + + pre_tasks: + - fail: + msg: "This playbook requires {{item}} to be set." + when: "{{ item }} is not defined or {{ item }} == ''" + with_items: + - cli_docker_device + + - name: start docker + service: + name: docker + state: started + + - name: Determine if loopback + shell: docker info | grep 'Data file:.*loop' + register: loop_device_check + ignore_errors: yes + + - debug: + var: loop_device_check + + - name: fail if we don't detect loopback + fail: + msg: loopback not detected! Please investigate manually. + when: loop_device_check.rc == 1 + + - name: stop zagg client monitoring container + service: + name: oso-rhel7-zagg-client + state: stopped + ignore_errors: yes + + - name: stop pcp client monitoring container + service: + name: oso-f22-host-monitoring + state: stopped + ignore_errors: yes + + - name: "check to see if {{ cli_docker_device }} exists" + command: "test -e {{ cli_docker_device }}" + register: docker_dev_check + ignore_errors: yes + + - debug: var=docker_dev_check + + - name: "fail if {{ cli_docker_device }} doesn't exist" + fail: + msg: "{{ cli_docker_device }} doesn't exist. Please investigate" + when: docker_dev_check.rc != 0 + + - name: stop docker + service: + name: docker + state: stopped + + - name: delete /var/lib/docker + command: rm -rf /var/lib/docker + + - name: remove /var/lib/docker + command: rm -rf /var/lib/docker + + - name: copy the docker-storage-setup config file + copy: + content: > + DEVS={{ cli_docker_device }} + VG=docker_vg + dest: /etc/sysconfig/docker-storage-setup + owner: root + group: root + mode: 0664 + + - name: docker storage setup + command: docker-storage-setup + register: setup_output + + - debug: var=setup_output + + - name: start docker + command: systemctl start docker.service + register: dockerstart + + - debug: var=dockerstart diff --git a/playbooks/adhoc/docker_storage_cleanup/docker_storage_cleanup.yml b/playbooks/adhoc/docker_storage_cleanup/docker_storage_cleanup.yml new file mode 100644 index 000000000..a19291a9f --- /dev/null +++ b/playbooks/adhoc/docker_storage_cleanup/docker_storage_cleanup.yml @@ -0,0 +1,69 @@ +--- +# This playbook attempts to cleanup unwanted docker files to help alleviate docker disk space issues. +# +# To run: +# +# 1. run the playbook: +# +# ansible-playbook -e 'cli_tag_name=<tag-name>' docker_storage_cleanup.yml +# +# Example: +# +# ansible-playbook -e 'cli_tag_name=ops-node-compute-12345' docker_storage_cleanup.yml +# +# Notes: +# * This *should* not interfere with running docker images +# + +- name: Clean up Docker Storage + gather_facts: no + hosts: "tag_Name_{{ cli_tag_name }}" + user: root + connection: ssh + + pre_tasks: + + - fail: + msg: "This playbook requires {{item}} to be set." + when: "{{ item }} is not defined or {{ item }} == ''" + with_items: + - cli_tag_name + + - name: Ensure docker is running + service: + name: docker + state: started + enabled: yes + + - name: Get docker info + command: docker info + register: docker_info + + - name: Show docker info + debug: + var: docker_info.stdout_lines + + - name: Remove exited and dead containers + shell: "docker ps -a | awk '/Exited|Dead/ {print $1}' | xargs --no-run-if-empty docker rm" + ignore_errors: yes + + - name: Remove dangling docker images + shell: "docker images -q -f dangling=true | xargs --no-run-if-empty docker rmi" + ignore_errors: yes + + - name: Remove non-running docker images + shell: "docker images | grep -v -e registry.access.redhat.com -e docker-registry.usersys.redhat.com -e docker-registry.ops.rhcloud.com | awk '{print $3}' | xargs --no-run-if-empty docker rmi 2>/dev/null" + ignore_errors: yes + + # leaving off the '-t' for docker exec. With it, it doesn't work with ansible and tty support + - name: update zabbix docker items + command: docker exec -i oso-rhel7-zagg-client /usr/local/bin/cron-send-docker-metrics.py + + # Get and show docker info again. + - name: Get docker info + command: docker info + register: docker_info + + - name: Show docker info + debug: + var: docker_info.stdout_lines diff --git a/playbooks/adhoc/grow_docker_vg/filter_plugins/oo_filters.py b/playbooks/adhoc/grow_docker_vg/filter_plugins/oo_filters.py new file mode 100644 index 000000000..d0264cde9 --- /dev/null +++ b/playbooks/adhoc/grow_docker_vg/filter_plugins/oo_filters.py @@ -0,0 +1,41 @@ +#!/usr/bin/python +# -*- coding: utf-8 -*- +# vim: expandtab:tabstop=4:shiftwidth=4 +''' +Custom filters for use in openshift-ansible +''' + +import pdb + + +class FilterModule(object): + ''' Custom ansible filters ''' + + @staticmethod + def oo_pdb(arg): + ''' This pops you into a pdb instance where arg is the data passed in + from the filter. + Ex: "{{ hostvars | oo_pdb }}" + ''' + pdb.set_trace() + return arg + + @staticmethod + def translate_volume_name(volumes, target_volume): + ''' + This filter matches a device string /dev/sdX to /dev/xvdX + It will then return the AWS volume ID + ''' + for vol in volumes: + translated_name = vol["attachment_set"]["device"].replace("/dev/sd", "/dev/xvd") + if target_volume.startswith(translated_name): + return vol["id"] + + return None + + + def filters(self): + ''' returns a mapping of filters to methods ''' + return { + "translate_volume_name": self.translate_volume_name, + } diff --git a/playbooks/adhoc/grow_docker_vg/grow_docker_vg.yml b/playbooks/adhoc/grow_docker_vg/grow_docker_vg.yml new file mode 100644 index 000000000..63d473146 --- /dev/null +++ b/playbooks/adhoc/grow_docker_vg/grow_docker_vg.yml @@ -0,0 +1,206 @@ +--- +# This playbook grows the docker VG on a node by: +# * add a new volume +# * add volume to the existing VG. +# * pv move to the new volume. +# * remove old volume +# * detach volume +# * mark old volume in AWS with "REMOVE ME" tag +# * grow docker LVM to 90% of the VG +# +# To run: +# 1. Source your AWS credentials (make sure it's the corresponding AWS account) into your environment +# export AWS_ACCESS_KEY_ID='XXXXX' +# export AWS_SECRET_ACCESS_KEY='XXXXXX' +# +# 2. run the playbook: +# ansible-playbook -e 'cli_tag_name=<tag-name>' grow_docker_vg.yml +# +# Example: +# ansible-playbook -e 'cli_tag_name=ops-compute-12345' grow_docker_vg.yml +# +# Notes: +# * By default this will do a 55GB GP2 volume. The can be overidden with the "-e 'cli_volume_size=100'" variable +# * This does a GP2 by default. Support for Provisioned IOPS has not been added +# * This will assign the new volume to /dev/xvdc. This is not variablized, yet. +# * This can be done with NO downtime on the host +# * This playbook assumes that there is a Logical Volume that is installed and called "docker-pool". This is +# the LV that gets created via the "docker-storage-setup" command +# + +- name: Grow the docker volume group + hosts: "tag_Name_{{ cli_tag_name }}" + user: root + connection: ssh + gather_facts: no + + vars: + cli_volume_type: gp2 + cli_volume_size: 55 +# cli_volume_iops: "{{ 30 * cli_volume_size }}" + + pre_tasks: + - fail: + msg: "This playbook requires {{item}} to be set." + when: "{{ item }} is not defined or {{ item }} == ''" + with_items: + - cli_tag_name + - cli_volume_size + + - debug: + var: hosts + + - name: start docker + service: + name: docker + state: started + + - name: Determine if Storage Driver (docker info) is devicemapper + shell: docker info | grep 'Storage Driver:.*devicemapper' + register: device_mapper_check + ignore_errors: yes + + - debug: + var: device_mapper_check + + - name: fail if we don't detect devicemapper + fail: + msg: The "Storage Driver" in "docker info" is not set to "devicemapper"! Please investigate manually. + when: device_mapper_check.rc == 1 + + # docker-storage-setup creates a docker-pool as the lvm. I am using docker-pool lvm to test + # and find the volume group. + - name: Attempt to find the Volume Group that docker is using + shell: lvs | grep docker-pool | awk '{print $2}' + register: docker_vg_name + ignore_errors: yes + + - debug: + var: docker_vg_name + + - name: fail if we don't find a docker volume group + fail: + msg: Unable to find docker volume group. Please investigate manually. + when: docker_vg_name.stdout_lines|length != 1 + + # docker-storage-setup creates a docker-pool as the lvm. I am using docker-pool lvm to test + # and find the physical volume. + - name: Attempt to find the Phyisical Volume that docker is using + shell: "pvs | grep {{ docker_vg_name.stdout }} | awk '{print $1}'" + register: docker_pv_name + ignore_errors: yes + + - debug: + var: docker_pv_name + + - name: fail if we don't find a docker physical volume + fail: + msg: Unable to find docker physical volume. Please investigate manually. + when: docker_pv_name.stdout_lines|length != 1 + + + - name: get list of volumes from AWS + delegate_to: localhost + ec2_vol: + state: list + instance: "{{ ec2_id }}" + region: "{{ ec2_region }}" + register: attached_volumes + + - debug: var=attached_volumes + + - name: get volume id of current docker volume + set_fact: + old_docker_volume_id: "{{ attached_volumes.volumes | translate_volume_name(docker_pv_name.stdout) }}" + + - debug: var=old_docker_volume_id + + - name: check to see if /dev/xvdc exists + command: test -e /dev/xvdc + register: xvdc_check + ignore_errors: yes + + - debug: var=xvdc_check + + - name: fail if /dev/xvdc already exists + fail: + msg: /dev/xvdc already exists. Please investigate + when: xvdc_check.rc == 0 + + - name: Create a volume and attach it + delegate_to: localhost + ec2_vol: + state: present + instance: "{{ ec2_id }}" + region: "{{ ec2_region }}" + volume_size: "{{ cli_volume_size | default(30, True)}}" + volume_type: "{{ cli_volume_type }}" + device_name: /dev/xvdc + register: create_volume + + - debug: var=create_volume + + - name: Fail when problems creating volumes and attaching + fail: + msg: "Failed to create or attach volume msg: {{ create_volume.msg }}" + when: create_volume.msg is defined + + - name: tag the vol with a name + delegate_to: localhost + ec2_tag: region={{ ec2_region }} resource={{ create_volume.volume_id }} + args: + tags: + Name: "{{ ec2_tag_Name }}" + env: "{{ ec2_tag_environment }}" + register: voltags + + - name: check for attached drive + command: test -b /dev/xvdc + register: attachment_check + until: attachment_check.rc == 0 + retries: 30 + delay: 2 + + - name: partition the new drive and make it lvm + command: parted /dev/xvdc --script -- mklabel msdos mkpart primary 0% 100% set 1 lvm + + - name: pvcreate /dev/xvdc + command: pvcreate /dev/xvdc1 + + - name: Extend the docker volume group + command: vgextend "{{ docker_vg_name.stdout }}" /dev/xvdc1 + + - name: pvmove onto new volume + command: "pvmove {{ docker_pv_name.stdout }} /dev/xvdc1" + async: 43200 + poll: 10 + + - name: Remove the old docker drive from the volume group + command: "vgreduce {{ docker_vg_name.stdout }} {{ docker_pv_name.stdout }}" + + - name: Remove the pv from the old drive + command: "pvremove {{ docker_pv_name.stdout }}" + + - name: Extend the docker lvm + command: "lvextend -l '90%VG' /dev/{{ docker_vg_name.stdout }}/docker-pool" + + - name: detach old docker volume + delegate_to: localhost + ec2_vol: + region: "{{ ec2_region }}" + id: "{{ old_docker_volume_id }}" + instance: None + + - name: tag the old vol valid label + delegate_to: localhost + ec2_tag: region={{ ec2_region }} resource={{old_docker_volume_id}} + args: + tags: + Name: "{{ ec2_tag_Name }} REMOVE ME" + register: voltags + + - name: Update the /etc/sysconfig/docker-storage-setup with new device + lineinfile: + dest: /etc/sysconfig/docker-storage-setup + regexp: ^DEVS= + line: DEVS=/dev/xvdc diff --git a/playbooks/adhoc/s3_registry/s3_registry.j2 b/playbooks/adhoc/s3_registry/s3_registry.j2 new file mode 100644 index 000000000..acfa89515 --- /dev/null +++ b/playbooks/adhoc/s3_registry/s3_registry.j2 @@ -0,0 +1,20 @@ +version: 0.1 +log: + level: debug +http: + addr: :5000 +storage: + cache: + layerinfo: inmemory + s3: + accesskey: {{ aws_access_key }} + secretkey: {{ aws_secret_key }} + region: us-east-1 + bucket: {{ clusterid }}-docker + encrypt: true + secure: true + v4auth: true + rootdirectory: /registry +middleware: + repository: + - name: openshift diff --git a/playbooks/adhoc/s3_registry/s3_registry.yml b/playbooks/adhoc/s3_registry/s3_registry.yml new file mode 100644 index 000000000..5dc1abf17 --- /dev/null +++ b/playbooks/adhoc/s3_registry/s3_registry.yml @@ -0,0 +1,61 @@ +--- +# This playbook creates an S3 bucket named after your cluster and configures the docker-registry service to use the bucket as its backend storage. +# Usage: +# ansible-playbook s3_registry.yml -e clusterid="mycluster" +# +# The AWS access/secret keys should be the keys of a separate user (not your main user), containing only the necessary S3 access role. +# The 'clusterid' is the short name of your cluster. + +- hosts: security_group_{{ clusterid }}_master + remote_user: root + gather_facts: False + + vars: + aws_access_key: "{{ lookup('env', 'AWS_ACCESS_KEY_ID') }}" + aws_secret_key: "{{ lookup('env', 'AWS_SECRET_ACCESS_KEY') }}" + tasks: + + - name: Check for AWS creds + fail: + msg: "Couldn't find {{ item }} creds in ENV" + when: "{{ item }} == ''" + with_items: + - aws_access_key + - aws_secret_key + + - name: Create S3 bucket + local_action: + module: s3 bucket="{{ clusterid }}-docker" mode=create + + - name: Generate docker registry config + template: src="s3_registry.j2" dest="/root/config.yml" owner=root mode=0600 + + - name: Determine if new secrets are needed + command: oc get secrets + register: secrets + + - name: Create registry secrets + command: oc secrets new dockerregistry /root/config.yml + when: "'dockerregistry' not in secrets.stdout" + + - name: Determine if service account contains secrets + command: oc describe serviceaccount/registry + register: serviceaccount + + - name: Add secrets to registry service account + command: oc secrets add serviceaccount/registry secrets/dockerregistry + when: "'dockerregistry' not in serviceaccount.stdout" + + - name: Determine if deployment config contains secrets + command: oc volume dc/docker-registry --list + register: dc + + - name: Add secrets to registry deployment config + command: oc volume dc/docker-registry --add --name=dockersecrets -m /etc/registryconfig --type=secret --secret-name=dockerregistry + when: "'dockersecrets' not in dc.stdout" + + - name: Scale up registry + command: oc scale --replicas=1 dc/docker-registry + + - name: Delete temporary config file + file: path=/root/config.yml state=absent diff --git a/playbooks/adhoc/upgrades/README.md b/playbooks/adhoc/upgrades/README.md new file mode 100644 index 000000000..6de8a970f --- /dev/null +++ b/playbooks/adhoc/upgrades/README.md @@ -0,0 +1,21 @@ +# [NOTE] +This playbook will re-run installation steps overwriting any local +modifications. You should ensure that your inventory has been updated with any +modifications you've made after your initial installation. If you find any items +that cannot be configured via ansible please open an issue at +https://github.com/openshift/openshift-ansible + +# Overview +This playbook is available as a technical preview. It currently performs the +following steps. + + * Upgrade and restart master services + * Upgrade and restart node services + * Applies latest configuration by re-running the installation playbook + * Applies the latest cluster policies + * Updates the default router if one exists + * Updates the default registry if one exists + * Updates image streams and quickstarts + +# Usage +ansible-playbook -i ~/ansible-inventory openshift-ansible/playbooks/adhoc/upgrades/upgrade.yml diff --git a/playbooks/adhoc/upgrades/filter_plugins b/playbooks/adhoc/upgrades/filter_plugins new file mode 120000 index 000000000..b0b7a3414 --- /dev/null +++ b/playbooks/adhoc/upgrades/filter_plugins @@ -0,0 +1 @@ +../../../filter_plugins/
\ No newline at end of file diff --git a/playbooks/adhoc/upgrades/lookup_plugins b/playbooks/adhoc/upgrades/lookup_plugins new file mode 120000 index 000000000..73cafffe5 --- /dev/null +++ b/playbooks/adhoc/upgrades/lookup_plugins @@ -0,0 +1 @@ +../../../lookup_plugins/
\ No newline at end of file diff --git a/playbooks/adhoc/upgrades/roles b/playbooks/adhoc/upgrades/roles new file mode 120000 index 000000000..e2b799b9d --- /dev/null +++ b/playbooks/adhoc/upgrades/roles @@ -0,0 +1 @@ +../../../roles/
\ No newline at end of file diff --git a/playbooks/adhoc/upgrades/upgrade.yml b/playbooks/adhoc/upgrades/upgrade.yml new file mode 100644 index 000000000..e666f0472 --- /dev/null +++ b/playbooks/adhoc/upgrades/upgrade.yml @@ -0,0 +1,115 @@ +--- +- name: Re-Run cluster configuration to apply latest configuration changes + include: ../../common/openshift-cluster/config.yml + vars: + g_etcd_group: "{{ 'etcd' }}" + g_masters_group: "{{ 'masters' }}" + g_nodes_group: "{{ 'nodes' }}" + openshift_cluster_id: "{{ cluster_id | default('default') }}" + openshift_deployment_type: "{{ deployment_type }}" + +- name: Upgrade masters + hosts: masters + vars: + openshift_version: "{{ openshift_pkg_version | default('') }}" + tasks: + - name: Upgrade master packages + yum: pkg={{ openshift.common.service_type }}-master{{ openshift_version }} state=latest + - name: Restart master services + service: name="{{ openshift.common.service_type}}-master" state=restarted + +- name: Upgrade nodes + hosts: nodes + vars: + openshift_version: "{{ openshift_pkg_version | default('') }}" + tasks: + - name: Upgrade node packages + yum: pkg={{ openshift.common.service_type }}-node{{ openshift_version }} state=latest + - name: Restart node services + service: name="{{ openshift.common.service_type }}-node" state=restarted + +- name: Determine new master version + hosts: oo_first_master + tasks: + - name: Determine new version + command: > + rpm -q --queryformat '%{version}' {{ openshift.common.service_type }}-master + register: _new_version + +- name: Ensure AOS 3.0.2 or Origin 1.0.6 + hosts: oo_first_master + tasks: + fail: This playbook requires Origin 1.0.6 or Atomic OpenShift 3.0.2 or later + when: _new_version.stdout < 1.0.6 or (_new_version.stdout >= 3.0 and _new_version.stdout < 3.0.2) + +- name: Update cluster policy + hosts: oo_first_master + tasks: + - name: oadm policy reconcile-cluster-roles --confirm + command: > + {{ openshift.common.admin_binary}} --config={{ openshift.common.config_base }}/master/admin.kubeconfig + policy reconcile-cluster-roles --confirm + +- name: Upgrade default router + hosts: oo_first_master + vars: + - router_image: "{{ openshift.master.registry_url | replace( '${component}', 'haproxy-router' ) | replace ( '${version}', 'v' + _new_version.stdout ) }}" + - oc_cmd: "{{ openshift.common.client_binary }} --config={{ openshift.common.config_base }}/master/admin.kubeconfig" + tasks: + - name: Check for default router + command: > + {{ oc_cmd }} get -n default dc/router + register: _default_router + failed_when: false + changed_when: false + - name: Check for allowHostNetwork and allowHostPorts + when: _default_router.rc == 0 + shell: > + {{ oc_cmd }} get -o yaml scc/privileged | /usr/bin/grep -e allowHostPorts -e allowHostNetwork + register: _scc + - name: Grant allowHostNetwork and allowHostPorts + when: + - _default_router.rc == 0 + - "'false' in _scc.stdout" + command: > + {{ oc_cmd }} patch scc/privileged -p '{"allowHostPorts":true,"allowHostNetwork":true}' --loglevel=9 + - name: Update deployment config to 1.0.4/3.0.1 spec + when: _default_router.rc == 0 + command: > + {{ oc_cmd }} patch dc/router -p + '{"spec":{"strategy":{"rollingParams":{"updatePercent":-10},"spec":{"serviceAccount":"router","serviceAccountName":"router"}}}}' + - name: Switch to hostNetwork=true + when: _default_router.rc == 0 + command: > + {{ oc_cmd }} patch dc/router -p '{"spec":{"template":{"spec":{"hostNetwork":true}}}}' + - name: Update router image to current version + when: _default_router.rc == 0 + command: > + {{ oc_cmd }} patch dc/router -p + '{"spec":{"template":{"spec":{"containers":[{"name":"router","image":"{{ router_image }}"}]}}}}' + +- name: Upgrade default + hosts: oo_first_master + vars: + - registry_image: "{{ openshift.master.registry_url | replace( '${component}', 'docker-registry' ) | replace ( '${version}', 'v' + _new_version.stdout ) }}" + - oc_cmd: "{{ openshift.common.client_binary }} --config={{ openshift.common.config_base }}/master/admin.kubeconfig" + tasks: + - name: Check for default registry + command: > + {{ oc_cmd }} get -n default dc/docker-registry + register: _default_registry + failed_when: false + changed_when: false + - name: Update registry image to current version + when: _default_registry.rc == 0 + command: > + {{ oc_cmd }} patch dc/docker-registry -p + '{"spec":{"template":{"spec":{"containers":[{"name":"registry","image":"{{ registry_image }}"}]}}}}' + +- name: Update image streams and templates + hosts: oo_first_master + vars: + openshift_examples_import_command: "update" + openshift_deployment_type: "{{ deployment_type }}" + roles: + - openshift_examples diff --git a/playbooks/adhoc/zabbix_setup/filter_plugins b/playbooks/adhoc/zabbix_setup/filter_plugins new file mode 120000 index 000000000..b0b7a3414 --- /dev/null +++ b/playbooks/adhoc/zabbix_setup/filter_plugins @@ -0,0 +1 @@ +../../../filter_plugins/
\ No newline at end of file diff --git a/playbooks/adhoc/zabbix_setup/oo-clean-zaio.yml b/playbooks/adhoc/zabbix_setup/oo-clean-zaio.yml new file mode 100755 index 000000000..0fe65b338 --- /dev/null +++ b/playbooks/adhoc/zabbix_setup/oo-clean-zaio.yml @@ -0,0 +1,7 @@ +#!/usr/bin/env ansible-playbook +--- +- include: clean_zabbix.yml + vars: + g_server: http://localhost/zabbix/api_jsonrpc.php + g_user: Admin + g_password: zabbix diff --git a/playbooks/adhoc/zabbix_setup/oo-config-zaio.yml b/playbooks/adhoc/zabbix_setup/oo-config-zaio.yml new file mode 100755 index 000000000..e2b8150c6 --- /dev/null +++ b/playbooks/adhoc/zabbix_setup/oo-config-zaio.yml @@ -0,0 +1,13 @@ +#!/usr/bin/ansible-playbook +--- +- hosts: localhost + gather_facts: no + vars: + g_server: http://localhost/zabbix/api_jsonrpc.php + g_user: Admin + g_password: zabbix + roles: + - role: os_zabbix + ozb_server: "{{ g_server }}" + ozb_user: "{{ g_user }}" + ozb_password: "{{ g_password }}" diff --git a/playbooks/adhoc/zabbix_setup/roles b/playbooks/adhoc/zabbix_setup/roles new file mode 120000 index 000000000..20c4c58cf --- /dev/null +++ b/playbooks/adhoc/zabbix_setup/roles @@ -0,0 +1 @@ +../../../roles
\ No newline at end of file diff --git a/playbooks/aws/openshift-cluster/tasks/launch_instances.yml b/playbooks/aws/openshift-cluster/tasks/launch_instances.yml index b77bcdc1a..9c699120b 100644 --- a/playbooks/aws/openshift-cluster/tasks/launch_instances.yml +++ b/playbooks/aws/openshift-cluster/tasks/launch_instances.yml @@ -172,6 +172,7 @@ - rotate 7 - compress - sharedscripts + - missingok scripts: postrotate: "/bin/kill -HUP `cat /var/run/syslogd.pid 2> /dev/null` 2> /dev/null || true" diff --git a/playbooks/aws/openshift-cluster/vars.online.int.yml b/playbooks/aws/openshift-cluster/vars.online.int.yml index bb18e13b0..2e2f25ccd 100644 --- a/playbooks/aws/openshift-cluster/vars.online.int.yml +++ b/playbooks/aws/openshift-cluster/vars.online.int.yml @@ -3,7 +3,7 @@ ec2_image: ami-9101c8fa ec2_image_name: libra-ops-rhel7* ec2_region: us-east-1 ec2_keypair: mmcgrath_libra -ec2_master_instance_type: t2.small +ec2_master_instance_type: t2.medium ec2_master_security_groups: [ 'integration', 'integration-master' ] ec2_infra_instance_type: c4.large ec2_infra_security_groups: [ 'integration', 'integration-infra' ] diff --git a/playbooks/aws/openshift-cluster/vars.online.prod.yml b/playbooks/aws/openshift-cluster/vars.online.prod.yml index bbef9cc56..18a53e12e 100644 --- a/playbooks/aws/openshift-cluster/vars.online.prod.yml +++ b/playbooks/aws/openshift-cluster/vars.online.prod.yml @@ -3,7 +3,7 @@ ec2_image: ami-9101c8fa ec2_image_name: libra-ops-rhel7* ec2_region: us-east-1 ec2_keypair: mmcgrath_libra -ec2_master_instance_type: t2.small +ec2_master_instance_type: t2.medium ec2_master_security_groups: [ 'production', 'production-master' ] ec2_infra_instance_type: c4.large ec2_infra_security_groups: [ 'production', 'production-infra' ] diff --git a/playbooks/aws/openshift-cluster/vars.online.stage.yml b/playbooks/aws/openshift-cluster/vars.online.stage.yml index 9008a55ba..1f9ac4252 100644 --- a/playbooks/aws/openshift-cluster/vars.online.stage.yml +++ b/playbooks/aws/openshift-cluster/vars.online.stage.yml @@ -3,7 +3,7 @@ ec2_image: ami-9101c8fa ec2_image_name: libra-ops-rhel7* ec2_region: us-east-1 ec2_keypair: mmcgrath_libra -ec2_master_instance_type: t2.small +ec2_master_instance_type: t2.medium ec2_master_security_groups: [ 'stage', 'stage-master' ] ec2_infra_instance_type: c4.large ec2_infra_security_groups: [ 'stage', 'stage-infra' ] diff --git a/playbooks/byo/openshift_facts.yml b/playbooks/byo/openshift_facts.yml index cd282270f..6d7c12fd4 100644 --- a/playbooks/byo/openshift_facts.yml +++ b/playbooks/byo/openshift_facts.yml @@ -1,5 +1,5 @@ --- -- name: Gather OpenShift facts +- name: Gather Cluster facts hosts: all gather_facts: no roles: diff --git a/playbooks/common/openshift-master/config.yml b/playbooks/common/openshift-master/config.yml index e8e3dcfdc..14ec82e85 100644 --- a/playbooks/common/openshift-master/config.yml +++ b/playbooks/common/openshift-master/config.yml @@ -37,7 +37,7 @@ public_console_url: "{{ openshift_master_public_console_url | default(None) }}" - name: Check status of external etcd certificatees stat: - path: "/etc/openshift/master/{{ item }}" + path: "{{ openshift.common.config_base }}/master/{{ item }}" with_items: - master.etcd-client.crt - master.etcd-ca.crt @@ -47,7 +47,7 @@ | map(attribute='stat.exists') | list | intersect([false])}}" etcd_cert_subdir: openshift-master-{{ openshift.common.hostname }} - etcd_cert_config_dir: /etc/openshift/master + etcd_cert_config_dir: "{{ openshift.common.config_base }}/master" etcd_cert_prefix: master.etcd- when: groups.oo_etcd_to_config is defined and groups.oo_etcd_to_config @@ -96,7 +96,7 @@ tasks: - name: Ensure certificate directory exists file: - path: /etc/openshift/master + path: "{{ openshift.common.config_base }}/master" state: directory when: etcd_client_certs_missing is defined and etcd_client_certs_missing - name: Unarchive the tarball on the master @@ -134,7 +134,7 @@ - name: Check status of master certificates stat: - path: "/etc/openshift/master/{{ item }}" + path: "{{ openshift.common.config_base }}/master/{{ item }}" with_items: openshift_master_certs register: g_master_cert_stat_result - set_fact: @@ -142,12 +142,12 @@ | map(attribute='stat.exists') | list | intersect([false])}}" master_cert_subdir: master-{{ openshift.common.hostname }} - master_cert_config_dir: /etc/openshift/master + master_cert_config_dir: "{{ openshift.common.config_base }}/master" - name: Configure master certificates hosts: oo_first_master vars: - master_generated_certs_dir: /etc/openshift/generated-configs + master_generated_certs_dir: "{{ openshift.common.config_base }}/generated-configs" masters_needing_certs: "{{ hostvars | oo_select_keys(groups['oo_masters_to_config'] | difference(groups['oo_first_master'])) | oo_filter_list(filter_attr='master_certs_missing') }}" @@ -186,10 +186,11 @@ vars: sync_tmpdir: "{{ hostvars.localhost.g_master_mktemp.stdout }}" openshift_master_ha: "{{ groups.oo_masters_to_config | length > 1 }}" + embedded_etcd: "{{ openshift.master.embedded_etcd }}" pre_tasks: - name: Ensure certificate directory exists file: - path: /etc/openshift/master + path: "{{ openshift.common.config_base }}/master" state: directory when: master_certs_missing and 'oo_first_master' not in group_names - name: Unarchive the tarball on the master @@ -215,7 +216,8 @@ roles: - role: openshift_master_cluster when: openshift_master_ha | bool - - openshift_examples + - role: openshift_examples + when: deployment_type in ['enterprise','openshift-enterprise','origin'] - role: openshift_cluster_metrics when: openshift.common.use_cluster_metrics | bool @@ -243,3 +245,12 @@ tasks: - file: name={{ g_master_mktemp.stdout }} state=absent changed_when: False + +- name: Configure service accounts + hosts: oo_first_master + + vars: + accounts: ["router", "registry"] + + roles: + - openshift_serviceaccounts diff --git a/playbooks/common/openshift-master/service.yml b/playbooks/common/openshift-master/service.yml index 5636ad156..27e1e66f9 100644 --- a/playbooks/common/openshift-master/service.yml +++ b/playbooks/common/openshift-master/service.yml @@ -10,9 +10,9 @@ add_host: name={{ item }} groups=g_service_masters with_items: oo_host_group_exp | default([]) -- name: Change openshift-master state on master instance(s) +- name: Change state on master instance(s) hosts: g_service_masters connection: ssh gather_facts: no tasks: - - service: name=openshift-master state="{{ new_cluster_state }}" + - service: name={{ openshift.common.service_type }}-master state="{{ new_cluster_state }}" diff --git a/playbooks/common/openshift-node/config.yml b/playbooks/common/openshift-node/config.yml index e0060a9a3..a14ca8e11 100644 --- a/playbooks/common/openshift-node/config.yml +++ b/playbooks/common/openshift-node/config.yml @@ -20,9 +20,10 @@ local_facts: labels: "{{ openshift_node_labels | default(None) }}" annotations: "{{ openshift_node_annotations | default(None) }}" + schedulable: "{{ openshift_schedulable | default(openshift_scheduleable) | default(None) }}" - name: Check status of node certificates stat: - path: "/etc/openshift/node/{{ item }}" + path: "{{ openshift.common.config_base }}/node/{{ item }}" with_items: - "system:node:{{ openshift.common.hostname }}.crt" - "system:node:{{ openshift.common.hostname }}.key" @@ -35,8 +36,8 @@ certs_missing: "{{ stat_result.results | map(attribute='stat.exists') | list | intersect([false])}}" node_subdir: node-{{ openshift.common.hostname }} - config_dir: /etc/openshift/generated-configs/node-{{ openshift.common.hostname }} - node_cert_dir: /etc/openshift/node + config_dir: "{{ openshift.common.config_base }}/generated-configs/node-{{ openshift.common.hostname }}" + node_cert_dir: "{{ openshift.common.config_base }}/node" - name: Create temp directory for syncing certs hosts: localhost @@ -89,9 +90,9 @@ path: "{{ node_cert_dir }}" state: directory - # TODO: notify restart openshift-node + # TODO: notify restart node # possibly test service started time against certificate/config file - # timestamps in openshift-node to trigger notify + # timestamps in node to trigger notify - name: Unarchive the tarball on the node unarchive: src: "{{ sync_tmpdir }}/{{ node_subdir }}.tgz" @@ -124,21 +125,14 @@ - os_env_extras - os_env_extras_node -- name: Set scheduleability +- name: Set schedulability hosts: oo_first_master vars: openshift_nodes: "{{ hostvars | oo_select_keys(groups['oo_nodes_to_config']) | oo_collect('openshift.common.hostname') }}" - openshift_unscheduleable_nodes: "{{ hostvars | oo_select_keys(groups['oo_nodes_to_config'] | default([])) - | oo_collect('openshift.common.hostname', {'openshift_scheduleable': False}) }}" openshift_node_vars: "{{ hostvars | oo_select_keys(groups['oo_nodes_to_config']) }}" pre_tasks: - - set_fact: - openshift_scheduleable_nodes: "{{ hostvars - | oo_select_keys(groups['oo_nodes_to_config'] | default([])) - | oo_collect('openshift.common.hostname') - | difference(openshift_unscheduleable_nodes) }}" roles: - openshift_manage_node diff --git a/playbooks/common/openshift-node/service.yml b/playbooks/common/openshift-node/service.yml index f76df089f..5cf83e186 100644 --- a/playbooks/common/openshift-node/service.yml +++ b/playbooks/common/openshift-node/service.yml @@ -10,9 +10,9 @@ add_host: name={{ item }} groups=g_service_nodes with_items: oo_host_group_exp | default([]) -- name: Change openshift-node state on node instance(s) +- name: Change state on node instance(s) hosts: g_service_nodes connection: ssh gather_facts: no tasks: - - service: name=openshift-node state="{{ new_cluster_state }}" + - service: name={{ service_type }}-node state="{{ new_cluster_state }}" diff --git a/playbooks/gce/openshift-cluster/config.yml b/playbooks/gce/openshift-cluster/config.yml index fd5dfcc72..6ca4f7395 100644 --- a/playbooks/gce/openshift-cluster/config.yml +++ b/playbooks/gce/openshift-cluster/config.yml @@ -10,6 +10,8 @@ - set_fact: g_ssh_user_tmp: "{{ deployment_vars[deployment_type].ssh_user }}" g_sudo_tmp: "{{ deployment_vars[deployment_type].sudo }}" + use_sdn: "{{ do_we_use_openshift_sdn }}" + sdn_plugin: "{{ sdn_network_plugin }}" - include: ../../common/openshift-cluster/config.yml vars: @@ -18,7 +20,10 @@ g_nodes_group: "{{ 'tag_env-host-type-' ~ cluster_id ~ '-openshift-node' }}" g_ssh_user: "{{ hostvars.localhost.g_ssh_user_tmp }}" g_sudo: "{{ hostvars.localhost.g_sudo_tmp }}" + g_nodeonmaster: true openshift_cluster_id: "{{ cluster_id }}" openshift_debug_level: 2 openshift_deployment_type: "{{ deployment_type }}" openshift_hostname: "{{ gce_private_ip }}" + openshift_use_openshift_sdn: "{{ hostvars.localhost.use_sdn }}" + os_sdn_network_plugin_name: "{{ hostvars.localhost.sdn_plugin }}" diff --git a/playbooks/gce/openshift-cluster/join_node.yml b/playbooks/gce/openshift-cluster/join_node.yml new file mode 100644 index 000000000..0dfa3e9d7 --- /dev/null +++ b/playbooks/gce/openshift-cluster/join_node.yml @@ -0,0 +1,49 @@ +--- +- name: Populate oo_hosts_to_update group + hosts: localhost + gather_facts: no + vars_files: + - vars.yml + tasks: + - name: Evaluate oo_hosts_to_update + add_host: + name: "{{ node_ip }}" + groups: oo_hosts_to_update + ansible_ssh_user: "{{ deployment_vars[deployment_type].ssh_user }}" + ansible_sudo: "{{ deployment_vars[deployment_type].sudo }}" + +- include: ../../common/openshift-cluster/update_repos_and_packages.yml + +- name: Populate oo_masters_to_config host group + hosts: localhost + gather_facts: no + vars_files: + - vars.yml + tasks: + - name: Evaluate oo_nodes_to_config + add_host: + name: "{{ node_ip }}" + ansible_ssh_user: "{{ deployment_vars[deployment_type].ssh_user }}" + ansible_sudo: "{{ deployment_vars[deployment_type].sudo }}" + groups: oo_nodes_to_config + + - name: Evaluate oo_first_master + add_host: + name: "{{ groups['tag_env-host-type-' ~ cluster_id ~ '-openshift-master'][0] }}" + ansible_ssh_user: "{{ deployment_vars[deployment_type].ssh_user }}" + ansible_sudo: "{{ deployment_vars[deployment_type].sudo }}" + groups: oo_first_master + when: "'tag_env-host-type-{{ cluster_id }}-openshift-master' in groups" + +#- include: config.yml +- include: ../../common/openshift-node/config.yml + vars: + openshift_cluster_id: "{{ cluster_id }}" + openshift_debug_level: 4 + openshift_deployment_type: "{{ deployment_type }}" + openshift_hostname: "{{ ansible_default_ipv4.address }}" + openshift_use_openshift_sdn: true + openshift_node_labels: "{{ lookup('oo_option', 'openshift_node_labels') }} " + os_sdn_network_plugin_name: "redhat/openshift-ovs-subnet" + osn_cluster_dns_domain: "{{ hostvars[groups.oo_first_master.0].openshift.dns.domain }}" + osn_cluster_dns_ip: "{{ hostvars[groups.oo_first_master.0].openshift.dns.ip }}" diff --git a/playbooks/gce/openshift-cluster/launch.yml b/playbooks/gce/openshift-cluster/launch.yml index 7a3b80da0..c22b897d5 100644 --- a/playbooks/gce/openshift-cluster/launch.yml +++ b/playbooks/gce/openshift-cluster/launch.yml @@ -34,27 +34,28 @@ count: "{{ num_infra }}" - include: tasks/launch_instances.yml vars: - instances: "{{ infra_names }}" + instances: "{{ node_names }}" cluster: "{{ cluster_id }}" type: "{{ k8s_type }}" g_sub_host_type: "{{ sub_host_type }}" - - set_fact: - a_infra: "{{ infra_names[0] }}" - - add_host: name={{ a_infra }} groups=service_master + - add_host: + name: "{{ master_names.0 }}" + groups: service_master + when: master_names is defined and master_names.0 is defined - include: update.yml - -- name: Deploy OpenShift Services - hosts: service_master - connection: ssh - gather_facts: yes - roles: - - openshift_registry - - openshift_router - -- include: ../../common/openshift-cluster/create_services.yml - vars: - g_svc_master: "{{ service_master }}" +# +#- name: Deploy OpenShift Services +# hosts: service_master +# connection: ssh +# gather_facts: yes +# roles: +# - openshift_registry +# - openshift_router +# +#- include: ../../common/openshift-cluster/create_services.yml +# vars: +# g_svc_master: "{{ service_master }}" - include: list.yml diff --git a/playbooks/gce/openshift-cluster/list.yml b/playbooks/gce/openshift-cluster/list.yml index 5ba0f5a48..53b2b9a5e 100644 --- a/playbooks/gce/openshift-cluster/list.yml +++ b/playbooks/gce/openshift-cluster/list.yml @@ -14,11 +14,11 @@ groups: oo_list_hosts ansible_ssh_user: "{{ deployment_vars[deployment_type].ssh_user | default(ansible_ssh_user, true) }}" ansible_sudo: "{{ deployment_vars[deployment_type].sudo }}" - with_items: groups[scratch_group] | default([]) | difference(['localhost']) | difference(groups.status_terminated) + with_items: groups[scratch_group] | default([], true) | difference(['localhost']) | difference(groups.status_terminated | default([], true)) - name: List instance(s) hosts: oo_list_hosts gather_facts: no tasks: - debug: - msg: "public ip:{{ hostvars[inventory_hostname].gce_public_ip }} private ip:{{ hostvars[inventory_hostname].gce_private_ip }}" + msg: "private ip:{{ hostvars[inventory_hostname].gce_private_ip }}" diff --git a/playbooks/gce/openshift-cluster/tasks/launch_instances.yml b/playbooks/gce/openshift-cluster/tasks/launch_instances.yml index 6307ecc27..c428cb465 100644 --- a/playbooks/gce/openshift-cluster/tasks/launch_instances.yml +++ b/playbooks/gce/openshift-cluster/tasks/launch_instances.yml @@ -10,14 +10,33 @@ service_account_email: "{{ lookup('env', 'gce_service_account_email_address') }}" pem_file: "{{ lookup('env', 'gce_service_account_pem_file_path') }}" project_id: "{{ lookup('env', 'gce_project_id') }}" + zone: "{{ lookup('env', 'zone') }}" + network: "{{ lookup('env', 'network') }}" +# unsupported in 1.9.+ + #service_account_permissions: "datastore,logging-write" tags: - created-by-{{ lookup('env', 'LOGNAME') |default(cluster, true) }} - env-{{ cluster }} - host-type-{{ type }} - - sub-host-type-{{ sub_host_type }} + - sub-host-type-{{ g_sub_host_type }} - env-host-type-{{ cluster }}-openshift-{{ type }} + when: instances |length > 0 register: gce +- set_fact: + node_label: + # There doesn't seem to be a way to get the region directly, so parse it out of the zone. + region: "{{ gce.zone | regex_replace('^(.*)-.*$', '\\\\1') }}" + type: "{{ g_sub_host_type }}" + when: instances |length > 0 and type == "node" + +- set_fact: + node_label: + # There doesn't seem to be a way to get the region directly, so parse it out of the zone. + region: "{{ gce.zone | regex_replace('^(.*)-.*$', '\\\\1') }}" + type: "{{ type }}" + when: instances |length > 0 and type != "node" + - name: Add new instances to groups and set variables needed add_host: hostname: "{{ item.name }}" @@ -27,16 +46,17 @@ groups: "{{ item.tags | oo_prepend_strings_in_list('tag_') | join(',') }}" gce_public_ip: "{{ item.public_ip }}" gce_private_ip: "{{ item.private_ip }}" - with_items: gce.instance_data + openshift_node_labels: "{{ node_label }}" + with_items: gce.instance_data | default([], true) - name: Wait for ssh wait_for: port=22 host={{ item.public_ip }} - with_items: gce.instance_data + with_items: gce.instance_data | default([], true) - name: Wait for user setup command: "ssh -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o ConnectTimeout=10 -o UserKnownHostsFile=/dev/null {{ hostvars[item.name].ansible_ssh_user }}@{{ item.public_ip }} echo {{ hostvars[item.name].ansible_ssh_user }} user is setup" register: result until: result.rc == 0 - retries: 20 - delay: 10 - with_items: gce.instance_data + retries: 30 + delay: 5 + with_items: gce.instance_data | default([], true) diff --git a/playbooks/gce/openshift-cluster/terminate.yml b/playbooks/gce/openshift-cluster/terminate.yml index 098b0df73..e20e0a8bc 100644 --- a/playbooks/gce/openshift-cluster/terminate.yml +++ b/playbooks/gce/openshift-cluster/terminate.yml @@ -1,25 +1,18 @@ --- - name: Terminate instance(s) hosts: localhost + connection: local gather_facts: no vars_files: - vars.yml tasks: - - set_fact: scratch_group=tag_env-host-type-{{ cluster_id }}-openshift-node + - set_fact: scratch_group=tag_env-{{ cluster_id }} - add_host: name: "{{ item }}" - groups: oo_hosts_to_terminate, oo_nodes_to_terminate + groups: oo_hosts_to_terminate ansible_ssh_user: "{{ deployment_vars[deployment_type].ssh_user | default(ansible_ssh_user, true) }}" ansible_sudo: "{{ deployment_vars[deployment_type].sudo }}" - with_items: groups[scratch_group] | default([]) | difference(['localhost']) | difference(groups.status_terminated) - - - set_fact: scratch_group=tag_env-host-type-{{ cluster_id }}-openshift-master - - add_host: - name: "{{ item }}" - groups: oo_hosts_to_terminate, oo_masters_to_terminate - ansible_ssh_user: "{{ deployment_vars[deployment_type].ssh_user | default(ansible_ssh_user, true) }}" - ansible_sudo: "{{ deployment_vars[deployment_type].sudo }}" - with_items: groups[scratch_group] | default([]) | difference(['localhost']) | difference(groups.status_terminated) + with_items: groups[scratch_group] | default([], true) | difference(['localhost']) | difference(groups.status_terminated | default([], true)) - name: Unsubscribe VMs hosts: oo_hosts_to_terminate @@ -32,14 +25,34 @@ lookup('oo_option', 'rhel_skip_subscription') | default(rhsub_skip, True) | default('no', True) | lower in ['no', 'false'] -- include: ../openshift-node/terminate.yml - vars: - gce_service_account_email: "{{ lookup('env', 'gce_service_account_email_address') }}" - gce_pem_file: "{{ lookup('env', 'gce_service_account_pem_file_path') }}" - gce_project_id: "{{ lookup('env', 'gce_project_id') }}" +- name: Terminate instances(s) + hosts: localhost + connection: local + gather_facts: no + vars_files: + - vars.yml + tasks: + + - name: Terminate instances that were previously launched + local_action: + module: gce + state: 'absent' + name: "{{ item }}" + service_account_email: "{{ lookup('env', 'gce_service_account_email_address') }}" + pem_file: "{{ lookup('env', 'gce_service_account_pem_file_path') }}" + project_id: "{{ lookup('env', 'gce_project_id') }}" + zone: "{{ lookup('env', 'zone') }}" + with_items: groups['oo_hosts_to_terminate'] | default([], true) + when: item is defined -- include: ../openshift-master/terminate.yml - vars: - gce_service_account_email: "{{ lookup('env', 'gce_service_account_email_address') }}" - gce_pem_file: "{{ lookup('env', 'gce_service_account_pem_file_path') }}" - gce_project_id: "{{ lookup('env', 'gce_project_id') }}" +#- include: ../openshift-node/terminate.yml +# vars: +# gce_service_account_email: "{{ lookup('env', 'gce_service_account_email_address') }}" +# gce_pem_file: "{{ lookup('env', 'gce_service_account_pem_file_path') }}" +# gce_project_id: "{{ lookup('env', 'gce_project_id') }}" +# +#- include: ../openshift-master/terminate.yml +# vars: +# gce_service_account_email: "{{ lookup('env', 'gce_service_account_email_address') }}" +# gce_pem_file: "{{ lookup('env', 'gce_service_account_pem_file_path') }}" +# gce_project_id: "{{ lookup('env', 'gce_project_id') }}" diff --git a/playbooks/gce/openshift-cluster/vars.yml b/playbooks/gce/openshift-cluster/vars.yml index ae33083b9..6de007807 100644 --- a/playbooks/gce/openshift-cluster/vars.yml +++ b/playbooks/gce/openshift-cluster/vars.yml @@ -1,8 +1,11 @@ --- +do_we_use_openshift_sdn: true +sdn_network_plugin: redhat/openshift-ovs-subnet +# os_sdn_network_plugin_name can be ovssubnet or multitenant, see https://docs.openshift.org/latest/architecture/additional_concepts/sdn.html#ovssubnet-plugin-operation deployment_vars: origin: - image: centos-7 - ssh_user: + image: preinstalled-slave-50g-v5 + ssh_user: root sudo: yes online: image: libra-rhel7 @@ -12,4 +15,3 @@ deployment_vars: image: rhel-7 ssh_user: sudo: yes - diff --git a/playbooks/libvirt/openshift-cluster/tasks/launch_instances.yml b/playbooks/libvirt/openshift-cluster/tasks/launch_instances.yml index 2a0c90b46..4b91c6da8 100644 --- a/playbooks/libvirt/openshift-cluster/tasks/launch_instances.yml +++ b/playbooks/libvirt/openshift-cluster/tasks/launch_instances.yml @@ -64,7 +64,7 @@ register: nb_allocated_ips until: nb_allocated_ips.stdout == '{{ instances | length }}' retries: 60 - delay: 1 + delay: 3 when: instances | length != 0 - name: Collect IP addresses of the VMs diff --git a/playbooks/libvirt/openshift-cluster/templates/network.xml b/playbooks/libvirt/openshift-cluster/templates/network.xml index 86dcd62bb..050bc7ab9 100644 --- a/playbooks/libvirt/openshift-cluster/templates/network.xml +++ b/playbooks/libvirt/openshift-cluster/templates/network.xml @@ -8,7 +8,7 @@ <!-- TODO: query for first available virbr interface available --> <bridge name='virbr3' stp='on' delay='0'/> <!-- TODO: make overridable --> - <domain name='example.com'/> + <domain name='example.com' localOnly='yes' /> <dns> <!-- TODO: automatically add host entries --> </dns> diff --git a/playbooks/libvirt/openshift-cluster/templates/user-data b/playbooks/libvirt/openshift-cluster/templates/user-data index 77b788109..eacae7c7e 100644 --- a/playbooks/libvirt/openshift-cluster/templates/user-data +++ b/playbooks/libvirt/openshift-cluster/templates/user-data @@ -19,5 +19,5 @@ system_info: ssh_authorized_keys: - {{ lookup('file', '~/.ssh/id_rsa.pub') }} -bootcmd: +runcmd: - NETWORK_CONFIG=/etc/sysconfig/network-scripts/ifcfg-eth0; if ! grep DHCP_HOSTNAME ${NETWORK_CONFIG}; then echo 'DHCP_HOSTNAME="{{ item[0] }}.example.com"' >> ${NETWORK_CONFIG}; fi; pkill -9 dhclient; service network restart diff --git a/roles/ansible_tower/tasks/main.yaml b/roles/ansible_tower/tasks/main.yaml index c110a3b70..b7757214d 100644 --- a/roles/ansible_tower/tasks/main.yaml +++ b/roles/ansible_tower/tasks/main.yaml @@ -9,6 +9,7 @@ - ansible - telnet - ack + - pylint - name: download Tower setup get_url: url=http://releases.ansible.com/ansible-tower/setup/ansible-tower-setup-2.1.1.tar.gz dest=/opt/ force=no @@ -38,5 +39,3 @@ regexp: "^({{ item.option }})( *)=" line: '\1\2= {{ item.value }}' with_items: config_changes | default([], true) - - diff --git a/roles/etcd/tasks/main.yml b/roles/etcd/tasks/main.yml index 27bfb7de9..656901409 100644 --- a/roles/etcd/tasks/main.yml +++ b/roles/etcd/tasks/main.yml @@ -38,6 +38,7 @@ template: src: etcd.conf.j2 dest: /etc/etcd/etcd.conf + backup: true notify: - restart etcd diff --git a/roles/etcd_ca/tasks/main.yml b/roles/etcd_ca/tasks/main.yml index 8a266f732..625756867 100644 --- a/roles/etcd_ca/tasks/main.yml +++ b/roles/etcd_ca/tasks/main.yml @@ -18,6 +18,7 @@ - template: dest: "{{ etcd_ca_dir }}/fragments/openssl_append.cnf" src: openssl_append.j2 + backup: true - assemble: src: "{{ etcd_ca_dir }}/fragments" diff --git a/roles/fluentd_master/tasks/main.yml b/roles/fluentd_master/tasks/main.yml index d592dc306..55cd94460 100644 --- a/roles/fluentd_master/tasks/main.yml +++ b/roles/fluentd_master/tasks/main.yml @@ -39,12 +39,16 @@ owner: 'td-agent' mode: 0444 -- name: "Pause before restarting td-agent and openshift-master, depending on the number of nodes." - pause: seconds={{ ( num_nodes|int < 3 ) | ternary(15, (num_nodes|int * 5)) }} +- name: wait for etcd to start up + wait_for: port=4001 delay=10 + when: embedded_etcd | bool + +- name: wait for etcd peer to start up + wait_for: port=7001 delay=10 + when: embedded_etcd | bool - name: ensure td-agent is running service: name: 'td-agent' state: started enabled: yes - diff --git a/roles/lib_zabbix/library/zbx_action.py b/roles/lib_zabbix/library/zbx_action.py new file mode 100644 index 000000000..d64cebae1 --- /dev/null +++ b/roles/lib_zabbix/library/zbx_action.py @@ -0,0 +1,538 @@ +#!/usr/bin/env python +''' + Ansible module for zabbix actions +''' +# vim: expandtab:tabstop=4:shiftwidth=4 +# +# Zabbix action ansible module +# +# +# Copyright 2015 Red Hat Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +# This is in place because each module looks similar to each other. +# These need duplicate code as their behavior is very similar +# but different for each zabbix class. +# pylint: disable=duplicate-code + +# pylint: disable=import-error +from openshift_tools.monitoring.zbxapi import ZabbixAPI, ZabbixConnection, ZabbixAPIError + +def exists(content, key='result'): + ''' Check if key exists in content or the size of content[key] > 0 + ''' + if not content.has_key(key): + return False + + if not content[key]: + return False + + return True + +def conditions_equal(zab_conditions, user_conditions): + '''Compare two lists of conditions''' + c_type = 'conditiontype' + _op = 'operator' + val = 'value' + if len(user_conditions) != len(zab_conditions): + return False + + for zab_cond, user_cond in zip(zab_conditions, user_conditions): + if zab_cond[c_type] != str(user_cond[c_type]) or zab_cond[_op] != str(user_cond[_op]) or \ + zab_cond[val] != str(user_cond[val]): + return False + + return True + +def filter_differences(zabbix_filters, user_filters): + '''Determine the differences from user and zabbix for operations''' + rval = {} + for key, val in user_filters.items(): + + if key == 'conditions': + if not conditions_equal(zabbix_filters[key], val): + rval[key] = val + + elif zabbix_filters[key] != str(val): + rval[key] = val + + return rval + +# This logic is quite complex. We are comparing two lists of dictionaries. +# The outer for-loops allow us to descend down into both lists at the same time +# and then walk over the key,val pairs of the incoming user dict's changes +# or updates. The if-statements are looking at different sub-object types and +# comparing them. The other suggestion on how to write this is to write a recursive +# compare function but for the time constraints and for complexity I decided to go +# this route. +# pylint: disable=too-many-branches +def operation_differences(zabbix_ops, user_ops): + '''Determine the differences from user and zabbix for operations''' + + # if they don't match, take the user options + if len(zabbix_ops) != len(user_ops): + return user_ops + + rval = {} + for zab, user in zip(zabbix_ops, user_ops): + for key, val in user.items(): + if key == 'opconditions': + for z_cond, u_cond in zip(zab[key], user[key]): + if not all([str(u_cond[op_key]) == z_cond[op_key] for op_key in \ + ['conditiontype', 'operator', 'value']]): + rval[key] = val + break + elif key == 'opmessage': + # Verify each passed param matches + for op_msg_key, op_msg_val in val.items(): + if zab[key][op_msg_key] != str(op_msg_val): + rval[key] = val + break + + elif key == 'opmessage_grp': + zab_grp_ids = set([ugrp['usrgrpid'] for ugrp in zab[key]]) + usr_grp_ids = set([ugrp['usrgrpid'] for ugrp in val]) + if usr_grp_ids != zab_grp_ids: + rval[key] = val + + elif key == 'opmessage_usr': + zab_usr_ids = set([usr['userid'] for usr in zab[key]]) + usr_ids = set([usr['userid'] for usr in val]) + if usr_ids != zab_usr_ids: + rval[key] = val + + elif zab[key] != str(val): + rval[key] = val + return rval + +def get_users(zapi, users): + '''get the mediatype id from the mediatype name''' + rval_users = [] + + for user in users: + content = zapi.get_content('user', + 'get', + {'filter': {'alias': user}}) + rval_users.append({'userid': content['result'][0]['userid']}) + + return rval_users + +def get_user_groups(zapi, groups): + '''get the mediatype id from the mediatype name''' + user_groups = [] + + content = zapi.get_content('usergroup', + 'get', + {'search': {'name': groups}}) + + for usr_grp in content['result']: + user_groups.append({'usrgrpid': usr_grp['usrgrpid']}) + + return user_groups + +def get_mediatype_id_by_name(zapi, m_name): + '''get the mediatype id from the mediatype name''' + content = zapi.get_content('mediatype', + 'get', + {'filter': {'description': m_name}}) + + return content['result'][0]['mediatypeid'] + +def get_priority(priority): + ''' determine priority + ''' + prior = 0 + if 'info' in priority: + prior = 1 + elif 'warn' in priority: + prior = 2 + elif 'avg' == priority or 'ave' in priority: + prior = 3 + elif 'high' in priority: + prior = 4 + elif 'dis' in priority: + prior = 5 + + return prior + +def get_event_source(from_src): + '''Translate even str into value''' + choices = ['trigger', 'discovery', 'auto', 'internal'] + rval = 0 + try: + rval = choices.index(from_src) + except ValueError as _: + ZabbixAPIError('Value not found for event source [%s]' % from_src) + + return rval + +def get_status(inc_status): + '''determine status for action''' + rval = 1 + if inc_status == 'enabled': + rval = 0 + + return rval + +def get_condition_operator(inc_operator): + ''' determine the condition operator''' + vals = {'=': 0, + '<>': 1, + 'like': 2, + 'not like': 3, + 'in': 4, + '>=': 5, + '<=': 6, + 'not in': 7, + } + + return vals[inc_operator] + +def get_host_id_by_name(zapi, host_name): + '''Get host id by name''' + content = zapi.get_content('host', + 'get', + {'filter': {'name': host_name}}) + + return content['result'][0]['hostid'] + +def get_trigger_value(inc_trigger): + '''determine the proper trigger value''' + rval = 1 + if inc_trigger == 'PROBLEM': + rval = 1 + else: + rval = 0 + + return rval + +def get_template_id_by_name(zapi, t_name): + '''get the template id by name''' + content = zapi.get_content('template', + 'get', + {'filter': {'host': t_name}}) + + return content['result'][0]['templateid'] + + +def get_host_group_id_by_name(zapi, hg_name): + '''Get hostgroup id by name''' + content = zapi.get_content('hostgroup', + 'get', + {'filter': {'name': hg_name}}) + + return content['result'][0]['groupid'] + +def get_condition_type(event_source, inc_condition): + '''determine the condition type''' + c_types = {} + if event_source == 'trigger': + c_types = {'host group': 0, + 'host': 1, + 'trigger': 2, + 'trigger name': 3, + 'trigger severity': 4, + 'trigger value': 5, + 'time period': 6, + 'host template': 13, + 'application': 15, + 'maintenance status': 16, + } + + elif event_source == 'discovery': + c_types = {'host IP': 7, + 'discovered service type': 8, + 'discovered service port': 9, + 'discovery status': 10, + 'uptime or downtime duration': 11, + 'received value': 12, + 'discovery rule': 18, + 'discovery check': 19, + 'proxy': 20, + 'discovery object': 21, + } + + elif event_source == 'auto': + c_types = {'proxy': 20, + 'host name': 22, + 'host metadata': 24, + } + + elif event_source == 'internal': + c_types = {'host group': 0, + 'host': 1, + 'host template': 13, + 'application': 15, + 'event type': 23, + } + else: + raise ZabbixAPIError('Unkown event source %s' % event_source) + + return c_types[inc_condition] + +def get_operation_type(inc_operation): + ''' determine the correct operation type''' + o_types = {'send message': 0, + 'remote command': 1, + 'add host': 2, + 'remove host': 3, + 'add to host group': 4, + 'remove from host group': 5, + 'link to template': 6, + 'unlink from template': 7, + 'enable host': 8, + 'disable host': 9, + } + + return o_types[inc_operation] + +def get_action_operations(zapi, inc_operations): + '''Convert the operations into syntax for api''' + for operation in inc_operations: + operation['operationtype'] = get_operation_type(operation['operationtype']) + if operation['operationtype'] == 0: # send message. Need to fix the + operation['opmessage']['mediatypeid'] = \ + get_mediatype_id_by_name(zapi, operation['opmessage']['mediatypeid']) + operation['opmessage_grp'] = get_user_groups(zapi, operation.get('opmessage_grp', [])) + operation['opmessage_usr'] = get_users(zapi, operation.get('opmessage_usr', [])) + if operation['opmessage']['default_msg']: + operation['opmessage']['default_msg'] = 1 + else: + operation['opmessage']['default_msg'] = 0 + + # NOT supported for remote commands + elif operation['operationtype'] == 1: + continue + + # Handle Operation conditions: + # Currently there is only 1 available which + # is 'event acknowledged'. In the future + # if there are any added we will need to pass this + # option to a function and return the correct conditiontype + if operation.has_key('opconditions'): + for condition in operation['opconditions']: + if condition['conditiontype'] == 'event acknowledged': + condition['conditiontype'] = 14 + + if condition['operator'] == '=': + condition['operator'] = 0 + + if condition['value'] == 'acknowledged': + condition['operator'] = 1 + else: + condition['operator'] = 0 + + + return inc_operations + +def get_operation_evaltype(inc_type): + '''get the operation evaltype''' + rval = 0 + if inc_type == 'and/or': + rval = 0 + elif inc_type == 'and': + rval = 1 + elif inc_type == 'or': + rval = 2 + elif inc_type == 'custom': + rval = 3 + + return rval + +def get_action_conditions(zapi, event_source, inc_conditions): + '''Convert the conditions into syntax for api''' + + calc_type = inc_conditions.pop('calculation_type') + inc_conditions['evaltype'] = get_operation_evaltype(calc_type) + for cond in inc_conditions['conditions']: + + cond['operator'] = get_condition_operator(cond['operator']) + # Based on conditiontype we need to set the proper value + # e.g. conditiontype = hostgroup then the value needs to be a hostgroup id + # e.g. conditiontype = host the value needs to be a host id + cond['conditiontype'] = get_condition_type(event_source, cond['conditiontype']) + if cond['conditiontype'] == 0: + cond['value'] = get_host_group_id_by_name(zapi, cond['value']) + elif cond['conditiontype'] == 1: + cond['value'] = get_host_id_by_name(zapi, cond['value']) + elif cond['conditiontype'] == 4: + cond['value'] = get_priority(cond['value']) + + elif cond['conditiontype'] == 5: + cond['value'] = get_trigger_value(cond['value']) + elif cond['conditiontype'] == 13: + cond['value'] = get_template_id_by_name(zapi, cond['value']) + elif cond['conditiontype'] == 16: + cond['value'] = '' + + return inc_conditions + + +def get_send_recovery(send_recovery): + '''Get the integer value''' + rval = 0 + if send_recovery: + rval = 1 + + return rval + +# The branches are needed for CRUD and error handling +# pylint: disable=too-many-branches +def main(): + ''' + ansible zabbix module for zbx_item + ''' + + + module = AnsibleModule( + argument_spec=dict( + zbx_server=dict(default='https://localhost/zabbix/api_jsonrpc.php', type='str'), + zbx_user=dict(default=os.environ.get('ZABBIX_USER', None), type='str'), + zbx_password=dict(default=os.environ.get('ZABBIX_PASSWORD', None), type='str'), + zbx_debug=dict(default=False, type='bool'), + + name=dict(default=None, type='str'), + event_source=dict(default='trigger', choices=['trigger', 'discovery', 'auto', 'internal'], type='str'), + action_subject=dict(default="{TRIGGER.NAME}: {TRIGGER.STATUS}", type='str'), + action_message=dict(default="{TRIGGER.NAME}: {TRIGGER.STATUS}\r\n" + + "Last value: {ITEM.LASTVALUE}\r\n\r\n{TRIGGER.URL}", type='str'), + reply_subject=dict(default="{TRIGGER.NAME}: {TRIGGER.STATUS}", type='str'), + reply_message=dict(default="Trigger: {TRIGGER.NAME}\r\nTrigger status: {TRIGGER.STATUS}\r\n" + + "Trigger severity: {TRIGGER.SEVERITY}\r\nTrigger URL: {TRIGGER.URL}\r\n\r\n" + + "Item values:\r\n\r\n1. {ITEM.NAME1} ({HOST.NAME1}:{ITEM.KEY1}): " + + "{ITEM.VALUE1}\r\n2. {ITEM.NAME2} ({HOST.NAME2}:{ITEM.KEY2}): " + + "{ITEM.VALUE2}\r\n3. {ITEM.NAME3} ({HOST.NAME3}:{ITEM.KEY3}): " + + "{ITEM.VALUE3}", type='str'), + send_recovery=dict(default=False, type='bool'), + status=dict(default=None, type='str'), + escalation_time=dict(default=60, type='int'), + conditions_filter=dict(default=None, type='dict'), + operations=dict(default=None, type='list'), + state=dict(default='present', type='str'), + ), + #supports_check_mode=True + ) + + zapi = ZabbixAPI(ZabbixConnection(module.params['zbx_server'], + module.params['zbx_user'], + module.params['zbx_password'], + module.params['zbx_debug'])) + + #Set the instance and the template for the rest of the calls + zbx_class_name = 'action' + state = module.params['state'] + + content = zapi.get_content(zbx_class_name, + 'get', + {'search': {'name': module.params['name']}, + 'selectFilter': 'extend', + 'selectOperations': 'extend', + }) + + #******# + # GET + #******# + if state == 'list': + module.exit_json(changed=False, results=content['result'], state="list") + + #******# + # DELETE + #******# + if state == 'absent': + if not exists(content): + module.exit_json(changed=False, state="absent") + + content = zapi.get_content(zbx_class_name, 'delete', [content['result'][0]['itemid']]) + module.exit_json(changed=True, results=content['result'], state="absent") + + # Create and Update + if state == 'present': + + conditions = get_action_conditions(zapi, module.params['event_source'], module.params['conditions_filter']) + operations = get_action_operations(zapi, module.params['operations']) + params = {'name': module.params['name'], + 'esc_period': module.params['escalation_time'], + 'eventsource': get_event_source(module.params['event_source']), + 'status': get_status(module.params['status']), + 'def_shortdata': module.params['action_subject'], + 'def_longdata': module.params['action_message'], + 'r_shortdata': module.params['reply_subject'], + 'r_longdata': module.params['reply_message'], + 'recovery_msg': get_send_recovery(module.params['send_recovery']), + 'filter': conditions, + 'operations': operations, + } + + # Remove any None valued params + _ = [params.pop(key, None) for key in params.keys() if params[key] is None] + + #******# + # CREATE + #******# + if not exists(content): + content = zapi.get_content(zbx_class_name, 'create', params) + + if content.has_key('error'): + module.exit_json(failed=True, changed=True, results=content['error'], state="present") + + module.exit_json(changed=True, results=content['result'], state='present') + + + ######## + # UPDATE + ######## + _ = params.pop('hostid', None) + differences = {} + zab_results = content['result'][0] + for key, value in params.items(): + + if key == 'operations': + ops = operation_differences(zab_results[key], value) + if ops: + differences[key] = ops + + elif key == 'filter': + filters = filter_differences(zab_results[key], value) + if filters: + differences[key] = filters + + elif zab_results[key] != value and zab_results[key] != str(value): + differences[key] = value + + if not differences: + module.exit_json(changed=False, results=zab_results, state="present") + + # We have differences and need to update. + # action update requires an id, filters, and operations + differences['actionid'] = zab_results['actionid'] + differences['operations'] = params['operations'] + differences['filter'] = params['filter'] + content = zapi.get_content(zbx_class_name, 'update', differences) + + if content.has_key('error'): + module.exit_json(failed=True, changed=False, results=content['error'], state="present") + + module.exit_json(changed=True, results=content['result'], state="present") + + module.exit_json(failed=True, + changed=False, + results='Unknown state passed. %s' % state, + state="unknown") + +# pylint: disable=redefined-builtin, unused-wildcard-import, wildcard-import, locally-disabled +# import module snippets. This are required +from ansible.module_utils.basic import * + +main() diff --git a/roles/lib_zabbix/library/zbx_application.py b/roles/lib_zabbix/library/zbx_application.py index d3d08c9dd..21e3d91f4 100644 --- a/roles/lib_zabbix/library/zbx_application.py +++ b/roles/lib_zabbix/library/zbx_application.py @@ -41,16 +41,17 @@ def exists(content, key='result'): return True -def get_template_ids(zapi, template_names): +def get_template_ids(zapi, template_name): ''' get related templates ''' template_ids = [] # Fetch templates by name - for template_name in template_names: - content = zapi.get_content('template', 'get', {'search': {'host': template_name}}) - if content.has_key('result'): - template_ids.append(content['result'][0]['templateid']) + content = zapi.get_content('template', + 'get', + {'search': {'host': template_name}}) + if content.has_key('result'): + template_ids.append(content['result'][0]['templateid']) return template_ids def main(): @@ -63,8 +64,8 @@ def main(): zbx_user=dict(default=os.environ.get('ZABBIX_USER', None), type='str'), zbx_password=dict(default=os.environ.get('ZABBIX_PASSWORD', None), type='str'), zbx_debug=dict(default=False, type='bool'), - name=dict(default=None, type='str'), - template_name=dict(default=None, type='list'), + name=dict(default=None, type='str', required=True), + template_name=dict(default=None, type='str'), state=dict(default='present', type='str'), ), #supports_check_mode=True @@ -81,10 +82,11 @@ def main(): aname = module.params['name'] state = module.params['state'] # get a applicationid, see if it exists + tids = get_template_ids(zapi, module.params['template_name']) content = zapi.get_content(zbx_class_name, 'get', {'search': {'name': aname}, - 'selectHost': 'hostid', + 'templateids': tids[0], }) if state == 'list': module.exit_json(changed=False, results=content['result'], state="list") @@ -97,9 +99,10 @@ def main(): module.exit_json(changed=True, results=content['result'], state="absent") if state == 'present': - params = {'hostid': get_template_ids(zapi, module.params['template_name'])[0], + params = {'hostid': tids[0], 'name': aname, } + if not exists(content): # if we didn't find it, create it content = zapi.get_content(zbx_class_name, 'create', params) diff --git a/roles/lib_zabbix/library/zbx_discoveryrule.py b/roles/lib_zabbix/library/zbx_discoveryrule.py index 71a0580c2..f52f350a5 100644 --- a/roles/lib_zabbix/library/zbx_discoveryrule.py +++ b/roles/lib_zabbix/library/zbx_discoveryrule.py @@ -85,6 +85,7 @@ def main(): Ansible module for zabbix discovery rules ''' + module = AnsibleModule( argument_spec=dict( zbx_server=dict(default='https://localhost/zabbix/api_jsonrpc.php', type='str'), @@ -93,6 +94,7 @@ def main(): zbx_debug=dict(default=False, type='bool'), name=dict(default=None, type='str'), key=dict(default=None, type='str'), + description=dict(default=None, type='str'), interfaceid=dict(default=None, type='int'), ztype=dict(default='trapper', type='str'), delay=dict(default=60, type='int'), @@ -113,18 +115,27 @@ def main(): idname = "itemid" dname = module.params['name'] state = module.params['state'] + template = get_template(zapi, module.params['template_name']) # selectInterfaces doesn't appear to be working but is needed. content = zapi.get_content(zbx_class_name, 'get', {'search': {'name': dname}, + 'templateids': template['templateid'], #'selectDServices': 'extend', #'selectDChecks': 'extend', #'selectDhosts': 'dhostid', }) + + #******# + # GET + #******# if state == 'list': module.exit_json(changed=False, results=content['result'], state="list") + #******# + # DELETE + #******# if state == 'absent': if not exists(content): module.exit_json(changed=False, state="absent") @@ -132,24 +143,37 @@ def main(): content = zapi.get_content(zbx_class_name, 'delete', [content['result'][0][idname]]) module.exit_json(changed=True, results=content['result'], state="absent") + + # Create and Update if state == 'present': - template = get_template(zapi, module.params['template_name']) params = {'name': dname, 'key_': module.params['key'], 'hostid': template['templateid'], 'interfaceid': module.params['interfaceid'], 'lifetime': module.params['lifetime'], 'type': get_type(module.params['ztype']), + 'description': module.params['description'], } if params['type'] in [2, 5, 7, 11]: params.pop('interfaceid') + # Remove any None valued params + _ = [params.pop(key, None) for key in params.keys() if params[key] is None] + + #******# + # CREATE + #******# if not exists(content): - # if we didn't find it, create it content = zapi.get_content(zbx_class_name, 'create', params) + + if content.has_key('error'): + module.exit_json(failed=True, changed=True, results=content['error'], state="present") + module.exit_json(changed=True, results=content['result'], state='present') - # already exists, we need to update it - # let's compare properties + + ######## + # UPDATE + ######## differences = {} zab_results = content['result'][0] for key, value in params.items(): @@ -163,6 +187,10 @@ def main(): # We have differences and need to update differences[idname] = zab_results[idname] content = zapi.get_content(zbx_class_name, 'update', differences) + + if content.has_key('error'): + module.exit_json(failed=True, changed=False, results=content['error'], state="present") + module.exit_json(changed=True, results=content['result'], state="present") module.exit_json(failed=True, diff --git a/roles/lib_zabbix/library/zbx_item.py b/roles/lib_zabbix/library/zbx_item.py index bcd389e47..6faa82dfc 100644 --- a/roles/lib_zabbix/library/zbx_item.py +++ b/roles/lib_zabbix/library/zbx_item.py @@ -53,6 +53,8 @@ def get_value_type(value_type): vtype = 0 if 'int' in value_type: vtype = 3 + elif 'log' in value_type: + vtype = 2 elif 'char' in value_type: vtype = 1 elif 'str' in value_type: @@ -60,18 +62,53 @@ def get_value_type(value_type): return vtype -def get_app_ids(zapi, application_names): +def get_app_ids(application_names, app_name_ids): ''' get application ids from names ''' - if isinstance(application_names, str): - application_names = [application_names] - app_ids = [] - for app_name in application_names: - content = zapi.get_content('application', 'get', {'search': {'name': app_name}}) - if content.has_key('result'): - app_ids.append(content['result'][0]['applicationid']) - return app_ids + applications = [] + if application_names: + for app in application_names: + applications.append(app_name_ids[app]) + return applications + +def get_template_id(zapi, template_name): + ''' + get related templates + ''' + template_ids = [] + app_ids = {} + # Fetch templates by name + content = zapi.get_content('template', + 'get', + {'search': {'host': template_name}, + 'selectApplications': ['applicationid', 'name']}) + if content.has_key('result'): + template_ids.append(content['result'][0]['templateid']) + for app in content['result'][0]['applications']: + app_ids[app['name']] = app['applicationid'] + + return template_ids, app_ids + +def get_multiplier(inval): + ''' Determine the multiplier + ''' + if inval == None or inval == '': + return None, 0 + + rval = None + try: + rval = int(inval) + except ValueError: + pass + + if rval: + return rval, 1 + + return rval, 0 + +# The branches are needed for CRUD and error handling +# pylint: disable=too-many-branches def main(): ''' ansible zabbix module for zbx_item @@ -88,7 +125,10 @@ def main(): template_name=dict(default=None, type='str'), zabbix_type=dict(default=2, type='int'), value_type=dict(default='int', type='str'), - applications=dict(default=[], type='list'), + multiplier=dict(default=None, type='str'), + description=dict(default=None, type='str'), + units=dict(default=None, type='str'), + applications=dict(default=None, type='list'), state=dict(default='present', type='str'), ), #supports_check_mode=True @@ -101,59 +141,83 @@ def main(): #Set the instance and the template for the rest of the calls zbx_class_name = 'item' - idname = "itemid" state = module.params['state'] - key = module.params['key'] - template_name = module.params['template_name'] - - content = zapi.get_content('template', 'get', {'search': {'host': template_name}}) - templateid = None - if content['result']: - templateid = content['result'][0]['templateid'] - else: - module.exit_json(changed=False, - results='Error: Could find template with name %s for item.' % template_name, + + templateid, app_name_ids = get_template_id(zapi, module.params['template_name']) + + # Fail if a template was not found matching the name + if not templateid: + module.exit_json(failed=True, + changed=False, + results='Error: Could find template with name %s for item.' % module.params['template_name'], state="Unkown") content = zapi.get_content(zbx_class_name, 'get', - {'search': {'key_': key}, + {'search': {'key_': module.params['key']}, 'selectApplications': 'applicationid', + 'templateids': templateid, }) + #******# + # GET + #******# if state == 'list': module.exit_json(changed=False, results=content['result'], state="list") + #******# + # DELETE + #******# if state == 'absent': if not exists(content): module.exit_json(changed=False, state="absent") - content = zapi.get_content(zbx_class_name, 'delete', [content['result'][0][idname]]) + content = zapi.get_content(zbx_class_name, 'delete', [content['result'][0]['itemid']]) module.exit_json(changed=True, results=content['result'], state="absent") + # Create and Update if state == 'present': + + formula, use_multiplier = get_multiplier(module.params['multiplier']) params = {'name': module.params.get('name', module.params['key']), - 'key_': key, - 'hostid': templateid, + 'key_': module.params['key'], + 'hostid': templateid[0], 'type': module.params['zabbix_type'], 'value_type': get_value_type(module.params['value_type']), - 'applications': get_app_ids(zapi, module.params['applications']), + 'applications': get_app_ids(module.params['applications'], app_name_ids), + 'formula': formula, + 'multiplier': use_multiplier, + 'description': module.params['description'], + 'units': module.params['units'], } + # Remove any None valued params + _ = [params.pop(key, None) for key in params.keys() if params[key] is None] + + #******# + # CREATE + #******# if not exists(content): - # if we didn't find it, create it content = zapi.get_content(zbx_class_name, 'create', params) + + if content.has_key('error'): + module.exit_json(failed=True, changed=True, results=content['error'], state="present") + module.exit_json(changed=True, results=content['result'], state='present') - # already exists, we need to update it - # let's compare properties + + + ######## + # UPDATE + ######## + _ = params.pop('hostid', None) differences = {} zab_results = content['result'][0] for key, value in params.items(): if key == 'applications': - zab_apps = set([item['applicationid'] for item in zab_results[key]]) - if zab_apps != set(value): - differences[key] = zab_apps + app_ids = [item['applicationid'] for item in zab_results[key]] + if set(app_ids) != set(value): + differences[key] = value elif zab_results[key] != value and zab_results[key] != str(value): differences[key] = value @@ -162,8 +226,12 @@ def main(): module.exit_json(changed=False, results=zab_results, state="present") # We have differences and need to update - differences[idname] = zab_results[idname] + differences['itemid'] = zab_results['itemid'] content = zapi.get_content(zbx_class_name, 'update', differences) + + if content.has_key('error'): + module.exit_json(failed=True, changed=False, results=content['error'], state="present") + module.exit_json(changed=True, results=content['result'], state="present") module.exit_json(failed=True, diff --git a/roles/lib_zabbix/library/zbx_itemprototype.py b/roles/lib_zabbix/library/zbx_itemprototype.py index 24f85710d..e7fd6fa21 100644 --- a/roles/lib_zabbix/library/zbx_itemprototype.py +++ b/roles/lib_zabbix/library/zbx_itemprototype.py @@ -38,13 +38,14 @@ def exists(content, key='result'): return True -def get_rule_id(zapi, discoveryrule_name): +def get_rule_id(zapi, discoveryrule_key, templateid): '''get a discoveryrule by name ''' content = zapi.get_content('discoveryrule', 'get', - {'search': {'name': discoveryrule_name}, + {'search': {'key_': discoveryrule_key}, 'output': 'extend', + 'templateids': templateid, }) if not content['result']: return None @@ -53,6 +54,9 @@ def get_rule_id(zapi, discoveryrule_name): def get_template(zapi, template_name): '''get a template by name ''' + if not template_name: + return None + content = zapi.get_content('template', 'get', {'search': {'host': template_name}, @@ -124,16 +128,17 @@ def get_status(status): return _status -def get_app_ids(zapi, application_names): +def get_app_ids(zapi, application_names, templateid): ''' get application ids from names ''' app_ids = [] for app_name in application_names: - content = zapi.get_content('application', 'get', {'search': {'name': app_name}}) + content = zapi.get_content('application', 'get', {'filter': {'name': app_name}, 'templateids': templateid}) if content.has_key('result'): app_ids.append(content['result'][0]['applicationid']) return app_ids +# pylint: disable=too-many-branches def main(): ''' Ansible module for zabbix discovery rules @@ -147,16 +152,17 @@ def main(): zbx_debug=dict(default=False, type='bool'), name=dict(default=None, type='str'), key=dict(default=None, type='str'), + description=dict(default=None, type='str'), interfaceid=dict(default=None, type='int'), ztype=dict(default='trapper', type='str'), value_type=dict(default='float', type='str'), delay=dict(default=60, type='int'), lifetime=dict(default=30, type='int'), - template_name=dict(default=[], type='list'), state=dict(default='present', type='str'), status=dict(default='enabled', type='str'), - discoveryrule_name=dict(default=None, type='str'), applications=dict(default=[], type='list'), + template_name=dict(default=None, type='str'), + discoveryrule_key=dict(default=None, type='str'), ), #supports_check_mode=True ) @@ -169,20 +175,27 @@ def main(): #Set the instance and the template for the rest of the calls zbx_class_name = 'itemprototype' idname = "itemid" - dname = module.params['name'] state = module.params['state'] + template = get_template(zapi, module.params['template_name']) # selectInterfaces doesn't appear to be working but is needed. content = zapi.get_content(zbx_class_name, 'get', - {'search': {'name': dname}, + {'search': {'key_': module.params['key']}, 'selectApplications': 'applicationid', 'selectDiscoveryRule': 'itemid', - #'selectDhosts': 'dhostid', + 'templated': True, }) + + #******# + # GET + #******# if state == 'list': module.exit_json(changed=False, results=content['result'], state="list") + #******# + # DELETE + #******# if state == 'absent': if not exists(content): module.exit_json(changed=False, state="absent") @@ -190,26 +203,39 @@ def main(): content = zapi.get_content(zbx_class_name, 'delete', [content['result'][0][idname]]) module.exit_json(changed=True, results=content['result'], state="absent") + # Create and Update if state == 'present': - template = get_template(zapi, module.params['template_name']) - params = {'name': dname, + params = {'name': module.params['name'], 'key_': module.params['key'], 'hostid': template['templateid'], 'interfaceid': module.params['interfaceid'], - 'ruleid': get_rule_id(zapi, module.params['discoveryrule_name']), + 'ruleid': get_rule_id(zapi, module.params['discoveryrule_key'], template['templateid']), 'type': get_type(module.params['ztype']), 'value_type': get_value_type(module.params['value_type']), - 'applications': get_app_ids(zapi, module.params['applications']), + 'applications': get_app_ids(zapi, module.params['applications'], template['templateid']), + 'description': module.params['description'], } + if params['type'] in [2, 5, 7, 8, 11, 15]: params.pop('interfaceid') + # Remove any None valued params + _ = [params.pop(key, None) for key in params.keys() if params[key] is None] + + #******# + # CREATE + #******# if not exists(content): - # if we didn't find it, create it content = zapi.get_content(zbx_class_name, 'create', params) + + if content.has_key('error'): + module.exit_json(failed=True, changed=False, results=content['error'], state="present") + module.exit_json(changed=True, results=content['result'], state='present') - # already exists, we need to update it - # let's compare properties + + #******# + # UPDATE + #******# differences = {} zab_results = content['result'][0] for key, value in params.items(): @@ -218,6 +244,11 @@ def main(): if value != zab_results['discoveryRule']['itemid']: differences[key] = value + elif key == 'applications': + app_ids = [app['applicationid'] for app in zab_results[key]] + if set(app_ids) - set(value): + differences[key] = value + elif zab_results[key] != value and zab_results[key] != str(value): differences[key] = value @@ -227,6 +258,10 @@ def main(): # We have differences and need to update differences[idname] = zab_results[idname] content = zapi.get_content(zbx_class_name, 'update', differences) + + if content.has_key('error'): + module.exit_json(failed=True, changed=False, results=content['error'], state="present") + module.exit_json(changed=True, results=content['result'], state="present") module.exit_json(failed=True, diff --git a/roles/lib_zabbix/library/zbx_trigger.py b/roles/lib_zabbix/library/zbx_trigger.py index 43f3d2677..ab7731faa 100644 --- a/roles/lib_zabbix/library/zbx_trigger.py +++ b/roles/lib_zabbix/library/zbx_trigger.py @@ -65,7 +65,7 @@ def get_deps(zapi, deps): for desc in deps: content = zapi.get_content('trigger', 'get', - {'search': {'description': desc}, + {'filter': {'description': desc}, 'expandExpression': True, 'selectDependencies': 'triggerid', }) @@ -74,6 +74,36 @@ def get_deps(zapi, deps): return results + +def get_trigger_status(inc_status): + ''' Determine the trigger's status + 0 is enabled + 1 is disabled + ''' + r_status = 0 + if inc_status == 'disabled': + r_status = 1 + + return r_status + +def get_template_id(zapi, template_name): + ''' + get related templates + ''' + template_ids = [] + app_ids = {} + # Fetch templates by name + content = zapi.get_content('template', + 'get', + {'search': {'host': template_name}, + 'selectApplications': ['applicationid', 'name']}) + if content.has_key('result'): + template_ids.append(content['result'][0]['templateid']) + for app in content['result'][0]['applications']: + app_ids[app['name']] = app['applicationid'] + + return template_ids, app_ids + def main(): ''' Create a trigger in zabbix @@ -98,10 +128,14 @@ def main(): zbx_password=dict(default=os.environ.get('ZABBIX_PASSWORD', None), type='str'), zbx_debug=dict(default=False, type='bool'), expression=dict(default=None, type='str'), + name=dict(default=None, type='str'), description=dict(default=None, type='str'), dependencies=dict(default=[], type='list'), priority=dict(default='avg', type='str'), + url=dict(default=None, type='str'), + status=dict(default=None, type='str'), state=dict(default='present', type='str'), + template_name=dict(default=None, type='str'), ), #supports_check_mode=True ) @@ -115,36 +149,60 @@ def main(): zbx_class_name = 'trigger' idname = "triggerid" state = module.params['state'] - description = module.params['description'] + tname = module.params['name'] + + templateid = None + if module.params['template_name']: + templateid, _ = get_template_id(zapi, module.params['template_name']) content = zapi.get_content(zbx_class_name, 'get', - {'search': {'description': description}, + {'filter': {'description': tname}, 'expandExpression': True, 'selectDependencies': 'triggerid', + 'templateids': templateid, }) + + # Get if state == 'list': module.exit_json(changed=False, results=content['result'], state="list") + # Delete if state == 'absent': if not exists(content): module.exit_json(changed=False, state="absent") content = zapi.get_content(zbx_class_name, 'delete', [content['result'][0][idname]]) module.exit_json(changed=True, results=content['result'], state="absent") + # Create and Update if state == 'present': - params = {'description': description, + params = {'description': tname, + 'comments': module.params['description'], 'expression': module.params['expression'], 'dependencies': get_deps(zapi, module.params['dependencies']), 'priority': get_priority(module.params['priority']), + 'url': module.params['url'], + 'status': get_trigger_status(module.params['status']), } + # Remove any None valued params + _ = [params.pop(key, None) for key in params.keys() if params[key] is None] + + #******# + # CREATE + #******# if not exists(content): # if we didn't find it, create it content = zapi.get_content(zbx_class_name, 'create', params) + + if content.has_key('error'): + module.exit_json(failed=True, changed=True, results=content['error'], state="present") + module.exit_json(changed=True, results=content['result'], state='present') - # already exists, we need to update it - # let's compare properties + + ######## + # UPDATE + ######## differences = {} zab_results = content['result'][0] for key, value in params.items(): diff --git a/roles/lib_zabbix/library/zbx_triggerprototype.py b/roles/lib_zabbix/library/zbx_triggerprototype.py new file mode 100644 index 000000000..c1224b268 --- /dev/null +++ b/roles/lib_zabbix/library/zbx_triggerprototype.py @@ -0,0 +1,177 @@ +#!/usr/bin/env python +''' +ansible module for zabbix triggerprototypes +''' +# vim: expandtab:tabstop=4:shiftwidth=4 +# +# Zabbix triggerprototypes ansible module +# +# +# Copyright 2015 Red Hat Inc. +# +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# + +# This is in place because each module looks similar to each other. +# These need duplicate code as their behavior is very similar +# but different for each zabbix class. +# pylint: disable=duplicate-code + +# pylint: disable=import-error +from openshift_tools.monitoring.zbxapi import ZabbixAPI, ZabbixConnection + +def exists(content, key='result'): + ''' Check if key exists in content or the size of content[key] > 0 + ''' + if not content.has_key(key): + return False + + if not content[key]: + return False + + return True + +def get_priority(priority): + ''' determine priority + ''' + prior = 0 + if 'info' in priority: + prior = 1 + elif 'warn' in priority: + prior = 2 + elif 'avg' == priority or 'ave' in priority: + prior = 3 + elif 'high' in priority: + prior = 4 + elif 'dis' in priority: + prior = 5 + + return prior + +def get_trigger_status(inc_status): + ''' Determine the trigger's status + 0 is enabled + 1 is disabled + ''' + r_status = 0 + if inc_status == 'disabled': + r_status = 1 + + return r_status + + +def main(): + ''' + Create a triggerprototype in zabbix + ''' + + module = AnsibleModule( + argument_spec=dict( + zbx_server=dict(default='https://localhost/zabbix/api_jsonrpc.php', type='str'), + zbx_user=dict(default=os.environ.get('ZABBIX_USER', None), type='str'), + zbx_password=dict(default=os.environ.get('ZABBIX_PASSWORD', None), type='str'), + zbx_debug=dict(default=False, type='bool'), + name=dict(default=None, type='str'), + expression=dict(default=None, type='str'), + description=dict(default=None, type='str'), + priority=dict(default='avg', type='str'), + url=dict(default=None, type='str'), + status=dict(default=None, type='str'), + state=dict(default='present', type='str'), + ), + #supports_check_mode=True + ) + + zapi = ZabbixAPI(ZabbixConnection(module.params['zbx_server'], + module.params['zbx_user'], + module.params['zbx_password'], + module.params['zbx_debug'])) + + #Set the instance and the template for the rest of the calls + zbx_class_name = 'triggerprototype' + idname = "triggerid" + state = module.params['state'] + tname = module.params['name'] + + content = zapi.get_content(zbx_class_name, + 'get', + {'filter': {'description': tname}, + 'expandExpression': True, + 'selectDependencies': 'triggerid', + }) + + # Get + if state == 'list': + module.exit_json(changed=False, results=content['result'], state="list") + + # Delete + if state == 'absent': + if not exists(content): + module.exit_json(changed=False, state="absent") + content = zapi.get_content(zbx_class_name, 'delete', [content['result'][0][idname]]) + module.exit_json(changed=True, results=content['result'], state="absent") + + # Create and Update + if state == 'present': + params = {'description': tname, + 'comments': module.params['description'], + 'expression': module.params['expression'], + 'priority': get_priority(module.params['priority']), + 'url': module.params['url'], + 'status': get_trigger_status(module.params['status']), + } + + # Remove any None valued params + _ = [params.pop(key, None) for key in params.keys() if params[key] is None] + + #******# + # CREATE + #******# + if not exists(content): + # if we didn't find it, create it + content = zapi.get_content(zbx_class_name, 'create', params) + + if content.has_key('error'): + module.exit_json(failed=True, changed=True, results=content['error'], state="present") + + module.exit_json(changed=True, results=content['result'], state='present') + + ######## + # UPDATE + ######## + differences = {} + zab_results = content['result'][0] + for key, value in params.items(): + + if zab_results[key] != value and zab_results[key] != str(value): + differences[key] = value + + if not differences: + module.exit_json(changed=False, results=zab_results, state="present") + + # We have differences and need to update + differences[idname] = zab_results[idname] + content = zapi.get_content(zbx_class_name, 'update', differences) + module.exit_json(changed=True, results=content['result'], state="present") + + + module.exit_json(failed=True, + changed=False, + results='Unknown state passed. %s' % state, + state="unknown") + +# pylint: disable=redefined-builtin, unused-wildcard-import, wildcard-import, locally-disabled +# import module snippets. This are required +from ansible.module_utils.basic import * + +main() diff --git a/roles/lib_zabbix/library/zbx_user.py b/roles/lib_zabbix/library/zbx_user.py index c916fa96a..62c85c1bf 100644 --- a/roles/lib_zabbix/library/zbx_user.py +++ b/roles/lib_zabbix/library/zbx_user.py @@ -164,7 +164,7 @@ def main(): if key == 'usrgrps': # this must be done as a list of ordered dictionaries fails comparison - if not all([True for _ in zab_results[key][0] if _ in value[0]]): + if not all([_ in value for _ in zab_results[key]]): differences[key] = value elif zab_results[key] != value and zab_results[key] != str(value): diff --git a/roles/lib_zabbix/library/zbx_user_media.py b/roles/lib_zabbix/library/zbx_user_media.py index 3f7760475..8895c78c3 100644 --- a/roles/lib_zabbix/library/zbx_user_media.py +++ b/roles/lib_zabbix/library/zbx_user_media.py @@ -54,8 +54,8 @@ def get_mtype(zapi, mtype): except ValueError: pass - content = zapi.get_content('mediatype', 'get', {'search': {'description': mtype}}) - if content.has_key['result'] and content['result']: + content = zapi.get_content('mediatype', 'get', {'filter': {'description': mtype}}) + if content.has_key('result') and content['result']: return content['result'][0]['mediatypeid'] return None @@ -63,7 +63,7 @@ def get_mtype(zapi, mtype): def get_user(zapi, user): ''' Get userids from user aliases ''' - content = zapi.get_content('user', 'get', {'search': {'alias': user}}) + content = zapi.get_content('user', 'get', {'filter': {'alias': user}}) if content['result']: return content['result'][0] @@ -104,15 +104,17 @@ def find_media(medias, user_media): ''' Find the user media in the list of medias ''' for media in medias: - if all([media[key] == user_media[key] for key in user_media.keys()]): + if all([media[key] == str(user_media[key]) for key in user_media.keys()]): return media return None -def get_active(in_active): +def get_active(is_active): '''Determine active value + 0 - enabled + 1 - disabled ''' active = 1 - if in_active: + if is_active: active = 0 return active @@ -128,6 +130,21 @@ def get_mediatype(zapi, mediatype, mediatype_desc): return mtypeid +def preprocess_medias(zapi, medias): + ''' Insert the correct information when processing medias ''' + for media in medias: + # Fetch the mediatypeid from the media desc (name) + if media.has_key('mediatype'): + media['mediatypeid'] = get_mediatype(zapi, mediatype=None, mediatype_desc=media.pop('mediatype')) + + media['active'] = get_active(media.get('active')) + media['severity'] = int(get_severity(media['severity'])) + + return medias + +# Disabling branching as the logic requires branches. +# I've also added a few safeguards which required more branches. +# pylint: disable=too-many-branches def main(): ''' Ansible zabbix module for mediatype @@ -166,11 +183,17 @@ def main(): # User media is fetched through the usermedia.get zbx_user_query = get_zbx_user_query_data(zapi, module.params['login']) - content = zapi.get_content('usermedia', 'get', zbx_user_query) - + content = zapi.get_content('usermedia', 'get', + {'userids': [uid for user, uid in zbx_user_query.items()]}) + ##### + # Get + ##### if state == 'list': module.exit_json(changed=False, results=content['result'], state="list") + ######## + # Delete + ######## if state == 'absent': if not exists(content) or len(content['result']) == 0: module.exit_json(changed=False, state="absent") @@ -178,13 +201,14 @@ def main(): if not module.params['login']: module.exit_json(failed=True, changed=False, results='Must specifiy a user login.', state="absent") - content = zapi.get_content(zbx_class_name, 'deletemedia', [content['result'][0][idname]]) + content = zapi.get_content(zbx_class_name, 'deletemedia', [res[idname] for res in content['result']]) if content.has_key('error'): module.exit_json(changed=False, results=content['error'], state="absent") module.exit_json(changed=True, results=content['result'], state="absent") + # Create and Update if state == 'present': active = get_active(module.params['active']) mtypeid = get_mediatype(zapi, module.params['mediatype'], module.params['mediatype_desc']) @@ -197,13 +221,21 @@ def main(): 'severity': int(get_severity(module.params['severity'])), 'period': module.params['period'], }] + else: + medias = preprocess_medias(zapi, medias) params = {'users': [zbx_user_query], 'medias': medias, 'output': 'extend', } + ######## + # Create + ######## if not exists(content): + if not params['medias']: + module.exit_json(changed=False, results=content['result'], state='present') + # if we didn't find it, create it content = zapi.get_content(zbx_class_name, 'addmedia', params) @@ -216,6 +248,9 @@ def main(): # If user params exists, check to see if they already exist in zabbix # if they exist, then return as no update # elif they do not exist, then take user params only + ######## + # Update + ######## diff = {'medias': [], 'users': {}} _ = [diff['medias'].append(media) for media in params['medias'] if not find_media(content['result'], media)] @@ -225,6 +260,9 @@ def main(): for user in params['users']: diff['users']['userid'] = user['userid'] + # Medias have no real unique key so therefore we need to make it like the incoming user's request + diff['medias'] = medias + # We have differences and need to update content = zapi.get_content(zbx_class_name, 'updatemedia', diff) diff --git a/roles/lib_zabbix/tasks/create_template.yml b/roles/lib_zabbix/tasks/create_template.yml index 022ca52f2..41381e76c 100644 --- a/roles/lib_zabbix/tasks/create_template.yml +++ b/roles/lib_zabbix/tasks/create_template.yml @@ -1,6 +1,4 @@ --- -- debug: var=template - - name: Template Create Template zbx_template: zbx_server: "{{ server }}" @@ -9,12 +7,10 @@ name: "{{ template.name }}" register: created_template -- debug: var=created_template - set_fact: - lzbx_applications: "{{ template.zitems | oo_select_keys_from_list(['applications']) | oo_flatten | unique }}" - -- debug: var=lzbx_applications + lzbx_item_applications: "{{ template.zitems | default([], True) | oo_select_keys_from_list(['applications']) | oo_flatten | unique }}" + lzbx_itemprototype_applications: "{{ template.zitemprototypes | default([], True) | oo_select_keys_from_list(['applications']) | oo_flatten | unique }}" - name: Create Application zbx_application: @@ -23,11 +19,11 @@ zbx_password: "{{ password }}" name: "{{ item }}" template_name: "{{ template.name }}" - with_items: lzbx_applications + with_items: + - "{{ lzbx_item_applications }}" + - "{{ lzbx_itemprototype_applications }}" register: created_application - when: template.zitems is defined - -- debug: var=created_application + when: template.zitems is defined or template.zitemprototypes is defined - name: Create Items zbx_item: @@ -37,25 +33,66 @@ key: "{{ item.key }}" name: "{{ item.name | default(item.key, true) }}" value_type: "{{ item.value_type | default('int') }}" + description: "{{ item.description | default('', True) }}" + multiplier: "{{ item.multiplier | default('', True) }}" + units: "{{ item.units | default('', True) }}" template_name: "{{ template.name }}" applications: "{{ item.applications }}" with_items: template.zitems register: created_items when: template.zitems is defined -#- debug: var=ctp_created_items - - name: Create Triggers zbx_trigger: zbx_server: "{{ server }}" zbx_user: "{{ user }}" zbx_password: "{{ password }}" - description: "{{ item.description }}" + name: "{{ item.name }}" + description: "{{ item.description | default('', True) }}" + dependencies: "{{ item.dependencies | default([], True) }}" expression: "{{ item.expression }}" priority: "{{ item.priority }}" + url: "{{ item.url | default(None, True) }}" with_items: template.ztriggers when: template.ztriggers is defined -#- debug: var=ctp_created_triggers +- name: Create Discoveryrules + zbx_discoveryrule: + zbx_server: "{{ server }}" + zbx_user: "{{ user }}" + zbx_password: "{{ password }}" + name: "{{ item.name }}" + key: "{{ item.key }}" + lifetime: "{{ item.lifetime }}" + template_name: "{{ template.name }}" + description: "{{ item.description | default('', True) }}" + with_items: template.zdiscoveryrules + when: template.zdiscoveryrules is defined +- name: Create Item Prototypes + zbx_itemprototype: + zbx_server: "{{ server }}" + zbx_user: "{{ user }}" + zbx_password: "{{ password }}" + name: "{{ item.name }}" + key: "{{ item.key }}" + discoveryrule_key: "{{ item.discoveryrule_key }}" + value_type: "{{ item.value_type }}" + template_name: "{{ template.name }}" + applications: "{{ item.applications }}" + description: "{{ item.description | default('', True) }}" + with_items: template.zitemprototypes + when: template.zitemprototypes is defined +- name: Create Trigger Prototypes + zbx_triggerprototype: + zbx_server: "{{ server }}" + zbx_user: "{{ user }}" + zbx_password: "{{ password }}" + name: "{{ item.name }}" + expression: "{{ item.expression }}" + url: "{{ item.url | default('', True) }}" + priority: "{{ item.priority | default('average', True) }}" + description: "{{ item.description | default('', True) }}" + with_items: template.ztriggerprototypes + when: template.ztriggerprototypes is defined diff --git a/roles/openshift_common/tasks/main.yml b/roles/openshift_common/tasks/main.yml index d9f2dafab..73bd28630 100644 --- a/roles/openshift_common/tasks/main.yml +++ b/roles/openshift_common/tasks/main.yml @@ -1,5 +1,5 @@ --- -- name: Set common OpenShift facts +- name: Set common Cluster facts openshift_facts: role: common local_facts: diff --git a/roles/openshift_common/vars/main.yml b/roles/openshift_common/vars/main.yml index 8e7d71154..50816d319 100644 --- a/roles/openshift_common/vars/main.yml +++ b/roles/openshift_common/vars/main.yml @@ -5,5 +5,3 @@ # chains with the public zone (or the zone associated with the correct # interfaces) os_firewall_use_firewalld: False - -openshift_data_dir: /var/lib/openshift diff --git a/roles/openshift_examples/defaults/main.yml b/roles/openshift_examples/defaults/main.yml index 3246790aa..7d4f100e3 100644 --- a/roles/openshift_examples/defaults/main.yml +++ b/roles/openshift_examples/defaults/main.yml @@ -14,3 +14,5 @@ db_templates_base: "{{ examples_base }}/db-templates" xpaas_image_streams: "{{ examples_base }}/xpaas-streams/jboss-image-streams.json" xpaas_templates_base: "{{ examples_base }}/xpaas-templates" quickstarts_base: "{{ examples_base }}/quickstart-templates" + +openshift_examples_import_command: "create" diff --git a/roles/openshift_examples/files/examples/image-streams/image-streams-centos7.json b/roles/openshift_examples/files/examples/image-streams/image-streams-centos7.json index 03affbddf..f213d99ca 100644 --- a/roles/openshift_examples/files/examples/image-streams/image-streams-centos7.json +++ b/roles/openshift_examples/files/examples/image-streams/image-streams-centos7.json @@ -161,19 +161,19 @@ "creationTimestamp": null }, "spec": { - "dockerImageRepository": "openshift/wildfly-8-centos", + "dockerImageRepository": "openshift/wildfly-81-centos7", "tags": [ { "name": "latest" }, { - "name": "8", + "name": "8.1", "annotations": { - "description": "Build and run Java applications on Wildfly 8", + "description": "Build and run Java applications on Wildfly 8.1", "iconClass": "icon-wildfly", "tags": "builder,wildfly,java", - "supports":"wildfly:8,jee,java", - "version": "8" + "supports":"wildfly:8.1,jee,java", + "version": "8.1" }, "from": { "Kind": "ImageStreamTag", @@ -260,13 +260,13 @@ "creationTimestamp": null }, "spec": { - "dockerImageRepository": "openshift/jenkins-16-centos7", + "dockerImageRepository": "openshift/jenkins-1-centos7", "tags": [ { "name": "latest" }, { - "name": "1.6", + "name": "1", "from": { "Kind": "ImageStreamTag", "Name": "latest" diff --git a/roles/openshift_examples/files/examples/image-streams/image-streams-rhel7.json b/roles/openshift_examples/files/examples/image-streams/image-streams-rhel7.json index 0bd885af3..8c125f76a 100644 --- a/roles/openshift_examples/files/examples/image-streams/image-streams-rhel7.json +++ b/roles/openshift_examples/files/examples/image-streams/image-streams-rhel7.json @@ -230,13 +230,13 @@ "creationTimestamp": null }, "spec": { - "dockerImageRepository": "registry.access.redhat.com/openshift3/jenkins-16-rhel7", + "dockerImageRepository": "registry.access.redhat.com/openshift3/jenkins-1-rhel7", "tags": [ { "name": "latest" }, { - "name": "1.6", + "name": "1", "from": { "Kind": "ImageStreamTag", "Name": "latest" diff --git a/roles/openshift_examples/files/examples/quickstart-templates/jenkins-ephemeral-template.json b/roles/openshift_examples/files/examples/quickstart-templates/jenkins-ephemeral-template.json index da08ffbd5..14bd032af 100644 --- a/roles/openshift_examples/files/examples/quickstart-templates/jenkins-ephemeral-template.json +++ b/roles/openshift_examples/files/examples/quickstart-templates/jenkins-ephemeral-template.json @@ -88,7 +88,7 @@ "containers": [ { "name": "jenkins", - "image": "openshift/jenkins-16-centos7", + "image": "${JENKINS_IMAGE}", "env": [ { "name": "JENKINS_PASSWORD", @@ -133,6 +133,11 @@ "value": "jenkins" }, { + "name": "JENKINS_IMAGE", + "description": "Jenkins Docker image to use", + "value": "openshift/jenkins-1-centos7" + }, + { "name": "JENKINS_PASSWORD", "description": "Password for the Jenkins user", "generate": "expression", diff --git a/roles/openshift_examples/files/examples/quickstart-templates/jenkins-persistent-template.json b/roles/openshift_examples/files/examples/quickstart-templates/jenkins-persistent-template.json index 33df68c74..fa31de486 100644 --- a/roles/openshift_examples/files/examples/quickstart-templates/jenkins-persistent-template.json +++ b/roles/openshift_examples/files/examples/quickstart-templates/jenkins-persistent-template.json @@ -105,7 +105,7 @@ "containers": [ { "name": "jenkins", - "image": "openshift/jenkins-16-centos7", + "image": "${JENKINS_IMAGE}", "env": [ { "name": "JENKINS_PASSWORD", @@ -156,6 +156,11 @@ "value": "password" }, { + "name": "JENKINS_IMAGE", + "description": "Jenkins Docker image to use", + "value": "openshift/jenkins-1-centos7" + }, + { "name": "VOLUME_CAPACITY", "description": "Volume space available for data, e.g. 512Mi, 2Gi", "value": "512Mi", diff --git a/roles/openshift_examples/files/examples/xpaas-templates/eap6-https-sti.json b/roles/openshift_examples/files/examples/xpaas-templates/eap6-https-sti.json index 0497e6824..5df36ccc2 100644 --- a/roles/openshift_examples/files/examples/xpaas-templates/eap6-https-sti.json +++ b/roles/openshift_examples/files/examples/xpaas-templates/eap6-https-sti.json @@ -6,10 +6,10 @@ "iconClass" : "icon-jboss", "description": "Application template for EAP 6 applications built using STI." }, - "name": "eap6-basic-sti" + "name": "eap6-https-sti" }, "labels": { - "template": "eap6-basic-sti" + "template": "eap6-https-sti" }, "parameters": [ { diff --git a/roles/openshift_examples/tasks/main.yml b/roles/openshift_examples/tasks/main.yml index bfc6dfb0a..3a829a4c6 100644 --- a/roles/openshift_examples/tasks/main.yml +++ b/roles/openshift_examples/tasks/main.yml @@ -7,7 +7,7 @@ # RHEL and Centos image streams are mutually exclusive - name: Import RHEL streams command: > - {{ openshift.common.client_binary }} create -n openshift -f {{ rhel_image_streams }} + {{ openshift.common.client_binary }} {{ openshift_examples_import_command }} -n openshift -f {{ rhel_image_streams }} when: openshift_examples_load_rhel register: oex_import_rhel_streams failed_when: "'already exists' not in oex_import_rhel_streams.stderr and oex_import_rhel_streams.rc != 0" @@ -15,7 +15,7 @@ - name: Import Centos Image streams command: > - {{ openshift.common.client_binary }} create -n openshift -f {{ centos_image_streams }} + {{ openshift.common.client_binary }} {{ openshift_examples_import_command }} -n openshift -f {{ centos_image_streams }} when: openshift_examples_load_centos | bool register: oex_import_centos_streams failed_when: "'already exists' not in oex_import_centos_streams.stderr and oex_import_centos_streams.rc != 0" @@ -23,7 +23,7 @@ - name: Import db templates command: > - {{ openshift.common.client_binary }} create -n openshift -f {{ db_templates_base }} + {{ openshift.common.client_binary }} {{ openshift_examples_import_command }} -n openshift -f {{ db_templates_base }} when: openshift_examples_load_db_templates | bool register: oex_import_db_templates failed_when: "'already exists' not in oex_import_db_templates.stderr and oex_import_db_templates.rc != 0" @@ -31,7 +31,7 @@ - name: Import quickstart-templates command: > - {{ openshift.common.client_binary }} create -n openshift -f {{ quickstarts_base }} + {{ openshift.common.client_binary }} {{ openshift_examples_import_command }} -n openshift -f {{ quickstarts_base }} when: openshift_examples_load_quickstarts register: oex_import_quickstarts failed_when: "'already exists' not in oex_import_quickstarts.stderr and oex_import_quickstarts.rc != 0" @@ -40,7 +40,7 @@ - name: Import xPaas image streams command: > - {{ openshift.common.client_binary }} create -n openshift -f {{ xpaas_image_streams }} + {{ openshift.common.client_binary }} {{ openshift_examples_import_command }} -n openshift -f {{ xpaas_image_streams }} when: openshift_examples_load_xpaas | bool register: oex_import_xpaas_streams failed_when: "'already exists' not in oex_import_xpaas_streams.stderr and oex_import_xpaas_streams.rc != 0" @@ -48,7 +48,7 @@ - name: Import xPaas templates command: > - {{ openshift.common.client_binary }} create -n openshift -f {{ xpaas_templates_base }} + {{ openshift.common.client_binary }} {{ openshift_examples_import_command }} -n openshift -f {{ xpaas_templates_base }} when: openshift_examples_load_xpaas | bool register: oex_import_xpaas_templates failed_when: "'already exists' not in oex_import_xpaas_templates.stderr and oex_import_xpaas_templates.rc != 0" diff --git a/roles/openshift_facts/library/openshift_facts.py b/roles/openshift_facts/library/openshift_facts.py index c13e96c2b..69bb49c9b 100755 --- a/roles/openshift_facts/library/openshift_facts.py +++ b/roles/openshift_facts/library/openshift_facts.py @@ -6,7 +6,7 @@ DOCUMENTATION = ''' --- module: openshift_facts -short_description: OpenShift Facts +short_description: Cluster Facts author: Jason DeTiberus requirements: [ ] ''' @@ -16,6 +16,7 @@ EXAMPLES = ''' import ConfigParser import copy import os +from distutils.util import strtobool def hostname_valid(hostname): @@ -283,28 +284,6 @@ def normalize_provider_facts(provider, metadata): facts = normalize_openstack_facts(metadata, facts) return facts -def set_registry_url_if_unset(facts): - """ Set registry_url fact if not already present in facts dict - - Args: - facts (dict): existing facts - Returns: - dict: the facts dict updated with the generated identity providers - facts if they were not already present - """ - for role in ('master', 'node'): - if role in facts: - deployment_type = facts['common']['deployment_type'] - if 'registry_url' not in facts[role]: - registry_url = "openshift/origin-${component}:${version}" - if deployment_type == 'enterprise': - registry_url = "openshift3/ose-${component}:${version}" - elif deployment_type == 'online': - registry_url = ("openshift3/ose-${component}:${version}") - facts[role]['registry_url'] = registry_url - - return facts - def set_fluentd_facts_if_unset(facts): """ Set fluentd facts if not already present in facts dict dict: the facts dict updated with the generated fluentd facts if @@ -317,12 +296,28 @@ def set_fluentd_facts_if_unset(facts): """ if 'common' in facts: - deployment_type = facts['common']['deployment_type'] if 'use_fluentd' not in facts['common']: - use_fluentd = True if deployment_type == 'online' else False + use_fluentd = False facts['common']['use_fluentd'] = use_fluentd return facts +def set_node_schedulability(facts): + """ Set schedulable facts if not already present in facts dict + Args: + facts (dict): existing facts + Returns: + dict: the facts dict updated with the generated schedulable + facts if they were not already present + + """ + if 'node' in facts: + if 'schedulable' not in facts['node']: + if 'master' in facts: + facts['node']['schedulable'] = False + else: + facts['node']['schedulable'] = True + return facts + def set_metrics_facts_if_unset(facts): """ Set cluster metrics facts if not already present in facts dict dict: the facts dict updated with the generated cluster metrics facts if @@ -447,6 +442,54 @@ def set_aggregate_facts(facts): return facts +def set_deployment_facts_if_unset(facts): + """ Set Facts that vary based on deployment_type. This currently + includes common.service_type, common.config_base, master.registry_url, + node.registry_url + + Args: + facts (dict): existing facts + Returns: + dict: the facts dict updated with the generated deployment_type + facts + """ + # Perhaps re-factor this as a map? + # pylint: disable=too-many-branches + if 'common' in facts: + deployment_type = facts['common']['deployment_type'] + if 'service_type' not in facts['common']: + service_type = 'atomic-openshift' + if deployment_type == 'origin': + service_type = 'origin' + elif deployment_type in ['enterprise', 'online']: + service_type = 'openshift' + facts['common']['service_type'] = service_type + if 'config_base' not in facts['common']: + config_base = '/etc/origin' + if deployment_type in ['enterprise', 'online']: + config_base = '/etc/openshift' + facts['common']['config_base'] = config_base + if 'data_dir' not in facts['common']: + data_dir = '/var/lib/origin' + if deployment_type in ['enterprise', 'online']: + data_dir = '/var/lib/openshift' + facts['common']['data_dir'] = data_dir + facts['common']['version'] = get_openshift_version() + + for role in ('master', 'node'): + if role in facts: + deployment_type = facts['common']['deployment_type'] + if 'registry_url' not in facts[role]: + registry_url = 'openshift/origin-${component}:${version}' + if deployment_type in ['enterprise', 'online', 'openshift-enterprise']: + registry_url = 'openshift3/ose-${component}:${version}' + elif deployment_type == 'atomic-enterprise': + registry_url = 'aep3/aep-${component}:${version}' + facts[role]['registry_url'] = registry_url + + return facts + + def set_sdn_facts_if_unset(facts): """ Set sdn facts if not already present in facts dict @@ -457,8 +500,10 @@ def set_sdn_facts_if_unset(facts): were not already present """ if 'common' in facts: + use_sdn = facts['common']['use_openshift_sdn'] + if not (use_sdn == '' or isinstance(use_sdn, bool)): + facts['common']['use_openshift_sdn'] = bool(strtobool(str(use_sdn))) if 'sdn_network_plugin_name' not in facts['common']: - use_sdn = facts['common']['use_openshift_sdn'] plugin = 'redhat/openshift-ovs-subnet' if use_sdn else '' facts['common']['sdn_network_plugin_name'] = plugin @@ -468,6 +513,10 @@ def set_sdn_facts_if_unset(facts): if 'sdn_host_subnet_length' not in facts['master']: facts['master']['sdn_host_subnet_length'] = '8' + if 'node' in facts: + if 'sdn_mtu' not in facts['node']: + facts['node']['sdn_mtu'] = '1450' + return facts def format_url(use_ssl, hostname, port, path=''): @@ -509,7 +558,7 @@ def get_current_config(facts): # anything from working properly as far as I can tell, perhaps because # we override the kubeconfig path everywhere we use it? # Query kubeconfig settings - kubeconfig_dir = '/var/lib/openshift/openshift.local.certificates' + kubeconfig_dir = '/var/lib/origin/openshift.local.certificates' if role == 'node': kubeconfig_dir = os.path.join( kubeconfig_dir, "node-%s" % facts['common']['hostname'] @@ -550,6 +599,21 @@ def get_current_config(facts): return current_config +def get_openshift_version(): + """ Get current version of openshift on the host + + Returns: + version: the current openshift version + """ + version = '' + + if os.path.isfile('/usr/bin/openshift'): + _, output, _ = module.run_command(['/usr/bin/openshift', 'version']) + versions = dict(e.split(' v') for e in output.splitlines()) + version = versions.get('openshift', '') + + #TODO: acknowledge the possility of a containerized install + return version def apply_provider_facts(facts, provider_facts): """ Apply provider facts to supplied facts dict @@ -595,7 +659,7 @@ def merge_facts(orig, new): facts = dict() for key, value in orig.iteritems(): if key in new: - if isinstance(value, dict): + if isinstance(value, dict) and isinstance(new[key], dict): facts[key] = merge_facts(value, new[key]) else: facts[key] = copy.copy(new[key]) @@ -656,25 +720,25 @@ def get_local_facts_from_file(filename): class OpenShiftFactsUnsupportedRoleError(Exception): - """OpenShift Facts Unsupported Role Error""" + """Origin Facts Unsupported Role Error""" pass class OpenShiftFactsFileWriteError(Exception): - """OpenShift Facts File Write Error""" + """Origin Facts File Write Error""" pass class OpenShiftFactsMetadataUnavailableError(Exception): - """OpenShift Facts Metadata Unavailable Error""" + """Origin Facts Metadata Unavailable Error""" pass class OpenShiftFacts(object): - """ OpenShift Facts + """ Origin Facts Attributes: - facts (dict): OpenShift facts for the host + facts (dict): facts for the host Args: role (str): role for setting local facts @@ -717,10 +781,11 @@ class OpenShiftFacts(object): facts['current_config'] = get_current_config(facts) facts = set_url_facts_if_unset(facts) facts = set_fluentd_facts_if_unset(facts) + facts = set_node_schedulability(facts) facts = set_metrics_facts_if_unset(facts) facts = set_identity_providers_if_unset(facts) - facts = set_registry_url_if_unset(facts) facts = set_sdn_facts_if_unset(facts) + facts = set_deployment_facts_if_unset(facts) facts = set_aggregate_facts(facts) return dict(openshift=facts) diff --git a/roles/openshift_facts/tasks/main.yml b/roles/openshift_facts/tasks/main.yml index b2cda3a85..6301d4fc0 100644 --- a/roles/openshift_facts/tasks/main.yml +++ b/roles/openshift_facts/tasks/main.yml @@ -1,10 +1,10 @@ --- -- name: Verify Ansible version is greater than 1.8.0 and not 1.9.0 +- name: Verify Ansible version is greater than 1.8.0 and not 1.9.0 and not 1.9.0.1 assert: that: - ansible_version | version_compare('1.8.0', 'ge') - ansible_version | version_compare('1.9.0', 'ne') - ansible_version | version_compare('1.9.0.1', 'ne') -- name: Gather OpenShift facts +- name: Gather Cluster facts openshift_facts: diff --git a/roles/openshift_manage_node/tasks/main.yml b/roles/openshift_manage_node/tasks/main.yml index 74e702248..637e494ea 100644 --- a/roles/openshift_manage_node/tasks/main.yml +++ b/roles/openshift_manage_node/tasks/main.yml @@ -1,25 +1,21 @@ - name: Wait for Node Registration command: > - {{ openshift.common.client_binary }} get node {{ item }} + {{ openshift.common.client_binary }} get node {{ item | lower }} register: omd_get_node until: omd_get_node.rc == 0 - retries: 10 + retries: 20 delay: 5 with_items: openshift_nodes -- name: Handle unscheduleable node +- name: Set node schedulability command: > - {{ openshift.common.admin_binary }} manage-node {{ item }} --schedulable=false - with_items: openshift_unscheduleable_nodes - -- name: Handle scheduleable node - command: > - {{ openshift.common.admin_binary }} manage-node {{ item }} --schedulable=true - with_items: openshift_scheduleable_nodes + {{ openshift.common.admin_binary }} manage-node {{ item.openshift.common.hostname | lower }} --schedulable={{ 'true' if item.openshift.node.schedulable | bool else 'false' }} + with_items: + - "{{ openshift_node_vars }}" - name: Label nodes command: > - {{ openshift.common.client_binary }} label --overwrite node {{ item.openshift.common.hostname }} {{ item.openshift.node.labels | oo_combine_dict }} + {{ openshift.common.client_binary }} label --overwrite node {{ item.openshift.common.hostname | lower }} {{ item.openshift.node.labels | oo_combine_dict }} with_items: - "{{ openshift_node_vars }}" when: "'labels' in item.openshift.node and item.openshift.node.labels != {}" diff --git a/roles/openshift_master/README.md b/roles/openshift_master/README.md index 0e7ef3aab..155bdb58b 100644 --- a/roles/openshift_master/README.md +++ b/roles/openshift_master/README.md @@ -1,7 +1,7 @@ -OpenShift Master -================ +OpenShift/Atomic Enterprise Master +================================== -OpenShift Master service installation +Master service installation Requirements ------------ @@ -15,8 +15,8 @@ Role Variables From this role: | Name | Default value | | |-------------------------------------|-----------------------|--------------------------------------------------| -| openshift_master_debug_level | openshift_debug_level | Verbosity of the debug logs for openshift-master | -| openshift_node_ips | [] | List of the openshift node ip addresses to pre-register when openshift-master starts up | +| openshift_master_debug_level | openshift_debug_level | Verbosity of the debug logs for master | +| openshift_node_ips | [] | List of the openshift node ip addresses to pre-register when master starts up | | oreg_url | UNDEF | Default docker registry to use | | openshift_master_api_port | UNDEF | | | openshift_master_console_port | UNDEF | | diff --git a/roles/openshift_master/defaults/main.yml b/roles/openshift_master/defaults/main.yml index ca8860099..9766d01ae 100644 --- a/roles/openshift_master/defaults/main.yml +++ b/roles/openshift_master/defaults/main.yml @@ -5,11 +5,11 @@ openshift_node_ips: [] os_firewall_allow: - service: etcd embedded port: 4001/tcp -- service: OpenShift api https +- service: api server https port: 8443/tcp -- service: OpenShift dns tcp +- service: dns tcp port: 53/tcp -- service: OpenShift dns udp +- service: dns udp port: 53/udp - service: Fluentd td-agent tcp port: 24224/tcp @@ -22,9 +22,9 @@ os_firewall_allow: - service: Corosync UDP port: 5405/udp os_firewall_deny: -- service: OpenShift api http +- service: api server http port: 8080/tcp -- service: former OpenShift web console port +- service: former web console port port: 8444/tcp - service: former etcd peer port port: 7001/tcp diff --git a/roles/openshift_master/handlers/main.yml b/roles/openshift_master/handlers/main.yml index f1e7e1ab3..2981979e0 100644 --- a/roles/openshift_master/handlers/main.yml +++ b/roles/openshift_master/handlers/main.yml @@ -1,4 +1,4 @@ --- -- name: restart openshift-master - service: name=openshift-master state=restarted +- name: restart master + service: name={{ openshift.common.service_type }}-master state=restarted when: not openshift_master_ha | bool diff --git a/roles/openshift_master/meta/main.yml b/roles/openshift_master/meta/main.yml index 41a183c3b..c125cb5d0 100644 --- a/roles/openshift_master/meta/main.yml +++ b/roles/openshift_master/meta/main.yml @@ -1,7 +1,7 @@ --- galaxy_info: author: Jhon Honce - description: OpenShift Master + description: Master company: Red Hat, Inc. license: Apache License, Version 2.0 min_ansible_version: 1.7 diff --git a/roles/openshift_master/tasks/main.yml b/roles/openshift_master/tasks/main.yml index 9204d25ce..73c04cb08 100644 --- a/roles/openshift_master/tasks/main.yml +++ b/roles/openshift_master/tasks/main.yml @@ -12,11 +12,7 @@ msg: "openshift_master_cluster_password must be set for multi-master installations" when: openshift_master_ha | bool and not openshift.master.cluster_defer_ha | bool and openshift_master_cluster_password is not defined -- name: Install OpenShift Master package - yum: pkg=openshift-master state=present - register: install_result - -- name: Set master OpenShift facts +- name: Set master facts openshift_facts: role: master local_facts: @@ -59,8 +55,26 @@ api_server_args: "{{ osm_api_server_args | default(None) }}" controller_args: "{{ osm_controller_args | default(None) }}" +- name: Install Master package + yum: pkg={{ openshift.common.service_type }}-master{{ openshift_version }} state=present + register: install_result + +- name: Check for RPM generated config marker file /etc/origin/.config_managed + stat: path=/etc/origin/.rpmgenerated + register: rpmgenerated_config + +- name: Remove RPM generated config files + file: + path: "{{ item }}" + state: absent + when: openshift.common.service_type in ['atomic-enterprise','openshift-enterprise'] and rpmgenerated_config.stat.exists == true + with_items: + - "{{ openshift.common.config_base }}/master" + - "{{ openshift.common.config_base }}/node" + - "{{ openshift.common.config_base }}/.rpmgenerated" + # TODO: These values need to be configurable -- name: Set dns OpenShift facts +- name: Set dns facts openshift_facts: role: dns local_facts: @@ -80,20 +94,28 @@ args: creates: "{{ openshift_master_policy }}" notify: - - restart openshift-master + - restart master - name: Create the scheduler config template: dest: "{{ openshift_master_scheduler_conf }}" src: scheduler.json.j2 + backup: true notify: - - restart openshift-master + - restart master - name: Install httpd-tools if needed yum: pkg=httpd-tools state=present when: item.kind == 'HTPasswdPasswordIdentityProvider' with_items: openshift.master.identity_providers +- name: Ensure htpasswd directory exists + file: + path: "{{ item.filename | dirname }}" + state: directory + when: item.kind == 'HTPasswdPasswordIdentityProvider' + with_items: openshift.master.identity_providers + - name: Create the htpasswd file if needed copy: dest: "{{ item.filename }}" @@ -108,12 +130,13 @@ template: dest: "{{ openshift_master_config_file }}" src: master.yaml.v1.j2 + backup: true notify: - - restart openshift-master + - restart master -- name: Configure OpenShift settings +- name: Configure master settings lineinfile: - dest: /etc/sysconfig/openshift-master + dest: /etc/sysconfig/{{ openshift.common.service_type }}-master regexp: "{{ item.regex }}" line: "{{ item.line }}" with_items: @@ -122,10 +145,10 @@ - regex: '^CONFIG_FILE=' line: "CONFIG_FILE={{ openshift_master_config_file }}" notify: - - restart openshift-master + - restart master -- name: Start and enable openshift-master - service: name=openshift-master enabled=yes state=started +- name: Start and enable master + service: name={{ openshift.common.service_type }}-master enabled=yes state=started when: not openshift_master_ha | bool register: start_result @@ -146,20 +169,24 @@ shell: echo {{ openshift_master_cluster_password | quote }} | passwd --stdin hacluster when: install_result | changed -- name: Create the OpenShift client config dir(s) +- name: Lookup default group for ansible_ssh_user + command: "/usr/bin/id -g {{ ansible_ssh_user }}" + register: _ansible_ssh_user_gid + +- name: Create the client config dir(s) file: path: "~{{ item }}/.kube" state: directory mode: 0700 owner: "{{ item }}" - group: "{{ item }}" + group: "{{ 'root' if item == 'root' else _ansible_ssh_user_gid.stdout }}" with_items: - root - "{{ ansible_ssh_user }}" # TODO: Update this file if the contents of the source file are not present in # the dest file, will need to make sure to ignore things that could be added -- name: Copy the OpenShift admin client config(s) +- name: Copy the admin client config(s) command: cp {{ openshift_master_config_dir }}/admin.kubeconfig ~{{ item }}/.kube/config args: creates: ~{{ item }}/.kube/config @@ -167,13 +194,13 @@ - root - "{{ ansible_ssh_user }}" -- name: Update the permissions on the OpenShift admin client config(s) +- name: Update the permissions on the admin client config(s) file: path: "~{{ item }}/.kube/config" state: file mode: 0700 owner: "{{ item }}" - group: "{{ item }}" + group: "{{ 'root' if item == 'root' else _ansible_ssh_user_gid.stdout }}" with_items: - root - "{{ ansible_ssh_user }}" diff --git a/roles/openshift_master/templates/master.yaml.v1.j2 b/roles/openshift_master/templates/master.yaml.v1.j2 index fff123d0d..cc1dee13d 100644 --- a/roles/openshift_master/templates/master.yaml.v1.j2 +++ b/roles/openshift_master/templates/master.yaml.v1.j2 @@ -46,7 +46,7 @@ etcdConfig: certFile: etcd.server.crt clientCA: ca.crt keyFile: etcd.server.key - storageDirectory: {{ openshift_data_dir }}/openshift.local.etcd + storageDirectory: {{ openshift.common.data_dir }}/openshift.local.etcd {% endif %} etcdStorageConfig: kubernetesStoragePrefix: kubernetes.io @@ -87,7 +87,11 @@ masterPublicURL: {{ openshift.master.public_api_url }} networkConfig: clusterNetworkCIDR: {{ openshift.master.sdn_cluster_network_cidr }} hostSubnetLength: {{ openshift.master.sdn_host_subnet_length }} + {% if openshift.common.use_openshift_sdn %} networkPluginName: {{ openshift.common.sdn_network_plugin_name }} + {% endif %} +# serviceNetworkCIDR must match kubernetesMasterConfig.servicesSubnet + serviceNetworkCIDR: {{ openshift.master.portal_net }} {% include 'v1_partials/oauthConfig.j2' %} policyConfig: bootstrapPolicyFile: {{ openshift_master_policy }} diff --git a/roles/openshift_master/templates/scheduler.json.j2 b/roles/openshift_master/templates/scheduler.json.j2 index 835f2383e..cb5f43bb2 100644 --- a/roles/openshift_master/templates/scheduler.json.j2 +++ b/roles/openshift_master/templates/scheduler.json.j2 @@ -1,4 +1,6 @@ { + "kind": "Policy", + "apiVersion": "v1", "predicates": [ {"name": "MatchNodeSelector"}, {"name": "PodFitsResources"}, diff --git a/roles/openshift_master/templates/v1_partials/oauthConfig.j2 b/roles/openshift_master/templates/v1_partials/oauthConfig.j2 index 72889bc29..8a4f5a746 100644 --- a/roles/openshift_master/templates/v1_partials/oauthConfig.j2 +++ b/roles/openshift_master/templates/v1_partials/oauthConfig.j2 @@ -80,6 +80,7 @@ oauthConfig: provider: {{ identity_provider_config(identity_provider) }} {%- endfor %} + masterCA: ca.crt masterPublicURL: {{ openshift.master.public_api_url }} masterURL: {{ openshift.master.api_url }} sessionConfig: diff --git a/roles/openshift_master/vars/main.yml b/roles/openshift_master/vars/main.yml index f6f69966a..ecdb4f883 100644 --- a/roles/openshift_master/vars/main.yml +++ b/roles/openshift_master/vars/main.yml @@ -1,8 +1,9 @@ --- -openshift_master_config_dir: /etc/openshift/master +openshift_master_config_dir: "{{ openshift.common.config_base }}/master" openshift_master_config_file: "{{ openshift_master_config_dir }}/master-config.yaml" openshift_master_scheduler_conf: "{{ openshift_master_config_dir }}/scheduler.json" openshift_master_policy: "{{ openshift_master_config_dir }}/policy.json" +openshift_version: "{{ openshift_pkg_version | default('') }}" openshift_master_valid_grant_methods: - auto diff --git a/roles/openshift_master_ca/tasks/main.yml b/roles/openshift_master_ca/tasks/main.yml index 03eb7e15f..5c9639ea5 100644 --- a/roles/openshift_master_ca/tasks/main.yml +++ b/roles/openshift_master_ca/tasks/main.yml @@ -1,6 +1,6 @@ --- -- name: Install the OpenShift package for admin tooling - yum: pkg=openshift state=present +- name: Install the base package for admin tooling + yum: pkg={{ openshift.common.service_type }}{{ openshift_version }} state=present register: install_result - name: Reload generated facts diff --git a/roles/openshift_master_ca/vars/main.yml b/roles/openshift_master_ca/vars/main.yml index 2925680bb..b35339b18 100644 --- a/roles/openshift_master_ca/vars/main.yml +++ b/roles/openshift_master_ca/vars/main.yml @@ -1,5 +1,6 @@ --- -openshift_master_config_dir: /etc/openshift/master +openshift_master_config_dir: "{{ openshift.common.config_base }}/master" openshift_master_ca_cert: "{{ openshift_master_config_dir }}/ca.crt" openshift_master_ca_key: "{{ openshift_master_config_dir }}/ca.key" openshift_master_ca_serial: "{{ openshift_master_config_dir }}/ca.serial.txt" +openshift_version: "{{ openshift_pkg_version | default('') }}" diff --git a/roles/openshift_master_certificates/tasks/main.yml b/roles/openshift_master_certificates/tasks/main.yml index bb23cf4f4..0d75a9eb3 100644 --- a/roles/openshift_master_certificates/tasks/main.yml +++ b/roles/openshift_master_certificates/tasks/main.yml @@ -44,5 +44,3 @@ args: creates: "{{ openshift_generated_configs_dir }}/{{ item.master_cert_subdir }}/master.server.crt" with_items: masters_needing_certs - - diff --git a/roles/openshift_master_certificates/vars/main.yml b/roles/openshift_master_certificates/vars/main.yml index 6214f7918..3f18ddc79 100644 --- a/roles/openshift_master_certificates/vars/main.yml +++ b/roles/openshift_master_certificates/vars/main.yml @@ -1,3 +1,3 @@ --- -openshift_generated_configs_dir: /etc/openshift/generated-configs -openshift_master_config_dir: /etc/openshift/master +openshift_generated_configs_dir: "{{ openshift.common.config_base }}/generated-configs" +openshift_master_config_dir: "{{ openshift.common.config_base }}/master" diff --git a/roles/openshift_master_cluster/tasks/configure.yml b/roles/openshift_master_cluster/tasks/configure.yml index 8ddc8bfda..7ab9afb51 100644 --- a/roles/openshift_master_cluster/tasks/configure.yml +++ b/roles/openshift_master_cluster/tasks/configure.yml @@ -22,14 +22,14 @@ command: pcs resource defaults resource-stickiness=100 - name: Add the cluster VIP resource - command: pcs resource create virtual-ip IPaddr2 ip={{ openshift_master_cluster_vip }} --group openshift-master + command: pcs resource create virtual-ip IPaddr2 ip={{ openshift_master_cluster_vip }} --group {{ openshift.common.service_type }}-master - name: Add the cluster public VIP resource - command: pcs resource create virtual-ip IPaddr2 ip={{ openshift_master_cluster_public_vip }} --group openshift-master + command: pcs resource create virtual-ip IPaddr2 ip={{ openshift_master_cluster_public_vip }} --group {{ openshift.common.service_type }}-master when: openshift_master_cluster_public_vip != openshift_master_cluster_vip -- name: Add the cluster openshift-master service resource - command: pcs resource create master systemd:openshift-master op start timeout=90s stop timeout=90s --group openshift-master +- name: Add the cluster master service resource + command: pcs resource create master systemd:{{ openshift.common.service_type }}-master op start timeout=90s stop timeout=90s --group {{ openshift.common.service_type }}-master - name: Disable stonith command: pcs property set stonith-enabled=false diff --git a/roles/openshift_master_cluster/tasks/configure_deferred.yml b/roles/openshift_master_cluster/tasks/configure_deferred.yml index a80b6c5b4..3b416005b 100644 --- a/roles/openshift_master_cluster/tasks/configure_deferred.yml +++ b/roles/openshift_master_cluster/tasks/configure_deferred.yml @@ -1,8 +1,8 @@ --- - debug: msg="Deferring config" -- name: Start and enable openshift-master +- name: Start and enable the master service: - name: openshift-master + name: "{{ openshift.common.service_type }}-master" state: started enabled: yes diff --git a/roles/openshift_node/README.md b/roles/openshift_node/README.md index 427269931..3aff81274 100644 --- a/roles/openshift_node/README.md +++ b/roles/openshift_node/README.md @@ -1,12 +1,12 @@ -OpenShift Node -============== +OpenShift/Atomic Enterprise Node +================================ -OpenShift Node service installation +Node service installation Requirements ------------ -One or more OpenShift Master servers. +One or more Master servers. A RHEL 7.1 host pre-configured with access to the rhel-7-server-rpms, rhel-7-server-extras-rpms, and rhel-7-server-ose-3.0-rpms repos. @@ -14,10 +14,10 @@ rhel-7-server-extras-rpms, and rhel-7-server-ose-3.0-rpms repos. Role Variables -------------- From this role: -| Name | Default value | | -|------------------------------------------|-----------------------|----------------------------------------| -| openshift_node_debug_level | openshift_debug_level | Verbosity of the debug logs for openshift-node | -| oreg_url | UNDEF (Optional) | Default docker registry to use | +| Name | Default value | | +|------------------------------------------|-----------------------|--------------------------------------------------------| +| openshift_node_debug_level | openshift_debug_level | Verbosity of the debug logs for node | +| oreg_url | UNDEF (Optional) | Default docker registry to use | From openshift_common: | Name | Default Value | | diff --git a/roles/openshift_node/defaults/main.yml b/roles/openshift_node/defaults/main.yml index 1dbcc4301..c4abf9d7c 100644 --- a/roles/openshift_node/defaults/main.yml +++ b/roles/openshift_node/defaults/main.yml @@ -1,6 +1,6 @@ --- os_firewall_allow: -- service: OpenShift kubelet +- service: Kubernetes kubelet port: 10250/tcp - service: http port: 80/tcp diff --git a/roles/openshift_node/handlers/main.yml b/roles/openshift_node/handlers/main.yml index 8b5acefbf..633f3ed13 100644 --- a/roles/openshift_node/handlers/main.yml +++ b/roles/openshift_node/handlers/main.yml @@ -1,6 +1,6 @@ --- -- name: restart openshift-node - service: name=openshift-node state=restarted +- name: restart node + service: name={{ openshift.common.service_type }}-node state=restarted - name: restart docker service: name=docker state=restarted diff --git a/roles/openshift_node/tasks/main.yml b/roles/openshift_node/tasks/main.yml index 7679adbf3..d45dd8073 100644 --- a/roles/openshift_node/tasks/main.yml +++ b/roles/openshift_node/tasks/main.yml @@ -10,16 +10,7 @@ msg: "SELinux is disabled, This deployment type requires that SELinux is enabled." when: (not ansible_selinux or ansible_selinux.status != 'enabled') and deployment_type in ['enterprise', 'online'] -- name: Install OpenShift Node package - yum: pkg=openshift-node state=present - register: node_install_result - -- name: Install openshift-sdn-ovs - yum: pkg=openshift-sdn-ovs state=present - register: sdn_install_result - when: openshift.common.use_openshift_sdn - -- name: Set node OpenShift facts +- name: Set node facts openshift_facts: role: "{{ item.role }}" local_facts: "{{ item.local_facts }}" @@ -31,24 +22,38 @@ deployment_type: "{{ openshift_deployment_type }}" - role: node local_facts: - labels: "{{ openshift_node_labels | default(none) }}" + labels: "{{ lookup('oo_option', 'openshift_node_labels') | default( openshift_node_labels | default(none), true) }}" annotations: "{{ openshift_node_annotations | default(none) }}" registry_url: "{{ oreg_url | default(none) }}" debug_level: "{{ openshift_node_debug_level | default(openshift.common.debug_level) }}" portal_net: "{{ openshift_master_portal_net | default(None) }}" kubelet_args: "{{ openshift_node_kubelet_args | default(None) }}" + sdn_mtu: "{{ openshift_node_sdn_mtu | default(None) }}" + schedulable: "{{ openshift_schedulable | default(openshift_scheduleable) | default(None) }}" + +# We have to add tuned-profiles in the same transaction otherwise we run into depsolving +# problems because the rpms don't pin the version properly. +- name: Install Node package + yum: pkg={{ openshift.common.service_type }}-node{{ openshift_version }},tuned-profiles-{{ openshift.common.service_type }}-node{{ openshift_version }} state=present + register: node_install_result + +- name: Install sdn-ovs package + yum: pkg={{ openshift.common.service_type }}-sdn-ovs{{ openshift_version }} state=present + register: sdn_install_result + when: openshift.common.use_openshift_sdn # TODO: add the validate parameter when there is a validation command to run - name: Create the Node config template: dest: "{{ openshift_node_config_file }}" src: node.yaml.v1.j2 + backup: true notify: - - restart openshift-node + - restart node -- name: Configure OpenShift Node settings +- name: Configure Node settings lineinfile: - dest: /etc/sysconfig/openshift-node + dest: /etc/sysconfig/{{ openshift.common.service_type }}-node regexp: "{{ item.regex }}" line: "{{ item.line }}" with_items: @@ -57,13 +62,13 @@ - regex: '^CONFIG_FILE=' line: "CONFIG_FILE={{ openshift_node_config_file }}" notify: - - restart openshift-node + - restart node - stat: path=/etc/sysconfig/docker register: docker_check # TODO: Enable secure registry when code available in origin -- name: Secure OpenShift Registry +- name: Secure Registry lineinfile: dest: /etc/sysconfig/docker regexp: '^OPTIONS=.*$' @@ -119,8 +124,8 @@ seboolean: name=virt_use_nfs state=yes persistent=yes when: ansible_selinux and ansible_selinux.status == "enabled" -- name: Start and enable openshift-node - service: name=openshift-node enabled=yes state=started +- name: Start and enable node + service: name={{ openshift.common.service_type }}-node enabled=yes state=started register: start_result - name: pause to prevent service restart from interfering with bootstrapping diff --git a/roles/openshift_node/templates/node.yaml.v1.j2 b/roles/openshift_node/templates/node.yaml.v1.j2 index e176e7511..4931d127e 100644 --- a/roles/openshift_node/templates/node.yaml.v1.j2 +++ b/roles/openshift_node/templates/node.yaml.v1.j2 @@ -12,13 +12,22 @@ kind: NodeConfig kubeletArguments: {{ openshift.node.kubelet_args | to_json }} {% endif %} masterKubeConfig: system:node:{{ openshift.common.hostname }}.kubeconfig +{% if openshift.common.use_openshift_sdn %} networkPluginName: {{ openshift.common.sdn_network_plugin_name }} -nodeName: {{ openshift.common.hostname }} +{% endif %} +# networkConfig struct introduced in origin 1.0.6 and OSE 3.0.2 which +# deprecates networkPluginName above. The two should match. +networkConfig: + mtu: {{ openshift.node.sdn_mtu }} +{% if openshift.common.use_openshift_sdn %} + networkPluginName: {{ openshift.common.sdn_network_plugin_name }} +{% endif %} +nodeName: {{ openshift.common.hostname | lower }} podManifestConfig: servingInfo: bindAddress: 0.0.0.0:10250 certFile: server.crt clientCA: ca.crt keyFile: server.key -volumeDirectory: {{ openshift_data_dir }}/openshift.local.volumes -{% include 'partials/kubeletArguments.j2' %}
\ No newline at end of file +volumeDirectory: {{ openshift.common.data_dir }}/openshift.local.volumes +{% include 'partials/kubeletArguments.j2' %} diff --git a/roles/openshift_node/vars/main.yml b/roles/openshift_node/vars/main.yml index cf47f8354..43dc50ca8 100644 --- a/roles/openshift_node/vars/main.yml +++ b/roles/openshift_node/vars/main.yml @@ -1,3 +1,4 @@ --- -openshift_node_config_dir: /etc/openshift/node +openshift_node_config_dir: "{{ openshift.common.config_base }}/node" openshift_node_config_file: "{{ openshift_node_config_dir }}/node-config.yaml" +openshift_version: "{{ openshift_pkg_version | default('') }}" diff --git a/roles/openshift_node_certificates/README.md b/roles/openshift_node_certificates/README.md index c6304e4b0..6264d253a 100644 --- a/roles/openshift_node_certificates/README.md +++ b/roles/openshift_node_certificates/README.md @@ -1,5 +1,5 @@ -OpenShift Node Certificates -======================== +OpenShift/Atomic Enterprise Node Certificates +============================================= TODO diff --git a/roles/openshift_node_certificates/vars/main.yml b/roles/openshift_node_certificates/vars/main.yml index a018bb0f9..61fbb1e51 100644 --- a/roles/openshift_node_certificates/vars/main.yml +++ b/roles/openshift_node_certificates/vars/main.yml @@ -1,7 +1,7 @@ --- -openshift_node_config_dir: /etc/openshift/node -openshift_master_config_dir: /etc/openshift/master -openshift_generated_configs_dir: /etc/openshift/generated-configs +openshift_node_config_dir: "{{ openshift.common.config_base }}/node" +openshift_master_config_dir: "{{ openshift.common.config_base }}/master" +openshift_generated_configs_dir: "{{ openshift.common.config_base }}/generated-configs" openshift_master_ca_cert: "{{ openshift_master_config_dir }}/ca.crt" openshift_master_ca_key: "{{ openshift_master_config_dir }}/ca.key" openshift_master_ca_serial: "{{ openshift_master_config_dir }}/ca.serial.txt" diff --git a/roles/openshift_registry/vars/main.yml b/roles/openshift_registry/vars/main.yml index 9fb501e85..9967e26f4 100644 --- a/roles/openshift_registry/vars/main.yml +++ b/roles/openshift_registry/vars/main.yml @@ -1,3 +1,2 @@ --- -openshift_master_config_dir: /etc/openshift/master - +openshift_master_config_dir: "{{ openshift.common.config_base }}/master" diff --git a/roles/openshift_repos/vars/main.yml b/roles/openshift_repos/vars/main.yml index bbb4c77e7..319611a0b 100644 --- a/roles/openshift_repos/vars/main.yml +++ b/roles/openshift_repos/vars/main.yml @@ -1,2 +1,7 @@ --- -known_openshift_deployment_types: ['origin', 'online', 'enterprise'] +# origin uses community packages named 'origin' +# online currently uses 'openshift' packages +# enterprise is used for OSE 3.0 < 3.1 which uses packages named 'openshift' +# atomic-enterprise uses Red Hat packages named 'atomic-openshift' +# openshift-enterprise uses Red Hat packages named 'atomic-openshift' starting with OSE 3.1 +known_openshift_deployment_types: ['origin', 'online', 'enterprise','atomic-enterprise','openshift-enterprise'] diff --git a/roles/openshift_router/vars/main.yml b/roles/openshift_router/vars/main.yml index 9fb501e85..9967e26f4 100644 --- a/roles/openshift_router/vars/main.yml +++ b/roles/openshift_router/vars/main.yml @@ -1,3 +1,2 @@ --- -openshift_master_config_dir: /etc/openshift/master - +openshift_master_config_dir: "{{ openshift.common.config_base }}/master" diff --git a/roles/openshift_serviceaccounts/tasks/main.yml b/roles/openshift_serviceaccounts/tasks/main.yml new file mode 100644 index 000000000..d93a25a21 --- /dev/null +++ b/roles/openshift_serviceaccounts/tasks/main.yml @@ -0,0 +1,26 @@ +- name: Create service account configs + template: + src: serviceaccount.j2 + dest: "/tmp/{{ item }}-serviceaccount.yaml" + with_items: accounts + +- name: Create {{ item }} service account + command: > + {{ openshift.common.client_binary }} create -f "/tmp/{{ item }}-serviceaccount.yaml" + with_items: accounts + register: _sa_result + failed_when: "'serviceaccounts \"{{ item }}\" already exists' not in _sa_result.stderr and _sa_result.rc != 0" + changed_when: "'serviceaccounts \"{{ item }}\" already exists' not in _sa_result.stderr and _sa_result.rc == 0" + +- name: Get current security context constraints + shell: "{{ openshift.common.client_binary }} get scc privileged -o yaml > /tmp/scc.yaml" + +- name: Add security context constraint for {{ item }} + lineinfile: + dest: /tmp/scc.yaml + line: "- system:serviceaccount:default:{{ item }}" + insertafter: "^users:$" + with_items: accounts + +- name: Apply new scc rules for service accounts + command: "{{ openshift.common.client_binary }} update -f /tmp/scc.yaml" diff --git a/roles/openshift_serviceaccounts/templates/serviceaccount.j2 b/roles/openshift_serviceaccounts/templates/serviceaccount.j2 new file mode 100644 index 000000000..931e249f9 --- /dev/null +++ b/roles/openshift_serviceaccounts/templates/serviceaccount.j2 @@ -0,0 +1,4 @@ +apiVersion: v1 +kind: ServiceAccount +metadata: + name: {{ item }} diff --git a/roles/openshift_storage_nfs_lvm/tasks/main.yml b/roles/openshift_storage_nfs_lvm/tasks/main.yml index e9f5814bb..ead81b876 100644 --- a/roles/openshift_storage_nfs_lvm/tasks/main.yml +++ b/roles/openshift_storage_nfs_lvm/tasks/main.yml @@ -21,4 +21,4 @@ template: src=../templates/nfs.json.j2 dest=/root/persistent-volume.{{ item }}.json with_sequence: start={{osnl_volume_num_start}} count={{osnl_number_of_volumes}} format={{osnl_volume_prefix}}{{osnl_volume_size}}g%04d -# TODO - Get the json files to an openshift-master, and load them.
\ No newline at end of file +# TODO - Get the json files to a master, and load them. diff --git a/roles/os_zabbix/tasks/main.yml b/roles/os_zabbix/tasks/main.yml index 7111c778b..a503b24d7 100644 --- a/roles/os_zabbix/tasks/main.yml +++ b/roles/os_zabbix/tasks/main.yml @@ -7,10 +7,14 @@ state: list register: templates -- debug: var=templates - - include_vars: template_heartbeat.yml - include_vars: template_os_linux.yml +- include_vars: template_docker.yml +- include_vars: template_openshift_master.yml +- include_vars: template_openshift_node.yml +- include_vars: template_ops_tools.yml +- include_vars: template_app_zabbix_server.yml +- include_vars: template_app_zabbix_agent.yml - name: Include Template Heartbeat include: ../../lib_zabbix/tasks/create_template.yml @@ -28,3 +32,50 @@ user: "{{ ozb_user }}" password: "{{ ozb_password }}" +- name: Include Template docker + include: ../../lib_zabbix/tasks/create_template.yml + vars: + template: "{{ g_template_docker }}" + server: "{{ ozb_server }}" + user: "{{ ozb_user }}" + password: "{{ ozb_password }}" + +- name: Include Template Openshift Master + include: ../../lib_zabbix/tasks/create_template.yml + vars: + template: "{{ g_template_openshift_master }}" + server: "{{ ozb_server }}" + user: "{{ ozb_user }}" + password: "{{ ozb_password }}" + +- name: Include Template Openshift Node + include: ../../lib_zabbix/tasks/create_template.yml + vars: + template: "{{ g_template_openshift_node }}" + server: "{{ ozb_server }}" + user: "{{ ozb_user }}" + password: "{{ ozb_password }}" + +- name: Include Template Ops Tools + include: ../../lib_zabbix/tasks/create_template.yml + vars: + template: "{{ g_template_ops_tools }}" + server: "{{ ozb_server }}" + user: "{{ ozb_user }}" + password: "{{ ozb_password }}" + +- name: Include Template App Zabbix Server + include: ../../lib_zabbix/tasks/create_template.yml + vars: + template: "{{ g_template_app_zabbix_server }}" + server: "{{ ozb_server }}" + user: "{{ ozb_user }}" + password: "{{ ozb_password }}" + +- name: Include Template App Zabbix Agent + include: ../../lib_zabbix/tasks/create_template.yml + vars: + template: "{{ g_template_app_zabbix_agent }}" + server: "{{ ozb_server }}" + user: "{{ ozb_user }}" + password: "{{ ozb_password }}" diff --git a/roles/os_zabbix/vars/template_app_zabbix_agent.yml b/roles/os_zabbix/vars/template_app_zabbix_agent.yml new file mode 100644 index 000000000..06c4eda8b --- /dev/null +++ b/roles/os_zabbix/vars/template_app_zabbix_agent.yml @@ -0,0 +1,23 @@ +--- +g_template_app_zabbix_agent: + name: Template App Zabbix Agent + zitems: + - key: agent.hostname + applications: + - Zabbix agent + value_type: character + zabbix_type: '0' + + - key: agent.ping + applications: + - Zabbix agent + description: The agent always returns 1 for this item. It could be used in combination with nodata() for availability check. + value_type: int + zabbix_type: '0' + + ztriggers: + - name: '[Reboot] Zabbix agent on {HOST.NAME} is unreachable for 15 minutes' + description: Zabbix agent is unreachable for 15 minutes. + expression: '{Template App Zabbix Agent:agent.ping.nodata(15m)}=1' + priority: high + url: https://github.com/openshift/ops-sop/blob/master/Alerts/check_ping.asciidoc diff --git a/roles/os_zabbix/vars/template_app_zabbix_server.yml b/roles/os_zabbix/vars/template_app_zabbix_server.yml new file mode 100644 index 000000000..dace2aa29 --- /dev/null +++ b/roles/os_zabbix/vars/template_app_zabbix_server.yml @@ -0,0 +1,408 @@ +--- +g_template_app_zabbix_server: + name: Template App Zabbix Server + zitems: + - key: housekeeper_creates + applications: + - Zabbix server + description: A simple count of the number of partition creates output by the housekeeper script. + units: '' + value_type: int + zabbix_type: '2' + + - key: housekeeper_drops + applications: + - Zabbix server + description: A simple count of the number of partition drops output by the housekeeper script. + units: '' + value_type: int + zabbix_type: '2' + + - key: housekeeper_errors + applications: + - Zabbix server + description: A simple count of the number of errors output by the housekeeper script. + units: '' + value_type: int + zabbix_type: '2' + + - key: housekeeper_total + applications: + - Zabbix server + description: A simple count of the total number of lines output by the housekeeper + script. + units: '' + value_type: int + zabbix_type: '2' + + - key: zabbix[process,alerter,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,configuration syncer,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,db watchdog,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,discoverer,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,escalator,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,history syncer,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,housekeeper,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,http poller,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,icmp pinger,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,ipmi poller,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,java poller,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,node watcher,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,poller,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,proxy poller,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,self-monitoring,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,snmp trapper,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,timer,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,trapper,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[process,unreachable poller,avg,busy] + applications: + - Zabbix server + description: '' + units: '%' + value_type: float + zabbix_type: '5' + + - key: zabbix[queue,10m] + applications: + - Zabbix server + description: '' + units: '' + value_type: int + zabbix_type: '5' + + - key: zabbix[queue] + applications: + - Zabbix server + description: '' + units: '' + value_type: int + zabbix_type: '5' + + - key: zabbix[rcache,buffer,pfree] + applications: + - Zabbix server + description: '' + units: '' + value_type: float + zabbix_type: '5' + + - key: zabbix[wcache,history,pfree] + applications: + - Zabbix server + description: '' + units: '' + value_type: float + zabbix_type: '5' + + - key: zabbix[wcache,text,pfree] + applications: + - Zabbix server + description: '' + units: '' + value_type: float + zabbix_type: '5' + + - key: zabbix[wcache,trend,pfree] + applications: + - Zabbix server + description: '' + units: '' + value_type: float + zabbix_type: '5' + + - key: zabbix[wcache,values] + applications: + - Zabbix server + description: '' + units: '' + value_type: float + zabbix_type: '5' + ztriggers: + - description: "There has been unexpected output while running the housekeeping script\ + \ on the Zabbix. There are only three kinds of lines we expect to see in the output,\ + \ and we've gotten something enw.\r\n\r\nCheck the script's output in /var/lib/zabbix/state\ + \ for more details." + expression: '{Template App Zabbix Server:housekeeper_errors.last(0)}+{Template App Zabbix Server:housekeeper_creates.last(0)}+{Template App Zabbix Server:housekeeper_drops.last(0)}<>{Template App Zabbix Server:housekeeper_total.last(0)}' + name: Unexpected output in Zabbix DB Housekeeping + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_DB_Housekeeping.asciidoc + + - description: An error has occurred during running the housekeeping script on the Zabbix. Check the script's output in /var/lib/zabbix/state for more details. + expression: '{Template App Zabbix Server:housekeeper_errors.last(0)}>0' + name: Errors during Zabbix DB Housekeeping + priority: high + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,alerter,avg,busy].min(600)}>75' + name: Zabbix alerter processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,configuration syncer,avg,busy].min(600)}>75' + name: Zabbix configuration syncer processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,db watchdog,avg,busy].min(600)}>75' + name: Zabbix db watchdog processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,discoverer,avg,busy].min(600)}>75' + name: Zabbix discoverer processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,escalator,avg,busy].min(600)}>75' + name: Zabbix escalator processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,history syncer,avg,busy].min(600)}>75' + name: Zabbix history syncer processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,housekeeper,avg,busy].min(1800)}>75' + name: Zabbix housekeeper processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,http poller,avg,busy].min(600)}>75' + name: Zabbix http poller processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,icmp pinger,avg,busy].min(600)}>75' + name: Zabbix icmp pinger processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,ipmi poller,avg,busy].min(600)}>75' + name: Zabbix ipmi poller processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,java poller,avg,busy].min(600)}>75' + name: Zabbix java poller processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,node watcher,avg,busy].min(600)}>75' + name: Zabbix node watcher processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,poller,avg,busy].min(600)}>75' + name: Zabbix poller processes more than 75% busy + priority: high + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,proxy poller,avg,busy].min(600)}>75' + name: Zabbix proxy poller processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,self-monitoring,avg,busy].min(600)}>75' + name: Zabbix self-monitoring processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,snmp trapper,avg,busy].min(600)}>75' + name: Zabbix snmp trapper processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: Timer processes usually are busy because they have to process time + based trigger functions + expression: '{Template App Zabbix Server:zabbix[process,timer,avg,busy].min(600)}>75' + name: Zabbix timer processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,trapper,avg,busy].min(600)}>75' + name: Zabbix trapper processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[process,unreachable poller,avg,busy].min(600)}>75' + name: Zabbix unreachable poller processes more than 75% busy + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/Zabbix_state_check.asciidoc + + - description: "This alert generally indicates a performance problem or a problem\ + \ with the zabbix-server or proxy.\r\n\r\nThe first place to check for issues\ + \ is Administration > Queue. Be sure to check the general view and the per-proxy\ + \ view." + expression: '{Template App Zabbix Server:zabbix[queue,10m].min(600)}>1000' + name: More than 1000 items having missing data for more than 10 minutes + priority: high + url: https://github.com/openshift/ops-sop/blob/master/Alerts/data_lost_overview_plugin.asciidoc + + - description: Consider increasing CacheSize in the zabbix_server.conf configuration + file + expression: '{Template App Zabbix Server:zabbix[rcache,buffer,pfree].min(600)}<5' + name: Less than 5% free in the configuration cache + priority: info + url: https://github.com/openshift/ops-sop/blob/master/Alerts/check_cache.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[wcache,history,pfree].min(600)}<25' + name: Less than 25% free in the history cache + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/check_cache.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[wcache,text,pfree].min(600)}<25' + name: Less than 25% free in the text history cache + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/check_cache.asciidoc + + - description: '' + expression: '{Template App Zabbix Server:zabbix[wcache,trend,pfree].min(600)}<25' + name: Less than 25% free in the trends cache + priority: avg + url: https://github.com/openshift/ops-sop/blob/master/Alerts/check_cache.asciidoc diff --git a/roles/os_zabbix/vars/template_docker.yml b/roles/os_zabbix/vars/template_docker.yml new file mode 100644 index 000000000..395e054de --- /dev/null +++ b/roles/os_zabbix/vars/template_docker.yml @@ -0,0 +1,89 @@ +--- +g_template_docker: + name: Template Docker + zitems: + - key: docker.ping + applications: + - Docker Daemon + value_type: int + + - key: docker.storage.is_loopback + applications: + - Docker Storage + value_type: int + + - key: docker.storage.data.space.total + applications: + - Docker Storage + value_type: float + + - key: docker.storage.data.space.used + applications: + - Docker Storage + value_type: float + + - key: docker.storage.data.space.available + applications: + - Docker Storage + value_type: float + + - key: docker.storage.data.space.percent_available + applications: + - Docker Storage + value_type: float + + - key: docker.storage.metadata.space.total + applications: + - Docker Storage + value_type: float + + - key: docker.storage.metadata.space.used + applications: + - Docker Storage + value_type: float + + - key: docker.storage.metadata.space.available + applications: + - Docker Storage + value_type: float + + - key: docker.storage.metadata.space.percent_available + applications: + - Docker Storage + value_type: float + ztriggers: + - name: 'docker.ping failed on {HOST.NAME}' + expression: '{Template Docker:docker.ping.max(#3)}<1' + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_docker_ping.asciidoc' + priority: high + + - name: 'Docker storage is using LOOPBACK on {HOST.NAME}' + expression: '{Template Docker:docker.storage.is_loopback.last()}<>0' + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_docker_loopback.asciidoc' + priority: high + + - name: 'Critically low docker storage data space on {HOST.NAME}' + expression: '{Template Docker:docker.storage.data.space.percent_available.max(#3)}<5 or {Template Docker:docker.storage.data.space.available.max(#3)}<5' # < 5% or < 5GB + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_docker_storage.asciidoc' + priority: high + + - name: 'Critically low docker storage metadata space on {HOST.NAME}' + expression: '{Template Docker:docker.storage.metadata.space.percent_available.max(#3)}<5 or {Template Docker:docker.storage.metadata.space.available.max(#3)}<0.005' # < 5% or < 5MB + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_docker_storage.asciidoc' + priority: high + + # Put triggers that depend on other triggers here (deps must be created first) + - name: 'Low docker storage data space on {HOST.NAME}' + expression: '{Template Docker:docker.storage.data.space.percent_available.max(#3)}<10 or {Template Docker:docker.storage.data.space.available.max(#3)}<10' # < 10% or < 10GB + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_docker_storage.asciidoc' + dependencies: + - 'Critically low docker storage data space on {HOST.NAME}' + priority: average + + - name: 'Low docker storage metadata space on {HOST.NAME}' + expression: '{Template Docker:docker.storage.metadata.space.percent_available.max(#3)}<10 or {Template Docker:docker.storage.metadata.space.available.max(#3)}<0.01' # < 10% or < 10MB + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_docker_storage.asciidoc' + dependencies: + - 'Critically low docker storage metadata space on {HOST.NAME}' + priority: average + diff --git a/roles/os_zabbix/vars/template_heartbeat.yml b/roles/os_zabbix/vars/template_heartbeat.yml index 3d0f3d51a..8dbe0d0d6 100644 --- a/roles/os_zabbix/vars/template_heartbeat.yml +++ b/roles/os_zabbix/vars/template_heartbeat.yml @@ -7,6 +7,7 @@ g_template_heartbeat: - Heartbeat key: heartbeat.ping ztriggers: - - description: 'Heartbeat.ping has failed on {HOST.NAME}' + - name: 'Heartbeat.ping has failed on {HOST.NAME}' expression: '{Template Heartbeat:heartbeat.ping.nodata(20m)}=1' priority: avg + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_node_heartbeat.asciidoc' diff --git a/roles/os_zabbix/vars/template_host.yml b/roles/os_zabbix/vars/template_host.yml deleted file mode 100644 index e7cc667cb..000000000 --- a/roles/os_zabbix/vars/template_host.yml +++ /dev/null @@ -1,27 +0,0 @@ ---- -g_template_host: - params: - name: Template Host - host: Template Host - groups: - - groupid: 1 # FIXME (not real) - output: extend - search: - name: Template Host - zitems: - - name: Host Ping - hostid: - key_: host.ping - type: 2 - value_type: 0 - output: extend - search: - key_: host.ping - ztriggers: - - description: 'Host ping has failed on {HOST.NAME}' - expression: '{Template Host:host.ping.last()}<>0' - priority: 3 - searchWildcardsEnabled: True - search: - description: 'Host ping has failed on*' - expandExpression: True diff --git a/roles/os_zabbix/vars/template_master.yml b/roles/os_zabbix/vars/template_master.yml deleted file mode 100644 index 5f9b41a4f..000000000 --- a/roles/os_zabbix/vars/template_master.yml +++ /dev/null @@ -1,27 +0,0 @@ ---- -g_template_master: - params: - name: Template Master - host: Template Master - groups: - - groupid: 1 # FIXME (not real) - output: extend - search: - name: Template Master - zitems: - - name: Master Etcd Ping - hostid: - key_: master.etcd.ping - type: 2 - value_type: 0 - output: extend - search: - key_: master.etcd.ping - ztriggers: - - description: 'Master Etcd ping has failed on {HOST.NAME}' - expression: '{Template Master:master.etcd.ping.last()}<>0' - priority: 3 - searchWildcardsEnabled: True - search: - description: 'Master Etcd ping has failed on*' - expandExpression: True diff --git a/roles/os_zabbix/vars/template_node.yml b/roles/os_zabbix/vars/template_node.yml deleted file mode 100644 index 98c343a24..000000000 --- a/roles/os_zabbix/vars/template_node.yml +++ /dev/null @@ -1,27 +0,0 @@ ---- -g_template_node: - params: - name: Template Node - host: Template Node - groups: - - groupid: 1 # FIXME (not real) - output: extend - search: - name: Template Node - zitems: - - name: Kubelet Ping - hostid: - key_: kubelet.ping - type: 2 - value_type: 0 - output: extend - search: - key_: kubelet.ping - ztriggers: - - description: 'Kubelet ping has failed on {HOST.NAME}' - expression: '{Template Node:kubelet.ping.last()}<>0' - priority: 3 - searchWildcardsEnabled: True - search: - description: 'Kubelet ping has failed on*' - expandExpression: True diff --git a/roles/os_zabbix/vars/template_openshift_master.yml b/roles/os_zabbix/vars/template_openshift_master.yml new file mode 100644 index 000000000..1de4fefbb --- /dev/null +++ b/roles/os_zabbix/vars/template_openshift_master.yml @@ -0,0 +1,58 @@ +--- +g_template_openshift_master: + name: Template Openshift Master + zitems: + - name: create_app + applications: + - Openshift Master + key: create_app + + - key: openshift.master.process.count + description: Shows number of master processes running + type: int + applications: + - Openshift Master + + - key: openshift.master.user.count + description: Shows number of users in a cluster + type: int + applications: + - Openshift Master + + - key: openshift.master.pod.running.count + description: Shows number of pods running + type: int + applications: + - Openshift Master + + - key: openshift.project.counter + description: Shows number of projects on a cluster + type: int + applications: + - Openshift Master + + ztriggers: + - name: 'Application creation has failed on {HOST.NAME}' + expression: '{Template Openshift Master:create_app.last(#1)}=1 and {Template Openshift Master:create_app.last(#2)}=1' + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_create_app.asciidoc' + priority: avg + + - name: 'Openshift Master process not running on {HOST.NAME}' + expression: '{Template Openshift Master:openshift.master.process.count.max(#3)}<1' + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/openshift_master.asciidoc' + priority: high + + - name: 'Too many Openshift Master processes running on {HOST.NAME}' + expression: '{Template Openshift Master:openshift.master.process.count.min(#3)}>1' + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/openshift_master.asciidoc' + priority: high + + - name: 'Number of users for Openshift Master on {HOST.NAME}' + expression: '{Template Openshift Master:openshift.master.user.count.last()}=0' + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/openshift_master.asciidoc' + priority: info + + - name: 'There are no projects running on {HOST.NAME}' + expression: '{Template Openshift Master:openshift.project.counter.last()}=0' + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/openshift_master.asciidoc' + priority: info diff --git a/roles/os_zabbix/vars/template_openshift_node.yml b/roles/os_zabbix/vars/template_openshift_node.yml new file mode 100644 index 000000000..ce28b1048 --- /dev/null +++ b/roles/os_zabbix/vars/template_openshift_node.yml @@ -0,0 +1,44 @@ +--- +g_template_openshift_node: + name: Template Openshift Node + zitems: + - key: openshift.node.process.count + description: Shows number of OpenShift Node processes running + type: int + applications: + - Openshift Node + + - key: openshift.node.ovs.pids.count + description: Shows number of ovs process ids running + type: int + applications: + - Openshift Node + + - key: openshift.node.ovs.ports.count + description: Shows number of OVS ports defined + type: int + applications: + - Openshift Node + + ztriggers: + - name: 'Openshift Node process not running on {HOST.NAME}' + expression: '{Template Openshift Node:openshift.node.process.count.max(#3)}<1' + url: 'https://github.com/openshift/ops-sop/blob/node/V3/Alerts/openshift_node.asciidoc' + priority: high + + - name: 'Too many Openshift Node processes running on {HOST.NAME}' + expression: '{Template Openshift Node:openshift.node.process.count.min(#3)}>1' + url: 'https://github.com/openshift/ops-sop/blob/node/V3/Alerts/openshift_node.asciidoc' + priority: high + + - name: 'OVS may not be running on {HOST.NAME}' + expression: '{Template Openshift Node:openshift.node.ovs.pids.count.last()}<>4' + url: 'https://github.com/openshift/ops-sop/blob/node/V3/Alerts/openshift_node.asciidoc' + priority: high + + - name: 'Number of OVS ports is 0 on {HOST.NAME}' + expression: '{Template Openshift Node:openshift.node.ovs.ports.count.last()}=0' + url: 'https://github.com/openshift/ops-sop/blob/node/V3/Alerts/openshift_node.asciidoc' + priority: high + + diff --git a/roles/os_zabbix/vars/template_ops_tools.yml b/roles/os_zabbix/vars/template_ops_tools.yml new file mode 100644 index 000000000..d1b8a2514 --- /dev/null +++ b/roles/os_zabbix/vars/template_ops_tools.yml @@ -0,0 +1,23 @@ +--- +g_template_ops_tools: + name: Template Operations Tools + zdiscoveryrules: + - name: disc.ops.runner + key: disc.ops.runner + lifetime: 1 + description: "Dynamically register operations runner items" + + zitemprototypes: + - discoveryrule_key: disc.ops.runner + name: "Exit code of ops-runner[{#OSO_COMMAND}]" + key: "disc.ops.runner.command.exitcode[{#OSO_COMMAND}]" + value_type: int + description: "The exit code of the command run from ops-runner" + applications: + - Ops Runner + + ztriggerprototypes: + - name: 'ops-runner[{#OSO_COMMAND}]: non-zero exit code on {HOST.NAME}' + expression: '{Template Operations Tools:disc.ops.runner.command.exitcode[{#OSO_COMMAND}].last()}<>0' + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_ops_runner_command.asciidoc' + priority: average diff --git a/roles/os_zabbix/vars/template_os_linux.yml b/roles/os_zabbix/vars/template_os_linux.yml index 1c9d10bb0..69432273f 100644 --- a/roles/os_zabbix/vars/template_os_linux.yml +++ b/roles/os_zabbix/vars/template_os_linux.yml @@ -10,17 +10,20 @@ g_template_os_linux: - key: kernel.all.cpu.wait.total applications: - Kernel - value_type: int + value_type: float + units: '%' - key: kernel.all.cpu.irq.hard applications: - Kernel - value_type: int + value_type: float + units: '%' - key: kernel.all.cpu.idle applications: - Kernel - value_type: int + value_type: float + units: '%' - key: kernel.uname.distro applications: @@ -35,7 +38,8 @@ g_template_os_linux: - key: kernel.all.cpu.irq.soft applications: - Kernel - value_type: int + value_type: float + units: '%' - key: kernel.all.load.15_minute applications: @@ -45,32 +49,19 @@ g_template_os_linux: - key: kernel.all.cpu.sys applications: - Kernel - value_type: int + value_type: float + units: '%' - key: kernel.all.load.5_minute applications: - Kernel value_type: float - - key: mem.freemem - applications: - - Memory - value_type: int - - key: kernel.all.cpu.nice applications: - Kernel - value_type: int - - - key: mem.util.bufmem - applications: - - Memory - value_type: int - - - key: swap.used - applications: - - Memory - value_type: int + value_type: float + units: '%' - key: kernel.all.load.1_minute applications: @@ -82,92 +73,188 @@ g_template_os_linux: - Kernel value_type: string - - key: swap.length + - key: kernel.all.uptime applications: - - Memory + - Kernel value_type: int - - key: mem.physmem + - key: kernel.all.cpu.user applications: - - Memory - value_type: int + - Kernel + value_type: float + units: '%' - - key: kernel.all.uptime + - key: kernel.uname.machine applications: - Kernel - value_type: int + value_type: string - - key: swap.free + - key: hinv.ncpu applications: - - Memory + - Kernel value_type: int - - key: mem.util.used + - key: kernel.all.cpu.steal applications: - - Memory - value_type: int + - Kernel + value_type: float + units: '%' - - key: kernel.all.cpu.user + - key: kernel.all.pswitch applications: - Kernel value_type: int - - key: kernel.uname.machine + - key: kernel.uname.release applications: - Kernel value_type: string - - key: hinv.ncpu + - key: proc.nprocs applications: - Kernel value_type: int - - key: mem.util.cached + # Memory Items + - key: mem.freemem applications: - Memory value_type: int + description: "PCP: free system memory metric from /proc/meminfo" + multiplier: 1024 + units: B - - key: kernel.all.cpu.steal + - key: mem.util.bufmem applications: - - Kernel + - Memory value_type: int + description: "PCP: Memory allocated for buffer_heads.; I/O buffers metric from /proc/meminfo" + multiplier: 1024 + units: B - - key: kernel.all.pswitch + - key: swap.used applications: - - Kernel + - Memory value_type: int + description: "PCP: swap used metric from /proc/meminfo" + multiplier: 1024 + units: B - - key: kernel.uname.release + - key: swap.length applications: - - Kernel - value_type: string + - Memory + value_type: int + description: "PCP: total swap available metric from /proc/meminfo" + multiplier: 1024 + units: B - - key: proc.nprocs + - key: mem.physmem applications: - - Kernel + - Memory value_type: int + description: "PCP: The value of this metric corresponds to the \"MemTotal\" field reported by /proc/meminfo. Note that this does not necessarily correspond to actual installed physical memory - there may be areas of the physical address space mapped as ROM in various peripheral devices and the bios may be mirroring certain ROMs in RAM." + multiplier: 1024 + units: B - - key: filesys.avail + - key: swap.free applications: - - Disk + - Memory value_type: int + description: "PCP: swap free metric from /proc/meminfo" + multiplier: 1024 + units: B - - key: filesys.capacity + - key: mem.util.available applications: - - Disk + - Memory value_type: int + description: "PCP: The amount of memory that is available for a new workload, without pushing the system into swap. Estimated from MemFree, Active(file), Inactive(file), and SReclaimable, as well as the \"low\" watermarks from /proc/zoneinfo.; available memory from /proc/meminfo" + multiplier: 1024 + units: B - - key: filesys.free + - key: mem.util.used applications: - - Disk + - Memory value_type: int + description: "PCP: Used memory is the difference between mem.physmem and mem.freemem; used memory metric from /proc/meminfo" + multiplier: 1024 + units: B - - key: filesys.full + - key: mem.util.cached applications: - - Disk + - Memory + value_type: int + description: "PCP: Memory used by the page cache, including buffered file data. This is in-memory cache for files read from the disk (the pagecache) but doesn't include SwapCached.; page cache metric from /proc/meminfo" + multiplier: 1024 + units: B + + zdiscoveryrules: + - name: disc.filesys + key: disc.filesys + lifetime: 1 + description: "Dynamically register the filesystems" + + zitemprototypes: + - discoveryrule_key: disc.filesys + name: "disc.filesys.full.{#OSO_FILESYS}" + key: "disc.filesys.full[{#OSO_FILESYS}]" value_type: float - - - key: filesys.used + description: "PCP filesys.full option. This is the percent full returned from pcp filesys.full" applications: - Disk + + - discoveryrule_key: disc.filesys + name: "Percentage of used inodes on {#OSO_FILESYS}" + key: "disc.filesys.inodes.pused[{#OSO_FILESYS}]" value_type: float + description: "PCP derived value of percentage of used inodes on a filesystem." + applications: + - Disk + + ztriggerprototypes: + - name: 'Filesystem: {#OSO_FILESYS} has less than 15% free disk space on {HOST.NAME}' + expression: '{Template OS Linux:disc.filesys.full[{#OSO_FILESYS}].last()}>85' + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_filesys_full.asciidoc' + priority: warn + + - name: 'Filesystem: {#OSO_FILESYS} has less than 10% free disk space on {HOST.NAME}' + expression: '{Template OS Linux:disc.filesys.full[{#OSO_FILESYS}].last()}>90' + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_filesys_full.asciidoc' + priority: high + + - name: 'Filesystem: {#OSO_FILESYS} has less than 10% free inodes on {HOST.NAME}' + expression: '{Template OS Linux:disc.filesys.inodes.pused[{#OSO_FILESYS}].last()}>90' + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_filesys_full.asciidoc' + priority: warn + + - name: 'Filesystem: {#OSO_FILESYS} has less than 5% free inodes on {HOST.NAME}' + expression: '{Template OS Linux:disc.filesys.inodes.pused[{#OSO_FILESYS}].last()}>95' + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_filesys_full.asciidoc' + priority: high + + ztriggers: + - name: 'Too many TOTAL processes on {HOST.NAME}' + expression: '{Template OS Linux:proc.nprocs.last()}>5000' + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_proc.asciidoc' + priority: warn + + - name: 'Lack of available memory on {HOST.NAME}' + expression: '{Template OS Linux:mem.freemem.last()}<30720000' + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_memory.asciidoc' + priority: warn + description: 'Alert on less than 30MegaBytes. This is 30 Million Bytes. 30000 KB x 1024' + + # CPU Utilization # + - name: 'CPU idle less than 5% on {HOST.NAME}' + expression: '{Template OS Linux:kernel.all.cpu.idle.last()}<5 and {Template OS Linux:kernel.all.cpu.idle.last(#2)}<5' + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_cpu_idle.asciidoc' + priority: high + description: 'CPU is less than 5% idle' + + - name: 'CPU idle less than 10% on {HOST.NAME}' + expression: '{Template OS Linux:kernel.all.cpu.idle.last()}<10 and {Template OS Linux:kernel.all.cpu.idle.last(#2)}<10' + url: 'https://github.com/openshift/ops-sop/blob/master/V3/Alerts/check_cpu_idle.asciidoc' + priority: warn + description: 'CPU is less than 10% idle' + dependencies: + - 'CPU idle less than 5% on {HOST.NAME}' diff --git a/roles/os_zabbix/vars/template_router.yml b/roles/os_zabbix/vars/template_router.yml deleted file mode 100644 index 4dae7da1e..000000000 --- a/roles/os_zabbix/vars/template_router.yml +++ /dev/null @@ -1,27 +0,0 @@ ---- -g_template_router: - params: - name: Template Router - host: Template Router - groups: - - groupid: 1 # FIXME (not real) - output: extend - search: - name: Template Router - zitems: - - name: Router Backends down - hostid: - key_: router.backends.down - type: 2 - value_type: 0 - output: extend - search: - key_: router.backends.down - ztriggers: - - description: 'Number of router backends down on {HOST.NAME}' - expression: '{Template Router:router.backends.down.last()}<>0' - priority: 3 - searchWildcardsEnabled: True - search: - description: 'Number of router backends down on {HOST.NAME}' - expandExpression: True |