ansible-junos-stdlib icon indicating copy to clipboard operation
ansible-junos-stdlib copied to clipboard

Timeout value for rpc timeout not used

Open T0nip opened this issue 3 years ago • 10 comments

Issue Type

  • Bug Report

Module Name

juniper.device.config

juniper.device collection and Python libraries version

ansible --version
ansible [core 2.14.1]
  config file = None
  configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.9/dist-packages/ansible
  ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections
  executable location = /usr/local/bin/ansible
  python version = 3.9.2 (default, Feb 28 2021, 17:03:44) [GCC 10.2.1 20210110] (/usr/bin/python3)
  jinja version = 3.1.2
  libyaml = True
pip freeze
ansible==7.1.0
ansible-core==2.14.1
ansible-pylibssh==1.1.0
bcrypt==4.0.1
cffi==1.15.1
cryptography==39.0.0
Jinja2==3.1.2
junos-eznc==2.6.6
jxmlease==1.0.3
lxml==4.9.2
MarkupSafe==2.1.2
ncclient==0.6.13
netaddr==0.8.0
packaging==23.0
paramiko==3.0.0
pycparser==2.21
PyNaCl==1.5.0
pyparsing==3.0.9
pyserial==3.5
PyYAML==6.0
resolvelib==0.8.1
scp==0.14.4
six==1.16.0
supervisor==4.2.2
transitions==0.9.0
xmltodict==0.13.0
yamlordereddictloader==0.4.0

OS / Environment

Juniper srx300 21.2R3-S2.9 Summary

The provided value for timeout isn't used in modules, so the playbook fails while commiting changes.

Steps to reproduce

Commit changes that take longer than 30s.

        - name: install new config
          config:
            host: "{{ inventory_hostname }}"
            port: "{{ sshport }}"
            user: "{{ username }}"
            passwd: "{{ password }}"
            load: 'set'
            src: "{{ config_path }}{{ config_file }}"
            logfile: "junos.log"
            confirmed: 20
            timeout: 120

Expected results

timeout value should be used, so setting 120s should adjust the timeout used, to 120s.

Actual results

playbook stops with timeout error after 30s


T0nip avatar Jan 26 '23 13:01 T0nip

the same issue is there any workaround?

IvanBCode avatar Feb 24 '23 17:02 IvanBCode

i have the same issue timeout is 180 but the default value is used

  • name: Commit confirmed 10 config: host: "{{ scp_ip }}" config_mode: "private" load: "set" format: "set" src: "../../output/{{ inventory_hostname }}-all.txt" confirm: 10 timeout: 180

fatal: [xxx]: FAILED! => {"changed": false, "msg": "Failure committing the configuraton: RpcTimeoutError(host: xxx, cmd: commit-configuration, timeout: 30)"}

fatal: [xxx]: FAILED! => { "changed": false, "invocation": { "module_args": { "attempts": null, "baud": null, "check": null, "check_commit_wait": null, "comment": "Full Config Update via "junos_fw_commit_auto.yml"", "commit": null, "commit_empty_changes": false, "commit_force_sync": false, "commit_full": false, "commit_sync": false, "config_mode": "private", "confirm": 10, "confirmed": 10, "console": null, "cs_passwd": null, "cs_user": null, "dest": null, "dest_dir": null, "diff": null, "diffs_file": null, "filter": null, "format": "set", "host": "xxx", "huge_tree": false, "ignore_warning": null, "level": null, "lines": null, "load": "set", "logdir": null, "logfile": null, "mode": null, "model": null, "namespace": null, "options": {}, "passwd": null, "port": 830, "remove_ns": null, "retrieve": null, "return_output": true, "rollback": null, "src": "../../output/xxx", "ssh_config": null, "ssh_private_key_file": "xxx", "template": null, "timeout": 180, "url": null, "user": "root", "vars": null } }, "msg": "Failure committing the configuraton: RpcTimeoutError(host: xxx, cmd: commit-configuration, timeout: 30)" }

m0msg avatar Mar 18 '23 15:03 m0msg

Same here. Fortunately, I have a dev server where it's taken into account. The only difference is juniper.device runs in 1.0.1.

Force install 1.0.1 solve the issue : ansible-galaxy collection install juniper.device:==1.0.1

Darkuja avatar Mar 21 '23 13:03 Darkuja

Ran into same issue. Putting timeout value in quotes fixed it for me.

juniper.device.config:
  port: "830"
  timeout: "45"
  load: merge

I noticed similar behavior with the port setting. Although I had defined 830 it was still opening a connection over port 22. After putting the port value in quotes it started using port 830 as I had defined.

Its confusing because the documentation shows timeout value as an int and port value as int or str

cbutera-sqsp avatar Jun 15 '23 18:06 cbutera-sqsp

I too have been running into this and it seems the only function of the task that does not use the provided timeout is the commit function. Adding the timeout value in quotes as previously described did not help me.

Every other step in the config task from Open a candidate configuration database to Retrieve the configuration database from the Junos device uses the provided timeout value from the Ansible task. The commit step does not. From this output snippet, using -vvvv during the running of the task, we can see the timeout is set to the one I defined in Ansible (300), but the default is still 30 seconds.

....

            "timeout": 300,    <---------------- Timeout defined by me in task code
            "url": null,
            "user": "sanitized",
            "vars": null
        }
    },
    "msg": "Failure committing the configuraton: RpcTimeoutError(host: router_name, cmd: commit-configuration, timeout: 30)"  <-------------- default in use
}

Potential Bug:

I looked through the module code and I think I identified a potential bug.

In file https://github.com/Juniper/ansible-junos-stdlib/blob/master/ansible_collections/juniper/device/plugins/module_utils/juniper_junos_common.py

The function name is commit_configuration.

The start of the function sets the default timeout to 30:

   def commit_configuration(self, ignore_warning=None, comment=None,
                             confirmed=None, timeout=30, full=False,
                             sync=False, force_sync=False):

The commit is called later in the function and shows timeout=timeout:

        try:
            self.config.commit(ignore_warning=ignore_warning,
                               comment=comment,
                               confirm=confirmed,
                               timeout=timeout,
                               full=full,
                               force_sync=force_sync,
                               sync=sync)
            self.logger.debug("Configuration committed.")
        except (self.pyez_exception.RpcError,
                self.pyez_exception.ConnectError) as ex:
            self.fail_json(msg='Failure committing the configuraton: %s' %
                               (str(ex)))

Comparing this to previous functions, there never seems to be an instance where the commit_configuration calls the user defined timeout. Previous functions have the following in use and I believe a call to a user defined timeout if it exists is needed to resolve this issue.

Example from the open function:

self.dev.timeout = timeout

*** Edited this section to provide the correct code I meant to copy

The commit_configuration just always calls it's own default.

Workaround:

ncclient times out at 30 seconds, but the router continues to commit fine so the configuration changes do go through. I worked around this by putting each config change in it's own block and adding a rescue that was a copy of the original task.

Ansible's second attempt sees the config is there and moves right along with the next task fine. I have upgraded ~10-15 routers this way now and it appears to be a valid workaround.

Example:

- name: Disabling all procotols block
  block:
    - name: Disable protocols
      juniper.device.config:
        load: "set"
        lines:
          - "deactivate protocols ldp"
          - "deactivate protocols l2circuit"
          - "deactivate protocols mpls"
          - "deactivate protocols rsvp"
        ignore_warning: true
        commit: true
        timeout: 300
        level: DEBUG
        logfile: "{{ logfile }}"

  rescue:
    - name: Disable protocols rescue
      juniper.device.config:
        load: "set"
        lines:
          - "deactivate protocols ldp"
          - "deactivate protocols l2circuit"
          - "deactivate protocols mpls"
          - "deactivate protocols rsvp"
        ignore_warning: true
        commit: true
        timeout: 300
        level: DEBUG
        logfile: "{{ logfile }}"

jnicholson56 avatar Sep 15 '23 16:09 jnicholson56

I was able to do a quick and dirty update to the juniper_junos_common.py and it seems to resolve the issue.

I added the following code to the commit_configuration function:

        if self.dev.timeout:
            timeout = self.dev.timeout

Full function code:

    def commit_configuration(
        self,
        ignore_warning=None,
        comment=None,
        confirmed=None,
        timeout=30,
        full=False,
        sync=False,
        force_sync=False,
    ):
        """Commit the candidate configuration.

        Commit the configuration. Assumes the configuration is already opened.

        Args:
            ignore_warning - Which warnings to ignore.
            comment - The commit comment
            confirmed - Number of minutes for commit confirmed.
            timeout - Timeout for commit configuration. Default timeout value is 30s.
            full - apply full commit
            sync - Check for commit syntax and sync between RE's
            force_sync - Ignore syntax check and force to sync between RE's

        Failures:
            - An error returned from committing the configuration.
        """

        if self.dev.timeout:
            timeout = self.dev.timeout

        if self.conn_type != "local":
            self._pyez_conn.commit_configuration(
                ignore_warning, comment, timeout, confirmed
            )
            return

        if self.dev is None or self.config is None:
            self.fail_json(msg="The device or configuration is not open.")

        self.logger.debug("Committing the configuration.")
        try:
            self.config.commit(
                ignore_warning=ignore_warning,
                comment=comment,
                confirm=confirmed,
                timeout=timeout,
                full=full,
                force_sync=force_sync,
                sync=sync,
            )
            self.logger.debug("Configuration committed.")
        except (self.pyez_exception.RpcError, self.pyez_exception.ConnectError) as ex:
            self.fail_json(msg="Failure committing the configuraton: %s" % (str(ex)))

jnicholson56 avatar Sep 15 '23 17:09 jnicholson56

Confirmed fix: Before adding fix: TASK [configuration settings] ******************************************************************************************************************* fatal: [ex2300]: FAILED! => {"ansible_facts": {"discovered_interpreter_python": "/usr/bin/python3"}, "changed": false, "msg": "Failure committing the configuraton: RpcTimeoutError(host: <snip>, cmd: commit-configuration, timeout: 30)"}

After adding fix (with "timeout: '60'"): TASK [configuration settings] ****************************************************************************************************************************************************** changed: [ex2300]

mpc755 avatar Nov 10 '23 17:11 mpc755

@chidanandpujar Can we fix this in the main code? The fix mentioned by @jnicholson56 is working fine for us.

vishnubraj avatar Dec 12 '23 18:12 vishnubraj

@chidanandpujar @jnicholson56 Can we create a PR for this?

vishnubraj avatar Jan 05 '24 07:01 vishnubraj

@vishnubraj I thought you were with juniper :D I am not against that but it seems like maybe no one is working this case actively. I am also unsure of the release cadence for this ansible package.

jnicholson56 avatar Jan 05 '24 15:01 jnicholson56

Hi @T0nip Fix is merged now https://github.com/Juniper/ansible-junos-stdlib/pull/665 hence closing this issue, Please verify and update the results .

Thanks Chidanand

chidanandpujar avatar May 17 '24 07:05 chidanandpujar