You want to fetch files from a website (with wget)
Install a Puppetforge module -
# install the module and its dependencies
$ sudo /opt/puppetlabs/bin/puppet module install maestrodev-wget
...
Notice: Installing -- do not interrupt ...
/etc/puppetlabs/code/environments/production/modules
- maestrodev-wget (v1.7.3)
...
and use wget to fetch a file:
class fetch_file {
include ::wget
wget::fetch { 'https://www.unixdaemon.net/index.xml':
destination => '/tmp/unixdaemon-feed.xml',
timeout => 15,
verbose => true,
}
}
# run puppet
...
Notice: /Stage[main]/Fetch_file/Wget::Fetch[https://www.unixdaemon.net/index.xml]
/Exec[wget-https://www.unixdaemon.net/index.xml]/returns: executed successfull
...
$ ls -alh /tmp/unixdaemon-feed.xml
-rw-r--r--. 1 root root 79K Jun 2 15:59 /tmp/unixdaemon-feed.xml
Sometimes, despite all the other tools and processes available, you just
need to fetch a file from a website and put it on the local machine.
While it's not the recommended way to manage things it's always nice to
have it available as an option. In this example we'll use the wget
puppet wrapper to download the file for us.
First, install the Puppetforge module:
# install the module and its dependencies
$ sudo /opt/puppetlabs/bin/puppet module install maestrodev-wget
...
Notice: Installing -- do not interrupt ...
/etc/puppetlabs/code/environments/production/modules
- maestrodev-wget (v1.7.3)
...
Once you have the module you can download the file using it:
class fetch_file {
include ::wget
wget::fetch { 'https://www.unixdaemon.net/index.xml':
destination => '/tmp/unixdaemon-feed.xml',
timeout => 15,
verbose => true,
}
}
# run puppet
...
Notice: /Stage[main]/Fetch_file/Wget::Fetch[https://www.unixdaemon.net/index.xml]
/Exec[wget-https://www.unixdaemon.net/index.xml]/returns: executed successfull
...
$ ls -alh /tmp/unixdaemon-feed.xml
-rw-r--r--. 1 root root 79K Jun 2 15:59 /tmp/unixdaemon-feed.xml
There are a few other use cases documented in the README
that are worth understanding; especially local caching to ensure you're
not constantly fetching the file just to discard it if it hasn't
changed. One that provides a big benefit with very little effort is
better resource naming. By specifying the URL in a source
parameter
you can put an actual descriptive name as the resource title, and then
hopefully find the logs much easier to read.
class fetch_named_file {
include ::wget
wget::fetch { 'unixdaemon index file'
source => 'https://www.unixdaemon.net/index.xml',
destination => '/tmp/unixdaemon-feed.xml',
timeout => 15,
}
}
I also think you should always specify a timeout on all wget
resources.
class resource_default {
include ::wget
Wget::Fetch {
timeout => 15,
}
}
Without being too judgemental I feel I should note a few things about using this approach in your Puppet code base. At the very least you should use the modules caching functionality to ensure your runs don't slow down while you constantly re-fetch the resource. There's also the lack of crypto support for the files, even those behind a username and password, to ensure they're what you expected. In general cases you'd be much better to either check the files into the puppet master, or even better, fpm package them and deploy the files via your normal package manager and enjoy all the benefits, checksum verification, file tracking etc. that ensue.