RSYNC:Private Article:Implementation & Lab Testing Guide
From CoolSolutionsWiki
Contents |
Document Overview
This document is to serve as a reference guide for implementing RSYNC on NetWare 6.x to achieve bi-directional replication capabilities. As you will read, bi-directional replication essentially consists of two servers replicating to one another - these pieces can be separated so that replication is unidirectional if desired, as in NBO.
Objective
To develop a method by which changes to files on NetWare servers can be efficiently synchronized from one point to another over existing private network connections.
Methodology
Leverage freeware synchronization utility, rsync, from rsync.samba.org which has been ported to NetWare to create a unidirectional, and optionally, bidirectional synchronization and replication mechanism.
Requirements
- Create a standard configuration for NetWare servers that will allow them to act as destinations for incoming data, and sources for outgoing data.
- Design an architecture that allows for data at remote locations to be centrally collected at Tulsaâs highly available, fault-tolerant, redundant, controlled environment.
- Provide a mechanism by which data stays âloosely consistentâ throughout the day, synchronizing changes no less frequently than once per hour.
- Provide the ability to perform bandwidth throttling of incoming synchronized data to avoid performance and IP telephony issues.
- Changes to files are handled efficiently as possible â at the bit, or block level â for any type of data file.
Description of Test Environment Configuration
Test environment consisted of a NetWare 6 SP2 server in a non-production tree (LAB) on the production Ethernet network, and a NetWare 6 SP2 server in the production tree (PROD). Both servers had equivalent patch revisions, SYS and DATA volumes, and were configured to be roughly identical.
Configuration Steps
Obtained rsync binaries from http://forge.novell.com. Copied contents of \Novell\Rsync\Sys to the LAB server (named BACKUP), and the PROD server (SERVERNAME).
Performing modifications to the default configuration involves modifying two files, and creating one directory for each incoming serverâs data. The two files are rsyncd.conf, and rsyncstr.ncf.
rsyncd.conf
Modification of the rsyncd.conf is very straightforward. Directories and paths on the âserverâ file system (destination) are aliased with a friendly name, configurable in the rsyncd.conf as a âModuleâ�?. The entries for Modules in rsyncd.conf roughly resemble a Windows INI configuration file. The default rsyncd.conf contains an entry named BOMA pointing to a sample file system location. Renamed the default BOMA module to reflect source server's name. The configuration file used on the SERVERNAME server in this environment is included here as a reference:
uid = nobody gid = nobody max connections = 0 syslog facility = local5 # Change the pid file, log file, and motd as needed pid file = SYS:/rsync/rsyncd.pid log file = SYS:/rsync/rsyncd.log motd file = SYS:/rsync/rsyncd.motd # Set the timeout value to 1 hour (60 seconds * 60 minutes) for now # this will affect all modules, or you can put it under each module timeout = 3600 [SERVERNAME] path = DATA:/rsync/SERVERNAME comment = SERVERNAME Server Backup Area read only = no use chroot = no timeout = 3600 transfer logging = yes
rsyncstr.ncf - Startup for rsync daemon.
For the rsyncstr.ncf file, removed the comment for the single-server configuration load statement. Also took out the SSL switch, as this test was between trees with non-matching âSSL CertificateIPâ�? objects. The SSL switch is not necessary in the private network, but is expected to work when loading the daemon and connecting within a single tree. IP addresses were not modified, leaving the default 0.0.0.0 and port 837. The rsyncstr.ncf file used in the tests is included here for reference:
REM rsyncstr.ncf â Start rsync Receive Daemon REM Use the following command line for the SSL-enabled daemon REM rsync -v --progress --ssl --port=873 âdaemon REM Use the following command line for the non-SSL daemon rsync -v --progress --port=873 --daemon
Folder Creation
The "path" entry in the module listed above must be created manually in order for the files to be stored properly by the rsync daemon. In this instance, the directory path DATA:RSYNC\SERVERNAME was created on BACKUP to handle data coming from the SERVERNAME server. The opposite was done on the SERVERNAME server, where a DATA:RSYNC\BACKUP directory was created.
"Send" Replication - Execution from Command line
From the "sending" NetWare server, used the following command line to get the sync to work. In this example, I wanted to send SYS:ETC from the local server to DATA:servername\ETC on the remote server. Again, the rsyncd.conf file had a module named "SERVERNAME" pointing to that path (DATA:servername):
SERVERNAME: rsync -rRutzvP --volume=SYS: /etc 192.168.1.000::SERVERNAME
This command successfully replicated the SYS:ETC directory, contents, and all sub-folders, to DATA:RSYNC\SERVERNAME on the RSYNC server. Subsequent iterations of this command with minor modifications did in fact indicate that only changes to files were being replicated.
Further tested with options such as --delete, etc. to develop the preferred implementation command line options, also without issue.
Testing Notes / Lessons Learned
Case Sensitivity
The Module name (specified after the "::") is CASE SENSITIVE. This must match what is in rsyncd.conf on the remote server. For example, specifying the destination as 192.168.1.6::servername would not work, but 192.168.1.6::SERVERNAME would work. You may use whichever case you like, as long as you are consistent between the command line and rsyncd.conf.
Command Line Syntax
Must use two colons ("::") in destination in order to avoide use of SSH or any of the other Linux-esque functionality of the rsync program. Use of a single colon implies that you are using rsync over SSH, or when using a remote shell program. This may be useful on Linux/Unix systems, but not necessary or relevant on the NetWare platform. Using two colons indicates you are using the native rsync protocol and ports ONLY.
Must also use forward slashes in paths, not backslashes. The utility was developed for Linux systems primarily.
Finally, the --volume switch must be specified in order for rsync to properly locate the source files specified on the command line. In order to replicate files from more than one volume, rsync must be run iteratively with a new value for volume on the command line.
Using Secure Sockets Layer (SSL)
You have to add the --ssl option on the command line if you want secure communication to a remote server. The server must also have been started with the --ssl switch in the rsyncstr.ncf file (the default option).
Using rsync Re-entrantly
You can run the daemon and simultaneously start an rsync session to another server â it does not appear that rsync executes re-entrantly, but instead launches another instance.
Monitoring Progress
Details on what's being copied should show up in the logger screen on the source server, and on the rsync screen on the remote server's console.
Performing Experimental rsync Commands
Use the "n" switch (e.g. -rRutzvPn) to do a dry-run evaluation of files to be copied. It will do everything but send them over the wire, making it easy to determine whether or not your config and command line is correct.
Changes to rsyncd.conf While the Daemon is Running
The rsyncd.conf file is read each time a client connects, so it is unnecessary to unload and reload rsync in order for configuration changes to take effect. Like any Linux configuration file, this can be opened in a standard text editor.
Efficiency and Bandwidth Utilization
Rsync demonstrated an ability to take minute changes to any type of file â text or otherwise â and replicate them with the minimum amount of over-the-wire traffic. Bandwidth throttling worked as described, limiting throughput to as little as 5KBps during the laboratory testing.
Recommended Implementation
Implementation should follow the steps outlined here, replacing host names and addresses as necessary.
For remote sites, each should point to a single IP address in Tulsa. Ideally, this would be a clustered resource connected to a large SAN array dedicated to remote synchronization.
A CRONTAB file would be created on each remote site that executed the replication command line on an hourly schedule. The minimum replication command line options are referenced below. Additionally, command line options such as --bwlimit can be put in place to perform bandwidth throttling for on-hours replication traffic.
Simple Replication
Command line for true replication (delete files on destination that are not on source):
rsync ârRutzvP --delete â-volume=SYS: /etc 192.168.1.000::servername
Personal and Shared Data Replication
For Data volumes, also exclude temporary files from Office products, etc using the exclude directive as shown here:
rsync ârRutzvP --delete â-exclude â~*.*â�? â-exclude â*.tmpâ�? â-exclude â*.mp3â�? â-volume=SYS: /etc 192.168.1.000::servername
Bandwidth Conservative Replication
The âbwlimit switch would be added, specifying a number in KBps at which to limit throughput from the client to the server.
rsync ârRutzvP --delete â-bwlimit=25 â-exclude â~*.*â�? â-exclude â*.tmpâ�? â-exclude â*.mp3â�? â-volume=SYS: /etc 192.168.1.000::servername
Next Steps
Ideally, a single PERL or other form of script could be deployed to each server, along with the required software, via TED to each server. The goal would be to have a script that could automatically configure itself based upon its address, name, etc.
This would allow us to maintain a single script file, easing the integration of this functionality into new and existing servers.
