Rsync for bulk transfer
If you have to transfer a huge number of files lets say 50000 over WAN as one time data copy. Each of the files are of 3-4 Mb. We have couple of options like tar (will need extra space; unless you pipe it through ssh), scp (will be sequential), or use rsync.
What I have done here, is used the --files-from option of rsync which specifies which files to transfer.
On the source, create the file list 50000 lines (huge number) with ls > /tmp/a, then split the files based on line count. The line count is the deciding factor, if you have DS3 or higher make it smaller. Smaller line count will determine the concurrency or the number of simultaneous sessions that can start.
split -l 1000 /tmp/a
This will create 50000/1000 number of files starting xaa (you can decide the starting prefix, this is default) in the present working directory assuming its /tmp.
Once done use the below either as single commands or put them in a script and start your job.
nohup /root/rsync/bin/rsync -avz --files-from=/tmp/xaa --log-file=/tmp/xaa.log --sockopts=SO_SNDBUF=65535,SO_RCVBUF=65535 -e rsh:/abc/ /abc/ &
Multiple commands follows with next /tmp/xab and log file as /tmp/xab.log, BTW we can use the same log file but just to see whether each transfer is successful or not prefer to keep it seperate
You can use either ssh or rsh. I have used here socket options as per my tcp/ip stack tuning. This can differ in your case.
What I have done here, is used the --files-from option of rsync which specifies which files to transfer.
On the source, create the file list 50000 lines (huge number) with ls > /tmp/a, then split the files based on line count. The line count is the deciding factor, if you have DS3 or higher make it smaller. Smaller line count will determine the concurrency or the number of simultaneous sessions that can start.
split -l 1000 /tmp/a
This will create 50000/1000 number of files starting xaa (you can decide the starting prefix, this is default) in the present working directory assuming its /tmp.
Once done use the below either as single commands or put them in a script and start your job.
nohup /root/rsync/bin/rsync -avz --files-from=/tmp/xaa --log-file=/tmp/xaa.log --sockopts=SO_SNDBUF=65535,SO_RCVBUF=65535 -e rsh
Multiple commands follows with next /tmp/xab and log file as /tmp/xab.log, BTW we can use the same log file but just to see whether each transfer is successful or not prefer to keep it seperate
You can use either ssh or rsh. I have used here socket options as per my tcp/ip stack tuning. This can differ in your case.
Comments
Post a Comment