Tar on Linux
在Linux下,文件以及文件系统的压缩与打包工具十分丰富,如compress,gzip,bzip2,tar,dump,restore,dd,cpio等等。这里仅针对文件的增量备份做一个简单的测试,主要使用tar。
命令组合:tar,bzip2,split。其中tar负责打包文件,bzip2负责压缩,split负责文件分割。
其他工具:rsync,inotify,sha1sum。rsync负责文件镜像,inotify负责文件实时监控,sha1sum负责文件校验。
测试
边打包、压缩,边分割
[dylan@server ~]# tar -g snapshot/snapshot.snar -cjpvf - source | split -a 3 -b 1K - backup/backup_201303121100.tar.bz2.
tar 选项
-c 创建新的归档文件
-j 使用bzip2压缩归档文件
-g 即--listed-incremental 指定增量备份
-p 归档时保留文件权限
-f 指定归档文件
- stdout
split 选项
-a 指定后缀名的长度。根据数字或字母,确定分割后的最大文件数
如果后缀为数字[0-9],则分割后最多有 10 ** ${suffix_length};
如果后缀为字母[a-z],则分割后最多有 26 ** ${suffix_length};
-b 指定分割大小,以字节为单位
- stdin
查看目录结构
[dylan@server ~]# tree .
.
├── backup
│ ├── backup_201303121100.tar.bz2.aaa
│ ├── backup_201303121100.tar.bz2.aab
│ └── backup_201303121100.tar.bz2.aac
├── snapshot
│ └── snapshot.snar
└── source
... ...
测试分割后的文件
[dylan@server backup]# tar -tf backup_201303121100.tar.bz2.aaa
bzip2: Compressed file ends unexpectedly;
perhaps it is corrupted? *Possible* reason follows.
bzip2: Inappropriate ioctl for device
Input file = (stdin), output file = (stdout)
It is possible that the compressed file(s) have become corrupted.
You can use the -tvv option to test integrity of such files.
You can use the `bzip2recover' program to attempt to recover
data from undamaged sections of corrupted files.
tar: Child returned status 2
tar: Error is not recoverable: exiting now
合并分割后的文件
[dylan@server backup]# cat * > backup_201303121100.tar.bz2
查看要恢复的文件
[dylan@server backup]# tar -tf backup_201303121100.tar.bz2
source/
source/foo/
source/foo/bar/
source/foo/bar/fooo/
source/0
source/1
[dylan@server backup]# tar -tf backup_201303121100.tar.bz2 | grep 50
source/50
解压指定的文件
[dylan@server backup]# tar -xjvf backup_201303121100.tar.bz2 source/50
source/50
查看文件
[dylan@server backup]# tree source/
source/
└── 50
0 directories, 1 file
脚本
backup.sh
#!/bin/sh
#abstract:
#backup files using tar, bzip2
#history:
#2013-3-12 [email protected] first release
source=/fs
#backup settings
dist=/backup
prefix=''
suffix='.tar.bz2.'
timestamp=$(date +%Y%m%d%H%M%S)
current=$(date +%Y/%m)
size=500M
recipients=dba
base=${dist}/${current}
pieces=${base}/${prefix}${timestamp}${suffix}
snapshot=${base}/snapshot.snar
log=${base}/${timestamp}.log
start=$(date +%s)
checksum="sha1sum started:"
ls ${source} ${dist} && mkdir -p ${base} && tar -g ${snapshot} -cjpvf - ${source} | split -b ${size} - ${pieces} && echo -e ${checksum} > ${log} && sha1sum ${pieces}* >> ${log}
end=$(date +%s)
elapsed=$(expr ${end} - ${start})
summary="sha1sum ended.\nsummary:\n\tstarted: ${start}\tended: ${end}\n\telapsed: ${elapsed} seconds."
echo -e ${summary} >> ${log}
#mail log
cat ${log} | mail -a ${log} -s "$(date +%Y-%m-%d\ %H:%M:%S):$(hostname) dba daily check of fs backup" ${recipients}
以上脚本每个月都会产生新的快照文件(snapshot每个月会重新创建);增量备份的频率,即脚本运行的频率。若以月为周期太长,则可以更改current=$(date +%Y/%m)
,(如current=$(date +%Y/%M)
,周期为一个星期)自行定义。
log
sha1sum started:
24c00b3eff967fec250ac7663ac03cbd35cbe40e /backup/2013/03/20130312163458.tar.bz2.aa
c2ccb143c514d1afcebfb7a84b4642e168bdd93d /backup/2013/03/20130312163458.tar.bz2.ab
7313889f5057365b168980c8f1765f97016fe913 /backup/2013/03/20130312163458.tar.bz2.ac
a186a9aee90606d53755719a63bed1f407ce6d88 /backup/2013/03/20130312163458.tar.bz2.ad
945979757291b2f1ac588cbf5c82637f20e37eb3 /backup/2013/03/20130312163458.tar.bz2.ae
642732865fcc1d43b620b255bf9f32d504f02079 /backup/2013/03/20130312163458.tar.bz2.af
9cf6a651d4d084d5fe3dc595aa57dc4d61484e3c /backup/2013/03/20130312163458.tar.bz2.ag
9eae089684016860aec91b50913d451c5a8689ce /backup/2013/03/20130312163458.tar.bz2.ah
dce088dcdd7e3f6e8d7a58007e590cfdc9047b9e /backup/2013/03/20130312163458.tar.bz2.ai
5db4a8f6196e1c11cc9dcf29422f56248111dc6e /backup/2013/03/20130312163458.tar.bz2.aj
sha1sum ended.
summary:
started:1363077298 ended at:1363077466
elapsed: 168 seconds.
参考
blog comments powered by Disqus