欢迎来到尧图网

客户服务 关于我们

您的位置:首页 > 科技 > 能源 > Ftp目录整个下载

Ftp目录整个下载

2025/5/5 9:19:20 来源:https://blog.csdn.net/weixin_45249411/article/details/144643324  浏览:    关键词:Ftp目录整个下载

最近有个需求是要下载ftp接近十个T的数据,在调研过多个工具后发现还是lftp的mirror最省事

mirror参数

Mirror specified source directory to local target directory. If target directory ends with a slash, the source base name is appended to target directory name.  Sourceand/or target can be URLs pointing to directories.-c,    --continue                continue a mirror job if possible-e,    --delete                  delete files not present at remote site--delete-first            delete old files before transferring new ones--depth-first             descend into subdirectories before transferring files-s,    --allow-suid              set suid/sgid bits according to remote site--allow-chown             try to set owner and group on files--ascii                   use ascii mode transfers (implies --ignore-size)--ignore-time             ignore time when deciding whether to download--ignore-size             ignore size when deciding whether to download--only-missing            download only missing files--only-existing           download only files already existing at target-n,    --only-newer              download only newer files (-c won't work)--no-empty-dirs           don't create empty directories (implies --depth-first)-r,    --no-recursion            don't go to subdirectories--no-symlinks             don't create symbolic links-p,    --no-perms                don't set file permissions--no-umask                don't apply umask to file modes-R,    --reverse                 reverse mirror (put files)-L,    --dereference             download symbolic links as files-N,    --newer-than=SPEC         download only files newer than specified time--on-change=CMD           execute the command if anything has been changed--older-than=SPEC         download only files older than specified time--size-range=RANGE        download only files with size in specified range-P,    --parallel[=N]            download N files in parallel--use-pget[-n=N]          use pget to transfer every single file--loop                    loop until no changes found-i RX, --include RX              include matching files-x RX, --exclude RX              exclude matching files-I GP, --include-glob GP         include matching files-X GP, --exclude-glob GP         exclude matching files-v,    --verbose[=level]         verbose operation--log=FILE                write lftp commands being executed to FILE--script=FILE             write lftp commands to FILE, but don't execute them--just-print, --dry-run   same as --script=---use-cache               use cached directory listings--Remove-source-files     remove files after transfer (use with caution)-a                               same as --allow-chown --allow-suid --no-umask

问题记录

1.虽然mirror支持多线程,我们也是针对三个大目录(其中很多子目录)下载,但是整个过程中list列表比较费时间,建议是直接mirror子目录 这样线程会多一些。

2.注意使用--only-missing参数,其他的参数比如only-newer 不太清楚原因但是会先删掉本地再下载一遍

#!/bin/bash# FTP服务器信息
FTP_HOST="xxxxx"
FTP_USER="xxxx"
FTP_PASS="xxxxxxx"# 定义要同步的远程和本地目录对declare -A DIR_MAP=(
["/fumulu/zimulu1"]="/data/0/bendi/fumulu/"
["/fumulu/zimulu2"]="/data/0/bendi/fumulu/"
["/fumulu/zimulu3"]="/data/0/bendi/fumulu/"
["/fumulu/zimulu4"]="/data/0/bendi/fumulu/"
["/fumulu/zimulu5"]="/data/0/bendi/fumulu/"
["/fumulu/zimulu6"]="/data/0/bendi/fumulu/"
["/fumulu/zimulu7"]="/data/0/bendi/fumulu/"
)
# 创建日志目录
LOG_DIR="sync_logs"
mkdir -p "$LOG_DIR"sync_directory() {local remote_dir=$1local local_dir=$2# 生成日志文件名(将目录分隔符替换为下划线)local log_name=$(echo "${remote_dir}" | tr '/' '_')local log_file="$LOG_DIR/${log_name}sync.log"# 确保本地目录存在mkdir -p "$local_dir"echo "开始同步 $remote_dir 到 $local_dir..." | tee -a "$log_file"echo "同步开始时间: $(date)" >> "$log_file"# 使用lftp进行同步操作,添加 --size-only 参数temp_log=$(mktemp)lftp -c "open -u $FTP_USER,$FTP_PASS $FTP_HOST; \mirror --parallel=1000 --verbose --only-missing  $remote_dir $local_dir" 2>&1 | tee -a "$temp_log" "$log_file"# 检查文件下载失败的情况if grep -i "File not available" "$temp_log" > /dev/null; thenecho "发现文件下载失败,记录到 shibai.txt..."# 提取并记录失败的文件信息grep -i "File not available" "$temp_log" | while read -r line; do# 提取完整的文件路径和文件名full_path=$(echo "$line" | grep -o "@.*" | cut -d' ' -f1)echo "$full_path" >> shibai.txtdonefiecho "同步结束时间: $(date)" >> "$log_file"echo "----------------------------------------" >> "$log_file"# 清理临时日志文件rm -f "$temp_log"
}# 同时启动所有同步任务
for remote_dir in "${!DIR_MAP[@]}"; dolocal_dir=${DIR_MAP[$remote_dir]}sync_directory "$remote_dir" "$local_dir" &
done# 等待所有后台任务完成
waitecho "所有同步任务已完成。"

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com

热搜词