ucar fnl等数据的批量下载exe（可续传）

sam_doggy · 发表于 2019-3-2 22:28:33

登录后查看更多精彩内容~

您需要登录才可以下载或查看，没有帐号？立即注册

x

本帖最后由 sam_doggy 于 2019-4-15 19:50 编辑

2019.4.15更新

把程序打包为exe文件，可以不需要下载python直接运行
由于ucar提供了.py形式的代码，所以并未费事爬取，而是希望使用者从提供的.py代码中复制某些变量

需要注意的是，希望复制的变量格式为图片所展示的格式，filelist不要有回车换行
等下次有空可以把从键盘读入数据的形式改为直接从文本正则查找

#################第一版

分享一个对ucar网站的python requeset 下载的小改动，只有完全不会用request库的人才有参考价值
主要加上
1：断点续传功能
2. 跳过已经下载完成的部分，每次只需重新运行该程序

改动仅仅两处

完全不会python 的：
1.百度下载python3，
2.配置环境变量后直接在cmd中运行,
3.运行方式为输入“python 需要运行的程序地址”

#!/usr/bin/env python
#################################################################
# Python Script to retrieve 2 online Data files of 'ds502.0',
# total 88.47M. This script uses 'requests' to download data.
#
# Highlight this script by Select All, Copy and Paste it into a file;
# make the file executable and run it on command line.
#
# You need pass in your password as a parameter to execute
# this script; or you can set an environment variable RDAPSWD
# if your Operating System supports it.
#
# Contact tcram@ucar.edu (Thomas Cram) for further assistance.
#################################################################
import sys, os
import requests
def check_file_status(filepath, filesize):
sys.stdout.write('\r')
sys.stdout.flush()
size = int(os.stat(filepath).st_size)
percent_complete = (size/filesize)*100
sys.stdout.write('%.3f %s' % (percent_complete, '% Completed'))
sys.stdout.flush()
# Try to get password
if len(sys.argv) < 2 and not 'RDAPSWD' in os.environ:
try:
import getpass
input = getpass.getpass
except:
try:
input = raw_input
except:
pass
pswd = input('Password: ')
else:
try:
pswd = sys.argv[1]
except:
pswd = os.environ['RDAPSWD']
url = 'https://rda.ucar.edu/cgi-bin/login'
values = {'email' : '你的邮箱', 'passwd' : pswd, 'action' : 'login'}
#此处填自己邮箱
ret = requests.post(url,data=values)
if ret.status_code != 200:
print('Bad Authentication')
print(ret.text)
exit(1)
dspath = 'http://rda.ucar.edu/data/ds502.0/'
filelist = [
'big_endian/2003/20030101_3hr-025deg_cpc+comb',
'big_endian/2003/20030102_3hr-025deg_cpc+comb']
for file in filelist:
filename=dspath+file
file_base = os.path.basename(file)
print('Downloading',file_base)
#####每次加这部分，在同样位置即可##################
# 第一次请求是为了得到文件总大小
r1 = requests.get(filename, cookies = ret.cookies, allow_redirects=True, stream=True)
total_size = int(r1.headers['Content-Length'])
# 先看看本地文件下载了多少
if os.path.exists(filename):
temp_size = os.path.getsize(filename)
# 本地已经下载的文件大小
else:
temp_size = 0
# 显示一下下载了多少
if temp_size==total_size:
continue
# 核心部分，这个是请求下载时，从本地文件已经下载过的后面下载
headers = {'Range': 'bytes=%d-' % temp_size}
# 重新请求网址，加入新的请求头的
################################################
req = requests.get(filename, cookies = ret.cookies, allow_redirects=True, stream=True,headers=headers)
filesize = int(req.headers['Content-length'])
with open(file_base, 'ab') as outfile:
#wb每次要改成ab
chunk_size=1048576
for chunk in req.iter_content(chunk_size=chunk_size):
outfile.write(chunk)
if chunk_size < filesize:
check_file_status(file_base, filesize)
check_file_status(file_base, filesize)
print()
#我改过的文件，标注了我加的内容，每次重下就直接运行就好，会自动跳过重复的。你要改下邮箱改成自己的。

复制代码

cookie-o-o · 发表于 2019-3-18 10:10:01

学习一下，非常感谢！

cookie-o-o · 发表于 2019-3-18 11:26:59

学习一下，非常感谢！

taxueyueming · 发表于 2019-7-30 10:09:57

请问楼主能实现批量下载么？

沐展眉 · 发表于 2019-9-27 10:54:33

楼主大大，我要下载ucar的 “ NCAR CESM Global Bias-Corrected CMIP5 Output to Support WRF/MPAS Research”这个数据，然后修改了那个dspath和filelist，但是总是下载完的文件大小为0k。这是为什么呀？

taxueyueming · 发表于 2019-10-10 09:16:06

楼主你好，最近软件无法下载（自动回复：请不要使用迅雷等下载工具，点我查看下载帮助），之前一直可以使用，请问是有什么问题么？
错误代码一闪就过，看都不看清，无法截图。。。

bearwoo · 发表于 2020-3-8 10:59:27

感谢分享！学习一下~

		自动登录	找回密码
密码			立即注册

[其他] ucar fnl等数据的批量下载exe（可续传）

登录后查看更多精彩内容~

浏览过的版块