Merge remote-tracking branch 'upstream/master' into upstream

This commit is contained in:
gongqijian 2013-02-16 22:08:53 +08:00
commit 20ef501ec9
50 changed files with 278 additions and 122 deletions

5
.gitignore vendored
View File

@ -1,9 +1,10 @@
/build/
/dist/
/*.egg-info/
/MANIFEST
*.egg-info/
*.py[cod]
_*/
*.py[cod]
*.download
*.cmt.*

6
.travis.yml Normal file
View File

@ -0,0 +1,6 @@
# https://travis-ci.org/soimort/you-get
language: python
python:
- "3.2"
- "3.3"
script: make test

View File

@ -1,6 +1,32 @@
Changelog
=========
0.3.1
-----
*Date: 2013-02-15*
* Fix issues for Google+ and Mixcloud.
* API changed.
0.3.0
-----
*Date: 2013-02-08*
* Add support for:
- Niconico
0.3dev-20130201
---------------
*Date: 2013-02-01*
* Add support for:
- Mixcloud
- Facebook
- Joy.cn
0.3dev-20130125
---------------

View File

@ -1,21 +0,0 @@
# file GENERATED by distutils, do NOT edit
CHANGELOG.txt
LICENSE.txt
Makefile
README.md
README.txt
setup.cfg
setup.py
you-get
you-get.json
you_get/__init__.py
you_get/common.py
you_get/main.py
you_get/downloader/__init__.py
you_get/downloader/tudou.py
you_get/downloader/yinyuetai.py
you_get/downloader/youku.py
you_get/downloader/youtube.py
you_get/processor/__init__.py
you_get/processor/merge_flv.py
you_get/processor/merge_mp4.py

View File

@ -2,13 +2,21 @@ SETUP = python3 setup.py
.PHONY: default clean build sdist bdist bdist_egg install release
default: build sdist bdist bdist_egg
default: i
i:
@(cd src/; python -i -c 'import you_get; print("You-Get %s (%s)\n>>> import you_get" % (you_get.__version__, you_get.__date__))')
test:
$(SETUP) test
clean:
zenity --question
rm -fr build/ dist/ *.egg-info/
rm -fr build/ dist/ src/*.egg-info/
find . | grep __pycache__ | xargs rm -fr
all: build sdist bdist bdist_egg
build:
$(SETUP) build

View File

@ -1,11 +1,13 @@
# You-Get
[You-Get](https://github.com/soimort/you-get) is a video downloader runs on Python 3. It aims at easing the download of videos on [YouTube](http://www.youtube.com), [Youku](http://www.youku.com)/[Tudou](http://www.tudou.com) (biggest online video providers in China), etc., in one script.
[You-Get](https://github.com/soimort/you-get) is a video downloader runs on Python 3. It aims at easing the download of videos on [YouTube](http://www.youtube.com), [Youku](http://www.youku.com)/[Tudou](http://www.tudou.com) (biggest online video providers in China), [ Niconico](http://www.nicovideo.jp), etc., in one script.
See the project homepage <http://www.soimort.org/you-get> for further documentation.
Fork me on GitHub: <https://github.com/soimort/you-get>
[![Build Status](https://api.travis-ci.org/soimort/you-get.png)](https://travis-ci.org/soimort/you-get)
## Features
### Supported Sites (As of Now)
@ -17,6 +19,8 @@ Fork me on GitHub: <https://github.com/soimort/you-get>
* Google+ <http://plus.google.com>
* Tumblr <http://www.tumblr.com>
* SoundCloud <http://soundcloud.com>
* Mixcloud <http://www.mixcloud.com>
* Niconico (ニコニコ動画) <http://www.nicovideo.jp>
* Youku (优酷) <http://www.youku.com>
* Tudou (土豆) <http://www.tudou.com>
* YinYueTai (音悦台) <http://www.yinyuetai.com>
@ -148,17 +152,14 @@ For a complete list of all available options, see:
In Python 3 (interactive):
>>> import you_get
>>> you_get.__version__
'0.2'
>>> you_get.youtube_download("http://www.youtube.com/watch?v=8bQlxQJEzLk", info_only = True)
>>> from you_get.downloader import *
>>> youtube.download("http://www.youtube.com/watch?v=8bQlxQJEzLk", info_only = True)
Video Site: YouTube.com
Title: If you're good at something, never do it for free!
Type: WebM video (video/webm)
Size: 0.13 MB (133176 Bytes)
>>> import you_get
>>> you_get.any_download("http://www.youtube.com/watch?v=sGwy8DsUJ4M")
Video Site: YouTube.com
Title: Mort from Madagascar LIKES
@ -211,6 +212,8 @@ You-Get基于优酷下载脚本[iambus/youku-lixian](https://github.com/iambus/y
* Google+ <http://plus.google.com>
* Tumblr <http://www.tumblr.com>
* SoundCloud <http://soundcloud.com>
* Mixcloud <http://www.mixcloud.com>
* NICONICO动画 <http://www.nicovideo.jp>
* 优酷 <http://www.youku.com>
* 土豆 <http://www.tudou.com>
* 音悦台 <http://www.yinyuetai.com>

View File

@ -1,7 +1,7 @@
You-Get
=======
`You-Get <https://github.com/soimort/you-get>`_ is a video downloader runs on Python 3. It aims at easing the download of videos on `YouTube <http://www.youtube.com>`_, `Youku <http://www.youku.com>`_/`Tudou <http://www.tudou.com>`_ (biggest online video providers in China), etc., in one script.
`You-Get <https://github.com/soimort/you-get>`_ is a video downloader runs on Python 3. It aims at easing the download of videos on `YouTube <http://www.youtube.com>`_, `Youku <http://www.youku.com>`_/`Tudou <http://www.tudou.com>`_ (biggest online video providers in China), `Niconico <http://www.nicovideo.jp>`_, etc., in one script.
See the project homepage http://www.soimort.org/you-get for further documentation.
@ -20,6 +20,8 @@ Supported Sites (As of Now)
* Google+ http://plus.google.com
* Tumblr http://www.tumblr.com
* SoundCloud http://soundcloud.com
* Mixcloud http://www.mixcloud.com
* Niconico (ニコニコ動画) http://www.nicovideo.jp
* Youku (优酷) http://www.youku.com
* Tudou (土豆) http://www.tudou.com
* YinYueTai (音悦台) http://www.yinyuetai.com
@ -156,17 +158,14 @@ Examples (For Developers)
In Python 3 (interactive)::
>>> import you_get
>>> you_get.__version__
'0.2'
>>> you_get.youtube_download("http://www.youtube.com/watch?v=8bQlxQJEzLk", info_only = True)
>>> from you_get.downloader import *
>>> youtube.download("http://www.youtube.com/watch?v=8bQlxQJEzLk", info_only = True)
Video Site: YouTube.com
Title: If you're good at something, never do it for free!
Type: WebM video (video/webm)
Size: 0.13 MB (133176 Bytes)
>>> import you_get
>>> you_get.any_download("http://www.youtube.com/watch?v=sGwy8DsUJ4M")
Video Site: YouTube.com
Title: Mort from Madagascar LIKES

View File

@ -1,13 +1,16 @@
#!/usr/bin/env python3
PROJ_METADATA = 'you-get.json'
PROJ_NAME = 'you-get'
PACKAGE_NAME = 'you_get'
PROJ_METADATA = '%s.json' % PROJ_NAME
import os, json, imp
here = os.path.abspath(os.path.dirname(__file__))
proj_info = json.loads(open(os.path.join(here, PROJ_METADATA)).read())
README = open(os.path.join(here, 'README.txt')).read()
CHANGELOG = open(os.path.join(here, 'CHANGELOG.txt')).read()
VERSION = imp.load_source('version', os.path.join(here, 'you_get/version.py')).__version__
VERSION = imp.load_source('version', os.path.join(here, 'src/%s/version.py' % PACKAGE_NAME)).__version__
from setuptools import setup, find_packages
setup(
@ -24,7 +27,10 @@ setup(
long_description = README + '\n\n' + CHANGELOG,
packages = find_packages(),
packages = find_packages('src'),
package_dir = {'' : 'src'},
test_suite = 'tests',
platforms = 'any',
zip_safe = False,

9
src/you_get/__init__.py Normal file
View File

@ -0,0 +1,9 @@
#!/usr/bin/env python
from .processor import *
from .downloader import *
from .version import *
from .common import *
from .__main__ import *

View File

@ -32,6 +32,8 @@ def url_to_module(url):
'joy': joy,
'kankanews': bilibili,
'ku6': ku6,
'mixcloud': mixcloud,
'nicovideo': nicovideo,
'pptv': pptv,
'qq': qq,
'sina': sina,
@ -63,3 +65,6 @@ def any_download_playlist(url, output_dir = '.', merge = True, info_only = False
def main():
script_main('you-get', any_download, any_download_playlist)
if __name__ == "__main__":
main()

View File

@ -18,7 +18,7 @@ fake_headers = {
'Accept-Charset': 'UTF-8,*;q=0.5',
'Accept-Encoding': 'gzip,deflate,sdch',
'Accept-Language': 'en-US,en;q=0.8',
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.57 Safari/537.1'
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20100101 Firefox/13.0'
}
if sys.stdout.isatty():
@ -131,10 +131,21 @@ def url_info(url, faker = False):
}
if type in mapping:
ext = mapping[type]
else:
type = None
if headers['content-disposition']:
filename = parse.unquote(r1(r'filename="?(.+)"?', headers['content-disposition']))
if len(filename.split('.')) > 1:
ext = filename.split('.')[-1]
else:
ext = None
else:
ext = None
if headers['transfer-encoding'] != 'chunked':
size = int(headers['content-length'])
else:
size = None
return type, ext, size
@ -357,7 +368,6 @@ def download_urls(urls, title, ext, total_size, output_dir = '.', refer = None,
print('Real URLs:\n', urls, '\n')
return
#assert ext in ('3gp', 'flv', 'mp4', 'webm')
if not total_size:
try:
total_size = urls_size(urls)
@ -506,7 +516,7 @@ def playlist_not_supported(name):
def print_info(site_info, title, type, size):
if type in ['3gp']:
type = 'video/3gpp'
elif type in ['asf']:
elif type in ['asf', 'wmv']:
type = 'video/x-ms-asf'
elif type in ['flv', 'f4v']:
type = 'video/x-flv'
@ -646,7 +656,7 @@ def script_main(script_name, download, download_playlist = None):
sys.exit(1)
if not args:
print(help)
sys.exit(1)
sys.exit()
set_http_proxy(proxy)

View File

@ -10,7 +10,9 @@ from .ifeng import *
from .iqiyi import *
from .joy import *
from .ku6 import *
from .mixcloud import *
from .netease import *
from .nicovideo import *
from .pptv import *
from .qq import *
from .sina import *

View File

@ -10,7 +10,7 @@ def facebook_download(url, output_dir = '.', merge = True, info_only = False):
title = r1(r'<title id="pageTitle">(.+) \| Facebook</title>', html)
for fmt in ["hd_src", "sd_src"]:
src = parse.unquote(unicodize(r1(r'\["' + fmt + '","([^"]*)', html)))
src= re.sub(r'\\/', r'/', r1(r'"' + fmt + '":"([^"]*)"', parse.unquote(unicodize(r1(r'\["params","([^"]*)"\]', html)))))
if src:
break

View File

@ -0,0 +1,62 @@
#!/usr/bin/env python
__all__ = ['googleplus_download']
from ..common import *
import re
def googleplus_download(url, output_dir = '.', merge = True, info_only = False):
# Percent-encoding Unicode URL
url = parse.quote(url, safe = ':/+%')
html = get_html(url)
html = parse.unquote(html).replace('\/', '/')
title = r1(r'<meta property="og:title" content="([^"]*)"', html) or r1(r'<title>(.*)</title>', html) or r1(r'<title>(.*)\n', html)
url2 = r1(r'<a href="([^"]+)" target="_blank" class="Mn" >', html)
if url2:
html = get_html(url2)
html = parse.unquote(html.replace('\/', '/'))
real_url = unicodize(r1(r'"(https://video.googleusercontent.com/[^"]*)",1\]', html).replace('\/', '/'))
if real_url:
type, ext, size = url_info(real_url)
if not real_url or not size:
url_data = re.findall(r'(\[[^\[\"]+\"http://redirector.googlevideo.com/.*\"\])', html)
for itag in [
'38',
'46', '37',
'102', '45', '22',
'84',
'120',
'85',
'44', '35',
'101', '100', '43', '34', '82', '18',
'6',
'83', '5', '36',
'17',
'13',
]:
real_url = None
for url_item in url_data:
if itag == str(eval(url_item)[0]):
real_url = eval(url_item)[3]
break
if real_url:
break
real_url = unicodize(real_url)
type, ext, size = url_info(real_url)
if not ext:
ext = 'mp4'
print_info(site_info, title, ext, size)
if not info_only:
download_urls([real_url], title, ext, size, output_dir, merge = merge)
site_info = "plus.google.com"
download = googleplus_download
download_playlist = playlist_not_supported('googleplus')

View File

@ -0,0 +1,28 @@
#!/usr/bin/env python
__all__ = ['mixcloud_download']
from ..common import *
def mixcloud_download(url, output_dir = '.', merge = True, info_only = False):
html = get_html(url)
title = r1(r'<meta property="og:title" content="([^"]*)"', html)
url = r1("data-preview-url=\"([^\"]+)\"", html)
url = re.sub(r'previews', r'cloudcasts/originals', url)
for i in range(10, 30):
url = re.sub(r'stream[^.]*', r'stream' + str(i), url)
try:
type, ext, size = url_info(url)
break
except:
continue
print_info(site_info, title, type, size)
if not info_only:
download_urls([url], title, ext, size, output_dir, merge = merge)
site_info = "Mixcloud.com"
download = mixcloud_download
download_playlist = playlist_not_supported('mixcloud')

View File

@ -0,0 +1,39 @@
#!/usr/bin/env python
__all__ = ['nicovideo_download']
from ..common import *
def nicovideo_login(user, password):
data = "current_form=login&mail=" + user +"&password=" + password + "&login_submit=Log+In"
response = request.urlopen(request.Request("https://secure.nicovideo.jp/secure/login?site=niconico", headers = fake_headers, data = data.encode('utf-8')))
return response.headers
def nicovideo_download(url, output_dir = '.', merge = True, info_only = False):
request.install_opener(request.build_opener(request.HTTPCookieProcessor()))
import netrc, getpass
info = netrc.netrc().authenticators('nicovideo')
if info is None:
user = input("User: ")
password = getpass.getpass("Password: ")
else:
user, password = info[0], info[2]
print("Logging in...")
nicovideo_login(user, password)
html = get_html(url) # necessary!
title = unicodize(r1(r'title:\s*\'(.*)\',', html))
api_html = get_html('http://www.nicovideo.jp/api/getflv?v=%s' % url.split('/')[-1])
real_url = parse.unquote(r1(r'url=([^&]+)&', api_html))
type, ext, size = url_info(real_url)
print_info(site_info, title, type, size)
if not info_only:
download_urls([real_url], title, ext, size, output_dir, merge = merge)
site_info = "Nicovideo.jp"
download = nicovideo_download
download_playlist = playlist_not_supported('nicovideo')

6
src/you_get/version.py Normal file
View File

@ -0,0 +1,6 @@
#!/usr/bin/env python
__all__ = ['__version__', '__date__']
__version__ = '0.3.1'
__date__ = '2013-02-15'

0
tests/__init__.py Normal file
View File

32
tests/test.py Normal file
View File

@ -0,0 +1,32 @@
#!/usr/bin/env python
# -*- coding: utf-8 -*-
import unittest
from you_get import *
from you_get.__main__ import url_to_module
class YouGetTests(unittest.TestCase):
def test_googleplus(self):
for url in [
"http://plus.google.com/111438309227794971277/posts/So6bW37WWtp",
"http://plus.google.com/114038303885145553998/posts/7Jkwa35HZu8",
"http://plus.google.com/109544372058574620997/posts/Hn9P3Mbuyud",
"http://plus.google.com/photos/109544372058574620997/albums/5835145047890484737/5835145057636064194",
"http://plus.google.com/102663035987142737445/posts/jJRu43KQFT5",
"http://plus.google.com/+%E5%B9%B3%E7%94%B0%E6%A2%A8%E5%A5%88/posts/jJRu43KQFT5",
"http://plus.google.com/+平田梨奈/posts/jJRu43KQFT5",
"http://plus.google.com/photos/102663035987142737445/albums/5844078581209509505/5844078587839097874",
"http://plus.google.com/photos/+%E5%B9%B3%E7%94%B0%E6%A2%A8%E5%A5%88/albums/5844078581209509505/5844078587839097874",
"http://plus.google.com/photos/+平田梨奈/albums/5844078581209509505/5844078587839097874",
]:
url_to_module(url).download(url, info_only = True)
def test_mixcloud(self):
for url in [
"http://www.mixcloud.com/beatbopz/beat-bopz-disco-mix/",
"http://www.mixcloud.com/beatbopz/tokyo-taste-vol4/",
"http://www.mixcloud.com/DJVadim/north-america-are-you-ready/",
]:
url_to_module(url).download(url, info_only = True)

View File

@ -1,5 +1,8 @@
#!/usr/bin/env python3
import os, sys
sys.path.insert(0, os.path.join((os.path.dirname(os.path.abspath(__file__))), "src"))
from you_get import *
if __name__ == "__main__":

View File

@ -5,8 +5,8 @@
"url": "http://www.soimort.org/you-get/",
"license": "MIT",
"description": "A YouTube/Youku video downloader written in Python 3.",
"keywords": "video download youtube youku",
"description": "A YouTube/Youku/Niconico video downloader written in Python 3.",
"keywords": "video download youtube youku niconico",
"classifiers": [
"Development Status :: 2 - Pre-Alpha",
@ -31,6 +31,6 @@
],
"console_scripts": [
"you-get = you_get.main:main"
"you-get = you_get.__main__:main"
]
}

View File

@ -1,8 +0,0 @@
#!/usr/bin/env python
from .processor import *
from .downloader import *
from .version import __version__, __date__
from .common import script_main
from .main import *

View File

@ -1,56 +0,0 @@
#!/usr/bin/env python
__all__ = ['googleplus_download']
from ..common import *
import re
def googleplus_download(url, output_dir = '.', merge = True, info_only = False):
# Percent-encoding Unicode URL
url = parse.quote(url, safe = ':/+%')
html = get_html(url)
html = parse.unquote(html).replace('\/', '/')
title = r1(r'<title>(.*)</title>', html) or r1(r'<title>(.*)\n', html)
url2 = r1(r'"(https\://plus\.google\.com/photos/.*?)",,"image/jpeg","video"\]', html)
if url2:
html = get_html(url2)
html = parse.unquote(html.replace('\/', '/'))
url_data = re.findall(r'(\[[^\[\"]+\"http://redirector.googlevideo.com/.*\"\])', html)
for itag in [
'38',
'46', '37',
'102', '45', '22',
'84',
'120',
'85',
'44', '35',
'101', '100', '43', '34', '82', '18',
'6',
'83', '5', '36',
'17',
'13',
]:
real_url = None
for url_item in url_data:
if itag == str(eval(url_item)[0]):
real_url = eval(url_item)[3]
break
if real_url:
break
real_url = unicodize(real_url)
type, ext, size = url_info(real_url)
print_info(site_info, title, type, size)
if not info_only:
download_urls([real_url], title, ext, size, output_dir, merge = merge)
site_info = "plus.google.com"
download = googleplus_download
download_playlist = playlist_not_supported('googleplus')

View File

@ -1,4 +0,0 @@
#!/usr/bin/env python
__version__ = '0.3dev-20130125'
__date__ = '2013-01-25'