mirror of
https://github.com/soimort/you-get.git
synced 2025-02-03 00:33:58 +03:00
Merge branch 'soimort-develop' into soimort-master
This commit is contained in:
commit
5e18df8a49
1
.gitignore
vendored
1
.gitignore
vendored
@ -6,6 +6,7 @@
|
||||
|
||||
_*/
|
||||
|
||||
*.bak
|
||||
*.download
|
||||
*.cmt.*
|
||||
*.3gp
|
||||
|
@ -3,4 +3,5 @@ language: python
|
||||
python:
|
||||
- "3.2"
|
||||
- "3.3"
|
||||
- "3.4"
|
||||
script: make test
|
||||
|
105
CHANGELOG.txt
105
CHANGELOG.txt
@ -1,6 +1,111 @@
|
||||
Changelog
|
||||
=========
|
||||
|
||||
0.3.29
|
||||
------
|
||||
|
||||
*Date: 2014-05-29*
|
||||
|
||||
* Bug fix release
|
||||
|
||||
0.3.28.3
|
||||
--------
|
||||
|
||||
*Date: 2014-05-18*
|
||||
|
||||
* New site support:
|
||||
- CBS.com
|
||||
|
||||
0.3.28.2
|
||||
--------
|
||||
|
||||
*Date: 2014-04-13*
|
||||
|
||||
* Bug fix release
|
||||
|
||||
0.3.28.1
|
||||
--------
|
||||
|
||||
*Date: 2014-02-28*
|
||||
|
||||
* Bug fix release
|
||||
|
||||
0.3.28
|
||||
------
|
||||
|
||||
*Date: 2014-02-21*
|
||||
|
||||
* New site support:
|
||||
- Magisto.com
|
||||
- VK.com
|
||||
|
||||
0.3.27
|
||||
------
|
||||
|
||||
*Date: 2014-02-14*
|
||||
|
||||
* Bug fix release
|
||||
|
||||
0.3.26
|
||||
------
|
||||
|
||||
*Date: 2014-02-08*
|
||||
|
||||
* New features:
|
||||
- Play video in players (#286)
|
||||
- LeTV support (#289)
|
||||
- Youku 1080P support
|
||||
* Bug fixes:
|
||||
- YouTube (#282, #292)
|
||||
- Sina (#246, #280)
|
||||
- Mixcloud
|
||||
- NetEase
|
||||
- QQ
|
||||
- Vine
|
||||
|
||||
0.3.25
|
||||
------
|
||||
|
||||
*Date: 2013-12-20*
|
||||
|
||||
* Bug fix release
|
||||
|
||||
0.3.24
|
||||
------
|
||||
|
||||
*Date: 2013-10-30*
|
||||
|
||||
* Experimental: Sogou proxy server
|
||||
* Fix issues for:
|
||||
- Vimeo
|
||||
|
||||
0.3.23
|
||||
------
|
||||
|
||||
*Date: 2013-10-23*
|
||||
|
||||
* Support YouTube playlists
|
||||
* Support general short URLs
|
||||
* Fix issues for:
|
||||
- Sina
|
||||
|
||||
0.3.22
|
||||
------
|
||||
|
||||
*Date: 2013-10-18*
|
||||
|
||||
* Fix issues for:
|
||||
- Baidu
|
||||
- Bilibili
|
||||
- JPopsuki TV
|
||||
- Niconico
|
||||
- PPTV
|
||||
- TED
|
||||
- Tumblr
|
||||
- YinYueTai
|
||||
- YouTube
|
||||
- ...
|
||||
|
||||
0.3.21
|
||||
------
|
||||
|
||||
|
351
README.md
351
README.md
@ -1,6 +1,6 @@
|
||||
# You-Get
|
||||
|
||||
[![Build Status](https://api.travis-ci.org/soimort/you-get.png)](https://travis-ci.org/soimort/you-get)
|
||||
[![Build Status](https://api.travis-ci.org/soimort/you-get.png)](https://travis-ci.org/soimort/you-get) [![PyPI version](https://badge.fury.io/py/you-get.png)](http://badge.fury.io/py/you-get)
|
||||
|
||||
[You-Get](https://github.com/soimort/you-get) is a video downloader runs on Python 3. It aims at easing the download of videos on [YouTube](http://www.youtube.com), [Youku](http://www.youku.com)/[Tudou](http://www.tudou.com) (biggest online video providers in China), [ Niconico](http://www.nicovideo.jp), etc., in one script.
|
||||
|
||||
@ -8,6 +8,8 @@ See the project homepage <http://www.soimort.org/you-get> for further documentat
|
||||
|
||||
Fork me on GitHub: <https://github.com/soimort/you-get>
|
||||
|
||||
__中文说明__已移至[wiki](https://github.com/soimort/you-get/wiki/%E4%B8%AD%E6%96%87%E8%AF%B4%E6%98%8E)。
|
||||
|
||||
## Features
|
||||
|
||||
### Supported Sites (As of Now)
|
||||
@ -16,6 +18,7 @@ Fork me on GitHub: <https://github.com/soimort/you-get>
|
||||
* Vimeo <http://vimeo.com>
|
||||
* Coursera <https://www.coursera.org>
|
||||
* Blip <http://blip.tv>
|
||||
* CBS <http://www.cbs.com>
|
||||
* Dailymotion <http://dailymotion.com>
|
||||
* eHow <http://www.ehow.com>
|
||||
* Facebook <http://facebook.com>
|
||||
@ -26,22 +29,25 @@ Fork me on GitHub: <https://github.com/soimort/you-get>
|
||||
* Tumblr <http://www.tumblr.com>
|
||||
* Vine <http://vine.co>
|
||||
* Instagram <http://instagram.com>
|
||||
* Magisto <http://www.magisto.com>
|
||||
* SoundCloud <http://soundcloud.com>
|
||||
* Mixcloud <http://www.mixcloud.com>
|
||||
* Freesound <http://www.freesound.org>
|
||||
* JPopsuki <http://jpopsuki.tv>
|
||||
* VID48 <http://vid48.com>
|
||||
* Niconico (ニコニコ動画) <http://www.nicovideo.jp>
|
||||
* Youku (优酷) <http://www.youku.com>
|
||||
* Tudou (土豆) <http://www.tudou.com>
|
||||
* YinYueTai (音悦台) <http://www.yinyuetai.com>
|
||||
* AcFun <http://www.acfun.tv>
|
||||
* bilibili <http://www.bilibili.tv>
|
||||
* AcFun <http://www.acfun.com>
|
||||
* bilibili <http://www.bilibili.com>
|
||||
* CNTV (中国网络电视台) <http://www.cntv.cn>
|
||||
* Douban (豆瓣) <http://douban.com>
|
||||
* ifeng (凤凰视频) <http://v.ifeng.com>
|
||||
* iQIYI (爱奇艺) <http://www.iqiyi.com>
|
||||
* Joy.cn (激动网) <http://www.joy.cn>
|
||||
* Ku6 (酷6网) <http://www.ku6.com>
|
||||
* LeTV (乐视网) <http://www.letv.com>
|
||||
* MioMio <http://www.miomio.tv>
|
||||
* NetEase (网易视频) <http://v.163.com>
|
||||
* PPTV <http://www.pptv.com>
|
||||
@ -55,87 +61,74 @@ Fork me on GitHub: <https://github.com/soimort/you-get>
|
||||
* Baidu Wangpan (百度网盘) <http://pan.baidu.com>
|
||||
* SongTaste <http://www.songtaste.com>
|
||||
* Alive.in.th <http://alive.in.th>
|
||||
* VK <http://vk.com>
|
||||
|
||||
## Dependencies
|
||||
|
||||
* [Python 3](http://www.python.org/download/releases/)
|
||||
* __(Optional)__ [FFmpeg](http://ffmpeg.org)
|
||||
* Used for converting and joining video files.
|
||||
* __(Optional)__ [FFmpeg](http://ffmpeg.org) / [Libav](http://libav.org/)
|
||||
* For converting and joining video files.
|
||||
* __(Optional)__ [RTMPDump](http://rtmpdump.mplayerhq.hu/)
|
||||
* For processing RTMP streams.
|
||||
|
||||
## Installation
|
||||
|
||||
### 1. Install via [Pip](http://www.pip-installer.org/):
|
||||
### 1. Install via Pip:
|
||||
|
||||
$ [sudo] pip install you-get
|
||||
|
||||
$ pip install you-get
|
||||
|
||||
Check if the installation was successful:
|
||||
|
||||
|
||||
$ you-get -V
|
||||
|
||||
### 2. Install via [EasyInstall](http://pypi.python.org/pypi/setuptools):
|
||||
|
||||
$ easy_install you-get
|
||||
|
||||
Check if the installation was successful:
|
||||
|
||||
$ you-get -V
|
||||
|
||||
### 3. Install from Git:
|
||||
### 2. Install from Git:
|
||||
|
||||
$ git clone git://github.com/soimort/you-get.git
|
||||
|
||||
|
||||
Use the raw script without installation:
|
||||
|
||||
|
||||
$ cd you-get/
|
||||
$ ./you-get -V
|
||||
|
||||
|
||||
To install the package into the system path, execute:
|
||||
|
||||
|
||||
$ make install
|
||||
|
||||
|
||||
Check if the installation was successful:
|
||||
|
||||
|
||||
$ you-get -V
|
||||
|
||||
### 4. Direct download (from <https://github.com/soimort/you-get/zipball/master>):
|
||||
|
||||
### 3. Direct download (from <https://github.com/soimort/you-get/zipball/master>):
|
||||
|
||||
$ wget -O you-get.zip https://github.com/soimort/you-get/zipball/master
|
||||
$ unzip you-get.zip
|
||||
|
||||
|
||||
Use the raw script without installation:
|
||||
|
||||
|
||||
$ cd soimort-you-get-*/
|
||||
$ ./you-get -V
|
||||
|
||||
|
||||
To install the package into the system path, execute:
|
||||
|
||||
|
||||
$ make install
|
||||
|
||||
|
||||
Check if the installation was successful:
|
||||
|
||||
|
||||
$ you-get -V
|
||||
|
||||
### 5. Install from [AUR (Arch User Repository)](http://aur.archlinux.org/):
|
||||
### 4. Install from your distro's repo:
|
||||
|
||||
Click [here](https://aur.archlinux.org/packages.php\?ID=62576).
|
||||
* __AUR (Arch)__: <https://aur.archlinux.org/packages/?O=0&K=you-get>
|
||||
|
||||
### Upgrading:
|
||||
* __Overlay (Gentoo)__: <http://gpo.zugaina.org/net-misc/you-get>
|
||||
|
||||
## Upgrading
|
||||
|
||||
Using Pip:
|
||||
|
||||
$ pip install --upgrade you-get
|
||||
$ [sudo] pip install --upgrade you-get
|
||||
|
||||
### FAQ (For Windows Users):
|
||||
|
||||
* Q: I don't know how to install it on Windows.
|
||||
|
||||
* A: Then don't do it. Just put your `you-get` folder into system `%PATH%`.
|
||||
|
||||
* Q: I got something like `UnicodeDecodeError: 'gbk' codec can't decode byte 0xb0 in position 1012: illegal multibyte sequence`.
|
||||
|
||||
* A: Run `set PYTHONIOENCODING=utf-8`.
|
||||
|
||||
## Examples (For End-Users)
|
||||
## Examples
|
||||
|
||||
Display the information of the video without downloading:
|
||||
|
||||
@ -172,31 +165,25 @@ By default, Python will apply the system proxy settings (i.e. environment variab
|
||||
For a complete list of all available options, see:
|
||||
|
||||
$ you-get --help
|
||||
Usage: you-get [OPTION]... [URL]...
|
||||
|
||||
## Examples (For Developers)
|
||||
Startup options:
|
||||
-V | --version Display the version and exit.
|
||||
-h | --help Print this help and exit.
|
||||
|
||||
In Python 3 (interactive):
|
||||
|
||||
>>> from you_get.downloader import *
|
||||
>>> youtube.download("http://www.youtube.com/watch?v=8bQlxQJEzLk", info_only = True)
|
||||
Video Site: YouTube.com
|
||||
Title: If you're good at something, never do it for free!
|
||||
Type: WebM video (video/webm)
|
||||
Size: 0.13 MB (133176 Bytes)
|
||||
|
||||
>>> import you_get
|
||||
>>> you_get.any_download("http://www.youtube.com/watch?v=sGwy8DsUJ4M")
|
||||
Video Site: YouTube.com
|
||||
Title: Mort from Madagascar LIKES
|
||||
Type: WebM video (video/webm)
|
||||
Size: 1.78 MB (1867072 Bytes)
|
||||
|
||||
Downloading Mort from Madagascar LIKES.webm ...
|
||||
100.0% ( 1.8/1.8 MB) [========================================] 1/1
|
||||
|
||||
## API Reference
|
||||
|
||||
See source code.
|
||||
Download options (use with URLs):
|
||||
-f | --force Force overwriting existed files.
|
||||
-i | --info Display the information of videos without downloading.
|
||||
-u | --url Display the real URLs of videos without downloading.
|
||||
-n | --no-merge Don't merge video parts.
|
||||
-c | --cookies Load NetScape's cookies.txt file.
|
||||
-o | --output-dir <PATH> Set the output directory for downloaded videos.
|
||||
-p | --player <PLAYER [options]> Directly play the video with PLAYER like vlc/smplayer.
|
||||
-x | --http-proxy <HOST:PORT> Use specific HTTP proxy for downloading.
|
||||
--no-proxy Don't use any proxy. (ignore $http_proxy)
|
||||
-S | --sogou Use a Sogou proxy server for downloading.
|
||||
--sogou-proxy <HOST:PORT> Run a standalone Sogou proxy server.
|
||||
--debug Show traceback on KeyboardInterrupt.
|
||||
|
||||
## License
|
||||
|
||||
@ -205,227 +192,3 @@ You-Get is licensed under the [MIT license](https://raw.github.com/soimort/you-g
|
||||
## Contributing
|
||||
|
||||
Please see [CONTRIBUTING.md](https://github.com/soimort/you-get/blob/master/CONTRIBUTING.md).
|
||||
|
||||
|
||||
|
||||
***
|
||||
|
||||
|
||||
|
||||
# You-Get - 中文说明
|
||||
|
||||
[You-Get](https://github.com/soimort/you-get)是一个基于Python 3的视频下载工具。之所以写它的主要原因是,我找不到一个现成的下载工具能够同时支持[YouTube](http://www.youtube.com/)和[优酷](http://www.youku.com/);而且,几乎所有以前的视频下载程序都是基于Python 2的。
|
||||
|
||||
项目主页:<http://www.soimort.org/you-get>
|
||||
|
||||
GitHub地址:<https://github.com/soimort/you-get>
|
||||
|
||||
## 特点
|
||||
|
||||
### 说明
|
||||
|
||||
You-Get基于优酷下载脚本[iambus/youku-lixian](https://github.com/iambus/youku-lixian)用Python 3改写而成,增加了以下功能:
|
||||
|
||||
* 支持YouTube、Vimeo等国外视频网站
|
||||
* 支持断点续传
|
||||
* 可设置HTTP代理
|
||||
|
||||
### 支持的站点(截至目前)
|
||||
|
||||
已实现对以下站点的支持,以后会陆续增加(・∀・)
|
||||
|
||||
* YouTube <http://www.youtube.com>
|
||||
* Vimeo <http://vimeo.com>
|
||||
* Coursera <https://www.coursera.org>
|
||||
* Blip <http://blip.tv>
|
||||
* Dailymotion <http://dailymotion.com>
|
||||
* eHow <http://www.ehow.com>
|
||||
* Facebook <http://facebook.com>
|
||||
* Google+ <http://plus.google.com>
|
||||
* Google Drive <http://docs.google.com>
|
||||
* Khan Academy <http://www.khanacademy.org>
|
||||
* TED <http://www.ted.com>
|
||||
* Tumblr <http://www.tumblr.com>
|
||||
* Vine <http://vine.co>
|
||||
* Instagram <http://instagram.com>
|
||||
* SoundCloud <http://soundcloud.com>
|
||||
* Mixcloud <http://www.mixcloud.com>
|
||||
* Freesound <http://www.freesound.org>
|
||||
* VID48 <http://vid48.com>
|
||||
* NICONICO动画 <http://www.nicovideo.jp>
|
||||
* 优酷 <http://www.youku.com>
|
||||
* 土豆 <http://www.tudou.com>
|
||||
* 音悦台 <http://www.yinyuetai.com>
|
||||
* AcFun <http://www.acfun.tv>
|
||||
* bilibili <http://www.bilibili.tv>
|
||||
* CNTV <http://www.cntv.cn>
|
||||
* 豆瓣 <http://douban.com>
|
||||
* 凤凰视频 <http://v.ifeng.com>
|
||||
* 爱奇艺 <http://www.iqiyi.com>
|
||||
* 激动网 <http://www.joy.cn>
|
||||
* 酷6网 <http://www.ku6.com>
|
||||
* MioMio <http://www.miomio.tv>
|
||||
* 网易视频 <http://v.163.com>
|
||||
* PPTV <http://www.pptv.com>
|
||||
* 腾讯视频 <http://v.qq.com>
|
||||
* 新浪视频 <http://video.sina.com.cn>
|
||||
* 搜狐视频 <http://tv.sohu.com>
|
||||
* 56网 <http://www.56.com>
|
||||
* 虾米 <http://www.xiami.com>
|
||||
* 5sing <http://www.5sing.com>
|
||||
* 百度音乐 <http://music.baidu.com>
|
||||
* 百度网盘 <http://pan.baidu.com>
|
||||
* SongTaste <http://www.songtaste.com>
|
||||
* Alive.in.th <http://alive.in.th>
|
||||
|
||||
## 依赖
|
||||
|
||||
* [Python 3](http://www.python.org/download/releases/)
|
||||
* __(可选)__ [FFmpeg](http://ffmpeg.org)
|
||||
* 用于转换与合并视频文件。
|
||||
|
||||
## 安装说明
|
||||
|
||||
(以下命令格式均以Linux shell为例)
|
||||
|
||||
### 1. 通过[Pip](http://www.pip-installer.org/)安装:
|
||||
|
||||
$ pip install you-get
|
||||
|
||||
检查安装是否成功:
|
||||
|
||||
$ you-get -V
|
||||
|
||||
### 2. 通过[EasyInstall](http://pypi.python.org/pypi/setuptools)安装:
|
||||
|
||||
$ easy_install you-get
|
||||
|
||||
检查安装是否成功:
|
||||
|
||||
$ you-get -V
|
||||
|
||||
### 3. 从Git安装:
|
||||
|
||||
$ git clone git://github.com/soimort/you-get.git
|
||||
|
||||
在不安装的情况下直接使用脚本:
|
||||
|
||||
$ cd you-get/
|
||||
$ ./you-get -V
|
||||
|
||||
若要将Python package安装到系统默认路径,执行:
|
||||
|
||||
$ make install
|
||||
|
||||
检查安装是否成功:
|
||||
|
||||
$ you-get -V
|
||||
|
||||
### 4. 直接下载(从<https://github.com/soimort/you-get/zipball/master>):
|
||||
|
||||
$ wget -O you-get.zip https://github.com/soimort/you-get/zipball/master
|
||||
$ unzip you-get.zip
|
||||
|
||||
在不安装的情况下直接使用脚本:
|
||||
|
||||
$ cd soimort-you-get-*/
|
||||
$ ./you-get -V
|
||||
|
||||
若要将Python package安装到系统默认路径,执行:
|
||||
|
||||
$ make install
|
||||
|
||||
检查安装是否成功:
|
||||
|
||||
$ you-get -V
|
||||
|
||||
### 5. 从[AUR (Arch User Repository)](http://aur.archlinux.org/)安装:
|
||||
|
||||
点击[这里](https://aur.archlinux.org/packages.php\?ID=62576)。
|
||||
|
||||
### 升级:
|
||||
|
||||
使用Pip:
|
||||
|
||||
$ pip install --upgrade you-get
|
||||
|
||||
### FAQ(针对Windows用户):
|
||||
|
||||
* Q:我不知道该如何在Windows下安装。
|
||||
|
||||
* A:不需要安装。直接把`you-get`目录放到系统`%PATH%`中。
|
||||
|
||||
* Q:出现错误提示`UnicodeDecodeError: 'gbk' codec can't decode byte 0xb0 in position 1012: illegal multibyte sequence`。
|
||||
|
||||
* A:执行`set PYTHONIOENCODING=utf-8`。
|
||||
|
||||
## 使用方法示例
|
||||
|
||||
### 如何下载视频
|
||||
|
||||
显示视频信息,但不进行下载(`-i`或`--info`选项):
|
||||
|
||||
$ you-get -i http://www.yinyuetai.com/video/463772
|
||||
|
||||
下载视频:
|
||||
|
||||
$ you-get http://www.yinyuetai.com/video/463772
|
||||
|
||||
下载多个视频:
|
||||
|
||||
$ you-get http://www.yinyuetai.com/video/463772 http://www.yinyuetai.com/video/471500
|
||||
|
||||
若当前目录下已有与视频标题同名的文件,下载时会自动跳过。若有同名的`.download`临时文件,程序会从上次中断处开始下载。
|
||||
如要强制重新下载该视频,可使用`-f`(`--force`)选项:
|
||||
|
||||
$ you-get -f http://www.yinyuetai.com/video/463772
|
||||
|
||||
`-l`(`--playlist`)选项用于下载播放列表(只对某些网站适用):
|
||||
|
||||
$ you-get -l http://www.youku.com/playlist_show/id_5344313.html
|
||||
|
||||
__注:从0.1.3以后的版本起,`-l`选项不再必须。You-Get可以自动识别并处理播放列表的下载。__
|
||||
|
||||
指定视频文件的下载目录:
|
||||
|
||||
$ you-get -o ~/Downloads http://www.yinyuetai.com/video/463772
|
||||
|
||||
显示详细帮助:
|
||||
|
||||
$ you-get -h
|
||||
|
||||
### 如何设置代理
|
||||
|
||||
默认情况下,Python自动使用系统的代理配置。可以通过环境变量`http_proxy`来设置系统的HTTP代理。
|
||||
|
||||
`-x`(`--http-proxy`)选项用于手动指定You-Get所使用的HTTP代理。例如:GoAgent的代理服务器是`http://127.0.0.1:8087`,则通过该代理下载某YouTube视频的命令是:
|
||||
|
||||
$ you-get -x 127.0.0.1:8087 http://www.youtube.com/watch?v=KbtO_Ayjw0M
|
||||
|
||||
Windows下的自由门等翻墙软件会自动设置系统全局代理,因此无需指定HTTP代理即可下载YouTube视频:
|
||||
|
||||
$ you-get http://www.youtube.com/watch?v=KbtO_Ayjw0M
|
||||
|
||||
如果不希望程序在下载过程中使用任何代理(包括系统的代理配置),可以显式地指定`--no-proxy`选项:
|
||||
|
||||
$ you-get --no-proxy http://v.youku.com/v_show/id_XMjI0ODc1NTc2.html
|
||||
|
||||
### 断点续传
|
||||
|
||||
下载未完成时被中止(因为`Ctrl+C`终止程序或者网络中断等原因),在目标路径中会有一个扩展名为`.download`的临时文件。
|
||||
|
||||
下次运行只要在目标路径中找到相应的`.download`临时文件,程序会自动从中断处继续下载。(除非指定了`-f`选项)
|
||||
|
||||
## 使用Python 2?
|
||||
|
||||
优酷等国内视频网站的下载,请移步:[iambus/youku-lixian](https://github.com/iambus/youku-lixian)
|
||||
|
||||
YouTube等国外视频网站的下载,请移步:[rg3/youtube-dl](https://github.com/rg3/youtube-dl)
|
||||
|
||||
## 许可证
|
||||
|
||||
You-Get在[MIT License](https://raw.github.com/soimort/you-get/master/LICENSE.txt)下发布。
|
||||
|
||||
## 如何参与贡献 / 报告issue
|
||||
|
||||
请阅读 [CONTRIBUTING.md](https://github.com/soimort/you-get/blob/master/CONTRIBUTING.md)。
|
||||
|
116
README.txt
116
README.txt
@ -3,6 +3,8 @@ You-Get
|
||||
|
||||
.. image:: https://api.travis-ci.org/soimort/you-get.png
|
||||
|
||||
.. image:: https://badge.fury.io/py/you-get.png
|
||||
|
||||
`You-Get <https://github.com/soimort/you-get>`_ is a video downloader runs on Python 3. It aims at easing the download of videos on `YouTube <http://www.youtube.com>`_, `Youku <http://www.youku.com>`_/`Tudou <http://www.tudou.com>`_ (biggest online video providers in China), `Niconico <http://www.nicovideo.jp>`_, etc., in one script.
|
||||
|
||||
See the project homepage http://www.soimort.org/you-get for further documentation.
|
||||
@ -19,6 +21,7 @@ Supported Sites (As of Now)
|
||||
* Vimeo http://vimeo.com
|
||||
* Coursera https://www.coursera.org
|
||||
* Blip http://blip.tv
|
||||
* CBS http://www.cbs.com
|
||||
* Dailymotion http://dailymotion.com
|
||||
* eHow http://www.ehow.com
|
||||
* Facebook http://facebook.com
|
||||
@ -29,22 +32,25 @@ Supported Sites (As of Now)
|
||||
* Tumblr http://www.tumblr.com
|
||||
* Vine http://vine.co
|
||||
* Instagram http://instagram.com
|
||||
* Magisto http://www.magisto.com
|
||||
* SoundCloud http://soundcloud.com
|
||||
* Mixcloud http://www.mixcloud.com
|
||||
* Freesound http://www.freesound.org
|
||||
* JPopsuki http://jpopsuki.tv
|
||||
* VID48 http://vid48.com
|
||||
* Niconico (ニコニコ動画) http://www.nicovideo.jp
|
||||
* Youku (优酷) http://www.youku.com
|
||||
* Tudou (土豆) http://www.tudou.com
|
||||
* YinYueTai (音悦台) http://www.yinyuetai.com
|
||||
* AcFun http://www.acfun.tv
|
||||
* bilibili http://www.bilibili.tv
|
||||
* AcFun http://www.acfun.com
|
||||
* bilibili http://www.bilibili.com
|
||||
* CNTV (中国网络电视台) http://www.cntv.cn
|
||||
* Douban (豆瓣) http://douban.com
|
||||
* ifeng (凤凰视频) http://v.ifeng.com
|
||||
* iQIYI (爱奇艺) http://www.iqiyi.com
|
||||
* Joy.cn (激动网) http://www.joy.cn
|
||||
* Ku6 (酷6网) http://www.ku6.com
|
||||
* LeTV (乐视网) http://www.letv.com
|
||||
* MioMio http://www.miomio.tv
|
||||
* NetEase (网易视频) http://v.163.com
|
||||
* PPTV http://www.pptv.com
|
||||
@ -58,74 +64,78 @@ Supported Sites (As of Now)
|
||||
* Baidu Wangpan (百度网盘) http://pan.baidu.com
|
||||
* SongTaste http://www.songtaste.com
|
||||
* Alive.in.th http://alive.in.th
|
||||
* VK http://vk.com
|
||||
|
||||
Dependencies
|
||||
------------
|
||||
|
||||
* `Python 3 <http://www.python.org/download/releases/>`_
|
||||
* (Optional) `FFmpeg <http://ffmpeg.org>`_
|
||||
* Used for converting and joining video files.
|
||||
* (Optional) `FFmpeg <http://ffmpeg.org>`_ / `Libav <http://libav.org/>`_
|
||||
* For converting and joining video files.
|
||||
* (Optional) `RTMPDump <http://rtmpdump.mplayerhq.hu/>`_
|
||||
* For processing RTMP streams.
|
||||
|
||||
Installation
|
||||
------------
|
||||
|
||||
#) Install via `Pip <http://www.pip-installer.org/>`_::
|
||||
#) Install via Pip::
|
||||
|
||||
$ [sudo] pip install you-get
|
||||
|
||||
$ pip install you-get
|
||||
|
||||
Check if the installation was successful::
|
||||
|
||||
$ you-get -V
|
||||
|
||||
#) Install via `EasyInstall <http://pypi.python.org/pypi/setuptools>`_::
|
||||
|
||||
$ easy_install you-get
|
||||
|
||||
Check if the installation was successful::
|
||||
|
||||
$ you-get -V
|
||||
|
||||
#) Install from Git::
|
||||
|
||||
$ git clone git://github.com/soimort/you-get.git
|
||||
|
||||
|
||||
Use the raw script without installation::
|
||||
|
||||
|
||||
$ cd you-get/
|
||||
$ ./you-get -V
|
||||
|
||||
|
||||
To install the package into the system path, execute::
|
||||
|
||||
|
||||
$ make install
|
||||
|
||||
|
||||
Check if the installation was successful::
|
||||
|
||||
|
||||
$ you-get -V
|
||||
|
||||
#) Direct download::
|
||||
|
||||
|
||||
$ wget -O you-get.zip https://github.com/soimort/you-get/zipball/master
|
||||
$ unzip you-get.zip
|
||||
|
||||
|
||||
Use the raw script without installation::
|
||||
|
||||
|
||||
$ cd soimort-you-get-*/
|
||||
$ ./you-get -V
|
||||
|
||||
|
||||
To install the package into the system path, execute::
|
||||
|
||||
|
||||
$ make install
|
||||
|
||||
|
||||
Check if the installation was successful::
|
||||
|
||||
|
||||
$ you-get -V
|
||||
|
||||
#) Install from `AUR (Arch User Repository) <http://aur.archlinux.org/>`_:
|
||||
#) Install from your distro's repo:
|
||||
|
||||
Click `here <https://aur.archlinux.org/packages.php?ID=62576>`_.
|
||||
* `AUR (Arch) <https://aur.archlinux.org/packages/?O=0&K=you-get>`_
|
||||
|
||||
Examples (For End-Users)
|
||||
------------------------
|
||||
* `Overlay (Gentoo) <http://gpo.zugaina.org/net-misc/you-get>`_
|
||||
|
||||
Upgrading
|
||||
---------
|
||||
|
||||
Using Pip::
|
||||
|
||||
$ [sudo] pip install --upgrade you-get
|
||||
|
||||
Examples
|
||||
--------
|
||||
|
||||
Display the information of the video without downloading::
|
||||
|
||||
@ -163,33 +173,25 @@ Command-Line Options
|
||||
For a complete list of all available options, see::
|
||||
|
||||
$ you-get --help
|
||||
Usage: you-get [OPTION]... [URL]...
|
||||
|
||||
Examples (For Developers)
|
||||
-------------------------
|
||||
Startup options:
|
||||
-V | --version Display the version and exit.
|
||||
-h | --help Print this help and exit.
|
||||
|
||||
In Python 3 (interactive)::
|
||||
|
||||
>>> from you_get.downloader import *
|
||||
>>> youtube.download("http://www.youtube.com/watch?v=8bQlxQJEzLk", info_only = True)
|
||||
Video Site: YouTube.com
|
||||
Title: If you're good at something, never do it for free!
|
||||
Type: WebM video (video/webm)
|
||||
Size: 0.13 MB (133176 Bytes)
|
||||
|
||||
>>> import you_get
|
||||
>>> you_get.any_download("http://www.youtube.com/watch?v=sGwy8DsUJ4M")
|
||||
Video Site: YouTube.com
|
||||
Title: Mort from Madagascar LIKES
|
||||
Type: WebM video (video/webm)
|
||||
Size: 1.78 MB (1867072 Bytes)
|
||||
|
||||
Downloading Mort from Madagascar LIKES.webm ...
|
||||
100.0% ( 1.8/1.8 MB) [========================================] 1/1
|
||||
|
||||
API Reference
|
||||
-------------
|
||||
|
||||
See source code.
|
||||
Download options (use with URLs):
|
||||
-f | --force Force overwriting existed files.
|
||||
-i | --info Display the information of videos without downloading.
|
||||
-u | --url Display the real URLs of videos without downloading.
|
||||
-n | --no-merge Don't merge video parts.
|
||||
-c | --cookies Load NetScape's cookies.txt file.
|
||||
-o | --output-dir <PATH> Set the output directory for downloaded videos.
|
||||
-p | --player <PLAYER [options]> Directly play the video with PLAYER like vlc/smplayer.
|
||||
-x | --http-proxy <HOST:PORT> Use specific HTTP proxy for downloading.
|
||||
--no-proxy Don't use any proxy. (ignore $http_proxy)
|
||||
-S | --sogou Use a Sogou proxy server for downloading.
|
||||
--sogou-proxy <HOST:PORT> Run a standalone Sogou proxy server.
|
||||
--debug Show traceback on KeyboardInterrupt.
|
||||
|
||||
License
|
||||
-------
|
||||
|
22
setup.py
22
setup.py
@ -7,36 +7,36 @@ PROJ_METADATA = '%s.json' % PROJ_NAME
|
||||
|
||||
import os, json, imp
|
||||
here = os.path.abspath(os.path.dirname(__file__))
|
||||
proj_info = json.loads(open(os.path.join(here, PROJ_METADATA)).read())
|
||||
README = open(os.path.join(here, 'README.txt')).read()
|
||||
CHANGELOG = open(os.path.join(here, 'CHANGELOG.txt')).read()
|
||||
proj_info = json.loads(open(os.path.join(here, PROJ_METADATA), encoding='utf-8').read())
|
||||
README = open(os.path.join(here, 'README.txt'), encoding='utf-8').read()
|
||||
CHANGELOG = open(os.path.join(here, 'CHANGELOG.txt'), encoding='utf-8').read()
|
||||
VERSION = imp.load_source('version', os.path.join(here, 'src/%s/version.py' % PACKAGE_NAME)).__version__
|
||||
|
||||
from setuptools import setup, find_packages
|
||||
setup(
|
||||
name = proj_info['name'],
|
||||
version = VERSION,
|
||||
|
||||
|
||||
author = proj_info['author'],
|
||||
author_email = proj_info['author_email'],
|
||||
url = proj_info['url'],
|
||||
license = proj_info['license'],
|
||||
|
||||
|
||||
description = proj_info['description'],
|
||||
keywords = proj_info['keywords'],
|
||||
|
||||
|
||||
long_description = README + '\n\n' + CHANGELOG,
|
||||
|
||||
|
||||
packages = find_packages('src'),
|
||||
package_dir = {'' : 'src'},
|
||||
|
||||
|
||||
test_suite = 'tests',
|
||||
|
||||
|
||||
platforms = 'any',
|
||||
zip_safe = False,
|
||||
include_package_data = True,
|
||||
|
||||
|
||||
classifiers = proj_info['classifiers'],
|
||||
|
||||
|
||||
entry_points = {'console_scripts': proj_info['console_scripts']}
|
||||
)
|
||||
|
@ -3,7 +3,5 @@
|
||||
from .common import *
|
||||
from .version import *
|
||||
|
||||
# Easy import
|
||||
#from .cli_wrapper.converter import *
|
||||
#from .cli_wrapper.player import *
|
||||
from .downloader import *
|
||||
from .cli_wrapper import *
|
||||
from .extractor import *
|
||||
|
0
src/you_get/cli_wrapper/__init__.py
Normal file
0
src/you_get/cli_wrapper/__init__.py
Normal file
0
src/you_get/cli_wrapper/downloader/__init__.py
Normal file
0
src/you_get/cli_wrapper/downloader/__init__.py
Normal file
0
src/you_get/cli_wrapper/openssl/__init__.py
Normal file
0
src/you_get/cli_wrapper/openssl/__init__.py
Normal file
3
src/you_get/cli_wrapper/player/__init__.py
Normal file
3
src/you_get/cli_wrapper/player/__init__.py
Normal file
@ -0,0 +1,3 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
from .mplayer import *
|
7
src/you_get/cli_wrapper/player/__main__.py
Normal file
7
src/you_get/cli_wrapper/player/__main__.py
Normal file
@ -0,0 +1,7 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
def main():
|
||||
script_main('you-get', any_download, any_download_playlist)
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
0
src/you_get/cli_wrapper/player/dragonplayer.py
Normal file
0
src/you_get/cli_wrapper/player/dragonplayer.py
Normal file
0
src/you_get/cli_wrapper/player/gnome_mplayer.py
Normal file
0
src/you_get/cli_wrapper/player/gnome_mplayer.py
Normal file
0
src/you_get/cli_wrapper/player/mplayer.py
Normal file
0
src/you_get/cli_wrapper/player/mplayer.py
Normal file
1
src/you_get/cli_wrapper/player/vlc.py
Normal file
1
src/you_get/cli_wrapper/player/vlc.py
Normal file
@ -0,0 +1 @@
|
||||
#!/usr/bin/env python
|
0
src/you_get/cli_wrapper/player/wmp.py
Normal file
0
src/you_get/cli_wrapper/player/wmp.py
Normal file
0
src/you_get/cli_wrapper/transcoder/__init__.py
Normal file
0
src/you_get/cli_wrapper/transcoder/__init__.py
Normal file
0
src/you_get/cli_wrapper/transcoder/ffmpeg.py
Normal file
0
src/you_get/cli_wrapper/transcoder/ffmpeg.py
Normal file
0
src/you_get/cli_wrapper/transcoder/libav.py
Normal file
0
src/you_get/cli_wrapper/transcoder/libav.py
Normal file
0
src/you_get/cli_wrapper/transcoder/mencoder.py
Normal file
0
src/you_get/cli_wrapper/transcoder/mencoder.py
Normal file
@ -8,11 +8,18 @@ import re
|
||||
import sys
|
||||
from urllib import request, parse
|
||||
import platform
|
||||
import threading
|
||||
|
||||
from .version import __version__
|
||||
from .util import log, sogou_proxy_server, get_filename, unescape_html
|
||||
|
||||
dry_run = False
|
||||
force = False
|
||||
player = None
|
||||
extractor_proxy = None
|
||||
sogou_proxy = None
|
||||
sogou_env = None
|
||||
cookies_txt = None
|
||||
|
||||
fake_headers = {
|
||||
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
|
||||
@ -49,16 +56,16 @@ def r1_of(patterns, text):
|
||||
|
||||
def match1(text, *patterns):
|
||||
"""Scans through a string for substrings matched some patterns (first-subgroups only).
|
||||
|
||||
|
||||
Args:
|
||||
text: A string to be scanned.
|
||||
patterns: Arbitrary number of regex patterns.
|
||||
|
||||
|
||||
Returns:
|
||||
When only one pattern is given, returns a string (None if no match found).
|
||||
When more than one pattern are given, returns a list of strings ([] if no match found).
|
||||
"""
|
||||
|
||||
|
||||
if len(patterns) == 1:
|
||||
pattern = patterns[0]
|
||||
match = re.search(pattern, text)
|
||||
@ -74,23 +81,31 @@ def match1(text, *patterns):
|
||||
ret.append(match.group(1))
|
||||
return ret
|
||||
|
||||
def launch_player(player, urls):
|
||||
import subprocess
|
||||
import shlex
|
||||
subprocess.call(shlex.split(player) + list(urls))
|
||||
|
||||
def parse_query_param(url, param):
|
||||
"""Parses the query string of a URL and returns the value of a parameter.
|
||||
|
||||
|
||||
Args:
|
||||
url: A URL.
|
||||
param: A string representing the name of the parameter.
|
||||
|
||||
|
||||
Returns:
|
||||
The value of the parameter.
|
||||
"""
|
||||
|
||||
return parse.parse_qs(parse.urlparse(url).query)[param][0]
|
||||
|
||||
try:
|
||||
return parse.parse_qs(parse.urlparse(url).query)[param][0]
|
||||
except:
|
||||
return None
|
||||
|
||||
def unicodize(text):
|
||||
return re.sub(r'\\u([0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f][0-9A-Fa-f])', lambda x: chr(int(x.group(0)[2:], 16)), text)
|
||||
|
||||
# DEPRECATED in favor of filenameable()
|
||||
# DEPRECATED in favor of util.legitimize()
|
||||
def escape_file_path(path):
|
||||
path = path.replace('/', '-')
|
||||
path = path.replace('\\', '-')
|
||||
@ -98,6 +113,7 @@ def escape_file_path(path):
|
||||
path = path.replace('?', '-')
|
||||
return path
|
||||
|
||||
# DEPRECATED in favor of util.legitimize()
|
||||
def filenameable(text):
|
||||
"""Converts a string to a legal filename through various OSes.
|
||||
"""
|
||||
@ -106,11 +122,7 @@ def filenameable(text):
|
||||
0: None,
|
||||
ord('/'): '-',
|
||||
})
|
||||
if platform.system() == 'Darwin': # For Mac OS
|
||||
text = text.translate({
|
||||
ord(':'): '-',
|
||||
})
|
||||
elif platform.system() == 'Windows': # For Windows
|
||||
if platform.system() == 'Windows': # For Windows
|
||||
text = text.translate({
|
||||
ord(':'): '-',
|
||||
ord('*'): '-',
|
||||
@ -124,14 +136,15 @@ def filenameable(text):
|
||||
ord('['): '(',
|
||||
ord(']'): ')',
|
||||
})
|
||||
else:
|
||||
if text.startswith("."):
|
||||
text = text[1:]
|
||||
if platform.system() == 'Darwin': # For Mac OS
|
||||
text = text.translate({
|
||||
ord(':'): '-',
|
||||
})
|
||||
return text
|
||||
|
||||
def unescape_html(html):
|
||||
from html import parser
|
||||
html = parser.HTMLParser().unescape(html)
|
||||
html = re.sub(r'&#(\d+);', lambda x: chr(int(x.group(1))), html)
|
||||
return html
|
||||
|
||||
def ungzip(data):
|
||||
"""Decompresses data for Content-Encoding: gzip.
|
||||
"""
|
||||
@ -146,7 +159,8 @@ def undeflate(data):
|
||||
(the zlib compression is used.)
|
||||
"""
|
||||
import zlib
|
||||
return zlib.decompress(data, -zlib.MAX_WBITS)
|
||||
decompressobj = zlib.decompressobj(-zlib.MAX_WBITS)
|
||||
return decompressobj.decompress(data)+decompressobj.flush()
|
||||
|
||||
# DEPRECATED in favor of get_content()
|
||||
def get_response(url, faker = False):
|
||||
@ -154,7 +168,7 @@ def get_response(url, faker = False):
|
||||
response = request.urlopen(request.Request(url, headers = fake_headers), None)
|
||||
else:
|
||||
response = request.urlopen(url)
|
||||
|
||||
|
||||
data = response.read()
|
||||
if response.info().get('Content-Encoding') == 'gzip':
|
||||
data = ungzip(data)
|
||||
@ -174,32 +188,36 @@ def get_decoded_html(url, faker = False):
|
||||
data = response.data
|
||||
charset = r1(r'charset=([\w-]+)', response.headers['content-type'])
|
||||
if charset:
|
||||
return data.decode(charset)
|
||||
return data.decode(charset, 'ignore')
|
||||
else:
|
||||
return data
|
||||
|
||||
def get_content(url, headers={}, decoded=True):
|
||||
"""Gets the content of a URL via sending a HTTP GET request.
|
||||
|
||||
|
||||
Args:
|
||||
url: A URL.
|
||||
headers: Request headers used by the client.
|
||||
decoded: Whether decode the response body using UTF-8 or the charset specified in Content-Type.
|
||||
|
||||
|
||||
Returns:
|
||||
The content as a string.
|
||||
"""
|
||||
|
||||
response = request.urlopen(request.Request(url, headers=headers))
|
||||
|
||||
req = request.Request(url, headers=headers)
|
||||
if cookies_txt:
|
||||
cookies_txt.add_cookie_header(req)
|
||||
req.headers.update(req.unredirected_hdrs)
|
||||
response = request.urlopen(req)
|
||||
data = response.read()
|
||||
|
||||
|
||||
# Handle HTTP compression for gzip and deflate (zlib)
|
||||
content_encoding = response.getheader('Content-Encoding')
|
||||
if content_encoding == 'gzip':
|
||||
data = ungzip(data)
|
||||
elif content_encoding == 'deflate':
|
||||
data = undeflate(data)
|
||||
|
||||
|
||||
# Decode the response body
|
||||
if decoded:
|
||||
charset = match1(response.getheader('Content-Type'), r'charset=([\w-]+)')
|
||||
@ -207,7 +225,7 @@ def get_content(url, headers={}, decoded=True):
|
||||
data = data.decode(charset)
|
||||
else:
|
||||
data = data.decode('utf-8')
|
||||
|
||||
|
||||
return data
|
||||
|
||||
def url_size(url, faker = False):
|
||||
@ -215,7 +233,7 @@ def url_size(url, faker = False):
|
||||
response = request.urlopen(request.Request(url, headers = fake_headers), None)
|
||||
else:
|
||||
response = request.urlopen(url)
|
||||
|
||||
|
||||
size = int(response.headers['content-length'])
|
||||
return size
|
||||
|
||||
@ -227,9 +245,9 @@ def url_info(url, faker = False):
|
||||
response = request.urlopen(request.Request(url, headers = fake_headers), None)
|
||||
else:
|
||||
response = request.urlopen(request.Request(url))
|
||||
|
||||
|
||||
headers = response.headers
|
||||
|
||||
|
||||
type = headers['content-type']
|
||||
mapping = {
|
||||
'video/3gpp': '3gp',
|
||||
@ -257,12 +275,12 @@ def url_info(url, faker = False):
|
||||
ext = None
|
||||
else:
|
||||
ext = None
|
||||
|
||||
|
||||
if headers['transfer-encoding'] != 'chunked':
|
||||
size = int(headers['content-length'])
|
||||
else:
|
||||
size = None
|
||||
|
||||
|
||||
return type, ext, size
|
||||
|
||||
def url_locations(urls, faker = False):
|
||||
@ -272,13 +290,13 @@ def url_locations(urls, faker = False):
|
||||
response = request.urlopen(request.Request(url, headers = fake_headers), None)
|
||||
else:
|
||||
response = request.urlopen(request.Request(url))
|
||||
|
||||
|
||||
locations.append(response.url)
|
||||
return locations
|
||||
|
||||
def url_save(url, filepath, bar, refer = None, is_part = False, faker = False):
|
||||
file_size = url_size(url, faker = faker)
|
||||
|
||||
|
||||
if os.path.exists(filepath):
|
||||
if not force and file_size == os.path.getsize(filepath):
|
||||
if not is_part:
|
||||
@ -296,19 +314,19 @@ def url_save(url, filepath, bar, refer = None, is_part = False, faker = False):
|
||||
print('Overwriting %s' % tr(os.path.basename(filepath)), '...')
|
||||
elif not os.path.exists(os.path.dirname(filepath)):
|
||||
os.mkdir(os.path.dirname(filepath))
|
||||
|
||||
|
||||
temp_filepath = filepath + '.download'
|
||||
received = 0
|
||||
if not force:
|
||||
open_mode = 'ab'
|
||||
|
||||
|
||||
if os.path.exists(temp_filepath):
|
||||
received += os.path.getsize(temp_filepath)
|
||||
if bar:
|
||||
bar.update_received(os.path.getsize(temp_filepath))
|
||||
else:
|
||||
open_mode = 'wb'
|
||||
|
||||
|
||||
if received < file_size:
|
||||
if faker:
|
||||
headers = fake_headers
|
||||
@ -318,7 +336,7 @@ def url_save(url, filepath, bar, refer = None, is_part = False, faker = False):
|
||||
headers['Range'] = 'bytes=' + str(received) + '-'
|
||||
if refer:
|
||||
headers['Referer'] = refer
|
||||
|
||||
|
||||
response = request.urlopen(request.Request(url, headers = headers), None)
|
||||
try:
|
||||
range_start = int(response.headers['content-range'][6:].split('/')[0].split('-')[0])
|
||||
@ -326,13 +344,13 @@ def url_save(url, filepath, bar, refer = None, is_part = False, faker = False):
|
||||
range_length = end_length - range_start
|
||||
except:
|
||||
range_length = int(response.headers['content-length'])
|
||||
|
||||
|
||||
if file_size != received + range_length:
|
||||
received = 0
|
||||
if bar:
|
||||
bar.received = 0
|
||||
open_mode = 'wb'
|
||||
|
||||
|
||||
with open(temp_filepath, open_mode) as output:
|
||||
while True:
|
||||
buffer = response.read(1024 * 256)
|
||||
@ -346,9 +364,9 @@ def url_save(url, filepath, bar, refer = None, is_part = False, faker = False):
|
||||
received += len(buffer)
|
||||
if bar:
|
||||
bar.update_received(len(buffer))
|
||||
|
||||
|
||||
assert received == os.path.getsize(temp_filepath), '%s == %s == %s' % (received, os.path.getsize(temp_filepath), temp_filepath)
|
||||
|
||||
|
||||
if os.access(filepath, os.W_OK):
|
||||
os.remove(filepath) # on Windows rename could fail if destination filepath exists
|
||||
os.rename(temp_filepath, filepath)
|
||||
@ -371,19 +389,19 @@ def url_save_chunked(url, filepath, bar, refer = None, is_part = False, faker =
|
||||
print('Overwriting %s' % tr(os.path.basename(filepath)), '...')
|
||||
elif not os.path.exists(os.path.dirname(filepath)):
|
||||
os.mkdir(os.path.dirname(filepath))
|
||||
|
||||
|
||||
temp_filepath = filepath + '.download'
|
||||
received = 0
|
||||
if not force:
|
||||
open_mode = 'ab'
|
||||
|
||||
|
||||
if os.path.exists(temp_filepath):
|
||||
received += os.path.getsize(temp_filepath)
|
||||
if bar:
|
||||
bar.update_received(os.path.getsize(temp_filepath))
|
||||
else:
|
||||
open_mode = 'wb'
|
||||
|
||||
|
||||
if faker:
|
||||
headers = fake_headers
|
||||
else:
|
||||
@ -392,9 +410,9 @@ def url_save_chunked(url, filepath, bar, refer = None, is_part = False, faker =
|
||||
headers['Range'] = 'bytes=' + str(received) + '-'
|
||||
if refer:
|
||||
headers['Referer'] = refer
|
||||
|
||||
|
||||
response = request.urlopen(request.Request(url, headers = headers), None)
|
||||
|
||||
|
||||
with open(temp_filepath, open_mode) as output:
|
||||
while True:
|
||||
buffer = response.read(1024 * 256)
|
||||
@ -404,9 +422,9 @@ def url_save_chunked(url, filepath, bar, refer = None, is_part = False, faker =
|
||||
received += len(buffer)
|
||||
if bar:
|
||||
bar.update_received(len(buffer))
|
||||
|
||||
|
||||
assert received == os.path.getsize(temp_filepath), '%s == %s == %s' % (received, os.path.getsize(temp_filepath))
|
||||
|
||||
|
||||
if os.access(filepath, os.W_OK):
|
||||
os.remove(filepath) # on Windows rename could fail if destination filepath exists
|
||||
os.rename(temp_filepath, filepath)
|
||||
@ -418,7 +436,7 @@ class SimpleProgressBar:
|
||||
self.total_pieces = total_pieces
|
||||
self.current_piece = 1
|
||||
self.received = 0
|
||||
|
||||
|
||||
def update(self):
|
||||
self.displayed = True
|
||||
bar_size = 40
|
||||
@ -437,14 +455,14 @@ class SimpleProgressBar:
|
||||
bar = '{0:>5}% ({1:>5}/{2:<5}MB) [{3:<40}] {4}/{5}'.format(percent, round(self.received / 1048576, 1), round(self.total_size / 1048576, 1), bar, self.current_piece, self.total_pieces)
|
||||
sys.stdout.write('\r' + bar)
|
||||
sys.stdout.flush()
|
||||
|
||||
|
||||
def update_received(self, n):
|
||||
self.received += n
|
||||
self.update()
|
||||
|
||||
|
||||
def update_piece(self, n):
|
||||
self.current_piece = n
|
||||
|
||||
|
||||
def done(self):
|
||||
if self.displayed:
|
||||
print()
|
||||
@ -457,20 +475,20 @@ class PiecesProgressBar:
|
||||
self.total_pieces = total_pieces
|
||||
self.current_piece = 1
|
||||
self.received = 0
|
||||
|
||||
|
||||
def update(self):
|
||||
self.displayed = True
|
||||
bar = '{0:>5}%[{1:<40}] {2}/{3}'.format('?', '?' * 40, self.current_piece, self.total_pieces)
|
||||
sys.stdout.write('\r' + bar)
|
||||
sys.stdout.flush()
|
||||
|
||||
|
||||
def update_received(self, n):
|
||||
self.received += n
|
||||
self.update()
|
||||
|
||||
|
||||
def update_piece(self, n):
|
||||
self.current_piece = n
|
||||
|
||||
|
||||
def done(self):
|
||||
if self.displayed:
|
||||
print()
|
||||
@ -486,12 +504,16 @@ class DummyProgressBar:
|
||||
def done(self):
|
||||
pass
|
||||
|
||||
def download_urls(urls, title, ext, total_size, output_dir = '.', refer = None, merge = True, faker = False):
|
||||
def download_urls(urls, title, ext, total_size, output_dir='.', refer=None, merge=True, faker=False):
|
||||
assert urls
|
||||
if dry_run:
|
||||
print('Real URLs:\n', urls, '\n')
|
||||
print('Real URLs:\n%s\n' % urls)
|
||||
return
|
||||
|
||||
|
||||
if player:
|
||||
launch_player(player, urls)
|
||||
return
|
||||
|
||||
if not total_size:
|
||||
try:
|
||||
total_size = urls_size(urls)
|
||||
@ -500,9 +522,9 @@ def download_urls(urls, title, ext, total_size, output_dir = '.', refer = None,
|
||||
import sys
|
||||
traceback.print_exc(file = sys.stdout)
|
||||
pass
|
||||
|
||||
title = filenameable(title)
|
||||
|
||||
|
||||
title = get_filename(title)
|
||||
|
||||
filename = '%s.%s' % (title, ext)
|
||||
filepath = os.path.join(output_dir, filename)
|
||||
if total_size:
|
||||
@ -513,7 +535,7 @@ def download_urls(urls, title, ext, total_size, output_dir = '.', refer = None,
|
||||
bar = SimpleProgressBar(total_size, len(urls))
|
||||
else:
|
||||
bar = PiecesProgressBar(total_size, len(urls))
|
||||
|
||||
|
||||
if len(urls) == 1:
|
||||
url = urls[0]
|
||||
print('Downloading %s ...' % tr(filename))
|
||||
@ -530,7 +552,7 @@ def download_urls(urls, title, ext, total_size, output_dir = '.', refer = None,
|
||||
bar.update_piece(i + 1)
|
||||
url_save(url, filepath, bar, refer = refer, is_part = True, faker = faker)
|
||||
bar.done()
|
||||
|
||||
|
||||
if not merge:
|
||||
print()
|
||||
return
|
||||
@ -548,7 +570,7 @@ def download_urls(urls, title, ext, total_size, output_dir = '.', refer = None,
|
||||
else:
|
||||
for part in parts:
|
||||
os.remove(part)
|
||||
|
||||
|
||||
elif ext == 'mp4':
|
||||
try:
|
||||
from .processor.ffmpeg import has_ffmpeg_installed
|
||||
@ -563,22 +585,26 @@ def download_urls(urls, title, ext, total_size, output_dir = '.', refer = None,
|
||||
else:
|
||||
for part in parts:
|
||||
os.remove(part)
|
||||
|
||||
|
||||
else:
|
||||
print("Can't merge %s files" % ext)
|
||||
|
||||
|
||||
print()
|
||||
|
||||
def download_urls_chunked(urls, title, ext, total_size, output_dir = '.', refer = None, merge = True, faker = False):
|
||||
def download_urls_chunked(urls, title, ext, total_size, output_dir='.', refer=None, merge=True, faker=False):
|
||||
assert urls
|
||||
if dry_run:
|
||||
print('Real URLs:\n', urls, '\n')
|
||||
print('Real URLs:\n%s\n' % urls)
|
||||
return
|
||||
|
||||
|
||||
if player:
|
||||
launch_player(player, urls)
|
||||
return
|
||||
|
||||
assert ext in ('ts')
|
||||
|
||||
title = filenameable(title)
|
||||
|
||||
|
||||
title = get_filename(title)
|
||||
|
||||
filename = '%s.%s' % (title, 'ts')
|
||||
filepath = os.path.join(output_dir, filename)
|
||||
if total_size:
|
||||
@ -589,7 +615,7 @@ def download_urls_chunked(urls, title, ext, total_size, output_dir = '.', refer
|
||||
bar = SimpleProgressBar(total_size, len(urls))
|
||||
else:
|
||||
bar = PiecesProgressBar(total_size, len(urls))
|
||||
|
||||
|
||||
if len(urls) == 1:
|
||||
parts = []
|
||||
url = urls[0]
|
||||
@ -598,7 +624,7 @@ def download_urls_chunked(urls, title, ext, total_size, output_dir = '.', refer
|
||||
parts.append(filepath)
|
||||
url_save_chunked(url, filepath, bar, refer = refer, faker = faker)
|
||||
bar.done()
|
||||
|
||||
|
||||
if not merge:
|
||||
print()
|
||||
return
|
||||
@ -626,7 +652,7 @@ def download_urls_chunked(urls, title, ext, total_size, output_dir = '.', refer
|
||||
bar.update_piece(i + 1)
|
||||
url_save_chunked(url, filepath, bar, refer = refer, is_part = True, faker = faker)
|
||||
bar.done()
|
||||
|
||||
|
||||
if not merge:
|
||||
print()
|
||||
return
|
||||
@ -643,9 +669,25 @@ def download_urls_chunked(urls, title, ext, total_size, output_dir = '.', refer
|
||||
print('No ffmpeg is found. Merging aborted.')
|
||||
else:
|
||||
print("Can't merge %s files" % ext)
|
||||
|
||||
|
||||
print()
|
||||
|
||||
def download_rtmp_url(url, playpath, title, ext, total_size=0, output_dir='.', refer=None, merge=True, faker=False):
|
||||
assert url
|
||||
if dry_run:
|
||||
print('Real URL:\n%s\n' % [url])
|
||||
print('Real Playpath:\n%s\n' % [playpath])
|
||||
return
|
||||
|
||||
if player:
|
||||
from .processor.rtmpdump import play_rtmpdump_stream
|
||||
play_rtmpdump_stream(player, url, playpath)
|
||||
return
|
||||
|
||||
from .processor.rtmpdump import has_rtmpdump_installed, download_rtmpdump_stream
|
||||
assert has_rtmpdump_installed(), "RTMPDump not installed."
|
||||
download_rtmpdump_stream(url, playpath, title, ext, output_dir)
|
||||
|
||||
def playlist_not_supported(name):
|
||||
def f(*args, **kwargs):
|
||||
raise NotImplementedError('Playlist is not supported for ' + name)
|
||||
@ -672,7 +714,7 @@ def print_info(site_info, title, type, size):
|
||||
type = 'video/MP2T'
|
||||
elif type in ['webm']:
|
||||
type = 'video/webm'
|
||||
|
||||
|
||||
if type in ['video/3gpp']:
|
||||
type_info = "3GPP multimedia file (%s)" % type
|
||||
elif type in ['video/x-flv', 'video/f4v']:
|
||||
@ -699,13 +741,42 @@ def print_info(site_info, title, type, size):
|
||||
type_info = "MP3 (%s)" % type
|
||||
else:
|
||||
type_info = "Unknown type (%s)" % type
|
||||
|
||||
|
||||
print("Video Site:", site_info)
|
||||
print("Title: ", tr(title))
|
||||
print("Title: ", unescape_html(tr(title)))
|
||||
print("Type: ", type_info)
|
||||
print("Size: ", round(size / 1048576, 2), "MB (" + str(size) + " Bytes)")
|
||||
print("Size: ", round(size / 1048576, 2), "MiB (" + str(size) + " Bytes)")
|
||||
print()
|
||||
|
||||
def parse_host(host):
|
||||
"""Parses host name and port number from a string.
|
||||
"""
|
||||
if re.match(r'^(\d+)$', host) is not None:
|
||||
return ("0.0.0.0", int(host))
|
||||
if re.match(r'^(\w+)://', host) is None:
|
||||
host = "//" + host
|
||||
o = parse.urlparse(host)
|
||||
hostname = o.hostname or "0.0.0.0"
|
||||
port = o.port or 0
|
||||
return (hostname, port)
|
||||
|
||||
def get_sogou_proxy():
|
||||
return sogou_proxy
|
||||
|
||||
def set_proxy(proxy):
|
||||
proxy_handler = request.ProxyHandler({
|
||||
'http': '%s:%s' % proxy,
|
||||
'https': '%s:%s' % proxy,
|
||||
})
|
||||
opener = request.build_opener(proxy_handler)
|
||||
request.install_opener(opener)
|
||||
|
||||
def unset_proxy():
|
||||
proxy_handler = request.ProxyHandler({})
|
||||
opener = request.build_opener(proxy_handler)
|
||||
request.install_opener(opener)
|
||||
|
||||
# DEPRECATED in favor of set_proxy() and unset_proxy()
|
||||
def set_http_proxy(proxy):
|
||||
if proxy == None: # Use system default setting
|
||||
proxy_support = request.ProxyHandler()
|
||||
@ -722,7 +793,7 @@ def download_main(download, download_playlist, urls, playlist, output_dir, merge
|
||||
url = url[8:]
|
||||
if not url.startswith('http://'):
|
||||
url = 'http://' + url
|
||||
|
||||
|
||||
if playlist:
|
||||
download_playlist(url, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
else:
|
||||
@ -732,7 +803,7 @@ def get_version():
|
||||
try:
|
||||
import subprocess
|
||||
real_dir = os.path.dirname(os.path.realpath(__file__))
|
||||
git_hash = subprocess.Popen(['git', 'rev-parse', '--short', 'HEAD'], cwd=real_dir, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL).stdout.read().decode('utf-8').strip()
|
||||
git_hash = subprocess.check_output(['git', 'rev-parse', '--short', 'HEAD'], cwd=real_dir, stderr=subprocess.DEVNULL).decode('utf-8').strip()
|
||||
assert git_hash
|
||||
return '%s-%s' % (__version__, git_hash)
|
||||
except:
|
||||
@ -749,31 +820,46 @@ def script_main(script_name, download, download_playlist = None):
|
||||
-f | --force Force overwriting existed files.
|
||||
-i | --info Display the information of videos without downloading.
|
||||
-u | --url Display the real URLs of videos without downloading.
|
||||
-c | --cookies Load NetScape's cookies.txt file.
|
||||
-n | --no-merge Don't merge video parts.
|
||||
-o | --output-dir <PATH> Set the output directory for downloaded videos.
|
||||
-x | --http-proxy <PROXY-SERVER-IP:PORT> Use specific HTTP proxy for downloading.
|
||||
-p | --player <PLAYER [options]> Directly play the video with PLAYER like vlc/smplayer.
|
||||
-x | --http-proxy <HOST:PORT> Use specific HTTP proxy for downloading.
|
||||
-y | --extractor-proxy <HOST:PORT> Use specific HTTP proxy for extracting stream data.
|
||||
--no-proxy Don't use any proxy. (ignore $http_proxy)
|
||||
-S | --sogou Use a Sogou proxy server for downloading.
|
||||
--sogou-proxy <HOST:PORT> Run a standalone Sogou proxy server.
|
||||
--debug Show traceback on KeyboardInterrupt.
|
||||
'''
|
||||
|
||||
short_opts = 'Vhfiuno:x:'
|
||||
opts = ['version', 'help', 'force', 'info', 'url', 'no-merge', 'no-proxy', 'debug', 'output-dir=', 'http-proxy=']
|
||||
|
||||
short_opts = 'Vhfiuc:nSo:p:x:y:'
|
||||
opts = ['version', 'help', 'force', 'info', 'url', 'cookies', 'no-merge', 'no-proxy', 'debug', 'sogou', 'output-dir=', 'player=', 'http-proxy=', 'extractor-proxy=', 'sogou-proxy=', 'sogou-env=']
|
||||
if download_playlist:
|
||||
short_opts = 'l' + short_opts
|
||||
opts = ['playlist'] + opts
|
||||
|
||||
|
||||
try:
|
||||
opts, args = getopt.getopt(sys.argv[1:], short_opts, opts)
|
||||
except getopt.GetoptError as err:
|
||||
print(err)
|
||||
print(help)
|
||||
log.e(err)
|
||||
log.e("try 'you-get --help' for more options")
|
||||
sys.exit(2)
|
||||
|
||||
|
||||
global force
|
||||
global dry_run
|
||||
global player
|
||||
global extractor_proxy
|
||||
global sogou_proxy
|
||||
global sogou_env
|
||||
global cookies_txt
|
||||
cookies_txt = None
|
||||
|
||||
info_only = False
|
||||
playlist = False
|
||||
merge = True
|
||||
output_dir = '.'
|
||||
proxy = None
|
||||
extractor_proxy = None
|
||||
traceback = False
|
||||
for o, a in opts:
|
||||
if o in ('-V', '--version'):
|
||||
@ -784,38 +870,175 @@ def script_main(script_name, download, download_playlist = None):
|
||||
print(help)
|
||||
sys.exit()
|
||||
elif o in ('-f', '--force'):
|
||||
global force
|
||||
force = True
|
||||
elif o in ('-i', '--info'):
|
||||
info_only = True
|
||||
elif o in ('-u', '--url'):
|
||||
global dry_run
|
||||
dry_run = True
|
||||
elif o in ('-c', '--cookies'):
|
||||
from http import cookiejar
|
||||
cookies_txt = cookiejar.MozillaCookieJar(a)
|
||||
cookies_txt.load()
|
||||
elif o in ('-l', '--playlist'):
|
||||
playlist = True
|
||||
elif o in ('-n', '--no-merge'):
|
||||
merge = False
|
||||
elif o in ('--no-proxy'):
|
||||
elif o in ('--no-proxy',):
|
||||
proxy = ''
|
||||
elif o in ('--debug'):
|
||||
elif o in ('--debug',):
|
||||
traceback = True
|
||||
elif o in ('-o', '--output-dir'):
|
||||
output_dir = a
|
||||
elif o in ('-p', '--player'):
|
||||
player = a
|
||||
elif o in ('-x', '--http-proxy'):
|
||||
proxy = a
|
||||
elif o in ('-y', '--extractor-proxy'):
|
||||
extractor_proxy = a
|
||||
elif o in ('-S', '--sogou'):
|
||||
sogou_proxy = ("0.0.0.0", 0)
|
||||
elif o in ('--sogou-proxy',):
|
||||
sogou_proxy = parse_host(a)
|
||||
elif o in ('--sogou-env',):
|
||||
sogou_env = a
|
||||
else:
|
||||
log.e("try 'you-get --help' for more options")
|
||||
sys.exit(2)
|
||||
if not args:
|
||||
if sogou_proxy is not None:
|
||||
try:
|
||||
if sogou_env is not None:
|
||||
server = sogou_proxy_server(sogou_proxy, network_env=sogou_env)
|
||||
else:
|
||||
server = sogou_proxy_server(sogou_proxy)
|
||||
server.serve_forever()
|
||||
except KeyboardInterrupt:
|
||||
if traceback:
|
||||
raise
|
||||
else:
|
||||
sys.exit()
|
||||
else:
|
||||
print(help)
|
||||
sys.exit(1)
|
||||
if not args:
|
||||
print(help)
|
||||
sys.exit()
|
||||
|
||||
sys.exit()
|
||||
|
||||
set_http_proxy(proxy)
|
||||
|
||||
if traceback:
|
||||
|
||||
try:
|
||||
download_main(download, download_playlist, args, playlist, output_dir, merge, info_only)
|
||||
else:
|
||||
try:
|
||||
download_main(download, download_playlist, args, playlist, output_dir, merge, info_only)
|
||||
except KeyboardInterrupt:
|
||||
except KeyboardInterrupt:
|
||||
if traceback:
|
||||
raise
|
||||
else:
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
|
||||
class VideoExtractor():
|
||||
def __init__(self, *args):
|
||||
self.url = None
|
||||
self.title = None
|
||||
self.vid = None
|
||||
self.streams = {}
|
||||
self.streams_sorted = []
|
||||
|
||||
if args:
|
||||
self.url = args[0]
|
||||
|
||||
def download_by_url(self, url, **kwargs):
|
||||
self.url = url
|
||||
|
||||
self.prepare(**kwargs)
|
||||
|
||||
self.streams_sorted = [dict([('id', stream_type['id'])] + list(self.streams[stream_type['id']].items())) for stream_type in self.__class__.stream_types if stream_type['id'] in self.streams]
|
||||
|
||||
global extractor_proxy
|
||||
if extractor_proxy:
|
||||
set_proxy(parse_host(extractor_proxy))
|
||||
self.extract(**kwargs)
|
||||
if extractor_proxy:
|
||||
unset_proxy()
|
||||
|
||||
self.download(**kwargs)
|
||||
|
||||
def download_by_vid(self, vid, **kwargs):
|
||||
self.vid = vid
|
||||
|
||||
self.prepare(**kwargs)
|
||||
|
||||
self.streams_sorted = [dict([('id', stream_type['id'])] + list(self.streams[stream_type['id']].items())) for stream_type in self.__class__.stream_types if stream_type['id'] in self.streams]
|
||||
|
||||
global extractor_proxy
|
||||
if extractor_proxy:
|
||||
set_proxy(parse_host(extractor_proxy))
|
||||
self.extract(**kwargs)
|
||||
if extractor_proxy:
|
||||
unset_proxy()
|
||||
|
||||
self.download(**kwargs)
|
||||
|
||||
def prepare(self, **kwargs):
|
||||
pass
|
||||
#raise NotImplementedError()
|
||||
|
||||
def extract(self, **kwargs):
|
||||
pass
|
||||
#raise NotImplementedError()
|
||||
|
||||
def p_stream(self, stream_id):
|
||||
stream = self.streams[stream_id]
|
||||
print(" - id: \033[7m%s\033[0m" % stream_id)
|
||||
print(" container: %s" % stream['container'])
|
||||
print(" video-profile: %s" % stream['video_profile'])
|
||||
print(" size: %s MiB (%s bytes)" % (round(stream['size'] / 1048576, 1), stream['size']))
|
||||
#print(" # download-with: \033[4myou-get --stream=%s\033[0m" % stream_id)
|
||||
print()
|
||||
|
||||
def p(self, stream_id=None):
|
||||
print("site: %s" % self.__class__.name)
|
||||
print("title: %s" % self.title)
|
||||
if stream_id:
|
||||
# Print the stream
|
||||
print("stream:")
|
||||
self.p_stream(stream_id)
|
||||
|
||||
elif stream_id is None:
|
||||
# Print stream with best quality
|
||||
print("stream: # Best quality")
|
||||
stream_id = self.streams_sorted[0]['id']
|
||||
self.p_stream(stream_id)
|
||||
|
||||
elif stream_id == []:
|
||||
# Print all available streams
|
||||
print("streams: # Available quality and codecs")
|
||||
for stream in self.streams_sorted:
|
||||
self.p_stream(stream['id'])
|
||||
|
||||
def download(self, **kwargs):
|
||||
if 'info_only' in kwargs and kwargs['info_only']:
|
||||
if 'stream_id' in kwargs and kwargs['stream_id']:
|
||||
# Display the stream
|
||||
stream_id = kwargs['stream_id']
|
||||
self.p(stream_id)
|
||||
else:
|
||||
# Display all available streams
|
||||
self.p([])
|
||||
|
||||
else:
|
||||
if 'stream_id' in kwargs and kwargs['stream_id']:
|
||||
# Download the stream
|
||||
stream_id = kwargs['stream_id']
|
||||
else:
|
||||
# Download stream with the best quality
|
||||
stream_id = self.streams_sorted[0]['id']
|
||||
|
||||
self.p(None)
|
||||
|
||||
urls = self.streams[stream_id]['src']
|
||||
if not urls:
|
||||
log.e('[Failed] Cannot extract video source.')
|
||||
log.e('This is most likely because the video has not been made available in your country.')
|
||||
log.e('You may try to use a proxy via \'-y\' for extracting stream data.')
|
||||
exit(1)
|
||||
download_urls(urls, self.title, self.streams[stream_id]['container'], self.streams[stream_id]['size'], output_dir=kwargs['output_dir'], merge=kwargs['merge'])
|
||||
|
||||
self.__init__()
|
||||
|
@ -1,58 +0,0 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
__all__ = ['acfun_download']
|
||||
|
||||
from ..common import *
|
||||
|
||||
from .qq import qq_download_by_id
|
||||
from .sina import sina_download_by_vid
|
||||
from .tudou import tudou_download_by_iid
|
||||
from .youku import youku_download_by_id
|
||||
|
||||
import json, re
|
||||
|
||||
def get_srt_json(id):
|
||||
url = 'http://comment.acfun.tv/%s.json' % id
|
||||
return get_html(url)
|
||||
|
||||
def acfun_download_by_id(id, title = None, output_dir = '.', merge = True, info_only = False):
|
||||
info = json.loads(get_html('http://wenzhou.acfun.tv/api/getVideoByID.aspx?vid=' + id))
|
||||
t = info['vtype']
|
||||
vid = info['vid']
|
||||
if t == 'sina':
|
||||
sina_download_by_vid(vid, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
elif t == 'youku':
|
||||
youku_download_by_id(vid, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
elif t == 'tudou':
|
||||
tudou_download_by_iid(vid, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
elif t == 'qq':
|
||||
qq_download_by_id(vid, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
else:
|
||||
raise NotImplementedError(t)
|
||||
|
||||
if not info_only:
|
||||
print('Downloading %s ...' % (title + '.cmt.json'))
|
||||
cmt = get_srt_json(vid)
|
||||
with open(os.path.join(output_dir, title + '.cmt.json'), 'w') as x:
|
||||
x.write(cmt)
|
||||
|
||||
def acfun_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
assert re.match(r'http://[^\.]+.acfun.tv/v/ac(\d+)', url)
|
||||
html = get_html(url)
|
||||
|
||||
title = r1(r'<h1 id="title-article" class="title"[^<>]*>([^<>]+)<', html)
|
||||
assert title
|
||||
title = unescape_html(title)
|
||||
title = escape_file_path(title)
|
||||
title = title.replace(' - AcFun.tv', '')
|
||||
|
||||
id = r1(r"\[Video\](\d+)\[/Video\]", html) or r1(r"\[video\](\d+)\[/video\]", html)
|
||||
if not id:
|
||||
id = r1(r"src=\"/newflvplayer/player.*id=(\d+)", html)
|
||||
sina_download_by_vid(id, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
else:
|
||||
acfun_download_by_id(id, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
|
||||
site_info = "AcFun.tv"
|
||||
download = acfun_download
|
||||
download_playlist = playlist_not_supported('acfun')
|
@ -1,25 +0,0 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
__all__ = ['douban_download']
|
||||
|
||||
from ..common import *
|
||||
|
||||
def douban_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
html = get_html(url)
|
||||
|
||||
titles = re.findall(r'"name":"([^"]*)"', html)
|
||||
real_urls = [re.sub('\\\\/', '/', i) for i in re.findall(r'"rawUrl":"([^"]*)"', html)]
|
||||
|
||||
for i in range(len(titles)):
|
||||
title = titles[i]
|
||||
real_url = real_urls[i]
|
||||
|
||||
type, ext, size = url_info(real_url)
|
||||
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
download_urls([real_url], title, ext, size, output_dir, merge = merge)
|
||||
|
||||
site_info = "Douban.com"
|
||||
download = douban_download
|
||||
download_playlist = playlist_not_supported('douban')
|
@ -1,49 +0,0 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
__all__ = ['qq_download']
|
||||
|
||||
from ..common import *
|
||||
|
||||
def qq_download_by_id(id, title = None, output_dir = '.', merge = True, info_only = False):
|
||||
url = 'http://vsrc.store.qq.com/%s.flv' % id
|
||||
|
||||
_, _, size = url_info(url)
|
||||
|
||||
print_info(site_info, title, 'flv', size)
|
||||
if not info_only:
|
||||
download_urls([url], title, 'flv', size, output_dir = output_dir, merge = merge)
|
||||
|
||||
def qq_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
if re.match(r'http://v.qq.com/([^\?]+)\?vid', url):
|
||||
aid = r1(r'(.*)\.html', url)
|
||||
vid = r1(r'http://v.qq.com/[^\?]+\?vid=(\w+)', url)
|
||||
url = "%s/%s.html" % (aid, vid)
|
||||
|
||||
if re.match(r'http://y.qq.com/([^\?]+)\?vid', url):
|
||||
vid = r1(r'http://y.qq.com/[^\?]+\?vid=(\w+)', url)
|
||||
|
||||
url = "http://v.qq.com/page/%s.html" % vid
|
||||
|
||||
r_url = r1(r'<meta http-equiv="refresh" content="0;url=([^"]*)', get_html(url))
|
||||
if r_url:
|
||||
aid = r1(r'(.*)\.html', r_url)
|
||||
url = "%s/%s.html" % (aid, vid)
|
||||
|
||||
if re.match(r'http://static.video.qq.com/.*vid=', url):
|
||||
vid = r1(r'http://static.video.qq.com/.*vid=(\w+)', url)
|
||||
url = "http://v.qq.com/page/%s.html" % vid
|
||||
|
||||
html = get_html(url)
|
||||
|
||||
title = r1(r'title:"([^"]+)"', html)
|
||||
assert title
|
||||
title = unescape_html(title)
|
||||
title = escape_file_path(title)
|
||||
|
||||
id = r1(r'vid:"([^"]+)"', html)
|
||||
|
||||
qq_download_by_id(id, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
|
||||
site_info = "QQ.com"
|
||||
download = qq_download
|
||||
download_playlist = playlist_not_supported('qq')
|
@ -1,24 +0,0 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
__all__ = ['ted_download']
|
||||
|
||||
from ..common import *
|
||||
|
||||
def ted_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
page = get_html(url).split("\n")
|
||||
for line in page:
|
||||
if line.find("<title>") > -1:
|
||||
title = line.replace("<title>", "").replace("</title>", "").replace("\t", "")
|
||||
title = title[:title.find(' | ')]
|
||||
if line.find("no-flash-video-download") > -1:
|
||||
url = line.replace('<a id="no-flash-video-download" href="', "").replace(" ", "").replace("\t", "").replace(".mp4", "-480p-en.mp4")
|
||||
url = url[:url.find('"')]
|
||||
type, ext, size = url_info(url)
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
download_urls([url], title, ext, size, output_dir, merge=merge)
|
||||
break
|
||||
|
||||
site_info = "ted.com"
|
||||
download = ted_download
|
||||
download_playlist = playlist_not_supported('ted')
|
@ -1,20 +0,0 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
__all__ = ['vine_download']
|
||||
|
||||
from ..common import *
|
||||
|
||||
def vine_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
html = get_html(url)
|
||||
|
||||
title = r1(r'<meta property="og:title" content="([^"]*)"', html)
|
||||
url = r1(r'<source src="([^"]*)"', html)
|
||||
type, ext, size = url_info(url)
|
||||
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
download_urls([url], title, ext, size, output_dir, merge = merge)
|
||||
|
||||
site_info = "Vine.co"
|
||||
download = vine_download
|
||||
download_playlist = playlist_not_supported('vine')
|
@ -1,208 +0,0 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
__all__ = ['youku_download', 'youku_download_playlist', 'youku_download_by_id']
|
||||
|
||||
from ..common import *
|
||||
|
||||
import json
|
||||
from random import randint
|
||||
from time import time
|
||||
import re
|
||||
import sys
|
||||
|
||||
def trim_title(title):
|
||||
title = title.replace(' - 视频 - 优酷视频 - 在线观看', '')
|
||||
title = title.replace(' - 专辑 - 优酷视频', '')
|
||||
title = re.sub(r'—([^—]+)—优酷网,视频高清在线观看', '', title)
|
||||
return title
|
||||
|
||||
def find_video_id_from_url(url):
|
||||
patterns = [r'^http://v.youku.com/v_show/id_([\w=]+).html',
|
||||
r'^http://player.youku.com/player.php/sid/([\w=]+)/v.swf',
|
||||
r'^loader\.swf\?VideoIDS=([\w=]+)',
|
||||
r'^([\w=]+)$']
|
||||
return r1_of(patterns, url)
|
||||
|
||||
def find_video_id_from_show_page(url):
|
||||
return re.search(r'<a class="btnShow btnplay.*href="([^"]+)"', get_html(url)).group(1)
|
||||
|
||||
def youku_url(url):
|
||||
id = find_video_id_from_url(url)
|
||||
if id:
|
||||
return 'http://v.youku.com/v_show/id_%s.html' % id
|
||||
if re.match(r'http://www.youku.com/show_page/id_\w+.html', url):
|
||||
return find_video_id_from_show_page(url)
|
||||
if re.match(r'http://v.youku.com/v_playlist/\w+.html', url):
|
||||
return url
|
||||
return None
|
||||
|
||||
def parse_video_title(url, page):
|
||||
if re.search(r'v_playlist', url):
|
||||
# if we are playing a viedo from play list, the meta title might be incorrect
|
||||
title = r1_of([r'<div class="show_title" title="([^"]+)">[^<]', r'<title>([^<>]*)</title>'], page)
|
||||
else:
|
||||
title = r1_of([r'<div class="show_title" title="([^"]+)">[^<]', r'<meta name="title" content="([^"]*)"'], page)
|
||||
assert title
|
||||
title = trim_title(title)
|
||||
if re.search(r'v_playlist', url) and re.search(r'-.*\S+', title):
|
||||
title = re.sub(r'^[^-]+-\s*', '', title) # remove the special name from title for playlist video
|
||||
title = re.sub(r'—专辑:.*', '', title) # remove the special name from title for playlist video
|
||||
title = unescape_html(title)
|
||||
|
||||
subtitle = re.search(r'<span class="subtitle" id="subtitle">([^<>]*)</span>', page)
|
||||
if subtitle:
|
||||
subtitle = subtitle.group(1).strip()
|
||||
if subtitle == title:
|
||||
subtitle = None
|
||||
if subtitle:
|
||||
title += '-' + subtitle
|
||||
return title
|
||||
|
||||
def parse_playlist_title(url, page):
|
||||
if re.search(r'v_playlist', url):
|
||||
# if we are playing a video from play list, the meta title might be incorrect
|
||||
title = re.search(r'<title>([^<>]*)</title>', page).group(1)
|
||||
else:
|
||||
title = re.search(r'<meta name="title" content="([^"]*)"', page).group(1)
|
||||
title = trim_title(title)
|
||||
if re.search(r'v_playlist', url) and re.search(r'-.*\S+', title):
|
||||
title = re.sub(r'^[^-]+-\s*', '', title)
|
||||
title = re.sub(r'^.*—专辑:《(.+)》', r'\1', title)
|
||||
title = unescape_html(title)
|
||||
return title
|
||||
|
||||
def parse_page(url):
|
||||
url = youku_url(url)
|
||||
page = get_html(url)
|
||||
id2 = re.search(r"var\s+videoId2\s*=\s*'(\S+)'", page).group(1)
|
||||
title = parse_video_title(url, page)
|
||||
return id2, title
|
||||
|
||||
def get_info(videoId2):
|
||||
return json.loads(get_html('http://v.youku.com/player/getPlayList/VideoIDS/' + videoId2 + '/timezone/+08/version/5/source/out/Sc/2'))
|
||||
|
||||
def find_video(info, stream_type = None):
|
||||
#key = '%s%x' % (info['data'][0]['key2'], int(info['data'][0]['key1'], 16) ^ 0xA55AA5A5)
|
||||
segs = info['data'][0]['segs']
|
||||
types = segs.keys()
|
||||
if not stream_type:
|
||||
for x in ['hd2', 'mp4', 'flv']:
|
||||
if x in types:
|
||||
stream_type = x
|
||||
break
|
||||
else:
|
||||
raise NotImplementedError()
|
||||
assert stream_type in ('hd2', 'mp4', 'flv')
|
||||
file_type = {'hd2': 'flv', 'mp4': 'mp4', 'flv': 'flv'}[stream_type]
|
||||
|
||||
seed = info['data'][0]['seed']
|
||||
source = list("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ/\\:._-1234567890")
|
||||
mixed = ''
|
||||
while source:
|
||||
seed = (seed * 211 + 30031) & 0xFFFF
|
||||
index = seed * len(source) >> 16
|
||||
c = source.pop(index)
|
||||
mixed += c
|
||||
|
||||
ids = info['data'][0]['streamfileids'][stream_type].split('*')[:-1]
|
||||
vid = ''.join(mixed[int(i)] for i in ids)
|
||||
|
||||
sid = '%s%s%s' % (int(time() * 1000), randint(1000, 1999), randint(1000, 9999))
|
||||
|
||||
urls = []
|
||||
for s in segs[stream_type]:
|
||||
no = '%02x' % int(s['no'])
|
||||
url = 'http://f.youku.com/player/getFlvPath/sid/%s_%s/st/%s/fileid/%s%s%s?K=%s&ts=%s' % (sid, no, file_type, vid[:8], no.upper(), vid[10:], s['k'], s['seconds'])
|
||||
urls.append((url, int(s['size'])))
|
||||
return urls
|
||||
|
||||
def file_type_of_url(url):
|
||||
return str(re.search(r'/st/([^/]+)/', url).group(1))
|
||||
|
||||
def youku_download_by_id(id, title, output_dir = '.', stream_type = None, merge = True, info_only = False):
|
||||
info = get_info(id)
|
||||
urls, sizes = zip(*find_video(info, stream_type))
|
||||
ext = file_type_of_url(urls[0])
|
||||
total_size = sum(sizes)
|
||||
|
||||
print_info(site_info, title, ext, total_size)
|
||||
if not info_only:
|
||||
download_urls(urls, title, ext, total_size, output_dir, merge = merge)
|
||||
|
||||
def parse_playlist_videos(html):
|
||||
return re.findall(r'id="A_(\w+)"', html)
|
||||
|
||||
def parse_playlist_pages(html):
|
||||
m = re.search(r'<ul class="pages">.*?</ul>', html, flags = re.S)
|
||||
if m:
|
||||
urls = re.findall(r'href="([^"]+)"', m.group())
|
||||
x1, x2, x3 = re.match(r'^(.*page_)(\d+)(_.*)$', urls[-1]).groups()
|
||||
return ['http://v.youku.com%s%s%s?__rt=1&__ro=listShow' % (x1, i, x3) for i in range(2, int(x2) + 1)]
|
||||
else:
|
||||
return []
|
||||
|
||||
def parse_playlist(url):
|
||||
html = get_html(url)
|
||||
video_id = re.search(r"var\s+videoId\s*=\s*'(\d+)'", html).group(1)
|
||||
show_id = re.search(r'var\s+showid\s*=\s*"(\d+)"', html).group(1)
|
||||
list_url = 'http://v.youku.com/v_vpofficiallist/page_1_showid_%s_id_%s.html?__rt=1&__ro=listShow' % (show_id, video_id)
|
||||
html = get_html(list_url)
|
||||
ids = parse_playlist_videos(html)
|
||||
for url in parse_playlist_pages(html):
|
||||
ids.extend(parse_playlist_videos(get_html(url)))
|
||||
return ids
|
||||
|
||||
def parse_vplaylist(url):
|
||||
id = r1_of([r'^http://www.youku.com/playlist_show/id_(\d+)(?:_ascending_\d_mode_pic(?:_page_\d+)?)?.html',
|
||||
r'^http://v.youku.com/v_playlist/f(\d+)o[01]p\d+.html',
|
||||
r'^http://u.youku.com/user_playlist/pid_(\d+)_id_[\w=]+(?:_page_\d+)?.html'],
|
||||
url)
|
||||
assert id, 'not valid vplaylist url: ' + url
|
||||
url = 'http://www.youku.com/playlist_show/id_%s.html' % id
|
||||
n = int(re.search(r'<span class="num">(\d+)</span>', get_html(url)).group(1))
|
||||
return ['http://v.youku.com/v_playlist/f%so0p%s.html' % (id, i) for i in range(n)]
|
||||
|
||||
def youku_download_playlist(url, output_dir='.', merge=True, info_only=False):
|
||||
"""Downloads a Youku playlist.
|
||||
"""
|
||||
|
||||
if re.match(r'http://www.youku.com/playlist_show/id_\d+(?:_ascending_\d_mode_pic(?:_page_\d+)?)?.html', url):
|
||||
ids = parse_vplaylist(url)
|
||||
elif re.match(r'http://v.youku.com/v_playlist/f\d+o[01]p\d+.html', url):
|
||||
ids = parse_vplaylist(url)
|
||||
elif re.match(r'http://u.youku.com/user_playlist/pid_(\d+)_id_[\w=]+(?:_page_\d+)?.html', url):
|
||||
ids = parse_vplaylist(url)
|
||||
elif re.match(r'http://www.youku.com/show_page/id_\w+.html', url):
|
||||
url = find_video_id_from_show_page(url)
|
||||
assert re.match(r'http://v.youku.com/v_show/id_([\w=]+).html', url), 'URL not supported as playlist'
|
||||
ids = parse_playlist(url)
|
||||
else:
|
||||
ids = []
|
||||
assert ids != []
|
||||
|
||||
title = parse_playlist_title(url, get_html(url))
|
||||
title = filenameable(title)
|
||||
output_dir = os.path.join(output_dir, title)
|
||||
|
||||
for i, id in enumerate(ids):
|
||||
print('Processing %s of %s videos...' % (i + 1, len(ids)))
|
||||
try:
|
||||
id, title = parse_page(youku_url(id))
|
||||
youku_download_by_id(id, title, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
except:
|
||||
continue
|
||||
|
||||
def youku_download(url, output_dir='.', merge=True, info_only=False):
|
||||
"""Downloads Youku videos by URL.
|
||||
"""
|
||||
|
||||
try:
|
||||
youku_download_playlist(url, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
except:
|
||||
id, title = parse_page(url)
|
||||
youku_download_by_id(id, title=title, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
|
||||
site_info = "Youku.com"
|
||||
download = youku_download
|
||||
download_playlist = youku_download_playlist
|
@ -5,6 +5,7 @@ from .alive import *
|
||||
from .baidu import *
|
||||
from .bilibili import *
|
||||
from .blip import *
|
||||
from .cbs import *
|
||||
from .cntv import *
|
||||
from .coursera import *
|
||||
from .dailymotion import *
|
||||
@ -18,7 +19,10 @@ from .ifeng import *
|
||||
from .instagram import *
|
||||
from .iqiyi import *
|
||||
from .joy import *
|
||||
from .jpopsuki import *
|
||||
from .ku6 import *
|
||||
from .letv import *
|
||||
from .magisto import *
|
||||
from .miomio import *
|
||||
from .mixcloud import *
|
||||
from .netease import *
|
||||
@ -29,11 +33,13 @@ from .sina import *
|
||||
from .sohu import *
|
||||
from .songtaste import *
|
||||
from .soundcloud import *
|
||||
from .theplatform import *
|
||||
from .tudou import *
|
||||
from .tumblr import *
|
||||
from .vid48 import *
|
||||
from .vimeo import *
|
||||
from .vine import *
|
||||
from .vk import *
|
||||
from .w56 import *
|
||||
from .xiami import *
|
||||
from .yinyuetai import *
|
@ -1,20 +1,19 @@
|
||||
#!/usr/bin/env python
|
||||
__all__ = ['main', 'any_download', 'any_download_playlist']
|
||||
|
||||
from ..downloader import *
|
||||
from ..extractor import *
|
||||
from ..common import *
|
||||
|
||||
def url_to_module(url):
|
||||
site = r1(r'http://([^/]+)/', url)
|
||||
assert site, 'invalid url: ' + url
|
||||
|
||||
if site.endswith('.com.cn'):
|
||||
site = site[:-3]
|
||||
domain = r1(r'(\.[^.]+\.[^.]+)$', site)
|
||||
if not domain:
|
||||
domain = site
|
||||
video_host = r1(r'https?://([^/]+)/', url)
|
||||
video_url = r1(r'https?://[^/]+(.*)', url)
|
||||
assert video_host and video_url, 'invalid url: ' + url
|
||||
|
||||
if video_host.endswith('.com.cn'):
|
||||
video_host = video_host[:-3]
|
||||
domain = r1(r'(\.[^.]+\.[^.]+)$', video_host) or video_host
|
||||
assert domain, 'unsupported url: ' + url
|
||||
|
||||
|
||||
k = r1(r'([^.]+)', domain)
|
||||
downloads = {
|
||||
'163': netease,
|
||||
@ -25,6 +24,7 @@ def url_to_module(url):
|
||||
'bilibili': bilibili,
|
||||
'blip': blip,
|
||||
'cntv': cntv,
|
||||
'cbs': cbs,
|
||||
'coursera': coursera,
|
||||
'dailymotion': dailymotion,
|
||||
'douban': douban,
|
||||
@ -38,8 +38,11 @@ def url_to_module(url):
|
||||
'instagram': instagram,
|
||||
'iqiyi': iqiyi,
|
||||
'joy': joy,
|
||||
'jpopsuki': jpopsuki,
|
||||
'kankanews': bilibili,
|
||||
'ku6': ku6,
|
||||
'letv': letv,
|
||||
'magisto': magisto,
|
||||
'miomio': miomio,
|
||||
'mixcloud': mixcloud,
|
||||
'nicovideo': nicovideo,
|
||||
@ -51,11 +54,13 @@ def url_to_module(url):
|
||||
'songtaste':songtaste,
|
||||
'soundcloud': soundcloud,
|
||||
'ted': ted,
|
||||
'theplatform': theplatform,
|
||||
'tudou': tudou,
|
||||
'tumblr': tumblr,
|
||||
'vid48': vid48,
|
||||
'vimeo': vimeo,
|
||||
'vine': vine,
|
||||
'vk': vk,
|
||||
'xiami': xiami,
|
||||
'yinyuetai': yinyuetai,
|
||||
'youku': youku,
|
||||
@ -65,17 +70,25 @@ def url_to_module(url):
|
||||
#TODO
|
||||
}
|
||||
if k in downloads:
|
||||
return downloads[k]
|
||||
return downloads[k], url
|
||||
else:
|
||||
raise NotImplementedError(url)
|
||||
import http.client
|
||||
conn = http.client.HTTPConnection(video_host)
|
||||
conn.request("HEAD", video_url)
|
||||
res = conn.getresponse()
|
||||
location = res.getheader('location')
|
||||
if location is None:
|
||||
raise NotImplementedError(url)
|
||||
else:
|
||||
return url_to_module(location)
|
||||
|
||||
def any_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
m = url_to_module(url)
|
||||
m.download(url, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
def any_download(url, output_dir='.', merge=True, info_only=False):
|
||||
m, url = url_to_module(url)
|
||||
m.download(url, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
|
||||
def any_download_playlist(url, output_dir = '.', merge = True, info_only = False):
|
||||
m = url_to_module(url)
|
||||
m.download_playlist(url, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
def any_download_playlist(url, output_dir='.', merge=True, info_only=False):
|
||||
m, url = url_to_module(url)
|
||||
m.download_playlist(url, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
|
||||
def main():
|
||||
script_main('you-get', any_download, any_download_playlist)
|
74
src/you_get/extractor/acfun.py
Normal file
74
src/you_get/extractor/acfun.py
Normal file
@ -0,0 +1,74 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
__all__ = ['acfun_download']
|
||||
|
||||
from ..common import *
|
||||
|
||||
from .qq import qq_download_by_id
|
||||
from .sina import sina_download_by_vid
|
||||
from .tudou import tudou_download_by_iid
|
||||
from .youku import youku_download_by_vid
|
||||
|
||||
import json, re
|
||||
|
||||
def get_srt_json(id):
|
||||
url = 'http://comment.acfun.com/%s.json' % id
|
||||
return get_html(url)
|
||||
|
||||
def get_srt_lock_json(id):
|
||||
url = 'http://comment.acfun.com/%s_lock.json' % id
|
||||
return get_html(url)
|
||||
|
||||
def acfun_download_by_vid(vid, title=None, output_dir='.', merge=True, info_only=False):
|
||||
info = json.loads(get_html('http://www.acfun.com/video/getVideo.aspx?id=' + vid))
|
||||
sourceType = info['sourceType']
|
||||
sourceId = info['sourceId']
|
||||
danmakuId = info['danmakuId']
|
||||
if sourceType == 'sina':
|
||||
sina_download_by_vid(sourceId, title, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
elif sourceType == 'youku':
|
||||
youku_download_by_vid(sourceId, title, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
elif sourceType == 'tudou':
|
||||
tudou_download_by_iid(sourceId, title, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
elif sourceType == 'qq':
|
||||
qq_download_by_id(sourceId, title, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
else:
|
||||
raise NotImplementedError(sourceType)
|
||||
|
||||
if not info_only:
|
||||
title = get_filename(title)
|
||||
try:
|
||||
print('Downloading %s ...\n' % (title + '.cmt.json'))
|
||||
cmt = get_srt_json(danmakuId)
|
||||
with open(os.path.join(output_dir, title + '.cmt.json'), 'w') as x:
|
||||
x.write(cmt)
|
||||
print('Downloading %s ...\n' % (title + '.cmt_lock.json'))
|
||||
cmt = get_srt_lock_json(danmakuId)
|
||||
with open(os.path.join(output_dir, title + '.cmt_lock.json'), 'w') as x:
|
||||
x.write(cmt)
|
||||
except:
|
||||
pass
|
||||
|
||||
def acfun_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
assert re.match(r'http://[^\.]+.acfun.[^\.]+/v/ac(\d+)', url)
|
||||
html = get_html(url)
|
||||
|
||||
title = r1(r'<h1 id="txt-title-view">([^<>]+)<', html)
|
||||
title = unescape_html(title)
|
||||
title = escape_file_path(title)
|
||||
assert title
|
||||
|
||||
videos = re.findall("data-vid=\"(\d+)\" href=\"[^\"]+\" title=\"([^\"]+)\"", html)
|
||||
if videos is not None:
|
||||
for video in videos:
|
||||
p_vid = video[0]
|
||||
p_title = title + " - " + video[1]
|
||||
acfun_download_by_vid(p_vid, p_title, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
else:
|
||||
# Useless - to be removed?
|
||||
id = r1(r"src=\"/newflvplayer/player.*id=(\d+)", html)
|
||||
sina_download_by_vid(id, title, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
|
||||
site_info = "AcFun.com"
|
||||
download = acfun_download
|
||||
download_playlist = playlist_not_supported('acfun')
|
@ -8,45 +8,59 @@ from .. import common
|
||||
|
||||
from urllib import parse
|
||||
|
||||
def baidu_get_song_html(sid):
|
||||
return get_html('http://music.baidu.com/song/%s/download?__o=%%2Fsong%%2F%s' % (sid, sid), faker = True)
|
||||
def baidu_get_song_data(sid):
|
||||
data = json.loads(get_html('http://music.baidu.com/data/music/fmlink?songIds=%s' % sid, faker = True))['data']
|
||||
|
||||
def baidu_get_song_url(html):
|
||||
return r1(r'downlink="/data/music/file\?link=(.+?)"', html)
|
||||
if data['xcode'] != '':
|
||||
# inside china mainland
|
||||
return data['songList'][0]
|
||||
else:
|
||||
# outside china mainland
|
||||
return None
|
||||
|
||||
def baidu_get_song_artist(html):
|
||||
return r1(r'singer_name:"(.+?)"', html)
|
||||
def baidu_get_song_url(data):
|
||||
return data['songLink']
|
||||
|
||||
def baidu_get_song_album(html):
|
||||
return r1(r'ablum_name:"(.+?)"', html)
|
||||
def baidu_get_song_artist(data):
|
||||
return data['artistName']
|
||||
|
||||
def baidu_get_song_title(html):
|
||||
return r1(r'song_title:"(.+?)"', html)
|
||||
def baidu_get_song_album(data):
|
||||
return data['albumName']
|
||||
|
||||
def baidu_download_lyric(sid, file_name, output_dir):
|
||||
if common.dry_run:
|
||||
return
|
||||
def baidu_get_song_title(data):
|
||||
return data['songName']
|
||||
|
||||
html = get_html('http://music.baidu.com/song/' + sid)
|
||||
href = r1(r'<a class="down-lrc-btn" data-lyricdata=\'{ "href":"(.+?)" }\' href="#">', html)
|
||||
if href:
|
||||
lrc = get_html('http://music.baidu.com' + href)
|
||||
if len(lrc) > 0:
|
||||
with open(output_dir + "/" + file_name.replace('/', '-') + '.lrc', 'w') as x:
|
||||
x.write(lrc)
|
||||
def baidu_get_song_lyric(data):
|
||||
lrc = data['lrcLink']
|
||||
return None if lrc is '' else "http://music.baidu.com%s" % lrc
|
||||
|
||||
def baidu_download_song(sid, output_dir = '.', merge = True, info_only = False):
|
||||
html = baidu_get_song_html(sid)
|
||||
url = baidu_get_song_url(html)
|
||||
title = baidu_get_song_title(html)
|
||||
artist = baidu_get_song_artist(html)
|
||||
album = baidu_get_song_album(html)
|
||||
type, ext, size = url_info(url, faker = True)
|
||||
def baidu_download_song(sid, output_dir='.', merge=True, info_only=False):
|
||||
data = baidu_get_song_data(sid)
|
||||
if data is not None:
|
||||
url = baidu_get_song_url(data)
|
||||
title = baidu_get_song_title(data)
|
||||
artist = baidu_get_song_artist(data)
|
||||
album = baidu_get_song_album(data)
|
||||
lrc = baidu_get_song_lyric(data)
|
||||
file_name = "%s - %s - %s" % (title, album, artist)
|
||||
else:
|
||||
html = get_html("http://music.baidu.com/song/%s" % sid)
|
||||
url = r1(r'data_url="([^"]+)"', html)
|
||||
title = r1(r'data_name="([^"]+)"', html)
|
||||
file_name = title
|
||||
|
||||
type, ext, size = url_info(url, faker=True)
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
file_name = "%s - %s - %s" % (title, album, artist)
|
||||
download_urls([url], file_name, ext, size, output_dir, merge = merge, faker = True)
|
||||
baidu_download_lyric(sid, file_name, output_dir)
|
||||
download_urls([url], file_name, ext, size, output_dir, merge=merge, faker=True)
|
||||
|
||||
try:
|
||||
type, ext, size = url_info(lrc, faker=True)
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
download_urls([lrc], file_name, ext, size, output_dir, faker=True)
|
||||
except:
|
||||
pass
|
||||
|
||||
def baidu_download_album(aid, output_dir = '.', merge = True, info_only = False):
|
||||
html = get_html('http://music.baidu.com/album/%s' % aid, faker = True)
|
||||
@ -56,32 +70,40 @@ def baidu_download_album(aid, output_dir = '.', merge = True, info_only = False)
|
||||
ids = json.loads(r1(r'<span class="album-add" data-adddata=\'(.+?)\'>', html).replace('"', '').replace(';', '"'))['ids']
|
||||
track_nr = 1
|
||||
for id in ids:
|
||||
song_html = baidu_get_song_html(id)
|
||||
song_url = baidu_get_song_url(song_html)
|
||||
song_title = baidu_get_song_title(song_html)
|
||||
song_data = baidu_get_song_data(id)
|
||||
song_url = baidu_get_song_url(song_data)
|
||||
song_title = baidu_get_song_title(song_data)
|
||||
song_lrc = baidu_get_song_lyric(song_data)
|
||||
file_name = '%02d.%s' % (track_nr, song_title)
|
||||
|
||||
type, ext, size = url_info(song_url, faker = True)
|
||||
print_info(site_info, song_title, type, size)
|
||||
if not info_only:
|
||||
download_urls([song_url], file_name, ext, size, output_dir, merge = merge, faker = True)
|
||||
baidu_download_lyric(id, file_name, output_dir)
|
||||
|
||||
if song_lrc:
|
||||
type, ext, size = url_info(song_lrc, faker = True)
|
||||
print_info(site_info, song_title, type, size)
|
||||
if not info_only:
|
||||
download_urls([song_lrc], file_name, ext, size, output_dir, faker = True)
|
||||
|
||||
track_nr += 1
|
||||
|
||||
def baidu_download(url, output_dir = '.', stream_type = None, merge = True, info_only = False):
|
||||
if re.match(r'http://pan.baidu.com', url):
|
||||
html = get_html(url)
|
||||
|
||||
|
||||
title = r1(r'server_filename="([^"]+)"', html)
|
||||
if len(title.split('.')) > 1:
|
||||
title = ".".join(title.split('.')[:-1])
|
||||
|
||||
|
||||
real_url = r1(r'\\"dlink\\":\\"([^"]*)\\"', html).replace('\\\\/', '/')
|
||||
type, ext, size = url_info(real_url, faker = True)
|
||||
|
||||
|
||||
print_info(site_info, title, ext, size)
|
||||
if not info_only:
|
||||
download_urls([real_url], title, ext, size, output_dir, merge = merge)
|
||||
|
||||
|
||||
elif re.match(r'http://music.baidu.com/album/\d+', url):
|
||||
id = r1(r'http://music.baidu.com/album/(\d+)', url)
|
||||
baidu_download_album(id, output_dir, merge, info_only)
|
@ -6,12 +6,12 @@ from ..common import *
|
||||
|
||||
from .sina import sina_download_by_vid
|
||||
from .tudou import tudou_download_by_id
|
||||
from .youku import youku_download_by_id
|
||||
from .youku import youku_download_by_vid
|
||||
|
||||
import re
|
||||
|
||||
def get_srt_xml(id):
|
||||
url = 'http://comment.bilibili.tv/%s.xml' % id
|
||||
url = 'http://comment.bilibili.com/%s.xml' % id
|
||||
return get_html(url)
|
||||
|
||||
def parse_srt_p(p):
|
||||
@ -19,7 +19,7 @@ def parse_srt_p(p):
|
||||
assert len(fields) == 8, fields
|
||||
time, mode, font_size, font_color, pub_time, pool, user_id, history = fields
|
||||
time = float(time)
|
||||
|
||||
|
||||
mode = int(mode)
|
||||
assert 1 <= mode <= 8
|
||||
# mode 1~3: scrolling
|
||||
@ -28,17 +28,17 @@ def parse_srt_p(p):
|
||||
# mode 6: reverse?
|
||||
# mode 7: position
|
||||
# mode 8: advanced
|
||||
|
||||
|
||||
pool = int(pool)
|
||||
assert 0 <= pool <= 2
|
||||
# pool 0: normal
|
||||
# pool 1: srt
|
||||
# pool 2: special?
|
||||
|
||||
|
||||
font_size = int(font_size)
|
||||
|
||||
|
||||
font_color = '#%06x' % int(font_color)
|
||||
|
||||
|
||||
return pool, mode, font_size, font_color
|
||||
|
||||
def parse_srt_xml(xml):
|
||||
@ -54,9 +54,9 @@ def parse_cid_playurl(xml):
|
||||
return urls
|
||||
|
||||
def bilibili_download_by_cid(id, title, output_dir = '.', merge = True, info_only = False):
|
||||
url = 'http://interface.bilibili.tv/playurl?cid=' + id
|
||||
url = 'http://interface.bilibili.com/playurl?cid=' + id
|
||||
urls = [i if not re.match(r'.*\.qqvideo\.tc\.qq\.com', i) else re.sub(r'.*\.qqvideo\.tc\.qq\.com', 'http://vsrc.store.qq.com', i) for i in parse_cid_playurl(get_html(url, 'utf-8'))] # dirty fix for QQ
|
||||
|
||||
|
||||
if re.search(r'\.(flv|hlv)\b', urls[0]):
|
||||
type = 'flv'
|
||||
elif re.search(r'/flv/', urls[0]):
|
||||
@ -65,25 +65,24 @@ def bilibili_download_by_cid(id, title, output_dir = '.', merge = True, info_onl
|
||||
type = 'mp4'
|
||||
else:
|
||||
type = 'flv'
|
||||
|
||||
|
||||
size = 0
|
||||
for url in urls:
|
||||
_, _, temp = url_info(url)
|
||||
size += temp
|
||||
|
||||
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
download_urls(urls, title, type, total_size = None, output_dir = output_dir, merge = merge)
|
||||
|
||||
def bilibili_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
assert re.match(r'http://(www.bilibili.tv|bilibili.kankanews.com|bilibili.smgbb.cn)/video/av(\d+)', url)
|
||||
html = get_html(url)
|
||||
|
||||
title = r1(r'<h2>([^<>]+)</h2>', html)
|
||||
|
||||
title = r1(r'<h2[^>]*>([^<>]+)</h2>', html)
|
||||
title = unescape_html(title)
|
||||
title = escape_file_path(title)
|
||||
|
||||
flashvars = r1_of([r'player_params=\'(cid=\d+)', r'flashvars="([^"]+)"', r'"https://secure.bilibili.tv/secure,(cid=\d+)(?:&aid=\d+)?"'], html)
|
||||
|
||||
flashvars = r1_of([r'player_params=\'(cid=\d+)', r'flashvars="([^"]+)"', r'"https://[a-z]+\.bilibili\.com/secure,(cid=\d+)(?:&aid=\d+)?"'], html)
|
||||
assert flashvars
|
||||
t, id = flashvars.split('=', 1)
|
||||
id = id.split('&')[0]
|
||||
@ -92,18 +91,19 @@ def bilibili_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
elif t == 'vid':
|
||||
sina_download_by_id(id, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
elif t == 'ykid':
|
||||
youku_download_by_id(id, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
youku_download_by_vid(id, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
elif t == 'uid':
|
||||
tudou_download_by_id(id, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
else:
|
||||
raise NotImplementedError(flashvars)
|
||||
|
||||
|
||||
if not info_only:
|
||||
print('Downloading %s ...' % (title + '.cmt.xml'))
|
||||
title = get_filename(title)
|
||||
print('Downloading %s ...\n' % (title + '.cmt.xml'))
|
||||
xml = get_srt_xml(id)
|
||||
with open(os.path.join(output_dir, title + '.cmt.xml'), 'w') as x:
|
||||
with open(os.path.join(output_dir, title + '.cmt.xml'), 'w', encoding='utf-8') as x:
|
||||
x.write(xml)
|
||||
|
||||
site_info = "bilibili.tv"
|
||||
site_info = "bilibili.com"
|
||||
download = bilibili_download
|
||||
download_playlist = playlist_not_supported('bilibili')
|
21
src/you_get/extractor/cbs.py
Normal file
21
src/you_get/extractor/cbs.py
Normal file
@ -0,0 +1,21 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
__all__ = ['cbs_download']
|
||||
|
||||
from ..common import *
|
||||
|
||||
from .theplatform import theplatform_download_by_pid
|
||||
|
||||
def cbs_download(url, output_dir='.', merge=True, info_only=False):
|
||||
"""Downloads CBS videos by URL.
|
||||
"""
|
||||
|
||||
html = get_content(url)
|
||||
pid = match1(html, r'video\.settings\.pid\s*=\s*\'([^\']+)\'')
|
||||
title = match1(html, r'video\.settings\.title\s*=\s*\"([^\"]+)\"')
|
||||
|
||||
theplatform_download_by_pid(pid, title, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
|
||||
site_info = "CBS.com"
|
||||
download = cbs_download
|
||||
download_playlist = playlist_not_supported('cbs')
|
@ -7,22 +7,22 @@ from ..common import *
|
||||
def dailymotion_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
"""Downloads Dailymotion videos by URL.
|
||||
"""
|
||||
|
||||
id = match1(url, r'/video/([^\?]+)')
|
||||
|
||||
id = match1(url, r'/video/([^\?]+)') or match1(url, r'video=([^\?]+)')
|
||||
embed_url = 'http://www.dailymotion.com/embed/video/%s' % id
|
||||
html = get_content(embed_url)
|
||||
|
||||
|
||||
info = json.loads(match1(html, r'var\s*info\s*=\s*({.+}),\n'))
|
||||
|
||||
|
||||
title = info['title']
|
||||
|
||||
|
||||
for quality in ['stream_h264_hd1080_url', 'stream_h264_hd_url', 'stream_h264_hq_url', 'stream_h264_url', 'stream_h264_ld_url']:
|
||||
real_url = info[quality]
|
||||
if real_url:
|
||||
break
|
||||
|
||||
|
||||
type, ext, size = url_info(real_url)
|
||||
|
||||
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
download_urls([real_url], title, ext, size, output_dir, merge = merge)
|
54
src/you_get/extractor/douban.py
Normal file
54
src/you_get/extractor/douban.py
Normal file
@ -0,0 +1,54 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
__all__ = ['douban_download']
|
||||
|
||||
import urllib.request, urllib.parse
|
||||
from ..common import *
|
||||
|
||||
def douban_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
html = get_html(url)
|
||||
if 'subject' in url:
|
||||
titles = re.findall(r'data-title="([^"]*)">', html)
|
||||
song_id = re.findall(r'<li class="song-item" id="([^"]*)"', html)
|
||||
song_ssid = re.findall(r'data-ssid="([^"]*)"', html)
|
||||
get_song_url = 'http://music.douban.com/j/songlist/get_song_url'
|
||||
|
||||
for i in range(len(titles)):
|
||||
title = titles[i]
|
||||
datas = {
|
||||
'sid': song_id[i],
|
||||
'ssid': song_ssid[i]
|
||||
}
|
||||
post_params = urllib.parse.urlencode(datas).encode('utf-8')
|
||||
try:
|
||||
resp = urllib.request.urlopen(get_song_url, post_params)
|
||||
resp_data = json.loads(resp.read().decode('utf-8'))
|
||||
real_url = resp_data['r']
|
||||
type, ext, size = url_info(real_url)
|
||||
print_info(site_info, title, type, size)
|
||||
except:
|
||||
pass
|
||||
|
||||
if not info_only:
|
||||
try:
|
||||
download_urls([real_url], title, ext, size, output_dir, merge = merge)
|
||||
except:
|
||||
pass
|
||||
|
||||
else:
|
||||
titles = re.findall(r'"name":"([^"]*)"', html)
|
||||
real_urls = [re.sub('\\\\/', '/', i) for i in re.findall(r'"rawUrl":"([^"]*)"', html)]
|
||||
|
||||
for i in range(len(titles)):
|
||||
title = titles[i]
|
||||
real_url = real_urls[i]
|
||||
|
||||
type, ext, size = url_info(real_url)
|
||||
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
download_urls([real_url], title, ext, size, output_dir, merge = merge)
|
||||
|
||||
site_info = "Douban.com"
|
||||
download = douban_download
|
||||
download_playlist = playlist_not_supported('douban')
|
@ -43,54 +43,54 @@ fmt_level = dict(
|
||||
def google_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
# Percent-encoding Unicode URL
|
||||
url = parse.quote(url, safe = ':/+%')
|
||||
|
||||
|
||||
service = url.split('/')[2].split('.')[0]
|
||||
|
||||
|
||||
if service == 'plus': # Google Plus
|
||||
|
||||
|
||||
if not re.search(r'plus.google.com/photos/[^/]*/albums/\d+/\d+', url):
|
||||
html = get_html(url)
|
||||
url = r1(r'"(https://plus.google.com/photos/\d+/albums/\d+/\d+)', html)
|
||||
url = "https://plus.google.com/" + r1(r'"(photos/\d+/albums/\d+/\d+)', html)
|
||||
title = r1(r'<title>([^<\n]+)', html)
|
||||
else:
|
||||
title = None
|
||||
|
||||
|
||||
html = get_html(url)
|
||||
real_urls = re.findall(r'\[(\d+),\d+,\d+,"([^"]+)"\]', html)
|
||||
real_url = unicodize(sorted(real_urls, key = lambda x : fmt_level[x[0]])[0][1])
|
||||
|
||||
|
||||
if title is None:
|
||||
post_url = r1(r'"(https://plus.google.com/\d+/posts/[^"]*)"', html)
|
||||
post_html = get_html(post_url)
|
||||
title = r1(r'<title>([^<\n]+)', post_html)
|
||||
|
||||
title = r1(r'<title[^>]*>([^<\n]+)', post_html)
|
||||
|
||||
if title is None:
|
||||
response = request.urlopen(request.Request(real_url))
|
||||
if response.headers['content-disposition']:
|
||||
filename = parse.unquote(r1(r'filename="?(.+)"?', response.headers['content-disposition'])).split('.')
|
||||
title = ''.join(filename[:-1])
|
||||
|
||||
|
||||
type, ext, size = url_info(real_url)
|
||||
if ext is None:
|
||||
ext = 'mp4'
|
||||
|
||||
|
||||
elif service in ['docs', 'drive'] : # Google Docs
|
||||
|
||||
|
||||
html = get_html(url)
|
||||
|
||||
|
||||
title = r1(r'"title":"([^"]*)"', html) or r1(r'<meta itemprop="name" content="([^"]*)"', html)
|
||||
if len(title.split('.')) > 1:
|
||||
title = ".".join(title.split('.')[:-1])
|
||||
|
||||
|
||||
docid = r1(r'"docid":"([^"]*)"', html)
|
||||
|
||||
|
||||
request.install_opener(request.build_opener(request.HTTPCookieProcessor()))
|
||||
|
||||
|
||||
request.urlopen(request.Request("https://docs.google.com/uc?id=%s&export=download" % docid))
|
||||
real_url ="https://docs.google.com/uc?export=download&confirm=no_antivirus&id=%s" % docid
|
||||
|
||||
|
||||
type, ext, size = url_info(real_url)
|
||||
|
||||
|
||||
print_info(site_info, title, ext, size)
|
||||
if not info_only:
|
||||
download_urls([real_url], title, ext, size, output_dir, merge = merge)
|
@ -6,13 +6,13 @@ from ..common import *
|
||||
|
||||
def instagram_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
html = get_html(url)
|
||||
|
||||
id = r1(r'instagram.com/p/([^/]+)/', html)
|
||||
|
||||
vid = r1(r'instagram.com/p/([^/]+)/', html)
|
||||
description = r1(r'<meta property="og:description" content="([^"]*)"', html)
|
||||
title = description + " [" + id + "]"
|
||||
title = description + " [" + vid + "]"
|
||||
url = r1(r'<meta property="og:video" content="([^"]*)"', html)
|
||||
type, ext, size = url_info(url)
|
||||
|
||||
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
download_urls([url], title, ext, size, output_dir, merge = merge)
|
@ -6,20 +6,23 @@ from ..common import *
|
||||
|
||||
def iqiyi_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
html = get_html(url)
|
||||
|
||||
videoId = r1(r'data-player-videoid="([^"]+)"', html)
|
||||
assert videoId
|
||||
|
||||
info_url = 'http://cache.video.qiyi.com/v/%s' % videoId
|
||||
info_xml = get_html(info_url)
|
||||
|
||||
|
||||
tvid = r1(r'data-player-tvid="([^"]+)"', html)
|
||||
videoid = r1(r'data-player-videoid="([^"]+)"', html)
|
||||
assert tvid
|
||||
assert videoid
|
||||
|
||||
info_url = 'http://cache.video.qiyi.com/vj/%s/%s/' % (tvid, videoid)
|
||||
info = get_html(info_url)
|
||||
raise NotImplementedError('iqiyi')
|
||||
|
||||
from xml.dom.minidom import parseString
|
||||
doc = parseString(info_xml)
|
||||
title = doc.getElementsByTagName('title')[0].firstChild.nodeValue
|
||||
size = int(doc.getElementsByTagName('totalBytes')[0].firstChild.nodeValue)
|
||||
urls = [n.firstChild.nodeValue for n in doc.getElementsByTagName('file')]
|
||||
assert urls[0].endswith('.f4v'), urls[0]
|
||||
|
||||
|
||||
for i in range(len(urls)):
|
||||
temp_url = "http://data.video.qiyi.com/%s" % urls[i].split("/")[-1].split(".")[0] + ".ts"
|
||||
try:
|
||||
@ -28,7 +31,7 @@ def iqiyi_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
key = r1(r'key=(.*)', e.geturl())
|
||||
assert key
|
||||
urls[i] += "?key=%s" % key
|
||||
|
||||
|
||||
print_info(site_info, title, 'flv', size)
|
||||
if not info_only:
|
||||
download_urls(urls, title, 'flv', size, output_dir = output_dir, merge = merge)
|
23
src/you_get/extractor/jpopsuki.py
Normal file
23
src/you_get/extractor/jpopsuki.py
Normal file
@ -0,0 +1,23 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
__all__ = ['jpopsuki_download']
|
||||
|
||||
from ..common import *
|
||||
|
||||
def jpopsuki_download(url, output_dir='.', merge=True, info_only=False):
|
||||
html = get_html(url, faker=True)
|
||||
|
||||
title = r1(r'<meta name="title" content="([^"]*)"', html)
|
||||
if title.endswith(' - JPopsuki TV'):
|
||||
title = title[:-14]
|
||||
|
||||
url = "http://jpopsuki.tv%s" % r1(r'<source src="([^"]*)"', html)
|
||||
type, ext, size = url_info(url, faker=True)
|
||||
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
download_urls([url], title, ext, size, output_dir, merge=merge, faker=True)
|
||||
|
||||
site_info = "JPopsuki.tv"
|
||||
download = jpopsuki_download
|
||||
download_playlist = playlist_not_supported('jpopsuki')
|
58
src/you_get/extractor/letv.py
Normal file
58
src/you_get/extractor/letv.py
Normal file
@ -0,0 +1,58 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
__all__ = ['letv_download']
|
||||
|
||||
import json
|
||||
import random
|
||||
import xml.etree.ElementTree as ET
|
||||
from ..common import *
|
||||
|
||||
def get_timestamp():
|
||||
tn = random.random()
|
||||
url = 'http://api.letv.com/time?tn={}'.format(tn)
|
||||
result = get_content(url)
|
||||
return json.loads(result)['stime']
|
||||
|
||||
def get_key(t):
|
||||
for s in range(0, 8):
|
||||
e = 1 & t
|
||||
t >>= 1
|
||||
e <<= 31
|
||||
t += e
|
||||
return t ^ 185025305
|
||||
|
||||
def video_info(vid):
|
||||
tn = get_timestamp()
|
||||
key = get_key(tn)
|
||||
url = 'http://api.letv.com/mms/out/video/play?id={}&platid=1&splatid=101&format=1&tkey={}&domain=http%3A%2F%2Fwww.letv.com'.format(vid, key)
|
||||
r = get_content(url, decoded=False)
|
||||
xml_obj = ET.fromstring(r)
|
||||
info = json.loads(xml_obj.find("playurl").text)
|
||||
title = info.get('title')
|
||||
urls = info.get('dispatch')
|
||||
for k in urls.keys():
|
||||
url = urls[k][0]
|
||||
break
|
||||
url += '&termid=1&format=0&hwtype=un&ostype=Windows7&tag=letv&sign=letv&expect=1&pay=0&rateid={}'.format(k)
|
||||
return url, title
|
||||
|
||||
def letv_download_by_vid(vid, output_dir='.', merge=True, info_only=False):
|
||||
url, title = video_info(vid)
|
||||
_, _, size = url_info(url)
|
||||
ext = 'flv'
|
||||
print_info(site_info, title, ext, size)
|
||||
if not info_only:
|
||||
download_urls([url], title, ext, size, output_dir=output_dir, merge=merge)
|
||||
|
||||
def letv_download(url, output_dir='.', merge=True, info_only=False):
|
||||
if re.match(r'http://www.letv.com/ptv/vplay/(\d+).html', url):
|
||||
vid = match1(url, r'http://www.letv.com/ptv/vplay/(\d+).html')
|
||||
else:
|
||||
html = get_content(url)
|
||||
vid = match1(html, r'vid="(\d+)"')
|
||||
letv_download_by_vid(vid, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
|
||||
|
||||
site_info = "letv.com"
|
||||
download = letv_download
|
||||
download_playlist = playlist_not_supported('letv')
|
23
src/you_get/extractor/magisto.py
Normal file
23
src/you_get/extractor/magisto.py
Normal file
@ -0,0 +1,23 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
__all__ = ['magisto_download']
|
||||
|
||||
from ..common import *
|
||||
|
||||
def magisto_download(url, output_dir='.', merge=True, info_only=False):
|
||||
html = get_html(url)
|
||||
|
||||
title1 = r1(r'<meta name="twitter:title" content="([^"]*)"', html)
|
||||
title2 = r1(r'<meta name="twitter:description" content="([^"]*)"', html)
|
||||
video_hash = r1(r'http://www.magisto.com/video/([^/]+)', url)
|
||||
title = "%s %s - %s" % (title1, title2, video_hash)
|
||||
url = r1(r'<source type="[^"]+" src="([^"]*)"', html)
|
||||
type, ext, size = url_info(url)
|
||||
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
download_urls([url], title, ext, size, output_dir, merge=merge)
|
||||
|
||||
site_info = "Magisto.com"
|
||||
download = magisto_download
|
||||
download_playlist = playlist_not_supported('magisto')
|
@ -4,21 +4,24 @@ __all__ = ['miomio_download']
|
||||
|
||||
from ..common import *
|
||||
|
||||
from .sina import sina_download_by_vid
|
||||
from .tudou import tudou_download_by_id
|
||||
from .youku import youku_download_by_id
|
||||
from .youku import youku_download_by_vid
|
||||
|
||||
def miomio_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
html = get_html(url)
|
||||
|
||||
|
||||
title = r1(r'<meta name="description" content="([^"]*)"', html)
|
||||
flashvars = r1(r'flashvars="(type=[^"]*)"', html)
|
||||
|
||||
|
||||
t = r1(r'type=(\w+)', flashvars)
|
||||
id = r1(r'vid=([^"]+)', flashvars)
|
||||
if t == 'youku':
|
||||
youku_download_by_id(id, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
youku_download_by_vid(id, title, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
elif t == 'tudou':
|
||||
tudou_download_by_id(id, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
tudou_download_by_id(id, title, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
elif t == 'sina':
|
||||
sina_download_by_vid(id, title, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
else:
|
||||
raise NotImplementedError(flashvars)
|
||||
|
@ -7,9 +7,9 @@ from ..common import *
|
||||
def mixcloud_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
html = get_html(url)
|
||||
title = r1(r'<meta property="og:title" content="([^"]*)"', html)
|
||||
preview_url = r1("data-preview-url=\"([^\"]+)\"", html)
|
||||
preview_url = r1("m-preview=\"([^\"]+)\"", html)
|
||||
|
||||
url = re.sub(r'previews', r'cloudcasts/originals', preview_url)
|
||||
url = re.sub(r'previews', r'c/originals', preview_url)
|
||||
for i in range(10, 30):
|
||||
url = re.sub(r'stream[^.]*', r'stream' + str(i), url)
|
||||
|
||||
@ -22,7 +22,7 @@ def mixcloud_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
try:
|
||||
type
|
||||
except:
|
||||
url = re.sub('cloudcasts/originals', r'cloudcasts/m4a/64', url)
|
||||
url = re.sub('c/originals', r'c/m4a/64', url)
|
||||
url = re.sub('.mp3', '.m4a', url)
|
||||
for i in range(10, 30):
|
||||
url = re.sub(r'stream[^.]*', r'stream' + str(i), url)
|
@ -6,8 +6,9 @@ from ..common import *
|
||||
|
||||
def netease_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
html = get_decoded_html(url)
|
||||
|
||||
|
||||
title = r1('movieDescription=\'([^\']+)\'', html) or r1('<title>(.+)</title>', html)
|
||||
|
||||
if title[0] == ' ':
|
||||
title = title[1:]
|
||||
|
||||
@ -27,7 +28,7 @@ def netease_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
ext = 'flv'
|
||||
|
||||
else:
|
||||
url = r1(r'["\'](.+)-list.m3u8["\']', html) + ".mp4"
|
||||
url = (r1(r'["\'](.+)-list.m3u8["\']', html) or r1(r'["\'](.+).m3u8["\']', html)) + ".mp4"
|
||||
_, _, size = url_info(url)
|
||||
ext = 'mp4'
|
||||
|
@ -6,12 +6,17 @@ from ..common import *
|
||||
|
||||
def nicovideo_login(user, password):
|
||||
data = "current_form=login&mail=" + user +"&password=" + password + "&login_submit=Log+In"
|
||||
response = request.urlopen(request.Request("https://secure.nicovideo.jp/secure/login?site=niconico", headers = fake_headers, data = data.encode('utf-8')))
|
||||
response = request.urlopen(request.Request("https://secure.nicovideo.jp/secure/login?site=niconico", headers=fake_headers, data=data.encode('utf-8')))
|
||||
return response.headers
|
||||
|
||||
def nicovideo_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
request.install_opener(request.build_opener(request.HTTPCookieProcessor()))
|
||||
|
||||
def nicovideo_download(url, output_dir='.', merge=True, info_only=False):
|
||||
import ssl
|
||||
ssl_context = request.HTTPSHandler(
|
||||
context=ssl.SSLContext(ssl.PROTOCOL_TLSv1))
|
||||
cookie_handler = request.HTTPCookieProcessor()
|
||||
opener = request.build_opener(ssl_context, cookie_handler)
|
||||
request.install_opener(opener)
|
||||
|
||||
import netrc, getpass
|
||||
info = netrc.netrc().authenticators('nicovideo')
|
||||
if info is None:
|
||||
@ -21,15 +26,15 @@ def nicovideo_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
user, password = info[0], info[2]
|
||||
print("Logging in...")
|
||||
nicovideo_login(user, password)
|
||||
|
||||
|
||||
html = get_html(url) # necessary!
|
||||
title = unicodize(r1(r'<span class="videoHeaderTitle">([^<]+)</span>', html))
|
||||
|
||||
|
||||
api_html = get_html('http://www.nicovideo.jp/api/getflv?v=%s' % url.split('/')[-1])
|
||||
real_url = parse.unquote(r1(r'url=([^&]+)&', api_html))
|
||||
|
||||
|
||||
type, ext, size = url_info(real_url)
|
||||
|
||||
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
download_urls([real_url], title, ext, size, output_dir, merge = merge)
|
@ -14,7 +14,7 @@ def pptv_download_by_id(id, title = None, output_dir = '.', merge = True, info_o
|
||||
key = r1(r'<key expire=[^<>]+>([^<>]+)</key>', xml)
|
||||
rid = r1(r'rid="([^"]+)"', xml)
|
||||
title = r1(r'nm="([^"]+)"', xml)
|
||||
pieces = re.findall('<sgm no="(\d+)".*fs="(\d+)"', xml)
|
||||
pieces = re.findall('<sgm no="(\d+)"[^<>]+fs="(\d+)"', xml)
|
||||
numbers, fs = zip(*pieces)
|
||||
urls = ['http://%s/%s/%s?k=%s' % (host, i, rid, key) for i in numbers]
|
||||
total_size = sum(map(int, fs))
|
70
src/you_get/extractor/qq.py
Normal file
70
src/you_get/extractor/qq.py
Normal file
@ -0,0 +1,70 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
__all__ = ['qq_download']
|
||||
|
||||
from ..common import *
|
||||
|
||||
def qq_download_by_id(id, title=None, output_dir='.', merge=True, info_only=False):
|
||||
xml = get_html('http://www.acfun.com/getinfo?vids=%s' % id)
|
||||
from xml.dom.minidom import parseString
|
||||
doc = parseString(xml)
|
||||
doc_root = doc.getElementsByTagName('root')[0]
|
||||
doc_vl = doc_root.getElementsByTagName('vl')[0]
|
||||
doc_vi = doc_vl.getElementsByTagName('vi')[0]
|
||||
fn = doc_vi.getElementsByTagName('fn')[0].firstChild.data
|
||||
fclip = doc_vi.getElementsByTagName('fclip')[0].firstChild.data
|
||||
if int(fclip) > 0:
|
||||
fn = fn[:-4] + "." + fclip + fn[-4:]
|
||||
fvkey = doc_vi.getElementsByTagName('fvkey')[0].firstChild.data
|
||||
doc_ul = doc_vi.getElementsByTagName('ul')
|
||||
url = doc_ul[0].getElementsByTagName('url')[0].firstChild.data
|
||||
url = url + fn + '?vkey=' + fvkey
|
||||
|
||||
_, ext, size = url_info(url)
|
||||
|
||||
print_info(site_info, title, ext, size)
|
||||
if not info_only:
|
||||
download_urls([url], title, ext, size, output_dir=output_dir, merge=merge)
|
||||
|
||||
def qq_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
if re.match(r'http://v.qq.com/([^\?]+)\?vid', url):
|
||||
aid = r1(r'(.*)\.html', url)
|
||||
vid = r1(r'http://v.qq.com/[^\?]+\?vid=(\w+)', url)
|
||||
url = 'http://sns.video.qq.com/tvideo/fcgi-bin/video?vid=%s' % vid
|
||||
|
||||
if re.match(r'http://y.qq.com/([^\?]+)\?vid', url):
|
||||
vid = r1(r'http://y.qq.com/[^\?]+\?vid=(\w+)', url)
|
||||
|
||||
url = "http://v.qq.com/page/%s.html" % vid
|
||||
|
||||
r_url = r1(r'<meta http-equiv="refresh" content="0;url=([^"]*)', get_html(url))
|
||||
if r_url:
|
||||
aid = r1(r'(.*)\.html', r_url)
|
||||
url = "%s/%s.html" % (aid, vid)
|
||||
|
||||
if re.match(r'http://static.video.qq.com/.*vid=', url):
|
||||
vid = r1(r'http://static.video.qq.com/.*vid=(\w+)', url)
|
||||
url = "http://v.qq.com/page/%s.html" % vid
|
||||
|
||||
if re.match(r'http://v.qq.com/cover/.*\.html', url):
|
||||
html = get_html(url)
|
||||
vid = r1(r'vid:"([^"]+)"', html)
|
||||
url = 'http://sns.video.qq.com/tvideo/fcgi-bin/video?vid=%s' % vid
|
||||
|
||||
html = get_html(url)
|
||||
|
||||
title = match1(html, r'<title>(.+?)</title>', r'title:"([^"]+)"')[0].strip()
|
||||
assert title
|
||||
title = unescape_html(title)
|
||||
title = escape_file_path(title)
|
||||
|
||||
try:
|
||||
id = vid
|
||||
except:
|
||||
id = r1(r'vid:"([^"]+)"', html)
|
||||
|
||||
qq_download_by_id(id, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
|
||||
site_info = "QQ.com"
|
||||
download = qq_download
|
||||
download_playlist = playlist_not_supported('qq')
|
@ -4,8 +4,19 @@ __all__ = ['sina_download', 'sina_download_by_vid', 'sina_download_by_vkey']
|
||||
|
||||
from ..common import *
|
||||
|
||||
def video_info(id):
|
||||
xml = get_content('http://v.iask.com/v_play.php?vid=%s' % id, decoded=True)
|
||||
from hashlib import md5
|
||||
from random import randint
|
||||
from time import time
|
||||
|
||||
def get_k(vid, rand):
|
||||
t = str(int('{0:b}'.format(int(time()))[:-6], 2))
|
||||
return md5((vid + 'Z6prk18aWxP278cVAH' + t + rand).encode('utf-8')).hexdigest()[:16] + t
|
||||
|
||||
def video_info(vid):
|
||||
rand = "0.{0}{1}".format(randint(10000, 10000000), randint(10000, 10000000))
|
||||
url = 'http://v.iask.com/v_play.php?vid={0}&ran={1}&p=i&k={2}'.format(vid, rand, get_k(vid, rand))
|
||||
xml = get_content(url, headers=fake_headers, decoded=True)
|
||||
|
||||
urls = re.findall(r'<url>(?:<!\[CDATA\[)?(.*?)(?:\]\]>)?</url>', xml)
|
||||
name = match1(xml, r'<vname>(?:<!\[CDATA\[)?(.+?)(?:\]\]>)?</vname>')
|
||||
vstr = match1(xml, r'<vstr>(?:<!\[CDATA\[)?(.+?)(?:\]\]>)?</vstr>')
|
||||
@ -15,7 +26,7 @@ def sina_download_by_vid(vid, title=None, output_dir='.', merge=True, info_only=
|
||||
"""Downloads a Sina video by its unique vid.
|
||||
http://video.sina.com.cn/
|
||||
"""
|
||||
|
||||
|
||||
urls, name, vstr = video_info(vid)
|
||||
title = title or name
|
||||
assert title
|
||||
@ -23,7 +34,7 @@ def sina_download_by_vid(vid, title=None, output_dir='.', merge=True, info_only=
|
||||
for url in urls:
|
||||
_, _, temp = url_info(url)
|
||||
size += temp
|
||||
|
||||
|
||||
print_info(site_info, title, 'flv', size)
|
||||
if not info_only:
|
||||
download_urls(urls, title, 'flv', size, output_dir = output_dir, merge = merge)
|
||||
@ -32,10 +43,10 @@ def sina_download_by_vkey(vkey, title=None, output_dir='.', merge=True, info_onl
|
||||
"""Downloads a Sina video by its unique vkey.
|
||||
http://video.sina.com/
|
||||
"""
|
||||
|
||||
|
||||
url = 'http://video.sina.com/v/flvideo/%s_0.flv' % vkey
|
||||
type, ext, size = url_info(url)
|
||||
|
||||
|
||||
print_info(site_info, title, 'flv', size)
|
||||
if not info_only:
|
||||
download_urls([url], title, 'flv', size, output_dir = output_dir, merge = merge)
|
||||
@ -43,7 +54,7 @@ def sina_download_by_vkey(vkey, title=None, output_dir='.', merge=True, info_onl
|
||||
def sina_download(url, output_dir='.', merge=True, info_only=False):
|
||||
"""Downloads Sina videos by URL.
|
||||
"""
|
||||
|
||||
|
||||
vid = match1(url, r'vid=(\d+)')
|
||||
if vid is None:
|
||||
video_page = get_content(url)
|
||||
@ -51,9 +62,10 @@ def sina_download(url, output_dir='.', merge=True, info_only=False):
|
||||
if hd_vid == '0':
|
||||
vids = match1(video_page, r'[^\w]vid\s*:\s*\'([^\']+)\'').split('|')
|
||||
vid = vids[-1]
|
||||
|
||||
|
||||
if vid:
|
||||
sina_download_by_vid(vid, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
title = match1(video_page, r'title\s*:\s*\'([^\']+)\'')
|
||||
sina_download_by_vid(vid, title=title, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
else:
|
||||
vkey = match1(video_page, r'vkey\s*:\s*"([^"]+)"')
|
||||
title = match1(video_page, r'title\s*:\s*"([^"]+)"')
|
@ -12,9 +12,22 @@ def real_url(host, prot, file, new):
|
||||
return '%s%s?key=%s' % (start[:-1], new, key)
|
||||
|
||||
def sohu_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
vid = r1('vid\s*=\s*"(\d+)"', get_html(url))
|
||||
|
||||
if vid:
|
||||
if re.match(r'http://share.vrs.sohu.com', url):
|
||||
vid = r1('id=(\d+)', url)
|
||||
else:
|
||||
html = get_html(url)
|
||||
vid = r1(r'\Wvid\s*[\:=]\s*[\'"]?(\d+)[\'"]?', html)
|
||||
assert vid
|
||||
|
||||
# Open Sogou proxy if required
|
||||
if get_sogou_proxy() is not None:
|
||||
server = sogou_proxy_server(get_sogou_proxy(), ostream=open(os.devnull, 'w'))
|
||||
server_thread = threading.Thread(target=server.serve_forever)
|
||||
server_thread.daemon = True
|
||||
server_thread.start()
|
||||
set_proxy(server.server_address)
|
||||
|
||||
if re.match(r'http://tv.sohu.com/', url):
|
||||
data = json.loads(get_decoded_html('http://hot.vrs.sohu.com/vrs_flash.action?vid=%s' % vid))
|
||||
for qtyp in ["oriVid","superVid","highVid" ,"norVid","relativeId"]:
|
||||
hqvid = data['data'][qtyp]
|
||||
@ -31,10 +44,9 @@ def sohu_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
for file, new in zip(data['clipsURL'], data['su']):
|
||||
urls.append(real_url(host, prot, file, new))
|
||||
assert data['clipsURL'][0].endswith('.mp4')
|
||||
|
||||
|
||||
else:
|
||||
vid = r1('vid\s*=\s*\'(\d+)\'', get_html(url))
|
||||
data = json.loads(get_decoded_html('http://my.tv.sohu.com/videinfo.jhtml?m=viewnew&vid=%s' % vid))
|
||||
data = json.loads(get_decoded_html('http://my.tv.sohu.com/play/videonew.do?vid=%s&referer=http://my.tv.sohu.com' % vid))
|
||||
host = data['allot']
|
||||
prot = data['prot']
|
||||
urls = []
|
||||
@ -45,7 +57,12 @@ def sohu_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
for file, new in zip(data['clipsURL'], data['su']):
|
||||
urls.append(real_url(host, prot, file, new))
|
||||
assert data['clipsURL'][0].endswith('.mp4')
|
||||
|
||||
|
||||
# Close Sogou proxy if required
|
||||
if get_sogou_proxy() is not None:
|
||||
server.shutdown()
|
||||
unset_proxy()
|
||||
|
||||
print_info(site_info, title, 'mp4', size)
|
||||
if not info_only:
|
||||
download_urls(urls, title, 'mp4', size, output_dir, refer = url, merge = merge)
|
24
src/you_get/extractor/ted.py
Normal file
24
src/you_get/extractor/ted.py
Normal file
@ -0,0 +1,24 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
__all__ = ['ted_download']
|
||||
|
||||
from ..common import *
|
||||
import json
|
||||
|
||||
def ted_download(url, output_dir='.', merge=True, info_only=False):
|
||||
html = get_html(url)
|
||||
metadata = json.loads(match1(html, r'({"talks"(.*)})\)'))
|
||||
title = metadata['talks'][0]['title']
|
||||
nativeDownloads = metadata['talks'][0]['nativeDownloads']
|
||||
for quality in ['high', 'medium', 'low']:
|
||||
if quality in nativeDownloads:
|
||||
url = nativeDownloads[quality]
|
||||
type, ext, size = url_info(url)
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
download_urls([url], title, ext, size, output_dir, merge=merge)
|
||||
break
|
||||
|
||||
site_info = "TED.com"
|
||||
download = ted_download
|
||||
download_playlist = playlist_not_supported('ted')
|
24
src/you_get/extractor/theplatform.py
Normal file
24
src/you_get/extractor/theplatform.py
Normal file
@ -0,0 +1,24 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
from ..common import *
|
||||
|
||||
def theplatform_download_by_pid(pid, title, output_dir='.', merge=True, info_only=False):
|
||||
smil_url = "http://link.theplatform.com/s/dJ5BDC/%s/meta.smil?format=smil&mbr=true" % pid
|
||||
smil = get_content(smil_url)
|
||||
smil_base = unescape_html(match1(smil, r'<meta base="([^"]+)"'))
|
||||
smil_videos = {y:x for x,y in dict(re.findall(r'<video src="([^"]+)".+height="([^"]+)"', smil)).items()}
|
||||
for height in ['1080', '720', '480', '360', '240', '216']:
|
||||
if height in smil_videos:
|
||||
smil_video = smil_videos[height]
|
||||
break
|
||||
assert smil_video
|
||||
|
||||
type, ext, size = 'mp4', 'mp4', 0
|
||||
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
download_rtmp_url(url=smil_base, playpath=ext+':'+smil_video, title=title, ext=ext, output_dir=output_dir)
|
||||
|
||||
site_info = "thePlatform.com"
|
||||
download = theplatform_download_by_pid
|
||||
download_playlist = playlist_not_supported('theplatform')
|
@ -8,7 +8,7 @@ def tudou_download_by_iid(iid, title, output_dir = '.', merge = True, info_only
|
||||
data = json.loads(get_decoded_html('http://www.tudou.com/outplay/goto/getItemSegs.action?iid=%s' % iid))
|
||||
vids = []
|
||||
for k in data:
|
||||
if len(data[k]) == 1:
|
||||
if len(data[k]) > 0:
|
||||
vids.append({"k": data[k][0]["k"], "size": data[k][0]["size"]})
|
||||
|
||||
temp = max(vids, key=lambda x:x["size"])
|
||||
@ -25,10 +25,10 @@ def tudou_download_by_iid(iid, title, output_dir = '.', merge = True, info_only
|
||||
if not info_only:
|
||||
download_urls([url], title, ext, size, output_dir = output_dir, merge = merge)
|
||||
|
||||
def tudou_download_by_id(id, title, output_dir = '.', merge = True, info_only = False):
|
||||
def tudou_download_by_id(id, title, output_dir = '.', merge = True, info_only = False):
|
||||
html = get_html('http://www.tudou.com/programs/view/%s/' % id)
|
||||
|
||||
iid = r1(r'iid\s*[:=]\s*(\S+)', html)
|
||||
|
||||
iid = r1(r'iid\s*[:=]\s*(\S+)', html)
|
||||
title = r1(r'kw\s*[:=]\s*[\'\"]([^\']+?)[\'\"]', html)
|
||||
tudou_download_by_iid(iid, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
|
||||
@ -37,22 +37,22 @@ def tudou_download(url, output_dir = '.', merge = True, info_only = False):
|
||||
id = r1(r'http://www.tudou.com/v/([^/]+)/', url)
|
||||
if id:
|
||||
return tudou_download_by_id(id, title="", info_only=info_only)
|
||||
|
||||
|
||||
html = get_decoded_html(url)
|
||||
|
||||
|
||||
title = r1(r'kw\s*[:=]\s*[\'\"]([^\']+?)[\'\"]', html)
|
||||
assert title
|
||||
title = unescape_html(title)
|
||||
|
||||
|
||||
vcode = r1(r'vcode\s*[:=]\s*\'([^\']+)\'', html)
|
||||
if vcode:
|
||||
from .youku import youku_download_by_id
|
||||
return youku_download_by_id(vcode, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
|
||||
from .youku import youku_download_by_vid
|
||||
return youku_download_by_vid(vcode, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
|
||||
iid = r1(r'iid\s*[:=]\s*(\d+)', html)
|
||||
if not iid:
|
||||
return tudou_download_playlist(url, output_dir, merge, info_only)
|
||||
|
||||
|
||||
tudou_download_by_iid(iid, title, output_dir = output_dir, merge = merge, info_only = info_only)
|
||||
|
||||
def parse_playlist(url):
|
||||
@ -81,4 +81,4 @@ def tudou_download_playlist(url, output_dir = '.', merge = True, info_only = Fal
|
||||
|
||||
site_info = "Tudou.com"
|
||||
download = tudou_download
|
||||
download_playlist = tudou_download_playlist
|
||||
download_playlist = tudou_download_playlist
|
@ -5,19 +5,16 @@ __all__ = ['vimeo_download', 'vimeo_download_by_id']
|
||||
from ..common import *
|
||||
|
||||
def vimeo_download_by_id(id, title = None, output_dir = '.', merge = True, info_only = False):
|
||||
html = get_html('http://vimeo.com/%s' % id, faker = True)
|
||||
video_page = get_content('http://player.vimeo.com/video/%s' % id, headers=fake_headers)
|
||||
title = r1(r'<title>([^<]+)</title>', video_page)
|
||||
info = dict(re.findall(r'"([^"]+)":\{[^{]+"url":"([^"]+)"', video_page))
|
||||
for quality in ['hd', 'sd', 'mobile']:
|
||||
if quality in info:
|
||||
url = info[quality]
|
||||
break
|
||||
assert url
|
||||
|
||||
signature = r1(r'"signature":"([^"]+)"', html)
|
||||
timestamp = r1(r'"timestamp":([^,]+)', html)
|
||||
hd = r1(r',"hd":(\d+),', html)
|
||||
|
||||
title = r1(r'"title":"([^"]+)"', html)
|
||||
title = escape_file_path(title)
|
||||
|
||||
url = 'http://player.vimeo.com/play_redirect?clip_id=%s&sig=%s&time=%s' % (id, signature, timestamp)
|
||||
if hd == "1":
|
||||
url += '&quality=hd'
|
||||
type, ext, size = url_info(url, faker = True)
|
||||
type, ext, size = url_info(url, faker=True)
|
||||
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
25
src/you_get/extractor/vine.py
Normal file
25
src/you_get/extractor/vine.py
Normal file
@ -0,0 +1,25 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
__all__ = ['vine_download']
|
||||
|
||||
from ..common import *
|
||||
|
||||
def vine_download(url, output_dir='.', merge=True, info_only=False):
|
||||
html = get_html(url)
|
||||
|
||||
vid = r1(r'vine.co/v/([^/]+)/', html)
|
||||
title1 = r1(r'<meta property="twitter:title" content="([^"]*)"', html)
|
||||
title2 = r1(r'<meta property="twitter:description" content="([^"]*)"', html)
|
||||
title = "%s - %s" % (title1, title2) + " [" + vid + "]"
|
||||
url = r1(r'<source src="([^"]*)"', html) or r1(r'<meta itemprop="contentURL" content="([^"]*)"', html)
|
||||
if url[0:2] == "//":
|
||||
url = "http:" + url
|
||||
type, ext, size = url_info(url)
|
||||
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
download_urls([url], title, ext, size, output_dir, merge = merge)
|
||||
|
||||
site_info = "Vine.co"
|
||||
download = vine_download
|
||||
download_playlist = playlist_not_supported('vine')
|
25
src/you_get/extractor/vk.py
Normal file
25
src/you_get/extractor/vk.py
Normal file
@ -0,0 +1,25 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
__all__ = ['vk_download']
|
||||
|
||||
from ..common import *
|
||||
|
||||
def vk_download(url, output_dir='.', merge=True, info_only=False):
|
||||
video_page = get_content(url)
|
||||
title = unescape_html(r1(r'"title":"([^"]+)"', video_page))
|
||||
info = dict(re.findall(r'\\"url(\d+)\\":\\"([^"]+)\\"', video_page))
|
||||
for quality in ['1080', '720', '480', '360', '240']:
|
||||
if quality in info:
|
||||
url = re.sub(r'\\\\\\/', r'/', info[quality])
|
||||
break
|
||||
assert url
|
||||
|
||||
type, ext, size = url_info(url)
|
||||
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
download_urls([url], title, ext, size, output_dir, merge=merge)
|
||||
|
||||
site_info = "VK.com"
|
||||
download = vk_download
|
||||
download_playlist = playlist_not_supported('vk')
|
@ -29,8 +29,9 @@ def location_dec(str):
|
||||
|
||||
def xiami_download_lyric(lrc_url, file_name, output_dir):
|
||||
lrc = get_html(lrc_url, faker = True)
|
||||
filename = get_filename(file_name)
|
||||
if len(lrc) > 0:
|
||||
with open(output_dir + "/" + file_name.replace('/', '-').replace('?', '-') + '.lrc', 'w', encoding='utf-8') as x:
|
||||
with open(output_dir + "/" + filename + '.lrc', 'w', encoding='utf-8') as x:
|
||||
x.write(lrc)
|
||||
|
||||
def xiami_download_pic(pic_url, file_name, output_dir):
|
||||
@ -50,7 +51,10 @@ def xiami_download_song(sid, output_dir = '.', merge = True, info_only = False):
|
||||
album_name = i.getElementsByTagName("album_name")[0].firstChild.nodeValue
|
||||
song_title = i.getElementsByTagName("title")[0].firstChild.nodeValue
|
||||
url = location_dec(i.getElementsByTagName("location")[0].firstChild.nodeValue)
|
||||
lrc_url = i.getElementsByTagName("lyric")[0].firstChild.nodeValue
|
||||
try:
|
||||
lrc_url = i.getElementsByTagName("lyric")[0].firstChild.nodeValue
|
||||
except:
|
||||
pass
|
||||
type, ext, size = url_info(url, faker = True)
|
||||
if not ext:
|
||||
ext = 'mp3'
|
||||
@ -78,7 +82,10 @@ def xiami_download_showcollect(cid, output_dir = '.', merge = True, info_only =
|
||||
album_name = i.getElementsByTagName("album_name")[0].firstChild.nodeValue
|
||||
song_title = i.getElementsByTagName("title")[0].firstChild.nodeValue
|
||||
url = location_dec(i.getElementsByTagName("location")[0].firstChild.nodeValue)
|
||||
lrc_url = i.getElementsByTagName("lyric")[0].firstChild.nodeValue
|
||||
try:
|
||||
lrc_url = i.getElementsByTagName("lyric")[0].firstChild.nodeValue
|
||||
except:
|
||||
pass
|
||||
type, ext, size = url_info(url, faker = True)
|
||||
if not ext:
|
||||
ext = 'mp3'
|
||||
@ -107,7 +114,10 @@ def xiami_download_album(aid, output_dir = '.', merge = True, info_only = False)
|
||||
for i in tracks:
|
||||
song_title = i.getElementsByTagName("title")[0].firstChild.nodeValue
|
||||
url = location_dec(i.getElementsByTagName("location")[0].firstChild.nodeValue)
|
||||
lrc_url = i.getElementsByTagName("lyric")[0].firstChild.nodeValue
|
||||
try:
|
||||
lrc_url = i.getElementsByTagName("lyric")[0].firstChild.nodeValue
|
||||
except:
|
||||
pass
|
||||
if not pic_exist:
|
||||
pic_url = i.getElementsByTagName("pic")[0].firstChild.nodeValue
|
||||
type, ext, size = url_info(url, faker = True)
|
74
src/you_get/extractor/youku.py
Normal file
74
src/you_get/extractor/youku.py
Normal file
@ -0,0 +1,74 @@
|
||||
#!/usr/bin/env python
|
||||
# -*- coding: utf-8 -*-
|
||||
|
||||
from ..common import *
|
||||
|
||||
class Youku(VideoExtractor):
|
||||
name = "优酷 (Youku)"
|
||||
|
||||
stream_types = [
|
||||
{'id': 'hd3', 'container': 'flv', 'video_profile': '1080P'},
|
||||
{'id': 'hd2', 'container': 'flv', 'video_profile': '超清'},
|
||||
{'id': 'mp4', 'container': 'mp4', 'video_profile': '高清'},
|
||||
{'id': 'flv', 'container': 'flv', 'video_profile': '标清'},
|
||||
{'id': '3gphd', 'container': '3gp', 'video_profile': '高清(3GP)'},
|
||||
]
|
||||
|
||||
def __init__(self, *args):
|
||||
super().__init__(args)
|
||||
|
||||
def get_vid_from_url(url):
|
||||
"""Extracts video ID from URL.
|
||||
"""
|
||||
patterns = [
|
||||
'youku.com/v_show/id_([\w=]+)',
|
||||
'player.youku.com/player.php/sid/([\w=]+)/v.swf',
|
||||
'loader\.swf\?VideoIDS=([\w=]+)',
|
||||
]
|
||||
matches = match1(url, *patterns)
|
||||
if matches:
|
||||
return matches[0]
|
||||
else:
|
||||
return None
|
||||
|
||||
def parse_m3u8(m3u8):
|
||||
return re.findall(r'(http://[^?]+)\?ts_start=0', m3u8)
|
||||
|
||||
def prepare(self, **kwargs):
|
||||
assert self.url or self.vid
|
||||
if self.url and not self.vid:
|
||||
self.vid = __class__.get_vid_from_url(self.url)
|
||||
|
||||
meta = json.loads(get_html('http://v.youku.com/player/getPlayList/VideoIDS/%s' % self.vid))
|
||||
metadata0 = meta['data'][0]
|
||||
|
||||
self.title = metadata0['title']
|
||||
|
||||
for stream_type in self.stream_types:
|
||||
if stream_type['id'] in metadata0['streamsizes']:
|
||||
stream_id = stream_type['id']
|
||||
stream_size = int(metadata0['streamsizes'][stream_id])
|
||||
self.streams[stream_id] = {'container': stream_type['container'], 'video_profile': stream_type['video_profile'], 'size': stream_size}
|
||||
|
||||
def extract(self, **kwargs):
|
||||
if 'stream_id' in kwargs and kwargs['stream_id']:
|
||||
# Extract the stream
|
||||
stream_id = kwargs['stream_id']
|
||||
else:
|
||||
# Extract stream with the best quality
|
||||
stream_id = self.streams_sorted[0]['id']
|
||||
|
||||
m3u8_url = "http://v.youku.com/player/getM3U8/vid/{vid}/type/{stream_id}/video.m3u8".format(vid=self.vid, stream_id=stream_id)
|
||||
m3u8 = get_html(m3u8_url)
|
||||
if not m3u8:
|
||||
log.w('[Warning] This video can only be streamed within Mainland China!')
|
||||
log.w('Use \'-y\' to specify a proxy server for extracting stream data.\n')
|
||||
|
||||
self.streams[stream_id]['src'] = __class__.parse_m3u8(m3u8)
|
||||
|
||||
site = Youku()
|
||||
download = site.download_by_url
|
||||
download_playlist = playlist_not_supported('youku')
|
||||
|
||||
youku_download_by_vid = site.download_by_vid
|
||||
# Used by: acfun.py bilibili.py miomio.py tudou.py
|
@ -5,36 +5,37 @@ __all__ = ['youtube_download', 'youtube_download_by_id']
|
||||
from ..common import *
|
||||
|
||||
# YouTube media encoding options, in descending quality order.
|
||||
# taken from http://en.wikipedia.org/wiki/YouTube#Quality_and_codecs, 3/22/2013.
|
||||
# taken from http://en.wikipedia.org/wiki/YouTube#Quality_and_codecs, 2/14/2014.
|
||||
yt_codecs = [
|
||||
{'itag': 38, 'container': 'MP4', 'video_resolution': '3072p', 'video_encoding': 'H.264', 'video_profile': 'High', 'video_bitrate': '3.5-5', 'audio_encoding': 'AAC', 'audio_bitrate': '192'},
|
||||
#{'itag': 85, 'container': 'MP4', 'video_resolution': '1080p', 'video_encoding': 'H.264', 'video_profile': '3D', 'video_bitrate': '3-4', 'audio_encoding': 'AAC', 'audio_bitrate': '192'},
|
||||
{'itag': 46, 'container': 'WebM', 'video_resolution': '1080p', 'video_encoding': 'VP8', 'video_profile': '', 'video_bitrate': '', 'audio_encoding': 'Vorbis', 'audio_bitrate': '192'},
|
||||
{'itag': 37, 'container': 'MP4', 'video_resolution': '1080p', 'video_encoding': 'H.264', 'video_profile': 'High', 'video_bitrate': '3-4.3', 'audio_encoding': 'AAC', 'audio_bitrate': '192'},
|
||||
{'itag': 102, 'container': '', 'video_resolution': '', 'video_encoding': 'VP8', 'video_profile': '', 'video_bitrate': '2', 'audio_encoding': 'Vorbis', 'audio_bitrate': '192'},
|
||||
{'itag': 45, 'container': 'WebM', 'video_resolution': '720p', 'video_encoding': '', 'video_profile': '', 'video_bitrate': '', 'audio_encoding': '', 'audio_bitrate': ''},
|
||||
{'itag': 22, 'container': 'MP4', 'video_resolution': '720p', 'video_encoding': 'H.264', 'video_profile': 'High', 'video_bitrate': '2-2.9', 'audio_encoding': 'AAC', 'audio_bitrate': '192'},
|
||||
{'itag': 84, 'container': 'MP4', 'video_resolution': '720p', 'video_encoding': 'H.264', 'video_profile': '3D', 'video_bitrate': '2-2.9', 'audio_encoding': 'AAC', 'audio_bitrate': '152'},
|
||||
{'itag': 120, 'container': 'FLV', 'video_resolution': '720p', 'video_encoding': 'AVC', 'video_profile': 'Main@L3.1', 'video_bitrate': '2', 'audio_encoding': 'AAC', 'audio_bitrate': '128'},
|
||||
{'itag': 85, 'container': 'MP4', 'video_resolution': '520p', 'video_encoding': 'H.264', 'video_profile': '3D', 'video_bitrate': '2-2.9', 'audio_encoding': 'AAC', 'audio_bitrate': '152'},
|
||||
#{'itag': 102, 'container': 'WebM', 'video_resolution': '720p', 'video_encoding': 'VP8', 'video_profile': '3D', 'video_bitrate': '', 'audio_encoding': 'Vorbis', 'audio_bitrate': '192'},
|
||||
{'itag': 45, 'container': 'WebM', 'video_resolution': '720p', 'video_encoding': 'VP8', 'video_profile': '', 'video_bitrate': '2', 'audio_encoding': 'Vorbis', 'audio_bitrate': '192'},
|
||||
#{'itag': 84, 'container': 'MP4', 'video_resolution': '720p', 'video_encoding': 'H.264', 'video_profile': '3D', 'video_bitrate': '2-3', 'audio_encoding': 'AAC', 'audio_bitrate': '192'},
|
||||
{'itag': 22, 'container': 'MP4', 'video_resolution': '720p', 'video_encoding': 'H.264', 'video_profile': 'High', 'video_bitrate': '2-3', 'audio_encoding': 'AAC', 'audio_bitrate': '192'},
|
||||
{'itag': 120, 'container': 'FLV', 'video_resolution': '720p', 'video_encoding': 'H.264', 'video_profile': 'Main@L3.1', 'video_bitrate': '2', 'audio_encoding': 'AAC', 'audio_bitrate': '128'},
|
||||
{'itag': 44, 'container': 'WebM', 'video_resolution': '480p', 'video_encoding': 'VP8', 'video_profile': '', 'video_bitrate': '1', 'audio_encoding': 'Vorbis', 'audio_bitrate': '128'},
|
||||
{'itag': 35, 'container': 'FLV', 'video_resolution': '480p', 'video_encoding': 'H.264', 'video_profile': 'Main', 'video_bitrate': '0.8-1', 'audio_encoding': 'AAC', 'audio_bitrate': '128'},
|
||||
{'itag': 101, 'container': 'WebM', 'video_resolution': '360p', 'video_encoding': 'VP8', 'video_profile': '3D', 'video_bitrate': '', 'audio_encoding': 'Vorbis', 'audio_bitrate': '192'},
|
||||
{'itag': 100, 'container': 'WebM', 'video_resolution': '360p', 'video_encoding': 'VP8', 'video_profile': '3D', 'video_bitrate': '', 'audio_encoding': 'Vorbis', 'audio_bitrate': '128'},
|
||||
#{'itag': 101, 'container': 'WebM', 'video_resolution': '360p', 'video_encoding': 'VP8', 'video_profile': '3D', 'video_bitrate': '', 'audio_encoding': 'Vorbis', 'audio_bitrate': '192'},
|
||||
#{'itag': 100, 'container': 'WebM', 'video_resolution': '360p', 'video_encoding': 'VP8', 'video_profile': '3D', 'video_bitrate': '', 'audio_encoding': 'Vorbis', 'audio_bitrate': '128'},
|
||||
{'itag': 43, 'container': 'WebM', 'video_resolution': '360p', 'video_encoding': 'VP8', 'video_profile': '', 'video_bitrate': '0.5', 'audio_encoding': 'Vorbis', 'audio_bitrate': '128'},
|
||||
{'itag': 34, 'container': 'FLV', 'video_resolution': '360p', 'video_encoding': 'H.264', 'video_profile': 'Main', 'video_bitrate': '0.5', 'audio_encoding': 'AAC', 'audio_bitrate': '128'},
|
||||
{'itag': 82, 'container': 'MP4', 'video_resolution': '360p', 'video_encoding': 'H.264', 'video_profile': '3D', 'video_bitrate': '0.5', 'audio_encoding': 'AAC', 'audio_bitrate': '96'},
|
||||
#{'itag': 82, 'container': 'MP4', 'video_resolution': '360p', 'video_encoding': 'H.264', 'video_profile': '3D', 'video_bitrate': '0.5', 'audio_encoding': 'AAC', 'audio_bitrate': '96'},
|
||||
{'itag': 18, 'container': 'MP4', 'video_resolution': '270p/360p', 'video_encoding': 'H.264', 'video_profile': 'Baseline', 'video_bitrate': '0.5', 'audio_encoding': 'AAC', 'audio_bitrate': '96'},
|
||||
{'itag': 6, 'container': 'FLV', 'video_resolution': '270p', 'video_encoding': 'Sorenson H.263', 'video_profile': '', 'video_bitrate': '0.8', 'audio_encoding': 'MP3', 'audio_bitrate': '64'},
|
||||
{'itag': 83, 'container': 'MP4', 'video_resolution': '240p', 'video_encoding': 'H.264', 'video_profile': '3D', 'video_bitrate': '0.5', 'audio_encoding': 'AAC', 'audio_bitrate': '96'},
|
||||
#{'itag': 83, 'container': 'MP4', 'video_resolution': '240p', 'video_encoding': 'H.264', 'video_profile': '3D', 'video_bitrate': '0.5', 'audio_encoding': 'AAC', 'audio_bitrate': '96'},
|
||||
{'itag': 13, 'container': '3GP', 'video_resolution': '', 'video_encoding': 'MPEG-4 Visual', 'video_profile': '', 'video_bitrate': '0.5', 'audio_encoding': 'AAC', 'audio_bitrate': ''},
|
||||
{'itag': 5, 'container': 'FLV', 'video_resolution': '240p', 'video_encoding': 'Sorenson H.263', 'video_profile': '', 'video_bitrate': '0.25', 'audio_encoding': 'MP3', 'audio_bitrate': '64'},
|
||||
{'itag': 36, 'container': '3GP', 'video_resolution': '240p', 'video_encoding': 'MPEG-4 Visual', 'video_profile': 'Simple', 'video_bitrate': '0.17', 'audio_encoding': 'AAC', 'audio_bitrate': '38'},
|
||||
{'itag': 36, 'container': '3GP', 'video_resolution': '240p', 'video_encoding': 'MPEG-4 Visual', 'video_profile': 'Simple', 'video_bitrate': '0.175', 'audio_encoding': 'AAC', 'audio_bitrate': '36'},
|
||||
{'itag': 17, 'container': '3GP', 'video_resolution': '144p', 'video_encoding': 'MPEG-4 Visual', 'video_profile': 'Simple', 'video_bitrate': '0.05', 'audio_encoding': 'AAC', 'audio_bitrate': '24'},
|
||||
]
|
||||
|
||||
def decipher(js, s):
|
||||
def tr_js(code):
|
||||
code = re.sub(r'function', r'def', code)
|
||||
code = re.sub(r'\$', '_dollar', code)
|
||||
code = re.sub(r'\{', r':\n\t', code)
|
||||
code = re.sub(r'\}', r'\n', code)
|
||||
code = re.sub(r'var\s+', r'', code)
|
||||
@ -44,73 +45,96 @@ def decipher(js, s):
|
||||
code = re.sub(r'(\w+).slice\((\d+)\)', r'\1[\2:]', code)
|
||||
code = re.sub(r'(\w+).split\(""\)', r'list(\1)', code)
|
||||
return code
|
||||
|
||||
f1 = match1(js, r'g.sig\|\|(\w+)\(g.s\)')
|
||||
f1def = match1(js, r'(function %s\(\w+\)\{[^\{]+\})' % f1)
|
||||
|
||||
f1 = match1(js, r'\w+\.sig\|\|([$\w]+)\(\w+\.\w+\)')
|
||||
f1def = match1(js, r'(function %s\(\w+\)\{[^\{]+\})' % re.escape(f1))
|
||||
code = tr_js(f1def)
|
||||
f2 = match1(f1def, r'(\w+)\(\w+,\d+\)')
|
||||
f2 = match1(f1def, r'([$\w]+)\(\w+,\d+\)')
|
||||
if f2 is not None:
|
||||
f2def = match1(js, r'(function %s\(\w+,\w+\)\{[^\{]+\})' % f2)
|
||||
f2e = re.escape(f2)
|
||||
f2def = match1(js, r'(function %s\(\w+,\w+\)\{[^\{]+\})' % f2e)
|
||||
f2 = re.sub(r'\$', '_dollar', f2)
|
||||
code = code + 'global %s\n' % f2 + tr_js(f2def)
|
||||
|
||||
code = code + 'sig=%s(s)' % f1
|
||||
|
||||
code = code + 'sig=%s(s)' % re.sub(r'\$', '_dollar', f1)
|
||||
exec(code, globals(), locals())
|
||||
return locals()['sig']
|
||||
|
||||
def youtube_download_by_id(id, title=None, output_dir='.', merge=True, info_only=False):
|
||||
"""Downloads a YouTube video by its unique id.
|
||||
"""
|
||||
|
||||
|
||||
raw_video_info = get_content('http://www.youtube.com/get_video_info?video_id=%s' % id)
|
||||
video_info = parse.parse_qs(raw_video_info)
|
||||
|
||||
|
||||
if video_info['status'] == ['ok'] and ('use_cipher_signature' not in video_info or video_info['use_cipher_signature'] == ['False']):
|
||||
title = parse.unquote_plus(video_info['title'][0])
|
||||
stream_list = parse.parse_qs(raw_video_info)['url_encoded_fmt_stream_map'][0].split(',')
|
||||
|
||||
|
||||
else:
|
||||
# Parse video page when video_info is not usable.
|
||||
video_page = get_content('http://www.youtube.com/watch?v=%s' % id)
|
||||
ytplayer_config = json.loads(match1(video_page, r'ytplayer.config\s*=\s*([^\n]+);'))
|
||||
|
||||
ytplayer_config = json.loads(match1(video_page, r'ytplayer.config\s*=\s*([^\n]+});'))
|
||||
|
||||
title = ytplayer_config['args']['title']
|
||||
stream_list = ytplayer_config['args']['url_encoded_fmt_stream_map'].split(',')
|
||||
|
||||
|
||||
html5player = ytplayer_config['assets']['js']
|
||||
|
||||
if html5player[0:2] == '//':
|
||||
html5player = 'http:' + html5player
|
||||
|
||||
streams = {
|
||||
parse.parse_qs(stream)['itag'][0] : parse.parse_qs(stream)
|
||||
for stream in stream_list
|
||||
}
|
||||
|
||||
|
||||
for codec in yt_codecs:
|
||||
itag = str(codec['itag'])
|
||||
if itag in streams:
|
||||
download_stream = streams[itag]
|
||||
break
|
||||
|
||||
|
||||
url = download_stream['url'][0]
|
||||
if 'sig' in download_stream:
|
||||
sig = download_stream['sig'][0]
|
||||
else:
|
||||
url = '%s&signature=%s' % (url, sig)
|
||||
elif 's' in download_stream:
|
||||
js = get_content(html5player)
|
||||
sig = decipher(js, download_stream['s'][0])
|
||||
url = '%s&signature=%s' % (url, sig)
|
||||
|
||||
url = '%s&signature=%s' % (url, sig)
|
||||
|
||||
type, ext, size = url_info(url)
|
||||
|
||||
|
||||
print_info(site_info, title, type, size)
|
||||
if not info_only:
|
||||
download_urls([url], title, ext, size, output_dir, merge = merge)
|
||||
|
||||
def youtube_list_download_by_id(list_id, title=None, output_dir='.', merge=True, info_only=False):
|
||||
"""Downloads a YouTube video list by its unique id.
|
||||
"""
|
||||
|
||||
video_page = get_content('http://www.youtube.com/playlist?list=%s' % list_id)
|
||||
ids = set(re.findall(r'<a href="\/watch\?v=([\w-]+)', video_page))
|
||||
for id in ids:
|
||||
youtube_download_by_id(id, title, output_dir, merge, info_only)
|
||||
|
||||
def youtube_download(url, output_dir='.', merge=True, info_only=False):
|
||||
"""Downloads YouTube videos by URL.
|
||||
"""
|
||||
|
||||
id = match1(url, r'youtu.be/([^/]+)') or parse_query_param(url, 'v')
|
||||
assert id
|
||||
|
||||
youtube_download_by_id(id, title=None, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
|
||||
id = match1(url, r'youtu.be/([^/]+)') or \
|
||||
match1(url, r'youtube.com/embed/([^/]+)') or \
|
||||
parse_query_param(url, 'v') or \
|
||||
parse_query_param(parse_query_param(url, 'u'), 'v')
|
||||
if id is None:
|
||||
list_id = parse_query_param(url, 'list') or \
|
||||
parse_query_param(url, 'p')
|
||||
assert id or list_id
|
||||
|
||||
if id:
|
||||
youtube_download_by_id(id, title=None, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
else:
|
||||
youtube_list_download_by_id(list_id, title=None, output_dir=output_dir, merge=merge, info_only=info_only)
|
||||
|
||||
site_info = "YouTube.com"
|
||||
download = youtube_download
|
@ -3,3 +3,4 @@
|
||||
from .join_flv import concat_flv
|
||||
from .join_mp4 import concat_mp4
|
||||
from .ffmpeg import *
|
||||
from .rtmpdump import *
|
||||
|
35
src/you_get/processor/rtmpdump.py
Normal file
35
src/you_get/processor/rtmpdump.py
Normal file
@ -0,0 +1,35 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
import os.path
|
||||
import subprocess
|
||||
|
||||
def get_usable_rtmpdump(cmd):
|
||||
try:
|
||||
p = subprocess.Popen([cmd], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
|
||||
out, err = p.communicate()
|
||||
return cmd
|
||||
except:
|
||||
return None
|
||||
|
||||
RTMPDUMP = get_usable_rtmpdump('rtmpdump')
|
||||
|
||||
def has_rtmpdump_installed():
|
||||
return RTMPDUMP is not None
|
||||
|
||||
def download_rtmpdump_stream(url, playpath, title, ext, output_dir='.'):
|
||||
filename = '%s.%s' % (title, ext)
|
||||
filepath = os.path.join(output_dir, filename)
|
||||
|
||||
params = [RTMPDUMP, '-r']
|
||||
params.append(url)
|
||||
params.append('-y')
|
||||
params.append(playpath)
|
||||
params.append('-o')
|
||||
params.append(filepath)
|
||||
|
||||
subprocess.call(params)
|
||||
return
|
||||
|
||||
def play_rtmpdump_stream(player, url, playpath):
|
||||
os.system("rtmpdump -r '%s' -y '%s' -o - | %s -" % (url, playpath, player))
|
||||
return
|
6
src/you_get/util/__init__.py
Normal file
6
src/you_get/util/__init__.py
Normal file
@ -0,0 +1,6 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
from .fs import *
|
||||
from .log import *
|
||||
from .sogou_proxy import *
|
||||
from .strings import *
|
45
src/you_get/util/fs.py
Normal file
45
src/you_get/util/fs.py
Normal file
@ -0,0 +1,45 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
import platform
|
||||
|
||||
def legitimize(text, os=platform.system()):
|
||||
"""Converts a string to a valid filename.
|
||||
"""
|
||||
|
||||
# POSIX systems
|
||||
text = text.translate({
|
||||
0: None,
|
||||
ord('/'): '-',
|
||||
})
|
||||
|
||||
if os == 'Windows':
|
||||
# Windows (non-POSIX namespace)
|
||||
text = text.translate({
|
||||
# Reserved in Windows VFAT and NTFS
|
||||
ord(':'): '-',
|
||||
ord('*'): '-',
|
||||
ord('?'): '-',
|
||||
ord('\\'): '-',
|
||||
ord('|'): '-',
|
||||
ord('\"'): '\'',
|
||||
# Reserved in Windows VFAT
|
||||
ord('+'): '-',
|
||||
ord('<'): '-',
|
||||
ord('>'): '-',
|
||||
ord('['): '(',
|
||||
ord(']'): ')',
|
||||
})
|
||||
else:
|
||||
# *nix
|
||||
if os == 'Darwin':
|
||||
# Mac OS HFS+
|
||||
text = text.translate({
|
||||
ord(':'): '-',
|
||||
})
|
||||
|
||||
# Remove leading .
|
||||
if text.startswith("."):
|
||||
text = text[1:]
|
||||
|
||||
text = text[:82] # Trim to 82 Unicode characters long
|
||||
return text
|
126
src/you_get/util/log.py
Normal file
126
src/you_get/util/log.py
Normal file
@ -0,0 +1,126 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
from ..version import __name__
|
||||
|
||||
import os, sys, subprocess
|
||||
|
||||
# Is terminal ANSI/VT100 compatible
|
||||
if os.getenv('TERM') in (
|
||||
'xterm',
|
||||
'vt100',
|
||||
'linux',
|
||||
'eterm-color',
|
||||
'screen',
|
||||
):
|
||||
has_colors = True
|
||||
else:
|
||||
try:
|
||||
# Eshell
|
||||
ppid = os.getppid()
|
||||
has_colors = (subprocess.getoutput('ps -p %d -ocomm=' % ppid)
|
||||
== 'emacs')
|
||||
except:
|
||||
has_colors = False
|
||||
|
||||
# ANSI/VT100 escape code
|
||||
# http://en.wikipedia.org/wiki/ANSI_escape_code
|
||||
colors = {
|
||||
'none': '',
|
||||
'reset': '\033[0m',
|
||||
|
||||
'black': '\033[30m',
|
||||
'bold-black': '\033[30;1m',
|
||||
'dark-gray': '\033[90m',
|
||||
'bold-dark-gray': '\033[90;1m',
|
||||
|
||||
'red': '\033[31m',
|
||||
'bold-red': '\033[31;1m',
|
||||
'light-red': '\033[91m',
|
||||
'bold-light-red': '\033[91;1m',
|
||||
|
||||
'green': '\033[32m',
|
||||
'bold-green': '\033[32;1m',
|
||||
'light-green': '\033[92m',
|
||||
'bold-light-green': '\033[92;1m',
|
||||
|
||||
'yellow': '\033[33m',
|
||||
'bold-yellow': '\033[33;1m',
|
||||
'light-yellow': '\033[93m',
|
||||
'bold-light-yellow': '\033[93;1m',
|
||||
|
||||
'blue': '\033[34m',
|
||||
'bold-blue': '\033[34;1m',
|
||||
'light-blue': '\033[94m',
|
||||
'bold-light-blue': '\033[94;1m',
|
||||
|
||||
'magenta': '\033[35m',
|
||||
'bold-magenta': '\033[35;1m',
|
||||
'light-magenta': '\033[95m',
|
||||
'bold-light-magenta': '\033[95;1m',
|
||||
|
||||
'cyan': '\033[36m',
|
||||
'bold-cyan': '\033[36;1m',
|
||||
'light-cyan': '\033[96m',
|
||||
'bold-light-cyan': '\033[96;1m',
|
||||
|
||||
'light-gray': '\033[37m',
|
||||
'bold-light-gray': '\033[37;1m',
|
||||
'white': '\033[97m',
|
||||
'bold-white': '\033[97;1m',
|
||||
}
|
||||
|
||||
def underlined(text):
|
||||
"""Returns an underlined text.
|
||||
"""
|
||||
return "\33[4m%s\33[24m" % text if has_colors else text
|
||||
|
||||
def println(text, color=None, ostream=sys.stdout):
|
||||
"""Prints a text line to stream.
|
||||
"""
|
||||
if has_colors and color in colors:
|
||||
ostream.write("{0}{1}{2}\n".format(colors[color], text, colors['reset']))
|
||||
else:
|
||||
ostream.write("{0}\n".format(text))
|
||||
|
||||
def printlog(message, color=None, ostream=sys.stderr):
|
||||
"""Prints a log message to stream.
|
||||
"""
|
||||
if has_colors and color in colors:
|
||||
ostream.write("{0}{1}: {2}{3}\n".format(colors[color], __name__, message, colors['reset']))
|
||||
else:
|
||||
ostream.write("{0}: {1}\n".format(__name__, message))
|
||||
|
||||
def i(message, ostream=sys.stderr):
|
||||
"""Sends an info log message.
|
||||
"""
|
||||
printlog(message,
|
||||
None,
|
||||
ostream=ostream)
|
||||
|
||||
def d(message, ostream=sys.stderr):
|
||||
"""Sends a debug log message.
|
||||
"""
|
||||
printlog(message,
|
||||
'blue' if has_colors else None,
|
||||
ostream=ostream)
|
||||
|
||||
def w(message, ostream=sys.stderr):
|
||||
"""Sends a warning log message.
|
||||
"""
|
||||
printlog(message,
|
||||
'yellow' if has_colors else None,
|
||||
ostream=ostream)
|
||||
|
||||
def e(message, ostream=sys.stderr):
|
||||
"""Sends an error log message.
|
||||
"""
|
||||
printlog(message,
|
||||
'bold-yellow' if has_colors else None,
|
||||
ostream=ostream)
|
||||
|
||||
def wtf(message, ostream=sys.stderr):
|
||||
"""What a Terrible Failure.
|
||||
"""
|
||||
printlog(message,
|
||||
'bold-red' if has_colors else None,
|
||||
ostream=ostream)
|
141
src/you_get/util/sogou_proxy.py
Normal file
141
src/you_get/util/sogou_proxy.py
Normal file
@ -0,0 +1,141 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
# Original code from:
|
||||
# http://xiaoxia.org/2011/03/26/using-python-to-write-a-local-sogou-proxy-server-procedures/
|
||||
|
||||
from . import log
|
||||
|
||||
from http.client import HTTPResponse
|
||||
from http.server import BaseHTTPRequestHandler, HTTPServer
|
||||
from socketserver import ThreadingMixIn
|
||||
from threading import Thread
|
||||
import random, socket, struct, sys, time
|
||||
|
||||
def sogou_proxy_server(
|
||||
host=("0.0.0.0", 0),
|
||||
network_env='CERNET',
|
||||
ostream=sys.stderr):
|
||||
"""Returns a Sogou proxy server object.
|
||||
"""
|
||||
|
||||
x_sogou_auth = '9CD285F1E7ADB0BD403C22AD1D545F40/30/853edc6d49ba4e27'
|
||||
proxy_host = 'h0.cnc.bj.ie.sogou.com'
|
||||
proxy_port = 80
|
||||
|
||||
def sogou_hash(t, host):
|
||||
s = (t + host + 'SogouExplorerProxy').encode('ascii')
|
||||
code = len(s)
|
||||
dwords = int(len(s) / 4)
|
||||
rest = len(s) % 4
|
||||
v = struct.unpack(str(dwords) + 'i' + str(rest) + 's', s)
|
||||
for vv in v:
|
||||
if type(vv) != bytes:
|
||||
a = (vv & 0xFFFF)
|
||||
b = (vv >> 16)
|
||||
code += a
|
||||
code = code ^ (((code << 5) ^ b) << 0xb)
|
||||
# To avoid overflows
|
||||
code &= 0xffffffff
|
||||
code += code >> 0xb
|
||||
if rest == 3:
|
||||
code += s[len(s) - 2] * 256 + s[len(s) - 3]
|
||||
code = code ^ ((code ^ (s[len(s) - 1]) * 4) << 0x10)
|
||||
code &= 0xffffffff
|
||||
code += code >> 0xb
|
||||
elif rest == 2:
|
||||
code += (s[len(s) - 1]) * 256 + (s[len(s) - 2])
|
||||
code ^= code << 0xb
|
||||
code &= 0xffffffff
|
||||
code += code >> 0x11
|
||||
elif rest == 1:
|
||||
code += s[len(s) - 1]
|
||||
code ^= code << 0xa
|
||||
code &= 0xffffffff
|
||||
code += code >> 0x1
|
||||
code ^= code * 8
|
||||
code &= 0xffffffff
|
||||
code += code >> 5
|
||||
code ^= code << 4
|
||||
code = code & 0xffffffff
|
||||
code += code >> 0x11
|
||||
code ^= code << 0x19
|
||||
code = code & 0xffffffff
|
||||
code += code >> 6
|
||||
code = code & 0xffffffff
|
||||
return hex(code)[2:].rstrip('L').zfill(8)
|
||||
|
||||
class Handler(BaseHTTPRequestHandler):
|
||||
_socket = None
|
||||
def do_proxy(self):
|
||||
try:
|
||||
if self._socket is None:
|
||||
self._socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
|
||||
self._socket.connect((proxy_host, proxy_port))
|
||||
self._socket.send(self.requestline.encode('ascii') + b'\r\n')
|
||||
log.d(self.requestline, ostream)
|
||||
|
||||
# Add Sogou Verification Tags
|
||||
self.headers['X-Sogou-Auth'] = x_sogou_auth
|
||||
t = hex(int(time.time()))[2:].rstrip('L').zfill(8)
|
||||
self.headers['X-Sogou-Tag'] = sogou_hash(t, self.headers['Host'])
|
||||
self.headers['X-Sogou-Timestamp'] = t
|
||||
self._socket.send(str(self.headers).encode('ascii') + b'\r\n')
|
||||
|
||||
# Send POST data
|
||||
if self.command == 'POST':
|
||||
self._socket.send(self.rfile.read(int(self.headers['Content-Length'])))
|
||||
response = HTTPResponse(self._socket, method=self.command)
|
||||
response.begin()
|
||||
|
||||
# Response
|
||||
status = 'HTTP/1.1 %s %s' % (response.status, response.reason)
|
||||
self.wfile.write(status.encode('ascii') + b'\r\n')
|
||||
h = ''
|
||||
for hh, vv in response.getheaders():
|
||||
if hh.upper() != 'TRANSFER-ENCODING':
|
||||
h += hh + ': ' + vv + '\r\n'
|
||||
self.wfile.write(h.encode('ascii') + b'\r\n')
|
||||
while True:
|
||||
response_data = response.read(8192)
|
||||
if len(response_data) == 0:
|
||||
break
|
||||
self.wfile.write(response_data)
|
||||
|
||||
except socket.error:
|
||||
log.e('Socket error for ' + self.requestline, ostream)
|
||||
|
||||
def do_POST(self):
|
||||
self.do_proxy()
|
||||
|
||||
def do_GET(self):
|
||||
self.do_proxy()
|
||||
|
||||
class ThreadingHTTPServer(ThreadingMixIn, HTTPServer):
|
||||
pass
|
||||
|
||||
# Server starts
|
||||
log.printlog('Sogou Proxy Mini-Server', color='bold-green', ostream=ostream)
|
||||
|
||||
try:
|
||||
server = ThreadingHTTPServer(host, Handler)
|
||||
except Exception as ex:
|
||||
log.wtf("Socket error: %s" % ex, ostream)
|
||||
exit(1)
|
||||
host = server.server_address
|
||||
|
||||
if network_env.upper() == 'CERNET':
|
||||
proxy_host = 'h%s.edu.bj.ie.sogou.com' % random.randint(0, 10)
|
||||
elif network_env.upper() == 'CTCNET':
|
||||
proxy_host = 'h%s.ctc.bj.ie.sogou.com' % random.randint(0, 3)
|
||||
elif network_env.upper() == 'CNCNET':
|
||||
proxy_host = 'h%s.cnc.bj.ie.sogou.com' % random.randint(0, 3)
|
||||
elif network_env.upper() == 'DXT':
|
||||
proxy_host = 'h%s.dxt.bj.ie.sogou.com' % random.randint(0, 10)
|
||||
else:
|
||||
proxy_host = 'h%s.edu.bj.ie.sogou.com' % random.randint(0, 10)
|
||||
|
||||
log.i('Remote host: %s' % log.underlined(proxy_host), ostream)
|
||||
log.i('Proxy server running on %s' %
|
||||
log.underlined("%s:%s" % host), ostream)
|
||||
|
||||
return server
|
25
src/you_get/util/strings.py
Normal file
25
src/you_get/util/strings.py
Normal file
@ -0,0 +1,25 @@
|
||||
try:
|
||||
# py 3.4
|
||||
from html import unescape as unescape_html
|
||||
except ImportError:
|
||||
import re
|
||||
from html.entities import entitydefs
|
||||
|
||||
def unescape_html(string):
|
||||
'''HTML entity decode'''
|
||||
string = re.sub(r'&#[^;]+;', _sharp2uni, string)
|
||||
string = re.sub(r'&[^;]+;', lambda m: entitydefs[m.group(0)[1:-1]], string)
|
||||
return string
|
||||
|
||||
def _sharp2uni(m):
|
||||
'''&#...; ==> unicode'''
|
||||
s = m.group(0)[2:].rstrip(';;')
|
||||
if s.startswith('x'):
|
||||
return chr(int('0'+s, 16))
|
||||
else:
|
||||
return chr(int(s))
|
||||
|
||||
from .fs import legitimize
|
||||
|
||||
def get_filename(htmlstring):
|
||||
return legitimize(unescape_html(htmlstring))
|
@ -1,5 +1,6 @@
|
||||
#!/usr/bin/env python
|
||||
__all__ = ['__version__', '__date__']
|
||||
|
||||
__version__ = '0.3.21'
|
||||
__date__ = '2013-08-17'
|
||||
__name__ = 'you-get'
|
||||
__version__ = '0.3.30dev'
|
||||
__date__ = '2014-06-24'
|
||||
|
@ -4,37 +4,44 @@
|
||||
import unittest
|
||||
|
||||
from you_get import *
|
||||
from you_get.downloader.__main__ import url_to_module
|
||||
from you_get.extractor.__main__ import url_to_module
|
||||
|
||||
def test_urls(urls):
|
||||
for url in urls:
|
||||
url_to_module(url).download(url, info_only = True)
|
||||
url_to_module(url)[0].download(url, info_only = True)
|
||||
|
||||
class YouGetTests(unittest.TestCase):
|
||||
|
||||
|
||||
def test_freesound(self):
|
||||
test_urls([
|
||||
"http://www.freesound.org/people/Corsica_S/sounds/184419/",
|
||||
])
|
||||
|
||||
|
||||
def test_magisto(self):
|
||||
test_urls([
|
||||
"http://www.magisto.com/album/video/f3x9AAQORAkfDnIFDA",
|
||||
])
|
||||
|
||||
def test_mixcloud(self):
|
||||
test_urls([
|
||||
"http://www.mixcloud.com/beatbopz/beat-bopz-disco-mix/",
|
||||
"http://www.mixcloud.com/DJVadim/north-america-are-you-ready/",
|
||||
])
|
||||
|
||||
|
||||
def test_ted(self):
|
||||
test_urls([
|
||||
"http://www.ted.com/talks/jennifer_lin_improvs_piano_magic.html",
|
||||
"http://www.ted.com/talks/derek_paravicini_and_adam_ockelford_in_the_key_of_genius.html",
|
||||
])
|
||||
|
||||
def test_vimeo(self):
|
||||
test_urls([
|
||||
"http://vimeo.com/56810854",
|
||||
])
|
||||
|
||||
def test_xiami(self):
|
||||
test_urls([
|
||||
"http://www.xiami.com/song/1769835121",
|
||||
])
|
||||
|
||||
|
||||
def test_youtube(self):
|
||||
test_urls([
|
||||
"http://www.youtube.com/watch?v=pzKerr0JIPA",
|
||||
"http://youtu.be/pzKerr0JIPA",
|
||||
"http://www.youtube.com/attribution_link?u=/watch?v%3DldAKIzq7bvs%26feature%3Dshare"
|
||||
])
|
||||
|
11
tests/test_util.py
Normal file
11
tests/test_util.py
Normal file
@ -0,0 +1,11 @@
|
||||
#!/usr/bin/env python
|
||||
|
||||
import unittest
|
||||
|
||||
from you_get.util import *
|
||||
|
||||
class TestUtil(unittest.TestCase):
|
||||
def test_legitimize(self):
|
||||
self.assertEqual(legitimize("1*2", os="Linux"), "1*2")
|
||||
self.assertEqual(legitimize("1*2", os="Darwin"), "1*2")
|
||||
self.assertEqual(legitimize("1*2", os="Windows"), "1-2")
|
2
you-get
2
you-get
@ -4,7 +4,7 @@ import os, sys
|
||||
__path__ = os.path.dirname(os.path.realpath(__file__))
|
||||
__srcdir__ = 'src'
|
||||
sys.path.insert(1, os.path.join(__path__, __srcdir__))
|
||||
from you_get.downloader import main
|
||||
from you_get.extractor import main
|
||||
|
||||
if __name__ == '__main__':
|
||||
main()
|
||||
|
@ -31,6 +31,6 @@
|
||||
],
|
||||
|
||||
"console_scripts": [
|
||||
"you-get = you_get.downloader.__main__:main"
|
||||
"you-get = you_get.extractor.__main__:main"
|
||||
]
|
||||
}
|
||||
|
Loading…
Reference in New Issue
Block a user