|
| Author |
Message |
wmcbrine

Joined: 04 Jan 2008 Posts: 1333
|
Posted: Sat Jan 26, 2008 4:53 pm Post subject: Pre-filtering videos by file extension |
|
|
Calling ffmpeg to determine whether a file is transcodable or not is a very expensive operation. Of course, it's the only way to be sure. However, I think there's a strong case to be made for pre-filtering by file extension before calling ffmpeg. On a directory that mixes different types of files, it gives a huge speedup. It also eliminates the false positives from still image files. Even in a directory of pure videos, if they all have metadata files, using pre-filtering cuts the number of calls to ffmpeg in half.
But to make this work, we need a fairly exhaustive list of acceptable extensions. (One could use a list of extensions to exclude instead, but any way you cut it, that list's going to be longer.) Here's what I used in my tests -- defintely not exhaustive. If you have others that should be added, please post them:
mpg
avi
mov
wmv
asf
flv
mpeg
qt
The basic implementation is very simple -- a three-line tweak to video_file_filter in video.py (extension list shortened for clarity):
| Code: | def video_file_filter(full_path, type = None):
goodexts = ('.mpg', '.mov', '.avi')
if os.path.isdir(full_path):
return True
if not os.path.splitext(full_path)[1] in goodexts:
return False
return transcode.supported_format(full_path)
|
To take this further, we could remove the supported_format() call from video_file_filter() altogether, relying solely on the extension to build the list. That makes for a much faster startup on large directories, but a slightly slower page down (the first time through), when ffmpeg would finally be called. The problem is that some videos with valid extensions would still be rejected by ffmpeg. My solution to that would be to use the "CopyProtected" flag -- which is just what Tivo Desktop does. _________________ My pyTivo fork |
|
| Back to top |
|
 |
PaulS
Joined: 05 Jan 2008 Posts: 176
|
Posted: Sat Jan 26, 2008 5:27 pm Post subject: |
|
|
| Don't forget Transport Streams (".ts"), Matroksa (.mkv) containers, and all of the Ogg ones as well... |
|
| Back to top |
|
 |
gonzotek
Joined: 12 Jan 2008 Posts: 64
|
Posted: Sat Jan 26, 2008 9:56 pm Post subject: |
|
|
| I suggest, if you do add this, to also make an option in pytivo.conf that allows a user to add their own extensions to the list (and possible one for exclusion as well). |
|
| Back to top |
|
 |
wgw
Joined: 06 Jan 2008 Posts: 284
|
Posted: Sat Jan 26, 2008 10:04 pm Post subject: |
|
|
.vob, .m2v, .divx, .mp4
I sure hate the idea of displaying an inaccurate list, but I understand the need.
Might as well include everthing in the following list, and let the Copyprotected flag filter out the files not supported by ffmpeg. That way they will automatically change to a supported format if ffmpeg adds support in the future.
http://www.fileinfo.net/filetypes/video _________________ Download pyTivo
my pyTivo branch |
|
| Back to top |
|
 |
wgw
Joined: 06 Jan 2008 Posts: 284
|
Posted: Sat Jan 26, 2008 11:25 pm Post subject: |
|
|
Rather than hard code extensions in the program, would it be possible to list them in a text file in the video plugin folder called video.ext, one extension per line.
That would also add the following possibility.
If video.ext exists, use extensions to speed up display
else
revert to current pyTivo behavior _________________ Download pyTivo
my pyTivo branch |
|
| Back to top |
|
 |
wgw
Joined: 06 Jan 2008 Posts: 284
|
Posted: Sun Jan 27, 2008 12:39 am Post subject: |
|
|
Video.ext
.3g2
.3gp
.3gp2
.3gpp
.3mm
.60d
.aep
.ajp
.amv
.asf
.asx
.avb
.avi
.avs
.bik
.bix
.box
.byu
.camrec
.cvcc
.d2v
.dat
.dce
.dif
.dir
.divx
.dmb
.dpg
.dv
.dvr-ms
.dxr
.eye
.fcp
.flc
.fli
.flv
.flx
.gl
.grasp
.gvi
.gvp
.ifo
.imovieproji
.imovieprojecti
.ivf
.ivs
.izz
.izzy
.lsf
.lsx
.m1v
.m21
.m2v
.m4e
.m4u
.m4v
.mjp
.mkv
.mod
.moov
.mov
.movie
.mp21
.mp4
.mpe
.mpeg
.mpg
.mpv2
.mqv
.msh
.mswmm
.mvb
.mvc
.nsv
.nvc
.ogm
.pds
.piv
.playlist
.pro
.prproj
.prx
.qt
.qtch
.qtz
.rm
.rmvb
.rp
.rts
.rts
.sbk
.scm
.scm
.sfvidcap
.smil
.smv
.spl
.srt
.ssm
.str
.svi
.swf
.swi
.tda3mt
.tivo
.ts
.vdo
.veg
.vf
.vfw
.vid
.viewlet
.viv
.vivo
.vob
.vp6
.vp7
.vro
.w32
.wcp
.wm
.wmd
.wmv
.wmx
.wvx
.yuv _________________ Download pyTivo
my pyTivo branch |
|
| Back to top |
|
 |
MasterCephus

Joined: 05 Jan 2008 Posts: 195 Location: Hueytown, AL
|
Posted: Mon Jan 28, 2008 6:30 pm Post subject: |
|
|
If I may suggest a couple of things...
1. I would make this (if possible) a pure user choice in the config file. Some people do not have that much video so it's not that big of a burder; however I do realize that some people have HUGE libraries and this would speed things. I would suggest letting the user decide and you could encapsulate the whole 3 line fix in a IF/ELSE to check if the user chooses to do this.
2. I would also suggest giving the user the burden to create this list. I don't know if the config file would be a good place to keep this list of extensions, or another text file somewhere, but definitely a user created list would be better because if one extension is missed, then the user has to wait to a new update to get what he/she wants. _________________ MetaGenerator
pyTivo Manager |
|
| Back to top |
|
 |
PaulS
Joined: 05 Jan 2008 Posts: 176
|
Posted: Mon Jan 28, 2008 6:45 pm Post subject: |
|
|
I'm torn. Yes, it would be nice to have a configuration option to allow the user to flexibly define what pyTivo will do for them. However, it also goes against the recent flow of development, which is to simplify and default as much as possible to help newbies get up and running with a minimum of fuss.
I think I'm with wgw on this one. Distribute a pre-created default list of audio/video/picture extensions with pyTivo, and use that list to determine which files need to be passed to ffmpeg for parsing.
On top of that, a config option could be provided to the user to allow them to specify their own list of extensions. I guess it'd be similar in behavior to many of the other options available to users in the pytivo.conf file. Provide a stock solution which will cover the vast majority of installations, but provide an option to allow for a user-specified override. |
|
| Back to top |
|
 |
TreborPugly
Joined: 05 Jan 2008 Posts: 52
|
Posted: Tue Jan 29, 2008 10:19 pm Post subject: |
|
|
| In the first post, you specifically mention Metadata.txt files, and still image files. Wouldn't it make more sense to build up a smaller list of files likely to be in your folders, that you know ffmpeg can't do anything with? Certainly txt, and then jpg, gif, a few others, which could be user configurable. If it is on the list, pyTivo ignores it. Experienced users could go in and add extensions that they know they have floating around. |
|
| Back to top |
|
 |
wmcbrine

Joined: 04 Jan 2008 Posts: 1333
|
Posted: Wed Jan 30, 2008 12:15 am Post subject: |
|
|
| TreborPugly wrote: | Wouldn't it make more sense to build up a smaller list of files likely to be in your folders, that you know ffmpeg can't do anything with? |
| wmcbrine wrote: | (One could use a list of extensions to exclude instead, but any way you cut it, that list's going to be longer.) |
So, in my view: No, it would not. And the bigger savings comes not from excluding files from the overall list, but from skipping the ffmpeg test for all files, until the last possible moment (i.e., when they're displayed a few at a time). But for that approach to make sense, they really need to be pre-filtered.
I think some people are unduly worrying about false negatives (files that ffmpeg could handle, but which have the wrong extension). Take a look at that list wgw posted, and ask yourself if you've ever had a video with an extension that wasn't in the list. Or even heard of one. Then consider, in the unlikely event that you did, the few seconds it would take to either add the extension to pyTivo's list, or just rename the file.
I'm actually inclined to trim that list quite a bit.
The other issue is false positives (files that ffmpeg can't handle, but which have the right extension). These are the ones I was talking about marking as "CopyProtected". I don't see this as a big issue, either; there aren't that many videos that ffmpeg can't handle. I know, for my own case, that the false positives list would be much shorter with pre-filtering than it is now, since ffmpeg makes every still image a false positive. (BTW, the copy protection flag isn't even needed, just a courtesy -- without it, you'd just get the same kind of error on unsupported videos that you get now on still pictures.)
Anyway, I appreciate all the responses. _________________ My pyTivo fork |
|
| Back to top |
|
 |
PaulS
Joined: 05 Jan 2008 Posts: 176
|
Posted: Wed Jan 30, 2008 12:47 am Post subject: |
|
|
Part of the problem is that these calls to ffmpeg currently need to be made on demand, whenever a user initiates an action which forces a directory traversal. Being thorough introduces lag, and being lax could introduce unexpected results for the user.
Is it possible to background these calls to ffmpeg ? I'm not familiar enough with Python to determine the feasibility of it, so that's why I'm asking. Is there a Python method of periodically spawning a thread to recurse through the given list of shares to determine the nature of these files, and populate the cache with transcodable files ? It would seem to me that pre-populating the cache would dramatically speed things up. |
|
| Back to top |
|
 |
reneg
Joined: 04 Jan 2008 Posts: 133
|
Posted: Wed Jan 30, 2008 4:32 am Post subject: |
|
|
| I think the copy protected flag is a good addition to the files that ffmpeg cannot handle. If I understand this correctly, the user gets a visual notification on the Tivo that a file cannot be downloaded. |
|
| Back to top |
|
 |
wmcbrine

Joined: 04 Jan 2008 Posts: 1333
|
Posted: Sun Feb 03, 2008 7:13 pm Post subject: |
|
|
OK, I've implemented this and posted it to my repo (both master and subfolders-8.3 branches). It uses an external file (video.ext, in plugins/video), and falls back to the old behavior if it's not present. And it puts the CopyProtected flag on anything that passes the extension test but fails the ffmpeg test.
Putting the list in an external file and allowing fallback to the old behavior are concessions to concerns that, frankly, I think are unwarranted. I urge you to at least try this system before rejecting it. It's orders of magnitude faster, and so far, I have zero false positives or negatives out of hundreds of files. (I had to add a non-video extension to video.ext just to test the CopyProtected flag.)
I ended up using wgw's whole list (above), since trimming it didn't seem to make a significant difference. The extensions are separated by any white space, so you don't have to put them one per line (and I didn't), but you can. _________________ My pyTivo fork |
|
| Back to top |
|
 |
wgw
Joined: 06 Jan 2008 Posts: 284
|
Posted: Sun Feb 03, 2008 8:20 pm Post subject: |
|
|
Very nice! I threw in .txt as a valid extension to try it out and decided to leave it in. I found that I like knowing which files have associated metadata. The txt file also tells me the file extention of the video which will aid in testing other enhancements, and let me know if transcoding will be required. _________________ Download pyTivo
my pyTivo branch |
|
| Back to top |
|
 |
windracer

Joined: 04 Jan 2008 Posts: 213 Location: St. Pete, FL
|
Posted: Tue Feb 05, 2008 3:17 am Post subject: |
|
|
| wmcbrine wrote: | And it puts the CopyProtected flag on anything that passes the extension test but fails the ffmpeg test. |
Hmmm ... about 50% of my movie files are being flagged as copy-protected now.
edit: and subfolders seem to be broken in the subfolders branch again. _________________ pyTiVo on Ubuntu 10.04 |
|
| Back to top |
|
 |
|