![]() ![]() |
Feb 19 2008, 07:16 PM
Post
#1
|
|
![]() Advanced Member ![]() ![]() Group: Members Posts: 177 Joined: 14-June 05 From: The Great White North Member No.: 13,899 |
(Quoting from the blog...)
> Updating seeds/leechers statistics > > This one was on top of the list of requested features.... The "update" button is not enough; what Mininova -- and the whole public community in general -- REALLY needs is to begin scraping the DHT-layer (rather than, or in addition to, site trackers) for seed/peer info in DHT-enabled torrents. Given the increasingly rapid pace of major sites closing, this should be a high priority item. -- The MPAA is perfectly well aware of the fact that if they can shut down a site, they can also nerf off-site meta-indexed torrents if their seed/peer counts drop to 0/0 and everybody subsequently thinks they're dead (or they show up five pages down a list of search-returns, if at all). In just the last year, Demonoid, Oink (private, but nevertheless), and now Zerotracker are gone or "out for maintenance" or "unavailable due to legal stuff", resulting in a massive percentage of otherwise perfectly healthy torrents reporting as 0/0. The "Update" button just makes things worse, as any torrent whose first listed announce URL is to a defunct site will have its previously stalled seed/peer count "updated" to 0/0 as soon as any registered member clicks it. -- And it won't matter if other announce URLs in the .torrent-file are to healthy sites, since Mininova only examines the first one. For instance, I have over a hundred posted multi-tracker torrents with Zerotracker or Demonoid as the first announce (mainly very old films, of which my torrent is often the only one in existence), and with several PB announces (working) farther down the list. Very shortly, every one of these will be reporting 0/0 even though they are all DHT-enabled and have scores if not hundreds of actual peers. If PB is ever hounded out of existence, the overwhelming majority of the public torrents listed on Mininova will register as 0/0 seed/peer unless DHT-scraping is implemented. ==//== Example: http://www.mininova.org/tor/1104862 -- The 58.8gb eng-dub DBZ torrent had the highest TPI (torrent popularity index, or [peer count]x[gb size]) on the internet, yet is now reporting 0/0 on Mininova. Prior to Zerotracker (whose tracker was the first announce) going down a few weeks ago, it was reporting 2150 peers, for a TPI of over 125. The same (hash-number) torrent cross-posted elsewhere: Torrentbox: http://www.torrentbox.com/torrent_details?id=117491 1281 peers ....only scrapes the Tbox announce (and any torrent with only the Tbox tracker in the announce list will report as 0/0 to any person with a US ip-address). Piratebay: http://thepiratebay.org/tor/3739634/ ....now reporting less than a hundred peers, because PB only scrapes the reports of clients using a .torrent-file which has a PB tracker listed first (so this torrent has a low count on PB because the Zerotracker announce was first listed in the versions crossposted to sites other than Demonoid, Zerotracker or PB, which are/were "narcissist" sites allowing public torrents but requiring their announces to be first in the .torrent-file). BTjunkie meta-index: 20,000 peers (grossly over-reported because BTjunkie scrapes every tracker in a multi-tracker torrent and just adds them together, meaning that peers are counted multiple times. Actual peer count: ![]() ...as is shown, a good half of the peers are using a DHT-enabled client with the feature turned on. Recommendation: Mininova collect stats from all trackers as well as the DHT layer, and report the highest seed count from any, and the highest leecher count from any. -------------------- |
|
|
|
Feb 20 2008, 12:05 AM
Post
#2
|
|
![]() Just call me DL ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Group: Ex-moderators Posts: 13,279 Joined: 4-July 05 From: U.S.A. Member No.: 17,023 |
I'll let the admins know about this and they can reply to you if they wish.
|
|
|
|
Feb 20 2008, 10:13 PM
Post
#3
|
|
![]() Advanced Member ![]() ![]() Group: Members Posts: 177 Joined: 14-June 05 From: The Great White North Member No.: 13,899 |
Excellent. If anything's going to happen, a timelime would be great.
-------------------- |
|
|
|
Feb 21 2008, 01:00 PM
Post
#4
|
|
|
Mininova staff ![]() ![]() ![]() ![]() Group: Admin Posts: 739 Joined: 14-March 05 Member No.: 2 |
We thought about this a while ago... but it's not a trivial task to implement. To "scrape" the number of peers from the DHT layer you'll have to run a dumbed-down client simulating downloading all our 570k+ torrents... Yes, it is possible, but no, it's not easy. We'll think about it
|
|
|
|
Feb 21 2008, 05:28 PM
Post
#5
|
|
![]() Advanced Member ![]() ![]() Group: Members Posts: 177 Joined: 14-June 05 From: The Great White North Member No.: 13,899 |
We thought about this a while ago... but it's not a trivial task to implement. To "scrape" the number of peers from the DHT layer you'll have to run a dumbed-down client simulating downloading all our 570k+ torrents... Yes, it is possible, but no, it's not easy. We'll think about it What about just scraping the rest of the trackers in a multi-tracker for best seed/leecher counts? That, methinks, would be far easier to implement, and would save all the multitracker torrents with bad first announces before they've languished completely. (DHT counting would be ideal, but it can wait.) -------------------- |
|
|
|
Feb 21 2008, 10:19 PM
Post
#6
|
|
|
Mininova staff ![]() ![]() ![]() ![]() Group: Admin Posts: 739 Joined: 14-March 05 Member No.: 2 |
I agree, multi-tracker scraping would be handy. We actually looked at this a few weeks ago, but there's some nasty technical issues with that.
I'll do some research on the DHT scraping though... I know some people who might have coded a solution for this. |
|
|
|
Feb 24 2008, 06:32 AM
Post
#7
|
|
![]() Advanced Member ![]() ![]() Group: Ex-moderators Posts: 227 Joined: 29-April 05 Member No.: 5,380 |
one thing about DHT, well two actually. (sorry to show you up a little niek, but remmeber, as we talked earlier, bugatti ride's been late comming)
First, there are two incompatable DHT swarms. there's Az, and there is everyone else. that means twice the work. Second, at least with the regular DHT, there is no indication of seeds, or leechers. Its just peer numbers only. So, theres no way to tell if the DHT scrape showing 1000, is 1000 seeds, 900 seeds and 100 leechers, 1 seed, 999 leechers, or all 1000 stuck at 0%. Oh, and that TPI (torrent popularity index, or [peer count]x[gb size]) is actually a bit of a misnomer. A 1Gb and a 50Gb torrent can be equally popular, but, the 1Gb one will show fewer peers, because people will have downloaded and then uploaded to their seed target quicker. As such, that method of rating would significantly favour any reasonably popular huge torrent. Or, to put it another way, all else being equal, doubling the size, means that it will take twice as long to download, and twice as long to reach the seed target.So, a 700Mb file should have a 'TPI' 4x greater than a 350Mb file of the same actual popularity. It's all in the application of common sense. |
|
|
|
Feb 26 2008, 10:44 PM
Post
#8
|
|
![]() Advanced Member ![]() ![]() Group: Members Posts: 177 Joined: 14-June 05 From: The Great White North Member No.: 13,899 |
So, theres no way to tell if the DHT scrape showing 1000, is 1000 seeds, 900 seeds and 100 leechers, 1 seed, 999 leechers, or all 1000 stuck at 0%. (Toss-up) ...anyone know how uTorrent 1.8 does it? (It has seed and leecher counts for the DHT layer.)QUOTE Oh, and that TPI (torrent popularity index, or [peer count]x[gb size]) is actually a bit of a misnomer. A 1Gb and a 50Gb torrent can be equally popular, but, the 1Gb one will show fewer peers, because people will have downloaded and then uploaded to their seed target quicker. As such, that method of rating would significantly favour any reasonably popular huge torrent. That's not an unreasonable way of looking at it, since twice as long of a download means twice as long online attached to that torrent (esp. as opposed to a different one, given that each peer's bandwidth is finite). Big torrents with a high TPI subsequently endure an extremely long time, while trendy 350mb TVrips might have 25,000 peers one week, but be gasping on fumes a couple months later. The bigger torrent is also likely to be a multi-file compilation, and its TPI therefore comprises the collective "popularity" of each of its individual components (e.g., all 276 DBZ eps...DBZ was the most popular eng-dubbed anime, so it's not surprising that its eng-dub torrent has the highest TPI).QUOTE Or, to put it another way, all else being equal, doubling the size, means that it will take twice as long to download, and twice as long to reach the seed target. So, a 700Mb file should have a 'TPI' 4x greater than a 350Mb file of the same actual popularity. Per [size x peers], it should only be 2x. (Seeding time and downloading time are concurrent, so there's no reason to count them both; the seeding time of the torrent's initial uploader is also statistically irrelevant because his bandwidth is an infinitesimal of that of all other peers.)
-------------------- |
|
|
|
Feb 28 2008, 04:33 AM
Post
#9
|
|
![]() Advanced Member ![]() ![]() Group: Ex-moderators Posts: 227 Joined: 29-April 05 Member No.: 5,380 |
Per [size x peers], it should only be 2x. (Seeding time and downloading time are concurrent, so there's no reason to count them both; the seeding time of the torrent's initial uploader is also statistically irrelevant because his bandwidth is an infinitesimal of that of all other peers.) If you're at size A, and and length of time B, then TPI is AxB. Double the size and so double the time, leads to (2xA)x(2xB) or expanded out, 2xAx2xB. Last time I was in a maths class (many years ago admittedly) it didn't matter what order things were put, so that works out at 4xAxB or 4AB. Four times bigger. |
|
|
|
Feb 28 2008, 06:39 PM
Post
#10
|
|
![]() Advanced Member ![]() ![]() Group: Members Posts: 177 Joined: 14-June 05 From: The Great White North Member No.: 13,899 |
If you're at size A, and and length of time B, then TPI is AxB.... TPI isn't size x time, it's size x peers.(Time is merely an aspect of size, since, if given a minimal base of peers to enable a mean peer to trade at his bandwidth capacity, size and time are in 1:1 correlation.) What TPI measures is a torrent's bandwidth commitment. I.e., if there are 2,150 peers in the 58.8gb DBZ torrent, then they are collectively committed to trading 126,420gb. In contrast, 10,000 peers in a 700mb aXXo torrent are collectively committed to trading only 7,000gb. ==//== How are things going on the DHT/multi-tracker-scraping fix? -------------------- |
|
|
|
Feb 28 2008, 11:32 PM
Post
#11
|
|
![]() Advanced Member ![]() ![]() Group: Ex-moderators Posts: 227 Joined: 29-April 05 Member No.: 5,380 |
TPI isn't size x time, it's size x peers. (Time is merely an aspect of size, since, if given a minimal base of peers to enable a mean peer to trade at his bandwidth capacity, size and time are in 1:1 correlation.) What TPI measures is a torrent's bandwidth commitment. I.e., if there are 2,150 peers in the 58.8gb DBZ torrent, then they are collectively committed to trading 126,420gb. In contrast, 10,000 peers in a 700mb aXXo torrent are collectively committed to trading only 7,000gb. ==//== How are things going on the DHT/multi-tracker-scraping fix? Time peers, same thing. If it's going to take twice as long, thats twice as long a peer will be there. Also, you have said that 58Gb torrent is a massive multi-file torrent, so saying they're collectively committed to trading 126,420Gb is phooey. Thats only true if everyone is there to download and fully upload the entire torrent. In these days of selective file downloads, that just isn't true. A 700Mb torrent of a movie, only has one file, so its downloading that file, or nothing. |
|
|
|
Mar 3 2008, 11:52 AM
Post
#12
|
|
![]() Advanced Member ![]() ![]() Group: Members Posts: 177 Joined: 14-June 05 From: The Great White North Member No.: 13,899 |
Getting back on subject....
-------------------- |
|
|
|
Mar 15 2008, 06:48 PM
Post
#13
|
|
![]() Advanced Member ![]() ![]() Group: Members Posts: 177 Joined: 14-June 05 From: The Great White North Member No.: 13,899 |
How is progress going on this?
-------------------- |
|
|
|
![]() ![]() |
| Lo-Fi Version | Time is now: 23rd November 2009 - 08:28 PM |