tag:blogger.com,1999:blog-4667528901979447244.post376025509331820632..comments2017-03-22T04:02:46.419-07:00Comments on Math and Hats: The average rank of elliptic curvesSimon Spicernoreply@blogger.comBlogger11125tag:blogger.com,1999:blog-4667528901979447244.post-17250860086786145852014-07-31T07:16:03.830-07:002014-07-31T07:16:03.830-07:00...computing these as we go *too*,......computing these as we go *too*,...Simon Spicerhttps://www.blogger.com/profile/14637909868809683079noreply@blogger.comtag:blogger.com,1999:blog-4667528901979447244.post-14763931559304187342014-07-30T14:20:49.914-07:002014-07-30T14:20:49.914-07:005). See replies to 2) and 3)
6) I wasn't bein...5). See replies to 2) and 3)<br /><br />6) I wasn't being intelligent at all when doing the Sage rank() timings - I just called the method with the specified parameters and checked how long it took. No doubt there are faster ways to get rank out using Sage. For example, I'm aware one could speed things when running 2-descent on lots of curves by combining all the homogenous space computations. Also since we're assuming GRH anyway, the class group computation part should be faster. And there's bound to be some duplication that we should eliminate by keeping track of the right things.<br /><br />Thanks for all your comments. Who am I replying to, by the way?<br /><br />- SimonSimon Spicerhttps://www.blogger.com/profile/14637909868809683079noreply@blogger.comtag:blogger.com,1999:blog-4667528901979447244.post-62981516102477090812014-07-30T14:03:17.265-07:002014-07-30T14:03:17.265-07:00The parameter Delta determines the runtime of the ...The parameter Delta determines the runtime of the routine. However, Delta directly specifies the tightness of the zero sum; since curves with large conductor have more zeros, we should scale our choice of Delta accordingly to get rank estimates of comparable fidelity.<br /><br />Specifically, if $E$ has conductor $N$, then if I remember correctly the expected number of zeros with imaginary part at most 1 in magnitude is $\frac{\log(N)}{2\pi} + O(1)$. Accordingly Delta should be set to some constant times $\log(N)$. Since runtime is exponential in $\Delta$, this is tantamount to saying that in order to produce rank bounds of constant fidelity, runtime will scale with conductor to some power. In other words: yes, you're correct.<br /><br />I haven't done this in my code yet. I should at least make it available as an option.Simon Spicerhttps://www.blogger.com/profile/14637909868809683079noreply@blogger.comtag:blogger.com,1999:blog-4667528901979447244.post-86674544839941529562014-07-30T13:49:43.867-07:002014-07-30T13:49:43.867-07:003) We haven't thought too hard about sampling;...3) We haven't thought too hard about sampling; the aim of this project from the outset was to compute the ranks of *all* curves up to some large height bound. It might just be that random sampling of curves ordered by naive height is actually less prone to bias than, say, discriminant; in this case sampling would be a good idea. Will have to think about this some more.Simon Spicerhttps://www.blogger.com/profile/14637909868809683079noreply@blogger.comtag:blogger.com,1999:blog-4667528901979447244.post-47315684998254819742014-07-30T13:48:06.723-07:002014-07-30T13:48:06.723-07:00Thanks, I've mentioned Matt Young's result...Thanks, I've mentioned Matt Young's results and added a link to his paper in my post, and reworded the definition of average rank and Bhargava's results thereon to make it a bit clearer.Simon Spicerhttps://www.blogger.com/profile/14637909868809683079noreply@blogger.comtag:blogger.com,1999:blog-4667528901979447244.post-34467454651444902512014-07-30T13:43:49.513-07:002014-07-30T13:43:49.513-07:00Perhaps a comment in a different direction. If you...Perhaps a comment in a different direction. If you want to compute a_p fast for many elliptic curves, one way is to do a giant precomputation.<br /><br />For instance, for primes up to X, for each j-invariant mod p you precompute (up to sign) the associated a_p, and then just need to figure out which twist you have. This latter bit I think is a Kronecker computation, so much faster than a_p computation.<br /><br />For X=10^6 the table should be about 75GB in size (thus fits in RAM on large machines). For an *experiment* this is probably better as it allows you compute to 10^6 maybe 100x faster, but for *usable Sage code* it is obviously not, sort of like your previous "computing primes" question.CatacycleUmbertohttps://www.blogger.com/profile/12487461289494442541noreply@blogger.comtag:blogger.com,1999:blog-4667528901979447244.post-90840137060675442992014-07-30T13:29:42.223-07:002014-07-30T13:29:42.223-07:00Two more comments: You mention the works of Brumer...Two more comments: You mention the works of Brumer and Heath-Brown, but neglect to point out that Matt Young beat the 2-barrier, achieving 25/14. http://www.ams.org/journals/jams/2006-19-01/S0894-0347-05-00503-5/<br /><br />In the last sentence of that paragraph, it is hard to parse what you are saying, but I don't think the current Bhargava/Shankar methods can hope to get much below 0.8 (or 0.85), let alone reach 1/2. I think your prior phrasing of "average rank of E for all curves up to height X, as X gets bigger and bigger" makes it sound (to the outsider) that B/S could improve their 0.885 by taking X larger, whereas that isn't really what is going on with any numerical improvements they might get. Rather, one way to improve 0.885 would be to understand the variation of root number in a larger class of curves that what they currently use.<br /><br />And a comment on my comment #1: Maybe I confusing what Calegari said. He might have been more fretful about the "65% of curves satisfy BSD" statements. See his comment here. http://mattbakerblog.wordpress.com/2014/03/10/the-bsd-conjecture-is-true-for-most-elliptic-curves/comment-page-1/#comment-235CatacycleUmbertohttps://www.blogger.com/profile/12487461289494442541noreply@blogger.comtag:blogger.com,1999:blog-4667528901979447244.post-87119465607500061062014-07-30T13:26:13.759-07:002014-07-30T13:26:13.759-07:002) 10 CPU years isn't too long in the greater ...2) 10 CPU years isn't too long in the greater scheme of things, but if we can cut that down by an order of magnitude by using other methods, then we should certainly do so. This is a relatively small project and we don't as yet have much in the way of cluster access, so I'm trying to make computations as efficient as possible. In all likelihood we will have a certain proportion of curves at the end for which all the easy methods fail, and we'll have to resort to some beefier hardware to get those ranks.Simon Spicerhttps://www.blogger.com/profile/14637909868809683079noreply@blogger.comtag:blogger.com,1999:blog-4667528901979447244.post-9419370989821100142014-07-30T13:20:46.981-07:002014-07-30T13:20:46.981-07:00Hi CatacycleUmberto
1) Agreed, the definition of ...Hi CatacycleUmberto<br /><br />1) Agreed, the definition of height is not canonical; right at the beginning of this project we were umming and aahing as to whether to use the [4,27] coefficients or not, and in the end opted not to. Certainly whatever resulting averages we eventually get out will be affected by this; however, the idea is that their inclusion/exclusion shouldn't affect the asymptotics. The good news is that reordering curves by a recalibrated height is quick, so it won't be the bottleneck. Either way I doubt height 10^8 - however you choose to define it - is going to be large enough to see average rank stop increasing and start decreasing.<br /><br />I know Bhargava's work is really about average n-Selmer group sizes; we are computing these as we go to, so hopefully we'll have some data to show there too.Simon Spicerhttps://www.blogger.com/profile/14637909868809683079noreply@blogger.comtag:blogger.com,1999:blog-4667528901979447244.post-528434421299183572014-07-30T11:35:57.425-07:002014-07-30T11:35:57.425-07:004) Regarding "essentially constant" runt...4) Regarding "essentially constant" runtime: The parameters chosen in this rank bounding method should depend on height. More precisely, to get good results you should be taking L-series coefficients to some power of the conductor (fortunately much less than the sqrt, which would allow you just to compute the L-values). Maybe this power z is so small that you can consider N^z to be constant in your range. My guess (from some experiments) is that z=1/4 already works a good deal of the time, even z=1/8 is reasonable to try for larger N, which for curves of conductor up to 10^12 is 1000 L-series coeffs, which as you say is microseconds. But I insist that asymptotically at least, the number of coeffs used should scale with N by some formula.<br /><br />5) I went through an analysis like yours about a month ago, though I was ordering by discriminant (cf. Stein-Watkins database), considering a height bound of 10^12, or about 10^10 curves. I ended up concluding that the "huge savings" was a constant of size maybe 3 or 5, the point is that you *do* have a lot of rank 2 curves (20%?), not to mention another (small) percentage with low-lying zeros. Unfortunately, I was unable to convince myself that this savings of 3 or 5 really made the experiment desirable, as the convergence to the asymptotic should be slow (logarithmic?). My conclusion was that the sampling (#3 above) was simply superior evidence. The main reason why to consider all curves up to a given height would be to dovetail with the Bhargava/Shankar bounds, but as #1 notes, this is finicky in the first place. OTOH, I *did* think of using analytic rank as you do (I got the idea from Bober's ANTS paper), and this should be just as applicable when sampling. Bottomline: whatever method you use to compute ranks, sampling should be part of your arsenal in statistical analysis.<br /><br />6) Finally "However, its runtime for curves of height near 100 million is on the order of 20 seconds;" ... I don't know what Sage is doing, but this seems *way* too long. The Cremona mwrank method should be suitable for many curves (the discriminant is small enough), and for the rest the algebraic 2-descent method should be used, where the (PARI?) class group computation (assuming GRH) with e.g. Denis Simon's code should not be overwhelming. In reality, I might choose the second method, as the runtime is more nearly constant. I really can't imagine *any* curve less than conductor 10^10 taking more than a second (2-Selmer computations give an upper bound on the rank, which as with your methods filters out a large subset). I ran a simple loop in Magma with curves A of size 10^4 and B about 10^6, and every TwoSelmerGroup was less than 0.3s seconds, the average was about 0.1s. Obviously your analytic rank methods are still useful, but I think you should reconsider the timings for the other methods, if nothing else to get the comparison factor correct. My *guess* is that "rank" in Sage is doing the whole work of computing the Mordell-Weil group in every case(?), which is overkill. Maybe you investigate this, and figure out what is going on. CatacycleUmbertohttps://www.blogger.com/profile/12487461289494442541noreply@blogger.comtag:blogger.com,1999:blog-4667528901979447244.post-5255023301465988512014-07-30T11:35:22.776-07:002014-07-30T11:35:22.776-07:00I have a number of comments more than the 4096 cha...I have a number of comments more than the 4096 character limit allows!)<br /><br />1) The choice of max(A^3,B^2) in the height bound is not canonical. Neither is max(4A^3,27B^2), but there are reasons to use the latter. In fact, I remember Frank Calegari saying that he specifically disliked Bhargava phrasing all his results in terms of bounds on average ranks (like 0.885), as the specific constants one obtains would be sensitive to the choice of 4 and 27. I am not sure I agree with Frank, but it is something to consider. <br /><br />2) I can't think that 10 cpu-years is all *that* long given cluster sizes today. I guess it depends on how important you think this computation is (I heard a Bianchi modular form team with Cremona saying 900000 cpu-hours for their work, which is over 100 cpu-years). See also #6 below. <br /><br />3) The paper of Stein (with Mazur, Bektemirov and Watkins) in the BAMS a few years back gave some additional evidence with the "obvious" idea of sampling curves at a higher height (Section 5.2.1). They say it is nontrivial to achieve a unbiased sample (with respect to real period?), but I am not thoroughly convinced of this (or that sure their methods are truly robust).<br />CatacycleUmbertohttps://www.blogger.com/profile/12487461289494442541noreply@blogger.com