Math and Hats

Fall quarter at UW begins!

2014-09-23T09:25:00.002-07:00

All good things must come to an end! It's been a gloriously warm and sunny summer, but with the advent of august this week the rain is due to set in. Cue the six months in Seattle where you don't see the sun (I've started taking my vitamin D supplements again).

This week also sees the University of Washington start back up again: Wednesday 24 September is the first day of fall quarter classes. I will be teaching two sections of MATH307: Introduction to Differential Equations. My website for the course is up and running at http://www.math.washington.edu/~mlungu/teaching/math307au14/index.html - go take a look (and let me know if there are any typos)!

GSoC: Wrapping Up

2014-08-22T09:04:00.003-07:00

Today marks the last day of my Google Summer of Code 2014 project. Evaluations are due midday Friday PDT, and code submissions for successfully passed projects start soon thereafter. The end result of my project can be found at Sage Trac Ticket 16773. In total it's just over 2000 lines of Python and Cython code, to be added to the next major Sage release (6.4) if/when it gets a final thumbs-up review.

When I write just the number of lines of code it doesn't sound like very much at all - I'm sure there are GSoC projects this year that produced at least 10k lines of code! However, given that what I wrote is complex mathematical code that needed a) significant optimization, and b) to be be mathematically sound in the first place, I'd say that isn't too bad. Especially since the code does what the project description aimed for it to do: compute analytic ranks of rational elliptic curves very quickly with a very low rate of failure. Hopefully this functionality can and will be used to produce numerical data for number theory research in the years to come.

The Google Summer of Code has been a thoroughly rewarding experience for me. It's a great way to sharpen one's coding skills and get paid in the process. I recommend it for any university student who's looking to go into any industry that requires programming skills; I'd apply to do it again next year if I wasn't planning to graduate then!

Things are Better in Parallel

2014-08-14T12:23:00.000-07:00

A recent improvement I've implemented in my GSoC code is to allow for parallelized computation. The world has rapidly moved to multi-core as a default, so it makes sense to write code that can exploit this. And it turns out that the zero sum rank estimation method that I've been working on can be parallelized in a very natural way.

THE @parallel DECORATOR IN SAGE

But first: how does one compute in parallel in Sage? Suppose I have written a function in a Sage environment (e.g. a SageMathCloud worksheet, .sage file, Sage console etc.) which takes in some input and returns some other input. The simple example f below takes in a number n and returns the square of that number.

sage: def f(n):

....:     return n*n

....: 

sage: f(2),f(3),f(5)

(4, 9, 25)

This is a fairly straightforward beast; put in a value, get a value out. But what if we have some computation that requires evaluating that function on a large number of inputs? For example, say we want to compute the sum of the first 10 million squares. If you only have one processor to tap, then you're limited to calling f over and over again in series:

sage: def g(N):

....:     y = 0

....:     for n in range(N+1):

....:         y += f(n)

....:     return y

....: 

sage: %time g(10000000)

CPU times: user 17.5 s, sys: 214 ms, total: 17.7 s

Wall time: 17.6 s

333333383333335000000

In this example you could of course invoke the formula for the sum of the first $n$ squares and just write down the answer without having to add anything up, but in general you won't be so lucky. You can optimize the heck out of f, but in general when you're limited to a single processor you're confined to iterating over all the values you need to consider sequentially .

However, if you have 2 processors available one could try write code that splits the work into two roughly equal parts that can run relatively independently. For example, for our function we could compute the sum of all the even squares up to a given bound in one process and the sum of all the odd squares in another, and then add the two results together to get the sum of all square up to our bound.

Sage has a readily available mechanism to do exactly this: the @parallel decorator. To enable parallel computation in your function, put @parallel above your function definition (the decorator can take some parameters; below ncpus=2 tells it that we want to use 2 processors). Note that we also have to modify the function: now it no longer only takes the bound up to which we must add squares, but also a flag indicating whether we should consider even or odd squares.

sage: @parallel(ncpus=2)

....: def h(N,parity):

....:     y = 0

....:     if parity=="even":

....:         nums = range(0,N+1,2)

....:     elif parity=="odd":

....:         nums = range(1,N+1,2)

....:     for n in nums:

....:         y += f(n)

....:     return y

Instead of calling h with its standard sequence of parameters, we can pass it a list of tuples, where each tuple is a valid sequence of inputs. Sage then sends each tuple of inputs off to an available processor and evaluates the function on them there. What's returned is a generator object that can iterate over all the outputs; we can always see the output directly by calling list() on this returned generator:

sage: for tup in list(h([(1000,"even"),(1000,"odd")])):

....:     print(tup)

....: 

(((1000, 'even'), {}), 167167000)

(((1000, 'odd'), {}), 166666500)

Note that the tuple of inputs is also returned. Because we're doing things in parallel, we need to know which output corresponds to which input, especially since processes may finish at different times and return order is not necessarily the same as the order of the input list.

Finally, we can write a wrapper function which calls our parallelized function and combines the returned results:

sage: def i(N):

....:     y = 0

....:     for output in h([(N,"even"),(N,"odd")]):

....:         y += output[1]

....:     return y

....: 

sage: %time i(10000000)

CPU times: user 1.76 ms, sys: 33.2 ms, total: 35 ms

Wall time: 9.26 s

333333383333335000000

Note that i(10000000) produces the same output at g(10000000) but in about half the time, as the work is split between two processes instead of one. This is the basic gist of parallel computation: write code that can be partitioned into parts that can operate (relatively) independently; run those parts on different processors simultaneously; and then collect returned outputs and combine to produce desired result.

PARALLELIZING THE ZERO SUM COMPUTATION

Let's take a look at the rank estimating zero formula again. Let $E$ be a rational elliptic curve with analytic rank $r$. Then

\begin{align*}
r < \sum_{\gamma} \text{sinc}^2(\Delta\gamma) &= \frac{1}{\pi \Delta}\left(-\eta+\log\left(\frac{\sqrt{N}}{2\pi}\right)\right) \\
&+ \frac{1}{2\pi^2\Delta^2}\left(\frac{\pi^2}{6}-\text{Li}_2\left(e^{-2\pi\Delta}\right) \right) \\
&+ \frac{1}{\pi \Delta}\sum_{\substack{n = p^k \\ n < e^{2\pi\Delta}}} c_n\left(1 - \frac{\log n}{2\pi\Delta}\right)
\end{align*}
where

$\gamma$ ranges over the nontrivial zeros of the $L$-function attached to $E$
$\Delta$ is a positive parameter
$\eta$ is the Euler-Mascheroni constant $=0.5772\ldots$
$N$ is the conductor of $E$
$\text{Li}_2(x)$ is the dilogarithm function, defined by $\text{Li}_2(x) = \sum_{k=1}^{\infty} \frac{x^k}{k^2}$
$c_n$ is the $n$th coefficient of the logarithmic derivative of the $L$-function of $E$, which is zero when $n$ is not a perfect prime power.

The right hand side of the sum, which is what we actually compute, can be broken up into three parts: the first term involving the curve's conductor $N$; the second term involving the dilogarithm function $Li_2(x)$; and the sum over prime powers. The first two parts are quick to compute: evaluating them can basically be done in constant time regardless of the magnitude of $N$ or $\Delta$.

It is therefore not worth considering parallelizing these two components, since the prime power sum dominates computation time for all but the smallest $\Delta$ values. Instead, what I've done is rewritten the zero sum code so that the prime power sum can be evaluated using multiple cores.

As mentioned in this post, we can turn this sum into one indexed by the primes (and not the prime powers); this actually makes parallelization quite straightforward. Recall that all primes except $2$ are odd, and all primes except $2$ and $3$ are either $1$ or $5$ remainder $6$. One can scale this up: given a list of small primes $[p_1,p_2,\ldots,p_n]$, all other primes fall into one of a relatively small number of residue classes modulo $p_1 p_2\cdots p_n$. For example, all primes beyond $2$, $3$, $5$ and $7$ have one of the following 48 remainders when you divide them by $210 = 2\cdot 3\cdot 5 \cdot 7$:
\begin{align*}
&1, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79,\\
&83, 89, 97, 101, 103, 107, 109, 113, 121, 127, 131, 137, 139, 143, 149, \\
&151, 157, 163, 167, 169, 173, 179, 181, 187, 191, 193, 197, 199, 209,
\end{align*}
and no other.

If we had 48 processors available. the natural thing to do would be to get each of them to iterate over all integers in a particular residue class up to $e^{2\pi\Delta}$, evaluating the summand whenever that integer is prime, and returning the sum thereof. For example, if the bound was 1 million, then processor 1 would compute and return $\sum_{n \equiv 1 (\text{mod } 210)}^{1000000} c_n\left(1 - \frac{\log n}{2\pi\Delta}\right)$. Processor 2 would do the same with all integers that are $11$ modulo $210$, etcetera.

In reality, we have to figure out a) how many processors are available, and b) partition the work relatively equally among those processors. Thankfully sage.parallel.ncpus.ncpus() succinctly addresses the former, and the latter is achieved by splitting the residue classes into $n$ chunks of approximately equal size (where $n$ is the number of available CPUs) and then getting a given processor to evaluate the sum over those residues in a single chunk.

Here is the method I wrote that computes the $\text{sinc}^2$ zero sum with (the option of) multiple processors:

Note that I've defined a subfunction to compute the prime sum over a given subset of residue classes; this is the part that is parallelized. Obtaining the residue chunks and computing the actual summand at each prime are both farmed out to external methods.

Let's see some timings. The machine I'm running Sage on has 12 available processors:

sage: sage.parallel.ncpus.ncpus()

12

sage: E = EllipticCurve([12838,-51298])

sage: Z = LFunctionZeroSum(E)

sage: %time Z._zerosum_sincsquared_fast(Delta=2.6)

CPU times: user 36.7 s, sys: 62.9 ms, total: 36.8 s

Wall time: 36.8 s

2.8283629046

sage: %time Z._zerosum_sincsquared_parallel(Delta=2.6)

CPU times: user 7.87 ms, sys: 133 ms, total: 141 ms

Wall time: 4.06 s

2.8283629046

Same answer in a ninth of the time! Note that the parallelized method has some extra overhead, so even with 12 processors we're unlikely to get a full factor of 12 speedup. Nevertheless, the speed boost will allow us to run the zero sum code with larger $\Delta$ values, allowing us to investigate elliptic curves with even larger conductors.

How big should Delta be?

2014-08-05T11:45:00.000-07:00

Let's take a look at the central formula in my GSoC rank estimation code. Let $E$ be a rational elliptic curve with analytic rank $r$. Then
$$ r < \sum_{\gamma} \text{sinc}^2(\Delta\gamma) = \frac{1}{\pi\Delta} \left[ C_0 + \frac{1}{2\pi\Delta}\left(\frac{\pi^2}{6}-\text{Li}_2\left(e^{-2\pi\Delta}\right) \right) + \sum_{n=1}^{e^{2\pi\Delta}} c_n\left(1 - \frac{\log n}{2\pi\Delta}\right) \right] $$
where

$\gamma$ ranges over the nontrivial zeros of the $L$-function attached to $E$
$\Delta$ is a positive parameter
$C_0 = -\eta + \log\left(\frac{\sqrt{N}}{2\pi}\right)$; $\eta$ is the Euler-Mascheroni constant $=0.5772\ldots$ and $N$ is the conductor of $E$
$\text{Li}_2(x)$ is the dilogarithm function, defined by $\text{Li}_2(x) = \sum_{k=1}^{\infty} \frac{x^k}{k^2}$
$c_n$ is the $n$th coefficient of the logarithmic derivative of the $L$-function of $E$.

The thing I want to look at in this post is that parameter $\Delta$. The larger you make it, the closer the sum on the left hand side over the zeros is to the analytic rank, so when trying to determine the rank of $E$ we want to pick as large a $\Delta$ as we can. However, the bigger this parameter is the more terms we have to compute in the sum over the $c_n$ on the right hand side; moreover the number of terms - and thus the total computation time - scales exponentially with $\Delta$. This severely constrains how big we can make $\Delta$; generally a value of $\Delta=2$ may take a second or two for a single curve on SageMathCloud, while $\Delta=3$ may take upwards of an hour. For the average rank project I'm working on I ran the code on 12 million curves using $\Delta=1.3$; the total computation time was about 4 days on SMC.

However, it should be clear that using too large a $\Delta$ is overkill: if you run the code on a curve with $\Delta=1$ and get a bound of zero out, you know that curve's rank exactly zero (since it's at most zero, and rank is a non-negative integer). Thus using larger $\Delta$ values on that curve will do nothing except provide you the same bound while taking much longer to do so.

This begs the question: just how big of a $\Delta$ value is good enough? Can we, given some data defining an elliptic curve, decide a priori what size $\Delta$ to use so that a) the computation returns a bound that is likely to be the true of the curve, and b) it will do so in as little time as possible?

The relevant invariant to look at here is conductor $N$ of the elliptic curve; go back to the formula above and you'll see that the zero sum includes a term which is $O\left(\frac{\log(N)}{2\pi\Delta}\right)$ (coming from the $\frac{1}{\pi \Delta} C_0$ term). This means that size of the returned estimate will scale with $\log(N)$: for a given $\Delta$, the bound returned on a curve with 10-digit conductor will be about double that which is returned for a 5-digit conductor curve, for example. However, we can compensate this by increasing $\Delta$ accordingly.

To put it all more concretely we can pose the following questions:

Given an elliptic curve $E$ with conductor $N$, how large does $\Delta$ need to be in terms of $N$ so that the returned bound is guaranteed to be the true analytic rank of the curve?
Given a conductor size $N$ and a proportionality constant $P \in [0,1]$, how big does $\Delta$ have to be in terms of $N$ and $P$ so that at least $P\cdot 100$ percent of all elliptic curves with conductor of size about $N$ will, when the rank estimation code is run on them with that $\Delta$ value, yield returned bounds that are equal to their true analytic rank?

[Note: in both of the above questions we are making the implicit assumption that the returned rank bound is monotonically decreasing for increasing $\Delta$. This is not necessarily the case: the function $y = \text{sinc}^2(x)$ is not a decreasing function in $x$. However, in practice any upwards fluctuation we see in the zero sum is small enough to be eliminated when we take the floor function to get an integer rank bound.]

A $\Delta$ CHOICE GOOD ENOUGH FOR MOST CURVES

The first question is easier to phrase, but more difficult to answer, so we will defer it for now. To answer the second question, it is useful mention what we know about the distribution and density of nontrivial zeros of an elliptic curve $L$-function.

Using some complex analysis we can show that, for the $L$-function of an elliptic curve with conductor $N$, the expected number of zeros in the critical strip with imaginary part at most $T$, is $O(T\log N+ T\log T)$. That is, expected zero density has two distinct components: a part that scales linearly with log of the conductor, and a part that doesn't scale with conductor (but does scale slightly faster than linearly with how far out you go).

Consider the following: if we let

$$\Delta(N) = \frac{C_0}{\pi} = \frac{1}{\pi}\left(-\eta+\log\left(\frac{\sqrt{N}}{2\pi}\right)\right)$$

then the first term in the right hand side of the zero sum formula is precisely 1 - this is the term that comes from the $\log N$ part of the zero density. The next term - the one involving $\frac{\pi^2}{6}-\text{Li}_2\left(e^{-2\pi\Delta}\right)$ - is the term that comes from the part independent of $N$; because the right hand side is divided by $\Delta$ it therefore goes to zero as the curve's conductor increases. The third term contains the $c_n$ coefficients which (per Sato-Tate) will be positive half the time and negative half the time, so the entire sum could be positive or negative; we therefore expect its contribution to be small on average when we look at large number of elliptic curves.

It thus stands to reason that for this value of Delta, and when the conductor $N$ is sufficiently large, the zero sum will be about 1, plus or minus a smallish amount coming from the $c_n$ sum. This argument is by no means rigorous, but we might therefore expect the zero sum to be within 2 of the actual analytic rank most of the time. Couple that with knowledge of the root number and you get a rank upper bound out which is equal to the actual analytic rank in all but a few pathological cases.

Empirical evidence bears this out. I ran my rank estimation code with this choice of $\Delta$ scaling on the whole Cremona database, which contains all elliptic curves up to conductor 350000:

The proportion of curves up to conductor $N$ for which the computed rank bound is strictly greater than rank. The $x$-axis denotes conductor; the $y$-axis is the proportion of all curves up to that conductor for which the returned rank bound was not equal to the true rank (assuming BSD and GRH as always).

As you can see, the percentage of curves for which the rank bound is strictly greater than rank holds fairly constant at about 0.25%. That's minuscule: what this is saying is that if you type in a random Weierstrass equation, there is only about a 1 in 1000 chance that the rank bounding code with $\Delta = \frac{C_0}{\pi}$ will return a value that isn't the actual analytic rank. Moreover, the code runs considerably faster than traditional analytic rank techniques, so if you wanted to, for example, compute the ranks of a database of millions of elliptic curves, this would be a good first port of call.

Of course, this $\Delta$ scaling approach is by no means problem-free. Some more detailed analysis will show that that as stated above, the runtime of the code will actually be $O(N)$ (omitting log factors), i.e. asymptotic scaling is actually worse than traditional analytic rank methods, which rely on evaluating the $L$-function directly and thus are $O(\sqrt{N})$. It's just that with this code we have some very small constants sitting in front, so the crossover point is at large enough conductor values that neither method is feasible anyway.

This choice of $\Delta$ scaling works for conductor ranges up to about $10^9$; that corresponds to $\Delta \approx 2.5$, which will give you a total runtime of about 10-20 seconds for a single curve on SMC. Increase the conductor by a factor of 10 and your runtime will also go up tenfold.

For curves of larger conductor, instead of setting $\Delta = \frac{C_0}{\pi}$ we can choose to set $\Delta$ to be $\alpha\cdot \frac{C_0}{\pi}$ for any $\alpha \in [0,1]$; the resulting asymptotic runtime will then be $O(N^{\alpha})$, at the expense of having a reduced proportion of elliptic curves where rank bound is equal to true rank.

HOW LARGE DOES $\Delta$ HAVE TO BE TO GUARANTEE TRUE RANK?

When we use $\Delta = \frac{C_0}{\pi}$, the curves for which the returned rank estimate is strictly larger than the true rank are precisely those which have unusually low-lying zeros. For example, the rank 0 curve with Cremona label 256944c1, has a zero with imaginary part at 0.0256 (see here for a plot), compared to an expected value of 0.824. Using $\Delta = \frac{C_0}{\pi}$ on this curve means $\Delta \approx 1.214$; if we compute the corresponding zero sum with this value of $\Delta$ we get a value of 2.07803. The smallest value of $\Delta$ for which we get a zero sum value of less than 2 is empirically about 2.813; at this point taking the floor and invoking parity tells us that the curve's rank is zero.

The above example demonstrates that if we want to guarantee that the returned rank bound is the true analytic rank, we are forced to increase the size of $\Delta$ to something larger than $\frac{C_0}{\pi}$. Do we need to increase $\Delta$ by a fixed value independent of $N$? Do we need to increase it by some constant factor? Or does it need to scale faster than $\log N$? These are hard questions to answer; it comes down to determining how close to the central point the lowest nontrivial zero can be as a function of the conductor $N$ (or some other invariants of $E$), which in turn is intimately related to estimating the size of the leading coefficient of the $L$-series of $E$ at the central point. This is already the topic of a previous post: it is a question that I hope to make progress in answering in my PhD dissertation.

The average rank of elliptic curves

2014-07-29T12:12:00.000-07:00

It's about time I should demonstrate the utility of the code I've written - the aim of the game for my GSoC project, after all, is to provide a new suite of tools with which to conduct mathematical research.

First some background. Given an elliptic curve $E$ specified by equation $y^2 = x^3 + Ax + B$ for integers $A$ and $B$, one of the questions we can ask is: what is the average rank of $E$ as $A$ and $B$ are allowed to vary? Because there are an infinite number of choices of $A$ and $B$, we need to formulate this question a bit more carefully. To this end, let us define the height of $E$ to be
$$ h(E) = \max\{|A|^3,|B|^2\} $$
[Aside: The height essentially measures the size of the coefficients of $E$ and is thus a fairly decent measure of the arithmetic complexity of the curve. We need the 3rd and 2nd powers in order to make the height function scale appropriately with the curve's discriminant.]

We can then ask: what is the limiting value of the average rank of all curves up to height $X$, as $X$ gets bigger and bigger? Is it infinity? Is it 0? Is it some non-trivial positive value? Does the limit even exist? It's possible that the average rank, as a function of $X$ could oscillate about and never stabilize.

There are strong heuristic arguments suggesting that the answer should be exactly $\frac{1}{2}$. Specifically, as the height bound $X$ gets very large, half of all curves should have rank 0, half should have rank 1, and a negligible proportion should have rank 2 or greater.

Even as recently as five years ago this there we knew nothing concrete unconditionally about average curve rank. There are some results by Brumer, Heath-Brown and Young providing successively better upper bounds on the average rank of curves ordered by height (2.3, 2 and $\frac{25}{14}$ respectively), but these results are contingent on the Riemann Hypothesis.

However, starting in 2011 Manjul Bhargava, together with Christopher Skinner and Arul Shankar, published a series of landmark papers (see here for a good expository slideshow, and here and here for two of the latest publications) proving unconditionally that average rank - that is, the limiting value of the average rank of all elliptic curves up to height $X$ - is bounded above by 0.885. A consequence of their work too is that at least 66% of all elliptic curves satisfy the rank part of the Birch and Swinnerton-Dyer Conjecture.

To a computationally-minded number theorist, the obvious question to ask is: Does the data support these results? I am by no means the first person to ask this question. Extensive databases of elliptic curves under various orderings have already been compiled, most notably those by Cremona (ordered by conductor) and Stein-Watkins (ordered essentially by discriminant). However, as yet no extensive tabulation of height-ordered elliptic curves has been carried out.

Here is a summarized table of elliptic curves with height at most 10000 - a total of 8638 curves, and the ranks thereof (all computations done in Sage, of course):

Rank	# Curves	%
0	2988	34.59%
1	4307	49.86%
2	1286	14.89%
3	57	0.66%
$\ge$4	0	0.00%
Total:	8638

Thus the average rank of elliptic curves is 0.816 when the height bound is 10000. This is worrisome: the average is significantly different from the value of 0.5 we're hoping to see.

The situation gets even worse when we go up to height bound 100000:

Rank	# Curves	%
0	19492	33.11%
1	28818	48.96%
2	9747	16.56%
3	801	1.36%
$\ge$4	4	0.01%
Total:	58862

This yields an average rank of 0.862 for height bound 100000. Bizarrely, the average rank is getting bigger, not smaller!

[Note: the fact that 0.862 is close to Bhargava's asymptotic bound of 0.885 is coincidental. Run the numbers for height 1 million, for example, and you get an average rank of 0.8854, which is bigger than the above asymptotic bound. Observationally, we see the average rank continue to increase as we push out to even larger height bounds beyond this.]

So what's the issue here? It turns out that a lot of the asymptotic statements we can make about elliptic curves are precisely that: asymptotic, and we don't yet have a good understanding of the associated rates of convergence. Elliptic curves, ornery beasts that they are, can seem quite different from their limiting behaviour when one only considers curves with small coefficients. We expect (hope?) that the average to eventually turn around and start to decrease back down to 0.5, but the exact point at which that happens is as yet unknown.

This is where I come in. One of the projects I've been working on (with Wei Ho, Jen Balakrishnan, Jamie Weigandt, Nathan Kaplan and William Stein) is to compute the average rank of elliptic curves for as large a height bound as possible, in the hope that we will get results a bit more reassuring than the above. The main steps of the project are thus:

Generate an ordered-by-height database of all elliptic curves up to some very large height bound (currently 100 million; about 18.5 million curves);
Use every trick in the book to compute the ranks of said elliptic curves;
Compute the average of said ranks.

Steps 1 and 3 are easy. Step 2 is not. Determining the rank of an elliptic curve is a notoriously hard problem - no unconditional algorithm with known complexity currently exists - especially when you want to do it for millions of curves in a reasonable amount of time. Sage, for example, already has a rank() method attached to their EllipticCurve class; if you pass the right parameters to it, the method will utilize an array of approaches to get a value out that is (assuming standard conjectures) the curve's rank. However, its runtime for curves of height near 100 million is on the order of 20 seconds; set it loose on 18.5 million such curves and you're looking at a total computation time of about 10 CPU years.

Enter GSoC project stage left. At the expense of assuming the Generalized Riemann Hypothesis and the Birch and Swinnerton-Dyer Conjecture, we can use the zero sum rank bounding code I've been working on to quickly compute concrete upper bounds on an elliptic curve's rank. This approach has a couple of major advantages to it:

It's fast. In the time it's taken me to write this post, I've computed rank bounds on 2.5 million curves.
Runtime is essentially constant for any curve in the database; we don't have to worry about how the method scales with height or conductor. If we want to go up to larger height bounds at a later date, no problem.

As always, some Terms and Conditions apply. The rank bounding code only gives you an upper bound on the rank: if, for example, you run the code on a curve and get the number 4 back, there's no way to determine with this method if the curve's rank is 4, or if it is really some non-negative integer less than 4.

Note: there is an invariant attached to an elliptic curve called the root number which is efficiently computable, even for curves with large conductor (it took less than 20 minutes to compute the root number for all 18.5 million curves in our database). The root number is one of two values: -1 or +1; if it's -1 the curve has odd analytic rank, and if it's +1 the curve has even analytic rank. Assuming BSD we can therefore always easily tell if the curve's rank is even or odd. My GSoC rank estimation code takes the root number into account, so in the example above, a returned value of 4 tells us that the curve's true rank is one of three values: 0, 2 or 4.

Even better, if the returned value is 0 or 1, we know this must be the actual algebraic rank of the curve: firstly, there's no ambiguity as to what the analytic rank is - it has to the returned 0 or 1; secondly, the BSD conjecture has been proven in the rank 0 & 1 cases. Thus even though we are a priori only computing analytic rank upper bounds, for some proportion of curves we've found the actual algebraic rank.

[Note that the rank bounding code I've written is predicated on knowing that all nontrivial zeros of an elliptic curve $L$-function lie on the critical line, so we still have to assume the Generalized Riemann Hypothesis.]

Thus running the rank bound code on the entire database of curves is a very sensible first step, and it's what I'm currently doing. It's computationally cheap to do - on SageMathCloud, using a Delta value of 1.0, the runtime for a single curve is about 4 milliseconds. Moreover, for some non-negligible percentage of curves the bounds will be observably sharp - based on some trial runs I'm expecting about 20-30% of the computed bounds to be 0 or 1.

That's about 4 million curves for which we won't have to resort to much more expensive rank finding methods. Huge savings!

How to efficiently enumerate prime numbers

2014-07-21T14:53:00.000-07:00

There's a part in my code that requires me to evaluate a certain sum

$$ \sum_{p\text{ prime }< \,M} h_E(p) $$

where $h_E$ is a function related to specified elliptic curve that can be evaluated efficiently, and $M$ is a given bound that I know. That is, I need to evaluate the function h_E at all the prime numbers less than $t$, and then add all those values up.

The question I hope to address in this post is: how can we do this efficiently as $M$ gets bigger and bigger? Specifically, what is the best way to compute a sum over all prime numbers up to a given bound when that bound can be very large?

[For those who have read my previous posts (you can skip this paragraph if you haven't - it's not the main thrust of this post), what I want to compute is, for an elliptic curve $E/\mathbb{Q}$, the analytic rank bounding sum $ \sum_{\gamma} \text{sinc}^2(\Delta \gamma) $ over the zeros of $L_E(s)$ for positive parameter $\Delta$; this requires us to evaluate the sum $ \sum_{n < \exp(2\pi\Delta)} c_n\cdot(2\pi\Delta-\log n)$. Here the $c_n$ are the logarithmic derivative coefficients of the completed $L$-function of $E$. Crucially $c_n = 0$ whenever $n$ isn't a prime power, and we can lump together all the terms coming from the same prime; we can therefore express the sum in the form you see in the first paragraph.]

As with so many things in mathematical programming, there is a simple but inefficient way to do this, and then there are more complicated and ugly ways that will be much faster. And as has been the case with other aspects of my code, I've initially gone with the first option to make sure that my code is mathematically correct, and then gone back later and reworked the relevant methods in an attempt to speed things up.

METHOD 1: SUCCINCT BUT STUPID

Here's a Python function that will evaluate the sum over primes. The function takes two inputs: a function $h_E$ and an integer $M$, and returns a value equal to the sum of $h_E(p)$ for all primes less than $M$. We're assuming here that the primality testing function is_prime() is predefined.

As you can see, we can achieve the desired outcome in a whopping six lines of code. Nothing mysterious going on here: we simply iterate over all integers less than our bound and test each one for primality; if that integer is prime, then we evaluate the function h_E at that integer and add the result to y. The variable y is then returned at the end.

Why is this a bad way to evaluate the sum? Because there are far more composite integers than there are primes. According to the prime number theorem, the proportion of integers up to $M$ that are prime is approximately $\frac{1}{\log M}$. For my code I want to compute with bounds in the order of $M = e^{8\pi} \sim 10^{11}$; the proportion of integers that are prime up to this bound value is correspondingly about $\frac{1}{8\pi} \sim 0.04$. That is, 96% of the integers we iterate over aren't prime, and we end up throwing that cycle away.

Just how inefficient this method is of course depends on how quickly we can evaluate the primality testing function is_prime(). The best known deterministic primality testing algorithm has running time that scales with (at most) the 6th power of $\log n$, where $n$ is the number being tested. This places primality testing in a class of algorithms called Polynomial Time Complexity Algorithms, which means the runtime of the function scales relatively well with the size of the input. However, what kills us here is the sheer number of times we have to call is_prime() - on all integers up to our bound $M$ - so even if it ran in constant time the prime_sum() function's running time is going to scale with the magnitude of $M$.

METHOD 2: SKIP THOSE $n$ WE KNOW ARE COMPOSITE

We can speed things up considerably by noting that apart from 2, all prime numbers are odd. We are therefore wasting a huge amount of time running primality tests on integers that we know a priori are composite. Assuming is_prime() takes a similar time to execute than our coefficient function h_E(), we could therefore roughly halve the runtime of the prime sum function by skipping the even integers and just checking odd numbers for primality.

We can go further. Apart from 2 and 3, all primes yield a remainder of 1 or 5 when you divide them by 6 (because all primes except for 2 are 1 (modulo 2) and all primes except for 3 are 1 or 2 (modulo 3)). We can therefore skip all integers that are 0, 2, 3 or 4 modulo 6; this means we only have to check for primality on only one third of all the integers less than $M$.

Here's a second version of the prime_sum() function that does this:

Of course we could go even further with the technique by looking at remainders modulo $p$ for more primes $p$ and combining the results: for example, all primes outside of 2, 3 and 5 can only have a remainder of 7, 11, 13, 17, 19, 23 or 29 modulo 30. However, the further you go the more special cases you need to consider, and the uglier your code becomes; as you can see, just looking at cases modulo 6 requires us to write a function about three times as long as previously. This method therefore will only be able to take us so far before the code we'd need to write would become too unwieldy for practicality.

METHOD 3: PRIME SIEVING...

This second prime_sum() version is a rudimentary example of a technique called prime sieving. The idea is to use quick computations to eliminate a large percentage of integers from consideration in a way that doesn't involve direct primality testing, since this is computationally expensive. Sieving techniques are an entire field of research in their own right, so I thought I'd just give as example one of the most famous methods: the Sieve of Eratosthenes (named after the ancient Greek mathematician who is thought to first come up with the idea). This takes as input a positive bound $M$ and returns a list of all prime numbers less than $M$. The method goes as follows:

Start with a list of boolean flags indexed by the numbers 2 through $M$, and set all of them to True.
Starting at the beginning of the list, let $i$ be the index of the first True entry. Set all entries at indices a multiples of $i$ to False.
Repeat step 2 until the first True entry is at index $> \sqrt{M}$.
Return a list of all integers $i$ such that the entry at index $i$ is True.

This is definitely a case where a (moving) picture is worth a thousand words:

A good graphic representation of the Sieve of Eratosthenes being used to generate all primes less than 121. Courtesy Wikipedia: "Sieve of Eratosthenes animation". Licensed under CC BY-SA 3.0 via Wikimedia Commons.

How and why does this work? By mathematical induction, at each iteration the index of the first True entry will always be prime. However any multiple thereof is by definition composite, so we can walk along the list and flag them as not prime. Wash, rinse, repeat. We can stop at $\sqrt{M}$, since all composite numbers at most $M$ in magnitude must have at least one prime factor at most $\sqrt{M}$ in size.

Here is a third version of our prime_sum() function that utilizes the Sieve of Eratosthenes:

Let's see how the three versions stack up against each other time-wise in the Sage terminal. I've saved the three functions in a file called prime_sum_functions.py, which I then import up front (if you want to do the same yourself, you'll need to import or define appropriate is_prime() and sqrt() functions at the top of the file). I've also defined a sample toy function h_E() and bound M:

sage: from prime_sum_functions import *

sage: def h_E(n): return sin(float(n))/float(n)

sage: M = 10000

sage: prime_sum_v1(h_E,M)

0.19365326958140347

sage: prime_sum_v2(h_E,M)

0.19365326958140347

sage: prime_sum_v3(h_E,M)

0.19365326958140347

sage: %timeit prime_sum_v1(h_E,M)

1 loops, best of 3: 363 ms per loop

sage: %timeit prime_sum_v2(h_E,M)

1 loops, best of 3: 206 ms per loop

sage: %timeit prime_sum_v3(h_E,M)

10 loops, best of 3: 86.8 ms per loop

Good news! All three functions (thankfully) produce the same result. And we see version 2 is about 1.8 times faster than version 1, while version 3 is four times as fast. These ratios remained roughly the same when I timed the functions on larger bounds, which indicates that the three versions have the same or similar asymptotic scaling - this should be expected, since no matter what we do we will always have to check something at each integer up to the bound.

METHOD 4: ...AND BEYOND

It should be noted, however, that the Sieve of Eratosthenes as implemented above would be a terrible choice for my GSoC code. This is because in order to enumerate the primes up to $M$ we need to create a list in memory of size $M$. This isn't an issue when $M$ is small, but for my code I need $M \sim 10^{11}$; an array of booleans that size would take up about 12 gigabytes in memory, and any speedups we get from not having to check for primality would be completely obliterated by read/write slowdowns due to working with an array that size. In other words, while the Sieve of Eratosthenes has great time complexity, it has abysmal space complexity.

Thankfully, more memory-efficient sieving methods exist that drastically cut down the space requirements. The best of these - for example, the Sieve of Atkin - need about $\sqrt{M}$ space. For $M \sim 10^{11}$ this translates to only about 40 kilobytes; much more manageable.

Of course, there's always a downside: bleeding edge prime enumeration methods are finicky and intricate, and there are a plethora of ways to get it wrong when implementing them. At some point squeezing an extra epsilon of speedup from your code is no longer worth it in terms of the time and effort it will take to get there. For now, I've implemented a more optimized version of the second prime_sum() function in my code (where we skip over all integers that are obviously not prime), since for now that is my happy middle ground. If I have time at the end of the project I will revisit the issue of efficient prime enumeration and try implement a more optimized sieving method, but that is a tomorrow problem.

Cythonize!

2014-07-14T11:37:00.001-07:00

I'm at the stage where my code essentially works: it does everthing I initially set out to have it do, including computing the aforementioned zero sums for elliptic curve $L$-functions. However, the code is written in pure Python, and it is therefore not as fast as it could be.

This is often not a problem; Python is designed to be easy to read and maintain, and I'm hoping that the Python code I wrote is both of those. If we were just planning to run it on elliptic curves with small coefficients - for example, the curve represented by the equation $y^2=x^3-16x+16$ - this wouldn't be an issue. Curves with small coefficients have small conductors and thus few low-lying zeros near the central point, which allows us to run the zero sum code on them with small Delta parameter values. A small Delta value means the computation will finish very quickly regardless of how efficiently it's implemented, so it probably wouldn't be worth my while trying to optimize the code in that case.

To illustrate this point, here is the first most high-level, generic version of the method that computes the sum $\sum_{\gamma} \text{sinc}^2(\Delta \gamma)$ over the zeros of a given elliptic curve $L$-function (minus documentation):

[Of course, there's plenty going on in the background here. I have a separate method, self.cn() which computes the logarithmic derivative coefficients, and I call the SciPy function spence() to compute the part of the sum that comes from the Fourier transform of the digamma function $\frac{\Gamma^{\prime}}{\Gamma}$. Nevertheless, the code is simple and straightforward, and (hopefully) it's easy to follow the logic therein.]

However, the whole point of my GSoC project is to produce code that can be used for mathematical research; ultimately we want to push computations as far as we can and run the code on elliptic curves with large conductor, since curves with small conductor are already well-understood. Ergo, it's time I thought about going back over what I've written and seeing how I can speed it up.

There are two distinct ways to achieve speedups. The first is to rewrite the code more cleverly to eliminates unnecessary loops, coercions, function calls etc. Here is a second version I have written of the same function (still in Python):

The major change I've made between the two versions is improving how the sum involving the logarithmic derivative coefficients is computed - captured in the variable y. In the first version, I simply iterated over all integers $n$ up to the bound $t$, calling the method self.cn() each time. However, the logarithmic derivative coefficient $c_n$ is zero whenever $n$ is not a prime power, and knowing its value for $n=p$ a prime allows us to efficiently compute its value for $p^k$ any power of that prime. It therefore makes sense to do everything "in-house": eliminate the method call to self.cn(), iterate only over primes, and compute the logarithmic derivative coefficients for all powers of a given prime together.

Let's see how the methods match up in terms of speed. Below we run the two versions of the zero sum method on the elliptic curve $E: y^2=x^3-16x+16$, which is a rank 1 curve of conductor 37:

sage: import sage.lfunctions.zero_sums as zero_sums

sage: ZS = zero_sums.LFunctionZeroSum(EllipticCurve([-16,16]))

sage: ZS._zerosum_sincsquared(Delta=1)

1.01038406984

sage: ZS._zerosum_sincsquared_fast(Delta=1)

1.01038406984

sage: %timeit ZS._zerosum_sincsquared(Delta=1)

10 loops, best of 3: 20.5 ms per loop

sage: %timeit ZS._zerosum_sincsquared_fast(Delta=1)

100 loops, best of 3: 3.46 ms per loop

That's about a sixfold speedup we've achieved, just by reworking the section of the code that computes the $c_n$ sum.

The downside, of course, is that the code in method version 2 is more complicated - and thus less readable - than that in version 1. This is often the case in software development: you can write code that is elegant and easy to read but slow, or you can write code that is fast but horribly complicated and difficult to maintain. And when it comes to mathematical programming, unfortunately, sometimes the necessity for speed trumps readability.

The second major way to achieve speedups is to abandon pure Python and switch to a more low-level language. I could theoretically take my code and rewrite it in C, for example; if done relatively intelligently I'm sure the resulting code would blow what I have here out the water in terms of speed. However, I have no experience writing C code, and even if I did getting the code to interface with the rest of the Sage codebase would be a major headache.

Thankfully there is a happy middle ground: Cython. Cython is a programming language - technically a superset of Python - that allows direct interfacing with C and C++ data types and structures. Code written in Cython can be as fast as C code and nearly as readable as pure Python. Crucially, because it's so similar to Python it doesn't require rewriting all my code from scratch. And Sage already knows how to deal with Cython, so there aren't any compatibility issues there.

I am therefore in the process of doing exactly that: rewriting my code in Cython. Mostly this is just a cut-and-paste job and is pleasingly hassle-free; however, in order to achieve the desired speedups, the bottleneck methods - such as our $\text{sinc}^2$ zero sum method above - must be modified to make use of C data types.

Here is the third, most recent version of the _zerosum_sincsquared() method for our zero sum class, this time written in Cython:

Definitely longer and uglier. I now must declare my (C) variables up front; previously Python just created them on the fly, which is nice but slower than allocating memory space for the variables a priori. I've eliminated the use of complex arithmetic, so that everything can be done using C integer and float types. I still iterate over all primes up to the bound $t$; however now I deal with those primes that divide the conductor of $E$ (for which the associated summand is calculated slightly differently) beforehand, so that in the main loop I don't have to check at each point if my prime $p$ divides the conductor or not [This last check is expensive: the conductor $N$ can be very large - too large to cast as a $C$ long long even - so we would have to use slower Python or Sage data types to represent it. Better to get rid of the check altogether].

Let's see how this version holds up in a speed test. The Cython code has already been built into Sage and the class loaded into the global namespace, so I can just call it without having to attach or load any file:

sage: ZS = LFunctionZeroSum(EllipticCurve([-16,16]))

sage: ZS._zerosum_sincsquared_fast(Delta=1)

1.01038406984

sage: %timeit ZS._zerosum_sincsquared_fast(Delta=1)

1000 loops, best of 3: 1.72 ms per loop

The good news: the Cythonized version of the method produces the same output as the Python versions, and it's definitely faster. The bad news: the speedup is only about a factor of 2, which isn't hugely impressive given how much uglier the code is.

Why is this? Crucially, we still iterate over all integers up to the bound $t$, testing for primality at each step. This is very inefficient: most integers are not prime (in fact, asymptotically 0 percent of all positive integers are prime); we should be using sieving methods to eliminate primality testing at those integers which we know before checking are composite. For example, we should at the very least only ever iterate over the odd numbers beyond 3. That immediately halves the number of primality tests we have to do, and we should therefore get a comparable speedup if primality testing is what is dominating the runtime in this method.

This is therefore what I hope to implement next: rework the zero sum code yet again to incorporate prime sieving. This has applicability beyond just the $\text{sinc}^2$ method: all explicit formula-type sums at some point invoke summing over primes or prime powers, so having access to code that can do this quickly would be a huge boon.

Hat Operators!

2014-07-09T08:31:00.004-07:00

Clearly I am working in the right field: a lot of what I'm currently doing involves taking Fourier transforms - typically denoted mathematically by a hat operator!
$$ f \longmapsto \hat{f} $$

The Birch and Swinnerton-Dyer Conjecture and computing analytic rank

2014-07-08T14:56:00.000-07:00

Let $E$ be an elliptic curve with $L$-function $L_E(s)$. Recall that Google Summer of Code project is to implement in Sage a method that allows us to compute $\sum_{\gamma} f(\Delta \gamma)$, where $\gamma$ ranges over the imaginary parts of the nontrivial zeros of $L_E$, $\Delta$ is a given positive parameter, and $f(x)$ is a specified symmetric continuous integrable function that is 1 at the origin. The value of this sum then bounds the analytic rank of $E$ - the number of zeros at the central point - from above, since we are summing $1$ with multipliticy $r_{an}$ in the sum, along with some other nonzero positive terms (that are hopefully small). See this post for more info on the method.

One immediate glaring issue here is that zeros that are close to the critical point will contribute values that are close to 1 in the sum, so the curve will then appear to have larger analytic rank than it actually does. An obvious question, then, is to ask: how close can the noncentral zeros get to the central point? Is there some way to show that they cannot be too close? If so, then we could figure out just how large of a $\Delta$ we would need to use in order to get a rank bound that is actually tight.

The rank zero curve $y^2 = x^3 - x^3 - 7460362000712 x - 7842981500851012704$ has an extremely low-lying zero at $\gamma = 0.0256$ (and thus another one at $-0.0256$; as a result the zero sum looks like it's converging towards a value of 2 instead of the actual analytic rank of zero. In order to actually get a sum value out that's less than one we would have to use a $\Delta$ value of about 20; this is far beyond what's feasible due to the exponential dependence of the zero sum method on $\Delta$.

The good news is that there is hope in this regard; the nature of low-lying zeros for elliptic curve $L$-functions is actually the topic of my PhD dissertation (which I'm still working on, so I can't provide a link thereto just yet!). In order to understand how close the lowest zero can get to the central point we will need to talk a bit about the BSD Conjecture.

The Birch and Swinnerton-Dyer Conjecture is one of the two Clay Math Millenium Problems related to $L$-functions. The conjecture is comprised of two parts; the first part I mentioned briefly in this previous post. However, we can use the second part to gain insight into how good our zero sum based estimate for analytic rank will be.

Even though I've stated the first part of the BSD Conjecture before, for completeness I'll reproduce the full conjecture here. Let $E$ be an elliptic curve defined over the rational numbers, e.g. a curve represented by the equation $y^2 = x^3 + Ax + B$ for some integers $A$ and $B$ such that $4A^3+27B^2 \ne 0$. Let $E(\mathbb{Q})$ be the group of rational points on the elliptic curve, and let $r_{al}$ be the algebraic rank of $E(\mathbb{Q})$. Let $L_E(s)$ be the $L$-function attached to $E$, and let $L_E(1+s) = s^{r_{an}}\left[a_0 + a_1 s + \ldots\right]$ be the Taylor expansion of $L_E(s)$ about $s=1$ such that the leading coefficient $a_0$ is nonzero; $r_{an}$ is called the analytic rank of $E$ (see here for more details on all of the above objects). The first part of the BSD conjecture asserts that $r_{al}=r_{an}$; that is, the order of vanishing of the $L$-function about the central point is exactly equal to the number of free generators in the group of rational points on $E$.

The second part of the conjecture asserts that we actually know the exact value of that leading coefficient $a_0$ in terms of other invariants of $E$. Specifically:

$$ a_0 = \frac{\Omega_E\cdot\text{Reg}_E\cdot\prod_p c_p\cdot\#\text{Ш}(E/\mathbb{Q})}{(\#E_{\text{Tor}}(\mathbb{Q}))^2}. $$

Fear not if you have no idea what any of these quantities are. They are all things that we know how to compute - or at least estimate in size. I provide below brief descriptions of each of these quantities; however, feel free to skip this part. It suffices to know that we have a formula for computing the exact value of that leading coefficient $a_0$.

$\#E_{\text{Tor}}(\mathbb{Q})$ is the number of rational torsion points on $E$. Remember that the solutions $(x,y)$ to the elliptic curve equation $y^2 = x^3 + Ax+B$, where $x$ and $y$ are both rational numbers, form a group. Recall also that that the group of rational points $E(\mathbb{Q})$ may be finite or infinite, depending on whether the group has algebraic rank zero, or greater than zero. However, it turns out that there are only ever finitely many torsion points - those which can be added to themselves some finite number of times to get the group identity element. These points of finite order form a subgroup, denoted $E_{\text{Tor}}(\mathbb{Q})$, and the quantity in question is just the size of this finite group (squared in the formula). In fact, it's been proven that the size of $E_{\text{Tor}}(\mathbb{Q})$ is at most 16.
$\Omega_E$ is the real period of $E$. This is perhaps a bit more tricky to define, but it essentially is a number that measures the size of the set of real points of $E$. If you plot the graph of the equation representing $E: y^2 = x^3 + Ax + B$ on the cartesian plane, you get something that looks like one of the following:

The plots of the real solutions to four elliptic curves, and their associated real periods.

There is a way to assign an intrinsic "size" to these graphs, which we denote the real period $\Omega_E$. The technical definition is that $\Omega_E$ is equal to the integral of the absolute value of the differential $\omega = \frac{dx}{2y}$ along the part of the real graph of $E$ that is connected to infinity (that or twice that depending on whether the cubic equation $x^3 + Ax + B$ has one or three real roots respectively).
$\text{Reg}_E$ is the regulator of $E$. This is a number that measures the "density" of rational points on $E$. Recall that $E(\mathbb{Q}) \approx T \times \mathbb{Z}^{r_{an}}$, i.e there free part of $E(\mathbb{Q})$ is isomorphic to $r_{an}$ copies of the integers. There is a canonical way to embed the free part of $E(\mathbb{Q})$ in $\mathbb{R}^{r_{an}}$ as a lattice; the regulator $\text{Reg}_E$ is the volume of the fundamental domain of this lattice. The thing to take away here is that elliptic curves with small regulators have lots of rational points whose coefficients have small numerators and denominators, while curves with large regulators have few such points.
$\prod_p c_p$ is the Tamagawa product for $E$. For each prime $p$, one can consider the points on $E$ over the $p$-adic numbers $\mathbb{Q}_p$. The Tamagawa number $c_p$ is the ratio of the size of the full group of $p$-adic points on $E$ to the subgroup of $p$-adic points that are connected to the identity. This is always a positive integer, and crucially, in all but a finite number of cases the ratio is 1. Thus we can consider the product of the $c_p$ as we range over all prime numbers, and this is precisely the definition of the Tamagawa product.
$\#\text{Ш}(E/\mathbb{Q})$ is the order of the Tate-Shafarevich group of $E$ over the rational numbers. The Tate-Shafarevich group of $E$ is probably the most mysterious part of the BSD formula; it is defined as the subgroup of the Weil–Châtelet group $H^1(G_{\mathbb{Q}},E)$ that becomes trivial when passing to any completion of $\mathbb{Q}$. If you're like me then this definition will be completely opaque; however, we can think of $\text{Ш}(E/\mathbb{Q})$ as measuring how much $E$ violates the local-global principle: that one should be able to find rational solutions to an algebraic equation by finding solutions modulo a prime number $p$ for each $p$, and then piecing this information together with the Chinese Remainder Theorem to get a rational solution. Curves with nontrivial $\text{Ш}$ have homogeneous spaces that have solutions modulo $p$ for all $p$, but no rational points. The main thing here is that $\text{Ш}$ is conjectured to be finite, in which case $\#\text{Ш}(E/\mathbb{Q})$ is just a positive integer (in fact, it can be shown for elliptic curves that if $\text{Ш}$ is indeed finite, then its size is a perfect square).

Why is the BSD Conjecture relevant to rank estimation? Because it helps us overcome the crucial obstacle to computing analytic rank exactly: without extra knowledge, it's impossible to decide using numerical techniques whether the $n$th derivative of the $L$-function at the central point is exactly zero, or just so small that it looks like it is zero to the precision that we're using. If we can use the BSD formula to show a priori that $a_0$ must be at least $M_E$ in magnitude, where $M_E$ is some number that depends only on some easily computable data attached to the elliptic curve $E$, then all we need to do is evaluate successive derivatives of $L_E$ at the central point to enough precision to decide if that derivative is less than $M_E$ or not; this is readily doable on a finite precision machine. Keep going until we hit a derivative which is then definitely greater than $M_E$ in magnitude, at which we can halt and declare that the analytic rank is precisely the order of that derivative.

In the context of the explicit formula-based zero sum rank estimation method implemented in our GSoC project, the size of the leading coefficient also controls how far close the lowest noncentral zero can be from the central point. Specifically, we have the folling result: Let $\gamma_0$ be the lowest-lying noncentral zero for $L_E$ (i.e. the zero closest to the central point that is not actually at the central point); let $\Lambda_E(s) = N^{s/2}(2\pi)^s\Gamma(s)L_E(s)$ is the completed $L$-function for $E$, and let $\Lambda_E(1+s) = s^{r_{an}}\left[b_0 + b_1 s + b_2 s^2 \ldots\right]$ be the Taylor expansion of $\Lambda_E$ about the central point. Then:

$$ \gamma_0 > \sqrt{\frac{b_0}{b_2}}. $$

Thankfully, $b_0$ is easily relatable back to the constant $a_0$, and we have known computable upper bounds on the magnitude of $b_2$, so knowing how small $a_0$ is tells us how close the lowest zero can be to the central point. Thus bounding $a_0$ from below in terms of some function of $E$ tells us how large of a $\Delta$ value we need to use in our zero sum, and thus how long the computation will take as a function of $E$.

If this perhaps sounds a bit handwavy at this point it's because this is all still a work in progress, to be published as part of my dissertation. Nevertheless, the bottom line is that bounding the leading $L$-series coefficient from below gives us a way to directly analyze the computational complexity of analytic rank methods.

I hope to go into more detail in a future post about what we can do to bound from below the leading coefficient of $L_E$ at the central point, and why this is a far from straightforward endeavour.

Bristol Summer School Recap

2014-07-08T11:13:00.000-07:00

For the past week I've been in Bristol, England, attending a summer school on computing with L-functions and automorphic forms. The official name of the summer school was the Building Bridges Summer School on Automorphic Forms and Related Topics.

The school comprised of three intense 2-day long mini-courses, each hosted by a pair of mathematicians in the field:

Explicit Methods for Modular Forms and $L$-functions by Tim Dokchitser and John Cremona, on various methods for computing with modular forms, elliptic curves and $L$-functions;
The Legacy of Ramanujan by Bill Duke and Paul Jenkins, on singular moduli, Borcherds products and forms of half-integral weight;
The Langlands Program by Solomon Friedberg and Jim Codell, on the web of conjectures relating automorphic forms, automorphic representations and $L$-functions.

The reason I attended the summer school is that the subjects all tie directly to the work that I am currently doing - both for my Google Summer of Code project and work towards my dissertation. The central objects of all three mini-courses are motivic $L$-functions - of which elliptic curve $L$-functions are an example. It would take me quite a long time to go through all the math we covered at the summer school - over the course of last week I took 114 pages of lecture notes! This is unfeasible - if you're interested please follow the links above, at which you should find typed up notes of all of the lectures.

Instead, my next post will focus on one of the aspects that ties in with both my GSoC project and my dissertation work: The full Birch and Swinnerton-Dyer conjecture, and how we can use it to obtain results about the complexity of computing analytic rank.

Bristol

2014-07-01T00:55:00.002-07:00

This week finds me in Bristol, in southwest England. I'm here for a weeklong summer school officially titled the Building Bridges: 2nd EU/US Summer School on Automorphic Forms and Related Topics (catchy title!). The summer school comprises of three mini-courses run by pairs of mathematicians in the area of mathematics that I'm in; it's also being co-organized by of of the mentors of my Google Summer of Code project, who is resident here in England. Thus attending will be a great way to kill multiple birds with one stone: learn some new and cool math, see a new place, and get to chat to people relating to my GSoC project as well.

As such, posts on analytic rank estimation of elliptic curves and $L$-functions might be a bit thin on the ground this week. I'll try do a little bit when I can, but they're keeping us fairly busy here.

Bristol, from what I've seen of it so far, is quite beautiful. The sun has been shining this week (which always helps), but the part of town that I'm staying in - the recently revitalized Docklands area - is clean, picturesque, and buzzing with activity. The city has pushed for sustainability and pedestrian-friendliness in recent years, and it shows.

All in all, can't complain. I think it will be a wonderful week.

Fun with elliptic curve L-function explicit formulae

2014-06-26T15:23:00.000-07:00

Although I gave a brief exposition of the "baseline" explicit formula for the elliptic curve $L$-functions in a previous post, I wanted to spend some more time showing how one can use this formula to prove some useful statements about the elliptic curves and their $L$-functions. Some of these are already implemented in some way in the code I am building for my GSoC project, and I hope to include more as time goes on.

First up, let's fix some notation. For the duration of this post we fix an elliptic curve $E$ over the rationals with conductor $N$. Let $L_E(s)$ be its associated $L$-function, and let $\Lambda_E(s)$ be the completed $L$-function thereof, i.e.
$\Lambda_E(s) = N^{s/2}(2\pi)^s\Gamma(s)L_E(s)$, where $\Gamma$ is the usual Gamma function on $\mathbb{C}$. If you want to know more about how $E$, $L_E$ and $\Lambda_E$ are defined and how we compute with them, some of my previous posts (here and here) go into their exposition in more detail.

Let's recap briefly how we derived the baseline explicit formula for the $L$-function of $E$ (see this post for more background thereon). Taking logarithmic derivatives of the above formula for $\Lambda_E(s)$ and shifting to the left by one gives us the following equality:
$$\frac{\Lambda_E^{\prime}}{\Lambda_E}(1+s) = \log\left(\frac{\sqrt{N}}{2\pi}\right) + \frac{\Gamma^{\prime}}{\Gamma}(1+s) + \frac{L_E^{\prime}}{L_E}(1+s).$$
Nothing magic here yet. However, $\frac{\Lambda_E^{\prime}}{\Lambda_E}$, $\frac{\Gamma^{\prime}}{\Gamma}$ and $\frac{L_E^{\prime}}{L_E}$ all have particularly nice series expansions about the central point. We have:

$\frac{\Lambda_E^{\prime}}{\Lambda_E}(1+s) = \sum_{\gamma} \frac{s}{s^2+\gamma^2}$, where $\gamma$ ranges over the imaginary parts of the zeros of $L_E$ on the critical line; this converges for any $s$ not equal to a zero of $L_E$.
$\frac{\Gamma^{\prime}}{\Gamma}(1+s) = -\eta + \sum_{k=1}^{\infty} \frac{s}{k(k+s)}$, where $\eta$ is the Euler-Mascheroni constant $=0.5772156649\ldots$ (this constant is usually denoted by the symbol $\gamma$ - but we'll be using that symbol for something else soon enough!); this sum converges for all $s$ not equal to a negative integer.
$\frac{L_E^{\prime}}{L_E}(1+s) = \sum_{n=1}^{\infty} c_n n^{-s}$; this only converges absolutely in the right half plane $\Re(s)>\frac{1}{2}$.

Here
$$ c_n = \begin{cases}
\left[p^m+1-\#E(\mathbb{F}_{p^m})\right]\frac{\log p}{p^m}, & n = p^m \mbox{ is a perfect prime power,} \\
0 & \mbox{otherwise.}\end{cases} $$

Assembling these equalities gives us the aforementioned explicit formula:
$$ \sum_{\gamma} \frac{s}{s^2+\gamma^2} = \left[-\eta+\log\left(\frac{\sqrt{N}}{2\pi}\right)\right] + \sum_{k=1}^{\infty} \frac{s}{k(k+s)} + \sum_n c_n n^{-s}$$
which holds for any $s$ where all three series converge. It is this formula which we will use repeatedly to plumb the nature of $E$.

For ease of exposition we'll denote the term in the square brackets $C_0$. It pitches up a lot in the math below, and it's a pain to keep writing out!

Some things to note:

$C_0 = -\eta+\log\left(\frac{\sqrt{N}}{2\pi}\right)$ is easily computable (assuming you know $N$). Importantly, this constant depends only on the conductor of $E$; it contains no other arithmetic information about the curve, nor does it depend in any way on the complex input $s$.
The sum $\sum_{k=1}^{\infty} \frac{s}{k(k+s)}$ doesn't depend on the curve at all. As such, when it comes to computing values associated to this sum we can just hard-code the computation before we even get to working with the curve itself.
The coefficients $c_n$ can computed by counting the number of points on $E$ over finite fields up to some bound. This is quick to do for any particular prime power.

Good news: the code I've written can compute all the above values quickly and efficiently:

sage: E = EllipticCurve([-12,29])

sage: Z = LFunctionZeroSum(E)

sage: N = E.conductor()

sage: print(Z.C0(),RDF(-euler_gamma+log(sqrt(N)/(2*pi))))

(2.0131665172, 2.0131665172)

sage: print(Z.digamma(3.5),RDF(psi(3.5)))

(1.10315664065, 1.10315664065)

sage: Z.digamma(3.5,include_constant_term=False)

1.68037230555

sage: Z.cn(389)

-0.183966457901

sage: Z.cn(next_prime(1e9))

0.000368045198812

sage: timeit('Z.cn(next_prime(1e9))')

625 loops, best of 3: 238 µs per loop

So by computing values on the right we can compute the sum on the left - without having to know the exact locations of the zeros $\gamma$, which in general is hard to compute.

Now that we have this formula in the bag, let's put it to work.

NAÏVE RANK ESTIMATION

If we multiply the sum over the zeros by $s$ and letting $\Delta = 1/s$, we get
$$\sum_{\gamma} \frac{\Delta^{-2}}{\Delta^{-2}+\gamma^2} = \sum_{\gamma} \frac{1}{1+(\Delta\gamma)^2},$$
Note that for large values of $\Delta$, $\frac{1}{1+(\Delta\gamma)^2}$ is small but strictly positive for all nonzero $\gamma$, and 1 for the central zeros, which have $\gamma=0$. Thus the value of the sum when $\Delta$ is large gives a close upper bound on the analytic rank $r$ of $E$. That is, we can bound the rank of $E$ from above by choosing a suitably large value of $\Delta$ and computing the quantity on the right in the inequality below:
$$r < \sum_{\gamma} \frac{1}{1+(\Delta\gamma)^2} = \frac{1}{\Delta}\left[C_0 + \sum_{k=1}^{\infty} \frac{1}{k(1+\Delta k)} + \sum_n c_n n^{-1/\Delta}\right]. $$
Great! Right? Wrong. In practice this approach is not a very good one. The reason is the infinite sum over $n$ only converges absolutely for $\Delta<2$, and for Delta values as small as this, the obtained bound won't be very good. A value of $\Delta=2$, for example, gives us the zero sum $\sum_{\gamma} \frac{1}{1+(2\gamma)^2}$. If a curve's $L$-function has a zero with imaginary part about 0.5, for example - as many $L$-functions do - then such a zero will contribute 0.5 to the sum. And because zeros come in symmetric pairs, the sum value will then be at least a whole unit larger than the actual analytic rank. In general, for $\Delta<2$ the computed sum can be quite a bit bigger than the actual analytic rank of $E$.

Moreover, even though the Generalized Riemann Hypothesis predicts that the sum $\sum_n c_n n^{-1/\Delta}$ does indeed converge for any positive value of $\Delta$, in practice the convergence is so slow that we end up needing to compute inordinate amounts of the $c_n$ in order to hope to get a good approximation of the sum. So no dice; this approach is just too inefficient to be practical.

A graphic depiction of the convergence problems we run into when trying to evaluate the sum over the $c_n$. For the elliptic curve $E: y^2 + y = x^3 - 79*x + 342$ the above plot shows the cumulative sum $\sum_{n<T} c_n n^{-1/4}$ for $T$ up to 100000; this corresponds to $\Delta=4$. Note that even when we include this many terms in the sum, its value still varies considerably. It's unlikely the final computed value is correct to a single decimal digit yet.

A BETTER APPROACH (THE ONE IMPLEMENTED IN MY CODE)

We can get around this impasse by evaluating a modified sum over the zeros, one which requires us to only ever need to compute finitely many of the $c_n$. Here's how we do it. Start with the sum $\sum_{\gamma} \frac{s}{s^2+\gamma^2}$, and divide by $s^2$ to get $\frac{1}{s(s^2+\gamma^2)}$. Now hearken back to your college sophomore math classes and take the inverse Laplace transform of this sum. We get
$$ \mathcal{L}^{-1} \left[\sum_{\gamma} \frac{1}{s(s^2+\gamma^2)}\right](t) = \sum_{\gamma} \mathcal{L}^{-1} \left[\frac{1}{s(s^2+\gamma^2)}\right](t) = \frac{t^2}{2} \sum_{\gamma} \left(\frac{\sin(\frac{t}{2}\gamma)}{\frac{t}{2}\gamma}\right)^2 $$
Letting $\Delta = \frac{t}{2\pi}$ we get the sum
$$ \mathcal{L}^{-1} \left[\sum_{\gamma} \frac{s}{s^2+\gamma^2}\right](\Delta) = 2\pi^2\Delta^2 \sum_{\gamma} \text{sinc}^2(\Delta\gamma), $$
where $\text{sinc}(x) = \frac{\sin(\pi x)}{\pi x}$, and $\text{sinc}(0) = 1$.

Note that as before, $\text{sinc}^2(\Delta\gamma)$ exaluates to 1 for the central zeros, and is small but positive for all nonzero $\gamma$ when $\Delta$ is large. So again, this sum will give an upper bound on the analytic rank of $E$, and that this bound converges to the true analytic rank as $\Delta \to \infty$.

If we do the same - divide by $s^2$ and take inverse Laplace transforms - to the quantities on the right, we get the following:

$\mathcal{L}^{-1} \left[\frac{C_0}{s^2} \right] = 2\pi\Delta C_0;$
$ \mathcal{L}^{-1} \left[\sum_{k=1}^{\infty} \frac{1}{sk(k+s)}\right] = \sum_{k=1}^{\infty} \frac{1}{k^2}\left(1-e^{-2\pi\Delta k}\right);$
$\mathcal{L}^{-1} \left[ \sum_{n=1}^{\infty} c_n \frac{n^{-s}}{s^2} \right] = \sum_{\log n<2\pi\Delta} c_n\cdot(2\pi\Delta - \log n). $

Note that the last sum is finite: for any given value of $\Delta$, we only need to compute the $c_n$ for $n$ up to $e^{2\pi\Delta}$. This is the great advantage here: we can compute the sum exactly without introducing any truncation error. The other two quantities are also readly computable to any precision.

Combining the above values and dividing by $2\pi^2\Delta^2$, we arrive at the rank bounding formula which is implemented in my GSoC code, hideous as it is:
\begin{align*}
r < \sum_{\gamma} \text{sinc}^2(\Delta\gamma) = \frac{1}{\pi\Delta} &\left[ C_0 + \frac{1}{2\pi\Delta}\sum_{k=1}^{\infty} \frac{1}{k^2}\left(1-e^{-2\pi\Delta k}\right) \right. \\
&\left. + \sum_{n<e^{2\pi\Delta}} c_n\left(1 - \frac{\log n}{2\pi\Delta}\right) \right] \end{align*}
Of course, the number of terms needed on the right hand side is still exponential in $\Delta$, so this limits the tightness of the sum we can compute on the left hand side. In practice a personal computer can compute the sum with $\Delta=2$ in a few seconds, and a more powerful cluster can handle $\Delta=3$ in a similar time. Beyond Delta values of about 4, however, the number of $c_n$ is just too great to make the sum effectively computable.

THE DENSITY OF ZEROS ON THE CRITICAL LINE

Explicit formula-type sums can also be used to answer questions about the distribution of zeros of $L_E$ along the critical line. Let $T$ be a positive real number, and $M_E(T)$ be the counting function that gives the number of zeros in the critical strip with imaginary part at most $T$ in magnitude. By convention, when $T$ coincides with a zero of $L_E$, then we count that zero with weight. In other words, we can write
$$ M_E(T) = \sum_{|\gamma|<T} 1 + \sum_{|\gamma|=T} 1/2.$$
There is a more mathematically elegant way to write this sum. Let $\theta(x)$ be the Heaviside function on $x$ given by
$$ \theta(x) = \begin{cases} 0, & x<0 \\
\frac{1}{2}, & x=0 \\
1, & x>0. \end{cases}$$
Then we have that
$$M_E(T) = \sum_{\gamma} \theta(T^2-\gamma^2). $$
We have written $M_E(T)$ as a sum over the zeros of $L_E$, so we expect that it comprises the left-hand-side of some explicit formula-type sum. This is indeed this the case. Using Fourier transforms we can show that
$$M_E(T) = \frac{2}{\pi}\left[C_0 T + \sum_{k=1}^{\infty} \left(\frac{T}{k} - \arctan \frac{T}{k}\right) + \sum_{n=1}^{\infty} \frac{c_n}{\log n}\cdot \sin(T\log n) \right] $$

With this formula in hand we can start asking questions on how the zeros of $L_E$ are distributed. For example, how does average zero density on the critical line scale as a function of $T$, the imaginary part of the area in the critical strip we're considering? How does zero density scale with the curve's conductor $N$?

If one assumes the Generalized Riemann Hypothesis, we can show that $\sum_{n=1}^{\infty} \frac{c_n}{\log n}\cdot \sin(T\log n) = O(\log T)$. Moreover, this sum is in a sense equidistributed about zero, so we expect its contribution for a sufficiently large randomly chosen value of $T$ to be zero. The other two quantities are more straightforward. The $C_0 T$ term is clearly linear in $T$. Going back to the definition of $C_0$, we see that $C_0 T = \frac{1}{2}T\log N + $constant$\cdot T$. Finally, we can show that the sum over $k$ equals $T\log T + O(\log T)$. Combining this gives us the estimate
$$ M_E(T) = \frac{1}{\pi} T (\log N + 2\log T+a) + O(\log T) $$
for some explicitly computable constant $a$ independent of $E$, $N$ or $T$ . That is, the number of zeros with imaginary part at most $T$ in magnitude is "very close" to $\frac{1}{\pi}T(\log N + 2\log T)$. Another way to put it is than the number of zeros in the interval $[T,T+1]$ is about $\frac{1}{2}\log N + \log T$.

The value of $M_E(T)$ for $T$ up to 30, for the elliptic curve $E: y^2 + y = x^3 - x$. The black line is just the 'smooth part' of $M_E(T)$, given by $\frac{2}{\pi}\left(C_0 T + \sum_{k=1}^{\infty} \left(\frac{T}{k} - \arctan \frac{T}{k}\right)\right)$. This gives us the 'expected number of zeros up to $T$', before we know any data about $E$ other than its conductor. The blue line is what we get when we add in $\sum_{n=1}^{\infty} \frac{c_n}{\log n}\cdot \sin(T\log n)$, which tells us the exact locations of the zeros: they will be found at the places where the blue curve jumps by 2.

Note that the sum over $n$ converges *very* slowly. To produce the above plot, I included terms up to $n=1 000 000$, and there is still considerable rounding visible in the blue curve. If I could some how evaluate the $c_n$ sum in all its infinite glory, then resulting plot would be perfectly sharp, comprised of flat lines that jump vertically by 2 at the loci of the zeros.

These are two of the uses of explicit formula-type sums in the context of elliptic curve $L$-functions. If you want to read more about them, feel free to pick up my PhD thesis - when it eventually gets published!

The structure of my GSoC code and a short demo

2014-06-19T11:50:00.000-07:00

I'm at the point where I can explain the layout of the code that I've written so far, and even demonstrate some of its functionality.

As mentioned in previous posts, the initial goal of my Google Summer of Code project is to implement in Sage functionality that will allow us to bound the analytic rank of an elliptic curve $E$ via certain explicit formula-type sums relating to the $L$-function of $E$. One could certainly achieve this by simply writing a method on the existing elliptic curve class in Sage. However, the machinery we'll be using to bound rank is useful outside of the context of elliptic curves: it allows us to compute sums over zeros of any $L$-function that comes from a modular form. In fact, so long as

The $L$-function has an Euler product expansion over the primes;
The Euler product coefficients - akin to the $a_p$ values we've mentioned previously - are readily computible; and
The $L$-function obeys a functional equation which allows us to define and compute with the function inside of the critical strip;

then we can use this explicit-formula methodology to compute sums over the zeros of the $L$-function.

All the code is hosted on GitHub, so you can view it freely here (note: this repository is a branch of the main Sage repo, so it contains an entire copy of the Sage source. I'm only adding a tiny fraction of code in a few files to this codebase; I'll link to these files directly below).

To this effect, it makes sense to write a new class in Sage that can ultimately be made to compute zero sums for any motivic $L$-function. This is what I'll be doing. I've created a file zero_sums.py in the sage/lfunctions/ directory, inside of which I'll write an family of classes that take as input a motivic object, and contain methods to compute various sums and values related to the $L$-function of that object.

I'll start by writing two classes: LFunctionZeroSum_abstract, which will contain all the general methods that can apply to a zero sum estimator attached to any motivic $L$-function. The second class, LFunctionZeroSum_EllipticCurve, inherits from the abstract class, and contains the code specific to elliptic curve $L$-functions.

I have also added a method to the EllipticCurve_rational_field class in the sage/schemes/elliptic_curves/ell_rational_field.py file (the class modeling elliptic curves over the rationals) to access the analytic rank estimation functionality of the zero sum estimator.

Let's see the code in action. To download a copy yourself, say from within a project in cloud.sagemath.com, open up a terminal and type

~$ git clone git://github.com/haikona/GSoC_2014.git

This will download the code in the repo into a new folder GSoC_2014 in your current directory. CD into that directory, type make and hit enter to build. This will unfortunately take a couple hours to complete, but the good new is you'll only need to do that once. Alternatively, if you have an existing freshly-built copy of the sage source and good github-fu, you should be able to download just the relevant code and apply it, greatly speeding up the build time.

Once your copy of sage containing the new LFunctionZeroSum code is built, fire up that copy of sage (e.g. type ./sage while in the GSoC_2014 directory; don't just type sage, as this runs the system-wide copy of sage - not the one we want). This command-line interface will have all the functionality of any other current copy of Sage, but with the extra methods and classes I've written.

Let's start by estimating the rank of some elliptic curves:

sage: E = EllipticCurve([0,1,1,-2,0]); E

Elliptic Curve defined by y^2 + y = x^3 + x^2 - 2*x

over Rational Field

sage: E.rank(), E.conductor()

(2, 389)

sage: E.analytic_rank_bound()

2.0375000846

In the code above, I've defined an elliptic curve given by the Weierstrass equation $y^2 +y = x^3+x^2-2x$; for those of you in the know, this is the curve with Cremona label '389a', the rank 2 curve with smallest conductor. I've then printed out the rank of the curve, which is indeed 2. Then I've called my analytic_rank_bound() method on $E$, and it returns a value which is strictly greater than the analytic rank of $E$. Since the analytic rank of an elliptic curve is always a non-negative integer, we can conclude that the analytic rank of $E$ is at most 2.

How long did this rank estimation computation take? Thankfully Sage has a timing function to answer this question:

sage: timeit('E.analytic_rank_bound()')

125 loops, best of 3: 3.33 ms per loop

Neat; that was quick. However, this elliptic curve has a small conductor - 389, hence it's Cremona label - so computing with its $L$-function is easy. The great thing with the zero sum code is that you can run it on curves with really large conductors. Here we reproduce the commands above on the curve $y^2 + y = x^3 - 23737x + 960366$, which has rank 8 and conductor $N \sim 4.6\times 10^{14}$:

sage: E = EllipticCurve([0,0,1,-23737,960366]); E

Elliptic Curve defined by y^2 + y = x^3 - 23737*x + 960366 

over Rational Field

sage: E.rank(), E.conductor()

(8, 457532830151317)

sage: E.analytic_rank_bound()

8.51446122452

sage: timeit('E.analytic_rank_bound()')

125 loops, best of 3: 3.37 ms per loop

So we see then that this zero sum code doesn't care very much about the size of the conductor of $E$; this is one of its huge plusses. There are of downsides, of course - but we'll save those for another post.

Now let's look at the LFunctionZeroSum code.

sage: E = EllipticCurve([23,-100])

sage: Z = LFunctionZeroSum(E); Z

Zero sum estimator for L-function attached to 

Elliptic Curve defined by y^2 = x^3 + 23*x - 100 

over Rational Field

Right now we don't have too much beyond that which you can access from the EllipticCurve class. For one, we can compute the coefficients of the logarithmic derivative, since they are needed to compute the rank-estimating zero sum. We can also return and/or compute one or two other values associated to the zero sum class. As time goes on we'll flesh out more functionality for this class.

sage: for n in range(1,11):

....:     print(n,Z.logarithmic_derivative_coefficient(n))

....: 

(1, 0)

(2, 0.0)

(3, -1.09861228867)

(4, 0.0)

(5, 1.28755032995)

(6, 0)

(7, -0.277987164151)

(8, 0.0)

(9, -0.366204096223)

(10, 0)

sage: Z.elliptic_curve()

Elliptic Curve defined by y^2 = x^3 + 23*x - 100

over Rational Field

sage: Z.elliptic_curve().rank()

1

sage: Z.rankbound(Delta=1.7,function="gaussian")

1.58647787024

All of this code is of course written to Sage spec. That means that every method written has documentation, including expected input, output, examples, and notes/warnings/references where necessary. You can access the docstring for a method from the command line by typing that method followed by a question mark and hitting enter.

All my code will initially be written in Python. Our first goal is to replicate the functionality that has already been written by Jonathan Bober in c. As such we're not aiming for speed here; instead the focus is to make sure all the code is mathematically correct.

Once that's up and running there are two natural directions to take the project:

Speed the code up. Rewrite key parts in Cython, so that computations run faster, and thus analytic rank can be estimated for curves of larger conductor.
Generalize the code. Write classes that model zero sum estimators for $L$-functions attached to eigenforms (those modular forms with Euler product expansions) of any level and weight, and allow for more types of sums to be computed. Ultimately write code that can be used to compute sums for any motivic $L$-function.

I'm going to attack speeding up the code first, since the elliptic curve rank estimation functionality will probably see the most use. If there's time at the end of the project I can look to generalizing the code as much as I can.

My favourite hat

2014-06-19T10:35:00.000-07:00

In 2007 I took an overland trip though Namibia with my parents and two sets of friends. Our first port of call was the Augrabies National Park on the South African side of the South Africa/Namibia border. At the gift shop there I bought this hat (along with a warthog fluff toy, which I also still own).

It's a Rogue handcrafted leather bush hat, and it's wonderful. Supremely comfortable, it generally goes where I go whenever there's the chance of sunny weather - which here in Seattle means the months of June through September. I hope to continue to wear this hat for many a year to come.

Also it makes me look like Indiana Jones.

The explicit formula and bounding analytic rank from above

2014-06-13T14:26:00.004-07:00

Wednesday's post gave some detail on how one can numerically evaluate $L_E(s)$, the $L$-function attached to $E$, at any given complex input $s$. This is certainly the first step in determining the analytic rank of $E$ i.e. the order of vanishing of the $L_E(s)$ at the central point $s=1$.

However, we immediately run into two major issues. The first, as mentioned previously, is that we have no proven lower bound on the size of the coefficients of the Taylor expansion of $L_E(s)$ at $s=1$, so we can never truly ascertain if the $n$th derivative there vanishes, or is just undetectibly small.

Today's post focuses on a method that at least partially overcomes the second issue: that evaluating $L_E(s)$ and its derivatives takes at least $O(\sqrt{N})$ time, where $N$ is the conductor of $E$. This means that working with elliptic curve $L$-functions directly becomes computationally infeasible if the curve's conductor is too big.

To see how we can bypass this problem, we'll have to talk about logarithmic derivatives. The logarithmic derivative of a function $f(s)$ is just
$$ \frac{d}{ds} \log(f(s)) = \frac{f^{\prime}(s)}{f(s)}. $$
Logarithmic derivatives have some nice properties. For one, they turn products into sums: if $f(s) = g(s)h(s)$, then $\frac{f^{\prime}}{f} = \frac{g^{\prime}}{g} + \frac{h^{\prime}}{h}$. Because we can write $L_E(s)$ as an Euler product indexed by the primes, we therefore have that
$$ \frac{L_E^{\prime}}{L_E}(s) = \sum_p \frac{d}{ds} \log\left(\left(1-a_p p^{-s} + \epsilon_p p^{-2s}\right)^{-1}\right). $$
When you multiply this out, you rather amazingly get a Dirichlet series whose coefficients can be described in a very mathematically elegant way:
$$ \frac{L_E^{\prime}}{L_E}(1+s) = \sum_{n=1}^{\infty} c_n n^{-s}, $$
where
$$ c_n = \begin{cases}
\left(p+1-\#E(\mathbb{F}_{p^m})\right)\frac{\log p}{p^m}, & n = p^m \mbox{a perfect prime power,} \\
0 & \mbox{otherwise.}\end{cases} $$

That is, the $n$th coefficient of $\frac{L_E^{\prime}}{L_E}(1+s)$ is $0$ when $n$ is not a prime power, and when it is, the coefficient relates directly to the number of points on the reduced curve over the finite field of $n$ elements. [Note we're shifting the logarithmic derivative over to the left by one unit, as this puts the central point at the origin i.e. at $s=0$.]

How does this yield a way to estimate analytic rank? Via something called the Explicit Formula for elliptic curve $L$-functions - which, confusingly, is really more of a suite of related methods than any one specific formula.

Remember the Generalized Riemann Hypothesis? This asserts, in the case of elliptic curve $L$-functions at least, that all nontrivial zeros of $L_E(s)$ lie on the line $\Re(s) = 1$, and two zeros can never lie on top of each other except at the central point $s=1$. The number of such zeros at the central point is precisely the analytic rank of $E$. Each nontrivial zero can therefore be written as $1\pm i\gamma$ for some non-negative real value of $\gamma$ - it turns out that whenever $L_E(1+i\gamma)=0$, then $L_E(1-i\gamma)=0$ too, so noncentral zeros always come in pairs. We believe GRH to be true with a high degree of confidence, since in the 150 years that $L$-functions have been studied, no-one has ever found a non-trivial zero off the critical line.

We can invoke GRH with logarithmic derivatives and something called the Hadamard product representation of an entire function to get another formula for the logarithmic derivative of $L_E(s)$:
$$ \frac{L_E^{\prime}}{L_E}(1+s) = -\log\left(\frac{\sqrt{N}}{2\pi}\right) - \frac{\Gamma^{\prime}}{\Gamma}(1+s) + \sum_{\gamma} \frac{s}{s^2+\gamma^2}. $$
Here $\frac{\Gamma^{\prime}}{\Gamma}(s)$ is the logarithmic derivative of the Gamma function (known as the digamma function; this is a standard function in analysis that we know a lot about and can easily compute), and $\gamma$ runs over the imaginary parts of the zeros of $L_E(s)$ on the critical strip $\Re(s)=1$.

What does this get us? A way to equate a certain sum over the zeros of $L_E(s)$ to other more tractable functions and sums. When you combine the two formulae for $\frac{L_E^{\prime}}{L_E}(1+s)$ you get the following equality:

$$ \sum_{\gamma} \frac{s}{s^2+\gamma^2} = \log\left(\frac{\sqrt{N}}{2\pi}\right) + \frac{\Gamma^{\prime}}{\Gamma}(1+s) + \sum_n \frac{c_n}{n} n^{-s}.$$

The key realization here is that while computing the zeros of $L_E$ is in general nontrivial and time-consuming, all the other quantities in the equation above are readily computable. We can therefore evaluate on the left by computing the quantities on the right to sufficient precision, none of which require knowledge of the exact locations of the zeros of $L_E(s)$.

The above is the first and perhaps most basic example of an explicit formula for elliptic curve $L$-functions, but it is not the one we want. Specificially, trouble still arises in the fact that when were still dealing with an infinite sum on the right, and when we look close to the central point $s=0$, convergence is unworkably slow. We can, however, work with related explicit formulae that reduce the sum on the right to a finite one.

The formula we'll use in our code can be obtained from the above equation by dividing both sides by $s^2$ and taking inverse Laplace transforms; we can also formulate it in terms of Fourier transforms. Specifically, we get the following. Recall the sinc function $\mbox{sinc}(x) = \frac{\sin(\pi x)}{\pi x}$ with $\mbox{sinc}(0)=1$. Then we have:
$$ \sum_{\gamma} \mbox{sinc}(\Delta\gamma)^2 = Q(\Delta,N) + \sum_{\log n<2\pi\Delta} c_n\cdot\left(2\pi\Delta-\log n\right), $$
where $Q(\Delta,N)$ is a certain quantity depending only on $\Delta$ and $N$ that is easy to compute.

Let's look at the left hand side. Since $\mbox{sinc}(\Delta\gamma)^2$ is non-negative for any $\gamma$, and $\mbox{sinc}(0)^2 = 1$, when we sum over all zeros $\gamma$'s we get a value which must be strictly greater than the number of zeros at the central point (i.e. the analytic rank of $E$). Moreover, as we increase $\Delta$, the contribution to the sum from all the noncentral zeros goes to zero, as $\mbox{sinc}(x) \to 0$ as $x \to \infty$.

A graphic representation of the sum on the left side of the equation for the $L$-function attached to the elliptic curve $y^2 = x^3 + 103x - 51$ for three increasing values of the parameter $\Delta$. Vertical lines have been plotted at $x=\gamma$ whenever $L_E(1+i\gamma) = 0$, and the height of each line is given by the black curve $\mbox{sinc}(\Delta x)^2$. Thus summing up the length of the vertical lines gives you the value of the sum on the left side of the equation. We see that as $\Delta$ increases, the contribution from the blue lines - corresponding to noncentral zeros - goes to zero, while the contribution from the central zeros in red remain at 1 apiece. Since there are two central zeros (plotted on top of each other here), the sum will limit to 2 from above as $\Delta$ increases.

Now consider the right hand side of the sum. Note that the sum over $n$ is finite, so we can compute it to any given precision without having to worry about truncation error. Since $Q(\Delta,N)$ is easy to compute, this allows us to evaluate the entire right hand side with (relatively) little fuss. We therefore are able, via this equality, to compute a quantity that converges to $r$ from above as $\Delta \to \infty$.

A graphical representation of the finite sum $\sum_{\log n<2\pi\Delta} c_n\cdot\left(2\pi\Delta-\log n\right)$ for the same curve, with $\Delta=1$. The black lines are the triangular function $y = \pm (2\pi\Delta-x)$, whose value at $\log n$ weight the $c_n$ coefficient. Each blue vertical line is placed at $x=\log n$, and has height $c_n\cdot\left(2\pi\Delta-\log n\right)$. Summing up the signed lengths of the blue lines gives the value of the sum over $n$. Note that there are only 120 non-zero terms in this sum when $\Delta=1$, so it is quick to compute.

This formula, and others akin to it, will form the core functionality of the code I will be writing and adapting to include in Sage, as they give us a way to quickly (at least when $\Delta$ is small) show that the analytic rank of a curve can't be too big.

There are some downsides, of course, most notable of which is that the number of terms in the finite sum on the right is exponential in $\Delta$. Tests show that on a laptop one can evaluate the sum in a few milliseconds for $\Delta=1$, a few seconds for $\Delta = 2$, but the computation takes on the order of days when $\Delta=4$. Nevertheless, for the majority of the elliptic curves we'll be looking at, $\Delta$ values of 2 or less will yield rank bounds that are in fact tight, so this method still has much utility.

Another issue that warrants investigation is to analyze just how fast - or slow - the convergence to $r$ will be as a function of increasing $\Delta$; this will allow us to determine what size $\Delta$ we need to pick a priori to get a good estimate out. But this is a topic for a future date; the next post or two will be dedicated toward detailing how I'm implementing this rank estimation functionality in Sage.

How to compute with elliptic curve L-functions and estimate analytic rank

2014-06-11T15:49:00.002-07:00

In a previous post I defined what an elliptic curve $L$-function $L_E(s)$ is, and indicated why the behaviour of the function at the central point $s=1$ is of such great interest: the order of vanishing of $L_E(s)$ at $s=1$ is conjecturally precisely equal to the algebraic rank of $E$.

Because this equality is still conjectural, we will cal the former quantity -- then number of derivitaves of $L_E(s)$ that vanish at $s=1$ - the analytic rank of $E$. The topic of this post is to address the question: given an elliptic curve $E$, how do we go about computing its analytic rank?

Before we can hope to answer this question, we need to know how to evaluate $L_E(s)$ itself for any given $s$. In the previous post I gave both the Euler product and Dirichlet series definitions for $L_E(s)$; to jog your memory, here's the Euler product of $L_E(s)$:
$$ L_E(s) = \prod_p \left(1-a_p p^{-s} + \epsilon_p p^{-2s}\right)^{-1}, $$
where the product runs over all prime numbers, $a_p = p+1-\#\{E(\mathbb{F}_p)\}$, and $\epsilon_p = 0$ if $p$ divides the conductor of $E$ and $1$ otherwise. The Dirichlet series is $L_E(s) = \sum_{n}a_n n^{-s}$, which is precisely what you get when you multiply out the Euler product. Note that we are suppresing the dependence on $E$ in both the $a_n$ and $\epsilon_p$ constants.

However, both the Euler product and Dirichlet series representations of $L_E(s)$ will only converge absolutely when the real part of $s$ exceeds $\frac{3}{2}$. Although the Sato-Tate Conjecture (now a theorem, but the name has stuck) implies that the expansions will in fact converge conditionally for $\Re{s}>\frac{1}{2}$, the convergence is so slow that attempting to evaluate $L_E(s)$ near the central point by multiplying or summing enough terms is horribly inefficient. As such, we need a better way to evaluate $L_E(s)$ - and thankfully, such better ways do indeed exit.

Remember how I mentioned in the previous post that $L$-functions obey a very nice symmetry condition? Well, here's that condition exactly: first, we need to define something called the completed $L$-function $\Lambda_E(s)$. This is just $L_E(s)$ multiplied by some extra factors. Specifically,
$$\Lambda_E(s) = N^{\frac{s}{2}}(2\pi)^{-s}\Gamma(s) L_E(s), $$
where $N$ is the conductor of $E$ and $\Gamma(s)$ is the usual Gamma function on $\mathbb{C}$ (the one that gives you $(n-1)!$ when you evaluate it at the positive integer $s=n$).

We can show that $\Lambda_E(s)$ is entire on $\mathbb{C}$; that is, it doesn't have any poles. Moreover, $\Lambda_E(s)$ obeys the glorious symmetry property
$$ \Lambda_E(s) = w_E \Lambda_E(2-s), $$
where $w_E$ is either $1$ or $-1$ and depends on $E$. This is called the functional equation for the $L$-function attached to $E$.

Another way to put it is that shifted completed $L$-function $\Lambda_E(1+s)$ is either an even or an odd function of $s$. Because the factors $N^{\frac{s}{2}}$, $(2\pi)^{-s}$ and $\Gamma(s)$ are all readily computable, this allows us to determine the value of $L_E(s)$ when the real part of $s$ is less than $\frac{1}{2}$.

What's left, then, is to figure out how to efficinetly evaluate $L_E(s)$ in the strip $\frac{1}{2} \le \Re(s) \le \frac{3}{2}$. This is called the critical strip for the $L$-function, and it is here that the behaviour of the function is most interesting.

[Aside: we can, for example, show that $\Lambda_E(1+s)$ is never zero outside of the critical strip. The Generalized Riemann Hypothesis in fact asserts that elliptic curve $L$-functions are only ever zero along the exact center of this strip, the critical line $\Re(s)=1$. We'll get back to the zeros of elliptic curve $L$-functions in a later post.]

To evaluate $\Lambda_E(s)$ in the critical strip, we make use of the modularity of $E$. The modularity theorem states that elliptic curves are modular: for every elliptic curve $E$ over the rationals there exists a weight 2 cuspidal eigenform $f_E$ of level $N$ (where $N$ is precisely the conductor of $E$), such that the $a_n$ of $E$ as defined previously equal the Fourier coefficients of $f_E$. If you haven't studied modular forms before, the crucial piece of knowledge is that there is a natural procedure for constructing $L$-functions from modular forms in such a way that the definition makes sense for all complex inputs $s$, and that the $L$-function attached to the cusp form $f_E$ will exactly equal the elliptic curve $L$-function $L_E(s)$. This is in fact how we show that elliptic curve $L$-functions can be analytically continued to all of $\mathbb{C}$.

The take-away is that via modularity there is an infinite sum representation for $L_E(s)$ which converges absolutely for all $s$. Here it is: define the auxiliary function
$$\lambda_E(s) = \left(\frac{\sqrt{N}}{2\pi}\right)^{s} \sum_{n=1}^\infty a_n n^{-s}\Gamma \left(s,\frac{2\pi}{\sqrt{N}}\cdot n\right), $$
where all the quantities are as defined previously, and $\Gamma(s,x)$ is the upper incomplete Gamma function (note that since $\Gamma(s,x)$ goes to zero exponentially as $x$ goes to infinity, this sum will converge absolutely for any $s$, with rate of convergence scaling with $\sqrt{N}$). Then we have
$$ \Lambda_E(s) = \lambda_E(s) + w_E \lambda_E(2-s). $$
Because we know how to compute incomplete Gamma functions, this gives us a way to evaluate $\Lambda_E(s)$, and thus $L_E(s)$, in the critical strip. The upside with this formula and variations thereof is that it works for any value of $s$ you stick in - including values near $s=1$. Similar formulae exist for the derivatives of $L_E(s)$, so we can in theory compute $L_E^{(n)}(1)$, the $n$th derivative of $L_E(s)$ at the central point, to any degree of accuracy for any $n$.

Thus if we want to compute a curve's analytic rank, what's stopping us from just evaluating successive derivatives of $L_E(s)$ at $s=1$ until we hit one which is not zero?

Two reasons. The first is that there's no way around the fact that you need about $\sqrt{N}$ terms to compute $L_E(s)$ or its derivatives to decent precision. If the conductor of the curve is too big, as is often the case, it takes an inordinate amount of time to simply evaluate the $L$-function near the central point. This makes direct evaluation impractical for all but the smallest-conductor curves -- and for those curves we can usually compute rank via other methods anyway.

The second reason is a more subtle one: how do you tell numerically if the $n$th derivative of $L_E(s)$ at $s=1$ is zero? If you think about it, it's easy to answer this question in one direction only: if you evaluate $L_E^{(n)}(1)$ to some precision and get digits that aren't all zero, then (assuming your code is correct) the $n$th derivative of $L_E(s)$ does not vanish. However, no matter how many digits of precision we compute getting all zeros, the possibility will always remain that the next digit along might not be zero.

In general, there is no numerical way to determine if the value of a black-box complex-valued function at a point is zero, or just really really close to zero. This is why, if you look in the literature, you'll find "methods to estimate analytic rank", but never to compute the quantity exactly. It's impossible to do without extra knowledge. Specifically, we'd need to have some sort of a computable lower bound on the size of the derivatives of $L_E(s)$ at the central. Unfortunately, no such theorems currently exist, so we're stuck with estimating analytic rank for now.

Thankfully, the $\sqrt{N}$-dependence issue is more hopeful. The next post will detail a method that provides good estimates for the analytic rank that scales much more slowly with the curve's conductor.

How to develop for Sage using Sage Math Cloud  and Git

2014-06-10T09:01:00.000-07:00

I will be using Sage Math Cloud to write all the code for my Google Summer of Code project. The advantage of choosing the cloud-based route should be clear: you don’t need to install or store anything locally, and it frees you up to use any device with a web browser you choose to write code. The downside, as Sage Math Cloud is still new, is that tutorials and how-tos on the setup process in order to do so are a bit lacking. Since I couldn’t find any single unified source on how to do it, I thought I might to create a step-by-step guide on getting set up in Sage Math Cloud to write code in for inclusion in the Sage codebase.

Much of what I've done follows the Git the Hard Way tutorial for Sage development. However, there are significant departures. I recommend reading through that tutorial first; if you do then what follows below will be a piece of cake.

First, you’ll need an account at cloud.sagemath.com. Good news: the whole service is completely free to use, and there’s no annoying back-and-forthing via emailed confirmation links. It’ll take you two minutes tops.

The cloud.sagemath.com homepage.

Your work on your Sage Math Cloud account is project-based; each account can have a large number of projects associated to it. Each project acts like it’s own Linux-based file system inside of which you can create and edit any number of files of any type. Create a project in which you'll house your code.

The project root view, before any files have been created.

Next, you'll want a local installation of Sage. We're going to initiate a local copy of the Sage master repo, as hosted on GitHub. Click New --> Terminal to open up a new terminal window - we'll be doing everything via terminal commands. Then type the following three commands:


~$ git clone git://github.com/sagemath/sage.git

~$ cd sage

~/sage$ make

The first will download a copy of the Sage codebase into the new folder 'sage' (it should appear in the project root directory after the download is complete); the second changes into the directory, and the third initiated the compiling of the Sage source.

Note that the building of Sage usually takes about 3 hours, so get this going and then kick back and take inspiration from this.

While your code is compiling you'll want to make sure you have a GitHub account. If you don't follow the steps there to create one. While not required, hosting any projects on GitHub is highly recommended: it will allow multiple people to work on a project simultaneously, as well as giving you access to sophisticated revision control functionality that will prevent you from accidentally deleting code or data.

GitHub repo creation page.

Create an empty repository - we'll fill it with things later. So long as the repo exists and you know its name you're good to go.

The next thing we'll want to do is register the Sage Trac repository, where all the active tickets are hosted for Sage development. Once you're done creating an empty repo on your GitHub account, go back to your Sage Math Cloud project and wait for Sage to finish building. When this is done, in the terminal window type the following:



~/sage$ git remote add trac git@trac.sagemath.org:sage.git -t master

~/sage$ git remote -v

origin  git://github.com/sagemath/sage.git (fetch)

origin  git://github.com/sagemath/sage.git (push)

trac    git@trac.sagemath.org:sage.git (fetch)

trac    git@trac.sagemath.org:sage.git (push)

The git remote add command registers the Sage trac repo as a remote repository (trac is the name it is registered as locally; you can of course call it anything you want). The git remote -v command simply lists what repos you have now registered. We won't be using Sage trac for now. If you want more practice with trac, see the Git the Hard Way tutorial mentioned previously.

Note the inclusion of the word master in the command above; this means we will only track the master branch of the remote repo. A branch is an instanced copy of the codebase which you can work on and edit without altering other branches. The master branch should be kept pristine, so we will want to create and switch to a new branch. This is accomplished with the git checkout command:



~/sage$ git checkout -b demo

Switched to a new branch 'demo'

~/sage$ git branch

* demo

  master

The -b parameter creates the branch, while checkout switches to that branch. Typing git branch then shows you what branches currently exist locally, as well as the one you're currently on.

We're now in the position to register the demo branch with our fresh repo on GitHub:



~/sage$ git remote add demo https://github.com/haikona/demo.git

~/sage$ git remote -v

demo    https://github.com/haikona/demo.git (fetch)

demo    https://github.com/haikona/demo.git (push)

origin  git://github.com/sagemath/sage.git (fetch)

origin  git://github.com/sagemath/sage.git (push)

trac    git@trac.sagemath.org:sage.git (fetch)

trac    git@trac.sagemath.org:sage.git (push)

And finally, we can push to our repo to sync the local copy therewith. For this you will need your GitHub password - so not just any old Harry can push to your repo.

~/sage$ git push demo demo

Username for 'https://github.com': haikona

Password for 'https://haikona@github.com':

Counting objects: 4, done.

Delta compression using up to 4 threads.

Compressing objects: 100% (2/2), done.

Writing objects: 100% (3/3), 317 bytes | 0 bytes/s, done.

Total 3 (delta 1), reused 0 (delta 0)

To https://github.com/haikona/demo.git

   2b1d88b..21cddb8  demo -> demo

You're now synced and ready to start writing code! From hereon you should be able to follow one of the many tutorials on working with git; the hard part - setting up - is all done.

Mathematical background: elliptic curves and L-functions

2014-06-09T13:15:00.000-07:00

Day 1 of my Google Summer of Code project! I will try post updates at least once a week; I thought a good starting point would be to give an introduction to the mathematical objects that this project revolves around. The next post will then give a more detailed description of the project itself, and the structure of the code that I'm going to adapt, write and include in Sage.

To that end, for this post I will assume some knowledge of elementary number theory, algebra and complex analysis, but nothing more complicated than that.

Let $E$ be an elliptic curve over the rational numbers. We can think of $E$ as the set of rational solutions $(x,y)$ to a two-variable cubic equation in the form:
$$ E: y^2 = x^3 + Ax + B $$
for some integers $A$ and $B$, along with an extra "point at infinity". An important criterion is that the $E$ be a smooth curve; this translates to the requirement that the discriminant of the curve, given by $-16(4A^3+27B^2)$, is not zero.

One of the natural questions to ask when considering an elliptic curve is "how many rational solutions are there?" It turns out elliptic curves fall in that sweet spot where the answer could be zero, finitely many or infinitely many - and figuring out which is the case is a deeply non-trivial problem.

The rational solutions form an abelian group with a well-defined group operation that can be easily computed. By a theorem of Mordell, the group of rational points on an elliptic curve $E(\mathbb{Q})$ is finitely generated; we can therefore write
$$ E(\mathbb{Q}) \approx T \times \mathbb{Z}^r,$$
where $T$ is a finite group (called the torsion subgroup of $E$), and $r$ is denoted the algebraic rank of $E$.

The elliptic curve group law explained visually: three points in a straight line add to zero; because the point at infinity is the identity element, this means that the sum R of two points P and Q is the reflection about the real axis of the other point on the curve on the straight line connecting P and Q.

Determining the torsion subgroup of $E$ is a relatively straightforward endeavor. By a theorem of Mazur, rational elliptic curves have torsion subgroups that are (non-canonically) isomorphic to one of precisely fifteen possibilities: $\mathbb{Z}/n\mathbb{Z}$ for $n = 1$ through $10$ or $12$; or $\mathbb{Z}/2\mathbb{Z}\oplus \mathbb{Z}/2n\mathbb{Z}$ for $n = 1$ though $4$. Computing the rank $r$ - the number of independent rational points on $E$ - is the hard part, and it is towards this end that this project hopes to contribute.

Perhaps surprisingly, we can translate the algebraic problem of finding rational solutions to $E$ to an analytic one - at least conjecturally. To understand this we'll need to know what an elliptic curve $L$-function is. These are holomorphic functions defined on the whole complex plane that somehow encode a great deal of information about the elliptic curve they're attached to.

The definition goes as follows: for each prime $p$, count the number of solutions to the elliptic curve equation modulo $p$; we'll call this number $N_p(E)$. Then define the number $a_p(E)$ by
$$ a_p(E) = p+1 - N_p(E). $$
Hasse's Theorem states that $a_p(E)$ is always less that $2\sqrt{p}$ in magnitude for any $p$, and the Sato-Tate conjecure (recently proven by Taylor et al) states that for a fixed elliptic curve, the $a_p$ (suitably transformed) are asymptotically distributed in a semi-circular distribution about zero.

Next, for a given $p$ define the local factor $L_p(s)$ to be the function of the complex variable $s$ as follows:
$$ L_p(s) = \left(1-a_p(E)p^{-s} + \epsilon(p)p^{-2s}\right)^{-1}, $$
where $\epsilon(p)$ is 0 if $p$ divides the conductor of $E$, and 1 otherwise.

The conductor of $E$ is a positive integer that encodes the bad reduction of $E$: for a finite list of primes $p$, when we reduce the curve modulo $p$, we don't get a smooth curve but rather a singular one instead. The conductor is just the product of these primes to the first or second power (or in the cases $p=2$ or $3$, up to the eighth and fifth powers respectively). If you're unfamiliar with elliptic curves, the thing to note is that the conductor always divides the discriminant of $E$, namely the quantity $-16(4A^3+27B^2)$ mentioned previously.

Finally we can define the $L$-function attached to $E$:
$$ L_E(s) = \prod_p L_p(s) = \prod_p \left(1-a_p(E)p^{-s} + \epsilon(p)p^{-2s}\right)^{-1}. $$
The above representation of $L_E(s)$ is called the Euler product form of the $L$-function. If we multiply out the terms and use power series inversion we can also write $L_E(s)$ as a Dirichlet series:
$$ L_E(s) = \sum_{n=1}^{\infty} a_n(E) n^{-s}, $$
where for non-prime $n$ the coefficients $a_n$ are defined to be exactly the integers you get when you multiply out the Euler expansion.

If you do some analysis, using Hasse's bound on the size of the $a_p(E)$ and their distribution according to Sato-Tate, one can show that the above to representations only converge absolutely when the real part of $s$ is greater than $\frac{3}{2}$, and conditionally for $\Re(s)>\frac{1}{2}$. However, the modularity theorem states that these elliptic curve $L$-functions can actually be analytically continued to the entire complex plane. That is, for every elliptic curve $L$-function $L_E(s)$ as defined above, there is an entire function on $\mathbb{C}$ which agrees with the Euler product/Dirichlet series definition for $\Re(s)>1$, but is also defined - and readily computable - for all other complex values of $s$. This entire function is what we actually call the $L$-function attached to $E$.

The way we analytically continue $L_E(s)$ yields that the function is highly symmetric about the line $\Re(s)=1$; moreover, because the function is defined by real coefficients $L_E(s)$ also obeys a reflection symmetry along the real axis. The point $s=1$ is in a very real sense therefore the central value for the $L$-function. It thus makes sense to investigate the behaviour of the function around this point.

Because $L_E(s)$ is entire, it has a Taylor expansion at any given point. We can ask what the Taylor expansion of $L_E(s)$ about the point $s=1$ is, for example. One of the central unproven conjectures in modern-day number theory is the Birch and Swinnerton-Dyer Conjecture: that the order of vanishing of the Taylor expansions of $L_E(s)$ about the point $s=1$ is precisely $r$, the algebraic rank of the elliptic curve $E$. That is, if we let $z=s-1$ so that
$$ L_E(s) = c_0 + c_1 z + c_2 z^2 + \ldots $$
is the expansion of $L_E(s)$ about the central point, the BSD conjecture holds that $c_0$ through $c_{r-1}$ are all zero, and $c_r$ is not zero.

The values of three elliptic curve L-functions along the critical line $1+it$ for $-10<=1<=10$. Blue corresponds to the curve $y^2 = x^3 - 13392x - 1080432$, a rank 0 curve, red is that of $y^2 = x^3 - 16x + 16$, a rank 1 curve, and green is $y^2 = x^3 - 3024x + 46224$, a rank 2 curve. Note that close to the origin the graphs look like non-zero constant function, a straight line through the origin and a parabola respectively.

Thus if we can compute the order of vanishing of the elliptic curve $L$-function at the central point, we can at least conjecturally compute the rank of the curve. This converts an algebraic problem into a numerical one, which is perhaps more tractible.

The techniques we'll use to attempt to do this will be the subject of the next blog post. Unfortunately there are still plenty of challenges to this approach - the least of which boils down to: how do you numerically determine if the value of a function is zero, or just really, really close to zero? The short answer is that without theorems to back you up, you can't -- but we can still make considerable progress toward the problem of computing an elliptic curve's rank.

2014-05-19T08:43:00.002-07:00

Gotta pay lip service to the other focus point of this blog, yo.

2014-05-15T10:26:00.001-07:00

The unit of success of a trip is the Hat, equal to the number of additional pieces of headgear one returns with from said trip. For example, my recently concluded sojourn to South Africa was 2 Hats successful.

Two Weeks Down South

2014-05-14T08:55:00.001-07:00

Back from South Africa!

I've just spent a fortnight back in my motherland. I went over for a number of reasons: first and foremost, my US F-1 visa was due to expire soon, so I needed to renew it in order to continue to be able to enter and exit the country while I'm still studying here. While in SA I also voted (the South African national elections were held on May 5) and renewed my pilot's license.

Consequently, very little math or coding has occurred in the past two weeks. So: time to get back in the swing of things! Stay tuned folks.

- Simon

2014-04-24T10:42:00.002-07:00

$\LaTeX$ Test!

I'll be writing a lot of math, so being able to incorporate $\LaTeX$ into my posts is essential. To this end I use MathJax, which allows you to write inline math expressions such as $\frac{1+\sqrt{-163}}{2}$ or standalone equations:
$$\frac{df}{dx} = \frac{1}{2} \left(\frac{ab \textrm{ sech}^2(b \sqrt{x})}{x} - \frac{a \tanh(b \sqrt{x})}{x^{3/2}} \right).$$

Hopefully these render once I get the MathJax plugin working properly...

[Edit] Looks like it's working!

Hello World!

2014-04-24T10:27:00.001-07:00

Hi Everyone!

I'm sure I'll look back on this first post and lament about its lack of meaningful content.

I've been accepted into the Google Summer of Code 2014. My project is to implement efficient analytic rank computation in Sage, the open source mathematical computing package whose development is headed up by my advisor, William Stein. Part of the parcel of working for GSoC is to blog your progress, so here we go.

My posts will hopefully detail this journey, as well as making ample references to hats, the other great love of my life.

- Simon