Jack Marchant

Offset and Cursor Pagination explained

October 30, 2018

Typically in an application with a database, you might have more records than you can fit on a page or in a single result set from a query. When you or your users want to retrieve the next page of results, two common options for paginating data include:

  1. Offset Pagination
  2. Cursor Pagination

Offset Pagination

When retrieving data with offset pagination, you would typically allow clients to supply two additional parameters in their query: an offset, and a limit. An offset is simply the number of records you wish to skip before selecting records. This gets slower as the number of records increases because the database still has to read up to the offset number of rows to know where it should start selecting data. This is often described as O(n) complexity, meaning it's generally the worst-case scenario. Additionally, in datasets that change frequently as is typical of large databses with frequent writes, the window of results will often be inaccurate across different pages in that you will either miss results entirely or see duplicates because results have now been added to the previous page.

If we want to get the first page of the newest posts from a database, the query might look like this:

Post
|> order_by(inserted_at: :desc)
|> limit(20)

Then, when we want the second page of results, we can include an offset:

Post
|> order_by(inserted_at: :desc)
|> limit(20)
|> offset(20)

While you could get away with this method initially, and it's definitely worth doing first - as the number of records increases you can consider alternatives to make reading much faster and more accurate.

Cursor Pagination

This is where cursor based pagination comes in. A cursor is a unique identifier for a specific record, which acts as a pointer to the next record we want to start querying from to get the next page of results. With using a cursor, we remove the need to read rows that we have already seen by using a WHERE clause in our query (making it faster to read data as it's constant i.e. O(1) time complexity) and we address the issue of inaccurate results by always reading after a specific row rather than relying on the position of records to remain the same.

Using our previous example, but this time implementing pagination with a cursor:

Post
|> order_by(inserted_at: :desc)
|> limit(20)
|> where([p], p.id < ^cursor)

In order for us to use a cursor, we need to return the results from the first page, in addition to the cursor for the last item in our result set. Using a cursor in this way is fine for moving forward in the result set, but by changing the fetching direction, you add complexity to how you retrieve records.

Conclusion

Cursor pagination is most often used for real-time data due to the frequency new records are added and because when reading data you often see the latest results first. There different scenarios in which offset and cursor pagination make the most sense so it will depend on the data itself and how often new records are added. When querying static data, the performance cost alone may not be enough for you to use a cursor, as the added complexity that comes with it may be more than you need.


Written by Jack Marchant, a Software Engineer who writes about writing code. You should follow him on Twitter or check out his code on GitHub


using a dependency injection container to decouple code

June 03, 2020

Dependency Injection is the method of passing objects to another (usually during instantiation) to invert the dependency created when you use an object. A Container is often used as a collection of the objects used in your system, to achieve separation between usage and instantiation.

3 tips to help with working from home

April 17, 2020

Working from home has been thrust upon those lucky enough to still have a job. Many aren’t sure how to cope, some are trying to find ways to help them through the day. Make no mistake, this is not a normal remote working environment we find ourselves in, but nonetheless we should find ways to embrace it.

making software a three step process

April 14, 2020

One of the most useful tips that has guided much of my decision over the years has been this simple principle: three steps, executed in sequential order;

help me help you code review

October 24, 2019

Code Reviews are one of the easiest ways to help your team-mates. There are a number of benefits for both the reviewer and pull request author:

a pratical guide to test driven development

September 12, 2019

It’s been a while since I last wrote about why testing is important, but in this post I thought I would expand on that and talk about why not only unit testing is important, but how a full spectrum of automated tests can improve productivity, increase confidence pushing code and help keep users happy.

facade pattern

July 05, 2019

Design Patterns allow you to create abstractions that decouple sections of a codebase with the purpose of making a change to the code later a much easier process.

the problem with elixir umbrella apps

May 03, 2019

Umbrella apps are big projects that contain multiple mix projects. Using umbrella apps feels more like getting poked in the eye from an actual umbrella.

broken windows

April 14, 2019

Ever get the feeling that adding this "one little hack", a couple of lines of code, won't have much of an impact on the rest of the codebase? You think nothing of it and add it, convincing your team members it was the correct decision to get this new feature over the line. In theory, and generally speaking, I would kind of agree with doing it, but every hack is different so it's hard to paint them all with the same brush. If you've been doing software development for long enough you can see this kind of code coming from a mile away. It's the kind of code that can haunt your dreams if you're not careful.

lonestar elixir 2019

March 04, 2019

Last week was Lonestar ElixirConf 2019 held in Austin, Texas. The conference ran over 2 days and was the first Elixir conference I had been to.

genserver async concurrent tasks

February 01, 2019

In most cases I have found inter-process communication to be an unnecessary overhead for the work I have been doing. Although Elixir is known for this (along with Erlang), it really depends on what you’re trying to achieve and processes shouldn’t be spawned just for the fun of it. I have recently come across a scenario where I thought having a separate process be responsible for performing concurrent and asynchronous jobs would be the best way to approach the problem. In this article I will explain the problem and the solution.

best practices third party integrations

December 19, 2018

When we think about what an application does, it's typical to think of how it behaves in context of its dependencies. For example, we could say a ficticious application sync's data with a third-party CRM.

you might not need a genserver

November 20, 2018

When you're browsing your way through Elixir documentation or reading blog posts (like this one), there's no doubt you'll come across a GenServer. It is perhaps one of the most overused modules in the Elixir standard library, simply because it's a good teaching tool for abstractions around processes. It can be confusing though, to know when to reach for your friendly, neighbourhood GenServer.

offset cursor pagination

October 30, 2018

Typically in an application with a database, you might have more records than you can fit on a page or in a single result set from a query. When you or your users want to retrieve the next page of results, two common options for paginating data include:

protocols

September 26, 2018

Protocols are a way to implement polymorphism in Elixir. We can use it to apply a function to multiple object types or structured data types, which are specific to the object itself. There are two steps; defining a protocol in the form of function(s), and one or many implementations for that protocol.

exdocker

August 23, 2018

Recently, I've been writing a tonne of Elixir code, some Phoenix websites and a few other small Elixir applications. One thing that was bugging me every time I would create a new project is that I would want to add Docker to it either straight away because I knew there would be a dependency on Redis or Postgres etc, or halfway through a project and it would really slow down the speed at which I could hack something together.

working with tasks

July 26, 2018

While writing Understanding Concurrency in Elixir I started to grasp processes more than I have before. Working with them more closely has strengthened the concepts in my own mind.

understanding concurrency

July 14, 2018

Concurrency in Elixir is a big selling point for the language, but what does it really mean for the code that we write in Elixir? It all comes down to Processes. Thanks to the Erlang Virtual Machine, upon which Elixir is built, we can create process threads that aren't actual processes on your machine, but in the Erlang VM. This means that in an Elixir application we can create thousands of Erlang processes without the application skipping a beat.

composing ecto queries

July 06, 2018

Ecto is an Elixir library, which allows you to define schemas that map to database tables. It's a super light weight ORM, (Object-Relational Mapper) that allows you to define structs to represent data.

streaming datasets

June 27, 2018

We often think about Streaming as being the way we watch multimedia content such as video/audio. We press play and the content is bufferred and starts sending data over the wire. The client receiving the data will handle those packets and show the content, while at the same time requesting more data. Streaming has allowed us to consume large media content types such as tv shows or movies over the internet.

elixir queues

June 06, 2018

A Queue is a collection data structure, which uses the FIFO (First In, First Out) method. This means that when you add items to a queue, often called enqueuing, the item takes its place at the end of the queue. When you dequeue an item, we remove the item from the front of the queue.

composing plugs

March 23, 2018

Elixir is a functional language, so it’s no surprise that one of the main building blocks of the request-response cycle is the humble Plug. A Plug will take connection struct (see Plug.Conn) and return a new struct of the same type. It is this concept that allows you to join multiple plugs together, each with their own transformation on a Conn struct.

elixir supervision trees

February 06, 2018

A Supervision Tree in Elixir has quite a number of parallels to how developers using React think about a component tree. In this article I will attempt to describe parallel concepts between the two - and if you've used React and are interested in functional programming, it might prompt you to take a look at Elixir.

surviving tech debt

December 21, 2017

Technical debt is a potentially crippling disease that can take over your codebase without much warning. One day, you’re building features, the next, you struggle to untangle the mess you (or maybe your team) has created.

pattern matching elixir

August 15, 2017

Before being introduced to Elixir, a functional programming language built on top of Erlang, I had no idea what pattern matching was. Hopefully, by the end of this article you will have at least a rudimentary understanding of how awesome it is.

first impressions elixir

January 06, 2017

Elixir is a functional programming language based on Erlang. I’m told it’s very similar to Ruby, with a few tweaks and improvements to the developer experience and language syntax.

write unit tests

November 29, 2016

Unit testing can sometimes be a tricky subject no matter what language you’re writing in. There’s a few reasons for this: