Skip to main content
AI in Production 2026 is now open for talk proposals.
Share insights that help teams build, scale, and maintain stronger AI systems.
items
Menu
  • About
    • Overview 
    • Join Us  
    • Community 
    • Contact 
  • Training
    • Overview 
    • Course Catalogue 
    • Public Courses 
  • Posit
    • Overview 
    • License Resale 
    • Managed Services 
    • Health Check 
  • Data Science
    • Overview 
    • Visualisation & Dashboards 
    • Open-source Data Science 
    • Data Science as a Service 
    • Gallery 
  • Engineering
    • Overview 
    • Cloud Solutions 
    • Enterprise Applications 
  • Our Work
    • Blog 
    • Case Studies 
    • R Package Validation 
    • diffify  

The Trouble with Tibbles

Author: Theo Roe

Published: January 8, 2018

tags: r, tidyverse, tibbles

Let’s get something straight, there isn’t really any trouble with tibbles. I’m hoping you’ve noticed this is a play on 1967 Star Trek episode, “The Trouble with Tribbles”. I’ve recently got myself a job as a Data Scientist, here, at Jumping Rivers. Having never come across tibbles until this point, I now find myself using them in nearly every R script I compose. Be that your timeless standard R script, your friendly Shiny app or an analytical Markdown document.

What are tibbles?

Presumably this is why you came here, right?

Tibbles are a modern take on data frames, but crucially they are still data frames. Well, what’s the difference then? There’s a quote I found somewhere on the internet that decribes the difference quite well;

“keeping what time has proven to be effective, and throwing out what is not”.

Basically, some clever people took the classic data.frame(), shook it til the ineffective parts fell out, then added some new, more appropriate features.

Precursors

# The easiest way to get access is to isstall the tibble package.
install.packages("tibble")

# Alternatively, tibbles are a part of the tidyverse and hence
# installing the whole tidyverse will give you access.
install.packages("tidyverse")
# I am just going to use tibble.
library("tibble")

Do you use Professional Posit Products? If so, check out our managed Posit services

Tribblemaking

There are three ways to form a tibble. It pretty much acts as your friendly old pal data.frame() does. Just like standard data frames, we can create tibbles, coerce objects into tibbles and import data sets into R as a tibble. Below is a table of the traditional data.frame() commands and their respective {tidyverse} commands.

Formation TypeData Frame CommandsTibbles Commands
Creationdata.frame()data_frame() tibble() tribble()
Coercionas.data.frame()as_data_frame() as_tibble()
Importingread.*()read_delim() read_csv() read_csv2() read_tsv()

Let’s take a closer look…

1) Creation.

Just as data.frame() creates data frames,tibble(), data_frame() and tribble() all create tibbles.

Standard data frame.

data.frame(a = 1:5, b = letters[1:5])
##   a b
## 1 1 a
## 2 2 b
## 3 3 c
## 4 4 d
## 5 5 e

A tibble using tibble() (identical to using data_frame).

tibble(a = 1:5, b = letters[1:5])
## # A tibble: 5 x 2
##       a b
##   <int> <chr>
## 1     1 a
## 2     2 b
## 3     3 c
## 4     4 d
## 5     5 e

A tibble using tribble().

tribble( ~a, ~b,
       #---|----
          1, "a",
          2, "b")
## # A tibble: 2 x 2
##       a b
##   <dbl> <chr>
## 1  1.00 a
## 2  2.00 b

Notice the odd one out? tribble() is different. It’s a way of laying out small amounts of data in an easy to read form. I’m not too keen on these, as even writing out that simple 2 x 2 tribble got tedious.

2) Coercion.

Just as as.data.frame() coerces objects into data frames, as_data_frame() and as_tibble() coerce objects into tibbles.

df = data.frame(a = 1:5, b = letters[1:5])
as_data_frame(df)
## # A tibble: 5 x 2
##       a b
##   <int> <fct>
## 1     1 a
## 2     2 b
## 3     3 c
## 4     4 d
## 5     5 e
as_tibble(df)
## # A tibble: 5 x 2
##       a b
##   <int> <fct>
## 1     1 a
## 2     2 b
## 3     3 c
## 4     4 d
## 5     5 e

You can coerce more than just data frames, too. Objects such as lists, matrices, vectors and single instances of class are convertible.

3) Importing.

There’s a few options to read in data files within the {tidyverse}, so we’ll just compare read_csv() and its representative data.frame() pal, read.csv(). Let’s take a look at them. I have here an example data set that I’ve created in MS Excel. You can download/look at this data here. To get access to this function you’ll need the {readr} package. Again this is part of the {tidyverse} so either will do.

library("readr")
url = "https://gist.githubusercontent.com/theoroe3/8bc989b644adc24117bc66f50c292fc8/raw/f677a2ad811a9854c9d174178b0585a87569af60/tibbles_data.csv"
tib = read_csv(url)
## Parsed with column specification:
## cols(
##   `<-` = col_integer(),
##   `8` = col_integer(),
##   `%` = col_double(),
##   name = col_character()
## )
tib
## # A tibble: 4 x 4
##    `<-`   `8`   `%` name
##   <int> <int> <dbl> <chr>
## 1     1     2 0.250 t
## 2     2     4 0.250 h
## 3     3     6 0.250 e
## 4     4     8 0.250 o
df = read.csv(url)
df
##   X.. X8   X. name
## 1   1  2 0.25    t
## 2   2  4 0.25    h
## 3   3  6 0.25    e
## 4   4  8 0.25    o

Not only does read_csv() return a pretty tibble, it is also much faster. For proof, check out this article by Erwin Kalvelagen. The keen eyes amongst you will have noticed something odd about the variable names… we’ll get on to that soon.

Tibbles vs Data Frames

Did you notice a key difference in the tibble()s and data.frame()s above? Take a look again.

tibble(a = 1:26, b = letters)
## # A tibble: 26 x 2
##       a b
##   <int> <chr>
## 1     1 a
## 2     2 b
## 3     3 c
## 4     4 d
## 5     5 e
## # ... with 21 more rows

The first thing you should notice is the pretty print process. The class of each column is now displayed above it and the dimensions of the tibble are shown at the top. The default print option within tibbles mean they will only display 10 rows if the data frame has more than 20 rows (I’ve changed mine to display 5 rows). Neat. Along side that we now only view columns that will fit on the screen. This is already looking quite the part. The row settings can be changed via

 options(tibble.print_max = 3, tibble.print_min = 1)

So now if there is more than 3 rows, we print only 1 row. Tibbles of length 3 and 4 would now print as so.

tibble(1:3)
## # A tibble: 3 x 1
##   `1:3`
##   <int>
## 1     1
## 2     2
## 3     3
tibble(1:4)
## # A tibble: 4 x 1
##   `1:4`
##   <int>
## 1     1
## # ... with 3 more rows

Yes, OK, you could do this with the traditional data frame. But it would be a lot more work, right?

As well as the fancy printing, tibbles don’t drop the variable type, don’t partial match and they allow non-syntactic column names when importing data in. We’re going to use the data from before. Again, it is available here. Notice it has 3 non-syntactic column names and one column of characters. Reading this is as a tibble and a data frame we get

tib
## # A tibble: 4 x 4
##    `<-`   `8`   `%` name
##   <int> <int> <dbl> <chr>
## 1     1     2 0.250 t
## 2     2     4 0.250 h
## 3     3     6 0.250 e
## 4     4     8 0.250 o
df
##   X.. X8   X. name
## 1   1  2 0.25    t
## 2   2  4 0.25    h
## 3   3  6 0.25    e
## 4   4  8 0.25    o

We see already that in the read.csv() process we’ve lost the column names. Let’s try some partial matching…

tib$n
## Warning: Unknown or uninitialised column: 'n'.
## NULL
df$n
## [1] t h e o
## Levels: e h o t

With the tibble we get an error, yet with the data frame it leads us straight to our name variable. To read more about why partial matching is bad, check out this thread.

What about subsetting? Let’s try it out using the data from our csv file.

tib[,2]
## # A tibble: 4 x 1
##     `8`
##   <int>
## 1     2
## 2     4
## 3     6
## 4     8
tib[2]
## # A tibble: 4 x 1
##     `8`
##   <int>
## 1     2
## 2     4
## 3     6
## 4     8
df[,2]
## [1] 2 4 6 8
df[2]
##   X8
## 1  2
## 2  4
## 3  6
## 4  8

Using the a normal data frame we get a vector and a data frame using single square brackets. Using tibbles, single square brackets, [, will always return another tibble. Much neater. Now for double brackets.


tib[[1]]
## [1] 1 2 3 4
tib$name
## [1] "t" "h" "e" "o"
df[[1]]
## [1] 1 2 3 4
df$name
## [1] t h e o
## Levels: e h o t

Double square brackets, [[, and the traditional dollar, $ are ways to access individual columns as vectors. Now, with tibbles, we have seperate operations for data frame operations and single column operations. Now we don’t have to use that pesky drop = FALSE. Note, these are actually quicker than the [[ and $ of the data.frame(), as shown in the documentation for the tibble package.


At last, no more strings as factors! Upon reading the data in, tibbles recognise strings as strings, not factors. For example, with the name column in our data set.

class(df$name)
## [1] "factor"
class(tib$name)
## [1] "character"

I quite like this, it’s much easier to turn a vector of characters into factors than vice versa, so why not give me everything as strings? Now I can choose whether or not to convert to factors.

Disadvantages

This won’t be long, there’s only one. Some older packages don’t work with tibbles because of their alternative subsetting method. They expect tib[, 1] to return a vector, when infact it will now return another tibble. Until this functionality is added in you must convert your tibble back to a data frame using as_data_frame() or as_tibble() as discussed previously. Whilst adding this functionality will give users the chance to use packages with tibbles and normal data frames, it of course puts extra work on the shoulders of package writers, who now have to change every package to be compatible with tibbles. For more on this discussion, see this thread.

To summarise..

So, most of the things you can accomplish with tibbles, you can accomplish with data.frame(), but it’s bit of a pain. Simple things like checking the dimensions of your data or converting strings to factors are small jobs. Small jobs that take time. With tibbles they take no time. Tibbles force you to look at your data earlier; confront the problems earlier. Ultimately leading to cleaner code.

Thanks for chatting!


Jumping Rivers Logo

Recent Posts

  • Should I Use Figma Design for Dashboard Prototyping? 
  • Announcing AI in Production 2026: A New Conference for AI and ML Practitioners 
  • Elevate Your Skills and Boost Your Career – Free Jumping Rivers Webinar on 20th November! 
  • Get Involved in the Data Science Community at our Free Meetups 
  • Polars and Pandas - Working with the Data-Frame 
  • Highlights from Shiny in Production (2025) 
  • Elevate Your Data Skills with Jumping Rivers Training 
  • Creating a Python Package with Poetry for Beginners Part2 
  • What's new for Python in 2025? 
  • Upcoming Free Webinar: Understanding Posit - Ecosystem and Use Cases 

Top Tags

  • R (235) 
  • Rbloggers (181) 
  • Pybloggers (88) 
  • Python (88) 
  • Shiny (63) 
  • Events (26) 
  • Training (22) 
  • Machine Learning (21) 
  • Conferences (20) 
  • Tidyverse (17) 
  • Packages (13) 
  • Statistics (13) 

Authors

  • Aida Gjoka 
  • Keith Newman 
  • Tim Brock 
  • Shane Halloran 
  • Theo Roe 
  • Russ Hyde 
  • Liam Kalita 
  • Osheen MacOscar 
  • Pedro Silva 
  • Amieroh Abrahams 
  • Colin Gillespie 
  • Gigi Kenneth 
  • Sebastian Mellor 
  • Myles Mitchell 

Keep Updated

Like data science? R? Python? Stan? Then you’ll love the Jumping Rivers newsletter. The perks of being part of the Jumping Rivers family are:

  • Be the first to know about our latest courses and conferences.
  • Get discounts on the latest courses.
  • Read news on the latest techniques with the Jumping Rivers blog.

We keep your data secure and will never share your details. By subscribing, you agree to our privacy policy.

Follow Us

  • GitHub
  • Bluesky
  • LinkedIn
  • YouTube
  • Eventbrite

Find Us

The Catalyst Newcastle Helix Newcastle, NE4 5TG
Get directions

Contact Us

  • hello@jumpingrivers.com
  • + 44(0) 191 432 4340

Newsletter

Sign up

Events

  • North East Data Scientists Meetup
  • Leeds Data Science Meetup
  • Shiny in Production
British Assessment Bureau, UKAS Certified logo for ISO 9001 - Quality management British Assessment Bureau, UKAS Certified logo for ISO 27001 - Information security management Cyber Essentials Certified Plus badge
  • Privacy Notice
  • |
  • Booking Terms

©2016 - present. Jumping Rivers Ltd