Questions


Data

The Tate Collection dataset contains ~70,000 artworks that are owned or partly owned by Tate. Data Source


Exploration


library(ggplot2)
library(scales)
library(caTools)
library(randomForest)
library(caret)
library(e1071)

Load data and check structure

artworks <- read.csv("the-tate-collection.csv", stringsAsFactors = FALSE, sep = ";")

str(artworks)
## 'data.frame':    69201 obs. of  20 variables:
##  $ id                : int  20400 20618 20830 21086 21163 21157 21153 21210 21271 21405 ...
##  $ accession_number  : chr  "P77527" "P77580" "P77612" "P77680" ...
##  $ artist            : chr  "Charlton, Alan" "Artschwager, Richard" "Marden, Brice" "Francis, Mark" ...
##  $ artistRole        : chr  "artist" "artist" "artist" "artist" ...
##  $ artistId          : int  891 669 1578 2311 1922 2170 2170 1938 2146 2339 ...
##  $ title             : chr  "[no title]" "Interior" "[no title]" "Untitled" ...
##  $ dateText          : chr  "1991" "1972" "1971" "1994" ...
##  $ medium            : chr  "Screenprint on paper" "Screenprint on paper" "Etching and aquatint on paper" "Monotype on paper" ...
##  $ creditLine        : chr  "Purchased 1992" "Purchased 1992" "Purchased 1993" "Purchased 1994" ...
##  $ year              : int  1991 1972 1971 1994 1968 1994 1994 1982 1994 1988 ...
##  $ acquisitionYear   : int  1992 1992 1993 1994 1994 1994 1994 1983 1995 1995 ...
##  $ dimensions        : chr  "image: 362 x 362 mm" "image: 715 x 1043 mm" "image: 370 x 603 mm" "image: 582 x 584 mm" ...
##  $ width             : int  362 715 370 582 1044 380 383 1147 544 585 ...
##  $ height            : int  362 1043 603 584 681 356 362 760 641 418 ...
##  $ depth             : num  NA NA NA NA NA NA NA NA NA NA ...
##  $ units             : chr  "mm" "mm" "mm" "mm" ...
##  $ inscription       : chr  "date inscribed" "date inscribed" "date inscribed" "" ...
##  $ thumbnailCopyright: chr  "© Alan Charlton" "© ARS, NY and DACS, London 2014" "© ARS, NY and DACS, London 2014" "© Mark Francis" ...
##  $ thumbnailUrl      : chr  "https://data.opendatasoft.com/api/datasets/1.0/the-tate-collection@public/images/a76355ec5781715e41cb08f1780b5ab0" "https://data.opendatasoft.com/api/datasets/1.0/the-tate-collection@public/images/d1875557c1a8dada3e60984cd0447cd7" "https://data.opendatasoft.com/api/datasets/1.0/the-tate-collection@public/images/cffeaa9f8a80028d928d47763167ad8a" "https://data.opendatasoft.com/api/datasets/1.0/the-tate-collection@public/images/c0ada25718520babc378ab32ebd9b811" ...
##  $ url               : chr  "http://www.tate.org.uk/art/artworks/charlton-no-title-p77527" "http://www.tate.org.uk/art/artworks/artschwager-interior-p77580" "http://www.tate.org.uk/art/artworks/marden-no-title-p77612" "http://www.tate.org.uk/art/artworks/francis-untitled-p77680" ...

Remove unneeded columns.

drop <- c("accession_number", "artistRole", "artistId", "dateText", "creditLine", "units", "inscription", "thumbnailCopyright", "thumbnailUrl", "url")
artworks_rem <- artworks[ , !(names(artworks) %in% drop)]

str(artworks_rem)
## 'data.frame':    69201 obs. of  10 variables:
##  $ id             : int  20400 20618 20830 21086 21163 21157 21153 21210 21271 21405 ...
##  $ artist         : chr  "Charlton, Alan" "Artschwager, Richard" "Marden, Brice" "Francis, Mark" ...
##  $ title          : chr  "[no title]" "Interior" "[no title]" "Untitled" ...
##  $ medium         : chr  "Screenprint on paper" "Screenprint on paper" "Etching and aquatint on paper" "Monotype on paper" ...
##  $ year           : int  1991 1972 1971 1994 1968 1994 1994 1982 1994 1988 ...
##  $ acquisitionYear: int  1992 1992 1993 1994 1994 1994 1994 1983 1995 1995 ...
##  $ dimensions     : chr  "image: 362 x 362 mm" "image: 715 x 1043 mm" "image: 370 x 603 mm" "image: 582 x 584 mm" ...
##  $ width          : int  362 715 370 582 1044 380 383 1147 544 585 ...
##  $ height         : int  362 1043 603 584 681 356 362 760 641 418 ...
##  $ depth          : num  NA NA NA NA NA NA NA NA NA NA ...
ggplot(artworks_rem) + 
  geom_density(aes(year, fill = "red"), alpha = 0.3) + 
  geom_density(aes(acquisitionYear, fill = "blue"), alpha = 0.3) +
  scale_fill_manual(name = NULL, values = c("red" = "red", "blue" = "blue"), labels=c("blue" = "Acquisition Year", "red" = "Year Artwork Created")) + 
  theme_bw()

As seen from the above density plot, there are two peaks of both acquisition year that Tate brought the artwork, and the age of the artworks within the Tate Collection. Interestingly, the first large number of acquisitions occured mid-19th Century, whilst the second happened in the late 20th Century. The first peak of artworks with acquisition years ~1880 can only contain artworks prior to this. The second peak, which occurs in ~1980s, however could contain artworks from any previous years. Thus, the year of the painting and acquisition year will be explored deeper.


Aspect Ratio through Time


Creation of the aspect ratio (height of artwork / width of artwork)

artworks_ar <- artworks_rem[!(is.na(artworks_rem$height & artworks_rem$width &artworks_rem$year)), ]

artworks_ar$aspectratio <- artworks_ar$height / artworks_ar$width

ggplot(artworks_ar) +
  geom_point(aes(year, aspectratio)) + 
  theme_bw()

As seen from above, the aspect ratio of artworks within the Tate are relatively static. That is, until ~1950s, where the aspect ratios of some artworks changes dramatically.

One artwork, in particular, has an aspect ratio of > 3000. Let’s check to see whether this is an error in the data, or in fact a real artwork.

artworks_ar <- artworks_ar[order(artworks_ar$aspectratio),]

artworks_ar[artworks_ar$aspectratio > 3000,]
##          id          artist              title medium year acquisitionYear
## 40103 21561 Balka, Miroslaw [diameter]1 x 3750  Steel 1995            1996
##                            dimensions width height depth halfcentury
## 40103 unconfirmed: 10 x 37500 x 10 mm    10  37500    10   1950-1999
##       aspectratio
## 40103        3750

From above, it is in fact a real artwork. Created by Balka in 1995, the piece, according to Tate Gallery, conceptulises grief and memory.


The Rise of Postmodernism


artworks_ar_1950 <- artworks_ar[artworks_ar$year >= 1950,]

ggplot(artworks_ar_1950) +
  geom_point(aes(year, aspectratio)) + 
  theme_bw()

As seen above, the 1950s onwards saw artworks with greater aspect ratios.

This is seen further by visualising the actual shape of the paintings.

artworks_ar$postmodern <- ifelse(artworks_ar$year >= 1950, 'yes', 'no')

artworks_ar <- artworks_ar[rev(order(artworks_ar$postmodern)),]

ggplot(artworks_ar, aes(xmin = 0, xmax = width, ymin = 0, ymax = height, color = postmodern)) +
  geom_rect(alpha = 0) +
  labs(color="Postmodern?", x = "Width", y = "Height") +
  theme_bw()