File:Rating averages for users who rated all four categories during AFTv4.svg

Original file(SVG file, nominally 960 × 564 pixels, file size: 47 KB)

Captions

Captions

Add a one-line explanation of what this file represents

Summary edit

Description
English: Quick plot showing the averages for users who rated all four categories during the AFTv4 public data period. A user could rate anywhere from 1-4 categories and those who rated four tended to apply the same rating to all four categories (usually all 1s or 5s).

Code to reproduce this plot is available on Github. A copy is reproduced below for convenience only:

library(ggplot2)

# sort of a waste to download all of them from within R. 
# much easier to download and then run something like:
#      sed "1q" 20110523.csv > names.txt
#  This batch has some undocumented fields which aren't worth figuring out atm
#      rm 20110620.csv
#      cat *.csv > temp
#      grep -v "aa_page" temp > out.csv
#      rm temp

header <- scan("/Users/protonk/R/AFT Dump/names.txt", what = "character", sep = ",")

in.classes <- c(rep("numeric", 2), "character", rep("numeric", 2), "character", rep("numeric", 13))

indrat <- read.csv("/Users/protonk/R/AFT Dump/out.csv",
                   header = FALSE, colClasses = in.classes,
                   nrows = 2508605)
names(indrat) <- header

# we only want the ratings
reduced <- indrat[, c("aa_rating_wellsourced", "aa_rating_neutral", "aa_rating_complete", 
                        "aa_rating_readable")]

# We are only interested in rows where the user rated all 4 categories
reduced[reduced == 0] <- NA
rating.avgs <- rowMeans(reduced[complete.cases(reduced), ])

count.table <- table(rating.avgs)

# Builds factors of the rating avgs (ordered by count)
# and another factor for just the integers (this is a ggplot2 thing)
count.out <- actor(rating.avgs, levels = names(count.table), ordered = TRUE)
integers <- factor(count.out, levels = as.character(1:5))
# placed into a data frame for easier plotting
preplot <- data.frame(count.out, integers)

# Plot frequency of averages
qplot(count.out, fill = integers, geom = "bar", data = preplot) + 
  opts(legend.position = "none", title = expression("Averages of ratings where users rated all four categories")) + 
  scale_y_continuous(name = "") +  scale_x_discrete(name = "")
Date
Source Own work
Author Protonk

Licensing edit

I, the copyright holder of this work, hereby publish it under the following license:
w:en:Creative Commons
attribution share alike
This file is licensed under the Creative Commons Attribution-Share Alike 3.0 Unported license.
You are free:
  • to share – to copy, distribute and transmit the work
  • to remix – to adapt the work
Under the following conditions:
  • attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
  • share alike – If you remix, transform, or build upon the material, you must distribute your contributions under the same or compatible license as the original.
 
This W3C-unspecified chart was created with R.

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current18:00, 3 May 2012Thumbnail for version as of 18:00, 3 May 2012960 × 564 (47 KB)Protonk (talk | contribs)Much better display of the disparity.
08:29, 3 May 2012Thumbnail for version as of 08:29, 3 May 2012960 × 564 (47 KB)Protonk (talk | contribs)updated for all ratings.
05:03, 3 May 2012Thumbnail for version as of 05:03, 3 May 2012960 × 564 (46 KB)Protonk (talk | contribs)

There are no pages that use this file.

File usage on other wikis

The following other wikis use this file: